Skip to content

Analyze

pyggdrasil.analyze.rhats(chains)

Compute estimate of rank normalized split R-hat for a set of chains.

Sometimes referred to as the potential scale reduction factor / Gelman-Rubin statistic.

Used the “rank” method recommended by Vehtari et al. (2019)

The rank normalized R-hat diagnostic tests for lack of convergence by comparing the variance between multiple chains to the variance within each chain. If convergence has been achieved, the between-chain and within-chain variances should be identical.

Parameters:

Name Type Description Default
chains ndarray

array of arrays to calculate R-hat for (minimum of 2 chains, minimum of 4 draws)

required

Returns:

Type Description
ndarray

R-hat for given chains from index 4 to length,

ndarray

returns list that is 4 shorter than the length of the chains

Note
  • May return NaN if the chains are too short and all values are the same
  • May raise Out of memory error if the chains are too long, 100000 samples still works, 1000000 does not - storing all truncated chains in memory is too much even at 24 GB RAM ?! TODO: find a way to calculate R-hat for longer chains

pyggdrasil.analyze.ess(chains)

Calculates the effective sample size of a set of chains.

used the “bulk” method recommended by Vehtari et al. (2019) rank normalized draws are used to calculate the effective sample size

chains: array of arrays to calculate ESS for (minimum of 2 chains, minimum of 4 draws)

Returns:

Type Description
tuple[ndarray, ndarray]

ess_bulk, ess_tail for given chains from index 4 to length,

pyggdrasil.analyze.Metrics

Metrics for comparing trees.

Attributes:

Name Type Description
_METRICS Dict[str, Callable[[TreeNode, TreeNode], float]]

Dictionary of metrics.

get(metric) staticmethod

Return metric function.

Parameters:

Name Type Description Default
metric str

Name of metric.

required

Returns:

Type Description
Callable[[TreeNode, TreeNode], float]
  • AD: Ancestor-Descendant Similarity; pyggdrasil.distances.AncestorDescendantSimilarity().calculate,
Callable[[TreeNode, TreeNode], float]
  • MP3: MP3 Similarity; pyggdrasil.distances.MP3Similarity().calculate,
Callable[[TreeNode, TreeNode], float]
  • TrueTree: True Tree Similarity; pyggdrasil.compare_trees
Callable[[TreeNode, TreeNode], float]
  • DL: Different Lineage Similarity; pyggdrasil.distances.DifferentLineageSimilarity().calculate,
Callable[[TreeNode, TreeNode], float]
  • MLTD: MLTD Similarity; pyggdrasil.distances.MLTDSimilarity().calculate,

pyggdrasil.analyze.to_pure_mcmc_data(mcmc_samples)

Converts McmcRunData to PureMcmcData.

Takes a list of MCMCSamples converts it into a xarray easy to plot.

Parameters:

Name Type Description Default
mcmc_samples

list[MCMCSample] - list of MCMC samples

required

Returns: PureMcmcData

pyggdrasil.analyze.check_run_for_tree(desired_tree, mcmc_samples)

Check if a tree is in an MCMC run.

Returns of list of tuples of (iteration, tree, log-probability), or False. Goes through entire chain to find all instances of the tree.

Parameters:

Name Type Description Default
desired_tree

Tree

required
mcmc_samples

McmcRunData

required

Returns: bool or list(tuple[int, Tree, float])

pyggdrasil.analyze.analyze_mcmc_run(mcmc_data, metric, base_tree)

Analyze a MCMC run.

Parameters:

Name Type Description Default
mcmc_data

PureMcmcData MCMC run data to analyze of iteration no., tree, and log-probability

required
metric

Callable[[TreeNode, TreeNode], float] metric to apply to the trees.

required
base_tree

TreeNode Tree to compare all applicable metrics to.

required

Returns:

Type Description
list[int]

list[int], list[float]

list[float]

Iteration number and results of the metric.