Analyze
pyggdrasil.analyze.rhats(chains)
Compute estimate of rank normalized split R-hat for a set of chains.
Sometimes referred to as the potential scale reduction factor / Gelman-Rubin statistic.
Used the “rank” method recommended by Vehtari et al. (2019)
The rank normalized R-hat diagnostic tests for lack of convergence by comparing the variance between multiple chains to the variance within each chain. If convergence has been achieved, the between-chain and within-chain variances should be identical.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
chains |
ndarray
|
array of arrays to calculate R-hat for (minimum of 2 chains, minimum of 4 draws) |
required |
Returns:
Type | Description |
---|---|
ndarray
|
R-hat for given chains from index 4 to length, |
ndarray
|
returns list that is 4 shorter than the length of the chains |
Note
- May return NaN if the chains are too short and all values are the same
- May raise Out of memory error if the chains are too long, 100000 samples still works, 1000000 does not - storing all truncated chains in memory is too much even at 24 GB RAM ?! TODO: find a way to calculate R-hat for longer chains
pyggdrasil.analyze.ess(chains)
Calculates the effective sample size of a set of chains.
used the “bulk” method recommended by Vehtari et al. (2019) rank normalized draws are used to calculate the effective sample size
chains: array of arrays to calculate ESS for (minimum of 2 chains, minimum of 4 draws)
Returns:
Type | Description |
---|---|
tuple[ndarray, ndarray]
|
ess_bulk, ess_tail for given chains from index 4 to length, |
pyggdrasil.analyze.Metrics
Metrics for comparing trees.
Attributes:
Name | Type | Description |
---|---|---|
_METRICS |
Dict[str, Callable[[TreeNode, TreeNode], float]]
|
Dictionary of metrics. |
get(metric)
staticmethod
Return metric function.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
metric |
str
|
Name of metric. |
required |
Returns:
Type | Description |
---|---|
Callable[[TreeNode, TreeNode], float]
|
|
Callable[[TreeNode, TreeNode], float]
|
|
Callable[[TreeNode, TreeNode], float]
|
|
Callable[[TreeNode, TreeNode], float]
|
|
Callable[[TreeNode, TreeNode], float]
|
|
pyggdrasil.analyze.to_pure_mcmc_data(mcmc_samples)
Converts McmcRunData to PureMcmcData.
Takes a list of MCMCSamples converts it into a xarray easy to plot.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mcmc_samples |
list[MCMCSample] - list of MCMC samples |
required |
Returns: PureMcmcData
pyggdrasil.analyze.check_run_for_tree(desired_tree, mcmc_samples)
Check if a tree is in an MCMC run.
Returns of list of tuples of (iteration, tree, log-probability), or False. Goes through entire chain to find all instances of the tree.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
desired_tree |
Tree |
required | |
mcmc_samples |
McmcRunData |
required |
Returns: bool or list(tuple[int, Tree, float])
pyggdrasil.analyze.analyze_mcmc_run(mcmc_data, metric, base_tree)
Analyze a MCMC run.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mcmc_data |
PureMcmcData MCMC run data to analyze of iteration no., tree, and log-probability |
required | |
metric |
Callable[[TreeNode, TreeNode], float] metric to apply to the trees. |
required | |
base_tree |
TreeNode Tree to compare all applicable metrics to. |
required |
Returns:
Type | Description |
---|---|
list[int]
|
list[int], list[float] |
list[float]
|
Iteration number and results of the metric. |