Skip to content

API Reference

Simulations

pmhn.simulate_dataset(rng, n_points, theta, mean_sampling_time)

Simulates a dataset of genotypes and sampling times.

Parameters:

Name Type Description Default
rng

the random number generator.

required
n_points int

number of points to simulate.

required
theta ndarray

the log-MHN matrix. Can be of shape (n_mutations, n_mutations) or (n_points, n_mutations, n_mutations).

required
mean_sampling_time Union[ndarray, float, Sequence[float]]

the mean sampling time. Can be a float (shared between all data point) or an array of shape (n_points,).

required

Returns:

Type Description
ndarray

sampling times, shape (n_points,)

ndarray

genotypes, shape (n_points, n_mutations)

pmhn.simulate_genotype_known_time(rng, theta, sampling_time=1.0, start_state=None)

pmhn.simulate_trajectory(rng, theta, max_time, start_state=None)

Simulates a trajectory of the jump Markov chain.

Parameters:

Name Type Description Default
rng

the random number generator.

required
theta ndarray

the log-MHN matrix, theta[i, j] describes the additive log-hazard of mutation j onto appearance of mutation i

required
max_time float

the maximum time to simulate.

required
start_state Optional[State]

the initial state of the chain. By default, it's the state with all 0s.

None

Returns:

Type Description
list[tuple[float, State]]

A list of (time, state) pairs, where time is the time of the jump and state

list[tuple[float, State]]

is the state to which the jump appeared

list[tuple[float, State]]

We initialize the list with (0, start_state), i.e., the initial state at time 0.

pmhn.sample_spike_and_slab(rng, n_mutations, diag_mean=0.0, diag_sigma=1.0, offdiag_effect=1.0, p_offdiag=0.2)

Samples a matrix using diagonal terms from a normal distribution and offdiagonal terms sampled from spike and slab distribution.

Parameters:

Name Type Description Default
rng

NumPy random number generator.

required
n_mutations int

number of mutations.

required
diag_mean float

mean of the normal distribution used to sample diagonal terms.

0.0
diag_scale

standard deviation of the normal distribution used to sample diagonal terms.

required
offdiag_effect float

the standard deviation of the slab used to sample non-zero offdiagonal terms

1.0
p_offdiag float

the probability of sampling a non-zero offdiagonal term.

0.2

Likelihood backends

pmhn.MHNBackend

Bases: Protocol

A backend for learning the MHN model.

All implementations of this interface must be able to compute the gradient and the loglikelihood of a given set of mutations and theta (log-MHN) matrix.

gradient_and_loglikelihood(mutations, theta)

Compute the gradient and the loglikelihood of a given set of mutations

Parameters:

Name Type Description Default
theta ndarray

log-MHN matrix, shape (n_genes, n_genes)

required

Returns:

Type Description
ndarray

gradient of the loglikelihood with respect to theta, shape (n_genes, n_genes)

float

loglikelihood of the given mutations, float

pmhn.MHNLoglikelihood

Bases: Op

A wrapper around the MHN loglikelihood, so that it can be used in PyMC models.

This operation expects the (unconstrained/log) MHN matrix of shape (n_genes, n_genes).

perform(node, inputs, outputs)

This is the method which is called by the operation.

It calculates the loglikelihood.

Note

The arguments and the output are PyTensor variables.

pmhn.MHNCythonBackend

Bases: MHNBackend

A simple wrapper around the Cython implementation of the gradient and loglikelihood.

pmhn.MHNJoblibBackend

Bases: MHNBackend

Calculates the gradient and the loglikelihood by using multiple processes via Joblib, sending them individual patient data.

pmhn.PersonalisedMHNLoglikelihood

Bases: Op

A wrapper around the MHN loglikelihood, so that it can be used in PyMC models.

This operation expects the (unconstrained/log) MHN matrix of shape (n_genes, n_genes).

perform(node, inputs, outputs)

This is the method which is called by the operation.

It calculates the loglikelihood.

Note

The arguments and the output are PyTensor variables.

Priors

pmhn.prior_horseshoe(n_mutations, baselines_mean=0, baselines_sigma=10.0, tau=None)

Constructs PyMC model with horseshoe prior on the off-diagonal terms.

For full description of this prior, see C.M. Caralho et al., Handling Sparsity via the Horseshoe, AISTATS 2009.

Parameters:

Name Type Description Default
n_mutations int

number of mutations

required
baselines_mean float

prior mean of the baseline rates

0
baselines_sigma float

prior standard deviation of the baseline rates

10.0

Returns:

Type Description
Model

PyMC model. Use model.theta to access the (log-)mutual hazard network variable, which has shape (n_mutations, n_mutations)

pmhn.prior_regularized_horseshoe(n_mutations, baselines_mean=0, baselines_sigma=10.0, sparsity_sigma=0.3, c2=None, tau=None, lambdas_dof=5)

Constructs PyMC model for regularized horseshoe prior. To access the (log-)mutual hazard network parameters, use the theta variable.

Parameters:

Name Type Description Default
n_mutations int

number of mutations

required
baselines_mean float

prior mean of the baseline rates

0
sigma

prior standard deviation of the baseline rates

required
sparsity_sigma float

sparsity parameter, controls the prior on tau. Ignored if tau is provided.

0.3
tau Optional[float]

if provided, will be used as the value of tau in the model

None

Returns:

Type Description
Model

PyMC model. Use model.theta to access the (log-)mutual hazard network variable, which has shape (n_mutations, n_mutations)

Example
model = prior_regularized_horseshoe(n_mutations=10)
with model:
    theta = model.theta
    pm.Potential("potential", some_function_of(theta))

pmhn.prior_normal(n_mutations, mean=0.0, sigma=10.0, mean_offdiag=None, sigma_offdiag=None)

Constructs PyMC model in which each entry is sampled from multivariate normal distribution.

Parameters:

Name Type Description Default
mean float

prior mean of the diagonal entries

0.0
sigma float

prior standard deviation of the diagonal entries

10.0
mean_offdiag Optional[float]

prior mean of the off-diagonal entries, defaults to mean

None
sigma_offdiag Optional[float]

prior standard deviation of the off-diagonal entries, defaults to sigma

None
Note

This model is unlikely to result in sparse solutions and for very weak priors (e.g., very large sigma) the solution may be very multimodal.

pmhn.prior_only_baseline_rates(n_mutations, mean=0.0, sigma=10.0)

Constructs a PyMC model in which the theta matrix contains only diagonal entries.

pmhn.prior_spike_and_slab_marginalized(n_mutations, baselines_mean=0.0, baselines_sigma=10.0, sparsity_a=3.0, sparsity_b=1.0, spike_scale=0.1, slab_scale=10.0)

Construct a spike-and-slab mixture prior for the off-diagonal entries.

See the spike-and-slab mixture prior in this post.

Parameters:

Name Type Description Default
n_mutations int

number of mutations

required
baselines_mean float

mean of the normal prior on the baseline rates

0.0
baselines_sigma float

standard deviation of the normal prior on the baseline rates

10.0
sparsity_a float

shape parameter of the Beta distribution controling sparsity

3.0
sparsity_b float

shape parameter of the Beta distribution controling sparsity

1.0
Note

By default we set sparsity prior Beta(3, 1) for $E[\gamma] \approx 0.75$, which should result in 75% of the off-diagonal entries being close to zero.

Visualisations

Mutual Hazard Network matrices

pmhn.plot_theta(theta, *, ax, gene_names=None, cmap=DEFAULT_COLORMAP, cbar=True, vmin=None, vmax=None, no_labels=False)

pmhn.plot_offdiagonal_sparsity(thetas, *, ax, thresholds=(0.01, 0.1, 0.2), true_theta=None, true_theta_color='orangered', true_theta_label='Data', xlabel='Off-diagonal sparsity', ylabel='Count')

Plots histogram representing the sparsity of the off-diagonal part of theta.

Parameters:

Name Type Description Default
thetas ndarray

Array of theta matrices, shape (n_samples, n_mutations, n_mutations)

required
ax Axes

axis to plot on

required
thresholds Sequence[float]

sparsity threshold (distinguishing between "existing" and "non-existing" interactions)

(0.01, 0.1, 0.2)

pmhn.plot_offdiagonal_histograms(thetas, *, ax, theta_true=None, alpha=0.1)

pmhn.plot_theta_samples(theta_samples, *, width=4, height=3, theta_true=None)

Plot samples from theta.

Genotype matrices

pmhn.plot_genotypes(genotypes, *, ax, patients_on_x_axis=True, patients_label='Patients', genes_label='Genes', sort=True)

pmhn.plot_genotype_samples(genotype_samples)

Misc

pmhn.control_no_mutation_warning(silence=True)

Silence the warning that is raised when a mutation matrix does not contain any mutation.

pmhn.construct_matrix(diag, offdiag)

Constructs a square matrix from diagonal and offdiagonal terms.

Parameters:

Name Type Description Default
diag ndarray

array of shape (n,)

required
offdiag ndarray

array of shape (n, n-1)

required

Returns:

Type Description
ndarray

array of shape (n, n) with the diagonal diag with the offdiagonal term at (k, i) given by offdiag[k, j(i)], where j(i) = i if i < diagonal_index and then skips it for i > diagonal_index

See Also

decompose_matrix, the inverse function.

pmhn.decompose_matrix(matrix)

Splits an (n, n) matrix into diagonal and offdiagonal terms.

Parameters:

Name Type Description Default
matrix ndarray

array of shape (n, n)

required

Returns:

Type Description
ndarray

diag, diagonal terms, shape (n,)

ndarray

offdiag, offdiagonal terms, shape (n, n-1)

See Also

construct_matrix, the inverse function.