API Reference
Simulations
pmhn.simulate_dataset(rng, n_points, theta, mean_sampling_time)
Simulates a dataset of genotypes and sampling times.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rng |
the random number generator. |
required | |
n_points |
int
|
number of points to simulate. |
required |
theta |
ndarray
|
the log-MHN matrix. Can be of shape (n_mutations, n_mutations) or (n_points, n_mutations, n_mutations). |
required |
mean_sampling_time |
Union[ndarray, float, Sequence[float]]
|
the mean sampling time. Can be a float (shared between all data point) or an array of shape (n_points,). |
required |
Returns:
Type | Description |
---|---|
ndarray
|
sampling times, shape (n_points,) |
ndarray
|
genotypes, shape (n_points, n_mutations) |
pmhn.simulate_genotype_known_time(rng, theta, sampling_time=1.0, start_state=None)
pmhn.simulate_trajectory(rng, theta, max_time, start_state=None)
Simulates a trajectory of the jump Markov chain.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rng |
the random number generator. |
required | |
theta |
ndarray
|
the log-MHN matrix, theta[i, j]
describes the additive log-hazard of mutation |
required |
max_time |
float
|
the maximum time to simulate. |
required |
start_state |
Optional[State]
|
the initial state of the chain. By default, it's the state with all 0s. |
None
|
Returns:
Type | Description |
---|---|
list[tuple[float, State]]
|
A list of (time, state) pairs, where |
list[tuple[float, State]]
|
is the state to which the jump appeared |
list[tuple[float, State]]
|
We initialize the list with (0, start_state), i.e., the initial state at time 0. |
pmhn.sample_spike_and_slab(rng, n_mutations, diag_mean=0.0, diag_sigma=1.0, offdiag_effect=1.0, p_offdiag=0.2)
Samples a matrix using diagonal terms from a normal distribution and offdiagonal terms sampled from spike and slab distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
rng |
NumPy random number generator. |
required | |
n_mutations |
int
|
number of mutations. |
required |
diag_mean |
float
|
mean of the normal distribution used to sample diagonal terms. |
0.0
|
diag_scale |
standard deviation of the normal distribution used to sample diagonal terms. |
required | |
offdiag_effect |
float
|
the standard deviation of the slab used to sample non-zero offdiagonal terms |
1.0
|
p_offdiag |
float
|
the probability of sampling a non-zero offdiagonal term. |
0.2
|
Likelihood backends
pmhn.MHNBackend
Bases: Protocol
A backend for learning the MHN model.
All implementations of this interface must be able to compute the gradient and the loglikelihood of a given set of mutations and theta (log-MHN) matrix.
gradient_and_loglikelihood(mutations, theta)
Compute the gradient and the loglikelihood of a given set of mutations
Parameters:
Name | Type | Description | Default |
---|---|---|---|
theta |
ndarray
|
log-MHN matrix, shape (n_genes, n_genes) |
required |
Returns:
Type | Description |
---|---|
ndarray
|
gradient of the loglikelihood with respect to theta, shape (n_genes, n_genes) |
float
|
loglikelihood of the given mutations, float |
pmhn.MHNLoglikelihood
Bases: Op
A wrapper around the MHN loglikelihood, so that it can be used in PyMC models.
This operation expects the (unconstrained/log) MHN matrix of shape (n_genes, n_genes).
perform(node, inputs, outputs)
This is the method which is called by the operation.
It calculates the loglikelihood.
Note
The arguments and the output are PyTensor variables.
pmhn.MHNCythonBackend
Bases: MHNBackend
A simple wrapper around the Cython implementation of the gradient and loglikelihood.
pmhn.MHNJoblibBackend
Bases: MHNBackend
Calculates the gradient and the loglikelihood by using multiple processes via Joblib, sending them individual patient data.
pmhn.PersonalisedMHNLoglikelihood
Bases: Op
A wrapper around the MHN loglikelihood, so that it can be used in PyMC models.
This operation expects the (unconstrained/log) MHN matrix of shape (n_genes, n_genes).
perform(node, inputs, outputs)
This is the method which is called by the operation.
It calculates the loglikelihood.
Note
The arguments and the output are PyTensor variables.
Priors
pmhn.prior_horseshoe(n_mutations, baselines_mean=0, baselines_sigma=10.0, tau=None)
Constructs PyMC model with horseshoe prior on the off-diagonal terms.
For full description of this prior, see C.M. Caralho et al., Handling Sparsity via the Horseshoe, AISTATS 2009.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_mutations |
int
|
number of mutations |
required |
baselines_mean |
float
|
prior mean of the baseline rates |
0
|
baselines_sigma |
float
|
prior standard deviation of the baseline rates |
10.0
|
Returns:
Type | Description |
---|---|
Model
|
PyMC model. Use |
pmhn.prior_regularized_horseshoe(n_mutations, baselines_mean=0, baselines_sigma=10.0, sparsity_sigma=0.3, c2=None, tau=None, lambdas_dof=5)
Constructs PyMC model for regularized horseshoe prior.
To access the (log-)mutual hazard network parameters, use the theta
variable.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_mutations |
int
|
number of mutations |
required |
baselines_mean |
float
|
prior mean of the baseline rates |
0
|
sigma |
prior standard deviation of the baseline rates |
required | |
sparsity_sigma |
float
|
sparsity parameter, controls the prior on |
0.3
|
tau |
Optional[float]
|
if provided, will be used as the value of |
None
|
Returns:
Type | Description |
---|---|
Model
|
PyMC model. Use |
Example
model = prior_regularized_horseshoe(n_mutations=10)
with model:
theta = model.theta
pm.Potential("potential", some_function_of(theta))
pmhn.prior_normal(n_mutations, mean=0.0, sigma=10.0, mean_offdiag=None, sigma_offdiag=None)
Constructs PyMC model in which each entry is sampled from multivariate normal distribution.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mean |
float
|
prior mean of the diagonal entries |
0.0
|
sigma |
float
|
prior standard deviation of the diagonal entries |
10.0
|
mean_offdiag |
Optional[float]
|
prior mean of the off-diagonal entries, defaults to |
None
|
sigma_offdiag |
Optional[float]
|
prior standard deviation of the off-diagonal entries,
defaults to |
None
|
Note
This model is unlikely to result in sparse solutions and for very weak priors (e.g., very large sigma) the solution may be very multimodal.
pmhn.prior_only_baseline_rates(n_mutations, mean=0.0, sigma=10.0)
Constructs a PyMC model in which the theta matrix contains only diagonal entries.
pmhn.prior_spike_and_slab_marginalized(n_mutations, baselines_mean=0.0, baselines_sigma=10.0, sparsity_a=3.0, sparsity_b=1.0, spike_scale=0.1, slab_scale=10.0)
Construct a spike-and-slab mixture prior for the off-diagonal entries.
See the spike-and-slab mixture prior in this post.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_mutations |
int
|
number of mutations |
required |
baselines_mean |
float
|
mean of the normal prior on the baseline rates |
0.0
|
baselines_sigma |
float
|
standard deviation of the normal prior on the baseline rates |
10.0
|
sparsity_a |
float
|
shape parameter of the Beta distribution controling sparsity |
3.0
|
sparsity_b |
float
|
shape parameter of the Beta distribution controling sparsity |
1.0
|
Note
By default we set sparsity
prior Beta(3, 1) for
$E[\gamma] \approx 0.75$, which
should result in 75% of the off-diagonal entries being close to zero.
Visualisations
Mutual Hazard Network matrices
pmhn.plot_theta(theta, *, ax, gene_names=None, cmap=DEFAULT_COLORMAP, cbar=True, vmin=None, vmax=None, no_labels=False)
pmhn.plot_offdiagonal_sparsity(thetas, *, ax, thresholds=(0.01, 0.1, 0.2), true_theta=None, true_theta_color='orangered', true_theta_label='Data', xlabel='Off-diagonal sparsity', ylabel='Count')
Plots histogram representing the sparsity of the off-diagonal part of theta.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
thetas |
ndarray
|
Array of theta matrices, shape (n_samples, n_mutations, n_mutations) |
required |
ax |
Axes
|
axis to plot on |
required |
thresholds |
Sequence[float]
|
sparsity threshold (distinguishing between "existing" and "non-existing" interactions) |
(0.01, 0.1, 0.2)
|
pmhn.plot_offdiagonal_histograms(thetas, *, ax, theta_true=None, alpha=0.1)
pmhn.plot_theta_samples(theta_samples, *, width=4, height=3, theta_true=None)
Plot samples from theta.
Genotype matrices
pmhn.plot_genotypes(genotypes, *, ax, patients_on_x_axis=True, patients_label='Patients', genes_label='Genes', sort=True)
pmhn.plot_genotype_samples(genotype_samples)
Misc
pmhn.control_no_mutation_warning(silence=True)
Silence the warning that is raised when a mutation matrix does not contain any mutation.
pmhn.construct_matrix(diag, offdiag)
Constructs a square matrix from diagonal and offdiagonal terms.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
diag |
ndarray
|
array of shape (n,) |
required |
offdiag |
ndarray
|
array of shape (n, n-1) |
required |
Returns:
Type | Description |
---|---|
ndarray
|
array of shape (n, n)
with the diagonal |
See Also
decompose_matrix, the inverse function.
pmhn.decompose_matrix(matrix)
Splits an (n, n) matrix into diagonal and offdiagonal terms.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
matrix |
ndarray
|
array of shape (n, n) |
required |
Returns:
Type | Description |
---|---|
ndarray
|
diag, diagonal terms, shape (n,) |
ndarray
|
offdiag, offdiagonal terms, shape (n, n-1) |
See Also
construct_matrix, the inverse function.