Skip to content

API Reference

fitree

FiTreeJointLikelihood

Bases: Op

__init__(trees, augment_max_level=2, C_s=None, conditioning=True, lifetime_risk_mean=None, lifetime_risk_std=None, eps=1e-64, tau=0.01)

This object computes the joint log-likelihood of the fitness matrix F_mat to be used in the MCMC sampling.

Parameters:

Name Type Description Default
trees TumorTreeCohort

TumorTreeCohort object.

required
augment_max_level int

Maximum level of augmentation.

2
C_s float

scaling factor at sampling.

None
conditioning bool

Whether to condition on the observed trees.

True
lifetime_risk_mean float

Mean lifetime risk.

None
lifetime_risk_std float

Standard deviation of lifetime risk.

None
eps float

machine epsilon.

1e-64
tau float

time window for the numerical integration.

0.01

Subclone

Bases: SubcloneBase, NodeMixin

__init__(node_id, mutation_ids, seq_cell_number, cell_number=None, parent=None, children=None, genotype=None, growth_params=None, node_path=None)

A subclone in the tree

Parameters:

Name Type Description Default
node_id int

node ID

required
mutation_ids Iterable[int]

mutation IDs

required
seq_cell_number int

number of cells in the subclone

required
cell_number int | None

number of cells in the subclone

None
parent Subclone | None

parent subclone

None
children Iterable[Subclone] | None

children subclones

None
genotype list[int] | None

genotype of the subclone

None
growth_params dict | None

growth parameters of the subclone

None
node_path str | None

path of the subclone in the tree

None

get_genotype()

Get the genotype of the subclone

Returns:

Name Type Description
list list[int]

genotype

get_growth_params(mu_vec, F_mat, common_beta, return_dict=False)

get growth parameters for the subclone

Parameters:

Name Type Description Default
mu_vec ndarray

mutation rate vector

required
F_mat ndarray

fitness matrix

required
common_beta float

common death rate

required
return_dict bool

whether to return a dict or not

False

None or

Name Type Description
growth_params dict | Any

dict with growth parameters

dict | Any

{ "nu": mutation rate, "alpha": birth rate, "beta": death rate, "lam": net growth rate, "delta": running-max net growth rate, "r": number of times achieving the running-max net growth rate, "rho": shape parameter of the subclonal population size distribution (nu / alpha), "phi": scale parameter of the subclonal population size distribution, "gamma": growth ratio

dict | Any

}

TumorTree

__init__(patient_id, tree_id, root, tumor_size, weight=1.0, sampling_time=None)

A tumor tree

Parameters:

Name Type Description Default
patient_id int

patient id

required
tree_id int

tree id

required
root Subclone

root subclone

required
tumor_size float

total number of tumor cells

required
weight float

weight of the tree. Defaults to 1.0.

1.0
sampling_time float

sampling time of the tree. Defaults to None.

None

__str__()

String representation of the tumor tree

Returns:

Name Type Description
str str

string representation of the tumor tree

TumorTreeCohort

__init__(name, trees=None, n_mutations=0, N_trees=0, N_patients=0, mu_vec=None, common_beta=None, C_0=None, C_seq=None, C_sampling=None, t_max=None, mutation_labels=None, tree_labels=None, patient_labels=None, lifetime_risk=None)

A cohort of tumor mutation trees

Parameters:

Name Type Description Default
name str

name of the cohort

required
trees list[TumorTree]

list of tumor trees.

None
n_mutations int

number of mutations.

0
N_trees int

number of trees.

0
N_patients int

number of patients.

0
mu_vec ndarray

mutation rates.

None
common_beta float

common beta parameter.

None
C_0 (int, float)

static wild-type population size.

None
C_seq (int, float)

number of sequenced cells.

None
C_sampling (int, float)

scaling factor for the sampling time.

None
t_max float

maximum time.

None
mutation_labels list

mutation labels.

None
tree_labels list

tree labels.

None
patient_labels list

patient labels.

None
lifetime_risk float

lifetime risk.

None

VectorizedTrees

Bases: NamedTuple

This class stores the trees in a vectorized format

Parameters:

Name Type Description Default
cell_number

cell number of each node

required
seq_cell_number

sequenced cell number of each node

required
observed

observed status of each node

required
sampling_time

sampling time of each tree

required
weight

weight of each tree

required
tumor_size

tumor size of each tree

required
node_id

node ID of each node

required
parent_id

parent ID of each node

required
alpha

alpha parameter of each node

required
nu

nu parameter of each node

required
lam

lambda parameter of each node

required
rho

rho parameter of each node

required
phi

phi parameter of each node

required
delta

delta parameter of each node

required
r

r parameter of each node

required
gamma

gamma parameter of each node

required
genotypes

genotype of each node

required
N_trees

number of observed trees

required
N_patients

number of patients

required
n_nodes

number of union nodes (w/o root)

required
beta

common death rate

required
C_s

sampling scale

required
C_0

root size

required
t_max

maximum sampling time

required

compute_normalizing_constant(trees, eps=1e-64, tau=0.01)

This function computes the normalizing constant for the joint likelihood of the trees. P(T_s < t_max) = 1 - P(T_s > t_max)

Parameters:

Name Type Description Default
trees

VectorizedTrees The tree object.

required
eps

float, optional The machine epsilon. Defaults to 1e-64.

1e-64
tau

float, optional The time window for the numerical integration. Defaults to 1e-2.

0.01

generate_fmat(rng, n_mutations, mean_diag=0.1, sigma_diag=0.05, mean_offdiag=0.0, sigma_offdiag=1.0, p_diag=0.5, p_offdiag=0.5, positive_ratio=0.5)

Generate a fitness matrix with diagonal and off-diagonal elements.

Parameters:

Name Type Description Default
rng Generator

Random number generator.

required
n_mutations int

Number of mutations.

required
mean_diag float

Mean of the diagonal elements.

0.1
sigma_diag float

Standard deviation of the diagonal elements.

0.05
mean_offdiag float

Mean of the off-diagonal elements.

0.0
sigma_offdiag float

Standard deviation of the off-diagonal elements.

1.0
p_diag float

Probability of a diagonal element being non-zero.

0.5
p_offdiag float

Probability of an off-diagonal element being non-zero.

0.5
positive_ratio float

Ratio of positive diagonal elements.

0.5

generate_trees(rng, n_mutations, N_trees, mu_vec, F_mat, common_beta=1.0, C_0=100000.0, C_seq=10000.0, C_sampling=1000000000.0, tau=0.001, t_max=100, rule='parallel', k_repeat=0, k_multiple=1, return_time=True, parallel=False, n_jobs=-1)

Generate a list of trees with the given number of mutations and the given mutation rate vector and fitness matrix.

Parameters:

Name Type Description Default
rng Generator

The random number generator.

required
n_mutations int

The number of mutations to be considered.

required
N_trees int

The number of trees to be generated.

required
mu_vec ndarray

The n-by-1 mutation rate vector.

required
F_mat ndarray

The n-by-n fitness matrix.

required
common_beta float

The common death rate.

1.0
C_0 int | float

The static wild-type population size.

100000.0
C_seq int

Number of cells to sequence.

10000.0
C_sampling int | float

The number of cells to sample.

1000000000.0
tau float

The step size of the tau-leaping algorithm.

0.001
t_max float

The maximum time to generate the tree.

100
rule str

The type of the tree generation. Currently, only "parallel" is supported.

'parallel'
k_repeat int

The maximum number of repeated mutations.

0
k_multiple int

The maximum number of multiple mutations.

1
return_time bool

Whether to return the sampling time or not.

True
parallel bool

Whether to use parallel processing or not.

False
n_jobs int

The number of jobs to run in parallel. If -1, then all available cores are used. Defaults to -1.

-1

load_cohort_from_json(path)

Load a TumorTreeCohort object from a JSON file.

Parameters:

Name Type Description Default
path str

Path to the JSON file containing the TumorTreeCohort object.

required

load_vectorized_trees_npz(path)

Load a VectorizedTrees NamedTuple from an .npz file.

Parameters:

Name Type Description Default
path str

Path to the .npz file containing the VectorizedTrees object.

required

plot_fmat(F_mat, mutation_labels=None, to_sort=True, figsize=(8, 6))

This function plots the fitness matrix F.

Parameters:

Name Type Description Default
F_mat ndarray

Fitness matrix.

required
mutation_labels list

Mutation labels. Defaults to None.

None
to_sort bool

Whether to sort the rows and columns of the

True
figsize tuple

Figure size. Defaults to (8, 6).

(8, 6)

plot_fmat_posterior(F_mat_posterior, true_F_mat=None, mutation_labels=None, figsize=(8, 7))

This function plots the posterior of the fitness matrix F.

Parameters:

Name Type Description Default
F_mat_posterior ndarray

Posterior of the fitness matrix.

required
true_F_mat ndarray

True fitness matrix. Defaults to None.

None
mutation_labels list

Mutation labels. Defaults to None.

None
figsize tuple

Figure size. Defaults to (

(8, 7)

plot_tree(cohort, tree_id, filename=None)

This function plots a tree in the cohort.

Parameters:

Name Type Description Default
cohort TumorTreeCohort

A cohort of tumor mutation trees.

required
tree_id int

Tree ID.

required
filename str

Filename to save the plot. Defaults to None.

None

prior_fitree(trees, diag_mean=0.0, diag_sigma=0.1, offdiag_mean=0.0, offdiag_sigma=0.1, min_occurrences=0, augment_max_level=2)

Construct a prior model for the fitness matrix F.

Parameters:

Name Type Description Default
trees TumorTreeCohort

TumorTreeCohort object.

required
diag_mean float

Mean of the normal prior for the diagonal entries.

0.0
diag_sigma float

Standard deviation of the normal prior for the diagonal entries.

0.1
offdiag_mean float

Mean of the normal prior for the off-diagonal entries.

0.0
offdiag_sigma float

Standard deviation of the normal prior for the

0.1
min_occurrences int

Minimum number of occurrences for a mutation

0
augment_max_level int

Maximum level of augmentation for the trees.

2

Returns:

Name Type Description
model Model

PyMC model for the fitness matrix F.

save_cohort_to_json(cohort, path)

Save a TumorTreeCohort object to a JSON file.

Parameters:

Name Type Description Default
cohort TumorTreeCohort

TumorTreeCohort object to be saved.

required
path str

Path to the JSON file where the cohort will be saved.

required

save_vectorized_trees_npz(vectorized_trees, path)

Save VectorizedTrees NamedTuple to a compressed .npz file.

Parameters:

Name Type Description Default
vectorized_trees VectorizedTrees

VectorizedTrees NamedTuple to be saved.

required
path str

Path to the .npz file where the object will be saved.

required

update_params(trees, F_mat, zero_window=0.01)

This function updates the growth parameters of the tree based on the fitness matrix.

Parameters:

Name Type Description Default
trees

VectorizedTrees The tree object.

required
F_mat

jnp.ndarray The fitness matrix.

required
zero_window

float, optional The zero window for numerical stability. Defaults to 1e-2.

0.01

wrap_trees(trees, augment_max_level=None)

This function takes a TumorTreeCohort object as input and returns a VectorizedTrees object.

Parameters:

Name Type Description Default
trees TumorTreeCohort

a cohort of tumor trees

required
augment_max_level int

maximum level for augmentation.

None

Returns:

Name Type Description
tuple tuple[VectorizedTrees, TumorTree]

tumor tree cohort in vectorized format and the union tree