Skip to content

Distances

pyggdrasil.distances.TreeDistance

Bases: TreeSimilarityMeasure

Interface for distance functions between the trees.

The hyperparameters of the metric should be set at the class initialization stage, similarly as with models in SciKit-Learn.

Note

The distances between trees should be treated as tree dissimilarity measures, rather than mathematical metrics. For example, the triangle inequality does not need to hold.

triangle_inequality()

Returns True if the triangle inequality

.. math::

d(t_1, t_3) <= d(t_1, t_2) + d(t_2, t_3)

is known to hold for this distance.

Note

If it is not known whether the triangle inequality holds for a metric, False should be returned.

pyggdrasil.distances.TreeSimilarity

Bases: TreeSimilarityMeasure

Interface for similarity functions between the trees.

The hyperparameters should be set at the class initialization stage, similarly as with models in SciKit-Learn.

pyggdrasil.distances.TreeSimilarityMeasure

Bases: Protocol

Interface for similarity or distance functions between the trees.

The hyperparameters should be set at the class initialization stage, similarly as with models in SciKit-Learn.

calculate(tree1, tree2)

Calculates similarity between tree1 and tree2.

Parameters:

Name Type Description Default
tree1 _IntegerTreeRoot

root of the first tree. The nodes should be labeled with integers.

required
tree2 _IntegerTreeRoot

root of the second tree. The nodes should be labeled with integers.

required

Returns:

Type Description
float

similarity from tree1 to tree2

is_symmetric()

Returns True if the similarity function is symmetric, i.e., :math:s(t_1, t_2) = s(t_2, t_1) for all pairs of trees.

Note

If it is not known whether the similarity function is symmetric, False should be returned.

pyggdrasil.distances.calculate_distance_matrix(trees1, trees2, /, *, distance)

Calculates a cross-distance matrix d[i, j] = distance(trees1[i], trees2[j])

Parameters:

Name Type Description Default
trees1 Sequence[_IntegerTreeRoot]

sequence of trees in one set, length m

required
trees2 Sequence[_IntegerTreeRoot]

sequence of trees in the second set, length n

required
distance TreeSimilarityMeasure

distance or similarity function

required

Returns:

Type Description
ndarray

distance matrix, shape (m, n)

pyggdrasil.distances.AncestorDescendantSimilarity

Bases: TreeSimilarity

Ancestor-descendant accuracy.

  • Considers only ancestor-descendant relationships between mutation,

i.e. excludes the root node. For an implementation with the root considered see AncestorDescendantSimilarityInclRoot instead.

Raises:

Type Description
DivisionByZeroError

If first tree is a star tree. Fork of scPhylo's not updated yet. Happens as no pairs of ancestor-descendant nodes can be created, given root is not considered.

calculate(tree1, tree2)

Calculates similarity between tree1 and tree2 using scphylo.tl.ad.

Parameters:

Name Type Description Default
tree1 Node

root of the first tree. The nodes should be labeled with integers.

required
tree2 Node

root of the second tree. The nodes should be labeled with integers.

required

Returns:

Type Description
float

similarity from tree1 to tree2

is_symmetric()

Returns True if the similarity function is symmetric, i.e., :math:s(t_1, t_2) = s(t_2, t_1) for all pairs of trees.

Note

If it is not known whether the similarity function is symmetric, False should be returned.

pyggdrasil.distances.MP3Similarity

Bases: TreeSimilarity

MP3 similarity.

calculate(tree1, tree2)

Calculates similarity between tree1 and tree2 using scphulo.tl.mp3.

Parameters:

Name Type Description Default
tree1 Node

root of the first tree. The nodes should be labeled with integers.

required
tree2 Node

root of the second tree. The nodes should be labeled with integers.

required

Returns:

Type Description
float

similarity from tree1 to tree2

is_symmetric()

Returns True if the similarity function is symmetric, i.e., :math:s(t_1, t_2) = s(t_2, t_1) for all pairs of trees.

Note

If it is not known whether the similarity function is symmetric, False should be returned.

pyggdrasil.distances.AncestorDescendantSimilarityInclRoot

Bases: TreeSimilarity

Ancestor-descendant similarity, adopted from @laurabquintas / Laura Quintas

Counts the root as a mutation, i.e. considers pairs of ancestor-descendant nodes between root and nodes - effectivly making comparisons if mutations exist in both trees. May lead a higher similarity score than AncestorDescendantSimilarity.

calculate(tree1, tree2)

Calculates similarity between tree1 and tree2 using scphylo.tl.ad.

Parameters:

Name Type Description Default
tree1 Node

root of the first tree. The nodes should be labeled with integers.

required
tree2 Node

root of the second tree. The nodes should be labeled with integers.

required

Returns:

Type Description
float

similarity from tree1 to tree2

is_symmetric()

Returns True if the similarity function is symmetric, i.e., :math:s(t_1, t_2) = s(t_2, t_1) for all pairs of trees.

Note

If it is not known whether the similarity function is symmetric, False should be returned.

pyggdrasil.distances.DifferentLineageSimilarity

Bases: TreeSimilarity

Different-Lineage similarity.

For each pair of mutations in ground truth tree that are in different-lineages relation we check whether the same relationship is preserved in the inferred tree.

Similarity out of one.

calculate(tree1, tree2)

Calculates similarity between tree1 and tree2 using scphulo.tl.dl.

Parameters:

Name Type Description Default
tree1 Node

root of the first tree. The nodes should be labeled with integers. Considered the ground truth tree.

required
tree2 Node

root of the second tree. The nodes should be labeled with integers. Considered the inferred tree to be compared to the ground truth.

required

Returns:

Type Description
float

similarity from tree1 to tree2

is_symmetric()

Returns True if the similarity function is symmetric, i.e., :math:s(t_1, t_2) = s(t_2, t_1) for all pairs of trees.

Note

If it is not known whether the similarity function is symmetric, False should be returned.

Known to be asymmetric.

pyggdrasil.distances.MLTDSimilarity

Bases: TreeSimilarity

Multi-labeled tree dissimilarity measure (MLTD), normalized to [0,1].

Similarity out of one.

Raises: Segmentation faults sometimes, unknown why. - scyphylo's issue.

calculate(tree1, tree2)

Calculates similarity between tree1 and tree2 using scphulo.tl.dl.

Parameters:

Name Type Description Default
tree1 Node

root of the first tree. The nodes should be labeled with integers.

required
tree2 Node

root of the second tree. The nodes should be labeled with integers.

required

Returns:

Type Description
float

similarity from tree1 to tree2

is_symmetric()

Returns True if the similarity function is symmetric, i.e., :math:s(t_1, t_2) = s(t_2, t_1) for all pairs of trees.

Note

If it is not known whether the similarity function is symmetric, False should be returned.

Unknown, but probably not symmetric.