lmpy.data_preparation.tree_encoder

Module containing a class for encoding a Phylogenetic tree into a matrix.

See:
Leibold, m.A., E.P. Economo and P.R. Peres-Neto. 2010. Metacommunity

phylogenetics: separating the roles of environmental filters and historical biogeography. Ecology letters 13: 1290-1299.

Module Contents

Classes

TreeEncoder

Base constructor for tree encoder.

exception lmpy.data_preparation.tree_encoder.EncodingException[source]

Bases: Exception

Initialize self. See help(type(self)) for accurate signature.

class lmpy.data_preparation.tree_encoder.TreeEncoder(tree, pam)[source]

Base constructor for tree encoder.

Parameters
  • tree (TreeWrapper) – A tree object to encode.

  • pam (Matrix) – A PAM matrix object.

_build_p_branch_length_values(self, node)[source]

Recurse through the tree to get P matrix values for node / tips.

Parameters

node (dendropy.Node) – The current clade.

Returns

A tuple of branch length dictionary, sum of branch lengths, and

p-values dictionary.

Return type

tuple

_build_p_matrix_no_branch_lengths(self)[source]

Creates a P matrix when no branch lengths are present.

Note

For this method, we assume that there is a total weight of -1 to

the left and +1 to the right for each node. As we go down (towards the tips) of the tree, we divide the proportion of each previously visited node by 2. We then recurse with this new visited list down the tree. Once we reach a tip, we can return that list of proportions because it will match for that tip for each of its ancestors.

Example

3 +– 2 | +– 1 | | +– 0 | | | +– A | | | +– B | | | | | +– C | | | +– D | +–4

+– E +– F

Step 1: (Node 3) []
  • recurse left with [(3,-1)]

  • recurse right with [(3,1)]

Step 2: (Node 2) [(3,-1)]
  • recurse left with [(3,-.5),(2,-1)]

  • recurse right with [(3,-.5),(2,1)]

Step 3: (Node 1)[(3,-.5),(2,-1)]
  • recurse left with [(3,-.25),(2,-.5),(1,-1)]

  • recurse right with [3,-.25),(2,-.5),(1,1)]

Step 4: (Node 0)[(3,-.25),(2,-.5),(1,-1)]
  • recurse left with [(3,-.125),(2,-.25),(1,-.5),(0,-1)]

  • recurse right with [(3,-.125),(2,-.25),(1,-.5),(0,1)]

Step 5: (Tip A) - Return [(3,-.125),(2,-.25),(1,-.5),(0,-1)] Step 6: (Tip B) - Return [(3,-.125),(2,-.25),(1,-.5),(0,1)] Step 7: (Tip C) - Return [(3,-.25),(2,-.5),(1,1)] Step 8: (Tip D) - Return [(3,-.5),(2,1)] Step 9: (Node 4) [(3,1)]

  • recurse left with [(3,.5),(4,-1)]

  • recurse right with [(3,.5),(4,1)]

Step 10: (Tip E) - Return [(3,.5),(4,-1)] Step 11: (Tip F) - Return [(3,.5),(4,1)]

Creates matrix:

0 1 2 3 4

A -1.0 -0.5 -0.25 -0.125 0.0 B 1.0 -0.5 -0.25 -0.125 0.0 C 0.0 1.0 -0.5 -0.25 0.0 D 0.0 0.0 1.0 -0.5 0.0 E 0.0 0.0 0.0 0.5 -1.0 F 0.0 0.0 0.0 0.5 1.0

See:

Page 1293 of the literature.

Returns

An encoded phylogeny matrix.

Return type

Matrix

_build_p_matrix_tip_proportion_list(self, node, visited=None)[source]

Builds a list of tip proportions for the p matrix w/o branch lengths.

Parameters
  • node (dendropy.Node) – The current clade.

  • visited (None or list of tuple) – A list of (node path id, proportion) tuples.

Note

Proportion for each visited node is divided by two as we go

towards the tips at each hop.

Returns

A list of tip proportions.

Return type

list

_build_p_matrix_with_branch_lengths(self)[source]

Creates a P matrix when branch lengths are present.

Note

For this method, we assume that there is a total weight of -1 to

the left and +1 to the right for each node. As we go down (towards the tips) of the tree, we divide the proportion of each previously visited node by 2. We then recurse with this new visited list down the tree. Once we reach a tip, we can return that list of proportions because it will match for that tip for each of its ancestors.

Example

3 +– 2 (0.4) | +– 1 (0.15) | | +– 0 (0.65) | | | +– A (0.2) | | | +– B (0.2) | | | | | +– C (0.85) | | | +– D (1.0) | +– 4 (0.9)

+– E (0.5) +– F (0.5)

Value for any cell (tip)(node) = (l1 + l2/2 + l3/3 + … + ln/n) /

(Sum of branch lengths in daughter clade)

ln / n -> The length of branch n divided by the number of tips that

descend from that clade

P(A)(2) = (0.2 + 0.65/2 + 0.15/3) /

(0.2 + 0.2 + 0.65 + 0.85 + 0.15)

= (0.2 + 0.325 + 0.05) / (2.05) = 0.575 / 2.05 = 0.280

P(D)(3) = (1.0 + 0.4/4) / (.2 + .2 + .65 + .85 + .15 + 1.0 + .4)

= 1.1 / 3.45 = 0.319

Creates matrix:

0 1 2 3 4

A -1.000 -0.500 -0.280 -0.196 0.000 B 1.000 -0.500 -0.280 -0.196 0.000 C 0.000 1.000 -0.439 -0.290 0.000 D 0.000 0.000 1.000 -0.319 0.000 E 0.000 0.000 0.000 0.500 -1.000 F 0.000 0.000 0.000 0.500 1.000

See:

Literature supplemental material.

Returns

An encoded phylogeny matrix.

Return type

Matrix

encode_phylogeny(self)[source]

Encode the phylogenetic tree into a matrix.

Note

P in the literature, a tip (row) by internal node (column) matrix

that needs to match the provided PAM.

Raises

EncodingException – Raised if the PAM and tree do not match.

Returns

An encoding of the phylogenetic tree.

Return type

Matrix

classmethod from_file(cls, tree_file_name, pam_file_name)[source]

Creates an instance of the PhyloEncoding class from tree and pam.

Parameters
  • tree_file_name (str) – The location of the tree.

  • pam_file_name (str) – The location of the PAM.

Returns

A new tree encoder instance.

Return type

TreeEncoder

validate(self)[source]

Validates the tree / PAM combination.

Returns

Boolean indicating if the tree / pam is valid.

Return type

bool