lmpy.tree

Module for the Lifemapper TreeWrapper class.

Module Contents

Classes

PhyloTreeKeys

Keys for phylogenetic trees.

TreeWrapper

The constructor can optionally construct a |Tree| object by

exception lmpy.tree.LmTreeException[source]

Bases: Exception

Initialize self. See help(type(self)) for accurate signature.

class lmpy.tree.PhyloTreeKeys[source]

Keys for phylogenetic trees.

MTX_IDX[source]

The tree attribute indicating the matrix index position for a node.

Type

str

SQUID[source]

The tree attribute indicating a hashed identifier for the taxon.

Type

str

MTX_IDX = mx[source]
SQUID = squid[source]
class lmpy.tree.TreeWrapper(*args, **kwargs)[source]

Bases: dendropy.Tree

The constructor can optionally construct a |Tree| object by cloning another |Tree| object passed as the first positional argument, or out of a data source if stream and schema keyword arguments are passed with a file-like object and a schema-specification string object values respectively.

Parameters
  • *args (positional argument, optional) – If given, should be exactly one |Tree| object. The new |Tree| will then be a structural clone of this argument.

  • **kwargs (keyword arguments, optional) –

    The following optional keyword arguments are recognized and handled by this constructor:

    label

    The label or description of the new |Tree| object.

    taxon_namespace

    Specifies the |TaxonNamespace| object to be that the new |Tree| object will reference.

Examples

Tree objects can be instantiated in the following ways:

# /usr/bin/env python

try:
    from StringIO import StringIO
except ImportError:
    from io import StringIO
from dendropy import Tree, TaxonNamespace

# empty tree
t1 = Tree()

# Tree objects can be instantiated from an external data source
# using the 'get()' factory class method

# From a file-like object
t2 = Tree.get(file=open('treefile.tre', 'r'),
                schema="newick",
                tree_offset=0)

# From a path
t3 = Tree.get(path='sometrees.nexus',
        schema="nexus",
        collection_offset=2,
        tree_offset=1)

# From a string
s = "((A,B),(C,D));((A,C),(B,D));"
# tree will be '((A,B),(C,D))'
t4 = Tree.get(data=s,
        schema="newick")
# tree will be '((A,C),(B,D))'
t5 = Tree.get(data=s,
        schema="newick",
        tree_offset=1)
# passing keywords to underlying tree parser
t7 = dendropy.Tree.get(
        data="((A,B),(C,D));",
        schema="newick",
        taxon_namespace=t3.taxon_namespace,
        suppress_internal_node_taxa=False,
        preserve_underscores=True)

# Tree objects can be written out using the 'write()' method.
t1.write(file=open('treefile.tre', 'r'),
        schema="newick")
t1.write(path='treefile.nex',
        schema="nexus")

# Or returned as a string using the 'as_string()' method.
s = t1.as_string("nexml")

# tree structure deep-copied from another tree
t8 = dendropy.Tree(t7)
assert t8 is not t7                             # Trees are distinct
assert t8.symmetric_difference(t7) == 0         # and structure is identical
assert t8.taxon_namespace is t7.taxon_namespace             # BUT taxa are not cloned.
nds3 = [nd for nd in t7.postorder_node_iter()]  # Nodes in the two trees
nds4 = [nd for nd in t8.postorder_node_iter()]  # are distinct objects,
for i, n in enumerate(nds3):                    # and can be manipulated
    assert nds3[i] is not nds4[i]               # independentally.
egs3 = [eg for eg in t7.postorder_edge_iter()]  # Edges in the two trees
egs4 = [eg for eg in t8.postorder_edge_iter()]  # are also distinct objects,
for i, e in enumerate(egs3):                    # and can also be manipulated
    assert egs3[i] is not egs4[i]               # independentally.
lves7 = t7.leaf_nodes()                         # Leaf nodes in the two trees
lves8 = t8.leaf_nodes()                         # are also distinct objects,
for i, lf in enumerate(lves3):                  # but order is the same,
    assert lves7[i] is not lves8[i]             # and associated Taxon objects
    assert lves7[i].taxon is lves8[i].taxon     # are the same.

# To create deep copy of a tree with a different taxon namespace,
# Use 'copy.deepcopy()'
t9 = copy.deepcopy(t7)

# Or explicitly pass in a new TaxonNamespace instance
taxa = TaxonNamespace()
t9 = dendropy.Tree(t7, taxon_namespace=taxa)
assert t9 is not t7                             # As above, the trees are distinct
assert t9.symmetric_difference(t7) == 0         # and the structures are identical,
assert t9.taxon_namespace is not t7.taxon_namespace         # but this time, the taxa *are* different
assert t9.taxon_namespace is taxa                     # as the given TaxonNamespace is used instead.
lves3 = t7.leaf_nodes()                         # Leaf nodes (and, for that matter other nodes
lves5 = t9.leaf_nodes()                         # as well as edges) are also distinct objects
for i, lf in enumerate(lves3):                  # and the order is the same, as above,
    assert lves7[i] is not lves9[i]             # but this time the associated Taxon
    assert lves7[i].taxon is not lves9[i].taxon # objects are distinct though the taxon
    assert lves7[i].taxon.label == lves9[i].taxon.label # labels are the same.

# to 'switch out' the TaxonNamespace of a tree, replace the reference and
# reindex the taxa:
t11 = Tree.get(data='((A,B),(C,D));', 'newick')
taxa = TaxonNamespace()
t11.taxon_namespace = taxa
t11.reindex_subcomponent_taxa()

# You can also explicitly pass in a seed node:
seed = Node(label="root")
t12 = Tree(seed_node=seed)
assert t12.seed_node is seed
_annotate_node(self, node, annotation_attribute, annotation_value, update=False)[source]

Annotates a node with the given value.

Parameters
  • node (Node) – A node to add an annotation to.

  • annotation_attribute (str) – The annotation attribute to add. If None or ‘label’, update the node label.

  • annotation_value (object) – The value of the annotation.

  • update (bool, optional) – If True, update existing attribute. Defaults to False.

_annotation_method(self, label_attribute)[source]

Use the label attribute as the node label.

Parameters

label_attribute (str) – The annotation to use as the label for the nodes in the tree.

Returns

A method for retrieving the label for a taxon.

Return type

Method

_get_label_method(self, label_attribute)[source]

Gets the function to be used for retrieving labels.

Parameters

label_attribute (str) – An annotation name, ‘label’, or None used to determine which method to use to retrieve the label of a node.

Returns

Function for labeling nodes.

static _label_method(node)[source]

Use the label of the node or taxon for the label.

Parameters

node (Node) – The node to get the label for.

Returns

If the node or the node’s taxon has a label, return it. None: If the node and it’s taxon do not have labels.

Return type

str

_label_tree_nodes(self, node, i, prefix=None, overwrite=False)[source]

Private function to do the work when labeling nodes.

Parameters
  • node (Node) – A node to label.

  • i (int) – A count of the number of previously labeled nodes.

  • prefix (str, optional) – A prefix to use when labeling nodes resulting in labels like ‘prefix_0’. Defaults to None and no prefix.

  • overwrite (bool, optional) – Should node labels be overwritten. Defaults to False.

Returns

The number of nodes already labeled in the tree.

Return type

int

Note

  • Recursive.

add_node_labels(self, prefix=None, overwrite=False)[source]

Add labels to the nodes in the tree.

Add labels to the unlabeled nodes in the tree.

Parameters
  • prefix (str, optional) – If provided, prefix the node labels with this string.

  • overwrite (bool, optional) – Indicates whether existing node labels should be overwritten or if they should be maintained. Defaults to False.

Note

  • This labels nodes the way that R does.

annotate_tree(self, annotation_dict, annotation_attribute=None, label_attribute=None, update=False)[source]

Annotates tree tips and nodes.

Parameters
  • annotation_dict (dict) – A dictionary where the keys correspond with the node labels and the value is either, a single value, or a dictionary of annotation name keys and annotation value values.

  • annotation_attribute (str or None, optional) – Only used if annotation_dict contains single values, this will be the name of the annotation added for each node. Using None or setting value to ‘label’ will change the label of the node. Defaults to None.

  • label_attribute (str, optional) – Use the value of this annotation as the label for the node. Setting the value to ‘label’ or leaving as None will use the label of the node. Defaults to None.

  • update (bool, optional) – If True, update any existing annotations with the annotations provided. Defaults to False.

annotate_tree_tips(self, attribute_name, annotation_pairs, label_attribute='label', update=False)[source]

Annotates the tips of the tree.

Deprecated:

Update to use annotate_tree.

Parameters
  • attribute_name (str) – The name of the annotation attribute to add.

  • annotation_pairs (dict) – A dictionary of label keys with annotation values.

  • label_attribute (str, optional) – If this is provided, use this annotation attribute as the key instead of the label. Defaults to ‘label’.

  • update (bool, optional) – Defaults to False. Indicates if existing annotations should be updated.

classmethod from_base_tree(cls, tree)[source]

Creates a TreeWrapper object from a base dendropy.Tree.

Parameters

tree (Tree) – A base dendropy tree object to wrap into a TreeWrapper.

Returns

The newly wrapped tree.

Return type

TreeWrapper

classmethod from_filename(cls, filename)[source]

Creates a TreeWrapper object by loading a file.

Parameters

filename (str) – A file path to a tree file that should be loaded.

Returns

The newly loaded tree.

Return type

TreeWrapper

Raises

IOError – Raised if the tree file cannot be loaded based on the file extension.

get_annotations(self, annotation_attribute)[source]

Gets a list of (label, annotation) pairs.

Parameters

annotation_attribute (str) – The annotation attribute to retrieve.

Returns

A list of annotations.

Return type

list

get_distance_matrix(self, label_attribute='label', ordered_labels=None)[source]

Gets a Matrix object of phylogenetic distances.

Get a Matrix object of phylogenetic distances between tips using a lower memory footprint.

Parameters
  • label_attribute (str, optional) – The attribute of the tips to use as labels for the matrix. Defaults to ‘label’.

  • ordered_labels (list of str, optional) – If provided, use this order of labels.

Returns

A distance matrix from each tip to each of the other tips

in the tree.

Return type

Matrix

get_distance_matrix_dendropy(self, label_attribute='label', ordered_labels=None)[source]

Gets a Matrix object of phylogenetic distances between tips.

Gets the distance matrix between each tip using Dendropy.

Parameters
  • label_attribute (str, optional) – The attribute of the tips to use as labels for the matrix. Defaults to ‘label’.

  • ordered_labels (list of str, optional) – If provided, use this order of labels.

Note

This method may require a significant amount of memory for large trees.

The get_distance_matrix method has a smaller memory footprint and works at nearly the same speed.

Returns

A distance matrix from each tip to each of the other tips in the

tree.

Return type

Matrix

get_labels(self)[source]

Gets tip labels for a clade.

Note

Bottom-up order.

Returns

A list of taxon labels for the taxa in the tree.

get_variance_covariance_matrix(self, label_attribute='label', ordered_labels=None)[source]

Gets a Matrix object of variance / co-variance for tips in tree.

Parameters
  • label_attribute (str, optional) – The attribute of the tips to use as labels for the matrix. Defaults to ‘label’.

  • ordered_labels (list of str, optional) – If provided, use this order of labels.

Returns

A matrix of variance / co-variance values for the tips in

the tree.

Return type

Matrix

Raises

LmTreeException – If the tree does not have branch lengths.

has_branch_lengths(self)[source]

Returns a boolean indicating if the entire tree has branch lengths.

Returns

An indication if the tree has branch lengths.

Return type

bool

has_polytomies(self)[source]

Returns boolean indicating if the tree has polytomies.

Returns

An indication if the tree has any polytomies.

Return type

bool

is_binary(self)[source]

Checks if the tree is binary.

Returns

An indication if the tree is binary.

Return type

bool

Note

  • Checks that every clade has either zero or two children.

is_ultrametric(self, rel_tol=0.001)[source]

Checks if the tree is ultrametric.

Parameters

rel_tol (float) – The relative tolerance to determine if the min and max are equal. We will say they are equal if they are 99.9%.

Returns

Returns true if the distance from the root to each tip is the

same (within the tolerance interval).

Return type

bool

Note

  • To be ultrametric, the branch length from root to tip must be

    equal for all tips.

prune_tips_without_attribute(self, search_attribute=PhyloTreeKeys.MTX_IDX)[source]

Prunes the tree of any tips that don’t have the specified attribute.

Parameters

search_attribute (str, optional) – The attribute to look for when pruning tips in the tree. Defaults to PhyloTreeKeys.MTX_IDX.