treeswift package¶
TreeSwift is a Python library for parsing, manipulating, and iterating over (rooted) tree structures. TreeSwift places an emphasis on speed.
Module contents¶
- class treeswift.Node(label=None, edge_length=None)[source]¶
Bases:
object
Node
class- child_nodes()[source]¶
Return a
list
containing thisNode
object’s children- Returns:
list
: Alist
containing thisNode
object’s children
- get_edge_length()[source]¶
Return the length of the edge incident to this
Node
- Returns:
float
: The length of the edge incident to thisNode
- is_leaf()[source]¶
Returns
True
if this is a leaf- Returns:
bool
:True
if this is a leaf, otherwiseFalse
- is_root()[source]¶
Returns
True
if this is theroot
- Returns:
bool
:True
if this is the root, otherwiseFalse
- newick()[source]¶
Newick string conversion starting at this
Node
object- Returns:
str
: Newick string conversion starting at thisNode
object
- num_children()[source]¶
Returns the number of children of this
Node
- Returns:
int
: The number of children of thisNode
- num_nodes(leaves=True, internal=True)[source]¶
Compute the total number of selected nodes in the subtree rooted by this
Node
(including itself)- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- Returns:
int
: The total number of selected nodes in thisTree
- set_edge_length(length)[source]¶
Set the length of the edge incident to this
Node
- Args:
length
: The new length of the edge incident to thisNode
- set_parent(parent)[source]¶
Set the parent of this
Node
object. Use this carefully, otherwise you may damage the structure of thisTree
object.- Args:
Node
: The new parent of thisNode
- traverse_ancestors(include_self=True)[source]¶
Traverse over the ancestors of this
Node
- Args:
include_self
(bool
):True
to include self in the traversal, otherwiseFalse
- traverse_bfs(include_self=True)[source]¶
Perform a Breadth-First Search (BFS) starting at this
Node
object’. Yields (Node
, distance) tuples- Args:
include_self
(bool
):True
to include self in the traversal, otherwiseFalse
- traverse_inorder(leaves=True, internal=True)[source]¶
Perform an inorder traversal starting at this
Node
object- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- traverse_internal()[source]¶
Traverse over the internal nodes below (and including) this
Node
object
- traverse_levelorder(leaves=True, internal=True)[source]¶
Perform a levelorder traversal starting at this
Node
object- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- traverse_postorder(leaves=True, internal=True)[source]¶
Perform a postorder traversal starting at this
Node
object- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- traverse_preorder(leaves=True, internal=True)[source]¶
Perform a preorder traversal starting at this
Node
object- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- traverse_rootdistorder(ascending=True, leaves=True, internal=True)[source]¶
Perform a traversal of the
Node
objects in the subtree rooted at thisNode
in either ascending (ascending=True
) or descending (ascending=False
) order of distance from thisNode
- Args:
ascending
(bool
):True
to perform traversal in ascending distance from the root, otherwiseFalse
for descendingleaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- class treeswift.Tree(is_rooted=True)[source]¶
Bases:
object
Tree
class- avg_branch_length(terminal=True, internal=True)[source]¶
Compute the average length of the selected branches of this
Tree
. Edges with lengthNone
will be treated as 0-length- Args:
terminal
(bool
):True
to include terminal branches, otherwiseFalse
internal
(bool
):True
to include internal branches, otherwiseFalse
- Returns:
The average length of the selected branches
- branch_lengths(terminal=True, internal=True)[source]¶
Generator over the lengths of the selected branches of this
Tree
. Edges with lengthNone
will be output as 0-length- Args:
terminal
(bool
):True
to include terminal branches, otherwiseFalse
internal
(bool
):True
to include internal branches, otherwiseFalse
- closest_leaf_to_root()[source]¶
Return the leaf that is closest to the root and the corresponding distance. Edges with no length will be considered to have a length of 0
- Returns:
tuple
: First value is the closest leaf to the root, and second value is the corresponding distance
- coalescence_times(backward=True)[source]¶
Generator over the times of successive coalescence events
- Args:
backward
(bool
):True
to go backward in time (i.e., leaves to root), otherwiseFalse
- coalescence_waiting_times(backward=True)[source]¶
Generator over the waiting times of successive coalescence events
- Args:
backward
(bool
):True
to go backward in time (i.e., leaves to root), otherwiseFalse
- collapse_short_branches(threshold)[source]¶
Collapse internal branches (not terminal branches) with length less than or equal to
threshold
. A branch length ofNone
is considered 0- Args:
threshold
(float
): The threshold to use when collapsing branches
- colless(normalize='leaves')[source]¶
Compute the Colless balance index of this
Tree
. If the tree has polytomies, they will be randomly resolved- Args:
normalize
(str
): How to normalize the Colless index (if at all)None
to not normalize"leaves"
to normalize by the number of leaves"yule"
to normalize to the Yule model"pda"
to normalize to the Proportional to Distinguishable Arrangements model
- Returns:
float
: Colless index (either normalized or not)
- condense()[source]¶
If siblings have the same label, merge them. If they have edge lengths, the resulting
Node
will have the larger of the lengths
- contract_low_support(threshold, terminal=False, internal=True)[source]¶
Contract nodes labeled by a number (e.g. branch support) below
threshold
- Args:
threshold
(float
): The support threshold to use when contracting nodesterminal
(bool
):True
to include terminal branches, otherwiseFalse
internal
(bool
):True
to include internal branches, otherwiseFalse
- deroot()[source]¶
If tree bifurcates at the root, contract an edge incident with the root to create a trifurcation at the root.
- diameter()[source]¶
Compute the diameter (maximum leaf pairwise distance) of this
Tree
- Returns:
float
: The diameter of this Tree
- distance_between(u, v)[source]¶
Return the distance between nodes
u
andv
in thisTree
- Args:
u
(Node
): Nodeu
v
(Node
): Nodev
- Returns:
float
: The distance between nodesu
andv
- distance_matrix(leaf_labels=False)[source]¶
Return a distance matrix (2D dictionary) of the leaves of this
Tree
- Args:
leaf_labels
(bool
):True
to have keys be labels of leafNode
objects, otherwiseFalse
to have keys beNode
objects- Returns:
dict
: Distance matrix (2D dictionary) of the leaves of thisTree
, where keys are labels of leaves;M[u][v]
= distance fromu
tov
- distances_from_parent(leaves=True, internal=True, unlabeled=False)[source]¶
Generator over the node-to-parent distances of this
Tree
; (node,distance) tuples- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
unlabeled
(bool
):True
to include unlabeled nodes, otherwiseFalse
- distances_from_root(leaves=True, internal=True, unlabeled=False, weighted=True)[source]¶
Generator over the root-to-node distances of this
Tree
; (node,distance) tuples- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
unlabeled
(bool
):True
to include unlabeled nodes, otherwiseFalse
weighted
(bool
):True
to define distance as sum of edge lengths (i.e., weighted distance), orFalse
to define distance as total number of edges (i.e., unweighted distance). If unweighted, edges with lengthNone
are counted in the height
- draw(show_plot=True, export_filename=None, show_labels=False, align_labels=False, label_fontsize=8, start_time=0, default_color='#000000', xlabel=None, handles=None)[source]¶
Draw this
Tree
- Args:
show_plot
(bool
):True
to show the plot, otherwiseFalse
export_filename
(str
): File to which the tree figure will be exported (otherwiseNone
to not save to file)show_labels
(bool
):True
to show the leaf labels, otherwiseFalse
align_labels
(bool
):True
to align the leaf labels (if shown), otherwiseFalse
to just put them by their tipslabel_fontsize
(int
): Font size of the leaf labels (in points). 8pt = 1/9in –> 1in = 72ptdefault_color
(str
): The default color to use if a node doesn’t have acolor
attributexlabel
(str
): The label of the horizontal axis in the resulting plothandles
(list
): List of matplotlibPatch
objects for a legend
- drop_edge_length_at_root(label='OLDROOT')[source]¶
If the tree has a root edge, drop the edge to be a child of the root node
- Args:
label
(str
): The desired label of the new child
- edge_length_sum(terminal=True, internal=True)[source]¶
Compute the sum of all selected edge lengths in this
Tree
- Args:
terminal
(bool
):True
to include terminal branches, otherwiseFalse
internal
(bool
):True
to include internal branches, otherwiseFalse
- Returns:
float
: Sum of all selected edge lengths in thisTree
- extract_subtree(node)[source]¶
Return a copy of the subtree rooted at
node
- Args:
node
(Node
): The root of the desired subtree- Returns:
Tree
: A copy of the subtree rooted atnode
- extract_tree(labels, without, suppress_unifurcations=True)[source]¶
Helper function for
extract_tree_*
functions
- extract_tree_with(labels, suppress_unifurcations=True)[source]¶
Extract a copy of this
Tree
with only the leaves labeled by the strings inlabels
- Args:
labels
(set
): Set of leaf labels to include.suppress_unifurcations
(bool
):True
to suppress unifurcations, otherwiseFalse
- Returns:
Tree: Copy of this Tree, including only the leaves labeled by the strings in
labels
- extract_tree_without(labels, suppress_unifurcations=True)[source]¶
Extract a copy of this
Tree
without the leaves labeled by the strings inlabels
- Args:
labels
(set
): Set of leaf labels to excludesuppress_unifurcations
(bool
):True
to suppress unifurcations, otherwiseFalse
- Returns:
Tree
: Copy of thisTree
, exluding the leaves labeled by the strings inlabels
- find_node(label, leaves=True, internal=False)[source]¶
Find and return the node(s) labeled by
label
, orNone
if none exist. Note that this function performs a linear-time search, so if you will be call- Args:
label
(str
): The label to search forleaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- Returns:
The
Node
object labeled bylabel
(or alist
ofNode
objects if multiple are labeled bylabel
), orNone
if none exist
- furthest_from_root()[source]¶
Return the
Node
that is furthest from the root and the corresponding distance. Edges with no length will be considered to have a length of 0- Returns:
tuple
: First value is the furthestNode
from the root, and second value is the corresponding distance
- gamma_statistic()[source]¶
Compute the Gamma statistic of Pybus and Harvey (2000)
- Returns:
float
: The Gamma statistic of Pybus and Harvey (2000)
- height(weighted=True)[source]¶
Compute the height (i.e., maximum distance from root) of this
Tree
- Returns:
float
: The height (i.e., maximum distance from root) of thisTree
- indent(space=4)[source]¶
Return an indented Newick string, just like
nw_indent
in Newick Utilities- Args:
space
(int
): The number of spaces a tab should equal- Returns:
str
: An indented Newick string
- label_to_node(selection='leaves')[source]¶
Return a dictionary mapping labels (strings) to
Node
objectsIf
selection
is"all"
, the dictionary will contain all nodesIf
selection
is"leaves"
, the dictionary will only contain leavesIf
selection
is"internal"
, the dictionary will only contain internal nodesIf
selection
is aset
, the dictionary will contain all nodes labeled by a label inselection
If multiple nodes are labeled by a given label, only the last (preorder traversal) will be obtained
- Args:
selection
(str
orset
): The selection of nodes to get"all"
to select all nodes"leaves"
to select leaves"internal"
to select internal nodesA
set
of labels to specify nodes to select
- Returns:
dict
: Dictionary mapping labels to the corresponding nodes
- labels(leaves=True, internal=True)[source]¶
Generator over the (non-
None
)Node
labels of thisTree
- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- ladderize(ascending=True)[source]¶
Ladderize this
Tree
by sorting eachNode
’s children by total number of descendants- Args:
ascending
(bool
):True
to sort in ascending order ofmode
, otherwiseFalse
- lineages_through_time(present_day=None, show_plot=True, export_filename=None, color='#000000', xmin=None, xmax=None, ymin=None, ymax=None, title=None, xlabel=None, ylabel=None)[source]¶
Compute the number of lineages through time. If seaborn is installed, a plot is shown as well
- Args:
present_day
(float
): The time of the furthest node from the root. IfNone
, the top of the tree will be placed at time 0show_plot
(bool
):True
to show the plot, otherwiseFalse
to only return the dictionary. To plot multiple LTTs on the same figure, setshow_plot
to False for all but the last plotexport_filename
(str
): File to which the LTT figure will be exported (otherwiseNone
to not save to file)color
(str
): The color of the resulting plottitle
(str
): The title of the resulting plotxmin
(float
): The minimum value of the horizontal axis in the resulting plotxmax
(float
): The maximum value of the horizontal axis in the resulting plotxlabel
(str
): The label of the horizontal axis in the resulting plotymin
(float
): The minimum value of the vertical axis in the resulting plotymax
(float
): The maximum value of the vertical axis in the resulting plotylabel
(str
): The label of the vertical axis in the resulting plot- Returns:
dict
: A dictionary in which each(t,n)
pair denotes the number of lineagesn
that existed at timet
- ltt(present_day=None, show_plot=True, export_filename=None, color='#000000', xmin=None, xmax=None, ymin=None, ymax=None, title=None, xlabel=None, ylabel=None)¶
Compute the number of lineages through time. If seaborn is installed, a plot is shown as well
- Args:
present_day
(float
): The time of the furthest node from the root. IfNone
, the top of the tree will be placed at time 0show_plot
(bool
):True
to show the plot, otherwiseFalse
to only return the dictionary. To plot multiple LTTs on the same figure, setshow_plot
to False for all but the last plotexport_filename
(str
): File to which the LTT figure will be exported (otherwiseNone
to not save to file)color
(str
): The color of the resulting plottitle
(str
): The title of the resulting plotxmin
(float
): The minimum value of the horizontal axis in the resulting plotxmax
(float
): The maximum value of the horizontal axis in the resulting plotxlabel
(str
): The label of the horizontal axis in the resulting plotymin
(float
): The minimum value of the vertical axis in the resulting plotymax
(float
): The maximum value of the vertical axis in the resulting plotylabel
(str
): The label of the vertical axis in the resulting plot- Returns:
dict
: A dictionary in which each(t,n)
pair denotes the number of lineagesn
that existed at timet
- mrca(labels)[source]¶
Return the Node that is the MRCA of the nodes labeled by a label in
labels
. If multiple nodes are labeled by a given label, only the last (preorder traversal) will be obtained- Args:
labels
(set
): Set of leaf labels- Returns:
Node
: The MRCA of theNode
objects labeled by a label inlabels
- mrca_matrix()[source]¶
Return a dictionary storing all pairwise MRCAs.
M[u][v]
= MRCA of nodesu
andv
. ExcludesM[u][u]
because MRCA of node and itself is itself- Returns:
dict
:M[u][v]
= MRCA of nodesu
andv
- num_cherries()[source]¶
Returns the number of cherries (i.e., internal nodes that only have leaf children) in this
Tree
- Returns:
int
: The number of cherries in thisTree
- num_lineages_at(distance)[source]¶
Returns the number of lineages of this
Tree
that existdistance
away from the root- Args:
distance
(float
): The distance away from the root- Returns:
int
: The number of lineages that existdistance
away from the root
- num_nodes(leaves=True, internal=True)[source]¶
Compute the total number of selected nodes in this
Tree
- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- Returns:
int
: The total number of selected nodes in thisTree
- order(mode, ascending=True)[source]¶
Order the children of the nodes in this
Tree
based onmode
- Args:
mode
(str
): How to order the children of the nodes of thisTree
"edge_length"
= order by incident edge length"edge_length_then_label"
= order by incident edge length, then by node label"edge_length_then_label_then_num_descendants"
= order by incident edge length, then by node label, then by number of descendants"edge_length_then_num_descendants"
= order by incident edge length, then by number of descendants"edge_length_then_num_descendants_then_label"
= order by incident edge length, then by number of descendants, then by node label"label"
= order by node label"label_then_edge_length"
= order by node label, then by incident edge length"label_then_edge_length_then_num_descendants"
= order by node label, then by incident edge length, then by number of descendants"label_then_num_descendants"
= order by node label, then by number of descendants"label_then_num_descendants_then_edge_length"
= order by node label, then by number of descendants, then by incident edge length"num_descendants"
= order by number of descendants"num_descendants_then_label"
= order by number of descendants, then by node label"num_descendants_then_label_then_edge_length"
= order by number of descendants, then by node label, then by incident edge length"num_descendants_then_edge_length"
= order by number of descendants, then by incident edge length"num_descendants_then_edge_length_then_label"
= order by number of descendants, then by incident edge length, then by node label
ascending
(bool
):True
to sort in ascending order ofmode
, otherwiseFalse
- rename_nodes(renaming_map)[source]¶
Rename nodes in this
Tree
- Args:
renaming_map
(dict
): A dictionary mapping old labels (keys) to new labels (values)
- reroot(node, length=None, branch_support=False)[source]¶
Reroot this
Tree
atlength
up the incident edge ofnode
. If 0 orNone
, reroot at the node (not on the incident edge)- Args:
node
(Node
): TheNode
on whose incident edge thisTree
will be rerootedlength
(float
): The distance up the specified edge at which to reroot thisTree
. If 0 orNone
, reroot at the node (not on the incident edge)branch_support
(bool
):True
if internal node labels represent branch support values, otherwiseFalse
- sackin(normalize='leaves')[source]¶
Compute the Sackin balance index of this
Tree
- Args:
normalize
(str
): How to normalize the Sackin index (if at all)None
to not normalize"leaves"
to normalize by the number of leaves"yule"
to normalize to the Yule model"pda"
to normalize to the Proportional to Distinguishable Arrangements model
- Returns:
float
: Sackin index (either normalized or not)
- suppress_unifurcations()[source]¶
Remove all nodes with only one child and directly attach child to parent
- traverse_inorder(leaves=True, internal=True)[source]¶
Perform an inorder traversal of the
Node
objects in thisTree
- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- traverse_levelorder(leaves=True, internal=True)[source]¶
Perform a levelorder traversal of the
Node
objects in thisTree
- traverse_postorder(leaves=True, internal=True)[source]¶
Perform a postorder traversal of the
Node
objects in thisTree
- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- traverse_preorder(leaves=True, internal=True)[source]¶
Perform a preorder traversal of the
Node
objects in thisTree
- Args:
leaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- traverse_rootdistorder(ascending=True, leaves=True, internal=True)[source]¶
Perform a traversal of the
Node
objects in thisTree
in either ascending (ascending=True
) or descending (ascending=False
) order of distance from the root- Args:
ascending
(bool
):True
to perform traversal in ascending distance from the root, otherwiseFalse
for descendingleaves
(bool
):True
to include leaves, otherwiseFalse
internal
(bool
):True
to include internal nodes, otherwiseFalse
- treeness()[source]¶
Compute the treeness (sum of internal branch lengths / sum of all branch lengths) of this
Tree
. Branch lengths ofNone
are considered 0 length- Returns:
float
: Treeness of thisTree
(sum of internal branch lengths / sum of all branch lengths)
- treeswift.read_tree(input, schema)[source]¶
Read a tree from a string or file
- Args:
input
(str
): Either a tree string, a path to a tree file (plain-text or gzipped), or a DendroPy Tree objectschema
(str
): The schema ofinput
(DendroPy, Newick, NeXML, Nexus, or linkage)- Returns:
If the input is Newick, either a
Tree
object ifinput
contains a single tree, or alist
ofTree
objects ifinput
contains multiple trees (one per line)If the input is NeXML or Nexus, a
dict
of trees represented byinput
, where keys are tree names (str
) and values areTree
objects
- treeswift.read_tree_dendropy(tree)[source]¶
Create a TreeSwift tree from a DendroPy tree
- Args:
tree
(dendropy.datamodel.treemodel
): A DendropyTree
object- Returns:
Tree
: A TreeSwift tree created fromtree
- treeswift.read_tree_newick(newick)[source]¶
Read a tree from a Newick string or file
- Args:
newick
(str
): Either a Newick string or the path to a Newick file (plain-text or gzipped)- Returns:
Tree
: The tree represented bynewick
. If the Newick file has multiple trees (one per line), alist
ofTree
objects will be returned
- treeswift.read_tree_nexml(nexml)[source]¶
Read a tree from a NeXML string or file
- Args:
nexml
(str
): Either a NeXML string or the path to a NeXML file (plain-text or gzipped)- Returns:
dict
ofTree
: A dictionary of the trees represented bynexml
, where keys are tree names (str
) and values areTree
objects
- treeswift.read_tree_nexus(nexus, translate=True)[source]¶
Read a tree from a Nexus string or file
- Args:
nexus
(str
): Either a Nexus string or the path to a Nexus file (plain-text or gzipped)translate
(bool
): Translate the node labels on the trees (if the Nexus file has a “Translate” section)- Returns:
dict
ofTree
: A dictionary of the trees represented bynexus
, where keys are tree names (str
) and values areTree
objectsIf the Nexus file had a “Taxlabels” section, the taxon labels will be stored in the output dictionary as a
list
associated with key"taxlabels"
If any trees in the Nexus file had information (e.g. the first
...
in thetree STATE_0 [...] = [&R] (...);
line), all information will be stored in the output dictionary: they will be associated with key"info"
in the output dictionary, and they will be stored in a dictionary where keys are tree names (str
)If
translate
wasTrue
, if a nodex
was translated,x.label
will be the translated label, andx.id
will be the original label (the “ID”)