A toolkit for working with phylogenetic data.
v0.18.0
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
genesis::tree Namespace Reference

Classes

class  AttributeTreeEdgeData
 Data class for AttributeTreeEdges. More...
 
class  AttributeTreeNodeData
 Data class for AttributeTreeNodes. More...
 
class  BaseEdgeData
 Base class for storing data on Edges of a Tree. More...
 
class  BaseNodeData
 Base class for storing data on Nodes of a Tree. More...
 
class  Bipartition
 
class  BipartitionSet
 
class  CircularEdgeData
 Data class for CircularTreeEdges. More...
 
class  CircularLayout
 
class  CircularNodeData
 Data class for CircularTreeNodes. More...
 
class  ColorWriterPlugin
 Base class for creating plugin classes that allow coloring of Tree edges. More...
 
class  DefaultEdgeData
 Default class containing the commonly needed data for tree edges. More...
 
class  DefaultNodeData
 Default class containing the commonly needed data for tree nodes. More...
 
class  DefaultTreeNewickReader
 Read default Newick trees, i.e., trees with names and branch lengths. More...
 
class  DefaultTreeNewickReaderPlugin
 Provide a set of plugin functions for NewickReader to read a DefaultTree. More...
 
class  DefaultTreeNewickWriter
 Write default Newick trees, i.e., trees with names and branch lengths. More...
 
class  DefaultTreeNewickWriterPlugin
 Provide a set of plugin functions for NewickWriter to write a DefaultTree. More...
 
class  DefaultTreePhyloxmlWriter
 
class  DefaultTreePhyloxmlWriterPlugin
 
class  IndexedAttributeTreeNewickReader
 Read Newick trees with ordered attributes for the Nodes and Edges. More...
 
class  IndexedAttributeTreeNewickReaderPlugin
 Provide a set of plugin functions for NewickReader to read ordered attributes of the Nodes and Edges into an AttributeTree. More...
 
class  IteratorEulertour
 
class  IteratorLevelorder
 
class  IteratorNodeLinks
 
class  IteratorPath
 
class  IteratorPathSet
 Iterate the path between two TreeNodes (non-linearily), given their lowest common ancestor (LCA). More...
 
class  IteratorPostorder
 
class  IteratorPreorder
 
class  KeyedAttributeTreeNewickReader
 Read default Newick trees, i.e., trees with names and branch lengths. More...
 
class  KeyedAttributeTreeNewickReaderPlugin
 Provide a set of plugin functions for NewickReader to read key-value-pair data attributes into an AttributeTree. More...
 
class  LayoutBase
 
class  LayoutEdgeData
 Data class for LayoutTreeEdges. More...
 
class  LayoutNodeData
 Data class for LayoutTreeNodes. More...
 
class  LcaLookup
 Fast lookup of the lowest common ancestor (LCA) of two TreeNodes, relative to an arbitrary root node. More...
 
class  MassTreeEdgeData
 Data class for MassTreeEdges. Stores the branch length and a list of masses with their positions along the edge. More...
 
class  MassTreeKmeans
 
class  MassTreeNodeData
 Data class for MassTreeNodes. Stores nothing. More...
 
class  NewickBroker
 Stores a Newick tree in an intermediate format that can be further processed into a Tree. More...
 
struct  NewickBrokerElement
 Store the information for one element of a Newick tree. More...
 
class  NewickColorWriterPlugin
 Plugin class for Newick output that allows coloring of edges. More...
 
class  NewickInputIterator
 Iterate an input stream and parse it as Newick trees. More...
 
class  NewickReader
 
class  NewickWriter
 Write a Tree to Newick format. More...
 
class  PhyloxmlColorWriterPlugin
 Plugin class for PhyloXML output that allows coloring of edges. More...
 
class  PhyloxmlWriter
 Write a Tree to Phyloxml format. More...
 
class  PrinterCompact
 Print a Tree in a compact form, i.e., each node and edge on one line. More...
 
class  PrinterDetailed
 Print a text representation of the Tree, showing all nodes, edges and links with their indices. More...
 
class  PrinterTable
 Print lists of all nodes, edges and links including their indices and connections with each other. More...
 
class  RectangularEdgeData
 Data class for RectangularTreeEdges. More...
 
class  RectangularLayout
 
class  RectangularNodeData
 Data class for RectangularTreeNodes. More...
 
struct  SquashClustering
 Result structure for Squash Clustering. More...
 
class  Tree
 Class for representing phylogenetic trees. More...
 
class  TreeEdge
 
class  TreeLink
 
class  TreeNode
 
class  TreeSet
 

Functions

TreeEdgeadd_new_node (Tree &tree, TreeNode &target_node)
 Add a new Node as a leaf to an existing Node. More...
 
TreeEdgeadd_new_node (Tree &tree, TreeEdge &target_edge)
 Add a new Node as a leaf to an existing Edge, by also adding a new Node in the middle of that Edge. More...
 
bool all_equal (TreeSet const &tset, std::function< bool(TreeNode const &, TreeNode const &)> node_comparator, std::function< bool(TreeEdge const &, TreeEdge const &)> edge_comparator)
 Compare whether all Trees in a TreeSet are equal using a given comparator functional. More...
 
bool all_identical_topology (TreeSet const &tset)
 Compare whether all Trees in a TreeSet are equal using their default comparision operators for nodes and edges. More...
 
Tree average_branch_length_tree (TreeSet const &tset)
 Return a Tree where the branch lengths are the average of the Trees in the TreeSet, given that they all have the same topology. More...
 
bool belongs_to (Tree const &tree, TreeNode const &node)
 Return whether the TreeNode belongs to the Tree, i.e., whether it is owned by the Tree. More...
 
bool belongs_to (TreeNode const &node, Tree const &tree)
 Return whether the TreeNode belongs to the Tree, i.e., whether it is owned by the Tree. More...
 
bool belongs_to (Tree const &tree, TreeEdge const &edge)
 Return whether the TreeEdge belongs to the Tree, i.e., whether it is owned by the Tree. More...
 
bool belongs_to (TreeEdge const &edge, Tree const &tree)
 Return whether the TreeEdge belongs to the Tree, i.e., whether it is owned by the Tree. More...
 
bool belongs_to (Tree const &tree, TreeLink const &link)
 Return whether the TreeLink belongs to the Tree, i.e., whether it is owned by the Tree. More...
 
bool belongs_to (TreeLink const &link, Tree const &tree)
 Return whether the TreeLink belongs to the Tree, i.e., whether it is owned by the Tree. More...
 
std::vector< double > branch_lengths (Tree const &tree)
 Get a vector of all branch lengths of a Tree, index by the edge index. More...
 
std::vector< std::pair
< TreeNode const *, size_t > > 
closest_leaf_depth_vector (const Tree &tree)
 Returns a vector containing the closest leaf node for each node, measured in number of edges between them and its depth (number of edges between them). More...
 
std::vector< std::pair
< TreeNode const *, double > > 
closest_leaf_distance_vector (Tree const &tree)
 Return a vector containing the closest leaf node for each node, using the branch_length as distance measure. More...
 
std::vector< std::pair
< TreeNode const *, double > > 
closest_leaf_distance_vector (Tree const &tree, utils::Matrix< double > const &node_distances)
 
Tree convert (Tree const &source, std::function< std::unique_ptr< BaseNodeData >(BaseNodeData const &node_data)> node_data_converter, std::function< std::unique_ptr< BaseEdgeData >(BaseEdgeData const &edge_data)> edge_data_converter)
 Create a tree with the same topology as the source tree, while converting its data. More...
 
AttributeTree convert_default_tree_to_attribute_tree (DefaultTree const &source)
 Helper function that takes a DefaultTree (or any Tree with Node and Edge data derived from it) and turns its data into an AttributeTree, that is, a Tree with AttributeTreeNodeData and AttributeTreeEdgeData. More...
 
MassTree convert_default_tree_to_mass_tree (DefaultTree const &source)
 Helper function that takes a DefaultTree (or any Tree with Node and Edge data derived from it) and turns its data into an MassTree, that is, a Tree with MassTreeNodeData and MassTreeEdgeData. More...
 
DefaultTree convert_to_default_tree (Tree const &source_tree)
 Convert a Tree to a DefaultTree with DefaultNodeData and DefaultEdgeData. More...
 
double deepest_distance (Tree const &tree)
 Return the longest distance from any point in the tree (on the edges) to any leaf. More...
 
double diameter (Tree const &tree)
 Get the diameter of the tree, i.e., the longest distance between any two nodes, measured using the branch_length. More...
 
double earth_movers_distance (MassTree const &lhs, MassTree const &rhs, double p=1.0)
 Calculate the earth mover's distance of two distributions of masses on a given Tree. More...
 
utils::Matrix< double > earth_movers_distance (std::vector< MassTree > const &trees, double p=1.0)
 Calculate the pairwise earth mover's distance for all MassTrees. More...
 
std::pair< double, double > earth_movers_distance (MassTree const &tree, double p=1.0)
 Calculate the earth mover's distance of masses on a given Tree. More...
 
TreeEdgeedge_between (TreeNode &lhs, TreeNode &rhs)
 Return the TreeEdge between two TreeNode&s, if they are neighbours, or nullptr otherwise. More...
 
TreeEdge const * edge_between (TreeNode const &lhs, TreeNode const &rhs)
 Return the TreeEdge between two TreeNode&s, if they are neighbours, or nullptr otherwise. More...
 
utils::Matrix< double > edge_branch_length_distance_matrix (Tree const &tree)
 
std::vector< double > edge_branch_length_distance_vector (Tree const &tree, TreeEdge const *edge)
 
std::vector< utils::Coloredge_color_branch_length_gradient (Tree const &tree, bool zero_based)
 
utils::Matrix< size_t > edge_path_length_matrix (Tree const &tree)
 
std::vector< size_t > edge_path_length_vector (Tree const &tree, TreeEdge const &edge)
 
utils::Matrix< signed char > edge_sides (Tree const &tree)
 Create a Matrix that indiciaces the relative position of the Edges of a Tree, i.e., whether they are on the root side or non-root side. More...
 
bool equal (Tree const &lhs, Tree const &rhs, std::function< bool(TreeNode const &, TreeNode const &) > node_comparator, std::function< bool(TreeEdge const &, TreeEdge const &) > edge_comparator)
 Compares two trees for equality given binary comparator functionals for their nodes and edges. More...
 
template<typename ElementType >
utils::Range
< IteratorEulertour< TreeLink
const, TreeNode const,
TreeEdge const > > 
eulertour (ElementType const &element)
 
template<typename ElementType >
utils::Range
< IteratorEulertour< TreeLink,
TreeNode, TreeEdge > > 
eulertour (ElementType &element)
 
TreeNode const * find_node (Tree const &tree, const std::string &name, bool replace_underscores)
 Finds a Node, given its name. If not found, nullptr is returned. More...
 
TreeNodefind_node (Tree &tree, const std::string &name, bool replace_underscores)
 Finds a Node, given its name. If not found, nullptr is returned. More...
 
Treefind_tree (TreeSet &tset, std::string const &name)
 Get the first Tree in a TreeSet that is stored with a given name, or nullptr if not found. More...
 
Tree const * find_tree (TreeSet const &tset, std::string const &name)
 Get the first Tree in a TreeSet that is stored with a given name, or nullptr if not found. More...
 
std::vector< std::pair
< TreeNode const *, double > > 
furthest_leaf_distance_vector (Tree const &tree)
 Opposite of closest_leaf_distance_vector(). More...
 
std::vector< std::pair
< TreeNode const *, double > > 
furthest_leaf_distance_vector (Tree const &tree, utils::Matrix< double > const &node_distances)
 
double height (Tree const &tree)
 Get the height of the tree, i.e., the longest distance from the root to a leaf, measured using the branch_length. More...
 
bool identical_topology (Tree const &lhs, Tree const &rhs)
 Returns true iff both trees have an identical topology. More...
 
size_t inner_edge_count (Tree const &tree)
 Return the number of Edges of a Tree that do not lead to a leaf Node. More...
 
size_t inner_node_count (Tree const &tree)
 Count the number of inner Nodes. More...
 
bool is_bifurcating (Tree const &tree)
 Return whether the Tree is bifurcating. More...
 
void ladderize (Tree &tree, LadderizeOrder order)
 
template<class Comparator >
std::vector< std::pair
< TreeNode const *, double > > 
leaf_distance_vector (Tree const &tree, utils::Matrix< double > const &node_distances, Comparator comp)
 Local helper function to calculate either closest_leaf_distance_vector() or furthest_leaf_distance_vector(). More...
 
size_t leaf_edge_count (Tree const &tree)
 Return the number of Edges of a Tree that lead to a leaf Node. More...
 
size_t leaf_node_count (Tree const &tree)
 Count the number of leaf Nodes of a Tree. More...
 
double length (Tree const &tree)
 Get the length of the tree, i.e., the sum of all branch lengths. More...
 
template<typename ElementType >
utils::Range
< IteratorLevelorder< TreeLink
const, TreeNode const,
TreeEdge const > > 
levelorder (ElementType const &element)
 
template<typename ElementType >
utils::Range
< IteratorLevelorder< TreeLink,
TreeNode, TreeEdge > > 
levelorder (ElementType &element)
 
TreeNode const & lowest_common_ancestor (TreeNode const &node_a, TreeNode const &node_b)
 Return the lowest common ancestor of two TreeNodes. More...
 
TreeNodelowest_common_ancestor (TreeNode &node_a, TreeNode &node_b)
 Return the lowest common ancestor of two TreeNodes. More...
 
utils::Matrix< size_t > lowest_common_ancestors (Tree const &tree)
 Return the lowest common ancestor of each pair of TreeNodes for a given tree, in form of a Matrix of their indices. More...
 
double mass_tree_binify_masses (MassTree &tree, size_t number_of_bins)
 Accumulate all masses of a MassTree into bins on each branch. More...
 
double mass_tree_center_masses_on_branches (MassTree &tree)
 Accumulate all masses of a MassTree on the centers of their edges. More...
 
double mass_tree_center_masses_on_branches_averaged (MassTree &tree)
 Accumulate all masses of a MassTree at the average mass position per edge. More...
 
void mass_tree_clear_masses (MassTree &tree)
 Clear all masses of a MassTree, while keeping its topology. More...
 
std::vector< double > mass_tree_mass_per_edge (MassTree const &tree)
 Return a std::vector that contains the total Mass for each edge of the given MassTree. More...
 
MassTree mass_tree_merge_trees (MassTree const &lhs, MassTree const &rhs, double const scaler_lhs=1.0, double const scaler_rhs=1.0)
 Merge all masses of two MassTrees into one and return it. More...
 
void mass_tree_merge_trees_inplace (MassTree &lhs, MassTree const &rhs, double const scaler_lhs=1.0, double const scaler_rhs=1.0)
 Merge all masses of two MassTrees by adding them to the first MassTree. More...
 
void mass_tree_normalize_masses (MassTree &tree)
 Scale all masses of a MassTree so that they sum up to 1.0. More...
 
void mass_tree_reverse_signs (MassTree &tree)
 Reverse the sign of each mass point on a MassTree. More...
 
void mass_tree_scale_masses (MassTree &tree, double factor)
 Scale all masses of a MassTree with the multiplicative factor factor. More...
 
double mass_tree_sum_of_masses (MassTree const &tree)
 Return the total sum of all masses on the MassTree. More...
 
void mass_tree_transform_to_unit_branch_lengths (MassTree &tree)
 Set all branch lengths of a MassTree to 1.0, while keeping the relative position of all masses on the branches. More...
 
bool mass_tree_validate (MassTree const &tree, double valid_total_mass_difference=-1.0)
 Validate the data on a MassTree. More...
 
size_t max_rank (Tree const &tree)
 Return the highest rank of the Nodes of a Tree. More...
 
utils::Matrix< double > node_branch_length_distance_matrix (Tree const &tree)
 Return a distance matrix containing pairwise distances between all Nodes, using the branch_length of the Edges as distance measurement. More...
 
std::vector< double > node_branch_length_distance_vector (Tree const &tree, TreeNode const *node=nullptr)
 Return a vector containing the distance of all nodes with respect to the given start node, where distance is measured in the sum of branch lengths between the nodes. More...
 
template<typename ElementType >
utils::Range
< IteratorNodeLinks< TreeLink
const, TreeNode const,
TreeEdge const > > 
node_links (ElementType const &element)
 
template<typename ElementType >
utils::Range
< IteratorNodeLinks< TreeLink,
TreeNode, TreeEdge > > 
node_links (ElementType &element)
 
std::unordered_set< std::string > node_names (Tree const &tree, bool leaves_only)
 Returns an unordered set of all TreeNode names of a Tree. More...
 
std::unordered_set< std::string > node_names (TreeSet const &tree_set, bool leaves_only)
 Returns a set of all TreeNode names of a TreeSet. More...
 
utils::SortedVector< std::string > node_names_sorted (Tree const &tree, bool leaves_only)
 Returns a set of all TreeNode names of a Tree. More...
 
utils::SortedVector< std::string > node_names_sorted (TreeSet const &tree_set, bool leaves_only)
 Returns a set of all TreeNode names of a TreeSet. More...
 
utils::Matrix< size_t > node_path_length_matrix (Tree const &tree)
 Return a matrix containing the pairwise depth of all nodes of the tree. More...
 
std::vector< size_t > node_path_length_vector (Tree const &tree, TreeNode const &node)
 Return a vector containing the depth of all nodes with respect to the given start node. More...
 
std::vector< size_t > node_path_length_vector (Tree const &tree)
 Return a vector containing the depth of all nodes with respect to the root node. More...
 
utils::Matrix< signed char > node_root_direction_matrix (Tree const &tree)
 Calculate a Matrix that indicates the nodes on the root side of a given node. More...
 
std::ostream & operator<< (std::ostream &out, Tree const &tree)
 
template<typename ElementType >
utils::Range< IteratorPath
< TreeLink const, TreeNode
const, TreeEdge const > > 
path (ElementType const &start, ElementType const &finish)
 
template<typename ElementType >
utils::Range< IteratorPath
< TreeLink, TreeNode, TreeEdge > > 
path (ElementType &start, ElementType &finish)
 
template<typename ElementType >
utils::Range< IteratorPathSet
< TreeLink const, TreeNode
const, TreeEdge const > > 
path_set (ElementType const &start, ElementType const &finish, ElementType const &lca)
 
template<typename ElementType >
utils::Range< IteratorPathSet
< TreeLink, TreeNode, TreeEdge > > 
path_set (ElementType &start, ElementType &finish, ElementType &lca)
 
std::vector< TreeLink const * > path_to_root (TreeNode const &node)
 Helper function that finds all TreeLinks between a given TreeNode and the root of the Tree. More...
 
template<typename ElementType >
utils::Range
< IteratorPostorder< TreeLink
const, TreeNode const,
TreeEdge const > > 
postorder (ElementType const &element)
 
template<typename ElementType >
utils::Range
< IteratorPostorder< TreeLink,
TreeNode, TreeEdge > > 
postorder (ElementType &element)
 
template<typename ElementType >
utils::Range< IteratorPreorder
< TreeLink const, TreeNode
const, TreeEdge const > > 
preorder (ElementType const &element)
 
template<typename ElementType >
utils::Range< IteratorPreorder
< TreeLink, TreeNode, TreeEdge > > 
preorder (ElementType &element)
 
void reroot (Tree &tree, TreeLink &at_link)
 Reroot the Tree at the given TreeLink. More...
 
void reroot (Tree &tree, TreeNode &at_node)
 Reroot the Tree at the given TreeNode. More...
 
void reroot_at_node (Tree &tree, size_t node_index)
 Reroot the Tree at the TreeNode with the given index. More...
 
void scale_all_branch_lengths (Tree &tree, double factor)
 Scale all branch lengths of a Tree by a given factor. More...
 
void set_all_branch_lengths (Tree &tree, double length)
 Set all branch lengths of a Tree to a given value. More...
 
std::string squash_cluster_tree (SquashClustering const &sc, std::vector< std::string > const &labels)
 Build a Newick-format tree for visualizing the result of a squash_clustering(). More...
 
SquashClustering squash_clustering (std::vector< MassTree > &&trees, double const p=1.0)
 Perfom Squash Clustering. More...
 
SquashClustering squash_clustering_init (std::vector< MassTree > &&trees, double const p)
 
void squash_clustering_merge_clusters (SquashClustering &sc, size_t i, size_t j, double const p)
 
std::pair< size_t, size_t > squash_clustering_min_entry (SquashClustering const &sc)
 
size_t subtree_max_path_height (Tree const &tree, TreeLink const &link)
 Calculate the height of a subtree, that is, the maximum path length to a leaf of that subtree, measured in edges between the link and the leaf. More...
 
std::vector< size_t > subtree_max_path_heights (Tree const &tree, TreeNode const &node)
 
std::vector< size_t > subtree_max_path_heights (Tree const &tree)
 
size_t subtree_size (Tree const &tree, TreeLink const &link)
 Return the size of the subtree defined by the given TreeLink, measured in number of nodes. More...
 
std::vector< size_t > subtree_sizes (Tree const &tree, TreeNode const &node)
 Calculate the sizes of all subtrees as seen from the given TreeNode. More...
 
std::vector< size_t > subtree_sizes (Tree const &tree)
 Calculate the sizes of all subtrees as seen from the root of the tree. More...
 
std::string svg_arc (double center_x, double center_y, double radius, double start_angle, double end_angle)
 
template<class NodeDataType , class EdgeDataType >
bool tree_data_is (Tree const &tree)
 Check whether the data of the nodes and edges of the Tree are exactly of the specified data types. More...
 
template<class NodeDataType , class EdgeDataType >
bool tree_data_is_derived_from (Tree const &tree)
 Check whether the data of the nodes and edges of the Tree are derived from the specified data types. More...
 
bool validate_topology (Tree const &tree)
 Validate that all internal pointers of the Tree elements (TreeLinks, TreeNodes, TreeEdges) to each other are correct and that some other invariants are met. More...
 

Enumerations

enum  LadderizeOrder { kSmallFirst, kLargeFirst }
 

Typedefs

using AttributeTree = Tree
 Alias for a Tree that stores TreeNodes and TreeEdges with string attributes on them. More...
 
using AttributeTreeEdge = TreeEdge
 Alias for a TreeEdge of an AttributeTree. See there for more information. More...
 
using AttributeTreeLink = TreeLink
 Alias for a TreeLink of an AttributeTree. See there for more information. More...
 
using AttributeTreeMap = std::map< std::string, std::string >
 Alias for the map type used by an AttributeTree. More...
 
using AttributeTreeNode = TreeNode
 Alias for a TreeNode of an AttributeTree. See there for more information. More...
 
using CircularTree = tree::Tree
 Alias for a tree::Tree used for a tree with information needed for tree drawing. More...
 
using CircularTreeEdge = tree::TreeEdge
 Alias for tree::TreeEdge used in a CircularTree. See CircularEdgeData for the data stored on the edges. More...
 
using CircularTreeLink = tree::TreeLink
 Alias for tree::TreeLink used in a CircularTree. More...
 
using CircularTreeNode = tree::TreeNode
 Alias for tree::TreeNode used in a CircularTree. See CircularNodeData for the data stored on the nodes. More...
 
using DefaultTree = Tree
 Alias for a Tree with data types DefaultNodeData and DefaultEdgeData. More...
 
using DefaultTreeEdge = TreeEdge
 Alias for a TreeEdge of a DefaultTree. See there for more information. More...
 
using DefaultTreeLink = TreeLink
 Alias for a TreeLink of a DefaultTree. See there for more information. More...
 
using DefaultTreeNode = TreeNode
 Alias for a TreeNode of a DefaultTree. See there for more information. More...
 
using LayoutTree = tree::Tree
 Alias for a tree::Tree used for a tree with information needed for tree drawing. More...
 
using LayoutTreeEdge = tree::TreeEdge
 Alias for tree::TreeEdge used in a LayoutTree. See LayoutEdgeData for the data stored on the edges. More...
 
using LayoutTreeLink = tree::TreeLink
 Alias for tree::TreeLink used in a LayoutTree. More...
 
using LayoutTreeNode = tree::TreeNode
 Alias for tree::TreeNode used in a LayoutTree. See LayoutNodeData for the data stored on the nodes. More...
 
using MassTree = Tree
 Alias for a Tree that stores masses on its TreeEdges. More...
 
using MassTreeEdge = TreeEdge
 Alias for a TreeEdge of a MassTree. See there for more information. More...
 
using MassTreeLink = TreeLink
 Alias for a TreeLink of a MassTree. See there for more information. More...
 
using MassTreeNode = TreeNode
 Alias for a TreeNode of a MassTree. See there for more information. More...
 
using RectangularTree = tree::Tree
 Alias for a tree::Tree used for a tree with information needed for tree drawing. More...
 
using RectangularTreeEdge = tree::TreeEdge
 Alias for tree::TreeEdge used in a RectangularTree. See RectangularEdgeData for the data stored on the edges. More...
 
using RectangularTreeLink = tree::TreeLink
 Alias for tree::TreeLink used in a RectangularTree. More...
 
using RectangularTreeNode = tree::TreeNode
 Alias for tree::TreeNode used in a RectangularTree. See RectangularNodeData for the data stored on the nodes. More...
 

Function Documentation

TreeEdge & add_new_node ( Tree &  tree,
TreeNode &  target_node 
)

Add a new Node as a leaf to an existing Node.

The function adds a new leaf to the Tree by appending it to an existing TreeNode. For this, four new elements are created and added to the tree:

  1. A TreeLink that gets added to the given node and connects it to the new node.
  2. A TreeLink for the new TreeNode.
  3. The new TreeNode itself.
  4. A TreeEdge that connects the new and the given TreeNode with each other.

Thus, the procedure is as shown:

add_new_node_node.png
The Tree before and after adding the new Node.

The new Node is added at the "end" of the given Node. That is, when traversing the Tree or the given Node, the new Node will be visited last. In the example, the tree is bifurcating before the new node is added. This is just an example - the function also works with Nodes that are already multifurcating.

The data pointers of the new Node and Edge are initialized to default-constructed objects:

  • The new node data type is the same as the one of the given target_node.
  • The new edge data type is the same as the one of the edge towards the root of the target_node.

This means that in case of a Tree where every Node and Edge has the same node and edge data type (standard case), the newly created ones will also have these data types.

Returns
The function returns the newly create TreeEdge. This way, all other new elements can be accessed, using TreeEdge::primary_link(), TreeEdge::secondary_link() and TreeEdge::secondary_node().

Definition at line 51 of file manipulation.cpp.

TreeEdge & add_new_node ( Tree &  tree,
TreeEdge &  target_edge 
)

Add a new Node as a leaf to an existing Edge, by also adding a new Node in the middle of that Edge.

This function is an extension of add_new_node( Tree&, TreeNode& ). Before adding the new leaf node, it first adds another Node that splits the given target_edge into two edges, and then adds the leaf to it.

Thus, the procedure is as shown:

add_new_node_edge.png
The Tree before and after adding the new Node.

After the function, the target_edge with all its data ends up on the to-root side of the new inner node.

For details of how the data pointers are handled, see add_new_node( Tree&, TreeNode& ). This function behaves in a similar way. The data objects of the new nodes and edges are default-constructed objects of the same type as the target_edge and its primary node.

Be aware that the data of target_edge is not changed. Thus, in trees with DefaultEdgeData, the branch lengths of all three affected edges might have to be changed to the desired values after calling this function.

Returns
The function returns the newly created TreeEdge that leads to the new leaf node.

Definition at line 130 of file manipulation.cpp.

bool all_equal ( TreeSet const &  tset,
std::function< bool(TreeNode const &, TreeNode const &)>  node_comparator,
std::function< bool(TreeEdge const &, TreeEdge const &)>  edge_comparator 
)

Compare whether all Trees in a TreeSet are equal using a given comparator functional.

See Tree::equal() for more information.

Definition at line 160 of file function/tree_set.cpp.

bool all_identical_topology ( TreeSet const &  tset)

Compare whether all Trees in a TreeSet are equal using their default comparision operators for nodes and edges.

Return true iff all Trees in a TreeSet have an identical topology.

Definition at line 195 of file function/tree_set.cpp.

Tree average_branch_length_tree ( TreeSet const &  tset)

Return a Tree where the branch lengths are the average of the Trees in the TreeSet, given that they all have the same topology.

The function works only under the following conditions:

* All trees must have the same topology.
* The TreeType must provide a data member `branch_length` for the edges.

Otherwise, the function throws an std::runtime_error. It does not check for node names, but the returned tree will contain the names of the first tree in the set.

TODO this function assumes that the tree edge has a branch_length. move it to default tree.

Definition at line 92 of file function/tree_set.cpp.

bool belongs_to ( Tree const &  tree,
TreeNode const &  node 
)

Return whether the TreeNode belongs to the Tree, i.e., whether it is owned by the Tree.

Definition at line 199 of file tree/function/operators.cpp.

bool belongs_to ( TreeNode const &  node,
Tree const &  tree 
)

Return whether the TreeNode belongs to the Tree, i.e., whether it is owned by the Tree.

Definition at line 207 of file tree/function/operators.cpp.

bool belongs_to ( Tree const &  tree,
TreeEdge const &  edge 
)

Return whether the TreeEdge belongs to the Tree, i.e., whether it is owned by the Tree.

Definition at line 215 of file tree/function/operators.cpp.

bool belongs_to ( TreeEdge const &  edge,
Tree const &  tree 
)

Return whether the TreeEdge belongs to the Tree, i.e., whether it is owned by the Tree.

Definition at line 223 of file tree/function/operators.cpp.

bool belongs_to ( Tree const &  tree,
TreeLink const &  link 
)

Return whether the TreeLink belongs to the Tree, i.e., whether it is owned by the Tree.

Definition at line 231 of file tree/function/operators.cpp.

bool belongs_to ( TreeLink const &  link,
Tree const &  tree 
)

Return whether the TreeLink belongs to the Tree, i.e., whether it is owned by the Tree.

Definition at line 239 of file tree/function/operators.cpp.

std::vector< double > branch_lengths ( Tree const &  tree)

Get a vector of all branch lengths of a Tree, index by the edge index.

Definition at line 220 of file tree/default/functions.cpp.

std::vector< std::pair< TreeNode const *, size_t > > closest_leaf_depth_vector ( const Tree &  tree)

Returns a vector containing the closest leaf node for each node, measured in number of edges between them and its depth (number of edges between them).

The vector is indexed using the node().index() for every node. Its value contains an std::pair, where the first element is a NodeType* to the closest leaf node (with respect to its depth) and the second element its depth with respect to the node at the given index of the vector. The depth is the number of edges visited on the path between two nodes (0 for itself, 1 for immediate neighbours, etc).

Thus, leaf nodes will have a pointer to themselves and a depth value of 0, and for all other nodes the depth will be the number of edges between it and the closest leaf node.

There might be more than one leaf with the same depth to a given node. In this case, an arbitrary one is used.

Definition at line 273 of file tree/function/distances.cpp.

std::vector< std::pair< TreeNode const *, double > > closest_leaf_distance_vector ( Tree const &  tree)

Return a vector containing the closest leaf node for each node, using the branch_length as distance measure.

The vector is indexed using the node().index() for every node. Its value contains an std::pair, where the first element is a TreeNode* to the closest leaf node of the node at the index, measured using the branch_length; the second element of the pair is the distance value itself. Thus, leaf nodes will have a pointer to themselves and a distance value of 0.

See also furthest_leaf_distance_vector().

Definition at line 347 of file tree/default/distances.cpp.

std::vector< std::pair< TreeNode const *, double > > closest_leaf_distance_vector ( Tree const &  tree,
utils::Matrix< double > const &  node_distances 
)

Definition at line 356 of file tree/default/distances.cpp.

Tree convert ( Tree const &  source,
std::function< std::unique_ptr< BaseNodeData >(BaseNodeData const &node_data)>  node_data_converter,
std::function< std::unique_ptr< BaseEdgeData >(BaseEdgeData const &edge_data)>  edge_data_converter 
)

Create a tree with the same topology as the source tree, while converting its data.

This function takes the given source Tree (possibly with different data types at the nodes and edges), and copies its topology (i.e., all links, nodes and edges, and their structure) to the newly created result tree.

The data types are then converted using the two provided functions for the node data type and edge data type, respectively. If a node or an edge does not have data (i.e., the data pointer is a nullptr), the converter functions are not called, but the data of the new tree at that node or edge is also set to a nullptr.

Definition at line 58 of file tree/function/operators.cpp.

AttributeTree genesis::tree::convert_default_tree_to_attribute_tree ( DefaultTree const &  source)
inline

Helper function that takes a DefaultTree (or any Tree with Node and Edge data derived from it) and turns its data into an AttributeTree, that is, a Tree with AttributeTreeNodeData and AttributeTreeEdgeData.

Definition at line 211 of file tree/attribute_tree/tree.hpp.

MassTree genesis::tree::convert_default_tree_to_mass_tree ( DefaultTree const &  source)
inline

Helper function that takes a DefaultTree (or any Tree with Node and Edge data derived from it) and turns its data into an MassTree, that is, a Tree with MassTreeNodeData and MassTreeEdgeData.

Definition at line 213 of file tree/mass_tree/tree.hpp.

DefaultTree convert_to_default_tree ( Tree const &  source_tree)

Convert a Tree to a DefaultTree with DefaultNodeData and DefaultEdgeData.

This works for all Trees that have data derived from those two data base classes. Also nodes or edges without data work. However, an expection is thrown if any node or edge contains data that is not derived from DefaultNodeData and DefaultEdgeData, respectively.

The data itself is copied using the clone functions, so that all names and branch lengths are transferred to the returned Tree.

Definition at line 44 of file tree/default/operators.cpp.

double deepest_distance ( Tree const &  tree)

Return the longest distance from any point in the tree (on the edges) to any leaf.

Definition at line 275 of file tree/default/distances.cpp.

double diameter ( Tree const &  tree)

Get the diameter of the tree, i.e., the longest distance between any two nodes, measured using the branch_length.

Definition at line 210 of file tree/default/functions.cpp.

double earth_movers_distance ( MassTree const &  lhs,
MassTree const &  rhs,
double  p = 1.0 
)

Calculate the earth mover's distance of two distributions of masses on a given Tree.

The earth mover's distance is typically a distance measure between two distributions. See [Earth mover's distance](https://en.wikipedia.org/wiki/Earth_mover's_distance) for an introduction.

In our case, we use distibutions of masses along the branches of a tree. Each branch can have multiple masses at different positions within [0.0, branch_length].

The distance is calculated as the amount of work needed to move the masses of one distribution so that they end up in the positions of the masses of the other distribution. Work is here defined as mass times dislocation. Thus, the work is higher if either more mass has to be moved, or if mass has to be moved further.

Here, the parameter p is used to control the influence of mass and distance, with 0.0 < p < inf, and default p == 1.0, which is the neutral case. A larger p increases the impact of distance traveled, while a smaller p emphasizes differences of mass. For details, see the references cited below.

The resulting distance is independed of the rooting of the tree and commutative with respect to the two mass distributions.

The earth mover's distance is only meaningful if both mass distributions contain the same amount of total mass. See mass_tree_sum_of_masses() to check this. Also, in order to give comparable results over different tree topologies, the mass can be normalized using mass_tree_normalize_masses(). Then, the result of the earth mover's distance is always in the range [ 0.0, 1.0 ].

See earth_movers_distance( Sample const&, ... ) for an exemplary usage of this function, which applies the earth mover's distance to the placement weights (like_weight_ratio) of a PlacementTree.

References:

[1] Guppy Documentation: http://matsen.github.io/pplacer/generated_rst/guppy_kr.html#guppy-kr

[2] F. A. Matsen and S. N. Evans, “Edge principal components and squash clustering: using the special structure of phylogenetic placement data for sample comparison.”, PLoS One, 2011. DOI: 10.1371/journal.pone.0056859

[3] S. N. Evans and F. A. Matsen, “The phylogenetic Kantorovich-Rubinstein metric for environmental sequence samples.”, Statistical Methodology, 2012. DOI: 10.1111/j.1467-9868.2011.01018.x

Definition at line 60 of file tree/mass_tree/emd.cpp.

utils::Matrix< double > earth_movers_distance ( std::vector< MassTree > const &  trees,
double  p = 1.0 
)

Calculate the pairwise earth mover's distance for all MassTrees.

The result is a pairwise distance Matrix using the indices of the given vector. See earth_movers_distance( MassTree const&, MassTree const&, double ) for details on the calculation.

Definition at line 183 of file tree/mass_tree/emd.cpp.

std::pair< double, double > earth_movers_distance ( MassTree const &  tree,
double  p = 1.0 
)

Calculate the earth mover's distance of masses on a given Tree.

This function is mainly used as a speed-up for calculating earth_movers_distance( MassTree const&, MassTree const& ). See there for more details.

It uses the following convention for the two distributions: The masses of one distribution are stored using a positive sign, the masses of the other distribution use a negative sign. This way, only one Tree needs to be stored, and the algorithm is significantly simplyfied.

Thus, as the earth mover's distance is only meaningful if both distributions have the same sum, and we use opposite signs to store the masses, the sum of all masses on the tree should ideally be zero (apart from numerical derivations). See mass_tree_sum_of_masses() and mass_tree_validate() for functions to verify this.

The function returns two doubles: The first one is the actual distance, the second one gives the remaining mass at the root node. This should also be close to 0.0, as there, all masses from the subtrees should ideally cancel each other out. Use this value to check whether this actually worked out. Too big numbers indicate that something is wrong with the sums of the signed masses.

Definition at line 237 of file tree/mass_tree/emd.cpp.

TreeEdge * edge_between ( TreeNode &  lhs,
TreeNode &  rhs 
)

Return the TreeEdge between two TreeNode&s, if they are neighbours, or nullptr otherwise.

Definition at line 247 of file tree/function/operators.cpp.

TreeEdge const * edge_between ( TreeNode const &  lhs,
TreeNode const &  rhs 
)

Return the TreeEdge between two TreeNode&s, if they are neighbours, or nullptr otherwise.

Definition at line 262 of file tree/function/operators.cpp.

utils::Matrix< double > edge_branch_length_distance_matrix ( Tree const &  tree)

Definition at line 154 of file tree/default/distances.cpp.

std::vector< double > edge_branch_length_distance_vector ( Tree const &  tree,
TreeEdge const *  edge 
)

Definition at line 221 of file tree/default/distances.cpp.

std::vector< utils::Color > edge_color_branch_length_gradient ( Tree const &  tree,
bool  zero_based 
)

Definition at line 51 of file tree/default/edge_color.cpp.

utils::Matrix< size_t > edge_path_length_matrix ( Tree const &  tree)

Definition at line 154 of file tree/function/distances.cpp.

std::vector< size_t > edge_path_length_vector ( Tree const &  tree,
TreeEdge const &  edge 
)

Definition at line 208 of file tree/function/distances.cpp.

utils::Matrix< signed char > edge_sides ( Tree const &  tree)

Create a Matrix that indiciaces the relative position of the Edges of a Tree, i.e., whether they are on the root side or non-root side.

For a Tree with e many Edges, the resulting square Matrix has dimensions ( e, e ). The row at index i corresponds to edge i, which can be accessed via Tree::edge_at(). The entries of that row indicate whether the other edges (as indexed by the columns) are on the root side of the edge i, in which case the entry is -1, or on the non-root side, in which case the entry is 1. The entry for the edge i itself (which is the diagonale) is 0.

In other words, all edges on the proximal side of a given edge are denoted by a -1, while all edges on the distal side of the given edge are denoted by a 1.

As a consequence, the rows of edges that lead to a leaf are full of 1s (with the exception of the diagonal entry, which still is 0). This is because for leaf edges, all other edges are on the root side of the tree.

Definition at line 114 of file tree/function/functions.cpp.

bool equal ( Tree const &  lhs,
Tree const &  rhs,
std::function< bool(TreeNode const &, TreeNode const &) >  node_comparator,
std::function< bool(TreeEdge const &, TreeEdge const &) >  edge_comparator 
)

Compares two trees for equality given binary comparator functionals for their nodes and edges.

This function does a preorder traversal of both trees in parallel and calls the comparator functionals for each position of the iterator. It returns true iff the comparators are true for every position.

The comparator functionals can be either function pointers, function objects, or lambda expressions.

As the traversal is done in parallel, the trees are also checked for equal topology: their elements (links, nodes, edges) have to be equal in size and the rank of each node during the traversal has to be identical in both trees. Those assumptions are made because two trees that do not have identical topology are never considered equal.

Definition at line 101 of file tree/function/operators.cpp.

utils::Range< IteratorEulertour< TreeLink const, TreeNode const, TreeEdge const > > genesis::tree::eulertour ( ElementType const &  element)

Definition at line 190 of file eulertour.hpp.

utils::Range< IteratorEulertour< TreeLink, TreeNode, TreeEdge > > genesis::tree::eulertour ( ElementType &  element)

Definition at line 200 of file eulertour.hpp.

TreeNode const * find_node ( Tree const &  tree,
const std::string &  name,
bool  replace_underscores 
)

Finds a Node, given its name. If not found, nullptr is returned.

Definition at line 146 of file tree/default/functions.cpp.

TreeNode * find_node ( Tree &  tree,
const std::string &  name,
bool  replace_underscores 
)

Finds a Node, given its name. If not found, nullptr is returned.

Definition at line 168 of file tree/default/functions.cpp.

Tree * find_tree ( TreeSet &  tset,
std::string const &  name 
)

Get the first Tree in a TreeSet that is stored with a given name, or nullptr if not found.

Definition at line 54 of file function/tree_set.cpp.

Tree const * find_tree ( TreeSet const &  tset,
std::string const &  name 
)

Get the first Tree in a TreeSet that is stored with a given name, or nullptr if not found.

Definition at line 68 of file function/tree_set.cpp.

std::vector< std::pair< TreeNode const *, double > > furthest_leaf_distance_vector ( Tree const &  tree)

Opposite of closest_leaf_distance_vector().

Definition at line 363 of file tree/default/distances.cpp.

std::vector< std::pair< TreeNode const *, double > > furthest_leaf_distance_vector ( Tree const &  tree,
utils::Matrix< double > const &  node_distances 
)

Definition at line 372 of file tree/default/distances.cpp.

double height ( Tree const &  tree)

Get the height of the tree, i.e., the longest distance from the root to a leaf, measured using the branch_length.

Definition at line 200 of file tree/default/functions.cpp.

bool identical_topology ( Tree const &  lhs,
Tree const &  rhs 
)

Returns true iff both trees have an identical topology.

The topology is considered identical only if the order of edges is also the same in both trees. This means, although two trees might have the same number of leaves and branches, they might still be not identical (with respect to this function) when the branches appear in a different order or when the root sits at a different node.

Definition at line 173 of file tree/function/operators.cpp.

size_t inner_edge_count ( Tree const &  tree)

Return the number of Edges of a Tree that do not lead to a leaf Node.

Definition at line 99 of file tree/function/functions.cpp.

size_t inner_node_count ( Tree const &  tree)

Count the number of inner Nodes.

Definition at line 83 of file tree/function/functions.cpp.

bool is_bifurcating ( Tree const &  tree)

Return whether the Tree is bifurcating.

Definition at line 66 of file tree/function/functions.cpp.

void ladderize ( Tree &  tree,
LadderizeOrder  order 
)

Definition at line 273 of file manipulation.cpp.

std::vector<std::pair< TreeNode const*, double > > genesis::tree::leaf_distance_vector ( Tree const &  tree,
utils::Matrix< double > const &  node_distances,
Comparator  comp 
)

Local helper function to calculate either closest_leaf_distance_vector() or furthest_leaf_distance_vector().

Definition at line 300 of file tree/default/distances.cpp.

size_t leaf_edge_count ( Tree const &  tree)

Return the number of Edges of a Tree that lead to a leaf Node.

Definition at line 88 of file tree/function/functions.cpp.

size_t leaf_node_count ( Tree const &  tree)

Count the number of leaf Nodes of a Tree.

Definition at line 71 of file tree/function/functions.cpp.

double length ( Tree const &  tree)

Get the length of the tree, i.e., the sum of all branch lengths.

Definition at line 187 of file tree/default/functions.cpp.

utils::Range< IteratorLevelorder< TreeLink const, TreeNode const, TreeEdge const > > genesis::tree::levelorder ( ElementType const &  element)

Definition at line 232 of file tree/iterator/levelorder.hpp.

utils::Range< IteratorLevelorder< TreeLink, TreeNode, TreeEdge > > genesis::tree::levelorder ( ElementType &  element)

Definition at line 242 of file tree/iterator/levelorder.hpp.

TreeNode const & lowest_common_ancestor ( TreeNode const &  node_a,
TreeNode const &  node_b 
)

Return the lowest common ancestor of two TreeNodes.

Definition at line 391 of file tree/function/functions.cpp.

TreeNode & lowest_common_ancestor ( TreeNode &  node_a,
TreeNode &  node_b 
)

Return the lowest common ancestor of two TreeNodes.

Definition at line 427 of file tree/function/functions.cpp.

utils::Matrix< size_t > lowest_common_ancestors ( Tree const &  tree)

Return the lowest common ancestor of each pair of TreeNodes for a given tree, in form of a Matrix of their indices.

The entries in the resulting Matrix are the Node indices of the lowest common ancestor (LCA) of a given pair of Nodes. For example, for the result Matrix r, the entry r[ 3, 5 ] == 7 means that the LCA of Nodes 3 and 5 is Node 7.

These Nodes can for example be accesses via Tree::node_at().

Definition at line 434 of file tree/function/functions.cpp.

double mass_tree_binify_masses ( MassTree &  tree,
size_t  number_of_bins 
)

Accumulate all masses of a MassTree into bins on each branch.

Each branch is divided into intervals of equal size, where number_of_bins is the number of those intervals. The mid points of these intervals are then used as bins, to which the masses on the branch are moved. Each mass point is moved to its closest bin, so that all mass is accumulated at the bins.

This function is useful to reduce the data load of big MassTrees, without affecting the accuracy of downstream analyses too much. Using the interval mid points as bins means that masses are moved as little as possible.

Example: Given number_of_bins == 6, for a branch of length 3.6, the bins look like this:

Intervals   0.0   0.6   1.2   1.8   2.4   3.0   3.6
            |     |     |     |     |     |     |
               ^     ^     ^     ^     ^     ^
Bins           0.3   0.9   1.5   2.1   2.7   3.3

The function returns the work (mass times distance) that was needed to move the masses to the bins.

Definition at line 213 of file tree/mass_tree/functions.cpp.

double mass_tree_center_masses_on_branches ( MassTree &  tree)

Accumulate all masses of a MassTree on the centers of their edges.

This function can be used to minimize the data load of a MassTree. It is equal to mass_tree_binify_masses() when using number_of_bins == 1.

Return the work (mass times distance) that was needed to move the masses to the centers.

Definition at line 155 of file tree/mass_tree/functions.cpp.

double mass_tree_center_masses_on_branches_averaged ( MassTree &  tree)

Accumulate all masses of a MassTree at the average mass position per edge.

This function is similar to mass_tree_center_masses_on_branches(), but instead of accumulating the masses at the branch center, they are accumulated at their average position on the branch.

Return the work (mass times distance) that was needed to move the masses to the centers.

Definition at line 175 of file tree/mass_tree/functions.cpp.

void mass_tree_clear_masses ( MassTree &  tree)

Clear all masses of a MassTree, while keeping its topology.

Definition at line 96 of file tree/mass_tree/functions.cpp.

std::vector< double > mass_tree_mass_per_edge ( MassTree const &  tree)

Return a std::vector that contains the total Mass for each edge of the given MassTree.

The vector is indexed using the index of the edges.

Definition at line 260 of file tree/mass_tree/functions.cpp.

MassTree mass_tree_merge_trees ( MassTree const &  lhs,
MassTree const &  rhs,
double const  scaler_lhs = 1.0,
double const  scaler_rhs = 1.0 
)

Merge all masses of two MassTrees into one and return it.

The two scalers can be used to weight the masses differently, if needed.

The resulting tree will have a mass of scaler_lhs * mass(lhs) + scaler_rhs * mass(rhs), which usually is not unit mass any more. Thus, if needed, call mass_tree_normalize_masses() to rescale the masses back to unit mass.

Definition at line 60 of file tree/mass_tree/functions.cpp.

void mass_tree_merge_trees_inplace ( MassTree &  lhs,
MassTree const &  rhs,
double const  scaler_lhs = 1.0,
double const  scaler_rhs = 1.0 
)

Merge all masses of two MassTrees by adding them to the first MassTree.

The two scalers can be used to weight the masses differently, if needed.

The resulting tree will have a mass of scaler_lhs * mass(lhs) + scaler_rhs * mass(rhs), which usually is not unit mass any more. Thus, if needed, call mass_tree_normalize_masses() to rescale the masses back to unit mass.

Definition at line 71 of file tree/mass_tree/functions.cpp.

void mass_tree_normalize_masses ( MassTree &  tree)

Scale all masses of a MassTree so that they sum up to 1.0.

Definition at line 125 of file tree/mass_tree/functions.cpp.

void mass_tree_reverse_signs ( MassTree &  tree)

Reverse the sign of each mass point on a MassTree.

Definition at line 103 of file tree/mass_tree/functions.cpp.

void mass_tree_scale_masses ( MassTree &  tree,
double  factor 
)

Scale all masses of a MassTree with the multiplicative factor factor.

Definition at line 114 of file tree/mass_tree/functions.cpp.

double mass_tree_sum_of_masses ( MassTree const &  tree)

Return the total sum of all masses on the MassTree.

In order for the earth_movers_distance() algorithm to work properly (and give meaningful results), the total mass on the MassTrees should ideally be the same. This function can be used to check this.

Because of numerical issues however, be aware that the result might be slighly off. This is okay, as it usually is in the last digits of the double.

Definition at line 274 of file tree/mass_tree/functions.cpp.

void mass_tree_transform_to_unit_branch_lengths ( MassTree &  tree)

Set all branch lengths of a MassTree to 1.0, while keeping the relative position of all masses on the branches.

Definition at line 140 of file tree/mass_tree/functions.cpp.

bool mass_tree_validate ( MassTree const &  tree,
double  valid_total_mass_difference = -1.0 
)

Validate the data on a MassTree.

This function returns true iff the data on the Tree is valid:

  • The node and edge data types have to be MassTreeNodeData and MassTreeEdgeData, respectively.
  • The positions of the masses are in [ 0.0, branch_length ] on their respective branches.
  • If the optional arugment valid_total_mass_difference is not negative, the sum of all masses is also checked. It has to be close to 0.0, using the argument as the absolute allowed difference. This is useful to check whether the masses for calculating the one-argument version of the earth_movers_distance( MassTree const& ) are correct.

The function stops at the first encountered invalid condition and outputs a description message of the invalid value to LOG_INFO.

Parameters
treeMassTree to be validated.
valid_total_mass_differenceIf set to a non-negative value, it is used as the absolute allowed difference from zero for the total sum of all masses on the tree.

Definition at line 285 of file tree/mass_tree/functions.cpp.

size_t max_rank ( Tree const &  tree)

Return the highest rank of the Nodes of a Tree.

If the Tree is empty, 0 is returned.

Definition at line 57 of file tree/function/functions.cpp.

utils::Matrix< double > node_branch_length_distance_matrix ( Tree const &  tree)

Return a distance matrix containing pairwise distances between all Nodes, using the branch_length of the Edges as distance measurement.

The elements of the matrix are indexed using node().index().

Definition at line 54 of file tree/default/distances.cpp.

std::vector< double > node_branch_length_distance_vector ( Tree const &  tree,
TreeNode const *  node = nullptr 
)

Return a vector containing the distance of all nodes with respect to the given start node, where distance is measured in the sum of branch lengths between the nodes.

The vector is indexed using the node().index() for every node. Its elements give the distance of each node with respect to the given start node. The distance is the sum of branch lengths of the edges visited on the path between the two nodes.

If no Node pointer is provided, the root is taken as node.

Definition at line 120 of file tree/default/distances.cpp.

utils::Range< IteratorNodeLinks< TreeLink const, TreeNode const, TreeEdge const > > genesis::tree::node_links ( ElementType const &  element)

Definition at line 175 of file node_links.hpp.

utils::Range< IteratorNodeLinks< TreeLink, TreeNode, TreeEdge > > genesis::tree::node_links ( ElementType &  element)

Definition at line 185 of file node_links.hpp.

std::unordered_set< std::string > node_names ( Tree const &  tree,
bool  leaves_only 
)

Returns an unordered set of all TreeNode names of a Tree.

If leaves_only is set to true, nodes names of inner nodes are not included. Unnamed nodes (node.data.name == "") are always excluded. The only difference to node_names_sorted() is the type of container used for storing the result.

The provided Tree needs to have TreeNodes with data types deriveed from DefaultNodeData.

Definition at line 54 of file tree/default/functions.cpp.

std::unordered_set< std::string > node_names ( TreeSet const &  tree_set,
bool  leaves_only 
)

Returns a set of all TreeNode names of a TreeSet.

The function returns the set of all names of all Trees in the set. See node_names(...) this version of the function for details.

Definition at line 106 of file tree/default/functions.cpp.

utils::SortedVector< std::string > node_names_sorted ( Tree const &  tree,
bool  leaves_only 
)

Returns a set of all TreeNode names of a Tree.

If leaves_only is set to true, nodes names of inner nodes are not included. Unnamed nodes (node.data.name == "") are always excluded. The only difference to node_names() is the type of container used for storing the result.

The provided Tree needs to have TreeNodes with data types deriveed from DefaultNodeData.

Definition at line 81 of file tree/default/functions.cpp.

utils::SortedVector< std::string > node_names_sorted ( TreeSet const &  tree_set,
bool  leaves_only 
)

Returns a set of all TreeNode names of a TreeSet.

The function returns the set of all names of all Trees in the set. See node_names_sorted(...) this version of the function for details.

Definition at line 127 of file tree/default/functions.cpp.

utils::Matrix< size_t > node_path_length_matrix ( Tree const &  tree)

Return a matrix containing the pairwise depth of all nodes of the tree.

See node_path_length_vector(...) for more information.

The vector is indexed using the node().index() for every node.

Definition at line 59 of file tree/function/distances.cpp.

std::vector< size_t > node_path_length_vector ( Tree const &  tree,
TreeNode const &  node 
)

Return a vector containing the depth of all nodes with respect to the given start node.

The vector is indexed using the node().index() for every node. Its elements give the depth of each node with respect to the given start node. The depth is the number of edges visited on the path between two nodes (0 for itself, 1 for immediate neighbours, etc).

Definition at line 105 of file tree/function/distances.cpp.

std::vector< size_t > node_path_length_vector ( Tree const &  tree)

Return a vector containing the depth of all nodes with respect to the root node.

This function calls and returns the value of node_path_length_vector(...) using the root node of the tree.

Definition at line 148 of file tree/function/distances.cpp.

utils::Matrix< signed char > node_root_direction_matrix ( Tree const &  tree)

Calculate a Matrix that indicates the nodes on the root side of a given node.

The row and column indices of the Matrix represent TreeNode indices. Each element of the Matrix indicates whether the column node is in the subtree of the row node that contains the root (value 1), or in a subtree that does not contain the root (value -1), while the diagonale contains 0.

Definition at line 141 of file tree/function/functions.cpp.

std::ostream & operator<< ( std::ostream &  out,
Tree const &  tree 
)

Definition at line 278 of file tree/function/operators.cpp.

utils::Range< IteratorPath< TreeLink const, TreeNode const, TreeEdge const > > genesis::tree::path ( ElementType const &  start,
ElementType const &  finish 
)

Definition at line 325 of file path.hpp.

utils::Range< IteratorPath< TreeLink, TreeNode, TreeEdge > > genesis::tree::path ( ElementType &  start,
ElementType &  finish 
)

Definition at line 335 of file path.hpp.

utils::Range< IteratorPathSet< TreeLink const, TreeNode const, TreeEdge const > > genesis::tree::path_set ( ElementType const &  start,
ElementType const &  finish,
ElementType const &  lca 
)

Definition at line 304 of file path_set.hpp.

utils::Range< IteratorPathSet< TreeLink, TreeNode, TreeEdge > > genesis::tree::path_set ( ElementType &  start,
ElementType &  finish,
ElementType &  lca 
)

Definition at line 314 of file path_set.hpp.

std::vector< TreeLink const * > path_to_root ( TreeNode const &  node)

Helper function that finds all TreeLinks between a given TreeNode and the root of the Tree.

Both the primary_link() of the Node and the root_link() of the Tree are included in the list. The order of the list starts at the provided node and ends at the root.

Definition at line 358 of file tree/function/functions.cpp.

utils::Range< IteratorPostorder< TreeLink const, TreeNode const, TreeEdge const > > genesis::tree::postorder ( ElementType const &  element)

Definition at line 236 of file tree/iterator/postorder.hpp.

utils::Range< IteratorPostorder< TreeLink, TreeNode, TreeEdge > > genesis::tree::postorder ( ElementType &  element)

Definition at line 246 of file tree/iterator/postorder.hpp.

utils::Range< IteratorPreorder< TreeLink const, TreeNode const, TreeEdge const > > genesis::tree::preorder ( ElementType const &  element)

Definition at line 220 of file tree/iterator/preorder.hpp.

utils::Range< IteratorPreorder< TreeLink, TreeNode, TreeEdge > > genesis::tree::preorder ( ElementType &  element)

Definition at line 230 of file tree/iterator/preorder.hpp.

void reroot ( Tree &  tree,
TreeLink &  at_link 
)

Reroot the Tree at the given TreeLink.

The function sets the root of the tree to the node of the given link. This operation does not change the topology of the tree, but merely adjusts some internal properties. The main changes are that Tree::root_node() and Tree::root_link() will return the new root after calling this function, and that tree iterators will start traversing the tree from this new root by default.

There are three internal changes made to the tree data structure:

  • All primary and secondary ends of the edges on the path between the new root and the old root are swapped. This is because the edges now need to point towards the new root.
  • Similarly, all (primary) links of the nodes on that path are changed so that they point towards the new root.
  • Also, the (internal) root_link_index is changed to the new root link. This is used for the functions Tree::root_node() and Tree::root_link().

The difference between this function and reroot( Tree& tree, TreeNode const& ) is that when specifying a specific link, this link is used as the (primary) link of the new root node. This way, algorithms and iterators (e.g., IteratorLevelorder) will start traversing the tree in the direction of this link by default. When specifying a node for rerooting instead, the primary link of that node is used, so that iterators start traversing the tree in the direction of the old root instead. For most applications, this does not make a difference. However, there might be cases where the start directing makes a difference. Thus, we offer both versions of this function.

The link needs to be part of the tree, otherwise an exception is thrown.

Definition at line 205 of file manipulation.cpp.

void reroot ( Tree &  tree,
TreeNode &  at_node 
)

Reroot the Tree at the given TreeNode.

See reroot( Tree&, TreeLink& ) for details.

The node needs to be part of the tree, otherwise an exception is thrown.

Definition at line 253 of file manipulation.cpp.

void reroot_at_node ( Tree &  tree,
size_t  node_index 
)

Reroot the Tree at the TreeNode with the given index.

See reroot( Tree&, TreeLink& ) for details.

The node index needs to be valid for the tree, otherwise an exception is thrown.

Definition at line 261 of file manipulation.cpp.

void scale_all_branch_lengths ( Tree &  tree,
double  factor 
)

Scale all branch lengths of a Tree by a given factor.

This function simply multiplies all branch lengths with the given factor. See also set_all_branch_lengths() for setting the branch lengths to a value.

Definition at line 251 of file tree/default/functions.cpp.

void set_all_branch_lengths ( Tree &  tree,
double  length 
)

Set all branch lengths of a Tree to a given value.

See also scale_all_branch_lengths() for a scaling function.

Definition at line 236 of file tree/default/functions.cpp.

std::string squash_cluster_tree ( SquashClustering const &  sc,
std::vector< std::string > const &  labels 
)

Build a Newick-format tree for visualizing the result of a squash_clustering().

The resulting Tree is a tree of samples, i.e., each leaf node represents one MassTree that was used as input for the Squash Clustering. The labels vector needs to contain the labels for those tips, in the order of elements that was used for running squash_clustering().

Definition at line 237 of file squash_clustering.cpp.

SquashClustering squash_clustering ( std::vector< MassTree > &&  trees,
double const  p = 1.0 
)

Perfom Squash Clustering.

See the guppy documentation and the corresponding paper for details on this algorithm.

The funciton takes MassTrees as input, which are consumed. The optional parameter p is used as exponent to calculate the earth_movers_distance(). See there for details.

Definition at line 201 of file squash_clustering.cpp.

SquashClustering genesis::tree::squash_clustering_init ( std::vector< MassTree > &&  trees,
double const  p 
)

Definition at line 56 of file squash_clustering.cpp.

void genesis::tree::squash_clustering_merge_clusters ( SquashClustering &  sc,
size_t  i,
size_t  j,
double const  p 
)

Definition at line 132 of file squash_clustering.cpp.

std::pair<size_t, size_t> genesis::tree::squash_clustering_min_entry ( SquashClustering const &  sc)

Definition at line 91 of file squash_clustering.cpp.

size_t subtree_max_path_height ( Tree const &  tree,
TreeLink const &  link 
)

Calculate the height of a subtree, that is, the maximum path length to a leaf of that subtree, measured in edges between the link and the leaf.

Definition at line 289 of file tree/function/functions.cpp.

std::vector< size_t > subtree_max_path_heights ( Tree const &  tree,
TreeNode const &  node 
)

Definition at line 310 of file tree/function/functions.cpp.

std::vector< size_t > subtree_max_path_heights ( Tree const &  tree)

Definition at line 349 of file tree/function/functions.cpp.

size_t subtree_size ( Tree const &  tree,
TreeLink const &  link 
)

Return the size of the subtree defined by the given TreeLink, measured in number of nodes.

Definition at line 194 of file tree/function/functions.cpp.

std::vector< size_t > subtree_sizes ( Tree const &  tree,
TreeNode const &  node 
)

Calculate the sizes of all subtrees as seen from the given TreeNode.

The function returns a vector with as many elements as the Tree has nodes. The vector is indexed using the TreeNode::index() values.

Each value in the vector tells the size (in number of nodes) of the subtree of the correnspinding node, as seen from the given starting node, and excluding that starting node.

In methaphorical words, the given starting node is used as a hook where the tree is suspended from, so that it hangs down. A subtree is then the part of the tree that "hangs down" from a certain node. We then count the number of nodes in each of those subtrees (that is, we examine the subtree starting at each node of the tree). For the starting node, the count is always the number of nodes of the tree minus one (because the node is not counted itself).

Definition at line 216 of file tree/function/functions.cpp.

std::vector< size_t > subtree_sizes ( Tree const &  tree)

Calculate the sizes of all subtrees as seen from the root of the tree.

See subtree_sizes(...) for details.

Definition at line 284 of file tree/function/functions.cpp.

std::string genesis::tree::svg_arc ( double  center_x,
double  center_y,
double  radius,
double  start_angle,
double  end_angle 
)

Definition at line 102 of file circular_layout.cpp.

bool genesis::tree::tree_data_is ( Tree const &  tree)

Check whether the data of the nodes and edges of the Tree are exactly of the specified data types.

This function returns true iff all data types of the nodes and edges match exaclty the specified types. It uses typeid() for this matching.

Definition at line 63 of file tree/function/operators.hpp.

bool genesis::tree::tree_data_is_derived_from ( Tree const &  tree)

Check whether the data of the nodes and edges of the Tree are derived from the specified data types.

This function returns true iff all data types of the nodes and edges are derived from the specified types. It uses dynamic_cast() for this.

Definition at line 98 of file tree/function/operators.hpp.

bool validate_topology ( Tree const &  tree)

Validate that all internal pointers of the Tree elements (TreeLinks, TreeNodes, TreeEdges) to each other are correct and that some other invariants are met.

Validate the correctness of all Tree pointers etc.

This check is a bit pedantic, but better safe than sorry.

Definition at line 296 of file tree/function/operators.cpp.

Typedef Documentation

Alias for a Tree that stores TreeNodes and TreeEdges with string attributes on them.

Definition at line 51 of file tree/attribute_tree/tree.hpp.

Alias for a TreeEdge of an AttributeTree. See there for more information.

Definition at line 57 of file tree/attribute_tree/tree.hpp.

Alias for a TreeLink of an AttributeTree. See there for more information.

Definition at line 63 of file tree/attribute_tree/tree.hpp.

using AttributeTreeMap = std::map<std::string, std::string>

Alias for the map type used by an AttributeTree.

We define this alias in the namespace instead of inside of a class, because it is used in multiple places. Defining it here once allows to easily change the type in the future, should that be needed.

See AttributeTreeNodeData and AttributeTreeEdgeData for the data classes where this type is used.

Definition at line 81 of file tree/attribute_tree/tree.hpp.

Alias for a TreeNode of an AttributeTree. See there for more information.

Definition at line 69 of file tree/attribute_tree/tree.hpp.

Alias for a tree::Tree used for a tree with information needed for tree drawing.

Definition at line 46 of file circular_tree.hpp.

Alias for tree::TreeEdge used in a CircularTree. See CircularEdgeData for the data stored on the edges.

Definition at line 58 of file circular_tree.hpp.

Alias for tree::TreeLink used in a CircularTree.

Definition at line 63 of file circular_tree.hpp.

Alias for tree::TreeNode used in a CircularTree. See CircularNodeData for the data stored on the nodes.

Definition at line 52 of file circular_tree.hpp.

typedef Tree DefaultTree

Alias for a Tree with data types DefaultNodeData and DefaultEdgeData.

Definition at line 51 of file epca.hpp.

Alias for a TreeEdge of a DefaultTree. See there for more information.

Definition at line 53 of file tree/default/tree.hpp.

Alias for a TreeLink of a DefaultTree. See there for more information.

Definition at line 58 of file tree/default/tree.hpp.

Alias for a TreeNode of a DefaultTree. See there for more information.

Definition at line 63 of file tree/default/tree.hpp.

Alias for a tree::Tree used for a tree with information needed for tree drawing.

Definition at line 51 of file layout_tree.hpp.

Alias for tree::TreeEdge used in a LayoutTree. See LayoutEdgeData for the data stored on the edges.

Definition at line 63 of file layout_tree.hpp.

Alias for tree::TreeLink used in a LayoutTree.

Definition at line 68 of file layout_tree.hpp.

Alias for tree::TreeNode used in a LayoutTree. See LayoutNodeData for the data stored on the nodes.

Definition at line 57 of file layout_tree.hpp.

typedef Tree MassTree

Alias for a Tree that stores masses on its TreeEdges.

It is for example used to calculate the earth movers distance between to sets of masses distributed on a Tree.

See earth_movers_distance( MassTree const& tree, double ) for more details on the purpose of this tree type and on the earth movers distance in general.

The branches of an MassTree hold a list of masses, sorted along their position on the branch.

It is easily possible to merge the masses of two MassTrees by using mass_tree_merge_trees() or mass_tree_merge_trees_inplace().

Lastly, there are some some useful transformation functions:

See there for details.

Definition at line 53 of file placement/function/operators.hpp.

Alias for a TreeEdge of a MassTree. See there for more information.

Definition at line 76 of file tree/mass_tree/tree.hpp.

Alias for a TreeLink of a MassTree. See there for more information.

Definition at line 81 of file tree/mass_tree/tree.hpp.

Alias for a TreeNode of a MassTree. See there for more information.

Definition at line 86 of file tree/mass_tree/tree.hpp.

Alias for a tree::Tree used for a tree with information needed for tree drawing.

Definition at line 46 of file rectangular_tree.hpp.

Alias for tree::TreeEdge used in a RectangularTree. See RectangularEdgeData for the data stored on the edges.

Definition at line 58 of file rectangular_tree.hpp.

Alias for tree::TreeLink used in a RectangularTree.

Definition at line 63 of file rectangular_tree.hpp.

Alias for tree::TreeNode used in a RectangularTree. See RectangularNodeData for the data stored on the nodes.

Definition at line 52 of file rectangular_tree.hpp.

Enumeration Type Documentation

enum LadderizeOrder
strong
Enumerator
kSmallFirst 
kLargeFirst 

Definition at line 168 of file manipulation.hpp.