SuchTree Class API Reference

Overview

The SuchTree class provides high-performance phylogenetic tree manipulation using Cython. It supports: - Fast tree traversal and node queries - Patristic distance calculations - Topological analysis - Multiple tree formats (Newick, URL, file path) - Integration with NetworkX and igraph

Some best practices to keep in mind :

Use node IDs for performance-critical code
Prefer bulk methods (distances_bulk) for multiple calculations
Cache frequently-used properties (like RED values)
Use traversal generators for memory efficiency

The supporting SuchLinkedTrees class provides several useful bookkeeping features for working with co-phylogeny datasets, most importantly the ability to perform reciprocal masking of one tree by clades in its linked tree. Subset masking is thread-safe, fast and non-destructive. However, please note that SuchTree 1.3 is the last release that will include this interface for SuchLinkedTrees. The next major release will include a modernized, multi-tree container with a new interface.

Initialization

class SuchTree(tree_input: Union[str, Path])

Construct from: - Newick string - File path - URL (http/https/ftp)

Example:

tree = SuchTree("(A:0.1,B:0.2,(C:0.3,D:0.4));")
tree = SuchTree("https://example.com/tree.newick")

Core Properties

Tree Structure

Property	Type	Description	Example
`size`	`int`	Total nodes	`tree.size` → 7
`depth`	`int`	Max depth	`tree.depth` → 3
`num_leaves`	`int`	Leaf count	`tree.num_leaves` → 4
`root_node`	`int`	Root ID	`tree.root_node` → 0
`polytomy_epsilon`	`float`	Polytomy resolution	`1e-20`

Node Collections

Property	Type	Description	Contains
`leaves`	`Dict[str, int]`	Name → ID	`{'A': 1, 'B': 2}`
`leaf_nodes`	`Dict[int, str]`	ID → Name	`{1: 'A', 2: 'B'}`
`internal_nodes`	`np.ndarray`	Internal IDs	`[0, 3, 4]`
`all_nodes`	`np.ndarray`	All IDs	`[0, 1, 2, 3, 4, 5, 6]`
`leaf_node_ids`	`np.ndarray`	Leaf IDs	`[1, 2, 5, 6]`
`leaf_names`	`list`	Leaf names	`['A', 'B', 'C', 'D']`

Example Usage

Here are some examples of operations you can do with SuchTree.

# Initialize a tree and print some basic properties
tree = SuchTree("(A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F;")
print(f"Tree depth: {tree.depth}")
print(f"Leaf names: {tree.leaf_names}")

# Find some node relationships
node_id = tree.leaves['A']
parent_id = tree.get_parent(node_id)
children = tree.get_children(parent_id)

# Distance analysis
dist = tree.distance('A', 'C')
dist_matrix = tree.pairwise_distances(['A', 'B', 'C', 'D'])

# Calculate RED values
red_values = tree.relative_evolutionary_divergence()

# Export tree as a networkx graph (requires networkx)
nx_graph = tree.to_networkx_graph()

Error Handling

NodeNotFoundError : Invalid leaf name
InvalidNodeError : Invalid node ID
TreeStructureError : Invalid tree operations
ValueError : Invalid input format
TypeError : Incorrect argument type

Core Methods

Node Relationships

get_parent(node: Union[int, str]) -> int

Get immediate parent node for a given node.

Args: node: Node identifier as either integer ID or leaf name string

Returns: Integer ID of parent node. Returns -1 if called on root node.

Raises: NodeNotFoundError: If leaf name doesn't exist in the tree InvalidNodeError: If node ID is out of valid range (0 <= id < tree.size)

Example:

parent_id = tree.get_parent("A")
parent_id = tree.get_parent(5)

get_children(node: Union[int, str]) -> Tuple[int, int]

Get direct children of a node.

Args: node: Node identifier as either integer ID or leaf name string

Returns: Tuple of (left_child, right_child) node IDs. Returns (-1, -1) for leaf nodes.

Raises: NodeNotFoundError: If leaf name doesn't exist InvalidNodeError: If node ID is invalid

Note: For multifurcating trees, only the first two children are returned. Use traverse_children() method for complete child iteration.

get_ancestors(node: Union[int, str]) -> Generator[int, None, None]

Generate ancestor node IDs from node to root. Yields parent IDs in ascending order from immediate parent to root.

get_descendants(node_id: int) -> Generator[int, None, None]

Generate all descendant node IDs in depth-first order. Includes the starting node in the output.

get_leaves(node: Union[int, str]) -> np.ndarray

Get array of leaf node IDs descended from a given node. Uses efficient buffer reuse for performance.

get_support(node: Union[int, str]) -> float

Retrieve node support value. Returns -1 if no support available. Works for both internal nodes and leaves.

is_leaf(node: Union[int, str]) -> bool

Check if node is a leaf. Uses optimized Cython implementation for fast checking.

is_internal(node: Union[int, str]) -> bool

Check if node is internal. Simply returns negation of is_leaf but provides clearer intent.

is_ancestor(ancestor: Union[int, str], descendant: Union[int, str]) -> int

Test ancestral relationship. Returns:

1 if ancestor of descendant
-1 if descendant is ancestor
0 if no direct relationship

is_descendant(descendant: Union[int, str], ancestor: Union[int, str]) -> bool

Convenience method that returns True if descendant is indeed a descendant of ancestor.

is_root(node: Union[int, str]) -> bool

Check if node is the tree root. Uses direct comparison with stored root node ID.

is_sibling(node1: Union[int, str], node2: Union[int, str]) -> bool

Check if two nodes share the same parent. Automatically returns False if either node is root.

has_children(node: Union[int, str]) -> bool

Determine if node has any children. Equivalent to is_internal but may be more intuitive for some users.

has_parent(node: Union[int, str]) -> bool

Check if node has a parent (i.e., is not root). Returns negation of is_root.

common_ancestor(a: Union[int, str], b: Union[int, str]) -> int

Find most recent common ancestor of two nodes. Uses optimized MRCA algorithm with visited node tracking.

path_between_nodes(a: Union[int, str], b: Union[int, str]) -> List[int]

Get node IDs forming the path between two nodes through their common ancestor. Returns list from a -> MRCA -> b.

Distance Analysis

SuchTree was originally built to compute patristic (leaf-to-leaf) distances as efficiently as possible. You have two choices to make :

Do you want to use leaf names or leaf IDs?
Do you want to to calculate distances for single a pair of leafs, or a table of leaf pairs?

It is important to remember that even if you have two trees with exactly the same leaf names, the leafs will have different node IDs if the topologies are different. For those cases, it is better to use use the leaf name to ID mappings (SuchTree.leaves and SuchTree.leaf_nodes) to associate your leaf nodes with other data.

distance(a: Union[int, str], b: Union[int, str]) -> float

Calculate patristic distance between two nodes along the tree.

Args: a: First node identifier (ID or name) b: Second node identifier (ID or name)

Returns: Sum of branch lengths along the path between nodes via their most recent common ancestor (MRCA)

Raises: NodeNotFoundError: If either node name doesn't exist InvalidNodeError: If either node ID is invalid

Complexity: O(h) where h is the height of the tree. Uses cached ancestor paths for optimal performance.

Example:

dist = tree.distance("A", "B")
dist = tree.distance(2, 5)

distances_bulk(pairs: np.ndarray) -> np.ndarray

Efficiently compute distances for multiple node pairs. Accepts (n, 2) array of node IDs. Uses Cython nogil implementation.

distances_by_name(pairs: List[Tuple[str, str]]) -> List[float]

Convenience wrapper for bulk distance calculation using leaf names instead of IDs.

pairwise_distances(nodes: list = None) -> np.ndarray

Generate full distance matrix for specified nodes (all leaves by default). Returns symmetric numpy array.

distance_to_root(node: Union[int, str]) -> float

Calculate total branch length from node to root. Optimized with cumulative distance caching.

nearest_neighbors(node: Union[int, str], k=1) -> List[Tuple[Union[int, str], float]]

Find k nearest neighbors to a node. Can search among specific nodes or all leaves by default.

Tree Traversal

SuchTree includes a collection of generators for traversing the tree in different ways. They all work the same way, other than the different order of traversal.

traverse_inorder(include_distances: bool = True) -> Generator

In-order traversal (left, root, right). Yields node IDs or (ID, distance) tuples.

traverse_preorder(from_node: Union[int, str] = None) -> Generator
"""
Iterate through nodes in pre-order traversal (parent before children).

Args:
    from_node: Starting node (default: root). Can be ID or name.

Yields:
    Node IDs in traversal order

Raises:
    NodeNotFoundError: If from_node name doesn't exist
    InvalidNodeError: If from_node ID is invalid

Memory:
    O(h) space complexity due to stack implementation, where h is tree height

Example:
```python
for node_id in tree.traverse_preorder():
    print(f"Visiting node {node_id}")

traverse_postorder(from_node: Union[int, str] = None) -> Generator

Post-order traversal (left, right, root). Useful for dependency resolution.

traverse_levelorder(from_node: Union[int, str] = None) -> Generator

Breadth-first level order traversal. Yields nodes by depth level.

traverse_leaves_only(from_node: Union[int, str] = None) -> Generator

Efficient traversal that only yields leaf nodes. Skips internal nodes.

traverse_internal_only(from_node: Union[int, str] = None) -> Generator

Traversal that skips leaf nodes. Useful for operations only on internal nodes.

traverse_with_depth(from_node: Union[int, str] = None) -> Generator[Tuple[int, int], None, None]

Traversal yielding (node ID, depth) pairs. Depth starts at 0 for root.

traverse_with_distances(from_node: Union[int, str] = None) -> Generator[Tuple[int, float, float], None, None]

Traversal yielding (node ID, distance to parent, cumulative distance to root).

Topological Analysis

bipartition(node: Union[int, str], by_id=False) -> frozenset
bipartitions(by_id=False) -> Generator[frozenset, None, None]
quartet_topology(a: Union[int, str], b: Union[int, str], c: Union[int, str], d: Union[int, str]) -> frozenset
quartet_topologies_bulk(quartets: Union[list, np.ndarray]) -> np.ndarray
quartet_topologies_by_name(quartets: List[Tuple[str, str, str, str]]) -> List[frozenset]

Graph Operations

adjacency_matrix(from_node: Union[int, str] = None) -> Dict[str, Any]
laplacian_matrix(from_node: Union[int, str] = None) -> Dict[str, Any]
incidence_matrix(from_node: Union[int, str] = None) -> Dict[str, Any]
degree_sequence(from_node: Union[int, str] = None) -> Dict[str, Any]

Export & Conversion

to_networkx_graph(from_node: Union[int, str] = None) -> 'networkx.Graph'
to_newick(include_support=True, include_distances=True) -> str
relative_evolutionary_divergence() -> Dict[int, float]

SuchTree Class API Reference

Overview

Initialization

Core Properties

Tree Structure

Node Collections

Example Usage

Error Handling

Core Methods

Node Relationships

Tree Navigation

Distance Analysis

Tree Traversal

Topological Analysis

Graph Operations

Export & Conversion