Step-by-step instructions

(Mac version)

The steps below are designed to be visited in order.

Step Title Description
Step 1 Setting Up A Build System

The first step is to set up the build system for your platform.

Step 2 Node and Tree Classes

The next step is to create C++ classes representing a node (Node class) in a phylogenetic tree (Tree class).

Step 3 Creating Trees

Input, output, and manipulation of phylogenetic trees is an essential part of our toolkit. We first start by creating a tree in memory the hard way, by creating each node and connecting nodes by specifying each node’s neighboring nodes. If this seems way too labor-intensive, you’re right! In the next several steps, we will gradually work toward inputting and outputting trees as Newick tree descriptions.

Step 4 Creating a Tree Manipulator

In this section, we create a class (TreeManip) that manipulates a tree object.

Step 5 Saving trees

In this step you will learn how to save a tree in memory in the form of a string known as a Newick tree description. You will also begin using some functionality from the Boost C++ library.

Step 6 Building trees and handling exceptions

In this step you will learn how to build a tree in memory from a Newick tree description and will create an exception class (XStrom) to handle unexpected run-time circumstances.

Step 7 Summarizing tree topologies

In this part, you will learn how to summarize trees in terms of their component splits. By the end of this section, your program will read a tree file and report all unique tree topologies and the number of trees having each topology. You will also install the Nexus Class Library in order to read NEXUS-formatted tree files.

Step 8 Adding program options

In this part, you will use the Boost Program Options library to add the ability to specify options on the command line so that information such as tree and data file names need not be hard coded.

Step 9 Reading and Storing Data

Now that the program can build trees, the next step is to read sequence data from a Nexus-formatted file and store the data patterns compactly.

Step 10 Calculating the likelihood

Now that we can load sequence data and a tree into memory, it is time to calculate the likelihood of the tree, which is the probability of the observed data given that tree and a model of evolution.

Step 11 Adding the Model class

In this part we add a Model class to handle the tedium associated with informing BeagleLib of the details of the substitution model. The Eigen library is used to manage calculation of transition probability matrices.

Step 12 The Large Tree Problem

Here we explore the problem of underflow when computing likelihoods of large trees, and how to deal with this in BeagleLib.

Step 13 Adding a pseudorandom number generator

The next step is to create a class that can generate pseudorandom numbers.

Step 14 Markov Chain Monte Carlo (MCMC)

Create a class whose objects carry out Markov chain Monte Carlo (MCMC) simulation.

Step 15 Managing Output

Create a class whose objects simplify managing the output of an MCMC analysis.

Step 16 Updating Other Parameters

Create updater classes for state frequencies, exchangeabilities, omega, pinvar, and subset relative rates.

Step 17 Updating the Tree

Create an updater that modifies the tree topology and edge lengths.

Step 18 Heated Chains

Implement Metropolis-coupled MCMC in order to improve mixing.

Step 19 Allowing Polytomies

Implement reversible-jump MCMC to allow polytomies in trees.

Step 20 The Steppingstone Method

Implement the Steppingstone method of marginal likelihood estimation