The TreeUpdater
class updates three edge length proportions and (potentially) the tree topology using a modified version of the Larget and Simon (1999) LOCAL move for unrooted trees.
Create a new header file named tree_updater.hpp and copy into it the following class declaration.
#pragma once
#include "updater.hpp"
namespace strom {
class Chain;
class TreeUpdater : public Updater {
friend class Chain;
public:
typedef std::shared_ptr< TreeUpdater > SharedPtr;
TreeUpdater();
~TreeUpdater();
virtual double calcLogPrior();
private:
virtual void revert();
virtual void proposeNewState();
virtual void reset();
double _orig_edgelen_top;
double _orig_edgelen_middle;
double _orig_edgelen_bottom;
unsigned _case;
bool _topology_changed;
Node * _x;
Node * _y;
Node * _a;
Node * _b;
};
// Member function bodies go here
}
This class can simply use the base class (Updater
) version of the clear function in the constructor and, as usual, the destructor has no work to do.
inline TreeUpdater::TreeUpdater() {
// std::cout << "Creating a TreeUpdater" << std::endl;
Updater::clear();
_name = "Tree Topol. and Edge Prop.";
reset();
}
inline TreeUpdater::~TreeUpdater() {
// std::cout << "Destroying a TreeUpdater" << std::endl;
}
Recall that the reset
function is called at the end of the Updater::update
function to prepare the updater for the next update. Any temporary variables needed by an updater during the update process need to be re-initialized in this function, and this TreeUpdater
has several of these temporary variables, most of which are needed to keep track of how to revert if the proposed tree is rejected.
inline void TreeUpdater::reset() {
_topology_changed = false;
_orig_edgelen_top = 0.0;
_orig_edgelen_middle = 0.0;
_orig_edgelen_bottom = 0.0;
_log_hastings_ratio = 0.0;
_case = 0;
_x = 0;
_y = 0;
_a = 0;
_b = 0;
}
Now that we’ve created the calcLogTopologyPrior
function and (previously, in the Updater
class) the calcEdgeLengthPrior
function, the calcLogPrior
function simply needs to call those two functions and return the sum of the values they calculate.
inline double TreeUpdater::calcLogPrior() {
double log_topology_prior = Updater::calcLogTopologyPrior();
double log_edge_length_prior = Updater::calcLogEdgeLengthPrior();
return log_topology_prior + log_edge_length_prior;
}
This function, which overrides one of the 3 pure virtual functions defined by the base class Updater
, is unfortunately rather long and tedious owing to the fact that the first step in the Larget-Simon LOCAL move, which involves choosing 3 adjacent random internal edges, leads to 8 distinct possible proposed states. I’ve tried to make following the code in this function easier by creating diagrams in the comments showing the relationships among all the temporary variables, but to fully understand this function, you will need to read and understand the description of this move in the papers by Larget and Simon (1999). Keep in mind that we are modifying the original method; specifically we do not expand or contract the 3-edge segment before moving one of the attached nodes to a different place. This means that the Hastings ratio is 1.0 because the move is now symmetric, and the discussion of the Hastings ratio in Holder et al. (2005) is no longer relevant. The reason we can get away with this is that we will next create a TreeLengthUpdater
class that will handle scaling the tree up or down, so we do not need to do any rescaling in the TreeUpdater
proposal.
inline void TreeUpdater::proposeNewState() {
_case = 0;
_topology_changed = false;
assert(!_tree_manipulator->getTree()->isRooted());
// Choose random internal node x that is not the root and has parent y that is also not the root.
// After choosing x (and thus y), choose a and b randomly to create a contiguous 3-edge segment.
//
// a
// \ | /
// \|/
// x
// \ | /
// \|/
// y
// |
// |
// b
_x = _tree_manipulator->randomInternalEdge(_lot);
_orig_edgelen_middle = _x->getEdgeLength();
_y = _x->getParent();
// Choose focal 3-edge segment to modify
// Begin by randomly choosing one child of x to be node _a
_a = _tree_manipulator->randomChild(_lot, _x, 0, false);
_orig_edgelen_top = _a->getEdgeLength();
// Now choose a different child of x (or the parent) to be node _b
_b = _tree_manipulator->randomChild(_lot, _y, _x, true);
bool b_is_child_of_y;
if (_b) {
b_is_child_of_y = true;
_orig_edgelen_bottom = _b->getEdgeLength();
}
else {
b_is_child_of_y = false;
_b = _y->getParent();
_orig_edgelen_bottom = _y->getEdgeLength();
}
// Symmetric move: Hastings ratio = 1
_log_hastings_ratio = 0.0;
// Decide where along focal path (starting from top) to place moved node
double focal_path_length = _orig_edgelen_top + _orig_edgelen_middle + _orig_edgelen_bottom;
double u = _lot->uniform();
double new_attachment_point = u*focal_path_length;
if (new_attachment_point <= Node::_smallest_edge_length)
new_attachment_point = Node::_smallest_edge_length;
else if (focal_path_length - new_attachment_point <= Node::_smallest_edge_length)
new_attachment_point = focal_path_length - Node::_smallest_edge_length;
// Decide which node(s) to move, and whether the move involves a topology change
u = _lot->uniform();
bool x_node_slides = (bool)(u < 0.5);
if (x_node_slides) {
// _x slides toward _y
_topology_changed = (new_attachment_point > _orig_edgelen_top + _orig_edgelen_middle);
if (_topology_changed) {
_tree_manipulator->LargetSimonSwap(_a, _b);
if (b_is_child_of_y) {
// LargetSimonSwap case 1: a swapped with b
_a->setEdgeLength(_orig_edgelen_top + _orig_edgelen_middle);
_x->setEdgeLength(new_attachment_point - _orig_edgelen_top - _orig_edgelen_middle);
_b->setEdgeLength(focal_path_length - new_attachment_point);
_case = 1;
} else {
// LargetSimonSwap case 2: x's children (except a) swapped with y's children (except b)
_a->setEdgeLength(_orig_edgelen_top + _orig_edgelen_middle);
_x->setEdgeLength(new_attachment_point - _orig_edgelen_top - _orig_edgelen_middle);
_y->setEdgeLength(focal_path_length - new_attachment_point);
_case = 2;
}
} else {
_a->setEdgeLength(new_attachment_point);
_x->setEdgeLength(_orig_edgelen_top + _orig_edgelen_middle - new_attachment_point);
if (b_is_child_of_y) {
_b->setEdgeLength(_orig_edgelen_bottom); // not really necessary
_case = 3;
} else {
_y->setEdgeLength(_orig_edgelen_bottom); // not really necessary
_case = 4;
}
}
} else {
// _y slides toward _x
_topology_changed = (new_attachment_point < _orig_edgelen_top);
if (_topology_changed) {
_tree_manipulator->LargetSimonSwap(_a, _b);
if (b_is_child_of_y) {
// LargetSimonSwap case 1: a swapped with b
_a->setEdgeLength(new_attachment_point);
_x->setEdgeLength(_orig_edgelen_top - new_attachment_point);
_b->setEdgeLength(_orig_edgelen_middle + _orig_edgelen_bottom);
_case = 5;
} else {
// LargetSimonSwap case 2: x's children (except a) swapped with y's children (except b)
_a->setEdgeLength(new_attachment_point);
_x->setEdgeLength(_orig_edgelen_top - new_attachment_point);
_y->setEdgeLength(_orig_edgelen_middle + _orig_edgelen_bottom);
_case = 6;
}
} else {
_a->setEdgeLength(_orig_edgelen_top);
_x->setEdgeLength(new_attachment_point - _orig_edgelen_top);
if (b_is_child_of_y) {
_b->setEdgeLength(focal_path_length - new_attachment_point);
_case = 7;
} else {
_y->setEdgeLength(focal_path_length - new_attachment_point);
_case = 8;
}
}
}
// flag partials and transition matrices for recalculation
_tree_manipulator->selectPartialsHereToRoot(_x);
_a->selectTMatrix();
_x->selectTMatrix();
if (_case == 2 || _case == 4 || _case == 6 || _case == 8) {
// In these cases b is below y, so it is y's edge that is modified
_y->selectTMatrix();
} else {
// In these cases b is above y, so it is b's edge that is modified
_b->selectTMatrix();
}
}
This is the last of the 4 pure virtual functions defined by the base class Updater
that we must override. The proposeNewState
function stored the particular path (of 8 possible paths) that was followed in generating the proposed tree in the data member _case
, so here we simply need to reconstruct the original tree from knowledge of the path followed and whether the topology was modified.
inline void TreeUpdater::revert() {
assert(_case > 0 && _case < 9);
if (_case == 2 || _case == 6)
_tree_manipulator->LargetSimonSwap(_a, _b);
else if (_case == 1 || _case == 5)
_tree_manipulator->LargetSimonSwap(_b, _a);
_a->setEdgeLength(_orig_edgelen_top);
_x->setEdgeLength(_orig_edgelen_middle);
if (_case == 1 || _case == 3 || _case == 5 || _case == 7)
_b->setEdgeLength(_orig_edgelen_bottom); // not actually necessary for case 3
else
_y->setEdgeLength(_orig_edgelen_bottom); // not actually necessary for case 4
}
Holder, Mark T., P. O. Lewis, D. L. Swofford, and B. Larget. 2005. Hastings ratio of the LOCAL proposal used in Bayesian phylogenetics. Systematic Biology 54(6): 961-965. DOI: 10.1080/10635150500354670
Larget, B., and Simon, D. 1999. Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees. Molecular Biology and Evolution 16:750–759. DOI: 10.1093/oxfordjournals.molbev.a026160