17.1 The TreeUpdater class

(Mac version)

< 17.0 | 17.1 | 17.2 >

The TreeUpdater class updates three edge length proportions and (potentially) the tree topology using a modified version of the Larget and Simon (1999) LOCAL move for unrooted trees.

Create a new header file named tree_updater.hpp and copy into it the following class declaration.

#pragma once    

#include "updater.hpp"

namespace strom {

    class Chain;

    class TreeUpdater : public Updater {

        friend class Chain;

        public:

            typedef std::shared_ptr&lt; TreeUpdater &gt; SharedPtr;

                                                TreeUpdater();
                                                ~TreeUpdater();

            virtual double                      calcLogPrior();

        private:

            virtual void                        revert();
            virtual void                        proposeNewState();
        
            virtual void                        reset();

            double                              _orig_edgelen_top;
            double                              _orig_edgelen_middle;
            double                              _orig_edgelen_bottom;

            unsigned                            _case;
            bool                                _topology_changed;
            Node *                              _x;
            Node *                              _y;
            Node *                              _a;
            Node *                              _b;
    };

    // Member function bodies go here
    
}   

The constructor and destructor

This class can simply use the base class (Updater) version of the clear function in the constructor and, as usual, the destructor has no work to do.

    inline TreeUpdater::TreeUpdater() { 
        // std::cout &lt;&lt; "Creating a TreeUpdater" &lt;&lt; std::endl;
        Updater::clear();
        _name = "Tree Topol. and Edge Prop.";
        reset();
    } 

    inline TreeUpdater::~TreeUpdater() {
        // std::cout &lt;&lt; "Destroying a TreeUpdater" &lt;&lt; std::endl;
    }   

The reset member function

Recall that the reset function is called at the end of the Updater::update function to prepare the updater for the next update. Any temporary variables needed by an updater during the update process need to be re-initialized in this function, and this TreeUpdater has several of these temporary variables, most of which are needed to keep track of how to revert if the proposed tree is rejected.

    inline void TreeUpdater::reset() {  
        _topology_changed       = false;
        _orig_edgelen_top       = 0.0;
        _orig_edgelen_middle    = 0.0;
        _orig_edgelen_bottom    = 0.0;
        _log_hastings_ratio     = 0.0;
        _case                   = 0;
        _x                      = 0;
        _y                      = 0;
        _a                      = 0;
        _b                      = 0;
    }   

The calcLogPrior member function

Now that we’ve created the calcLogTopologyPrior function and (previously, in the Updater class) the calcEdgeLengthPrior function, the calcLogPrior function simply needs to call those two functions and return the sum of the values they calculate.

    inline double TreeUpdater::calcLogPrior() {   
        double log_topology_prior    = Updater::calcLogTopologyPrior();
        double log_edge_length_prior = Updater::calcLogEdgeLengthPrior();
        return log_topology_prior + log_edge_length_prior;
    }   

The proposeNewState member function

This function, which overrides one of the 3 pure virtual functions defined by the base class Updater, is unfortunately rather long and tedious owing to the fact that the first step in the Larget-Simon LOCAL move, which involves choosing 3 adjacent random internal edges, leads to 8 distinct possible proposed states. I’ve tried to make following the code in this function easier by creating diagrams in the comments showing the relationships among all the temporary variables, but to fully understand this function, you will need to read and understand the description of this move in the papers by Larget and Simon (1999). Keep in mind that we are modifying the original method; specifically we do not expand or contract the 3-edge segment before moving one of the attached nodes to a different place. This means that the Hastings ratio is 1.0 because the move is now symmetric, and the discussion of the Hastings ratio in Holder et al. (2005) is no longer relevant. The reason we can get away with this is that we will next create a TreeLengthUpdater class that will handle scaling the tree up or down, so we do not need to do any rescaling in the TreeUpdater proposal.

    inline void TreeUpdater::proposeNewState() {    
        _case = 0;
        _topology_changed = false;
        assert(!_tree_manipulator-&gt;getTree()-&gt;isRooted());

        // Choose random internal node x that is not the root and has parent y that is also not the root.
        // After choosing x (and thus y), choose a and b randomly to create a contiguous 3-edge segment.
        //
        //        a
        //   \ | /
        //    \|/
        //     x
        //      \ | /
        //       \|/
        //        y
        //        |
        //        |
        //        b
        
        _x = _tree_manipulator-&gt;randomInternalEdge(_lot);
        _orig_edgelen_middle = _x-&gt;getEdgeLength();

        _y = _x-&gt;getParent();

        // Choose focal 3-edge segment to modify
        // Begin by randomly choosing one child of x to be node _a
        _a = _tree_manipulator-&gt;randomChild(_lot, _x, 0, false);
        _orig_edgelen_top = _a-&gt;getEdgeLength();

        // Now choose a different child of x (or the parent) to be node _b
        _b = _tree_manipulator-&gt;randomChild(_lot, _y, _x, true);
        bool b_is_child_of_y;
        if (_b) {
            b_is_child_of_y = true;
            _orig_edgelen_bottom = _b-&gt;getEdgeLength();
        }
        else {
            b_is_child_of_y = false;
            _b = _y-&gt;getParent();
            _orig_edgelen_bottom = _y-&gt;getEdgeLength();
        }
        
        // Symmetric move: Hastings ratio = 1
        _log_hastings_ratio = 0.0;

        // Decide where along focal path (starting from top) to place moved node
        double focal_path_length = _orig_edgelen_top + _orig_edgelen_middle + _orig_edgelen_bottom;
        double u = _lot-&gt;uniform();
        double new_attachment_point = u*focal_path_length;
        if (new_attachment_point &lt;= Node::_smallest_edge_length)
            new_attachment_point = Node::_smallest_edge_length;
        else if (focal_path_length - new_attachment_point &lt;= Node::_smallest_edge_length)
            new_attachment_point = focal_path_length - Node::_smallest_edge_length;

        // Decide which node(s) to move, and whether the move involves a topology change
        u = _lot-&gt;uniform();
        bool x_node_slides = (bool)(u &lt; 0.5);
        if (x_node_slides) {
            // _x slides toward _y
            _topology_changed = (new_attachment_point &gt; _orig_edgelen_top + _orig_edgelen_middle);
            if (_topology_changed) {
                _tree_manipulator-&gt;LargetSimonSwap(_a, _b);
                if (b_is_child_of_y) {
                    // LargetSimonSwap case 1: a swapped with b
                    _a-&gt;setEdgeLength(_orig_edgelen_top + _orig_edgelen_middle);
                    _x-&gt;setEdgeLength(new_attachment_point - _orig_edgelen_top - _orig_edgelen_middle);
                    _b-&gt;setEdgeLength(focal_path_length - new_attachment_point);
                    _case = 1;
                } else {
                    // LargetSimonSwap case 2: x's children (except a) swapped with y's children (except b)
                    _a-&gt;setEdgeLength(_orig_edgelen_top + _orig_edgelen_middle);
                    _x-&gt;setEdgeLength(new_attachment_point - _orig_edgelen_top - _orig_edgelen_middle);
                    _y-&gt;setEdgeLength(focal_path_length - new_attachment_point);
                    _case = 2;
                }
            } else {
                _a-&gt;setEdgeLength(new_attachment_point);
                _x-&gt;setEdgeLength(_orig_edgelen_top + _orig_edgelen_middle - new_attachment_point);
                if (b_is_child_of_y) {
                    _b-&gt;setEdgeLength(_orig_edgelen_bottom);    // not really necessary
                    _case = 3;
                } else {
                    _y-&gt;setEdgeLength(_orig_edgelen_bottom);    // not really necessary
                    _case = 4;
                }
            }
        } else {
            // _y slides toward _x
            _topology_changed = (new_attachment_point &lt; _orig_edgelen_top);
            if (_topology_changed) {
                _tree_manipulator-&gt;LargetSimonSwap(_a, _b);
                if (b_is_child_of_y) {
                    // LargetSimonSwap case 1: a swapped with b
                    _a-&gt;setEdgeLength(new_attachment_point);
                    _x-&gt;setEdgeLength(_orig_edgelen_top - new_attachment_point);
                    _b-&gt;setEdgeLength(_orig_edgelen_middle + _orig_edgelen_bottom);
                    _case = 5;
                } else {
                    // LargetSimonSwap case 2: x's children (except a) swapped with y's children (except b)
                    _a-&gt;setEdgeLength(new_attachment_point);
                    _x-&gt;setEdgeLength(_orig_edgelen_top - new_attachment_point);
                    _y-&gt;setEdgeLength(_orig_edgelen_middle + _orig_edgelen_bottom);
                    _case = 6;
                }
            } else {
                _a-&gt;setEdgeLength(_orig_edgelen_top);
                _x-&gt;setEdgeLength(new_attachment_point - _orig_edgelen_top);
                if (b_is_child_of_y) {
                    _b-&gt;setEdgeLength(focal_path_length - new_attachment_point);
                    _case = 7;
                } else {
                    _y-&gt;setEdgeLength(focal_path_length - new_attachment_point);
                    _case = 8;
                }
            }
        }

        // flag partials and transition matrices for recalculation
        _tree_manipulator-&gt;selectPartialsHereToRoot(_x);
        _a-&gt;selectTMatrix();
        _x-&gt;selectTMatrix();
        if (_case == 2 || _case == 4 || _case == 6 || _case == 8) {
            // In these cases b is below y, so it is y's edge that is modified
            _y-&gt;selectTMatrix();
        } else {
            // In these cases b is above y, so it is b's edge that is modified
            _b-&gt;selectTMatrix();
        }
    }   

The revert member function

This is the last of the 4 pure virtual functions defined by the base class Updater that we must override. The proposeNewState function stored the particular path (of 8 possible paths) that was followed in generating the proposed tree in the data member _case, so here we simply need to reconstruct the original tree from knowledge of the path followed and whether the topology was modified.

    inline void TreeUpdater::revert() { 
        assert(_case &gt; 0 && _case &lt; 9);
        if (_case == 2 || _case == 6)
            _tree_manipulator-&gt;LargetSimonSwap(_a, _b);
        else if (_case == 1 || _case == 5)
            _tree_manipulator-&gt;LargetSimonSwap(_b, _a);
        _a-&gt;setEdgeLength(_orig_edgelen_top);
        _x-&gt;setEdgeLength(_orig_edgelen_middle);
        if (_case == 1 || _case == 3 || _case == 5 || _case == 7)
            _b-&gt;setEdgeLength(_orig_edgelen_bottom);    // not actually necessary for case 3
        else
            _y-&gt;setEdgeLength(_orig_edgelen_bottom);    // not actually necessary for case 4
    }   

Literature Cited

Holder, Mark T., P. O. Lewis, D. L. Swofford, and B. Larget. 2005. Hastings ratio of the LOCAL proposal used in Bayesian phylogenetics. Systematic Biology 54(6): 961-965. DOI: 10.1080/10635150500354670

Larget, B., and Simon, D. 1999. Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees. Molecular Biology and Evolution 16:750–759. DOI: 10.1093/oxfordjournals.molbev.a026160