14.4 Testing the Chain Class

(Mac version)

< 14.3 | 14.4 | 15.0 >

Now all that is needed is to make changes to the Node and TreeManip classes to accommodate flipping transition matrices and partials, make Updater a friend of Tree and Node, and update the Strom::run function to set up Chain and run it. For this exercise we will use another variation on the rbcL data set used before.

Download rbcl10.nex and rbcl10.tre to your working directory with the other downloaded data and tree files.

Here are maximum likelihood estimates (using PAUP* Version 4.0a, build 159) of the GTR+G model parameters using the data stored in rbcl10.nex and the tree topology stored in rbcl10.tre. You can refer back to this table once your program begins estimating these parameters to see how the values compare. Remember, however, that your program is Bayesian and thus prior distributions will affect estimates, and the maximized log-likelihood computed by PAUP* will (almost surely) be higher than any log-likelihood you will see in your output. I have modified the output produced by PAUP* below by adding a second column (values in parentheses) in which I provide:

In short, if all is working, we should see our MCMC analysis hovering (after a few iterations) around the rate variance 3.94 and a log-likelihood a little less than -6723.144.

-ln L     6723.144
Base frequencies:
  A       0.294667
  C       0.189172
  G       0.206055
  T       0.310106
Rate matrix R:
  AC       1.25798 (0.06082)
  AG       5.76823 (0.27887)
  AT       1.33643 (0.06461)
  CG       1.29145 (0.06244)
  CT      10.03035 (0.48492)
  GT       1.00000 (0.04834)
Shape     0.253596 (3.94322)

Adding selectability to the Node class

Edit node.hpp and uncomment the class Updater; and friend class Updater; lines that are currently commented out. In addition, add the methods and data structures highlighted below in bold, blue text:

#pragma once    

#include &lt;string&gt;
#include &lt;vector&gt;
#include  &lt;iostream&gt;
#include "split.hpp"

namespace strom {

    class Tree;
    class TreeManip;
    class Likelihood;
    <span style="color:#0000ff"><strong>class Updater;</strong></span>

    class Node {
        friend class Tree;
        friend class TreeManip;
        friend class Likelihood;
        <span style="color:#0000ff"><strong>friend class Updater;</strong></span>

        public:
                                        Node();
                                        ~Node();

                    Node *              getParent()                 {return _parent;}
                    Node *              getLeftChild()              {return _left_child;}
                    Node *              getRightSib()               {return _right_sib;}
                    int                 getNumber()                 {return _number;}
                    std::string         getName()                   {return _name;}
                    Split               getSplit()                  {return _split;}
        
                    <span style="color:#0000ff"><strong>bool                isSelected()                {return _flags & Flag::Selected;}</strong></span>
                    <span style="color:#0000ff"><strong>void                select()                    {_flags |= Flag::Selected;}</strong></span>
                    <span style="color:#0000ff"><strong>void                deselect()                  {_flags &= ~Flag::Selected;}</strong></span>

<span style="color:#0000ff"><strong></strong></span>
                    <span style="color:#0000ff"><strong>bool                isSelPartial()              {return _flags & Flag::SelPartial;}</strong></span>
                    <span style="color:#0000ff"><strong>void                selectPartial()             {_flags |= Flag::SelPartial;}</strong></span>
                    <span style="color:#0000ff"><strong>void                deselectPartial()           {_flags &= ~Flag::SelPartial;}</strong></span>

<span style="color:#0000ff"><strong></strong></span>
                    <span style="color:#0000ff"><strong>bool                isSelTMatrix()              {return _flags & Flag::SelTMatrix;}</strong></span>
                    <span style="color:#0000ff"><strong>void                selectTMatrix()             {_flags |= Flag::SelTMatrix;}</strong></span>
                    <span style="color:#0000ff"><strong>void                deselectTMatrix()           {_flags &= ~Flag::SelTMatrix;}</strong></span>

<span style="color:#0000ff"><strong></strong></span>
                    <span style="color:#0000ff"><strong>bool                isAltPartial()              {return _flags & Flag::AltPartial;}</strong></span>
                    <span style="color:#0000ff"><strong>void                setAltPartial()             {_flags |= Flag::AltPartial;}</strong></span>
                    <span style="color:#0000ff"><strong>void                clearAltPartial()           {_flags &= ~Flag::AltPartial;}</strong></span>

<span style="color:#0000ff"><strong></strong></span>
                    <span style="color:#0000ff"><strong>bool                isAltTMatrix()              {return _flags & Flag::AltTMatrix;}</strong></span>
                    <span style="color:#0000ff"><strong>void                setAltTMatrix()             {_flags |= Flag::AltTMatrix;}</strong></span>
                    <span style="color:#0000ff"><strong>void                clearAltTMatrix()           {_flags &= ~Flag::AltTMatrix;}</strong></span>

<span style="color:#0000ff"><strong></strong></span>
                    <span style="color:#0000ff"><strong>void                flipTMatrix()               {isAltTMatrix() ? clearAltTMatrix() : setAltTMatrix();}</strong></span>
                    <span style="color:#0000ff"><strong>void                flipPartial()               {isAltPartial() ? clearAltPartial() : setAltPartial();}</strong></span>

                    double              getEdgeLength()             {return _edge_length;}
                    void                setEdgeLength(double v);

            static const double _smallest_edge_length;

            typedef std::vector&lt;Node&gt;    Vector;
            typedef std::vector&lt;Node *&gt;  PtrVector;
        
        private:
        
            <span style="color:#0000ff"><strong>enum Flag {</strong></span>
                <span style="color:#0000ff"><strong>Selected   = (1 &lt;&lt; 0),</strong></span>
                <span style="color:#0000ff"><strong>SelPartial = (1 &lt;&lt; 1),</strong></span>
                <span style="color:#0000ff"><strong>SelTMatrix = (1 &lt;&lt; 2),</strong></span>
                <span style="color:#0000ff"><strong>AltPartial = (1 &lt;&lt; 3),</strong></span>
                <span style="color:#0000ff"><strong>AltTMatrix = (1 &lt;&lt; 4)</strong></span>
            <span style="color:#0000ff"><strong>};</strong></span>

            void                clear();

            Node *              _left_child;
            Node *              _right_sib;
            Node *              _parent;
            int                 _number;
            std::string         _name;
            double              _edge_length;
            Split               _split;
            <span style="color:#0000ff"><strong>int                 _flags;</strong></span>
    };

    inline Node::Node() {
        //std::cout &lt;&lt; "Creating Node object" &lt;&lt; std::endl;
        clear();
    }

    inline Node::~Node() {
        //std::cout &lt;&lt; "Destroying Node object" &lt;&lt; std::endl;
    }

    inline void Node::clear() {
        <span style="color:#0000ff"><strong>_flags = 0;</strong></span>
        _left_child = 0;
        _right_sib = 0;
        _parent = 0;
        _number = -1;
        _name = "";
        _edge_length = _smallest_edge_length;
    }

    inline void Node::setEdgeLength(double v) {
        _edge_length = (v &lt; _smallest_edge_length ? _smallest_edge_length : v);
    }

}   

The Flag enum provides binary attributes for every node that can be turned on or off. The select(), selectPartial(), selectTMatrix(), setAltPartial(), and setAltTMatrix() member functions set (i.e. make the value 1) bits at positions 1, 2, 3, 4, and 5, respectively, while the member functions deselect(), deselectPartial(), deselectTMatrix(), clearAltPartial(), and clearAltTMatrix() unset (i.e. make the value 0 at) the same bit positions. As an example, suppose a node has 0 at all bit positions except position 4 (Flag::AltPartial). Here is what the node’s flag data member looks like in binary representation (showing only the 5 bits relevant to this discussion):

flags = 0 1 0 0 0
        | | | |  \Selected 
        | | |  \SelPartial
        | |  \SelTMatrix
        |  \AltPartial
         \AltTMatrix

If selectTMatrix() is called for this node, the result will be to set the 3rd bit (from the right) in addition to the bit already set in the 4th position (due to the fact that this is an OR operation):

flags = 0 1 0 0 0
      | 0 0 1 0 0
-----------------
        0 1 1 0 0    
        | | | |  \Selected 
        | | |  \SelPartial
        | |  \SelTMatrix
        |  \AltPartial
         \AltTMatrix

If clearAltPartial() is called for this node, the result will be to unset only the 4th bit (note that ~(0 1 0 0 0) = 1 0 1 1 1:

flags = 0 1 1 0 0
      & 1 0 1 1 1
--------------------
        0 0 1 0 0
        | | | |  \Selected 
        | | |  \SelPartial
        | |  \SelTMatrix
        |  \AltPartial
        \AltTMatrix

If bit operations still seem confusing, take a look at this explanation.

These member functions allow us to:

Add functions to TreeManip to select or deselect nodes

The added Node functions are normally called by TreeManip functions. Add the indicated function prototypes to the TreeManip class declaration:

    class TreeManip {   
        public:
                                        TreeManip();
                                        TreeManip(Tree::SharedPtr t);
                                        ~TreeManip();

            void                        setTree(Tree::SharedPtr t);
            Tree::SharedPtr             getTree();

            double                      calcTreeLength() const;
            unsigned                    countEdges() const;
            void                        scaleAllEdgeLengths(double scaler);

            void                        createTestTree();
            std::string                 makeNewick(unsigned precision, bool use_names = false) const;

            void                        buildFromNewick(const std::string newick, bool rooted, bool allow_polytomies);
            void                        storeSplits(std::set&lt;Split&gt; & splitset);
            void                        rerootAtNodeNumber(int node_number);

            <span style="color:#0000ff"><strong>void                        selectAll();</strong></span>
            <span style="color:#0000ff"><strong>void                        deselectAll();</strong></span>
            <span style="color:#0000ff"><strong>void                        selectAllPartials();</strong></span>
            <span style="color:#0000ff"><strong>void                        deselectAllPartials();</strong></span>
            <span style="color:#0000ff"><strong>void                        selectAllTMatrices();</strong></span>
            <span style="color:#0000ff"><strong>void                        deselectAllTMatrices();</strong></span>
            <span style="color:#0000ff"><strong>void                        selectPartialsHereToRoot(Node * a);</strong></span>
            <span style="color:#0000ff"><strong>void                        flipPartialsAndTMatrices();</strong></span>

            void                        clear();

        private:

            Node *                      findNextPreorder(Node * nd);
            void                        refreshPreorder();
            void                        refreshLevelorder();
            void                        renumberInternals();
            void                        rerootAtNode(Node * prospective_root);
            void                        extractNodeNumberFromName(Node * nd, std::set&lt;unsigned&gt; & used);
            void                        extractEdgeLen(Node * nd, std::string edge_length_string);
            unsigned                    countNewickLeaves(const std::string newick);
            void                        stripOutNexusComments(std::string & newick);
            bool                        canHaveSibling(Node * nd, bool rooted, bool allow_polytomies);

            Tree::SharedPtr             _tree;

        public:

            typedef std::shared_ptr&lt; TreeManip &gt; SharedPtr;
    };  

Add the bodies of these functions somewhere below the class declaration and before the curly bracket closing the namespace:

    inline void TreeManip::selectAll() {    
        for (auto & nd : _tree-&gt;_nodes) {
            nd.select();
        }
    }

    inline void TreeManip::deselectAll() {
        for (auto & nd : _tree-&gt;_nodes) {
            nd.deselect();
        }
    }

    inline void TreeManip::selectAllPartials() {
        for (auto & nd : _tree-&gt;_nodes)
            nd.selectPartial();
    }

    inline void TreeManip::deselectAllPartials() {
        for (auto & nd : _tree-&gt;_nodes) {
            nd.deselectPartial();
        }
    }

    inline void TreeManip::selectAllTMatrices() {
        for (auto & nd : _tree-&gt;_nodes)
            nd.selectTMatrix();
    }

    inline void TreeManip::deselectAllTMatrices() {
        for (auto & nd : _tree-&gt;_nodes) {
            nd.deselectTMatrix();
        }
    }

    inline void TreeManip::selectPartialsHereToRoot(Node * a) {
        a-&gt;selectPartial();
        while (a-&gt;_parent) {
            a = a-&gt;_parent;
            a-&gt;selectPartial();
        }
    }

    inline void TreeManip::flipPartialsAndTMatrices() {
        for (auto & nd : _tree-&gt;_nodes) {
            if (nd.isSelPartial())
                nd.flipPartial();
            
            if (nd.isSelTMatrix())
                nd.flipTMatrix();
        }
    }   

Modifying the Likelihood class to allow alternate partials and transition matrix indices

The ability to set a flag for a node indicating that an alternate partials index (or transition matrix index) is only useful if the Likelihood class pays attention to these flags. The first step is to reserve enough memory in BeagleLIb to accommodate alternate sets of indices. The highlighted lines below double the amount of memory allocated for partials, transition matrices, and scaler buffers.

    inline void Likelihood::newInstance(unsigned nstates, int nrates, std::vector&lt;unsigned&gt; & subset_indices) { 
        unsigned num_subsets = (unsigned)subset_indices.size();
        
        bool is_invar_model = (nrates &lt; 0 ? true : false);
        unsigned ngammacat = (unsigned)(is_invar_model ? -nrates : nrates);
        
        unsigned num_patterns = 0;
        for (auto s : subset_indices) {
            num_patterns += _data-&gt;getNumPatternsInSubset(s);
        }
        
        unsigned num_internals = calcNumInternalsInFullyResolvedTree();

        // add 1 to num_edges so that subroot node will have a tmatrix, root tip's tmatrix is never used
        unsigned num_edges = calcNumEdgesInFullyResolvedTree();
        unsigned num_nodes = num_edges + 1;
        unsigned num_transition_probs = num_nodes*num_subsets;
        
        long requirementFlags = 0;

        long preferenceFlags = BEAGLE_FLAG_PRECISION_SINGLE | BEAGLE_FLAG_THREADING_CPP;
        if (_underflow_scaling) {
            preferenceFlags |= BEAGLE_FLAG_SCALING_MANUAL;
            preferenceFlags |= BEAGLE_FLAG_SCALERS_LOG;
        }
        if (_prefer_gpu)
            preferenceFlags |= BEAGLE_FLAG_PROCESSOR_GPU;
        else
            preferenceFlags |= BEAGLE_FLAG_PROCESSOR_CPU;
        
        BeagleInstanceDetails instance_details;
        unsigned npartials = num_internals + _ntaxa;
        unsigned nscalers = num_internals;  // one scale buffer for every internal node
        unsigned nsequences = 0;
        if (_ambiguity_equals_missing) {
            npartials -= _ntaxa;
            nsequences += _ntaxa;
        }
        
        int inst = beagleCreateInstance(
             _ntaxa,                                // tips
             <span style="color:#0000ff"><strong>2*npartials,                           // partials</strong></span>
             nsequences,                            // sequences
             nstates,                               // states
             num_patterns,                          // patterns (total across all subsets that use this instance)
             num_subsets,                           // models (one for each distinct eigen decomposition)
             <span style="color:#0000ff"><strong>2*num_subsets*num_transition_probs,    // transition matrices (one for each edge in each subset)</strong></span>
             ngammacat,                             // rate categories
             <span style="color:#0000ff"><strong>(_underflow_scaling ? 2*nscalers + 1 : 0),  // scale buffers (+1 is for the cumulative scaler at index 0)</strong></span>
             NULL,                                  // resource restrictions
             0,                                     // length of resource list
             preferenceFlags,                       // preferred flags
             requirementFlags,                      // required flags
             &instance_details);                    // pointer for details
        
        if (inst &lt; 0) {
            // beagleCreateInstance returns one of the following:
            //   valid instance (0, 1, 2, ...)
            //   error code (negative integer)
            throw XStrom(boost::str(boost::format("Likelihood init function failed to create BeagleLib instance (BeagleLib error code was %d)") % _beagle_error[inst]));
        }
        
        InstanceInfo info;
        info.handle         = inst;
        info.resourcenumber = instance_details.resourceNumber;
        info.resourcename   = instance_details.resourceName;
        info.nstates        = nstates;
        info.nratecateg     = ngammacat;
        info.invarmodel     = is_invar_model;
        info.subsets        = subset_indices;
        info.npatterns      = num_patterns;
        info.partial_offset = num_internals;
        info.tmatrix_offset = num_nodes;
        _instances.push_back(info);
    }   

Next, modify getScalerIndex, getTMatrixIndex, and getPartialIndex so that they return the alternate indices when the appropriate node flags are set. Replace the bodies of each of these three functions with the versions below:

    inline unsigned Likelihood::getScalerIndex(Node * nd, InstanceInfo & info) const {  
        unsigned sindex = BEAGLE_OP_NONE;
        if (_underflow_scaling) {
            sindex = nd-&gt;_number - _ntaxa + 1; // +1 to skip the cumulative scaler vector
            if (nd-&gt;isAltPartial())
                sindex += info.partial_offset;
        }
        return sindex;
    } 
    
    inline unsigned Likelihood::getPartialIndex(Node * nd, InstanceInfo & info) const {
        // Note: do not be tempted to subtract _ntaxa from pindex: BeagleLib does this itself
        assert(nd-&gt;_number &gt;= 0);
        unsigned pindex = nd-&gt;_number;
        if (pindex &gt;= _ntaxa) {
            if (nd-&gt;isAltPartial())
                pindex += info.partial_offset;
        }
        return pindex;
    }
    
    inline unsigned Likelihood::getTMatrixIndex(Node * nd, InstanceInfo & info, unsigned subset_index) const {
        unsigned tindex = 2*subset_index*info.tmatrix_offset + nd-&gt;_number;
        if (nd-&gt;isAltTMatrix())
            tindex += info.tmatrix_offset;
        return tindex;
    }   

Finally, tell defineOperations to only recalculate transition matrices or partials if the node is selected. This will be important later when only a few nodes are affected by a proposal (no sense in recalculating the partial likelihoods for a clade if that clade was not involved in the MCMC proposal).

    inline void Likelihood::defineOperations(Tree::SharedPtr t) {  
        assert(_instances.size() &gt; 0);
        assert(t);
        assert(t-&gt;isRooted() == _rooted);
        
        _relrate_normalizing_constant = _model-&gt;calcNormalizingConstantForSubsetRelRates();

        // Start with a clean slate
        for (auto & info : _instances) {
            _operations[info.handle].clear();
            _pmatrix_index[info.handle].clear();
            _edge_lengths[info.handle].clear();
            _eigen_indices[info.handle].clear();
            _category_rate_indices[info.handle].clear();
        }
                
        // Loop through all nodes in reverse level order
        for (auto nd : boost::adaptors::reverse(t-&gt;_levelorder)) {
            assert(nd-&gt;_number &gt;= 0);
            if (!nd-&gt;_left_child) {
                // This is a leaf
                <span style="color:#0000ff"><strong>if (nd-&gt;isSelTMatrix())</strong></span>
                    queueTMatrixRecalculation(nd);
            }
            else {
                // This is an internal node
                <span style="color:#0000ff"><strong>if (nd-&gt;isSelTMatrix())</strong></span>
                    queueTMatrixRecalculation(nd);

                // Internal nodes have partials to be calculated, so define
                // an operation to compute the partials for this node
                <span style="color:#0000ff"><strong>if (nd-&gt;isSelPartial()) {</strong></span>
                    Node * lchild = nd-&gt;_left_child;
                    assert(lchild);
                    Node * rchild = lchild-&gt;_right_sib;
                    assert(rchild);
                    queuePartialsRecalculation(nd, lchild, rchild);
                <span style="color:#0000ff"><strong>}</strong></span>
            }
        }
    }  

Make Updater a friend of Tree

Edit tree.hpp and uncomment the class Updater; and friend class Updater; lines that are currently commented out.

Modifying the Model class

Make the following blue-highlighted additions to the Model class.

    class Model {   
        
        friend class Likelihood;

        public:
            typedef std::vector&lt;ASRV::SharedPtr&gt;      asrv_vect_t;
            typedef std::vector&lt;QMatrix::SharedPtr&gt;   qmat_vect_t;
            typedef std::vector&lt;unsigned&gt;             subset_sizes_t;
            typedef std::vector&lt;DataType&gt;             subset_datatype_t;
            typedef std::vector&lt;double&gt;               subset_relrate_vect_t;
            <span style="color:#0000ff"><strong>typedef std::vector&lt;QMatrix::SharedPtr&gt;   state_freq_params_t;</strong></span>
            <span style="color:#0000ff"><strong>typedef std::vector&lt;QMatrix::SharedPtr&gt;   exchangeability_params_t;</strong></span>
            <span style="color:#0000ff"><strong>typedef std::vector&lt;QMatrix::SharedPtr&gt;   omega_params_t;</strong></span>
            <span style="color:#0000ff"><strong>typedef std::vector&lt;ASRV::SharedPtr&gt;      ratevar_params_t;</strong></span>
            <span style="color:#0000ff"><strong>typedef std::vector&lt;ASRV::SharedPtr&gt;      pinvar_params_t;</strong></span>
            typedef boost::shared_ptr&lt;Model&gt;          SharedPtr;
        
                                        Model();
                                        ~Model();

            void                        activate();
            void                        inactivate();
            
            std::string                 describeModel();

            void                        setSubsetDataTypes(const subset_datatype_t & datatype_vect);

            void                        setSubsetRateVar(ASRV::ratevar_ptr_t ratevar, unsigned subset, bool fixed);
            void                        setSubsetPinvar(ASRV::pinvar_ptr_t pinvar, unsigned subset, bool fixed);
            void                        setSubsetExchangeabilities(QMatrix::freq_xchg_ptr_t exchangeabilities, unsigned subset, bool fixed);
            void                        setSubsetStateFreqs(QMatrix::freq_xchg_ptr_t state_frequencies, unsigned subset, bool fixed);
            void                        setSubsetOmega(QMatrix::omega_ptr_t omega, unsigned subset, bool fixed);

            void                        setSubsetRelRates(subset_relrate_vect_t & relrates, bool fixed);
            subset_relrate_vect_t &     getSubsetRelRates();
            bool                        isFixedSubsetRelRates() const;
            double                      calcNormalizingConstantForSubsetRelRates() const;

            void                        setTreeIndex(unsigned i, bool fixed);
            unsigned                    getTreeIndex() const;
            bool                        isFixedTree() const;

            unsigned                    getNumSubsets() const;
            unsigned                    getNumSites() const;
            unsigned                    getSubsetNumSites(unsigned subset) const;
            const QMatrix &             getQMatrix(unsigned subset) const;
            const ASRV &                getASRV(unsigned subset) const;

            void                        setSubsetIsInvarModel(bool is_invar, unsigned subset);
            bool                        getSubsetIsInvarModel(unsigned subset) const;

            void                        setSubsetSizes(const subset_sizes_t nsites_vect);
            subset_sizes_t &            getSubsetSizes();

            void                        setSubsetNumPatterns(const subset_sizes_t npatterns_vect);
            unsigned                    getSubsetNumPatterns(unsigned subset) const;

            void                        setSubsetNumCateg(unsigned ncateg, unsigned subset);
            unsigned                    getSubsetNumCateg(unsigned subset) const;

            <span style="color:#0000ff"><strong>state_freq_params_t &       getStateFreqParams();</strong></span>
            <span style="color:#0000ff"><strong>exchangeability_params_t &  getExchangeabilityParams();</strong></span>
            <span style="color:#0000ff"><strong>omega_params_t &            getOmegaParams();</strong></span>
            <span style="color:#0000ff"><strong>ratevar_params_t &          getRateVarParams();</strong></span>
            <span style="color:#0000ff"><strong>pinvar_params_t &           getPinvarParams();</strong></span>
        
            int                         setBeagleEigenDecomposition(int beagle_instance, unsigned subset, unsigned instance_subset);
            int                         setBeagleStateFrequencies(int beagle_instance, unsigned subset, unsigned instance_subset);
            int                         setBeagleAmongSiteRateVariationRates(int beagle_instance, unsigned subset, unsigned instance_subset);
            int                         setBeagleAmongSiteRateVariationProbs(int beagle_instance, unsigned subset, unsigned instance_subset);

        private:
        
            void                        clear();
        
            unsigned                    _num_subsets;
            unsigned                    _num_sites;
            subset_sizes_t              _subset_sizes;
            subset_sizes_t              _subset_npatterns;
            subset_datatype_t           _subset_datatypes;
            qmat_vect_t                 _qmatrix;
            asrv_vect_t                 _asrv;
        
            bool                        _tree_index;
            bool                        _tree_fixed;

            bool                        _subset_relrates_fixed;
            subset_relrate_vect_t       _subset_relrates;
        
            <span style="color:#0000ff"><strong>state_freq_params_t         _state_freq_params;</strong></span>
            <span style="color:#0000ff"><strong>exchangeability_params_t    _exchangeability_params;</strong></span>
            <span style="color:#0000ff"><strong>omega_params_t              _omega_params;</strong></span>
            <span style="color:#0000ff"><strong>ratevar_params_t            _ratevar_params;</strong></span>
            <span style="color:#0000ff"><strong>pinvar_params_t             _pinvar_params;</strong></span>
        };  

Add these 5 function bodies somewhere before the right curly bracket that closes the strom namespace in model.hpp:

    inline Model::state_freq_params_t & Model::getStateFreqParams() {   
        return _state_freq_params;
    }
    
    inline Model::exchangeability_params_t & Model::getExchangeabilityParams() {
        return _exchangeability_params;
    }
    
    inline Model::omega_params_t & Model::getOmegaParams() {
        return _omega_params;
    }
    
    inline Model::ratevar_params_t & Model::getRateVarParams() {
        return _ratevar_params;
    } 
    
    inline Model::pinvar_params_t & Model::getPinvarParams() {
        return _pinvar_params;
    }  

Make the following additions to the describeModel function in model.hpp. These additions specify which parameters are variable in the model and allow the Chain class to determine what specific parameter updaters are needed.

    inline std::string Model::describeModel() { 
        // Creates summary such as following and returns as a string:
        //
        // Partition information:
        //
        //          data subset           1           2           3
        //    -----------------------------------------------------
        //           num. sites          20          20          20
        //        num. patterns           7           5          17
        //          num. states           4           4           4
        //      rate categories           4           1           4
        //
        // Parameter linkage:
        //
        //          data subset           1           2           3
        //    -----------------------------------------------------
        //          state freqs           1           1           1
        //    exchangeabilities           1           1           2
        //        rate variance           1           2           3
        //               pinvar           1           2           -

        <span style="color:#0000ff"><strong>// Start with empty parameter vectors</strong></span>
        <span style="color:#0000ff"><strong>_state_freq_params.clear();</strong></span>
        <span style="color:#0000ff"><strong>_exchangeability_params.clear();</strong></span>
        <span style="color:#0000ff"><strong>_omega_params.clear();</strong></span>
        <span style="color:#0000ff"><strong>_ratevar_params.clear();</strong></span>
        <span style="color:#0000ff"><strong>_pinvar_params.clear();</strong></span>

        // Sets used to determine which parameters are linked across subsets
        std::set&lt;double *&gt; freqset;
        std::set&lt;double *&gt; xchgset;
        std::set&lt;double *&gt; omegaset;
        std::set&lt;double *&gt; ratevarset;
        std::set&lt;double *&gt; pinvarset;
        std::set&lt;double *&gt; relrateset;

        // Vectors of pointers to distinct parameters
        std::vector&lt;double *&gt; unique_freq;
        std::vector&lt;double *&gt; unique_xchg;
        std::vector&lt;double *&gt; unique_omega;
        std::vector&lt;double *&gt; unique_ratevar;
        std::vector&lt;double *&gt; unique_pinvar;
        std::vector&lt;double *&gt; unique_relrate;

        // Map for storing strings that will contain the information for each row
        std::map&lt;std::string, std::string&gt; ss = {
            {"subset",    ""},
            {"dashes",    ""},
            {"freqs",     ""},
            {"xchg",      ""},
            {"omega",     ""},
            {"ratevar",   ""},
            {"pinvar",    ""},
            {"ncateg",    ""},
            {"nsites",    ""},
            {"npatterns", ""},
            {"nstates",   ""}
        };

        // Ensure that the subset relative rates are fixed if there is only one
        // subset; otherwise the subset relative rates will be added to the list
        // of free parameters that are updated, which makes no sense in this case
        if (_num_subsets == 1)
            _subset_relrates_fixed = true;

        // Loop through subsets, building up rows as we go
        for (unsigned i = 0; i &lt; _num_subsets; i++) {
            // Ensure that for subsets in which the number of rate categories is 1 that
            // the gamma rate variance is fixed; otherwise the gamma rate variance will
            // be added to the list of free parameters that are updated, which makes
            // no sense in this case
            if (_asrv[i]-&gt;getNumCateg() == 1) {
                _asrv[i]-&gt;fixRateVar(true);
            }

            unsigned index;
            ss["subset"] += boost::str(boost::format("%12d") % (i+1));
            ss["dashes"] += "------------";

            // Determine whether state freqs are unique for this subset
            QMatrix::freq_xchg_ptr_t pfreq = _qmatrix[i]-&gt;getStateFreqsSharedPtr();
            QMatrix::freq_xchg_t & freq = *pfreq;
            double * freq_addr = &freq[0];
            auto f = freqset.insert(freq_addr);
            if (f.second) {
                unique_freq.push_back(freq_addr);
                <span style="color:#0000ff"><strong>if (!_qmatrix[i]-&gt;isFixedStateFreqs())</strong></span>
                    <span style="color:#0000ff"><strong>_state_freq_params.push_back(_qmatrix[i]);</strong></span>
                index = (unsigned)unique_freq.size();
            }
            else {
                auto iter = std::find(unique_freq.begin(), unique_freq.end(), freq_addr);
                index = (unsigned)std::distance(unique_freq.begin(), iter) + 1;
            }
            ss["freqs"] += boost::str(boost::format("%12d") % index);

            // Determine whether exchangeabilities are unique for this subset
            if (_subset_datatypes[i].isNucleotide()) {
                QMatrix::freq_xchg_ptr_t pxchg = _qmatrix[i]-&gt;getExchangeabilitiesSharedPtr();
                QMatrix::freq_xchg_t & xchg = *pxchg;
                double * xchg_addr = &xchg[0];
                auto x = xchgset.insert(xchg_addr);
                if (x.second) {
                    unique_xchg.push_back(xchg_addr);
                    <span style="color:#0000ff"><strong>if (!_qmatrix[i]-&gt;isFixedExchangeabilities())</strong></span>
                        <span style="color:#0000ff"><strong>_exchangeability_params.push_back(_qmatrix[i]);</strong></span>
                    index = (unsigned)unique_xchg.size();
                }
                else {
                    auto iter = std::find(unique_xchg.begin(), unique_xchg.end(), xchg_addr);
                    index = (unsigned)std::distance(unique_xchg.begin(), iter) + 1;
                }
                ss["xchg"] += boost::str(boost::format("%12d") % index);
            }
            else {
                ss["xchg"] += boost::str(boost::format("%12s") % "-");
            }

            // Determine whether omega is unique for this subset
            if (_subset_datatypes[i].isCodon()) {
                QMatrix::omega_ptr_t pomega = _qmatrix[i]-&gt;getOmegaSharedPtr();
                QMatrix::omega_t omegavalue = *pomega;
                double * omega_addr = &omegavalue;
                auto o = omegaset.insert(omega_addr);
                if (o.second) {
                    unique_omega.push_back(omega_addr);
                    <span style="color:#0000ff"><strong>if (!_qmatrix[i]-&gt;isFixedOmega())</strong></span>
                        <span style="color:#0000ff"><strong>_omega_params.push_back(_qmatrix[i]);</strong></span>
                    index = (unsigned)unique_omega.size();
                }
                else {
                    auto iter = std::find(unique_omega.begin(), unique_omega.end(), omega_addr);
                    index = (unsigned)std::distance(unique_omega.begin(), iter) + 1;
                }
                ss["omega"] += boost::str(boost::format("%12d") % index);
            }
            else {
                ss["omega"] += boost::str(boost::format("%12s") % "-");
            }

            // Determine whether rate variance is unique for this subset
            ASRV::ratevar_ptr_t pratevar = _asrv[i]-&gt;getRateVarSharedPtr();
            double & ratevar = *pratevar;
            double * ratevar_addr = &ratevar;
            auto r = ratevarset.insert(ratevar_addr);
            if (r.second) {
                unique_ratevar.push_back(ratevar_addr);
                <span style="color:#0000ff"><strong>if (!_asrv[i]-&gt;isFixedRateVar())</strong></span>
                    <span style="color:#0000ff"><strong>_ratevar_params.push_back(_asrv[i]);</strong></span>
                index = (unsigned)unique_ratevar.size();
            }
            else {
                auto iter = std::find(unique_ratevar.begin(), unique_ratevar.end(), ratevar_addr);
                index = (unsigned)std::distance(unique_ratevar.begin(), iter) + 1;
            }
            ss["ratevar"] += boost::str(boost::format("%12d") % index);

            // Determine whether pinvar is unique for this subset
            if (_asrv[i]-&gt;getIsInvarModel()) {
                ASRV::pinvar_ptr_t ppinvar = _asrv[i]-&gt;getPinvarSharedPtr();
                double & pinvar = *ppinvar;
                double * pinvar_addr = &pinvar;
                auto r = pinvarset.insert(pinvar_addr);
                if (r.second) {
                    unique_pinvar.push_back(pinvar_addr);
                    <span style="color:#0000ff"><strong>if (!_asrv[i]-&gt;isFixedPinvar())</strong></span>
                        <span style="color:#0000ff"><strong>_pinvar_params.push_back(_asrv[i]);</strong></span>
                    index = (unsigned)unique_pinvar.size();
                }
                else {
                    auto iter = std::find(unique_pinvar.begin(), unique_pinvar.end(), pinvar_addr);
                    index = (unsigned)std::distance(unique_pinvar.begin(), iter) + 1;
                }
                ss["pinvar"] += boost::str(boost::format("%12d") % index);
            }
            else {
                ss["pinvar"] += boost::str(boost::format("%12s") % "-");
            }

            // Save number of rate categories for this subset
            ss["ncateg"] += boost::str(boost::format("%12d") % _asrv[i]-&gt;getNumCateg());

            // Save number of sites for this subset
            ss["nsites"] += boost::str(boost::format("%12d") % _subset_sizes[i]);

            // Save number of patterns for this subset
            ss["npatterns"] += boost::str(boost::format("%12d") % _subset_npatterns[i]);

            // Save number of states for this subset
            if (_subset_datatypes.size() == _num_subsets)
                ss["nstates"] += boost::str(boost::format("%12d") % _subset_datatypes[i].getNumStates());
            else
                ss["nstates"] += boost::str(boost::format("%12s") % "?");

        }
        std::string s = "Partition information:\n\n";

        s += boost::str(boost::format("%20s%s\n") % "data subset" % ss["subset"]);
        s += boost::str(boost::format("%20s%s\n") % "-----------------" % ss["dashes"]);
        s += boost::str(boost::format("%20s%s\n") % "num. sites" % ss["nsites"]);
        s += boost::str(boost::format("%20s%s\n") % "num. patterns" % ss["npatterns"]);
        s += boost::str(boost::format("%20s%s\n") % "num. states" % ss["nstates"]);
        s += boost::str(boost::format("%20s%s\n") % "rate categories" % ss["ncateg"]);

        s += "\nParameter linkage:\n\n";

        s += boost::str(boost::format("%20s%s\n") % "data subset" % ss["subset"]);
        s += boost::str(boost::format("%20s%s\n") % "-----------------" % ss["dashes"]);
        s += boost::str(boost::format("%20s%s\n") % "state freqs" % ss["freqs"]);
        s += boost::str(boost::format("%20s%s\n") % "exchangeabilities" % ss["xchg"]);
        s += boost::str(boost::format("%20s%s\n") % "omega" % ss["omega"]);
        s += boost::str(boost::format("%20s%s\n") % "rate variance" % ss["ratevar"]);
        s += boost::str(boost::format("%20s%s\n") % "pinvar" % ss["pinvar"]);

        s += "\nParameter values for each subset:\n";

        s += "\n  relative rate:\n";
        for (unsigned i = 0; i &lt; _num_subsets; i++) {
            s += boost::str(boost::format("  %12d: %g\n") % (i+1) % _subset_relrates[i]);
        }

        s += "\n  state freqs:\n";
        for (unsigned i = 0; i &lt; _num_subsets; i++) {
            QMatrix::freq_xchg_t & freqs = *(_qmatrix[i]-&gt;getStateFreqsSharedPtr());
            std::vector&lt;std::string&gt; freqs_as_strings(freqs.size());
            std::transform(freqs.begin(), freqs.end(), freqs_as_strings.begin(), [](double freq) {return boost::str(boost::format("%g") % freq);});
            std::string tmp = boost::algorithm::join(freqs_as_strings, ",");
            s += boost::str(boost::format("  %12d: (%s)\n") % (i+1) % tmp);
        }

        s += "\n  exchangeabilities:\n";
        for (unsigned i = 0; i &lt; _num_subsets; i++) {
            if (_subset_datatypes[i].isNucleotide()) {
                QMatrix::freq_xchg_t & xchg = *(_qmatrix[i]-&gt;getExchangeabilitiesSharedPtr());
                std::vector&lt;std::string&gt; xchg_as_strings(xchg.size());
                std::transform(xchg.begin(), xchg.end(), xchg_as_strings.begin(), [](double x) {return boost::str(boost::format("%g") % x);});
                std::string tmp = boost::algorithm::join(xchg_as_strings, ",");
                s += boost::str(boost::format("  %12d: (%s)\n") % (i+1) % tmp);
            }
            else {
                s += boost::str(boost::format("  %12d: -\n") % (i+1));
            }
        }

        s += "\n  omega:\n";
        for (unsigned i = 0; i &lt; _num_subsets; i++) {
            if (_subset_datatypes[i].isCodon()) {
                double omega = *(_qmatrix[i]-&gt;getOmegaSharedPtr());
                s += boost::str(boost::format("  %12d: %g\n") % (i+1) % omega);
            }
            else {
                s += boost::str(boost::format("  %12d: -\n") % (i+1));
            }
        }

        s += "\n  rate variance:\n";
        for (unsigned i = 0; i &lt; _num_subsets; i++) {
            if (_asrv[i]-&gt;getNumCateg() &gt; 1) {
                double ratevar = *(_asrv[i]-&gt;getRateVarSharedPtr());
                s += boost::str(boost::format("  %12d: %g\n") % (i+1) % ratevar);
            }
            else
                s += boost::str(boost::format("  %12d: -\n") % (i+1));
        }

        s += "\n  pinvar:\n";
        for (unsigned i = 0; i &lt; _num_subsets; i++) {
            double pinvar = *(_asrv[i]-&gt;getPinvarSharedPtr());
            bool is_invar_model = _asrv[i]-&gt;getIsInvarModel();
            if (is_invar_model)
                s += boost::str(boost::format("  %12d: %g\n") % (i+1) % pinvar);
            else
                s += boost::str(boost::format("  %12d: -\n") % (i+1));
        }

        return s;
    }   

Modifying the Strom class declaration

Modify strom.hpp by adding the bold, blue parts below to the Strom class declaration:

#pragma once    

#include &lt;iostream&gt;
#include "data.hpp"
#include "likelihood.hpp"
#include "tree_summary.hpp"
#include "partition.hpp"
<span style="color:#0000ff"><strong>#include "lot.hpp"</strong></span>
<span style="color:#0000ff"><strong>#include "chain.hpp"</strong></span>
#include &lt;boost/program_options.hpp&gt;
#include &lt;boost/filesystem.hpp&gt;
#include &lt;boost/algorithm/string/split.hpp&gt;
#include &lt;boost/algorithm/string/classification.hpp&gt;

namespace strom {

    class Strom {
        public:
                                                    Strom();
                                                    ~Strom();

            void                                    clear();
            void                                    processCommandLineOptions(int argc, const char * argv[]);
            void                                    run();
        
        private:
            bool                                    processAssignmentString(const std::string & which, const std::string & definition);
            void                                    handleAssignmentStrings(const boost::program_options::variables_map & vm, std::string label, const std::vector&lt;std::string&gt; & definitions, std::string default_definition);
            bool                                    splitAssignmentString(const std::string & definition, std::vector&lt;std::string&gt; & vector_of_subset_names, std::vector&lt;double&gt;  & vector_of_values);

            double                                  _expected_log_likelihood;

            std::string                             _data_file_name;
            std::string                             _tree_file_name;
            Partition::SharedPtr                    _partition;

            Data::SharedPtr                         _data;
            Model::SharedPtr                        _model;
            Likelihood::SharedPtr                   _likelihood;
            TreeSummary::SharedPtr                  _tree_summary;
            <span style="color:#0000ff"><strong>Lot::SharedPtr                          _lot;</strong></span>

            <span style="color:#0000ff"><strong>unsigned                                _random_seed;</strong></span>
            <span style="color:#0000ff"><strong>unsigned                                _num_iter;</strong></span>
            <span style="color:#0000ff"><strong>unsigned                                _print_freq;</strong></span>
            <span style="color:#0000ff"><strong>unsigned                                _sample_freq;</strong></span>

            bool                                    _use_gpu;
            bool                                    _ambig_missing;
            bool                                    _use_underflow_scaling;

            static std::string                      _program_name;
            static unsigned                         _major_version;
            static unsigned                         _minor_version;

    };  

I’ve added several data members: _lot (the pseudorandom number generator), _random_seed (the starting seed for the pseudorandom number generator), _num_iter (the number of parameter updates that our Markov chain simulator will attempt, _sample_freq (the frequency with which parameter values will be saved), and _print_freq (the frequency with which a progress report will be issued to the console). The last one (_print_freq) will be ignored by our program for now; we will begin using it in the next step to control how much output appears on the screen as the program is running (we will not want a report after every iteration if we’ve asked for 10 million iterations!).

Modifying the Strom clear member function

Modify strom.hpp by adding the bold, blue parts below to the clear member function:

    inline void Strom::clear() {    
        _data_file_name             = "";
        _tree_file_name             = "";
        _tree_summary               = nullptr;
        _partition.reset(new Partition());
        _use_gpu                    = true;
        _ambig_missing              = true;
        _model.reset(new Model());
        _expected_log_likelihood    = 0.0;
        _data                       = nullptr;
        _likelihood                 = nullptr;
        _use_underflow_scaling      = false;
        <span style="color:#0000ff"><strong>_lot                        = nullptr;</strong></span>
        <span style="color:#0000ff"><strong>_random_seed                = 1;</strong></span>
        <span style="color:#0000ff"><strong>_num_iter                   = 1000;</strong></span>
        <span style="color:#0000ff"><strong>_print_freq                 = 1;</strong></span>
        <span style="color:#0000ff"><strong>_sample_freq                = 1;</strong></span>
    }   

Modifying the Strom processCommandLineOptions member function

Continue modifying strom.hpp by adding the bold, blue parts below to the Strom::processCommandLineOptions function:

    inline void Strom::processCommandLineOptions(int argc, const char * argv[]) {   
        std::vector&lt;std::string&gt; partition_statefreq;
        std::vector&lt;std::string&gt; partition_rmatrix;
        std::vector&lt;std::string&gt; partition_omega;
        std::vector&lt;std::string&gt; partition_ratevar;
        std::vector&lt;std::string&gt; partition_pinvar;
        std::vector&lt;std::string&gt; partition_ncateg;
        std::vector&lt;std::string&gt; partition_subsets;
        std::vector&lt;std::string&gt; partition_relrates;
        std::vector&lt;std::string&gt; partition_tree;
        boost::program_options::variables_map vm;
        boost::program_options::options_description desc("Allowed options");
        desc.add_options()
            ("help,h", "produce help message")
            ("version,v", "show program version")
            <span style="color:#0000ff"><strong>("seed,z",        boost::program_options::value(&_random_seed)-&gt;default_value(1),   "pseudorandom number seed")</strong></span>
            <span style="color:#0000ff"><strong>("niter,n",       boost::program_options::value(&_num_iter)-&gt;default_value(1000),   "number of MCMC iterations")</strong></span>
            <span style="color:#0000ff"><strong>("printfreq",  boost::program_options::value(&_print_freq)-&gt;default_value(1),   "skip this many iterations before reporting progress")</strong></span>
            <span style="color:#0000ff"><strong>("samplefreq",  boost::program_options::value(&_sample_freq)-&gt;default_value(1),   "skip this many iterations before sampling next")</strong></span>
            ("datafile,d",  boost::program_options::value(&_data_file_name)-&gt;required(), "name of a data file in NEXUS format")
            ("treefile,t",  boost::program_options::value(&_tree_file_name)-&gt;required(), "name of a tree file in NEXUS format")
            ("subset",  boost::program_options::value(&partition_subsets), "a string defining a partition subset, e.g. 'first:1-1234\3' or 'default[codon:standard]:1-3702'")
            ("ncateg,c", boost::program_options::value(&partition_ncateg), "number of categories in the discrete Gamma rate heterogeneity model")
            ("statefreq", boost::program_options::value(&partition_statefreq), "a string defining state frequencies for one or more data subsets, e.g. 'first,second:0.1,0.2,0.3,0.4'")
            ("omega", boost::program_options::value(&partition_omega), "a string defining the nonsynonymous/synonymous rate ratio omega for one or more data subsets, e.g. 'first,second:0.1'")
            ("rmatrix", boost::program_options::value(&partition_rmatrix), "a string defining the rmatrix for one or more data subsets, e.g. 'first,second:1,2,1,1,2,1'")
            ("ratevar", boost::program_options::value(&partition_ratevar), "a string defining the among-site rate variance for one or more data subsets, e.g. 'first,second:2.5'")
            ("pinvar", boost::program_options::value(&partition_pinvar), "a string defining the proportion of invariable sites for one or more data subsets, e.g. 'first,second:0.2'")
            ("relrate", boost::program_options::value(&partition_relrates), "a string defining the (unnormalized) relative rates for all data subsets (e.g. 'default:3,1,6').")
            ("tree", boost::program_options::value(&partition_tree), "the index of the tree in the tree file (first tree has index = 1)")
            ("expectedLnL", boost::program_options::value(&_expected_log_likelihood)-&gt;default_value(0.0), "log likelihood expected")
            ("gpu",           boost::program_options::value(&_use_gpu)-&gt;default_value(true),                "use GPU if available")
            ("ambigmissing",  boost::program_options::value(&_ambig_missing)-&gt;default_value(true),          "treat all ambiguities as missing data")
            ("underflowscaling",  boost::program_options::value(&_use_underflow_scaling)-&gt;default_value(false),          "scale site-likelihoods to prevent underflow (slower but safer)")
        ;
        boost::program_options::store(boost::program_options::parse_command_line(argc, argv, desc), vm);
        try {
            const boost::program_options::parsed_options & parsed = boost::program_options::parse_config_file&lt; char &gt;("strom.conf", desc, false);
            boost::program_options::store(parsed, vm);
        }
        catch(boost::program_options::reading_file & x) {
            std::cout &lt;&lt; "Note: configuration file (strom.conf) not found" &lt;&lt; std::endl;
        }
        boost::program_options::notify(vm);

        // If user specified --help on command line, output usage summary and quit
        if (vm.count("help") &gt; 0) {
            std::cout &lt;&lt; desc &lt;&lt; "\n";
            std::exit(1);
        }

        // If user specified --version on command line, output version and quit
        if (vm.count("version") &gt; 0) {
            std::cout &lt;&lt; boost::str(boost::format("This is %s version %d.%d") % _program_name % _major_version % _minor_version) &lt;&lt; std::endl;
            std::exit(1);
        }
    
        // If user specified --subset on command line, break specified partition subset 
        // definition into name and character set string and add to _partition
        if (vm.count("subset") &gt; 0) {
            _partition.reset(new Partition());
            for (auto s : partition_subsets) {
                _partition-&gt;parseSubsetDefinition(s);
            }
        }
        
        _model-&gt;setSubsetDataTypes(_partition-&gt;getSubsetDataTypes());
        
        handleAssignmentStrings(vm, "statefreq", partition_statefreq, "default:equal");
        handleAssignmentStrings(vm, "rmatrix",   partition_rmatrix,   "default:equal");
        handleAssignmentStrings(vm, "omega",     partition_omega,     "default:0.1"  );
        handleAssignmentStrings(vm, "ncateg",    partition_ncateg,    "default:1"    );
        handleAssignmentStrings(vm, "ratevar",   partition_ratevar,   "default:1.0"  );
        handleAssignmentStrings(vm, "pinvar",    partition_pinvar,    "default:0.0"  );
        handleAssignmentStrings(vm, "relrate",   partition_relrates,  "default:equal");
        handleAssignmentStrings(vm, "tree",      partition_tree,      "default:1"    ); 
    }   

Modifying the Strom run member function

Our final modifications to strom.hpp will be to add the bold, blue parts below to the Strom::run function:

    inline void Strom::run() {  
        std::cout &lt;&lt; "Starting..." &lt;&lt; std::endl;
        std::cout &lt;&lt; "Current working directory: " &lt;&lt; boost::filesystem::current_path() &lt;&lt; std::endl;

        try {
            std::cout &lt;&lt; "\n*** Reading and storing the data in the file " &lt;&lt; _data_file_name &lt;&lt; std::endl;
            _data = Data::SharedPtr(new Data());
            _data-&gt;setPartition(_partition);
            _data-&gt;getDataFromFile(_data_file_name);
            
            _model-&gt;setSubsetNumPatterns(_data-&gt;calcNumPatternsVect());
            _model-&gt;setSubsetSizes(_partition-&gt;calcSubsetSizes());
            _model-&gt;activate();

            // Report information about data partition subsets
            unsigned nsubsets = _data-&gt;getNumSubsets();
            std::cout &lt;&lt; "\nNumber of taxa: " &lt;&lt; _data-&gt;getNumTaxa() &lt;&lt; std::endl;
            std::cout &lt;&lt; "Number of partition subsets: " &lt;&lt; nsubsets &lt;&lt; std::endl;
            for (unsigned subset = 0; subset &lt; nsubsets; subset++) {
                DataType dt = _partition-&gt;getDataTypeForSubset(subset);
                std::cout &lt;&lt; "  Subset " &lt;&lt; (subset+1) &lt;&lt; " (" &lt;&lt; _data-&gt;getSubsetName(subset) &lt;&lt; ")" &lt;&lt; std::endl;
                std::cout &lt;&lt; "    data type: " &lt;&lt; dt.getDataTypeAsString() &lt;&lt; std::endl;
                std::cout &lt;&lt; "    sites:     " &lt;&lt; _data-&gt;calcSeqLenInSubset(subset) &lt;&lt; std::endl;
                std::cout &lt;&lt; "    patterns:  " &lt;&lt; _data-&gt;getNumPatternsInSubset(subset) &lt;&lt; std::endl;
                std::cout &lt;&lt; "    ambiguity: " &lt;&lt; (_ambig_missing || dt.isCodon() ? "treated as missing data (faster)" : "handled appropriately (slower)") &lt;&lt; std::endl;
                }

            std::cout &lt;&lt; "\n*** Resources available to BeagleLib " &lt;&lt; _likelihood-&gt;beagleLibVersion() &lt;&lt; ":\n";
            std::cout &lt;&lt; _likelihood-&gt;availableResources() &lt;&lt; std::endl;

            std::cout &lt;&lt; "\n*** Creating the likelihood calculator" &lt;&lt; std::endl;
            _likelihood = Likelihood::SharedPtr(new Likelihood());
            _likelihood-&gt;setPreferGPU(_use_gpu);
            _likelihood-&gt;setAmbiguityEqualsMissing(_ambig_missing);
            _likelihood-&gt;setData(_data);
            _likelihood-&gt;useUnderflowScaling(_use_underflow_scaling);

            std::cout &lt;&lt; "\n*** Model description" &lt;&lt; std::endl;
            std::cout &lt;&lt; _model-&gt;describeModel() &lt;&lt; std::endl;
            _likelihood-&gt;setModel(_model);

            _likelihood-&gt;initBeagleLib();

            std::cout &lt;&lt; "\n*** Reading and storing the first tree in the file " &lt;&lt; _tree_file_name &lt;&lt; std::endl;
            _tree_summary = TreeSummary::SharedPtr(new TreeSummary());
            _tree_summary-&gt;readTreefile(_tree_file_name, 0);
            Tree::SharedPtr tree = _tree_summary-&gt;getTree(0);
            <span style="color:#0000ff"><strong>std::string newick = _tree_summary-&gt;getNewick(0);</strong></span>

            if (tree-&gt;numLeaves() != _data-&gt;getNumTaxa())
                throw XStrom(boost::format("Number of taxa in tree (%d) does not equal the number of taxa in the data matrix (%d)") % tree-&gt;numLeaves() % _data-&gt;getNumTaxa());

            std::cout &lt;&lt; "\n*** Calculating the likelihood of the tree" &lt;&lt; std::endl;
            <span style="color:#0000ff"><strong>TreeManip tm(tree);</strong></span>
            <span style="color:#0000ff"><strong>tm.selectAllPartials();</strong></span>
            <span style="color:#0000ff"><strong>tm.selectAllTMatrices();</strong></span>
            double lnL = _likelihood-&gt;calcLogLikelihood(tree);
            std::cout &lt;&lt; boost::str(boost::format("log likelihood = %.5f") % lnL) &lt;&lt; std::endl;
            
            if (_expected_log_likelihood != 0.0) 
                std::cout &lt;&lt; boost::str(boost::format("      (expecting %.3f)") % _expected_log_likelihood) &lt;&lt; std::endl;
            
            <span style="color:#0000ff"><strong>// Create a Lot object that generates (pseudo)random numbers</strong></span>
            <span style="color:#0000ff"><strong>_lot = Lot::SharedPtr(new Lot);</strong></span>
            <span style="color:#0000ff"><strong>_lot-&gt;setSeed(_random_seed);</strong></span>
            
            <span style="color:#0000ff"><strong>// Create a Chain object and take _num_iter steps</strong></span>
            <span style="color:#0000ff"><strong>Chain chain;</strong></span>
            <span style="color:#0000ff"><strong>unsigned num_free_parameters = chain.createUpdaters(_model, _lot, _likelihood);</strong></span>
            <span style="color:#0000ff"><strong>if (num_free_parameters &gt; 0) {</strong></span>
                <span style="color:#0000ff"><strong>chain.setTreeFromNewick(newick);</strong></span>
                <span style="color:#0000ff"><strong>chain.start();</strong></span>

                <span style="color:#0000ff"><strong>// Output column headers and first line of output showing starting state (iteration 0)</strong></span>
                <span style="color:#0000ff"><strong>std::cout &lt;&lt; boost::str(boost::format("\n%12s %12s %12s %12s %12s %12s\n")</strong></span>
                    <span style="color:#0000ff"><strong>% "iteration"</strong></span>
                    <span style="color:#0000ff"><strong>% "lnLike"</strong></span>
                    <span style="color:#0000ff"><strong>% "lnPrior"</strong></span>
                    <span style="color:#0000ff"><strong>% "ratevar"</strong></span>
                    <span style="color:#0000ff"><strong>% "accept"</strong></span>
                    <span style="color:#0000ff"><strong>% "samples");</strong></span>
                <span style="color:#0000ff"><strong>GammaRateVarUpdater::SharedPtr ratevar_updater = std::dynamic_pointer_cast&lt;GammaRateVarUpdater&gt; (chain.findUpdaterByName("Gamma Rate Variance"));</strong></span>
                <span style="color:#0000ff"><strong>std::cout &lt;&lt; boost::str(boost::format("%12d %12.5f %12.5f %12.5f %12.1f %12d\n")</strong></span>
                    <span style="color:#0000ff"><strong>% 0</strong></span>
                    <span style="color:#0000ff"><strong>% chain.getLogLikelihood()</strong></span>
                    <span style="color:#0000ff"><strong>% chain.calcLogJointPrior()</strong></span>
                    <span style="color:#0000ff"><strong>% ratevar_updater-&gt;getCurrentPoint()</strong></span>
                    <span style="color:#0000ff"><strong>% 0</strong></span>
                    <span style="color:#0000ff"><strong>% 0);</strong></span>

                <span style="color:#0000ff"><strong>for (unsigned iteration = 1; iteration &lt;= _num_iter; ++iteration) {</strong></span>
                    <span style="color:#0000ff"><strong>chain.nextStep(iteration);</strong></span>
                    
                    <span style="color:#0000ff"><strong>if (iteration % _sample_freq == 0) {</strong></span>
                        <span style="color:#0000ff"><strong>GammaRateVarUpdater::SharedPtr ratevar_updater = std::dynamic_pointer_cast&lt;GammaRateVarUpdater&gt; (chain.findUpdaterByName("Gamma Rate Variance"));</strong></span>
                        <span style="color:#0000ff"><strong>double log_prior = chain.calcLogJointPrior();</strong></span>
                        <span style="color:#0000ff"><strong>if (log_prior == Updater::getLogZero())</strong></span>
                            <span style="color:#0000ff"><strong>std::cout &lt;&lt; boost::str(boost::format("%12d %12.5f %12s %12.5f %12.1f %12d\n")</strong></span>
                                <span style="color:#0000ff"><strong>% iteration</strong></span>
                                <span style="color:#0000ff"><strong>% chain.getLogLikelihood()</strong></span>
                                <span style="color:#0000ff"><strong>% "-infinity"</strong></span>
                                <span style="color:#0000ff"><strong>% ratevar_updater-&gt;getCurrentPoint()</strong></span>
                                <span style="color:#0000ff"><strong>% ratevar_updater-&gt;getAcceptPct()</strong></span>
                                <span style="color:#0000ff"><strong>% ratevar_updater-&gt;getNumUpdates());</strong></span>
                        <span style="color:#0000ff"><strong>else</strong></span>
                            <span style="color:#0000ff"><strong>std::cout &lt;&lt; boost::str(boost::format("%12d %12.5f %12.5f %12.5f %12.1f %12d\n")</strong></span>
                                <span style="color:#0000ff"><strong>% iteration</strong></span>
                                <span style="color:#0000ff"><strong>% chain.getLogLikelihood()</strong></span>
                                <span style="color:#0000ff"><strong>% log_prior</strong></span>
                                <span style="color:#0000ff"><strong>% ratevar_updater-&gt;getCurrentPoint()</strong></span>
                                <span style="color:#0000ff"><strong>% ratevar_updater-&gt;getAcceptPct()</strong></span>
                                <span style="color:#0000ff"><strong>% ratevar_updater-&gt;getNumUpdates());</strong></span>
                    <span style="color:#0000ff"><strong>}</strong></span>
                <span style="color:#0000ff"><strong>}</strong></span>
                <span style="color:#0000ff"><strong>chain.stop();</strong></span>
            <span style="color:#0000ff"><strong>}</strong></span>
            <span style="color:#0000ff"><strong>else {</strong></span>
                <span style="color:#0000ff"><strong>std::cout &lt;&lt; "\nMCMC skipped because there are no free parameters in the model" &lt;&lt; std::endl;</strong></span>
            <span style="color:#0000ff"><strong>}</strong></span>
        }
        catch (XStrom & x) {
            std::cerr &lt;&lt; "Strom encountered a problem:\n  " &lt;&lt; x.what() &lt;&lt; std::endl;
        }

        std::cout &lt;&lt; "\nFinished!" &lt;&lt; std::endl;
    }   

Replacing the main function

You will also need to return main.cpp to the way it was before we changed it to test the Lot class. Replace everything in main.cpp with the following:

#include &lt;limits&gt;   
#include &lt;iostream&gt;
#include "strom.hpp"

using namespace strom;

// static data member initializations
std::string  Strom::_program_name        = "strom";
unsigned     Strom::_major_version       = 1;
unsigned     Strom::_minor_version       = 0;
const double Node::_smallest_edge_length = 1.0e-12;
const double Updater::_log_zero          = -std::numeric_limits&lt;double&gt;::max();
GeneticCode::genetic_code_definitions_t GeneticCode::_definitions = { // codon order is alphabetical: i.e. AAA, AAC, AAG, AAT, ACA, ..., TTT
    {"standard",             "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSS*CWCLFLF"},
    {"vertmito",             "KNKNTTTT*S*SMIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSSWCWCLFLF"},
    {"yeastmito",            "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSSWCWCLFLF"},
    {"moldmito",             "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSSWCWCLFLF"},
    {"invertmito",           "KNKNTTTTSSSSMIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSSWCWCLFLF"},
    {"ciliate",              "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVVQYQYSSSS*CWCLFLF"},
    {"echinomito",           "NNKNTTTTSSSSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSSWCWCLFLF"},
    {"euplotid",             "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSSCCWCLFLF"},
    {"plantplastid",         "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSS*CWCLFLF"},
    {"altyeast",             "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLSLEDEDAAAAGGGGVVVV*Y*YSSSS*CWCLFLF"},
    {"ascidianmito",         "KNKNTTTTGSGSMIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSSWCWCLFLF"},
    {"altflatwormmito",      "NNKNTTTTSSSSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVVYY*YSSSSWCWCLFLF"},
    {"blepharismamacro",     "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*YQYSSSS*CWCLFLF"},
    {"chlorophyceanmito",    "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*YLYSSSS*CWCLFLF"},
    {"trematodemito",        "NNKNTTTTSSSSMIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSSWCWCLFLF"},
    {"scenedesmusmito",      "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*YLY*SSS*CWCLFLF"},
    {"thraustochytriummito", "KNKNTTTTRSRSIIMIQHQHPPPPRRRRLLLLEDEDAAAAGGGGVVVV*Y*YSSSS*CWC*FLF"}
};

int main(int argc, const char * argv[]) {

    Strom strom;
    try {
        strom.processCommandLineOptions(argc, argv);
        strom.run();
    }
    catch(std::exception & x) {
        std::cerr &lt;&lt; "Exception: " &lt;&lt; x.what() &lt;&lt; std::endl;
        std::cerr &lt;&lt; "Aborted." &lt;&lt; std::endl;
    }
    catch(...) {
        std::cerr &lt;&lt; "Exception of unknown type!\n";
    }

    return 0;
}   

Revising your strom.conf file

Replace the contents of strom.conf with the following. This sets all of the model parameters to the values PAUP* estimated with the exception of ratevar, which we will start at 1.0 rather than 3.94280 so we can test whether our rate variance updater is working.

We also set the number of rate categories to 4. This is necessary because, otherwise, the model would assume rate homogeneity and our one updatable parameter (gamma rate variance) would not be updated!

datafile    = rbcl10.nex    
treefile    = rbcl10.tre
statefreq   = default:[0.294667,0.189172,0.206055,0.310106]
rmatrix     = default:[0.06082,0.27887,0.06461,0.06244,0.48492,0.04834]
ratevar     = default:1.0  # MLE = 3.94322 = 1./0.25360
pinvar      = default:[0.0]
ncateg      = default:4
niter       = 1000
samplefreq  = 1
printfreq   = 1
expectedLnL = -6897.492 

Expected output

You should see output from 1000 calls to the nextStep function of the Chain object: I’ve shown only the first 10 iterations (plus the very last iteration) of this output below

   iteration       lnLike      lnPrior      ratevar       accept      samples
           0  -6897.49179     -1.00000      1.00000            0            0
           1  -6817.59032     -1.49718      1.49718        100.0            1
           2  -6817.59032     -1.49718      1.49718         50.0            2
           3  -6770.15548     -2.02658      2.02658         66.7            3
           4  -6770.15548     -2.02658      2.02658         50.0            4
           5  -6770.15548     -2.02658      2.02658         40.0            5
           6  -6770.15548     -2.02658      2.02658         33.3            6
           7  -6750.87905     -2.37988      2.37988         42.9            7
           8  -6742.62316     -2.58944      2.58944         50.0            8
           9  -6742.62316     -2.58944      2.58944         44.4            9
          10  -6742.62316     -2.58944      2.58944         40.0           10
        1000  -6723.60154     -3.70240      3.70240         30.9         1000

Note that the acceptance percentage for our gamma rate variance updater quickly converges to 30%, which is the target we specified by default in the Chain constructor (actually, in Chain::clear function, which is called by the constructor). Right now, we are tuning throughout the MCMC run, but soon we will change things so that autotuning is done for our updaters only during the burn-in period.

Also note that the likelihood improves as the gamma rate variance converges from the starting value of 1.0 to a value close to the maximum likelihood estimate (MLE) 3.94280, suggesting that our MCMC chain is converging on a region of parameter space having higher posterior probability density. The log-likelihood also approaches (but of course can never exceed) the maximum likleihood of -6723.144 as the chain runs.

Finally, notice that the lnPrior column is always equal to -1 times the gamma rate variance value. This is because the prior is completely determined by the prior on the gamma rate variance parameter (our model currently only has one estimated parameter). This gamma rate variance prior is, by default, Gamma(1,1), which is specified in Chain::clear. The log of this probability density is -log(1) – ratevar/1, which just equals -ratevar.

< 14.3 | 14.4 | 15.0 >