11.6 Test the Model class

(Mac version)

< 11.5 | 11.6 | 12.0 >

Install the Eigen library

Download the latest stable release of Eigen 3.

Unpack the downloaded zip or tar.gz file (e.g. 3.3.7.tar.gz) to your libraries directory. For example, on my computer, I now have this directory path: /Users/plewis/Documents/libraries/eigen-eigen-323c052e1731/Eigen I will refer to the eigen-eigen-323c052e1731 directory as EIGEN_ROOT. Inside EIGEN_ROOT is the directory named Eigen (starting with a capital E)

You now need to tell Xcode where the installed Eigen header files are located.

Choose Xcode > Preferences… from the main Xcode window, then click the “Locations” tab, and, finally, “Custom Paths”.

Click the + button and add name EIGEN_ROOT, Display Name EIGEN_ROOT, and, for Path, enter the full path to the directory you just created: for example, I entered this:

/Users/plewis/Documents/libraries/eigen-eigen-323c052e1731

This creates an alias that we can use in place of this path to specify where the NCL libraries and headers reside. You may remember that we already created an alias named BOOST_ROOT pointing to the Boost install directory and one named NCL_ROOT pointing to the Nexus Class Library install directory. Should we ever move the Eigen library, we need only update this “Custom Paths” path and would not have to modify the Xcode project.

Click on the blue project icon :blueproject: labeled strom at the top of the Project Navigator (not the yellow folder icon :yellowfolder: also labeled strom), then click on the “Build Settings” tab. Type “header search” into the search box. In the “Header Search Paths” row, double-click on the cell in the column with the blue project icon :blueproject:. Click the + symbol and add $(EIGEN_ROOT), pressing the enter key when finished. This tells Xcode where to find Eigen header files.

Because Eigen is a header-only library, we do not need to perform any other setup: it’s ready to go once Xcode knows where to find it!

Modify Strom

First, add a data member of type Model to the private section of the Strom class declaration (e.g. just below Data::SharedPtr _data;). Add another data member to hold the expected log-likelihood. The _expected_log_likelihood data member and the corresponding expectedLnL program option will help us in debugging but will be removed once our program is ready to be released. Finally, add declaration for 3 new private member functions, processAssignmentString, handleAssignmentStrings, and splitAssignmentString. The changes needed to the class declaration are highlighted below:

    class Strom {   
        public:
                                                    Strom();
                                                    ~Strom();

            void                                    clear();
            void                                    processCommandLineOptions(int argc, const char * argv[]);
            void                                    run();
        
        private:
            <span style="color:#0000ff"><strong>bool                                    processAssignmentString(const std::string & which, const std::string & definition);</strong></span>
            <span style="color:#0000ff"><strong>void                                    handleAssignmentStrings(const boost::program_options::variables_map & vm, std::string label, const std::vector&lt;std::string&gt; & definitions, std::string default_definition);</strong></span>
            <span style="color:#0000ff"><strong>bool                                    splitAssignmentString(const std::string & definition, std::vector&lt;std::string&gt; & vector_of_subset_names, std::vector&lt;double&gt;  & vector_of_values);</strong></span>

            <span style="color:#0000ff"><strong>double                                  _expected_log_likelihood;</strong></span>

            std::string                             _data_file_name;
            std::string                             _tree_file_name;
            Partition::SharedPtr                    _partition;

            Data::SharedPtr                         _data;
            <span style="color:#0000ff"><strong>Model::SharedPtr                        _model;</strong></span>
            Likelihood::SharedPtr                   _likelihood;
            TreeSummary::SharedPtr                  _tree_summary;

            bool                                    _use_gpu;
            bool                                    _ambig_missing;

            static std::string                      _program_name;
            static unsigned                         _major_version;
            static unsigned                         _minor_version;

    };  

Initialize the new data members in the body of the Strom::clear function:

    inline void Strom::clear() {    
        _data_file_name = "";
        _tree_file_name = "";
        _tree_summary = nullptr;
        _partition.reset(new Partition());
        _use_gpu = true;
        _ambig_missing = true;
        <span style="color:#0000ff"><strong>_model.reset(new Model());</strong></span>
        <span style="color:#0000ff"><strong>_expected_log_likelihood = 0.0;</strong></span>
        _data = nullptr; 
        _likelihood = nullptr;
    }   

Allow user to set model parameter values

Modify the Strom::processCommandLineOptions function (in the strom.hpp file) as follows to add options to allow the user to set various model parameters (as well as the expected log-likelihood).

    inline void Strom::processCommandLineOptions(int argc, const char * argv[]) {   
        <span style="color:#0000ff"><strong>std::vector&lt;std::string&gt; partition_statefreq;</strong></span>
        <span style="color:#0000ff"><strong>std::vector&lt;std::string&gt; partition_rmatrix;</strong></span>
        <span style="color:#0000ff"><strong>std::vector&lt;std::string&gt; partition_omega;</strong></span>
        <span style="color:#0000ff"><strong>std::vector&lt;std::string&gt; partition_ratevar;</strong></span>
        <span style="color:#0000ff"><strong>std::vector&lt;std::string&gt; partition_pinvar;</strong></span>
        <span style="color:#0000ff"><strong>std::vector&lt;std::string&gt; partition_ncateg;</strong></span>
        std::vector&lt;std::string&gt; partition_subsets;
        <span style="color:#0000ff"><strong>std::vector&lt;std::string&gt; partition_relrates;</strong></span>
        <span style="color:#0000ff"><strong>std::vector&lt;std::string&gt; partition_tree;</strong></span>
        boost::program_options::variables_map vm;
        boost::program_options::options_description desc("Allowed options");
        desc.add_options()
            ("help,h", "produce help message")
            ("version,v", "show program version")
            ("datafile,d",  boost::program_options::value(&_data_file_name)-&gt;required(), "name of a data file in NEXUS format")
            ("treefile,t",  boost::program_options::value(&_tree_file_name)-&gt;required(), "name of a tree file in NEXUS format")
            ("subset",  boost::program_options::value(&partition_subsets), "a string defining a partition subset, e.g. 'first:1-1234\3' or 'default[codon:standard]:1-3702'")
            <span style="color:#0000ff"><strong>("ncateg,c", boost::program_options::value(&partition_ncateg), "number of categories in the discrete Gamma rate heterogeneity model")</strong></span>
            <span style="color:#0000ff"><strong>("statefreq", boost::program_options::value(&partition_statefreq), "a string defining state frequencies for one or more data subsets, e.g. 'first,second:0.1,0.2,0.3,0.4'")</strong></span>
            <span style="color:#0000ff"><strong>("omega", boost::program_options::value(&partition_omega), "a string defining the nonsynonymous/synonymous rate ratio omega for one or more data subsets, e.g. 'first,second:0.1'")</strong></span>
            <span style="color:#0000ff"><strong>("rmatrix", boost::program_options::value(&partition_rmatrix), "a string defining the rmatrix for one or more data subsets, e.g. 'first,second:1,2,1,1,2,1'")</strong></span>
            <span style="color:#0000ff"><strong>("ratevar", boost::program_options::value(&partition_ratevar), "a string defining the among-site rate variance for one or more data subsets, e.g. 'first,second:2.5'")</strong></span>
            <span style="color:#0000ff"><strong>("pinvar", boost::program_options::value(&partition_pinvar), "a string defining the proportion of invariable sites for one or more data subsets, e.g. 'first,second:0.2'")</strong></span>
            <span style="color:#0000ff"><strong>("relrate", boost::program_options::value(&partition_relrates), "a string defining the (unnormalized) relative rates for all data subsets (e.g. 'default:3,1,6').")</strong></span>
            <span style="color:#0000ff"><strong>("tree", boost::program_options::value(&partition_tree), "the index of the tree in the tree file (first tree has index = 1)")</strong></span>
            <span style="color:#0000ff"><strong>("expectedLnL", boost::program_options::value(&_expected_log_likelihood)-&gt;default_value(0.0), "log likelihood expected")</strong></span>
            ("gpu",           boost::program_options::value(&_use_gpu)-&gt;default_value(true),                "use GPU if available")
            ("ambigmissing",  boost::program_options::value(&_ambig_missing)-&gt;default_value(true),          "treat all ambiguities as missing data")
        ;
        boost::program_options::store(boost::program_options::parse_command_line(argc, argv, desc), vm);
        try {
            const boost::program_options::parsed_options & parsed = boost::program_options::parse_config_file&lt; char &gt;("strom.conf", desc, false);
            boost::program_options::store(parsed, vm);
        }
        catch(boost::program_options::reading_file & x) {
            std::cout &lt;&lt; "Note: configuration file (strom.conf) not found" &lt;&lt; std::endl;
        }
        boost::program_options::notify(vm);

        // If user specified --help on command line, output usage summary and quit
        if (vm.count("help") &gt; 0) {
            std::cout &lt;&lt; desc &lt;&lt; "\n";
            std::exit(1);
        }

        // If user specified --version on command line, output version and quit
        if (vm.count("version") &gt; 0) {
            std::cout &lt;&lt; boost::str(boost::format("This is %s version %d.%d") % _program_name % _major_version % _minor_version) &lt;&lt; std::endl;
            std::exit(1);
        }
    
        // If user specified --subset on command line, break specified partition subset 
        // definition into name and character set string and add to _partition
        if (vm.count("subset") &gt; 0) {
            _partition.reset(new Partition());
            for (auto s : partition_subsets) {
                _partition-&gt;parseSubsetDefinition(s);
            }
        }
        
        <span style="color:#0000ff"><strong>_model-&gt;setSubsetDataTypes(_partition-&gt;getSubsetDataTypes());</strong></span>
        
        <span style="color:#0000ff"><strong>handleAssignmentStrings(vm, "statefreq", partition_statefreq, "default:equal");</strong></span>
        <span style="color:#0000ff"><strong>handleAssignmentStrings(vm, "rmatrix",   partition_rmatrix,   "default:equal");</strong></span>
        <span style="color:#0000ff"><strong>handleAssignmentStrings(vm, "omega",     partition_omega,     "default:0.1"  );</strong></span>
        <span style="color:#0000ff"><strong>handleAssignmentStrings(vm, "ncateg",    partition_ncateg,    "default:1"    );</strong></span>
        <span style="color:#0000ff"><strong>handleAssignmentStrings(vm, "ratevar",   partition_ratevar,   "default:1.0"  );</strong></span>
        <span style="color:#0000ff"><strong>handleAssignmentStrings(vm, "pinvar",    partition_pinvar,    "default:0.0"  );</strong></span>
        <span style="color:#0000ff"><strong>handleAssignmentStrings(vm, "relrate",   partition_relrates,  "default:equal");</strong></span>
        <span style="color:#0000ff"><strong>handleAssignmentStrings(vm, "tree",      partition_tree,      "default:1"    );</strong></span>
    }  

You will note that I have introduced 8 local variables (partition_statefreq, partition_rmatrix, partition_omega, partition_ratevar, partition_pinvar, partition_ncateg, partition_relrates, and partition_tree) that serve to capture information provided by the user in the configuration file. As these bits of information are captured, they must be transfered to the model. The first thing we must tell the model is the number of partition subsets that the user defined and the data types of those subsets (using the Model::setSubsetDataTypes function). Following that line, there are 8 handleAssignmentStrings function calls, one for each of the new quantities that the user can potentially modify using the configuration file.

The handleAssignmentStrings member function

This function checks to see whether the user specified any option in the configuration file with a name equal to label. If so, it calls on processAssignmentString to do all the work of parsing the assignment of a parameter value to different subsets. It also checks whether the user specified a default parameter value to be applied to all subsets unless overridden by a later subset-specific assignment. If the user has supplied a default, it must come first in the config file, otherwise it would overwrite assignments already made to specific subsets. If the user did not specify any assignments using the name provided via label, then the supplied default_definition is applied.

    inline void Strom::handleAssignmentStrings(const boost::program_options::variables_map & vm, std::string label, const std::vector&lt;std::string&gt; & definitions, std::string default_definition) { 
        if (vm.count(label) &gt; 0) {
            bool first = true;
            for (auto s : definitions) {
                bool is_default = processAssignmentString(label, s);
                if (is_default && !first)
                    throw XStrom(boost::format("default specification must be first %s encountered") % label);
                first = false;
            }
        }
        else {
            processAssignmentString(label, default_definition);
        }
    }   

The processAssignmentString member function

This function handles parsing a particular parameter assignment (e.g. strings such as rmatrix = first,second:1,1,1,1,1,1 or statefreq = default:.1,.2,.3,.4). Each of these assignment statements comprises a list of subsets (or the keyword default) and a parameter value (which may be multivariate), separated by a colon (:) character. The function splitAssignmentString is used to separate out the subsets into a vector of strings named vector_of_subset_names and a vector of doubles named vector_of_values.

The remainder of the function is just a long nested conditional that calls the specific member function of _model responsible for handling each possible parameter assignment. In the case of model parameters that can be linked across data subsets (e.g. statefreq, rmatrix, omega, pinvar, ratevar, and relrate), the parameter value is made into a shared pointer and it is the shared pointer that is assigned to different subsets. In the case of fixed settings (e.g. ncateg), actual values are assigned to the specified subsets. If the first element of vector_of_subset_names is default, then the assignment is made to all subsets; otherwise, each subset listed by the user is looked up by name and the assignment is made to that particular set of data subsets.

    inline bool Strom::processAssignmentString(const std::string & which, const std::string & definition) { 
        unsigned num_subsets_defined = _partition-&gt;getNumSubsets();
        std::vector&lt;std::string&gt; vector_of_subset_names;
        std::vector&lt;double&gt; vector_of_values;
        bool fixed = splitAssignmentString(definition, vector_of_subset_names, vector_of_values);
        
        if (vector_of_values.size() == 1 && vector_of_values[0] == -1 && !(which == "statefreq" || which == "rmatrix" || which == "relrate"))
            throw XStrom("Keyword equal is only allowed for statefreq, rmatrix, and relrate");

        // Assign values to subsets in model
        bool default_found = false;
        if (which == "statefreq") {
            QMatrix::freq_xchg_ptr_t freqs = std::make_shared&lt;QMatrix::freq_xchg_t&gt;(vector_of_values);
            if (vector_of_subset_names[0] == "default") {
                default_found = true;
                for (unsigned i = 0; i &lt; num_subsets_defined; i++)
                    _model-&gt;setSubsetStateFreqs(freqs, i, fixed);
            }
            else {
                for (auto s : vector_of_subset_names) {
                    _model-&gt;setSubsetStateFreqs(freqs, _partition-&gt;findSubsetByName(s), fixed);
                }
            }
        }
        else if (which == "rmatrix") {
            QMatrix::freq_xchg_ptr_t xchg = std::make_shared&lt;QMatrix::freq_xchg_t&gt;(vector_of_values);
            if (vector_of_subset_names[0] == "default") {
                default_found = true;
                for (unsigned i = 0; i &lt; num_subsets_defined; i++)
                    _model-&gt;setSubsetExchangeabilities(xchg, i, fixed);
            }
            else {
                for (auto s : vector_of_subset_names) {
                    _model-&gt;setSubsetExchangeabilities(xchg, _partition-&gt;findSubsetByName(s), fixed);
                }
            }
        }
        else if (which == "omega") {
            if (vector_of_values.size() &gt; 1)
                throw XStrom(boost::format("expecting 1 value for omega, found %d values") % vector_of_values.size());
            QMatrix::omega_ptr_t omega = std::make_shared&lt;QMatrix::omega_t&gt;(vector_of_values[0]);
            if (vector_of_subset_names[0] == "default") {
                default_found = true;
                for (unsigned i = 0; i &lt; num_subsets_defined; i++)
                    _model-&gt;setSubsetOmega(omega, i, fixed);
            }
            else {
                for (auto s : vector_of_subset_names) {
                    _model-&gt;setSubsetOmega(omega, _partition-&gt;findSubsetByName(s), fixed);
                }
            }
        }
        else if (which == "pinvar") {
            if (vector_of_values.size() &gt; 1)
                throw XStrom(boost::format("expecting 1 value for pinvar, found %d values") % vector_of_values.size());
            ASRV::pinvar_ptr_t p = std::make_shared&lt;double&gt;(vector_of_values[0]);
            bool invar_model = (*p &gt; 0);
            if (vector_of_subset_names[0] == "default") {
                default_found = true;
                for (unsigned i = 0; i &lt; num_subsets_defined; i++) {
                    _model-&gt;setSubsetIsInvarModel(invar_model, i);
                    _model-&gt;setSubsetPinvar(p, i, fixed);
                }
            }
            else {
                for (auto s : vector_of_subset_names) {
                    unsigned i = _partition-&gt;findSubsetByName(s);
                    _model-&gt;setSubsetIsInvarModel(invar_model, i);
                    _model-&gt;setSubsetPinvar(p, i, fixed);
                }
            }
        }
        else if (which == "ratevar") {
            if (vector_of_values.size() &gt; 1)
                throw XStrom(boost::format("expecting 1 value for ratevar, found %d values") % vector_of_values.size());
            ASRV::ratevar_ptr_t rv = std::make_shared&lt;double&gt;(vector_of_values[0]);
            if (vector_of_subset_names[0] == "default") {
                default_found = true;
                for (unsigned i = 0; i &lt; num_subsets_defined; i++)
                    _model-&gt;setSubsetRateVar(rv, i, fixed);
            }
            else {
                for (auto s : vector_of_subset_names) {
                    _model-&gt;setSubsetRateVar(rv, _partition-&gt;findSubsetByName(s), fixed);
                }
            }
        }
        else if (which == "ncateg") {
            if (vector_of_values.size() &gt; 1)
                throw XStrom(boost::format("expecting 1 value for ncateg, found %d values") % vector_of_values.size());
            unsigned ncat = vector_of_values[0];
            if (vector_of_subset_names[0] == "default") {
                default_found = true;
                for (unsigned i = 0; i &lt; num_subsets_defined; i++)
                    _model-&gt;setSubsetNumCateg(ncat, i);
            }
            else {
                for (auto s : vector_of_subset_names) {
                    _model-&gt;setSubsetNumCateg(ncat, _partition-&gt;findSubsetByName(s));
                }
            }
        }
        else if (which == "tree") {
            if (vector_of_values.size() &gt; 1)
                throw XStrom(boost::format("expecting 1 value for tree, found %d values") % vector_of_values.size());
            unsigned tree_index = vector_of_values[0];
            assert(tree_index &gt; 0);
            _model-&gt;setTreeIndex(tree_index - 1, fixed);
            if (vector_of_subset_names[0] != "default")
                throw XStrom("tree must be assigned to default only");
        }
        else {
            assert(which == "relrate");
            if (vector_of_subset_names[0] != "default")
                throw XStrom("relrate must be assigned to default only");
            _model-&gt;setSubsetRelRates(vector_of_values, fixed);
        }

        return default_found;
    }   

The splitAssignmentString member function

The job of this function is to split a string such as first,second:1,1,1,1,1,1, default:.1,.2,.3,.4, or first,third:4 into two vectors: a vector of strings named vector_of_subset_names storing the subset names (or just the keyword default); and a vector of doubles named vector_of_values storing the (possibly multivariate) parameter value. These two vectors are supplied by the calling function (processAssignmentString) and passed by reference so that they can be filled by splitAssignmentString. In the example given, vector_of_subset_names would end up containing two strings (first and second) and vector_of_values would contain 6 double values, all equal to 1.0.

The splitting is accomplished using the boost::split function, which produces a vector (twoparts) containing 2 strings (first,second and 1,1,1,1,1,1). The first,second string is assigned to the variable comma_delimited_subset_names_string and the 1,1,1,1,1,1 string is assigned to the variable comma_delimited_value_string.

This section shown below checks to see if the parameter value in comma_delimited_value_string is enclosed in square brackets (using a regular expression). If so, then comma_delimited_value_string is assigned to the part inside the square brackets, and a boolean variable fixed is set to true, indicating that the user wishes for this parameter to have a fixed value (i.e. not updated during an MCMC analysis, for example).

        // Check for brackets indicating that parameter should be fixed     
        // now see if before_colon contains a data type specification in square brackets
        bool fixed = false;
        const char * pattern_string = R"(\s*\[(.+?)\]\s*)";
        std::regex re(pattern_string);
        std::smatch match_obj;
        bool matched = std::regex_match(comma_delimited_value_string, match_obj, re);
        if (matched) {
            comma_delimited_value_string = match_obj[1];
            fixed = true;
        }   

If comma_delimited_value_string is the string equal, then vector_of_values is assigned the single value -1.0, which will serve as a signal that equal was specified.

Assuming comma_delimited_value_string does not containg the string equal, then comma_delimited_value_string is split, again using boost::split, at the commas to yield the returned vector_of_strings vector, which is then converted to the double vector vector_of_values using std::transform. The std::transform function transforms each individual string in vector_of_strings to a double value that is appended to vector_of_values by the anonymous lambda function provided. The lambda function takes a string argument vstr provided by std::transform and converts it to a double value using the std::stod function.

The comma_delimited_subset_names_string is split to yield the returned vector_of_subset_names using boost::split once again.

Finally, some sanity checks are performed:

    inline bool Strom::splitAssignmentString(const std::string & definition, std::vector&lt;std::string&gt; & vector_of_subset_names, std::vector&lt;double&gt;  & vector_of_values) {  
        // Split subset names from definition
        std::vector&lt;std::string&gt; twoparts;
        boost::split(twoparts, definition, boost::is_any_of(":"));
        if (twoparts.size() != 2)
            throw XStrom("Expecting exactly one colon in assignment");
        std::string comma_delimited_subset_names_string = twoparts[0];
        std::string comma_delimited_value_string = twoparts[1];
        boost::to_lower(comma_delimited_value_string);
        
        // Check for brackets indicating that parameter should be fixed     
        // now see if before_colon contains a data type specification in square brackets
        bool fixed = false;
        const char * pattern_string = R"(\s*\[(.+?)\]\s*)";
        std::regex re(pattern_string);
        std::smatch match_obj;
        bool matched = std::regex_match(comma_delimited_value_string, match_obj, re);
        if (matched) {
            comma_delimited_value_string = match_obj[1];
            fixed = true;
        }   
        
        if (comma_delimited_value_string == "equal") {
            vector_of_values.resize(1);
            vector_of_values[0] = -1;
        }
        else {
            // Convert comma_delimited_value_string to vector_of_strings
            std::vector&lt;std::string&gt; vector_of_strings;
            boost::split(vector_of_strings, comma_delimited_value_string, boost::is_any_of(","));

            // Convert vector_of_strings to vector_of_values (single values in case of ratevar, ncateg, and pinvar)
            vector_of_values.resize(vector_of_strings.size());
            std::transform(
                vector_of_strings.begin(),      // start of source vector
                vector_of_strings.end(),        // end of source vector
                vector_of_values.begin(),       // start of destination vector
                [](const std::string & vstr) {  // anonymous function that translates
                    return std::stod(vstr);     // each string element to a double
                }
            );
            assert(vector_of_values.size() &gt; 0);
        }
        
        // Split comma_delimited_subset_names_string into vector_of_subset_names
        boost::split(vector_of_subset_names, comma_delimited_subset_names_string, boost::is_any_of(","));
        
        // Sanity check: at least one subset name must be provided
        if (vector_of_subset_names.size() == 0) {
            XStrom("At least 1 subset must be provided in assignments (use \"default\" if not partitioning)");
        }
        
        // Sanity check: if no partition was defined, then values should be assigned to "default" subset
        // and if "default" is in the list of subset names, it should be the only thing in that list
        unsigned num_subsets_defined = _partition-&gt;getNumSubsets();
        std::vector&lt;std::string&gt;::iterator default_iter = std::find(vector_of_subset_names.begin(), vector_of_subset_names.end(), std::string("default"));
        bool default_found = (default_iter != vector_of_subset_names.end());
        if (default_found) {
            if (vector_of_subset_names.size() &gt; 1)
                throw XStrom("The \"default\" specification cannot be mixed with other subset-specific parameter specifications");
        }
        else if (num_subsets_defined == 0) {
            throw XStrom("Must specify partition before assigning values to particular subsets (or assign to subset \"default\")");
        }
        return fixed;
    }   

Updating the run function

Change the Strom::run function in the strom.hpp file to accommodate the new lines in blue:

    inline void Strom::run() {  
        std::cout &lt;&lt; "Starting..." &lt;&lt; std::endl;
        std::cout &lt;&lt; "Current working directory: " &lt;&lt; boost::filesystem::current_path() &lt;&lt; std::endl;

        try {
            std::cout &lt;&lt; "\n*** Reading and storing the data in the file " &lt;&lt; _data_file_name &lt;&lt; std::endl;
            _data = Data::SharedPtr(new Data());
            _data-&gt;setPartition(_partition);
            _data-&gt;getDataFromFile(_data_file_name);
            
            <span style="color:#0000ff"><strong>_model-&gt;setSubsetNumPatterns(_data-&gt;calcNumPatternsVect());</strong></span>
            <span style="color:#0000ff"><strong>_model-&gt;setSubsetSizes(_partition-&gt;calcSubsetSizes());</strong></span>
            <span style="color:#0000ff"><strong>_model-&gt;activate();</strong></span>

            // Report information about data partition subsets
            unsigned nsubsets = _data-&gt;getNumSubsets();
            std::cout &lt;&lt; "\nNumber of taxa: " &lt;&lt; _data-&gt;getNumTaxa() &lt;&lt; std::endl;
            std::cout &lt;&lt; "Number of partition subsets: " &lt;&lt; nsubsets &lt;&lt; std::endl;
            for (unsigned subset = 0; subset &lt; nsubsets; subset++) {
                DataType dt = _partition-&gt;getDataTypeForSubset(subset);
                std::cout &lt;&lt; "  Subset " &lt;&lt; (subset+1) &lt;&lt; " (" &lt;&lt; _data-&gt;getSubsetName(subset) &lt;&lt; ")" &lt;&lt; std::endl;
                std::cout &lt;&lt; "    data type: " &lt;&lt; dt.getDataTypeAsString() &lt;&lt; std::endl;
                std::cout &lt;&lt; "    sites:     " &lt;&lt; _data-&gt;calcSeqLenInSubset(subset) &lt;&lt; std::endl;
                std::cout &lt;&lt; "    patterns:  " &lt;&lt; _data-&gt;getNumPatternsInSubset(subset) &lt;&lt; std::endl;
                std::cout &lt;&lt; "    ambiguity: " &lt;&lt; (_ambig_missing || dt.isCodon() ? "treated as missing data (faster)" : "handled appropriately (slower)") &lt;&lt; std::endl;
                }

            std::cout &lt;&lt; "\n*** Resources available to BeagleLib " &lt;&lt; _likelihood-&gt;beagleLibVersion() &lt;&lt; ":\n";
            std::cout &lt;&lt; _likelihood-&gt;availableResources() &lt;&lt; std::endl;

            std::cout &lt;&lt; "\n*** Creating the likelihood calculator" &lt;&lt; std::endl;
            _likelihood = Likelihood::SharedPtr(new Likelihood());
            _likelihood-&gt;setPreferGPU(_use_gpu);
            _likelihood-&gt;setAmbiguityEqualsMissing(_ambig_missing);
            _likelihood-&gt;setData(_data);

            <span style="color:#0000ff"><strong>std::cout &lt;&lt; "\n*** Model description" &lt;&lt; std::endl;</strong></span>
            <span style="color:#0000ff"><strong>std::cout &lt;&lt; _model-&gt;describeModel() &lt;&lt; std::endl;</strong></span>
            <span style="color:#0000ff"><strong>_likelihood-&gt;setModel(_model);</strong></span>

            _likelihood-&gt;initBeagleLib();

            std::cout &lt;&lt; "\n*** Reading and storing the first tree in the file " &lt;&lt; _tree_file_name &lt;&lt; std::endl;
            _tree_summary = TreeSummary::SharedPtr(new TreeSummary());
            _tree_summary-&gt;readTreefile(_tree_file_name, 0);
            Tree::SharedPtr tree = _tree_summary-&gt;getTree(0);
            
            if (tree-&gt;numLeaves() != _data-&gt;getNumTaxa())
                throw XStrom(boost::format("Number of taxa in tree (%d) does not equal the number of taxa in the data matrix (%d)") % tree-&gt;numLeaves() % _data-&gt;getNumTaxa());

            std::cout &lt;&lt; "\n*** Calculating the likelihood of the tree" &lt;&lt; std::endl;
            double lnL = _likelihood-&gt;calcLogLikelihood(tree);
            std::cout &lt;&lt; boost::str(boost::format("log likelihood = %.5f") % lnL) &lt;&lt; std::endl;
            
            <span style="color:#0000ff"><strong>if (_expected_log_likelihood != 0.0)</strong></span>
                <span style="color:#0000ff"><strong>std::cout &lt;&lt; boost::str(boost::format("      (expecting %.3f)") % _expected_log_likelihood) &lt;&lt; std::endl;</strong></span>
            
        }
        catch (XStrom & x) {
            std::cerr &lt;&lt; "Strom encountered a problem:\n  " &lt;&lt; x.what() &lt;&lt; std::endl;
        }

        std::cout &lt;&lt; "\nFinished!" &lt;&lt; std::endl;
    }   

Run the program using the new Model class

Run your program using the strom.conf file you created at the beginning of this step (at the end of the section Specifying a model).

The addition of the expectedLnL line provides the log-likelihood we expect our program to compute. Running the program should produce the expected log-likelihood. Your program now has the capability to compute the likelihood under a model involving data partitioning.

< 11.5 | 11.6 | 12.0 >