2.1 Create the Node Class

(Mac version)

< 2.0 | 2.1 | 2.2 >

In the Xcode Project Navigator pane on the left, you should now see your strom project (beside a blue project icon ), and inside that you should see a strom folder (beside a yellow folder icon ), and inside that you should see a file named main.cpp. (If you do not see anything like this, it may be because your Project Navigator pane is not visible. You can always get it back using View > Navigators > Show Project Navigator from the main menu.)

Command-click the strom folder (be sure to click the strom folder, with the yellow icon , not the strom project) and choose New File… from the popup menu.

Select Header File and press the Next button. Name the new file node.hpp, check the strom target, and press the Create button to save node.hpp in the strom folder.

You should now have a strom folder inside your strom project, and inside the strom folder you should see 2 files: node.hpp and main.cpp.

Replace the default contents of node.hpp with the following code and save the file.

#pragma once	

#include &lt;string&gt;
#include &lt;vector&gt;
#include  &lt;iostream&gt;
//#include "split.hpp"

namespace strom {

    //class Tree;
    //class TreeManip;
    //class Likelihood;
    //class Updater;

    class Node {
            //friend class Tree;
            //friend class TreeManip;
            //friend class Likelihood;
            //friend class Updater;

        public:
                                        Node();
                                        ~Node();

                    Node *              getParent()     {return _parent;}
                    Node *              getLeftChild()  {return _left_child;}
                    Node *              getRightSib()   {return _right_sib;}
                    int                 getNumber()     {return _number;}
                    std::string         getName()       {return _name;}
                    //Split               getSplit()      {return _split;}

                    double              getEdgeLength() {return _edge_length;}
                    void                setEdgeLength(double v);

            static const double _smallest_edge_length;

            typedef std::vector&lt;Node&gt;    Vector;
            typedef std::vector&lt;Node *&gt;  PtrVector;

        private:

            void                clear();

            Node *              _left_child;
            Node *              _right_sib;
            Node *              _parent;
            int                 _number;
            std::string         _name;
            double              _edge_length;
            //Split               _split;
    };

    inline Node::Node() {
        std::cout &lt;&lt; "Creating Node object" &lt;&lt; std::endl;
        clear();
    }

    inline Node::~Node() {
        std::cout &lt;&lt; "Destroying Node object" &lt;&lt; std::endl;
    }

    inline void Node::clear() {
        _left_child = 0;
        _right_sib = 0;
        _parent = 0;
        _number = -1;
        _name = "";
        _edge_length = _smallest_edge_length;
    }

    inline void Node::setEdgeLength(double v) {
        _edge_length = (v &lt; _smallest_edge_length ? _smallest_edge_length : v);
    }

}	

Explanation of node.hpp

Pragmas

The first line is a pragma, which is an instruction to the compiler that can be either used or ignored, depending on the compiler. The once pragma says that this header file should not be included more than once.

Including header files

The #include <string> statement near the beginning of the file causes the code defining a standard string object to be inserted at that location in node.hpp, replacing the #include <string> line as if you had carried out a search and replace operation. This allows us to create and use objects of type std::string inside node.hpp (and any file other that specifies #include "node.hpp"). Similarly, #include <vector> allows us to use the std::vector container to create arrays of Node objects, and #include <iostream> allows us to use std::cout objects to output information to the console.

You have probably noticed the #include "split.hpp" statement that has been commented out by preceding it with a double slash (//). We will uncomment this line later after we create the split.hpp file.

Namespaces

The Node class is wrapped in a namespace just in case this class is used along with other code that defines a class of the same name. The namespace we will be using to wrap everything in this tutorial is strom, which means “tree” in Czech. Note that the std in std::string is also a namespace. In order to use a standard string, you must qualify the name string with the namespace (i.e. std::string). While we will not do it, if you were to define a string class inside node.hpp, you would need to refer to it as strom::string outside the strom namespace.

Data members

A data member is a variable defined for objects of a particular C++ class. Note that all data members have names beginning with a single underscore character. This is not required by C++ but is a convention that will be used throughout to make it easy to recognize data members as opposed to local variables or function parameters.

Node data members

The data members of the Node class include three pointers to other Node objects.

_parent: The _parent pointer points to the ancestor of the current Node. If _parent is 0, then the current Node is the root node.
_left_child: The _left_child pointer points to the left-most child node of the current Node in the tree. If _left_child is 0, then the Node is a leaf node in the tree.
_right_sib: Finally, the _right_sib pointer points to the next sibling Node on the right. If _right_sib is 0, then the current Node is the rightmost child of its ancestor.

There are four other data members of the Node class.

_name: This is a string that represents the taxon name of a leaf and is often (but not necessarily) an empty string for interior nodes.
_number: This is the node number, which serves as an index into the Tree::_nodes vector.
_edge_length: This is the length of the edge between this Node and its ancestor (i.e. _parent).
_smallest_edge_length: This is the length of the smallest allowable edge length. This is a static data member, which means that it exists and its value can be set and used even if no object of this class is ever created. As such it functions as a global variable that can be used anywhere, but has the advantage of still being nestled within the Node class so there is no danger of it being confused with some global variable with the same name introduced by, for example, a third-party header file. Because it is static, we don’t initialize it in the construtor; instead, it will be initialized in main.cpp, which is our only source code file (all other files in this project are header files).

Finally, there is one data member (_split) that is currently commented out. You will uncomment this line later.

Member functions

Accessors and setters

The Node class has several member functions. Most of these functions are accessors: they provide access to the private data members that are not otherwise accessible except by a member function of the Node class itself. The accessor functions are getParent(), getLeftChild(), getRightSib(), getNumber(), getName(), getEdgeLength(), and (commented out for now) getSplit().

Note that these functions are defined (i.e. their function body is provided) directly in the class declaration (i.e. the part between class Node and };). This is fine for really simple functions, but for functions with even slightly more complicated bodies, we will move the bodies further down in the file to avoid making the class declaration too difficult to comprehend at a glance.

The member function setEdgeLength is a setter: it sets the value of a particular data member to the specified floating point value. The function body is a bit too long to include in the class declaration, so its body is provided near the bottom (but inside the strom namespace code block). This function enforces a minimum length (_smallest_edge_length) for any edge.

Constructors and destructors

Two member functions are special: the constructor function and the destructor function. You can identify these by the fact that they have no return type (not even void!) and their names are identical to the class name. The constructor is called automatically when a new object of the class is created, so it is the logical place for doing setup tasks that should be done before an object is used for the first time. The destructor is called automatically when the object is being deleted, and is the logical place for cleanup tasks that should be done before the object goes away.

The constructor defined here just reports that an object of type Node has been created, and then calls the clear() function to initialize data members. The destructor simply reports that a Node object is being destroyed. We will eventually comment out these std::cout statements to avoid cluttering the output, but for now it is nice to be able to see when objects are created and destroyed.

Public versus private

You might wonder “why don’t we just make all data members public?” It is always wise to expose as little as possible to the outside world. Accessors provide read-only access to anyone who is interested, but do not allow just anyone to make changes to the data members of the Node class. This makes it harder for someone (maybe even yourself at a later time!) who does not fully understand your code to introduce errors when modifying it.

We will soon find that some classes use private members of Node to such an extent that we will make each of these classes a friend of Node. Any class declared as a friend of Node has full access to private data members. Because the classes Tree (which comprises Node objects), TreeManip (whose purpose is to manipulate Tree objects), and Likelihood (whose purpose is to compute the likelihood of a tree) do not yet exist, these friend declarations are currently commented out.