Exploring Binary Trees (in C++)
My notes and takeaways from a short dive into the world of DSA and binary trees.
A short while ago, I undertook the task of improving my C++ via DSA (which stands for Data Structure & Algorithms). The aim of practising DSA is simple: By implementing your own data structures and algorithms, you learn a lot of what happens under the hood of your everyday programming, allowing you to write more efficient and optimised code. In short, it helps you distinguish between bloat and useful code.
Before we move on, what exactly is even DSA? Let's define the words in there.
Data Structure: A data structure is a named location that allows you to store and organise data. An example of a data structure is the common
Map
found in multiple languagesAlgorithm: An algorithm is a set of steps that define processes/steps to perform a task. There are two main, important components of an algorithm: the order and the steps. A non-technical example is getting ready for work, you have 4 steps: wake up, shower, dress for the day, and head out. We defined the steps and also ordered them. Imagine showering, then waking up, then heading out, then dressing up! That's a sure way to land in an asylum.
That's the basics out of the way, back to my journey. Binary trees are a data tree structure with multiple nodes, with a node having no more than 2 children.
๐ฒ Binary Tree
What was the word salad above, you ask? Let's break down the concepts:
Tree Data Structure: A tree data structure is an abstract data structure that mimics a tree hierarchical structure with a set of connected points. Each point has one parent point, except the root point (where the tree starts from). Each point here is called a node. The connecting "lines" between a node is called an edge.
-
Binary Tree: A binary tree is a tree data structure where each node has no more than two children, usually referred to as the left and right nodes. A child here means a node connected to a node on an upper level. In the picture above, the root has two children's nodes, with the left child's node having two children and the right child's node having three children nodes. The picture above is not a binary tree.
Note that the data structures here like trees and graphs are abstract concepts. They are models/conventions for shaping data.
Based on that, we can go ahead and define a basic Binary Tree:
struct TreeNode {
int value;
TreeNode *right, *left;
explicit TreeNode(int val):
value(val),
right(nullptr),
left(nullptr) {};
};
Our binary tree is a struct that allows us to instantiate a root node, that is a node with no parent. It also points the left and right child nodes to nothing. Also, the edges are the pointers to its children nodes.
We use pointers (TreeNode *right, *left;
) in our struct because binary trees are recursive structures. Each node needs to reference its children, but we don't know how many nodes the tree will have in advance. Pointers allow us to dynamically allocate memory for new nodes as we build the tree, providing flexibility and efficient memory usage.
TreeNode root = TreeNode(0)
Let's add children nodes and see how it would tally.
root.right = new TreeNode(1);
root.left = new TreeNode(2);
auto* node_three = new TreeNode(3);
auto* node_four = new TreeNode(4);
auto* node_five = new TreeNode(5);
root.right->left = node_three;
root.right->right = node_five;
root.left->left = node_four;
That's the basics of the Binary Tree. Next, we will briefly look at the traversal of a Binary Tree.
โ๏ธ Traversing a Binary Tree
Traversing a binary tree simply means moving through the tree nodes. It can be broken down into two main categories:
Depth-First Search (DFS) Algorithms: Depth-first is an algorithm (remember, steps and order) that is as its name advertises, a search algorithm used to travel around a tree by going down. For example, using our tree above, we can traverse: 0-2-4-1-3-5. Starting from the root, then going down the leftmost node recursively, then moving to the right and repeating the same process. DFS has different types of traversals we won't be looking at, namely Preorder, Inorder, and Postorder traversals.
Breadth-First Search (BFS) Algorithms: Breadth-first is the opposite of the DFS algorithm, in that the traversal occurs across instead of down. Using our trusty tree above, a complete traversal would yield: 0-2-1-4-3-5. Unlike DFS, there's only one type of traversal when it comes to this algorithm. And that's Level Order Traversal.
That's it for this section. As I said above, a binary tree is an abstract data model (concept), not an actual topic to be cramming or sweating over. It only makes sense when applied to something practical (one good reason I hate Leetcode and the like).
๐ฌ Practical Examples
A good application of binary trees is in expression trees and ASTs (Abstract Syntax Trees). Expression trees are simply a way of representing mathematical expressions, whilst ASTs are used by compilers and parsers to represent code. Note that an expression tree/AST doesn't necessarily equate to a binary tree structure. A binary tree is just one of the ways they could be represented.
Starting out with expression trees, let's say we want to represent the following:
2 * ((1 + 7) / 4)
We can start out with a tree of the following structure:
// with the syntax <value|node>
<2|left> <*|root> <((1 + 7) / 4)|right>
The right node can then be broken down further to:
<1 + 7|left> </|root> <4|right>
The left node can also be broken down further..., do you see how this is coming along? By using a divide-and-conquer strategy, we can turn any complicated expression into a simple representation made up of multiple left-parent-right structures.
For mathematical expression evaluation, expression trees allow for efficient computation of complex formulas. By traversing the tree, we can easily separate and respect the order of operations and handle nested expressions individually. This is very useful in applications like spreadsheet software, where users input complex mathematical expressions that need to be evaluated quickly and accurately.
The same goes for ASTs. A common example of binary trees in AST is a compiler. Using a simple conditional as an example:
if (x > 0) print(x);
An AST can be thought of as a more elegant expression tree, we can represent the above as:
ASTNode* ifNode = new ASTNode("IfStatement");
ASTNode* condition = new ASTNode("BinaryExpression", ">");
condition->left(new ASTNode("Identifier", "x"));
condition->right(new ASTNode("Literal", "0"));
ASTNode* printCall = new ASTNode("CallExpression", "print");
printCall->left(new ASTNode("Identifier", "x"));
ifNode->left(condition);
ifNode->right(printCall);
This is a very simple representation of the code. As I mentioned above, these data structures are just abstract models to fit data. If we wanted to parse a longer statement like:
if (x > 0) {
print(x);
else if (x == 0) {
print(0);
} else {
print(-1);
}
We would find a node having more than two children nodes, in which case, it is still a tree structure, but no longer a binary tree.
We've covered the basics of binary trees, including their structure, implementation in C++, traversal methods, and practical applications in expression evaluation and compiler design. Binary trees are important in programming because they provide an efficient way to organize and search hierarchical data. They form the foundation for more complex tree structures like binary search trees (BSTs) and AVL trees, which are crucial for implementing efficient searching and sorting algorithms. I am still learning about these though and would be documenting my progress along the way.
That's it for this article! DSA is a very large and interesting field, especially when done with a language like C++. This is because C++ is a language that gives you a high degree of control over memory management (which allows for things to go so wrong) and performance optimization. Understanding these data structures allows you to choose the most efficient ways to organize and access data, while knowledge of algorithms in your arsenal helps you implement the most effective solutions to problems.
I am currently having fun playing around with multiple algos and structures and then trying to put them into practice in small projects, if you want to follow along or take a peek at my progress, it is available in this repository with tests too (yes, full test coverage ๐). Speaking of tests, TDD in C++ is very fun, I should write up on that too. Once again, thanks for reading my notes. If you have any corrections or additions for us, feel free to share in the comments. Till next time ๐!