A tree is a non-linear data structure where a node can have zero or more connections. The topmost node in a tree is called root. The linked nodes to the root are called children or descendants.
As you can see in the picture above, this data structure resembles an inverted tree, hence the name. It starts with a root node and branch off with its descendants, and finally leaves.
Implementing a tree is not too hard. It’s similar to a part02-linear-data-structures.asc. The main difference is that instead of having the next
and previous
links, we have an 0 or more number of linked nodes (children/descendants).
link:../../../src/data-structures/trees/tree-node.js[role=include]
Simple! Right? But there are some constraints that you have to keep at all times.
-
Loops: You have to be careful not to make a circular loop. Otherwise, this wouldn’t be a tree anymore but a graph data structure! E.g., Node A has B as a child, then Node B list Node A as its descendant forming a loop. ️
-
Parents: A node with more than two parents. Again, if that happens is no longer a tree but a part03-graph-data-structures.asc.
-
Root: a tree must have only one root. Two non-connected parts are not a tree. part03-graph-data-structures.asc can have non-connected portions and doesn’t have root.
-
The topmost node is called root.
-
A node’s primary linked nodes are called children.
-
A leaf or terminal node is a node without any descendant or children.
-
A node immediate ancestor is called parent. Yep, and like a family tree, a node can have uncles and siblings, and grandparents.
-
Internal nodes are all nodes except for the leaf nodes and the root node.
-
The connection/link between nodes is called edge.
-
The height of a tree is the distance (edge count) from the farthest leaf to the root.
-
The height of a node is obtained by counting the edges between the node and the most distant leaf. For instance, from the image above:
-
Node A has a height of 3.
-
Node G has a height of 1.
-
Node I has a height of 0.
-
-
The depth of a tree is the distance (edge count) from the root to the farthest leaf.
There are different kinds of trees, depending on the restrictions. E.g. The trees with two children or less are called binary tree, while trees with at most three children are Ternary Tree. Since binary trees are the most common, we will cover them here and others in another chapter.
The binary restricts the nodes to have at most two children. Trees can have 0, 1, 2, 7, or more, but not binary trees.
Binary trees are one of the most used kinds of trees, and they are used to build other data structures.
-
Priority Queues
The Binary Search Tree (BST) is a specialization of the binary tree. BST has the same restriction as a binary tree; each node has at most two children. However, there’s another restriction: the values are ordered. It means the left child’s value has to be less or equal than the parent. In turn, the right child’s value has to be bigger than the parent.
BST: left ≤ parent < right
The heap (max-heap) is a binary tree where the parent’s value is higher than both children’s value. Opposed to the BST, the left child doesn’t have to be smaller than the right child.
The (max) heap has the maximum value in the root, while BST doesn’t.
There are two kinds of heaps: min-heap and max-heap. For a max-heap, the root has the highest value. The heap guarantee that as you move away from the root, the values get smaller. The opposite is true for a min-heap. In a min-heap, the lowest value is at the root, and as you go down the lower to the descendants, they will keep increasing values.
Heap is better at finding max or min values in constant time O(1), while a balanced BST is good a finding any element in O(log n). Heaps are often used to implement priority queues, while BST is used when you need every value sorted.