Basic Programming: Trees

Ranked #6,603 in Computers & Electronics, #128,476 overall

A Recursive Data Structure

After the array and the linked list are understood, trees make a good next step.

Trees are often the first non-list data structure that a programmer learns. This is partially because they build on the concept of linked boxes that is used in linked lists, making trees easier to learn.

Trees, especially in the form of Binary Search Trees and Balanced Binary Search Trees, are useful for organizing data -- which makes it easier (quicker) to find for specific elements.

Because other data structures, such as heaps, are based on trees, it is important to understand trees before you tackle those data structures.

Besides their usefulness as data structures, trees are a perfect time to learn (and apply) recursion. Recursion is a programming concept that involves nested calls to the same method, instead of a for or while loop. It is used frequently and effectively when dealing with trees.

What Is A Tree?

The General Tree

Trees are made of nodes. Each node holds one piece of data and several links.

On a diagram, the circles are nodes and the lines are links. The value (the data) is written inside the circle representing the node.

There are a number of vocabulary words related to trees -- it helps to remember that many of the words (parent, child, grandparent) are used similarly to the way they would be in a family tree.

Terms Relative to the Current Node

  • Parent
    Every node (except the root) has one parent. The parent of a node is one level closer to the root.
  • Child
    A node may have zero, one, or more children. The maximum number of children allowed depends on the type of tree.
  • Grandparent, Grandchild
    Defined analogous to parent and child.
  • Sibling
    Nodes that Share the same parent
  • Subtree
    A subtree is a tree within a tree; the root of a subtree is a node (not the root) of another tree. A subtree is a portion of that other tree.

Terms Relative to the Tree

  • Root
    This is the beginning of the tree; the only node without a parent. Usually, the root is draw at the top of a tree diagram.
  • Leaf
    A node without any children.
  • Interior Nodes
    Nodes that are neither leaves nor the root. These nodes are not on either edge; they are in between the root and the leaves.
  • Complete Tree
    In a complete tree, all the leaves are on one or two levels. There may only be empty spaces on the lowest level (the one farthest from the root). In the last level, all the leaves must be on the right and all the empty spaces must be on the left. There may be only one (non-leaf) node with less than the maximum number of children.

Terms for Distance

The Measurement of a Tree

The two main ways of measuring distance in a tree are Height and Depth.

Depth: Distance From the Root

To determine the depth of a node, count the number of pointers it takes to travel from the root to the current node; that number is called the depth of the node. The root has a depth of zero. This does not make sense as a measurement for the entire tree; you cannot discuss the "depth of the tree".

Another Term:
A Level consists of all the nodes with a given depth. For example, the children of the root form the first level, Level 1.

Height: Distance From the Bottom/Leaves

All leaves have height one. For all other nodes, height is defined as the maximum of the heights of the their children, plus one. For example, if a node has 2 children, one of height 3 and the other of height 5, then the height of the node is 6.

The height of a tree is the height of it's root -- the greatest height in the tree. (Yes, I know that real, wooden trees have the roots at the bottom. It might help to think of computer science trees as being upside down -- we do draw the root at the top, after all.)

Binary Tree

The Most Common Tree

This is the standard type of tree -- when people talk about trees, this is what they probably mean.

In a binary tree, the nodes can have a maximum of 2 children. This is the rule that defines a binary tree as distinct from the general concept of a tree.

Order of Iteration

There are two general ways of iterating over (visiting all the nodes of) a tree. They are Breath-First and Depth-First.

Depth First

As the name implies, in a depth-first search you first move all the way down to a leaf (the bottom of the tree) then begin to move side ways.

There are three types of depth-first traversals:
  • In Order (Left-Root-Right)
    In an in order traversal, you first return the left child of the current node, then the current node, then the right child. When you return the left child, you are calling the traversal on it -- so the first node that actually gets returned is the far left root of the tree.
  • Pre-Order (Root-Left-Right)
    In a pre-order traversal, the first thing you return is the root. Then you return it's children -- the left, then the right. As in all the traversals, when you return a node, you are returning the result of it's traversal.
  • Post-Order (Left-Right-Root)
    In a post-order traversal, you return both children (left, then right) before the root. So, the first node in the tree to be returned is the far left leaf.


Notice: the left child is always returned prior to the right child. The naming of the traversals refers to when the root is returned.

Breath-First

In a breath-first traversal, you return the nodes by level, going left to right. You begin with the root, then the roots children (left, then right), then the root's grandchildren (from left to right), etc.

This type of traversal is commonly used to print out a representation of the tree.

Binary Search Tree

A Sorted Tree

A BST (Binary Search Tree) is a binary tree, but organized. This means that is has the same structure as a binary tree (2 children be node).

What makes the BST special is it's organization. In a binary search tree, the value of the left child of a node is less than that node. The value of the right child of a node is greater than that node. This makes all the nodes in the right subtree of a node greater than that node (and visa-versa for the left subtree).

Because of this organization, searching for an element is faster. At each node, you know instantly which way to turn, which can halve the number of remaining possibilities at each comparison. It means that you don't have to look at every node in the tree to know whether something is in the tree or not.

Balanced Binary Search Tree

More Complicated, More Efficient

One of the things that programmers often strive for is efficiency. Because a program is written once but used many times, it makes sense to increase complexity for the sake of efficiency, especially with in a data structure than can be implemented, then used by multiple applications.

While a binary search tree simply and effectively organized the data, it would take an irritatingly long time to find some node if, for example all the the node had only one child. If you added the elements of a sorted array, one by one, into a BST, then each node would have a right child, but no left child. It would degenerate into a linked list.

BBSTs are designed to prevent such eventualities by keeping the heights of the various subtrees balanced. Since there are different ways of achieving this goal, there are different types of BBST.

Common types of BBST include AVL trees and Red-Black trees. Splay trees are a type of BBST that is intentionally unbalanced. I'm currently constructing a lens covering only BBSTs, specifically these three types.

Implementation

Array VS. Linked

When we talk about trees, we usually draw them as values (or shapes) connected by lines. This is a good way for us to think about them and it accurately represents a linked implementation of a tree.

In a linked implementation, each node is a separate object with a reference, a link, to it's children and parent. This works fine in an object oriented language.

Another way to implement a tree is to use an array.

In a binary tree implemented using an array, the children of a node a index i are are indexes 2i and 2i + 1.

Related Programming Lenses

Loading

Reader Feedback

Am I Being Clear?

  • jessest May 3, 2012 @ 9:06 pm | delete
    A very nice explanation.
  • ujjaini Mar 7, 2012 @ 1:55 pm | delete
    this is very helpful ! thanks !
  • TerrenceYoung Oct 8, 2010 @ 4:28 pm | delete
    Having a good understanding of the fundamentals of computer science is key to career success ! This article brought me back to my college years! Thanks for the information.
  • badmsm Jun 8, 2009 @ 6:52 pm | delete
    Good coverage of the topic, I ALMOST understood it ;) 5 Stars & a Squid Angel Blessing

by

Astrieanna

I'm a Computer Science major at Johns Hopkins University.
Besides reading scifi and playing with computers, my main hobby is playing Go, a really cool...
more »

Feeling creative? Create a Lens!