

Analogous to organic trees in nature, many phenomena may be represented as trees. In the real world, there are often concepts that exhibit a treelike structure: an organization chart, the contents of a book, the parts list of a machine are but a few examples. Tree structures can serve as natural representations for any form of hierarchical organized systems. Systems which are composed of subsystems, subsubsystems, and so on, such as organizations, buildings, machines, formulas, and biological organisms may all be represented by tree structures. There are several ways to represent tree structures. In Chapter 13 of the "Language Description" we discussed term trees, representing symbolic expressions. In Chapter 1 of "Data Structures" sequences have been examined as special versions of binary term trees. In this chapter we will discuss trees in a more general context. We start with some terminology about abstract trees, then examine ways of representing tree structures and define operations on some of these structures.
Trees are collections of elements organized in hierarchical structures. Like other collections, homogeneous trees have only elements which are all of the same type. Heterogeneous trees have elements which may be of different types. Trees are made of nodes. The initial node of a tree is called the root node. In general, a node consists of a data item, representing an element of the tree, and a number of branches which are pointing to other nodes. Because a branch refers to another node, which may refer to again other nodes, and so on, we may say that a node represents a tree element and a collection of subtrees. The number of subtrees of a given node is called the degree of that node. If the degree of a node is zero, then the node is called a leaf node, or a terminal node. Nodes that resides in the interior of a tree are nonterminal nodes. They have a degree greater than zero. Based on the degrees of the nodes in a tree, the degree of a tree can be defined. The degree of a tree is equal to the maximum of all node degrees of the tree. For example, if the degree of a tree is 3 then the degrees of its nodes may be 0, 1, 2, or 3. If the degree of a tree is 1, the tree collapses to a unidirectional list; we call this a degenerated tree. If the degree of a tree is 2, the tree is called a binary tree. Another variety of trees that appears in practice has a degree that varies from node to node. These trees are called multiway trees, or simply, arbitrary trees. Trees are such semantically rich structures that a large range of additional terms are associated with trees. We will name a few. If a given node references any other nodes, we say that it is the parent of these subordinate nodes; each of these subordinate nodes is a child of the parent. A parent is also called the ancestor of its children and the children are descendants of their parents. Trees are often drawn "upside down" (see Figure 31). However, sometimes a presentation as an organization chart (see Figure 32) or as a diagram from left to right is more convenient (see Figure 33). All three examples are representing the same abstract tree. A /  \ /  \ B C D / \ /  \ / \ /  \ E F G H I Figure 31: Example of a tree structure
  A            B   C   D                    E   F   G   H   I       Figure 32: Example of an organization chart as a tree
A    B      E     F   C   D    G   H  I Figure 33: Example of a indented tree structure
In general, there are two ways to represent trees. One way is to describe trees by means of terms, as has been discussed earlier. Another way, which we will follow here, is to represent trees by means of descriptors. Because arbitrary trees, or multiway trees, are the most general kinds of trees there are, the corresponding treedescriptors should reflect this generality. As we have seen in the preceding description of abstract trees, trees are made of nodes, and a node consists of a data item and a collection of subtrees. And because a node together with its subtrees is also a tree, we can conclude that a treedescriptor has the following format:
where Item is an element of the tree and Subtrees is a list of subtrees. The corresponding constructor operation is: Tree (Element, Subtrees = list (Tree) ) > Tree; Tree (Item, Subtrees) = Tree: [ Item; Subtrees ]; In addition, we also like to have an operation which creates a leaf node: Leaf (Element) > Node; Leaf (anItem) = Tree (anItem, alist (Tree)); The Leaf operation calls the Tree constructor with an empty list of subtrees. We are now able to define a number of interesting operations on trees. These operations include:
As discussed earlier, the elements of a tree may be of any type. They may be descriptor types, collection types, term types, category types, and so on. Because we want to define operations on arbitrary trees, independent of the element types, we need to define a generic component for trees, where the tree type and the element type are generic parameters (see Figure 34). component ArbitraryTrees (Tree, Element); type Tree; type Node = Tree; Tree (Element, Subtrees = list(Tree)) > Tree; Leaf (Element) > Node; Node (Tree) > Node; Item (Node) > Element; ChangeItem (Node, NewItem = Element) > nothing; Subtrees (Node) > list (Tree); ChangeSubtrees (Node, NewSubtrees = list (Tree)) > nothing; Nodes (Tree) > multi (Node); Items (Tree) > multi (Element); Member (Tree, Element) > boolean; Search (Tree, Element) > optional (Node); Copy (Tree) > Tree; begin Tree (Item, Subtrees) = Tree: [ Item; Subtrees ];Leaf (anItem) = Tree (anItem, alist (Tree));Node (aTree) = aTree;Item (aNode) = aNode.Item;ChangeItem (aNode, NewItem) = [ aNode.Item:=NewItem ];Subtrees (aNode) = aNode.Subtrees;ChangeSubtrees(aNode, NewSubtrees) = [aNode.Subtrees:=NewSubtrees];Nodes (aTree) = ( aTree, Nodes (items (aTree.Subtrees)));Items (aTree) = Item (Nodes (aTree));Member(aTree, anItem) = [ if Items(aTree) == anItem then return(true); false];Search (aTree, anItem) = [ [ aNode = Nodes (aTree); if Item (aNode) == anItem then return (aNode) ]; no (Node) ];Copy (aTree) = [ T = copy (aTree); T.Item:=copy (aTree.Item); T.Subtrees:={ Copy (items (aTree.Subtrees)) }; T];end component ArbitraryTrees; Figure 34: A component for arbitrary trees We may now use this component to create trees and to operate on trees. As an example we will create a tree as represented by Figure 31. Let us assume that the node elements are text items. We will build the final tree in a number of intermediate steps: use ArbitraryTrees (Tree, text);TB = Tree ("B", { Leaf ("E"), Leaf ("F")}); TC = Leaf ("C"); TD = Tree ("D", { Leaf ("G"), Leaf ("H"), Leaf ("I")});TA = Tree ("A", { TB, TC, TD }); TA? The result is a tree as represented by: Tree: [ Item = "A"; Subtrees = { Tree: [ Item = "B"; Subtrees = { Tree: [ Item = "E"; Subtrees = { } ], Tree: [ Item = "F"; Subtrees = { }]}], Tree: [ Item = "C"; Subtrees = { }], Tree: [ Item = "D"; Subtrees = { Tree: [ Item = "G"; Subtrees = { }], Tree: [ Item = "H"; Subtrees = { }], Tree: [ Item = "I"; Subtrees = { }]}]}] Mark that the output has the structure of Figure 33. There are many operations that may be performed on a tree. Because trees are collections of elements it is sometimes convenient, just like with other collections such as arrays and lists, to generate a stream of all the elements of a tree. In the preceding component there are two operations producing streams, one of nodes and one of items. For nodes we have the following recursive definition: Nodes (Tree) > multi (Node); Nodes (aTree) = ( aTree, Nodes (items (aTree.Subtrees))); This definition produces a stream of all the nodes of a tree. It begins with the root node and then climbs the tree, starting with the first, most left subtree, and so on. The Items definition produces a stream of items in the same order as Nodes are produced: Items (Tree) > multi (Element); Items (aTree) = Item (Nodes (aTree)); We may now use this definition, for example, to construct a list L of all the elements of the tree TA : L = { Items (TA) }; The resulting elements of L are: { "A", "B", "E", "F", "C", "D", "G", "H", "I"} With this approach it is easy to construct a list from a tree. Another common task performed on a tree is that of executing a given operation on each element of the tree. The operations: Item (Node) > Element; Item (aNode) = aNode.Item; and ChangeItem (Node, NewItem = Element) > nothing; ChangeItem (aNode, NewItem) = [ aNode.Item:=NewItem ]; may be used to get the item of a node and to change the item. Assume, for example, that we want to double the letters in our example tree of Figure 31; so, "A" becomes "AA", "B" becomes "BB", and so on. We may write a definition using these operations in the following way: double (Tree) > nothing; double (aTree) = [ Node = Nodes (aTree); << all nodes of the tree >> Letter = Item (Node); ChangeItem (Node, Letter & Letter) ]; This definition visits all nodes of a tree and changes the corresponding items to double the letters. Visiting all nodes of a tree is also called the traversal of a tree. This operation is an example of operations that change the elements of a tree but does not change the structure of the tree. Because these kinds of operations are very much dependent on the actual element type of a tree, they will usually not be part of a generic tree component but are defined by the client of the component. Another category of operations on trees are those which are changing the structure of a tree. For example, if you want to update an organization chart, you may change the tree structure. The operations: Subtrees (Node) > list (Tree); Subtrees (aNode) = aNode.Subtrees; and ChangeSubtrees(Node, NewSubtrees = list(Tree) ) > nothing; ChangeSubtrees(aNode, NewSubtrees) = [aNode.Subtrees:=NewSubtrees]; may be used to get the subtrees of a node and to change the subtrees. Because the subtrees are represented as lists, the client of the component may update the list by means of list operations such as add, insert, delete. As a result the structure of the tree is changed. As an example, let us assume that we want to create a mirror image of a tree (see Figure 35).
To create such a mirror image we need a recursive definition which mirrors for each node the subtrees. Therefore it has to use the abovementioned operations to get the subtrees and to change the subtrees of a node: Mirror (Tree) > Tree; Mirror (T) = [ChangeSubtrees(T, { Mirror (prioritems (Subtrees (T))) }); T]; Note that this operation mutates the input tree. Another operation on trees is to test if a data item is a member of a tree: Member (Tree, Element) > boolean; Member (aTree, anItem) = [ if Items(aTree) == anItem then return (true); false]; Sometimes, there is also a need to search for an item in the tree and, if found, to return the corresponding node: Search (Tree, Element) > optional (Node); Search (aTree, anItem) = [ [ aNode = Nodes (aTree); if Item (aNode) == anItem then return (aNode)]; no (Node)]; As we have seen, some operations are mutating the input tree. If that is not what is required, then this may be solved by making a copy of the input tree by means of: Copy (Tree) > Tree; Copy (aTree) = [ T = copy (aTree); T.Item:= copy(aTree.Item); T.Subtrees:= { Copy (items (aTree.Subtrees)) }; T];
A binary tree is a tree where every node has at most two branches. This means that in a binary tree all subtrees are also binary trees. Actually, a binary tree is only a special case of an arbitrary tree. However, binary trees are very common in practice, and, furthermore, because of the limitation in the number of subtrees per node a simpler implementation is possible. A binary (sub)tree is either empty or it consists of three things:
We may translate this directly in a corresponding treedescriptor:
The corresponding constructor operation is: Tree (LeftTree = Tree, Element, RightTree = Tree) > Tree; Tree (Lefttree, Item, Righttree) = Tree: [ Lefttree; Item; Righttree ]; In addition, a leaf node is defined by: Leaf (Element) > Node; Leaf (anItem) = Tree (null (Tree), anItem, null (Tree)); The Leaf operation calls the Tree constructor with null values for the subtrees. Let us assume a tree with integer values as in Figure 36.
If we assume that the element type is integer, this tree can be created as follows: T1 = Tree (Leaf (9), 7, Tree (null (Tree), 4, Leaf (6)) ); The result is a binary tree as represented by: Tree: [ Lefttree = Tree: [ Lefttree = [ ]; Item = 9; Righttree = [ ] ]; Item = 7; Righttree = Tree: [ Lefttree = [ ]; Item = 4; Righttree = Tree: [ Lefttree = [ ]; Item = 6; Righttree = [ ] ] ] ] Many operations may be performed on a binary tree; a common one is that of executing a given operation on each element of the tree. To apply operations to each node of a tree, the tree must be traversed in some order. The actions to visit all nodes involve:
There are six different ways in which these three steps can be arranged. If we assume that the left subtree is always traversed before the right subtree, three principal orderings for tree traversal remain. These orderings are:
The corresponding definitions are: Preorder (Tree) > multi (Node); Preorder ( =null(Tree) ) = no (Tree); Preorder (T) = ( T, Preorder (T.Lefttree), Preorder (T.Righttree));Inorder (Tree) > multi (Node); Inorder ( =null(Tree) ) = no(Tree); Inorder (T) = ( Inorder (T.Lefttree), T, Inorder (T.Righttree));Postorder (Tree) > multi (Node); Postorder ( =null(Tree) ) = no(Tree); Postorder (T) = ( Postorder (T.Lefttree), Postorder (T.Righttree), T); These definitions are producing ordered stream of nodes. Exercise:
Item (Preorder (T1))? Item (Inorder (T1))? Item (Postorder (T1))? Another operation frequently used is a search for an element in a tree. We may define such a search operation as follows: Search (Tree, Element) > optional (Node); Search (aTree, anItem) = [ [ aNode = Preorder (aTree); if Item (aNode) == anItem then return (aNode) ]; no (Node) ]; This Search definition uses the preceding Preorder operation to read the nodes of the tree in preorder sequence. The search operation will be terminated as soon as a node with the requested item is found. Other operations defined for arbitrary trees are also applicable to binary trees with the exception of the Subtrees and ChangeSubtrees operations. Those operations are replaced by corresponding definitions for: LeftTree (Node) > Tree;ChangeLeftTree (Node, NewTree = Tree) > nothing;RightTree (Node) > Tree;ChangeRightTree (Node, NewTree = Tree) > nothing; As with arbitrary trees, the elements of a binary tree may also be of any type. So, the corresponding component is again a generic one (see Figure 37). component BinaryTrees (Tree, Element); type Tree; type Node = Tree; Tree (LeftTree = Tree, Element, RightTree = Tree) > Tree; Leaf (Element) > Node; Node(Tree) > Node; Item (Node) > Element; ChangeItem (Node, NewItem = Element) > nothing; LeftTree (Node) > Tree; ChangeLeftTree (Node, NewTree = Tree) > nothing; RightTree (Node) > Tree; ChangeRightTree (Node, NewTree = Tree) > nothing; Preorder (Tree) > multi (Node); Inorder (Tree) > multi (Node); Postorder (Tree) > multi (Node); Search (Tree, Element) > optional (Node); Member (Tree, Element) > boolean; Copy (Tree) > Tree; begin Tree (Lefttree, Item, Righttree) = Tree: [ Lefttree; Item; Righttree ];Leaf (anItem) = Tree (null(Tree), anItem, null(Tree) );Node (aTree) = aTree;Item (aNode) = aNode.Item;ChangeItem (aNode, NewItem) = [aNode.Item:=NewItem];LeftTree (aNode) = aNode.Lefttree;ChangeLeftTree (aNode, NewTree) = [aNode.Lefttree:=NewTree];RightTree (aNode) = aNode.Righttree;ChangeRightTree (aNode, NewTree) = [aNode.Righttree:=NewTree];Preorder (=null(Tree)) = no(Tree); Preorder (T) = ( T, Preorder (T.Lefttree), Preorder (T.Righttree));Inorder (=null(Tree)) = no(Tree); Inorder (T) = ( Inorder (T.Lefttree), T, Inorder (T.Righttree));Postorder (=null(Tree)) = no(Tree); Postorder (T) = ( Postorder (T.Lefttree), Postorder (T.Righttree), T);Search (aTree, anItem) = [ [ aNode = Preorder (aTree); if Item (aNode) == anItem then return (aNode) ]; no (Node)];Member(aTree, anItem) = [ [ aNode = Search (aTree, anItem); return (aNode <> null (Node) ) ]; false];Copy (aTree) = [ T = copy (aTree); T.Item:= copy (aTree.Item); T.Lefttree:= Copy (aTree.Lefttree); T.Righttree:= Copy (aTree.Righttree); T];end component BinaryTrees; Figure 37: A component for binary trees Exercise:
Because binary trees as well as lists are frequently used to represent sets of data, the question arises: Is there any difference in efficiency between searching for elements in a binary tree or in a list? In general, if we are using the abovementioned Search tree definition, this will not be as efficient as searching a linear list because of the recursive tree traversals. However, a major improvement can be achieved if we are imposing an ordering relation between the elements of a tree. In that case the data in the tree can be ordered from left to right according to this relation. Such trees are often called binary search trees or binary dictionaries. A binary search tree is ordered from left to right if for all nonleaf nodes holds:
The advantage of ordering is that, to search for an object in a binary search tree, it is always sufficient to search at most one subtree. For example, searching for the element 15 in the binary search tree of Figure 38 will proceed in the following way:
We start at the root, 17, and compare 15 and 17. Because 15 is less than 17 we discard the right subtree because all the elements of the right subtree are greater then 17, and we proceed with the left subtree where all elements are less than 17. Then we compare 15 with the next node element, 14, and select for the same reasons its right subtree; and finally we select the element 15. In general, the elements of a search tree may be of any type as long as an ordering relationship between the elements is defined. Such an ordering relation may be based on some characteristics of the objects, such as keys which are uniquely defining objects and which have an implicit ordering relationship or any other measure of ordering. The general method for searching in a binary search tree is a recursive procedure as defined by: Search (Tree, Element) > optional (Node); Search (=null(Tree), Key) = no (Node); Search (aNode, Key) = if Item (aNode) == Key then aNode else if Item (aNode) > Key then Search (aNode.Lefttree, Key) else Search (aNode.Righttree, Key); What have we gained in search efficiency if we are comparing a list search with a binary tree search? Let n be the number of elements. If the set of elements is represented by a list then the expected search time will be proportional to its length n. On average, we have to scan the list up to halfway through it. If the set of elements is represented by a binary search tree, the search time will be roughly proportional to the height of the tree. The height of a tree is the length of the longest path between the root and a leaf of the tree. The height depends on the shape of the tree. That means that the shape of the tree influences the search time. For example, the following examples in Figure 39 are all representing the same data in binary search trees of different shapes. A tree is a balanced tree if, for each node in the tree, its two subtrees has approximately equal number of nodes. If a binary search tree with n nodes is nicely balanced then its height is proportional to log n. The difference between n and log n is the improvement of a binary search tree over a list. This holds, unfortunately, only when a tree is approximately balanced. If the tree gets out of balance its performance will degrade. In extreme cases of totally unbalanced trees, a tree is in effect reduced to a unidirectional list. In such a case the tree's height is n, and the tree's performance is equally poor as that of a list. 17 14 11 / \ / \ \ / \ 11 15 14 / \ \ \ 14 22 17 15 / \ / \ \ \ / \ / \ 22 17 11 15 19 37 / \ \ 19 37 22 / \ 19 37
Another problem we have to face is how to insert new items in a binary search tree. A recursive procedure for binary search trees is given by the following definition: Insert (Tree, Element) > Tree; Insert (=null(Tree), Item) = Leaf (Item); Insert (T, Item) = if Item == Item (T) then T else if Item < Item(T) then Tree (Insert (T.Lefttree,Item), Item (T), T.Righttree) else Tree (T.Lefttree, Item (T), Insert (T.Righttree, Item) ); We may use this definition for inserting values in an existing tree or for constructing a new tree. For example, the tree of Figure 38 can be constructed from a list of node values as shown in the following example: NodeList = { 17, 14, 22, 11, 15, 19, 37 }; T1:= [ T:=null (Tree); T:= insert (T, items (NodeList)); T]; Let us now consider the delete operation for binary search trees. It is straightforward to delete a leaf from a tree, but deleting an internal node is more complicated as illustrated in Figure 310. \ \ \ \ 17 delete 17 ? / \ ====> / \ / \ / \ / \ / \ 14 22 14 22 / \ / \ / \ / \ / \ / \ / \ / \ 11 15 19 37 11 15 19 37
An internal node has two subtrees, Lefttree and Righttree. After the internal node has been removed, we have a hole in the tree and the left and right subtrees are no longer connected to the rest of the tree. They cannot both be directly connected to the parent of the internal node, because the parent can accommodate only one of them. If one of the subtrees is empty then the solution is simple: the nonempty subtree is connected to the parent. If they are both nonempty then one possibility is shown in Figure 311. \ \ \ \ ? promote 19 19 / \ =====> / \ / \ / \ / \ / \ 14 22 14 22 / \ / \ / \ \ / \ / \ / \ \ 11 15 19 37 11 15 37
The leftmost node of the right subtree is promoted from the current position upwards to fill the gap. After this transfer, the tree remains ordered. Of course, the same idea works symmetrically with the promotion of the rightmost node of the left subtree. According to these considerations, the operation to delete an element from the binary search tree is performed by the following two definitions: Delete (Tree, Element) > Tree; Delete (=null (Tree), Item) = null (Tree); Delete (T, Item) = if Item (T) == Item then if T.Lefttree == null (Tree) then T.Righttree else if T.Righttree == null (Tree) then T.Lefttree else [ item:=Item; Tree (T.Lefttree, Item, PromoteLeaf (T.Righttree, var (item))) ] else if Item < Item (T) then Tree (Delete (T.Lefttree, Item), Item (T), T.Righttree) else Tree(T.Lefttree, Item (T), Delete (T.Righttree, Item)); Promotion of a leaf is defined as: PromoteLeaf (Tree, var (Element) ) > Tree; PromoteLeaf (=null(Tree), Item) = null (Tree); PromoteLeaf (T, Leaf) = if T.Lefttree == null (Tree) then [ Leaf:=Item (T); T.Righttree] else Tree (PromoteLeaf (T.Lefttree, var (Leaf) ), Item (T), T.Righttree); A generic component for binary search trees is defined in Figure 312. component BinarySearchTrees (Tree, Element); type Tree; type Node = Tree; Node (Tree) > Node; Item (Node) > Element; Nodes (Tree) > multi (Node); Member (Tree, Element) > boolean; Search (Tree, Element) > optional (Node); Insert (Tree, Element) > Tree; Delete (Tree, Element) > Tree; begin Tree (LeftTree = Tree, Element, RightTree = Tree) > Tree; Tree (Lefttree, Item, Righttree) = Tree: [ Lefttree; Item; Righttree ];Leaf (Element) > Node; Leaf (anItem) = Tree (null (Tree), anItem, null (Tree));Node (aTree) = aTree;Item (aNode) = aNode.Item;Nodes (=null (Tree)) = no (Tree); Nodes (aTree) = (aTree, Nodes (aTree.Lefttree), Nodes (aTree.Righttree));Member (aTree, anItem) = [ if Item (Nodes (aTree)) == anItem then return (true); false];Search (=null(Tree), Key) = no (Node); Search (aNode, Key) = if Item (aNode) == Key then aNode else if Item (aNode) > Key then Search (aNode.Lefttree, Key) else Search (aNode.Righttree, Key);Insert (=null(Tree), Item) = Leaf (Item); Insert (T, Item) = if Item == Item (T) then T else if Item < Item (T) then Tree (Insert (T.Lefttree, Item), Item (T), T.Righttree) else Tree (T.Lefttree, Item (T), Insert (T.Righttree, Item));Delete (Tree, Element) > Tree; Delete (=null (Tree), Item) = null (Tree); Delete (T, Item) = if Item (T) == Item then if T.Lefttree == null (Tree) then T.Righttree else if T.Righttree == null (Tree) then T.Lefttree else [ item:=Item; Tree (T.Lefttree, Item, PromoteLeaf (T.Righttree, var (item))) ] else if Item < Item (T) then Tree (Delete (T.Lefttree, Item), Item (T), T.Righttree) else Tree(T.Lefttree, Item (T), Delete (T.Righttree, Item));PromoteLeaf (Tree, var (Element) ) > Tree; PromoteLeaf (=null(Tree), Item) = null (Tree); PromoteLeaf (T, Leaf) = if T.Lefttree == null (Tree) then [ Leaf:=Item (T); T.Righttree] else Tree (PromoteLeaf (T.Lefttree, var (Leaf) ), Item (T), T.Righttree);end component BinarySearchTrees; Figure 312: A component for binary search trees. There are some significant differences with the component for binary trees. The component for binary search trees is completely responsible for the structure of its trees. It distributes the items over the trees according to the criteria stated and it maintains and adapts the structure of the trees accordingly. That means that a client may not interfere with the structure imposed by the component. So, external operations which may change the structure of a tree or which may update its data items are not acceptable for binary search trees. For that reason, the following operations have been excluded from the components interface, either because they are forbidden for binary search trees or they have lost their meaning: Tree (LeftTree = Tree, Element, RightTree = Tree) > Tree;Leaf (Element) > Node;ChangeItem (Node, NewItem = Element) > nothing;LeftTree (Node) > Tree;ChangeLeftTree (Node, NewTree = Tree) > nothing;RightTree (Node) > Tree;ChangeRightTree (Node, NewTree = Tree) > nothing;Preorder (Tree) > multi (Node);Inorder (Tree) > multi (Node);Postorder (Tree) > multi (Node);
Several other variations of trees have been developed, including 23 dictionaries, AVLtrees, Btrees, and so on. Further discussions on the representation and use of trees can be found in Knuth(1975), Wirth(1976), Ado, Hopcroft and Ullman(1974, 1983), Baase(1978), Gonnet(1984), and Bratko (1990).
