[previous] [up] [next]     [index]
Next: Lists in Lists Up: More Self-referential Data Definitions Previous: Structures in Structures

Extended Exercise: Binary Search Trees

Programmers often work with trees, though rarely with family trees. A particularly well-known form of tree is the binary search tree. Many applications employ binary search trees to store and to retrieve information.

To be concrete, we discuss binary trees that manage information about people. In this context, a binary tree is similar to a family tree but instead of child structures it contains nodes:

(define-struct node (ssn name left right))
Here we have decided to record the social security number, the name, and two other trees. The latter are like the parent fields of family trees, though the relationship between a node and its left and right trees is not based on family relationships.

The corresponding data definition is just like the one for family trees:

A binary-tree (short: BT) is either
  1. false; or
  2. (make-node soc pn lft rgt)
    where soc is a number, pn is a symbol, and lft and rgt are BTs.

The choice of false to indicate lack of information is arbitrary. We could have chosen empty again, but false is an equally good and equally frequent choice that we should become familiar with.

Here are two binary trees:

   (make-node
     15 
     'd 
     false 
     (make-node 24 'i false false)) 
  
(make-node
  15 
  'd 
  (make-node 87 'h false false) 
  false) 

Figure [cross-reference] shows how we should think about such trees. The trees are drawn upside down, that is, with the root at the top and the crown of the tree at the bottom. Each circle corresponds to a node, labeled with the ssn field of a corresponding node structure. The trees omit false.


Exercises

Exercise 14.2.1

Draw the two trees above in the manner of figure [cross-reference]. Then develop contains-bt. The function consumes a number and a BT and determines whether the number occurs in the tree. Solution

Exercise 14.2.2

Develop search-bt. The function consumes a number n and a BT. If the tree contains a node structure whose soc field is n, the function produces the value of the pn field in that node. Otherwise, the function produces false.

Hint: Use contains-bt. Or, use boolean? to find out whether search-bt was successfully used on a subtree. We will discuss this second technique, called backtracking, in the intermezzo at the end of this part. Solution



Tree A: Tree B:
[bst] [bst]
 

Figure: A binary search tree and a binary tree


Both trees in figure [cross-reference] are binary trees but they differ in a significant way. If we read the numbers in the two trees from left to right we obtain two sequences:

displaymath72202

The sequence for tree A is sorted in ascending order, the one for B is not.

A binary tree that has an ordered sequence of information is a BINARY SEARCH TREE. Every binary search tree is a binary tree, but not every binary tree is a binary search tree. We say that the class of binary search trees is a PROPER SUBCLASS of that of binary trees, that is, a class that does not contain all binary trees. More concretely, we formulate a condition--or data invariant--that distinguishes a binary search tree from a binary tree:

The BST Invariant A binary-search-tree (short: BST) is a BT:

  1. false is always a BST;
  2. (make-node soc pn lft rgt) is a BST if
    1. lft and rgt are BSTs,
    2. all ssn numbers in lft are smaller than soc, and
    3. all ssn numbers in rgt are larger than soc.

The second and third conditions are different from what we have seen in previous data definitions. They place an additional and unusual burden on the construction BSTs. We must inspect all numbers in these trees and ensure that they are smaller (or larger) than soc.


Exercises

Exercise 14.2.3

Develop the function inorder. It consumes a binary tree and produces a list of all the ssn numbers in the tree. The list contains the numbers in the left-to-right order we have used above.

Hint: Use the Scheme operation append, which concatenates lists:

(append (list 1 2 3) (list 4) (list 5 6 7))
;evaluates to  
(list 1 2 3 4 5 6 7) 
What does inorder produce for a binary search tree? Solution

Looking for a specific node in a BST takes fewer steps than looking for the same node in a BT. To find out whether a BT contains a node with a specific ssn field, a function may have to look at every node of the tree. In contrast, to inspect a binary search tree requires far fewer inspections than that. Suppose we are given the BST:

(make-node 66 'a L R)
If we are looking for 66, we have found it. Now suppose we are looking for 63. Given the above node, we can focus the search on L because all nodes with ssns smaller than 66 are in L. Similarly, if we were to look for 99, we would ignore L and focus on R because all nodes with ssns larger than 66 are in R.


Exercises

Exercise 14.2.4

Develop search-bst. The function consumes a number n and a BST. If the tree contains a node structure whose soc field is n, the function produces the value of the pn field in that node. Otherwise, the function produces false. The function organization must exploit the BST Invariant so that the function performs as few comparisons as necessary. Compare searching in binary search trees with searching in sorted lists (exercise [cross-reference]). Solution


Building a binary tree is easy; building a binary search tree is a complicated, error-prone affair. To create a BT we combine two BTs, an ssn number and a name with make-node. The result is, by definition, a BT. To create a BST, this procedure fails because the result would typically not be a BST. For example, if one tree contains 3 and 5, and the other one contains 2 and 6, there is no way to join these two BSTs into a single binary search tree.

We can overcome this problem in (at least) two ways. First, given a list of numbers and symbols, we can determine by hand what the corresponding BST should look like and then use make-node to build it. Second, we can write a function that builds a BST from the list, one node after another.


Exercises

Exercise 14.2.5

Develop the function create-bst. It consumes a BST B, a number N, and a symbol S. It produces a BST that is just like B and that in place of one false subtree contains the node structure

(make-node N S false false)

Test the function with (create-bst false 66 'a); this should create a single node. Then show that the following holds:

  (create-bst (create-bst false 66 'a) 53 'b)
= (make-node 66 
             'a 
             (make-node 53 'b false false) 
             false) 
Finally, create tree A from figure [cross-reference] using create-bstSolution

Exercise 14.2.6

Develop the function create-bst-from-list. It consumes a list of numbers and names; it produces a BST by repeatedly applying create-bst.

The data definition for a list of numbers and names is as follows:

A list-of-number/name is either
  1. empty or
  2. (cons (list ssn nom) lonn)
    where ssn is a number, nom a symbol,
    and lonn is a list-of-number/name.

Consider the following examples:

  
(define sample
'((99 o)
(77 l)
(24 i)
(10 h)
(95 g)
(15 d)
(89 c)
(29 b)
(63 a)))
  
(define sample
(list (list 99 'o)
(list 77 'l)
(list 24 'i)
(list 10 'h)
(list 95 'g)
(list 15 'd)
(list 89 'c)
(list 29 'b)
(list 63 'a)))

They are equivalent, although the left one is defined with the quote abbreviation, the right one using list. The left tree in figure [cross-reference] is the result of using create-bst-from-list on this list. Solution



[previous] [up] [next]     [index]
Next: Lists in Lists Up: More Self-referential Data Definitions Previous: Structures in Structures

PLT