Programmers often work with trees, though rarely with family trees. A particularly well-known form of tree is the binary search tree. Many applications employ binary search trees to store and to retrieve information.
To be concrete, we discuss binary trees that manage information about people. In this context, a binary tree is similar to a family tree but instead of child structures it contains nodes:
(define-struct node (ssn name left right))Here we have decided to record the social security number, the name, and two other trees. The latter are like the parent fields of family trees, though the relationship between a node and its left and right trees is not based on family relationships.
The corresponding data definition is just like the one for family trees:
A binary-tree (short: BT) is either
- false; or
- (make-node soc pn lft rgt)
where soc is a number, pn is a symbol, and lft and rgt are BTs.
The choice of false to indicate lack of information is arbitrary. We could have chosen empty again, but false is an equally good and equally frequent choice that we should become familiar with.
Here are two binary trees:
(make-node
15
'd
false
(make-node 24 'i false false))
(make-node 15 'd (make-node 87 'h false false) false)
Figure
shows how we should think about such
trees. The trees are drawn upside down, that is, with the root at the top
and the crown of the tree at the bottom. Each circle corresponds to a node,
labeled with the ssn field of a corresponding node
structure. The trees omit false.
Exercise 14.2.1
Draw the two trees above in the manner of figure
. Then
develop contains-bt. The function consumes a number and a
BT and determines whether the number occurs in the
tree. Solution
Exercise 14.2.2
Develop search-bt. The function consumes a number n and a BT. If the tree contains a node structure whose soc field is n, the function produces the value of the pn field in that node. Otherwise, the function produces false.
Hint: Use contains-bt. Or, use boolean? to find out
whether search-bt was successfully used on a subtree. We will
discuss this second technique, called backtracking, in the intermezzo at
the end of this part. Solution
Tree A: Tree B:
Both trees in figure
are binary trees but they differ in a
significant way. If we read the numbers in the two trees from left to right
we obtain two sequences:
The sequence for tree A is sorted in ascending order, the one for B is not.
A binary tree that has an ordered sequence of information is a BINARY SEARCH TREE. Every binary search tree is a binary tree, but not every binary tree is a binary search tree. We say that the class of binary search trees is a PROPER SUBCLASS of that of binary trees, that is, a class that does not contain all binary trees. More concretely, we formulate a condition--or data invariant--that distinguishes a binary search tree from a binary tree:
The BST Invariant A binary-search-tree (short: BST) is a BT:
- false is always a BST;
- (make-node soc pn lft rgt) is a BST if
- lft and rgt are BSTs,
- all ssn numbers in lft are smaller than soc, and
- all ssn numbers in rgt are larger than soc.
The second and third conditions are different from what we have seen in previous data definitions. They place an additional and unusual burden on the construction BSTs. We must inspect all numbers in these trees and ensure that they are smaller (or larger) than soc.
Exercise 14.2.3
Develop the function inorder. It consumes a binary tree and produces a list of all the ssn numbers in the tree. The list contains the numbers in the left-to-right order we have used above.
Hint: Use the Scheme operation append, which concatenates lists:
(append (list 1 2 3) (list 4) (list 5 6 7)) ;evaluates to (list 1 2 3 4 5 6 7)What does inorder produce for a binary search tree? Solution
Looking for a specific node in a BST takes fewer steps than looking for the same node in a BT. To find out whether a BT contains a node with a specific ssn field, a function may have to look at every node of the tree. In contrast, to inspect a binary search tree requires far fewer inspections than that. Suppose we are given the BST:
(make-node 66 'a L R)If we are looking for 66, we have found it. Now suppose we are looking for 63. Given the above node, we can focus the search on L because all nodes with ssns smaller than 66 are in L. Similarly, if we were to look for 99, we would ignore L and focus on R because all nodes with ssns larger than 66 are in R.
Exercise 14.2.4
Develop search-bst. The function consumes a number n and
a BST. If the tree contains a node structure whose
soc field is n, the function produces the value of the
pn field in that node. Otherwise, the function produces
false. The function organization must exploit the BST
Invariant so that the function performs as few comparisons as
necessary. Compare searching in binary search trees with searching in
sorted lists (exercise
). Solution
Building a binary tree is easy; building a binary search tree is a complicated, error-prone affair. To create a BT we combine two BTs, an ssn number and a name with make-node. The result is, by definition, a BT. To create a BST, this procedure fails because the result would typically not be a BST. For example, if one tree contains 3 and 5, and the other one contains 2 and 6, there is no way to join these two BSTs into a single binary search tree.
We can overcome this problem in (at least) two ways. First, given a list of numbers and symbols, we can determine by hand what the corresponding BST should look like and then use make-node to build it. Second, we can write a function that builds a BST from the list, one node after another.
Exercise 14.2.5
Develop the function create-bst. It consumes a BST B, a number N, and a symbol S. It produces a BST that is just like B and that in place of one false subtree contains the node structure
(make-node N S false false)
Test the function with (create-bst false 66 'a); this should create a single node. Then show that the following holds:
(create-bst (create-bst false 66 'a) 53 'b)
= (make-node 66
'a
(make-node 53 'b false false)
false)
Finally, create tree A from figure Exercise 14.2.6
Develop the function create-bst-from-list. It consumes a list of numbers and names; it produces a BST by repeatedly applying create-bst.
The data definition for a list of numbers and names is as follows:
A list-of-number/name is either
- empty or
- (cons (list ssn nom) lonn)
where ssn is a number, nom a symbol,
and lonn is a list-of-number/name.
Consider the following examples:
|
|
They are equivalent, although the left one is defined with the quote
abbreviation, the right one using list. The left tree in
figure
is the result of using create-bst-from-list
on this list. Solution