Medical researchers rely on family trees to do research on hereditary diseases. They may, for example, search a family tree for a certain eye color. Computers can help with these tasks, so it is natural to design representations of family trees and functions for processing them.
One way to maintain a family tree of a family is to add a node to the tree every time a child is born. From the node, we can draw connections to the node for the father and the one for the mother, which tells us how the people in the tree are related. For those people in the tree whose parents are unknown, we do not draw any connections. The result is a so-called ancestor family tree because, given any node in the tree, we can find the ancestors of that person if we follow the arrows but not the descendants.
As we record a family tree, we may also want to record certain pieces of information. The birth date, birth weight, the color of the eyes, and the color of the hair are the pieces of information that we care about. Others record different information.
See figure
for a drawing of an ancestor family tree. Adam
is the child of Bettina and Carl; he has yellow eyes and was born in
1950. Similarly, Gustav is the child of Eva and Fred, has brown eyes, and
was born in 1988. To represent a child in a family tree is to combine
several pieces of information: information about the father, the mother,
the name, the birth date, and eye color. This suggests that we define a new
structure:
(define-struct child (father mother name date eyes))The five fields of child structures record the required information, which suggests the following data definition:
While this data definition is simple, it is unfortunately also useless. The definition refers to itself but, because it doesn't have any clauses, there is no way to create a child structure. If we tried to create a child structure, we would have to write
(make-child
(make-child
(make-child
(make-child
...
)))
... ... ... ...)
without end. It is for this reason that we demand that all self-referential
data definitions consist of several clauses (for now) and that at least one
of them does not refer to the data definition.
Let's postpone the data definition for a moment and study instead how we can use child structures to represent family trees. Suppose we are about to add a child to an existing family tree, and furthermore suppose that we already have representations for the parents. Then we can just construct a new child structure. For example, for Adam we could create the following child structure:
(make-child Carl Bettina 'Adam 1950 'yellow)assuming Carl and Bettina stand for representations of Adam's parents.
The problem is that we don't always know a person's parents. In the family
depicted in figure
, we don't know Bettina's parents.
Yet, even if we don't know a person's father or mother, we must still use
some Scheme value for the two fields in a child structure. We
could use all kinds of values to signal a lack of information (5,
false, or 'none); here, we use empty. For
example, to construct a child structure for Bettina, we do the
following:
(make-child empty empty 'Bettina 1926 'green)Of course, if only one of the two parents is missing, we fill just that field with empty.
Our analysis suggests that a child node has the following data definition:
A child node is (make-child f m na da ec) whereThis definition is special in two regards. First, it is a self-referential data definition involving structures. Second, the data definition mentions two alternatives for the first and second component. This violates our conventions concerning the shape of data definitions.
- f and m are either
- empty or
- child nodes;
- na and ec are symbols;
- da is a number.
We can avoid this problem by defining the collection of nodes in a family tree instead:
A family-tree-node (short: ftn) is either
- empty; or
- (make-child f m na da ec)
where f and m are ftns, na
and ec are symbols, and da is a number.
This new definition satisfies our conventions. It consists of two clauses. One of the clauses is self-referential, the other is not.
In contrast to previous data definitions involving structures, the
definition of ftn is not a plain explanation of what kind of data
can show up in which field. Instead, it is multi-clausal and
self-referential. Considering that this is the first such data definition,
let us carefully translate the example from figure
and
thus reassure ourselves that the new class of data can represent the
information of interest.
The information for Carl is easy to translate into a ftn:
(make-child empty empty 'Carl 1926 'green)Bettina and Fred are represented with similar nodes. Accordingly, the node for Adam is created with
(make-child (make-child empty empty 'Carl 1926 'green)
(make-child empty empty 'Bettina 1926 'green)
'Adam
1950
'yellow)
As the examples show, a simple-minded, node-by-node transliteration of
figure
requires numerous repetitions of data. For
example, if we constructed the child structure for Dave like the
one for Adam, we would get
(make-child (make-child empty empty 'Carl 1926 'green)
(make-child empty empty 'Bettina 1926 'green)
'Dave
1955
'black)
Hence it is a good idea to introduce a variable definition per node and to
use the variable thereafter. To make things easy, we use Carl to
stand for the child structure that describes Carl, and so on.
The complete transliteration of the family tree into Scheme can be found in
figure
;; Oldest Generation: (define Carl (make-child empty empty 'Carl 1926 'green)) (define Bettina (make-child empty empty 'Bettina 1926 'green));; Middle Generation: (define Adam (make-child Carl Bettina 'Adam 1950 'yellow)) (define Dave (make-child Carl Bettina 'Dave 1955 'black)) (define Eva (make-child Carl Bettina 'Eva 1965 'blue)) (define Fred (make-child empty empty 'Fred 1966 'pink))
;; Youngest Generation: (define Gustav (make-child Fred Eva 'Gustav 1988 'brown))
The structure definitions in figure
naturally correspond
to an image of deeply nested boxes. Each box has five compartments. The
first two contain boxes again, which in turn contain boxes in their first
two compartments, and so on. Thus, if we were to draw the structure
definitions for the family tree using nested boxes, we would quickly be
overwhelmed by the details of the picture. Furthermore, the picture would
copy certain portions of the tree just like our attempt to use
make-child without variable definitions. For these reasons, it is
better to imagine the structures as boxes and arrows, as originally drawn
in figure
. In general, a programmer must flexibly switch
back and forth between both of these graphical illustrations. For
extracting values from structures, the boxes-in-boxes image works best; for
finding our way around large collections of interconnected structures, the
boxes-and-arrows image works better.
Equipped with a firm understanding of the family tree representation, we can turn to the design of functions that consume family trees. Let us first look at a generic function of this kind:
;; fun-for-ftn : ftn -> ??? (define (fun-for-ftn a-ftree) ...)After all, we should be able to construct the template without considering the purpose of a function.
Since the data definition for ftns contains two clauses, the template must consist of a cond-expression with two clauses. The first deals with empty, the second with child structures:
;; fun-for-ftn : ftn -> ???
(define (fun-for-ftn a-ftree)
(cond
[(empty? a-ftree) ...]
[else ; (child? a-ftree)
... ]))
Furthermore, for the first clause, the input is atomic so there is nothing
further to be done. For the second clause, though, the input contains five
pieces of information: two other family tree nodes, the person's name,
birth date, and eye color:
;; fun-for-ftn : ftn -> ???
(define (fun-for-ftn a-ftree)
(cond
[(empty? a-ftree) ...]
[else
... (fun-for-ftn (child-father a-ftree)) ...
... (fun-for-ftn (child-mother a-ftree)) ...
... (child-name a-ftree) ...
... (child-date a-ftree) ...
... (child-eyes a-ftree) ...]))
We also apply fun-for-ftn to the father and
mother fields because of the self-references in the second clause
of the data definition.
Let us now turn to a concrete example: blue-eyed-ancestor?, the function that determines whether anyone in some given family tree has blue eyes:
;; blue-eyed-ancestor? : ftn -> boolean ;; to determine whether a-ftree contains a child ;; structure with 'blue in the eyes field (define (blue-eyed-ancestor? a-ftree) ...)
Following our recipe, we first develop some examples. Consider the family tree node for Carl. He does not have blue eyes, and because he doesn't have any (known) ancestors in our family tree, the family tree represented by this node does not contain a person with blue eyes. In short,
(blue-eyed-ancestor? Carl)evaluates to false. In contrast, the family tree represented by Gustav contains a node for Eva who does have blue eyes. Hence
(blue-eyed-ancestor? Gustav)evaluates to true.
The function template is like that of fun-for-ftn, except that we use the name blue-eyed-ancestor?. As always, we use the template to guide the function design. First we assume that (empty? a-ftree) holds. In that case, the family tree is empty, and nobody has blue eyes. Hence the answer must be false.
The second clause of the template contains several expressions, which we must interpret:
Our discussion suggests that we formulate a conditional expression and that the first condition is
(symbol=? (child-eyes a-ftree) 'blue)The two recursions are the other two conditions. If either one produces true, the function produces true. The else-clause produces false.
In summary, the answer in the second clause is the expression:
(cond [(symbol=? (child-eyes a-ftree) 'blue) true] [(blue-eyed-ancestor? (child-father a-ftree)) true] [(blue-eyed-ancestor? (child-mother a-ftree)) true] [else false])The first definition in figure
The function blue-eyed-ancestor? is unusual in that it uses the recursions as conditions in a cond-expressions. To understand how this works, let us evaluate an application of blue-eyed-ancestor? to Carl by hand:
(blue-eyed-ancestor? Carl)
= (blue-eyed-ancestor? (make-child empty empty 'Carl 1926 'green))
= (cond
[(empty? (make-child empty empty 'Carl 1926 'green)) false]
[else
(cond
[(symbol=?
(child-eyes (make-child empty empty 'Carl 1926 'green))
'blue)
true]
[(blue-eyed-ancestor?
(child-father (make-child empty empty 'Carl 1926 'green)))
true]
[(blue-eyed-ancestor?
(child-mother (make-child empty empty 'Carl 1926 'green)))
true]
[else false])])
= (cond
[(symbol=? 'green 'blue) true]
[(blue-eyed-ancestor? empty) true]
[(blue-eyed-ancestor? empty) true]
[else false])
= (cond
[false true]
[false true]
[false true]
[else false])
= false
The evaluation confirms that blue-eyed-ancestor? works properly
for Carl, and it also illustrates how the function works.
The second definition of blue-eyed-ancestor? in
figure
uses an or-expression instead of a
nested conditional. Use a hand-evaluation to show that this definition
produces the same output for the inputs empty and
Carl. Solution
Exercise 14.1.2
(blue-eyed-ancestor? empty)evaluates to false with a hand-evaluation.
Evaluate (blue-eyed-ancestor? Gustav) by hand and with DrScheme. For the hand-evaluation, skip those steps in the evaluation that concern extractions, comparisons, and conditions involving empty?. Also reuse established equations where possible, especially the one above. Solution
Exercise 14.1.3
Develop count-persons. The function consumes a family tree node and produces the number of people in the corresponding family tree. Solution
Exercise 14.1.4
Develop the function average-age. It consumes a family tree node and the current year. It produces the average age of all people in the family tree. Solution
Exercise 14.1.5
Develop the function eye-colors, which consumes a family tree node and produces a list of all eye colors in the tree. An eye color may occur more than once in the list.
Hint: Use the Scheme operation append, which consumes two lists and produces the concatenation of the two lists. For example:
(append (list 'a 'b 'c) (list 'd 'e)) = (list 'a 'b 'c 'd 'e)We discuss the development of functions like append in section
Exercise 14.1.6
Suppose we need the function proper-blue-eyed-ancestor?. It is like blue-eyed-ancestor? but responds with true only when some proper ancestor, not the given one, has blue eyes.
The contract for this new function is the same as for the old one:
;; proper-blue-eyed-ancestor? : ftn -> boolean ;; to determine whether a-ftree has a blue-eyed ancestor (define (proper-blue-eyed-ancestor? a-ftree) ...)The results differ slightly.
To appreciate the difference, we need to look at Eva, who is blue-eyed, but does not have a blue-eyed ancestor. Hence
(blue-eyed-ancestor? Eva)is true but
(proper-blue-eyed-ancestor? Eva)is false. After all Eva is not a proper ancestor of herself.
Suppose a friend sees the purpose statement and comes up with this solution:
(define (proper-blue-eyed-ancestor? a-ftree)
(cond
[(empty? a-ftree) false]
[else (or (proper-blue-eyed-ancestor? (child-father a-ftree))
(proper-blue-eyed-ancestor? (child-mother a-ftree)))]))
What would be the result of (proper-blue-eyed-ancestor? A) for any ftn A?
Fix the friend's solution. Solution