Suppose we wish to represent the inventory of a toy store that sells such things as dolls, make-up sets, clowns, bows, arrows, and soccer balls. To make an inventory, a store owner would start with an empty sheet of paper and slowly write down the names of the toys on the various shelves.
Representing a list of toys in Scheme is straightforward. We can simply use Scheme's symbols for toys and then construct lists from them. Here are a few short samples:
empty (cons 'ball empty) (cons 'arrow (cons 'ball empty)) (cons 'clown empty) (cons 'bow (cons 'arrow (cons 'ball empty))) (cons 'clown (cons 'bow (cons 'arrow (cons 'ball empty))))For a real store, the list will contain many more items, and the list will grow and shrink over time. In any case, we cannot say in advance how many items these inventory lists will contain. Hence, if we wish to develop a function that consumes such lists, we cannot simply say that the input is a list with either one, two, three, or four items. We must be prepared to think about lists of arbitrary length.
In other words, we need a data definition that precisely describes the class of lists that contain an arbitrary number of symbols. Unfortunately, the data definitions we have seen so far can only describe classes of data where each item is of a fixed size, such as a structure with a specific number of components or a list with a specific number of items. So how can we describe a class of lists of arbitrary size?
Looking back we see that all our examples fall into one of two categories. The store owner starts with an empty list and constructs longer and longer lists. The construction proceeds by constructing together a toy and another list of toys. Here is a data definition that reflects this process:
A list-of-symbols is either
- the empty list, empty, or
- (cons s los) where s is a symbol and los is a list of symbols.
This definition is unlike any of the definitions we have seen so far or that we encounter in high school English or mathematics. Those definitions explain a new idea in terms of old, well-understood concepts. In contrast, this definition refers to itself in the item labeled 2, which implies that it explains what a list of symbols is in terms of lists of symbols. We call such definitions SELF-REFERENTIAL or RECURSIVE.
At first glance, a definition that explains or specifies something in terms
of itself does not seem to make much sense. This first impression, however,
is wrong. A recursive definition, such as the one above, makes sense as
long as we can construct some elements from it; the definition is correct
if we can construct all intended elements.
Let's check whether our specific data definition makes sense and contains all the elements we are interested in. From the first clause we immediately know that empty is a list of symbols. From the second clause we know that we can create larger lists with cons from a symbol and a list of symbols. Thus (cons 'ball empty) is a list of symbols because we just determined that empty is one and we know that 'doll is a symbol. There is nothing special about 'doll. Any other symbol could serve equally well to form a number of one-item lists of symbols:
(cons 'make-up-set empty) (cons 'water-gun empty) ...Once we have lists that contain one symbol, we can use the same method to build lists with two items:
(cons 'Barbie (cons 'robot empty)) (cons 'make-up-set (cons 'water-gun empty)) (cons 'ball (cons 'arrow empty)) ...From here, it is easy to see how we can form lists that contain an arbitrary number of symbols. More important still for our problem, all possible inventories are adequately described by our data definition.
Show that all the inventory lists discussed at the beginning of this section belong to the class list-of-symbols. Solution
Exercise 9.2.2
Do all lists of two symbols also belong to the class list-of-symbols? Provide a concise argument. Solution
Exercise 9.2.3
Provide a data definition for the class of list of booleans. The
class contains all arbitrarily large lists of booleans. Solution