Section 17

Processing Two Complex Pieces of Data

On occasion, a function consumes two arguments that belong to classes with non-trivial data definitions. In some cases, one of the arguments should be treated as if it were atomic; a precisely formulated purpose statement typically clarifies this. In other cases, the two arguments must be processed in lockstep. Finally, in a few rare cases, the function must take into account all possible cases and process the arguments accordingly. This section illustrates the three cases with examples and provides an augmented design recipe for the last one. The last section discusses the equality of compound data and its relationship to testing; it is essential for automating test suites for functions.

17.1  Processing Two Lists Simultaneously: Case 1

Consider the following contract, purpose statement, and header:

;; replace-eol-with : list-of-numbers list-of-numbers  ->  list-of-numbers
;; to construct a new list by replacing empty in alon1 with alon2
(define (replace-eol-with alon1 alon2) ...)

The contract says that the function consumes two lists, which we haven't seen in the past. Let's see how the design recipe works in this case.

First, we make up examples. Suppose the first input is empty. Then replace-eol-with should produce the second argument, no matter what it is:

  (replace-eol-with empty L) 
= L

In this equation, L stands for an arbitrary list of numbers. Now suppose the first argument is not empty. Then the purpose statement requires that we replace empty at the end of alon1 with alon2:

(replace-eol-with (cons 1 empty) L) 
;; expected value:
(cons 1 L)

(replace-eol-with (cons 2 (cons 1 empty)) L) 
;; expected value:
(cons 2 (cons 1 L))

(replace-eol-with (cons 2 (cons 11 (cons 1 empty))) L) 
;; expected value: 
(cons 2 (cons 11 (cons 1 L)))

Again, L stands for any list of numbers in these examples.

;; replace-eol-with : list-of-numbers list-of-numbers  ->  list-of-numbers
;; to construct a new list by replacing empty in alon1 with alon2
(define (replace-eol-with alon1 alon2)
  (cond
    ((empty? alon1) alon2)
    (else (cons (first alon1) (replace-eol-with (rest alon1) alon2)))))

Figure 45:  The complete definition of replace-eol-with

The examples suggest that it doesn't matter what the second argument is -- as long as it is a list; otherwise, it doesn't even make sense to replace empty with the second argument. This implies that the template should be that of a list-processing function with respect to the first argument:

(define (replace-eol-with alon1 alon2)
  (cond
    ((empty? alon1) ...)
    (else ... (first alon1) ... (replace-eol-with (rest alon1) alon2) ... )))

The second argument is treated as it were an atomic piece of data.

Let's fill the gaps in the template, following the design recipe and using our examples. If alon1 is empty, replace-eol-with produces alon2 according to our examples. For the second cond-clause, when alon1 is not empty, we must proceed by inspecting the available expressions:

  1. (first alon1) evaluates to the first item on the list, and

  2. (replace-eol-with (rest alon1) alon2) replaces empty in (rest alon1) with alon2.

To gain a better understanding of what this means, consider one of the examples:

(replace-eol-with (cons 2 (cons 11 (cons 1 empty))) L) 
;; expected value: 
(cons 2 (cons 11 (cons 1 L)))

Here (first alon1) is 2, (rest alon1) is (cons 11 (cons 1 empty)), and (replace-eol-with (rest alon1) alon2) is (cons 11 (cons 1 alon2)). We can combine 2 and the latter with cons and can thus obtain the desired result. More generally,

(cons (first alon1) (replace-eol-with (rest alon1) alon2))

is the answer in the second cond-clause. Figure 45 contains the complete definition.

Exercise 17.1.1.   In several exercises, we have used the Scheme operation append, which consumes three lists and juxtaposes their items:

(append (list 'a) (list 'b 'c) (list 'd 'e 'f))
;; expected value: 
(list 'a 'b 'c 'd 'e 'f)

Use replace-eol-with to define our-append, which acts just like Scheme's append.    Solution

Exercise 17.1.2.   Develop cross. The function consumes a list of symbols and a list of numbers and produces all possible pairs of symbols and numbers.

Example:

(cross '(a b c) '(1 2))
;; expected value: 
(list (list 'a 1) (list 'a 2) (list 'b 1) (list 'b 2) (list 'c 1) (list 'c 2))

   Solution

17.2  Processing Two Lists Simultaneously: Case 2

In section 10.1, we developed the function hours->wages for the computation of weekly wages. It consumed a list of numbers -- hours worked per week -- and produced a list of weekly wages. We had based the function on the simplifying assumption that all employees received the same pay rate. Even a small company, however, employs people at different rate levels. Typically, the company's accountant also maintains two collections of information: a permanent one that, among other things, includes an employee's personal pay-rate, and a temporary one that records how much time an employee has worked during the past week.

The revised problem statement means that the function should consume two lists. To simplify the problem, let us assume that the lists are just lists of numbers, pay rates and weekly hours. Then here is the problem statement:

;; hours->wages : list-of-numbers list-of-numbers  ->  list-of-numbers
;; to construct a new list by multiplying the corresponding items on
;; alon1 and alon2
;; ASSUMPTION: the two lists are of equal length 
(define (hours->wages alon1 alon2) ...)

We can think of alon1 as the list of pay-rates and of alon2 as the list of hours worked per week. To get the list of weekly wages, we must multiply the corresponding numbers in the two input lists.

Let's look at some examples:

(hours->wages empty empty)
;; expected value:
empty

(hours->wages (cons 5.65 empty) (cons 40 empty))
;; expected value: 
(cons 226.0 empty)

(hours->wages (cons 5.65 (cons 8.75 empty)) 
              (cons 40.0 (cons 30.0 empty)))
;; expected value: 
(cons 226.0 (cons 262.5 empty))

For all three examples the function is applied to two lists of equal length. As stated in the addendum to the purpose statement, the function assumes this and, indeed, using the function makes no sense if the condition is violated.

The condition on the inputs can also be exploited for the development of the template. Put more concretely, the condition says that (empty? alon1) is true if, and only if, (empty? alon2) is true; and furthermore, (cons? alon1) is true if, and only if, (cons? alon2) is true. In other words, the condition simplifies the design of the template's cond-structure, because it says the template is similar to that of a plain list-processing function:

(define (hours->wages alon1 alon2)
  (cond
    ((empty? alon1) ...)
    (else ... )))

In the first cond-clause, both alon1 and alon2 are empty. Hence no selector expressions are needed. In the second clause, both alon1 and alon2 are constructed lists, which means we need four selector expressions:

(define (hours->wages alon1 alon2)
  (cond
    ((empty? alon1) ...)
    (else
      ... (first alon1) ... (first alon2) ...
      ... (rest alon1) ... (rest alon2) ... )))

Finally, because the last two are lists of equal length, they make up a natural candidate for the natural recursion of hours->wages:

(define (hours->wages alon1 alon2)
  (cond
    ((empty? alon1) ...)
    (else
      ... (first alon1) ... (first alon2) ...
      ... (hours->wages (rest alon1) (rest alon2)) ... )))

The only unusual aspect of this template is that the recursive application consists of two expressions, both selector expressions for the two arguments. But, as we have seen, the idea is easily explicable owing to the assumption that alon1 and alon2 are of equal length.

;; hours->wages : list-of-numbers list-of-numbers  ->  list-of-numbers
;; to construct a new list by multiplying the corresponding items on
;; ASSUMPTION: the two lists are of equal length 
;; alon1 and alon2
(define (hours->wages alon1 alon2)
  (cond
    ((empty? alon1) empty)
    (else (cons (weekly-wage (first alon1) (first alon2))
                (hours->wages (rest alon1) (rest alon2))))))

;; weekly-wage : number number  ->  number
;; to compute the weekly wage from pay-rate and hours-worked
(define (weekly-wage pay-rate hours-worked)
  (* pay-rate hours-worked))

Figure 46:  The complete definition of hours-->wage

To define the function from here, we follow the design recipe. The first example implies that the answer for the first cond-clause is empty. In the second one, we have three values available:

  1. (first alon1) evaluates to the first item on the list of pay-rates;

  2. (first alon2) evaluates to the first item on the list of hours worked; and

  3. (hours->wages (rest alon1) (rest alon2)) computes the list of weekly wages for the remainders of alon1 and alon2.

We merely need to combine these values to get the final answer. More specifically, given the purpose statement, we must compute the weekly wage for the first employee and construct a list from that wage and the rest of the wages. This suggests the following answer for the second cond-clause:

(cons (weekly-wage (first alon1) (first alon2))
      (hours->wages (rest alon1) (rest alon2)))

The auxiliary function weekly-wage consumes the two first items and computes the weekly wage. Figure 46 contains the complete definitions.

Exercise 17.2.1.   In the real world, hours->wages consumes lists of employee structures and lists of work structures. An employee structure contains an employee's name, social security number, and pay rate. A work structure contains an employee's name and the number of hours worked in a week. The result is a list of structures that contain the name of the employee and the weekly wage.

Modify the function in figure 46 so that it works on these classes of data. Provide the necessary structure definitions and data definitions. Use the design recipe to guide the modification process.    Solution

Exercise 17.2.2.   Develop the function zip, which combines a list of names and a list phone numbers into a list of phone records. Assuming the following structure definition:

(define-struct phone-record (name number)) ,

a phone record is constructed with (make-phone-record s n) where s is a symbol and n is a number. Assume the lists are of equal length. Simplify the definition, if possible.    Solution

17.3  Processing Two Lists Simultaneously: Case 3

Here is a third problem statement, given as in the form of a function contract, purpose statement, and header:

;; list-pick : list-of-symbols N[>= 1]  ->  symbol
;; to determine the nth symbol from alos, counting from 1;
;; signals an error if there is no nth item
(define (list-pick alos n) ...)

That is, the problem is to develop a function that consumes a natural number and a list of symbols. Both belong to classes with complex data definitions, though, unlike for the previous two problems, the classes are distinct. Figure 47 recalls the two definitions.

The data definitions:

A natural number [>= 1] (N[>= 1]) is either

  1. 1 or

  2. (add1 n) if n is a N[>= 1].

A list-of-symbols is either

  1. the empty list, empty, or

  2. (cons s lof) where s is a symbol and lof is a list of symbols.

Figure 47:  Data definitions for list-pick

Because the problem is non-standard, we should ensure that our examples cover all important cases. We usually accomplish this goal by picking one item per clause in the definition and choosing elements from basic forms of data on a random basis. In this example, this procedure implies that we pick at least two elements from list-of-symbols and two from N[>= 1]. Let's choose empty and (cons 'a empty) for the former, and 1 and 3 for the latter. But two choices per argument means four examples total; after all, there is no immediately obvious connection between the two arguments and no restriction in the contract:

(list-pick empty 1) 
;; expected behavior: 
(error 'list-pick "...")

(list-pick (cons 'a empty) 1)
;; expected value: 
'a

(list-pick empty 3) 
;; expected behavior: 
(error 'list-pick "...")

(list-pick (cons 'a empty) 3)
;; expected behavior: 
(error 'list-pick "...")

Only one of the four results is a symbol; in the other cases, we see an error, indicating that the list doesn't contain enough items.

The discussion on examples indicates that there are indeed four possible, independent cases that we must consider for the design of the function. We can discover the four cases by arranging the necessary conditions in a table format:

(empty? alos) (cons? alos)
(= n 1)
(> n 1)
The horizontal dimension of the table lists those questions that list-pick must ask about the list argument; the vertical dimension lists the questions about the natural number. Furthermore, the partitioning of the table yields four squares. Each square represents the case when both the condition on the horizontal and the one on the vertical are true. We can express this fact with and-expressions in the squares:

(empty? alos) (cons? alos)
(= n 1)


(and (= n 1)
(empty? alos))


(and (= n 1)
(cons? alos))
(> n 1)


(and (> n 1)
(empty? alos))


(and (> n 1)
(cons? alos))
It is straightforward to check that for any given pair of arguments exactly one of the four composite claims must evaluate to true.

Using our cases analysis, we can now design the first part of the template, the conditional expression:

(define (list-pick alos n)
  (cond
    [(and (= n 1) (empty? alos)) ...]
    [(and (> n 1) (empty? alos)) ...]
    [(and (= n 1) (cons? alos)) ...]
    [(and (> n 1) (cons? alos)) ...]))

The cond-expression asks all four questions, thus distinguishing all possibilities. Next we must add selector expressions to each cond-clause if possible:

(define (list-pick alos n)
  (cond
    [(and (= n 1) (empty? alos))
     ...]
    [(and (> n 1) (empty? alos))
     ... (sub1 n) ...]
    [(and (= n 1) (cons? alos))
     ... (first alos) ... (rest alos)...]
    [(and (> n 1) (cons? alos)) 
     ... (sub1 n) ... (first alos) ... (rest alos) ...]))

For n, a natural number, the template contains at most one selector expression, which determines the predecessor of n. For alos, it might contain two. In those cases where either (= n 1) or (empty? alos) holds, one of the two arguments is atomic and there is no need for a corresponding selector expression.

The final step of the template construction demands that we annotate the template with recursions where the results of selector expressions belong to the same class as the inputs. In the template for list-pick, this makes sense only in the last cond-clause, which contains both a selector expression for N[>= 1] and one for list-of-symbols. All other clauses contain at most one relevant selector expression. It is, however, unclear how to form the natural recursions. If we disregard the purpose of the function, and the template construction step asks us to do just that, there are three possible recursions:

  1. (list-pick (rest alos) (sub1 n))

  2. (list-pick alos (sub1 n))

  3. (list-pick (rest alos) n)

Since we cannot know which one matters or whether all three matter, we move on to the next development stage.

;; list-pick : list-of-symbols N[>= 1]  ->  symbol
;; to determine the nth symbol from alos, counting from 1;
;; signals an error if there is no nth item
(define (list-pick alos n)
  (cond
    [(and (= n 1) (empty? alos)) (error 'list-pick "list too short")]
    [(and (> n 1) (empty? alos)) (error 'list-pick "list too short")]
    [(and (= n 1) (cons? alos)) (first alos)]
    [(and (> n 1) (cons? alos)) (list-pick (rest alos) (sub1 n))]))

Figure 48:  The complete definition of list-pick

Following the design recipe, let us analyze each cond-clause in the template and decide what a proper answer is:

  1. If (and (= n 1) (empty? alos)) holds, list-pick was asked to pick the first item from an empty list, which is impossible. The answer must be an application of error.

  2. If (and (> n 1) (empty? alos)) holds, list-pick was again asked to pick an item from an empty list. The answer is also an error.

  3. If (and (= n 1) (cons? alos)) holds, list-pick is supposed to produce the first item from some list. The selector expression (first alos) reminds us how to get this item. It is the answer.

  4. For the final clause, if (and (> n 1) (cons? alos)) holds, we must analyze what the selector expressions compute:

    1. (first alos) selects the first item from the list of symbols;

    2. (rest alos) is the rest of the list; and

    3. (sub1 n) is one less that the original given list index.

    Let us consider an example to illustrate the meaning of these expressions. Suppose list-pick is applied to (cons 'a (cons 'b empty)) and 2:

       (list-pick (cons 'a (cons 'b empty)) 2)
    

    The answer must be 'b, (first alos) is 'a, and (sub1 n) is 1. Here is what the three natural recursions would compute with these values:

    1. (list-pick (cons 'b empty) 1) produces 'b, the desired answer;

    2. (list-pick (cons 'a (cons 'b empty)) 1) evaluates to 'a, which is a symbol, but the the wrong answer for the original problem; and

    3. (list-pick (cons 'b empty) 2) signals an error because the index is larger than the length of the list.

    This suggests that we use (list-pick (rest alos) (sub1 n)) as the answer in the last cond-clause. But, example-based reasoning is often treacherous, so we should try to understand why the expression works in general.

    Recall that, according to the purpose statement,

    (list-pick (rest alos) (sub1 n))
    

    picks the (n - 1)st item from (rest alos). In other words, for the second application, we have decreased the index by 1, shortened the list by one item, and now look for an item. Clearly, the second application always produces the same answer as the first one, assuming alos and n are ``compound'' values. Hence our choice for the last clause is truly justified.

Exercise 17.3.1.   Develop list-pick0, which picks items from a list like list-pick but starts counting at 0.

Examples:

(symbol=? (list-pick0 (list 'a 'b 'c 'd) 3)
         'd)

(list-pick0 (list 'a 'b 'c 'd) 4)
;; expected behavior:
(error 'list-pick0 "the list is too short")

   Solution

17.4  Function Simplification

The list-pick function in figure 48 is more complicated than necessary. Both the first and the second cond-clause produce the same answer: an error. In other words, if either

(and (= n 1) (empty? alos))

or

(and (> n 1) (empty? alos))

evaluates to true, the answer is an error. We can translate this observation into a simpler cond-expression:

(define (list-pick alos n)
  (cond
    [(or (and (= n 1) (empty? alos)) 
         (and (> n 1) (empty? alos))) (error 'list-pick "list too short")]
    [(and (= n 1) (cons? alos)) (first alos)]
    [(and (> n 1) (cons? alos)) (list-pick (rest alos) (sub1 n))]))

The new expression is a direct transliteration of our English observation.

To simplify this function even more, we need to get acquainted with an algebraic law concerning booleans:

  (or (and condition1 a-condition) 
      (and condition2 a-condition))
= (and (or condition1 condition2)
       a-condition)

The law is called de Morgan's law of distributivity. Applying it to our function yields the following:

(define (list-pick n alos)
  (cond
    [(and (or (= n 1) (> n 1))
          (empty? alos)) (error 'list-pick "list too short")]
    [(and (= n 1) (cons? alos)) (first alos)]
    [(and (> n 1) (cons? alos)) (list-pick (rest alos) (sub1 n))]))

Now consider the first part of the condition: (or (= n 1) (> n 1)). Because n belongs to N[>= 1], the condition is always true. But, if we replace it with true we get

(and true
     (empty? alos))     

which is clearly equivalent to (empty? alos). In other words, the function can be written as

(define (list-pick alos n)
  (cond
    [(empty? alos) (error 'list-pick "list too short")]
    [(and (= n 1) (cons? alos)) (first alos)]
    [(and (> n 1) (cons? alos)) (list-pick (rest alos) (sub1 n))]))

which is already significantly simpler than that in figure 48.

Still, we can do even better than that. The first condition in the latest version of list-pick filters out all those cases when alos is empty. Hence (cons? alos) in the next two clauses is always going to evaluate to true. If we replace the condition with true and simplify the and-expressions, we get the simplest possible version of list-pick, which is displayed in figure 49. While this last function is simpler than the original, it is important to understand that we designed both the original and the simplified version in a systematic manner and that we can therefore trust both. If we try to find the simple versions directly, we sooner or later fail to consider a case and produce flawed functions.

;; list-pick : list-of-symbols N[>= 1]  ->  symbol
;; to determine the nth symbol from alos, counting from 1;
;; signals an error if there is no nth item
(define (list-pick alos n)
  (cond
    [(empty? alos) (error 'list-pick "list too short")]
    [(= n 1) (first alos)]
    [(> n 1) (list-pick (rest alos) (sub1 n))]))

Figure 49:  The simplified definition of list-pick

Exercise 17.4.1.   Develop the function replace-eol-with following the strategy of section 17.2. Then simplify it systematically.    Solution

Exercise 17.4.2.   Simplify the function list-pick0 from exercise 17.3.1 or explain why it can't be simplified.    Solution

17.5  Designing Functions that Consume Two Complex Inputs

On occasion, we will encounter problems that require functions on two complex classes of inputs. The most interesting situation occurs when both inputs are of unknown size. As we have seen in the first three subsections, we may have to deal with such functions in three different ways.

The proper approach to this problem is to follow the general design recipe. In particular, we must conduct a data analysis and we must define the relevant classes of data. Then we can state the contract and the purpose of the function, which, in turn, puts us in a position where we can think ahead. Before we continue from this point, we should decide which one of the following three situations we are facing:

  1. In some cases, one of the parameters plays a dominant role. Conversely, we can think of one of the parameters as an atomic piece of data as far as the function is concerned.

  2. In some other cases, the two parameters are synchronized. They must range over the same class of values, and they must have the same structure. For example, if we are given two lists, they must have the same length. If we are given two Web pages, they must have the same length, and where one of them contains an embedded page, the other one does, too. If we decide that the two parameters have this equal status and must be processed in a synchronized manner, then we can pick one of them and organize the function around it.

  3. Finally, in some rare cases, there may not be any obvious connection between the two parameters. In this case, we must analyze all possible cases before we pick examples and design the template.

For the first two cases, we use an existing design recipe. The last case deserves some special consideration.

After we have decided that a function falls into the third category but before we develop examples and the function template, we develop a two-dimensional table. Here is the table for list-pick again:

alos
(empty? alos) (cons? alos)
n (= n 1)
(> n 1)

Along the horizontal direction we enumerate the conditions that recognize the subclasses for the first parameter, and along the vertical direction we enumerate the conditions for the second parameter.

The table guides the development of both the set of function examples and the function template. As far as the examples are concerned, they must cover all possible cases. That is, there must be at least one example for each cell in the table.

As far as the template is concerned, it must have one cond-clause per cell. Each cond-clause, in turn, must contain all feasible selector expressions for both parameters. If one of the parameters is atomic, there is no need for a selector expression. Finally, instead of a single natural recursion, we might have several. For list-pick, we discovered three cases. In general, all possible combinations of selector expressions are candidates for a natural recursion. Because we can't know which ones are necessary and which ones aren't, we write them all down and pick the proper ones for the actual function definition.

In summary, the design of multi-parameter functions is just a variation on the old design-recipe theme. The key idea is to translate the data definitions into a table that shows all feasible and interesting combinations. The development of function examples and the template exploit the table as much as possible. Filling in the gaps in the template takes practice, just like anything else.

17.6  Exercises on Processing Two Complex Inputs

Exercise 17.6.1.   Develop the function merge. It consumes two lists of numbers, sorted in ascending order. It produces a single sorted list of numbers that contains all the numbers on both inputs lists (and nothing else). A number occurs in the output as many times as it occurs on the two input lists together.

Examples:

(merge (list 1 3 5 7 9) (list 0 2 4 6 8))
;; expected value: 
(list 0 1 2 3 4 5 6 7 8 9)

(merge (list 1 8 8 11 12) (list 2 3 4 8 13 14))
;; expected value: 
(list 1 2 3 4 8 8 8 11 12 13 14) ;     


 Solution

Exercise 17.6.2.   The goal of this exercise is to develop a version of the Hangman game of section 6.7 for words of arbitrary length.

Provide a data definition for representing words of arbitrary length with lists. A letter is represented with the symbols 'a through 'z plus '_.

Develop the function reveal-list. It consumes three arguments:

  1. the chosen word, which is the word that we have to guess;

  2. the status word, which states how much of the word we have guessed so far; and

  3. a letter, which is our current guess.

It produces a new status word, that is, a word that contains ordinary letters and '_. The fields in the new status word are determined by comparing the guess with each pair of letters from the status word and the chosen word:

  1. If the guess is equal to the letter in the chosen word, the guess is the corresponding letter in the new status word.

  2. Otherwise, the new letter is the corresponding letter from the status word.

Test the function with the following examples:

  1. (reveal-list (list 't 'e 'a) (list '_ 'e '_) 'u)

  2. (reveal-list (list 'a 'l 'e) (list 'a '_ '_) 'e)

  3. (reveal-list (list 'a 'l 'l) (list '_ '_ '_) 'l)

First determine what the result should be.

Use the teachpack hangman.ss and the functions draw-next-part (from exercise 6.7.1) and reveal-list to play the Hangman game. Evaluate the following expression:

(hangman-list reveal-list draw-next-part)

The function hangman-list chooses a word randomly and pops up a window with a choice menu for letters. Choose letters and, when ready, click on the Check button to see whether your guess is correct. Enjoy!    Solution

Exercise 17.6.3.   In a factory, employees punch time cards as they arrive in the morning and leave in the evening. In the modern age of electronic punch cards, a punch card contains an employee number and the number of hours worked. Also, employee records always contain the name of the employee, an employee number, and a pay rate.

Develop the function hours->wages2. The function consumes a list of employee records and a list of (electronic) punch cards. It computes the weekly wage for each employee by matching the employee record with a punch card based on employee numbers. If a pair is missing or if a pair's employee numbers are mismatched, the function stops with an appropriate error message. Assume that there is at most one card per employee and employee number.

Hint: An accountant would sort the two lists by employee number first.    Solution

Exercise 17.6.4.   A linear combination is the sum of many linear terms, that is, products of variables and numbers. The latter are called coefficients in this context. Here are some examples:

[curriculum2a-Z-G-1.gif]

In all three examples, the coefficient of x is 5, that of y is 17, and the one for z is 3.

If we are given values for variables, we can determine the value of a polynomial. For example, if x = 10, the value of 5 · x is 50; if x = 10 and y = 1, the value of 5 · x + 17 · y is 67; and if x = 10, y = 1, and z = 2, the value of 5 · x + 17 · y + 3 · z is 73.

In the past, we would have developed functions to compute the values of linear combinations for specific values. An alternative representation is a list of its coefficients. The above combinations would be represented as:

(list 5)
(list 5 17)
(list 5 17 3)

This representation assumes that we always agree on using variables in a fixed order.

Develop the function value. It consumes the representation of a linear combination and a list of numbers. The lists are of equal length. It produces the value of the combination for these values.    Solution

Exercise 17.6.5.   Louise, Jane, Laura, Dana, and Mary are sisters who would like to save money and work spent on Christmas gifts. So they decide to hold a lottery that assigns to each one of them a single gift recipient. Since Jane is a computer programmer, they ask her to write a program that performs the lottery in an impartial manner. Of course, the program must not assign any of the sisters to herself.

Here is the definition of gift-pick. It consumes a list of distinct names (symbols) and randomly picks one of those arrangements of the list that do not agree with the original list at any position:

;; gift-pick: list-of-names  ->  list-of-names
;; to pick a ``random'' non-identity arrangement of names
(define (gift-pick names)
  (random-pick
    (non-same names (arrangements names))))

Recall that arrangements (see exercise 12.4.2) consumes a list of symbols and produces the list of all rearrangements of the items in the list.

Develop the auxiliary functions

  1. random-pick : list-of-list-of-names  ->  list-of-names, which consumes a list of items and randomly picks one of them as the result;

  2. non-same : list-of-names list-of-list-of-names  ->  list-of-list-of-names, which consumes a list of names L and a list of arrangements and produces the list of those that do not agree with L at any position.

    Two permutations agree at some position if we can extract the same name from both lists by applying first and the same number of rest operations to both. For example, (list 'a 'b 'c) and (list 'c 'a 'b) do not agree, but (list 'a 'b 'c) and (list 'c 'b 'a) agree at the second position. We can prove that by applying rest followed by first to both lists.

Follow the appropriate recipe in each case carefully.

Hint: Recall that (random n) picks a random number between 0 and n-1 (compare with exercise 11.3.1).    Solution

Exercise 17.6.6.   Develop the function DNAprefix. The function takes two arguments, both lists of symbols (only 'a, 'c, 'g, and 't occur in DNA, but we can safely ignore this issue here). The first list is called a pattern, the second one a search-string. The function returns true if the pattern is a prefix of the search-string. In all other cases, the function returns false.

Examples:

(DNAprefix (list 'a 't) (list 'a 't 'c)) 

(not (DNAprefix (list 'a 't) (list 'a)))

(DNAprefix (list 'a 't) (list 'a 't)) 

(not (DNAprefix (list 'a 'c 'g 't) (list 'a 'g)))

(not (DNAprefix (list 'a 'a 'c 'c) (list 'a 'c)))

Simplify DNAprefix, if possible.

Modify DNAprefix so that it returns the first item beyond the pattern in the search-string if the pattern is a proper prefix of the search-string. If the lists do not match or if the pattern is no shorter than the search-string, the modified function should still return false. Similarly, if the lists are equally long and match, the result is still true.

Examples:

(symbol=? (DNAprefix (list 'a 't) (list 'a 't 'c)) 
          'c)

(not (DNAprefix (list 'a 't) (list 'a)))

(DNAprefix (list 'a 't) (list 'a 't)) 

Can this variant of DNAprefix be simplified? If so, do it. If not, explain.    Solution

17.7  Extended Exercise: Evaluating Scheme, Part 2

The goal of this section is to extend the evaluator of section 14.4 so that it can cope with function applications and function definitions. In other words, the new evaluator simulates what happens in DrScheme when we enter an expression in the Interactions window after clicking Execute. To make things simple, we assume that all functions in the Definitions window consume one argument.

Exercise 17.7.1.   Extend the data definition of exercise 14.4.1 so that we can represent the application of a user-defined function to an expression such as (f (+ 1 1)) or (* 3 (g 2)). The application should be represented as a structure with two fields. The first field contains the name of the function, the second one the representation of the argument expression.    Solution

A full-fledged evaluator can also deal with function definitions.

Exercise 17.7.2.   Provide a structure definition and a data definition for definitions. Recall that a function definition has three essential attributes:

  1. the function's name,

  2. the parameter name, and

  3. the function's body.

This suggests the introduction of a structure with three fields. The first two contain symbols, the last one a representation of the function's body, which is an expression.

Translate the following definitions into Scheme values:

  1. (define (f x) (+ 3 x))

  2. (define (g x) (* 3 x))

  3. (define (h u) (f (* 2 u)))

  4. (define (i v) (+ (* v v) (* v v)))

  5. (define (k w) (* (h w) (i w)))

Make up more examples and translate them, too.    Solution

Exercise 17.7.3.   Develop evaluate-with-one-def. The function consumes (the representation of) a Scheme expression and (the representation of) a single function definition, P.

The remaining expressions from exercise 14.4.1 are evaluated as before. For (the representation of) a variable, the function signals an error. For an application of the function P, evaluate-with-one-def

  1. evaluates the argument;

  2. substitutes the value of the argument for the function parameter in the function's body; and

  3. evaluates the new expression via recursion. Here is a sketch:42

    (evaluate-with-one-def (subst ... ... ...) 
    		       a-fun-def)
    

For all other function applications, evaluate-with-one-def signals an error.    Solution

Exercise 17.7.4.   Develop the function evaluate-with-defs. The function consumes (the representation of) a Scheme expression and a list of (representations of) function definitions, defs. The function produces the number that DrScheme would produce if we were to evaluate the actual Scheme expression in the Interactions window and if the Definitions window contained the actual definitions.

The remaining expressions from exercise 14.4.1 are evaluated as before. For an application of the function P, evaluate-with-defs

  1. evaluates the argument;

  2. looks up the definition of P in defs;

  3. substitutes the value of the argument for the function parameter in the function's body; and

  4. evaluates the new expression via recursion.

Like DrScheme, evaluate-with-defs signals an error for a function application whose function name is not on the list and for (the representation of) a variable.    Solution

17.8  Equality and Testing

Many of the functions we designed produce lists. When we test these functions, we must compare their results with the predicted value, both of which are lists. Comparing lists by hand is tedious and error-prone. Let's develop a function that consumes two lists of numbers and determines whether they are equal:

;; list=? : list-of-numbers list-of-numbers  ->  boolean
;; to determine whether a-list and another-list 
;; contain the same numbers in the same order
(define (list=? a-list another-list) ...)

The purpose statement refines our general claim and reminds us that, for example, shoppers may consider two lists equal if they contain the same items, regardless of the order, but programmers are more specific and include the order in the comparison. The contract and the purpose statement also show that list=? is a function that processes two complex values, and indeed, it is an interesting case study.

Comparing two lists means looking at each item in both lists. This rules out designing list=? along the lines of replace-eol-with in section 17.1. At first glance, there is also no connection between the two lists, which suggests that we should use the modified design recipe.

Let's start with the table:

(empty? a-list) (cons? a-list)
(empty? another-list)
(cons? another-list)
It has four cells, which implies that we need (at least) four tests and four cond-clauses in the template.

Here are five tests:

(list=? empty empty) 

(not 
  (list=? empty (cons 1 empty)))

(not 
  (list=? (cons 1 empty) empty))

(list=? (cons 1 (cons 2 (cons 3 empty))) 
        (cons 1 (cons 2 (cons 3 empty))))

(not 
  (list=? (cons 1 (cons 2 (cons 3 empty))) 
          (cons 1 (cons 3 empty))))

The second and third show that list=? must deal with its arguments in a symmetric fashion. The last two show how list=? can produce true and false.

Three of the template's four cond-clauses contain selector expressions and one contains natural recursions:

(define (list=? a-list another-list)
  (cond
    [(and (empty? a-list) (empty? another-list)) ...]
    [(and (cons? a-list) (empty? another-list))
     ... (first a-list) ... (rest a-list) ...]
    [(and (empty? a-list) (cons? another-list)) 
    ... (first another-list) ... (rest another-list) ...]
    [(and (cons? a-list) (cons? another-list))
     ... (first a-list) ... (first another-list) ...
     ... (list=? (rest a-list) (rest another-list)) ...
     ... (list=? a-list (rest another-list)) ...
     ... (list=? (rest a-list) another-list) ...]))

There are three natural recursions in the fourth clause because we can pair the two selector expressions and we can pair each parameter with one selector expression.

From the template to the complete definition is only a small step. Two lists can contain the same items only if they are both empty or constructed. This immediately implies true as the answer for the first clause and false for the next two. In the last clause, we have two numbers, the first of both lists, and three natural recursions. We must compare the two numbers. Furthermore, (list=? (rest a-list) (rest another-list)) computes whether the rest of the two lists are equal. The two lists are equal if, and only if, both conditions hold, which means we must combine them with an and:

(define (list=? a-list another-list)
  (cond
    [(and (empty? a-list) (empty? another-list)) true]
    [(and (cons? a-list) (empty? another-list)) false]
    [(and (empty? a-list) (cons? another-list)) false]
    [(and (cons? a-list) (cons? another-list))
     (and (= (first a-list) (first another-list))
          (list=? (rest a-list) (rest another-list)))]))

The other two natural recursions play no role.

Let us now take a second look at the connection between the two parameters. The first development suggests that the second parameter must have the same shape as the first one, if the two lists are to be equal. Put differently, we could develop the function based on the structure of the first parameter and check structure of the other one as needed.

The first parameter is a list of numbers, so we can reuse the template for list-processing functions:

(define (list=? a-list another-list)
  (cond
    [(empty? a-list) ...]
    [(cons? a-list) 
     ... (first a-list) ... (first another-list) ...
     ... (list=? (rest a-list) (rest another-list)) ...]))

The only difference is that the second clause processes the second parameter in the same way as the first one. This mimics the development of hours->wages in section 17.2.

Filling the gaps in this template is more difficult than for the first development of list=?. If a-list is empty, the answer depends on another-list. As the examples show, the answer is true if, and only if, another-list is also empty. Translated into Scheme this means that the answer in the first cond-clause is (empty? another-list).

If a-list is not empty, the template suggests that we compute the answer from

  1. (first a-list), the first number of a-list;

  2. (first another-list), the first number on another-list; and

  3. (list=? (rest a-list) (rest another-list)), which determines whether the rest of the two lists are equal.

Given the purpose of the function and the examples, we now simply compare (first a-list) and (first another-list) and combine the result with the natural recursion in an and-expression:

(and (= (first a-list) (first another-list))
     (list=? (rest a-list) (rest another-list)))

While this step appears to be simple and straightforward, the result is an improper definition. The purpose of spelling out the conditions in a cond-expression is to ensure that all selector expressions are appropriate. Nothing in the specification of list=?, however, suggests that another-list is constructed if a-list is constructed.

We can overcome this problem with an additional condition:

(define (list=? a-list another-list)
  (cond
    [(empty? a-list) (empty? another-list)]
    [(cons? a-list) 
     (and (cons? another-list)
          (and (= (first a-list) (first another-list))
               (list=? (rest a-list) (rest another-list))))]))

The additional condition is (cons? another-list), which means that list=? produces false if (cons? a-list) is true and (cons? another-list) is empty. As the examples show, this is the desired outcome.

In summary, list=? shows that, on occasion, we can use more than one design recipe to develop a function. The outcomes are different, though closely related; indeed, we could prove that the two always produce the same results for the same inputs. Also, the second development benefited from the first one.

Exercise 17.8.1.   Test both versions of list=?.    Solution

Exercise 17.8.2.   Simplify the first version of list=?. That is, merge neighboring cond-clauses with the same result by combining their conditions in an or-expression; switch cond-clauses as needed; and use else in the last clause of the final version.    Solution

Exercise 17.8.3.   Develop sym-list=?. The function determines whether two lists of symbols are equal.    Solution

Exercise 17.8.4.   Develop contains-same-numbers. The function determines whether two lists of numbers contain the same numbers, regardless of the ordering. Thus, for example,

(contains-same-numbers (list 1 2 3) (list 3 2 1))

evaluates to true.    Solution

Exercise 17.8.5.   The class of numbers, symbols, and booleans are sometimes called atoms:43

An atom is either

  1. a number

  2. a boolean

  3. a symbol

Develop the function list-equal?, which consumes two lists of atoms and determines whether they are equal.    Solution

A comparison between the two versions of list=? suggests that the second one is easier to understand than the first. It says that two compound values are equal if the second is made from the same constructor and the components are equal. In general, this idea is a good guide for the development of other equality functions.

Let's look at an equality function for simple Web pages to confirm this conjecture:

;; web=? : web-page web-page  ->  boolean
;; to determine whether a-wp and another-wp have the same tree shape
;; and contain the same symbols in the same order
(define (web=? a-wp another-wp) ...)

Recall the data definition for simple Web pages:

A Web-page (short: WP) is either

  1. empty;

  2. (cons s wp)
    where s is a symbol and wp is a Web page; or

  3. (cons ewp wp)
    where both ewp and wp are Web pages.

The data definition has three clauses, which means that if we were to develop web=? with the modified design recipe, we would need to study nine cases. By using the insight gained from the development of list=? instead, we can start from the plain template for Web sites:

(define (web=? a-wp another-wp)
  (cond
    [(empty? a-wp) ...]
    [(symbol? (first a-wp))
     ... (first a-wp) ... (first another-wp) ...
     ... (web=? (rest a-wp) (rest another-wp)) ...]
    [else 
     ... (web=? (first a-wp) (first another-wp)) ...
     ... (web=? (rest a-wp) (rest another-wp)) ...]))

In the second cond-clause, we follow the example of hours->wages and list=? again. That is, we say that another-wp must have the same shape as a-wp if it is to be equal and process the two pages in an analogous manner. The reasoning for the third clause is similar.

As we refine this template into a full definition now, we must again add conditions on another-wp to ensure that the selector expressions are justified:

(define (web=? a-wp another-wp)
  (cond
    [(empty? a-wp) (empty? another-wp)]
    [(symbol? (first a-wp))
     (and (and (cons? another-wp) (symbol? (first another-wp)))
          (and (symbol=? (first a-wp) (first another-wp))
               (web=? (rest a-wp) (rest another-wp))))]
    [else 
     (and (and (cons? another-wp) (list? (first another-wp)))
          (and (web=? (first a-wp) (first another-wp))
               (web=? (rest a-wp) (rest another-wp))))]))

In particular, we must ensure in the second and third clause that another-wp is a constructed list and that the first item is a symbol or a list, respectively. Otherwise the function is analogous to list=? and works in the same way.

Exercise 17.8.6.   Draw the table based on the data definition for simple Web pages. Develop (at least) one example for each of the nine cases. Test web=? with these examples.    Solution

Exercise 17.8.7.   Develop the function posn=?, which consumes two binary posn structures and determines whether they are equal.    Solution

Exercise 17.8.8.   Develop the function tree=?, which consumes two binary trees and determines whether they are equal.    Solution

Exercise 17.8.9.   Consider the following two, mutually recursive data definitions:

A Slist is either

  1. empty

  2. (cons s sl) where s is a Sexpr and sl is a Slist.

A Sexpr is either

  1. a number

  2. a boolean

  3. a symbol

  4. a Slist

Develop the function Slist=?, which consumes two Slists and determines whether they are equal. Like lists of numbers, two Slists are equal if they contain the same item at analogous positions.    Solution

Now that we have explored the idea of equality of values, we can return to the original motivation of the section: testing functions. Suppose we wish to test hours->wages from section 17.2:

  (hours->wages (cons 5.65 (cons 8.75 empty)) 
                (cons 40 (cons 30 empty)))
= (cons 226.0 (cons 262.5 empty))

If we just type in the application into Interactions window or add it to the bottom of the Definitions window, we must compare the result and the predicted value by inspection. For short lists, like the ones above, this is feasible; for long lists, deep Web pages, or other large compound data, manual inspection is error-prone.

Using equality functions like list=?, we can greatly reduce the need for manual inspection of test results. In our running example, we can add the expression

(list=? 
  (hours->wages (cons 5.65 (cons 8.75 empty)) 
                (cons 40 (cons 30 empty)))
  (cons 226.0 (cons 262.5 empty)))

to the bottom of the Definitions window. When we click the Execute button now, we just need to make sure that all test cases produce true as their results are displayed in the Interactions window.

;; test-hours->wages : list-of-numbers list-of-numbers list-of-numbers  ->  test-result
;; to test hours->wages
(define (test-hours->wages a-list another-list expected-result)
  (cond
    [(list=? (hours->wages a-list another-list) expected-result)
     true]
    [else
     (list "bad test result:" a-list another-list expected-result)]))

Figure 50:  A test function

Indeed, we can go even further. We can write a test function like the one in figure 50. The class of test-results consists of the value true and lists of four items: the string "bad test result:" followed by three lists. Using this new auxiliary function, we can test hours->wages as follows:

(test-hours->wages 
  (cons 5.65 (cons 8.75 empty)) 
  (cons 40 (cons 30 empty))
  (cons 226.0 (cons 262.5 empty)))

If something goes wrong with the test, the four-item list will stand out and specify precisely which test case failed.

Testing with equal?: The designers of Scheme anticipated the need of a general equality procedure and provide:

;; equal? : any-value any-value  ->  boolean
;; to determine whether two values are structurally equivalent 
;; and contain the same atomic values in analogous positions 

When equal? is applied to two lists, it compares them in the same manner as list=?; when it encounters a pair of structures, it compares their corresponding fields, if they are the same kind of structures; and when it consumes a pair of atomic values, it compares them with =, symbol=?, or boolean=?, whatever is appropriate.

Guideline on Testing

Use equal? for testing (when comparisons of values are necessary).

Unordered Lists: On some occasions, we use lists even though the ordering of the items doesn't play a role. For those cases, it is important to have functions such as contains-same-numbers (see exercise 17.8.4) if we wish to determine whether the result of some function application contains the proper items. 

Exercise 17.8.10.   Define a test function for replace-eol-with from section 17.1 using equal? and formulate the examples as test cases using this function.    Solution

Exercise 17.8.11.   Define the function test-list-pick, which manages test cases for the list-pick function from section 17.3. Formulate the examples from the section as test cases using test-list-pick.    Solution

Exercise 17.8.12.   Define test-interpret, which tests interpret-with-defs from exercise 17.7.4, using equal?. Reformulate the test cases using this function.    Solution


42 We discuss this form of recursion in detail in part V.

43 Some people also include empty and keyboard characters (chars).