Section 26

Designing Algorithms

At first glance, the algorithms move-until-out and quick-sort have little in common. One processes structures; the other processes lists. One creates a new structure for the generative step; the other splits up a list into three pieces and recurs on two of them. In short, a comparison of the two examples of generative recursion suggests that the design of algorithms is an ad hoc activity and that it is impossible to come up with a general design recipe. A closer look, however, suggests a different picture.

First, even though we speak of algorithms as processes that solve problems, they are still functions that consume and produce data. In other words, we still choose data to represent a problem, and we must definitely understand the nature of our data if we wish to understand the process. Second, we describe the processes in terms of data, for example, ``creating a new structure'' or ``partitioning a list of numbers.'' Third, we always distinguish between input data for which it is trivial to produce a solution and those for which it is not. Fourth, the generation of problems is the key to the design of algorithms. Although the idea of how to generate a new problem might be independent of a data representation, it must certainly be implemented for whatever form of representation we choose for our problem. Finally, once the generated problems have been solved, the solutions must be combined with other values.

Let us examine the six general stages of our structural design recipe in light of our discussion:

Data analysis and design:
The choice of a data representation for a problem often affects our thinking about the process. Sometimes the description of a process dictates a particular choice of representation. On other occasions, it is possible and worthwhile to explore alternatives. In any case, we must analyze and define our data collections.

Contract, purpose, header:
We also need a contract, a definition header, and a purpose statement. Since the generative step has no connection to the structure of the data definition, the purpose statement should not only specify what the function does but should also include a comment that explains in general terms how it works.

Function examples:
In our previous design recipes, the function examples merely specified which output the function should produce for some given input. For algorithms, examples should illustrate how the algorithm proceeds for some given input. This helps us to design, and readers to understand, the algorithm. For functions such as move-until-out the process is trivial and doesn't need more than a few words. For others, including, quick-sort, the process relies on a non-trivial idea for its generative step, and its explanation requires a good example such as the one in figure 67.

Template:
Our discussion suggests a general template for algorithms:

(define (generative-recursive-fun problem)
  (cond
    [(trivially-solvable? problem)
     (determine-solution problem)]
    [else
     (combine-solutions
       ... problem ...
       (generative-recursive-fun (generate-problem-1 problem))
       [curriculum-Z-G-D-5.gif]
       (generative-recursive-fun (generate-problem-n problem)))]))

Definition:
Of course, this template is only a suggestive blueprint, not a definitive shape. Each function in the template is to remind us that we need to think about the following four questions:

  1. What is a trivially solvable problem?

  2. What is a corresponding solution?

  3. How do we generate new problems that are more easily solvable than the original problem? Is there one new problem that we generate or are there several?

  4. Is the solution of the given problem the same as the solution of (one of) the new problems? Or, do we need to combine the solutions to create a solution for the original problem? And, if so, do we need anything from the original problem data?

To define the algorithm, we must express the answers to these four questions in terms of our chosen data representation.

Test:
Once we have a complete function, we must also test it. As before, the goal of testing is to discover bugs and to eliminate them. Remember that testing cannot validate that the function works correctly for all possible inputs. Also remember that it is best to formulate tests as boolean-valued expressions that automatically compare the expected value with the computed value (see section 17.8).

Exercise 26.0.7.   Formulate informal answers to the four key questions for the problem of modeling a ball's movement across a canvas until it is out of bounds.     Solution

Exercise 26.0.8.   Formulate informal answers to the four key questions for the quick-sort problem. How many instances of generate-problem are there?    Solution

26.1  Termination

Unfortunately, the standard recipe is not good enough for the design of algorithms. Up to now, a function has always produced an output for any legitimate input. That is, the evaluation has always stopped. After all, by the nature of our recipe, each natural recursion consumes an immediate piece of the input, not the input itself. Because data is constructed in a hierarchical manner, this means that the input shrinks at every stage. Hence the function sooner or later consumes an atomic piece of data and stops.

With functions based on generative recursion, this is no longer true. The internal recursions don't consume an immediate component of the input but some new piece of data, which is generated from the input. As exercise 25.1.1 shows, this step may produce the input over and over again and thus prevent the evaluation from ever producing a result. We say that the program LOOPS or is in an INFINITE LOOP.

In addition, even the slightest mistake in translating the process description into a function definition may cause an infinite loop. The problem is most easily understood with an example. Consider the following definition of smaller-items, one of the two ``problem generators'' for quick-sort:

;; smaller-items : (listof number) number  ->  (listof number)
;; to create a list with all those numbers on alon  
;; that are smaller than or equal to threshold
(define (smaller-items alon threshold)
  (cond
    [(empty? alon) empty]
    [else (if (<= (first alon) threshold) 
	      (cons (first alon) (smaller-items (rest alon) threshold))
	      (smaller-items (rest alon) threshold))]))

Instead of < it employs <= to compare numbers. As a result, this function produces (list 5) when applied to (list 5) and 5.

Worse, if the quick-sort function from figure 68 is combined with this new version of smaller-items, it doesn't produce any output for (list 5):

  (quick-sort (list 5))
= (append (quick-sort (smaller-items 5 (list 5)))
          (list 5)
          (quick-sort (larger-items 5 (list 5))))
= (append (quick-sort (list 5))
          (list 5)
          (quick-sort (larger-items 5 (list 5))))

The first recursive use demands that quick-sort solve the problem of sorting (list 5) -- but that is the exact problem that we started with. Since this is a circular evaluation, (quick-sort (list 5)) never produces a result. More generally, there is no guarantee that the size of the input for a recursive call brings us closer to a solution than the original input.

Phase Goal Activity
Examples

to characterize the input-
output relationship
and the computational process via examples

create and show examples of trivially solvable problems [curriculum-Z-G-D-4.gif] create and show examples that require recursive processing [curriculum-Z-G-D-4.gif] illustrate how to work through the examples

Body                 

to define an algorithm

formulate tests for trivially solvable problems [curriculum-Z-G-D-4.gif] formulate answers for the trivial cases [curriculum-Z-G-D-4.gif] determine how to generate new problems from the given problem, possibly using auxiliary functions [curriculum-Z-G-D-4.gif] determine how to combine the solutions of these problems into a solution for the given problem

[curriculum-Z-G-D-5.gif]     [curriculum-Z-G-D-5.gif]     [curriculum-Z-G-D-5.gif]     
Termin.

to argue that the algorithm terminates for all possible inputs

show that the inputs to the recursive applications are smaller than the given input

Figure 69:  Designing algorithms

The lesson from this example is that the design of algorithms requires one more step in our design recipe: a TERMINATION ARGUMENT, which explains why the process produces an output for every input and how the function implements this idea; or a warning, which explains when the process may not terminate. For quick-sort, the argument might look like this:

At each step, quick-sort partitions the list into two sublists using smaller-items and larger-items. Each function produces a list that is smaller than the input (the second argument), even if the threshold (the first argument) is an item on the list. Hence each recursive application of quick-sort consumes a strictly shorter list than the given one. Eventually, quick-sort receives and returns empty.
Without such an argument an algorithm must be considered incomplete.

A good termination argument may on occasion also reveal additional termination cases. For example, (smaller-items N (list N)) and (larger-items N (list N)) always produce empty for any N. Therefore we know that quick-sort's answer for (list N) is (list N).55 To add this knowledge to quick-sort, we simply add a cond-clause:

(define (quick-sort alon)
  (cond
    [(empty? alon) empty]
    [(empty? (rest alon)) alon]
    [else (append 
	    (quick-sort (smaller-items alon (first alon))) 
	    (list (first alon)) 
	    (quick-sort (larger-items alon (first alon))))]))

The condition (empty? (rest alon)) is one way to ask whether alon contains one item.

Figure 69 summarizes the suggestions on the design of algorithms. The dots indicate that the design of an algorithm requires a new step: the termination argument. Read the table in conjunction with those of the preceding chapters.

Exercise 26.1.1.   Define the function tabulate-div, which accepts a number n and tabulates the list of all of its divisors, starting with 1 and ending in n. A number d is a divisior of a number n if the remainder of dividing n by d is 0, that is, (= (remainder n d) 0) is true. The smallest divisior of any number is 1; the largest one is the number itself.    Solution

Exercise 26.1.2.   Develop the function merge-sort, which sorts a list of numbers in ascending order, using the following two auxiliary functions:

  1. The first one, make-singles, constructs a list of one-item lists from the given list of numbers. For example,

    (equal? (make-singles (list 2 5 9 3))
            (list (list 2) (list 5) (list 9) (list 3)))
    

  2. The second one, merge-all-neighbors, merges pairs of neighboring lists. More specifically, it consumes a list of lists (of numbers) and merges neighbors. For example,

    (equal? (merge-all-neighbors (list (list 2) (list 5) (list 9) (list 3)))
            (list (list 2 5) (list 3 9)))
    
    (equal? (merge-all-neighbors (list (list 2 5) (list 3 9)))
            (list (list 2 3 5 9)))
    

    In general, this function yields a list that is approximately half as long as the input. Why is the output not always half as long as the input?

Make sure to develop the functions independently.

The function merge-sort first uses make-singles to create a list of single lists; then it relies on merge-all-neighbors to shorten the list of lists until it contains a single list. The latter is the result.    Solution

26.2  Structural versus Generative Recursion

The template for algorithms is so general that it even covers functions based on structural recursion. Consider the version with one termination clause and one generation step:

(define (generative-recursive-fun problem)
  (cond
    [(trivially-solvable? problem)
     (determine-solution problem)]
    [else
      (combine-solutions
	problem
	(generative-recursive-fun (generate-problem problem)))]))

If we replace trivially-solvable? with empty? and generate-problem with rest, the outline is a template for a list-processing function:

(define (generative-recursive-fun problem)
  (cond
    [(empty? problem) (determine-solution problem)]
    [else
      (combine-solutions
	problem
	(generative-recursive-fun (rest problem)))]))

Exercise 26.2.1.   Define determine-solution and combine-solutions so that the function generative-recursive-fun computes the length of its input.    Solution

This discussion raises the question of whether there is a difference between between functions based on structural recursion and those based on generative recursion. The answer is ``it depends.'' Of course, we could say that all functions using structural recursion are just special cases of generative recursion. This ``everything is equal'' attitude, however, is of no help if we wish to understand the process of designing functions. It confuses two classes of functions that are designed with different approaches and that have different consequences. One relies on a systematic data analysis and not much more; the other requires a deep, often mathematical, insight into the problem-solving process itself. One leads programmers to naturally terminating functions; the other requires a termination argument.

A simple inspection of a function's definition quickly shows whether a function uses structural or generative recursion. All self-applications of a structurally recursive function always receive an immediate component of the current input for further processing. For example, for a constructed list, the immediate components are the first item and the rest of the list. Hence, if a function consumes a plain list and its recursive use does not consume the rest of the list, its definition is not structural but generative. Or, put positively, properly recursive algorithms consume newly generated input, which may or may not contain components of the input. In any case, the new piece of data represents a different problem than the given one, but still a problem of the same general class of problems.

26.3  Making Choices

A user cannot distinguish sort and quick-sort. Both consume a list of numbers; both produce a list that consists of the same numbers arranged in ascending order. To an observer, the functions are completely equivalent.56 This raises the question of which of the two a programmer should provide. More generally, if we can develop a function using structural recursion and an equivalent one using generative recursion, what should we do?

To understand this choice better, let's discuss another classical example of generative recursion from mathematics: the problem of finding the greatest common divisor of two positive natural numbers.57 All such numbers have at least one divisor in common: 1. On occasion, this is also the only common divisor. For example, 2 and 3 have only 1 as common divisor because 2 and 3, respectively, are the only other divisors. Then again, 6 and 25 are both numbers with several divisors:

  1. 6 is evenly divisible by 1, 2, 3, and 6;

  2. 25 is evenly divisible by 1, 5, and 25.

Still, the greatest common divisior of 25 and 6 is 1. In contrast, 18 and 24 have many common divisors:

  1. 18 is evenly divisible by 1, 2, 3, 6, 9, and 18;

  2. 24 is evenly divisible by 1, 2, 3, 4, 6, 8, 12, and 24.

The greatest common divisor is 6.

Following the design recipe, we start with a contract, a purpose statement, and a header:

;; gcd : N[>= 1] N[>= 1]  ->  N
;; to find the greatest common divisior of n and m
(define (gcd n m) 
  ...)

The contract specifies the precise inputs: natural numbers that are greater or equal to 1 (not 0).

Now we need to make a decision whether we want to pursue a design based on structural or on generative recursion. Since the answer is by no means obvious, we develop both. For the structural version, we must consider which input the function should process: n, m, or both. A moment's consideration suggests that what we really need is a function that starts with the smaller of the two and outputs the first number smaller or equal to this input that evenly divides both n and m.

;; gcd-structural : N[>= 1] N[>= 1]  ->  N
;; to find the greatest common divisior of n and m
;; structural recursion using data definition of N[>= 1] 
(define (gcd-structural n m)
  (local ((define (first-divisior-<= i)
	    (cond
	      [(= i 1) 1]
	      [else (cond
		      [(and (= (remainder n i) 0) 
			    (= (remainder m i) 0))
		       i]
		      [else (first-divisior-<= (- i 1))])])))
    (first-divisior-<= (min m n))))

Figure 70:  Finding the greatest common divisor via structural recursion

We use local to define an appropriate auxiliary function: see figure 70. The conditions ``evenly divisible'' have been encoded as (= (remainder n i) 0) and (= (remainder m i) 0). The two ensure that i divides n and m without a remainder. Testing gcd-structural with the examples shows that it finds the expected answers.

Although the design of gcd-structural is rather straightforward, it is also naive. It simply tests for every number whether it divides both n and m evenly and returns the first such number. For small natural numbers, this process works just fine. Consider the following example, however:

(gcd-structural 101135853 45014640)

The result is 177 and to get there gcd-structural had to compare 101135676, that is, 101135853 - 177, numbers. This is a large effort and even reasonably fast computers spend several minutes on this task.

Exercise 26.3.1.   Enter the definition of gcd-structural into the Definitions window and evaluate (time (gcd-structural 101135853 45014640)) in the Interactions window.

Hint: After testing gcd-structural conduct the performance tests in the Full Scheme language (without debugging), which evaluates expressions faster than the lower language levels but with less protection. Add (require-library "core.ss") to the top of the Definitions window. Have some reading handy!    Solution

Since mathematicians recognized the inefficiency of the ``structural algorithm'' a long time ago, they studied the problem of finding divisiors in more depth. The essential insight is that for two natural numbers larger and smaller, their greatest common divisor is equal to the greatest common divisior of smaller and the remainder of larger divided into smaller. Here is how we can put this insight into equational form:

  (gcd larger smaller) 
= (gcd smaller (remainder larger smaller))

Since (remainder larger smaller) is smaller than both larger and smaller, the right-hand side use of gcd consumes smaller first.

Here is how this insight applies to our small example:

  1. The given numbers are 18 and 24.

  2. According to the mathematicians' insight, they have the same greatest common divisor as 18 and 6.

  3. And these two have the same greatest common divisor as 6 and 0.

And here we seem stuck because 0 is nothing expected. But, 0 can be evenly divided by every number, so we have found our answer: 6.

Working through the example not only explains the idea but also suggests how to discover the case with a trivial solution. When the smaller of the two numbers is 0, the result is the larger number. Putting everything together, we get the following definition:

;; gcd-generative : N[>= 1] N[>=1]  ->  N
;; to find the greatest common divisior of n and m
;; generative recursion: (gcd n m) = (gcd n (remainder m n)) if (<= m n)
(define (gcd-generative n m)
  (local ((define (clever-gcd larger smaller)
	    (cond
	      [(= smaller 0) larger]
	      [else (clever-gcd smaller (remainder larger smaller))])))
    (clever-gcd (max m n) (min m n))))

The local definition introduces the workhorse of the function: clever-gcd, a function based on generative recursion. Its first line discovers the trivially solvable case by comparing smaller to 0 and produces the matching solution. The generative step uses smaller as the new first argument and (remainder larger smaller) as the new second argument to clever-gcd, exploiting the above equation.

If we now use gcd-generative with our complex example from above:

(gcd-generative 101135853 45014640)

we see that the response is nearly instantaneous. A hand-evaluation shows that clever-gcd recurs only nine times before it produces the solution: 177. In short, generative recursion has helped find us a much faster solution to our problem.

Exercise 26.3.2.   Formulate informal answers to the four key questions for gcd-generative.    Solution

Exercise 26.3.3.   Define gcd-generative and evaluate

(time (gcd-generative 101135853 45014640))

in the Interactions window.

Evaluate (clever-gcd 101135853 45014640) by hand. Show only those lines that introduce a new recursive call to clever-gcd.    Solution

Exercise 26.3.4.   Formulate a termination argument for gcd-generative.    Solution

Considering the above example, it is tempting to develop functions using generative recursion. After all, they produce answers faster! This judgment is too rash for three reasons. First, even a well-designed algorithm isn't always faster than an equivalent structurally recursive function. For example, quick-sort wins only for large lists; for small ones, the standard sort function is faster. Worse, a badly designed algorithm can wreak havoc on the performance of a program. Second, it is typically easier to design a function using the recipe for structural recursion. Conversely, designing an algorithm requires an idea of how to generate new, smaller problems, a step that often requires deep mathematical insight. Finally, people who read functions can easily understand structurally recursive functions, even without much documentation. To understand an algorithm, the generative step must be well explained, and even with a good explanation, it may still be difficult to grasp the idea.

Experience shows that most functions in a program employ structural recursion; only a few exploit generative recursion. When we encounter a situation where a design could use either the recipe for structural or generative recursion, the best approach is often to start with a structural version. If it turns out to be too slow, the alternative design using generative recursion should be explored. If it is chosen, it is important to document the problem generation with good examples and to give a good termination argument.

Exercise 26.3.5.   Evaluate

(quick-sort (list 10 6 8 9 14 12 3 11 14 16 2))

by hand. Show only those lines that introduce a new recursive call to quick-sort. How many recursive applications of quick-sort are required? How many recursive applications of append? Suggest a general rule for a list of length N.

Evaluate

(quick-sort (list 1 2 3 4 5 6 7 8 9 10 11 12 13 14))

by hand. How many recursive applications of quick-sort are required? How many recursive applications of append? Does this contradict the first part of the exercise?    Solution

Exercise 26.3.6.   Add sort and quick-sort to the Definitionswindow. Test the functions and then explore how fast each works on various lists. The experiment should confirm the claim that the plain sort function wins (in many comparisons) over quick-sort for short lists and vice versa. Determine the cross-over point. Then build a sort-quick-sort function that behaves like quick-sort for large lists and switches over to the plain sort function for lists below the cross-over point.

Hints: (1) Use the ideas of exercise 26.3.5 to create test cases. (2) Develop create-tests, a function that creates large test cases randomly. Then evaluate

(define test-case (create-tests 10000))
(collect-garbage)
(time (sort test-case))
(collect-garbage)
(time (quick-sort test-case))

The uses of collect-garbage helps DrScheme deal with large lists.    Solution


55 Of course, we could have just argued that the sorted version of a one-item list is the list, which is the basis of exercise 25.2.2.

56 The concept of observably equivalent functions and expressions plays a central role in the study of programming languages and their meaning.

57 The material on the greatest common divisor was suggested by John Stone.