[previous] [up] [next]     [index]
Next: Newton's Method Up: Variations on a Theme Previous: From Files to Lines

Binary Search

Applied mathematicians model the real-world with non-linear equations and then try to solve them. Here is a simplistic example:

Given a perfect cube that encloses 27m tex2html_wrap_inline72762 . What area do its six walls cover?
We know from geometry that if the length of a cube's side is x, the enclosed space is tex2html_wrap_inline72766 . Hence we need to know the possible values of x such that

displaymath72770

Once we have solved the equation, the covered area is tex2html_wrap_inline72772 .

In general, we are given a function f from numbers to numbers, and want to know some number r such that

displaymath72778

The value r is called the root of f. In our above example, tex2html_wrap_inline72782 , and the value r is the length of the side of the cube.[footnote]

For the past few centuries, mathematicians have developed many methods for finding the root of different types of functions. In this section, we study a solution that is based on the Mean Value Theorem, an early result of mathematical analysis. The resulting algorithm is a primary example of generative recursion based on a deep mathematical theorem. It has been adapted to other uses and has become known as the binary search algorithm in computer science.


displaymath72760

Figure: A numeric function f with root in interval [a,b] (stage 1)


The Mean Value Theorem says that a continuous function f has a root in an interval [a,b] if the signs of f(a) and f(b) differ. By continuous we mean a function that doesn't ``jump,'' that doesn't have gaps, and that always continues in a ``smooth'' fashion. The theorem is best illustrated with the graph of a function. The function f in figure [cross-reference] is below the x axis at a and above the x-axis at b. It is a continuous function, which we can tell from the uninterrupted, smooth line. And indeed, the function intersects the x axis somewhere between a and b.

Now take a look at the midpoint between a and b:

displaymath72822

It partitions the interval [a,b] into two smaller, equally large intervals. We can now compute the value of f at m and see whether it is below or above 0. Here f(m) < 0, so according to the Mean Value Theorem, the root is in the right interval: [m,b]. Our picture confirms this because the root is in the right half of the interval, labeled ``range 2'' in figure [cross-reference].

The abstract description of the Mean Value Theorem and the illustrative example describe a process for finding a root. Specifically, we use the halving step as many times as necessary to determine a tolerably small range in which f must have a root. Let us now translate this description into a Scheme algorithm, which we call find-root.

To begin with, we must agree on the exact task of find-root. It consumes a function, let's call it f, for which we need to find a root. In addition, it must consume the boundaries of the interval in which we expect to find a root. For simplicity, let's say that find-root consumes two numbers: left and right. But these parameters can't be just any two numbers. For our algorithm to work we must assume that

(or (<= (f left) 0 (f right))
    (<= (f right) 0 (f left))) 
holds. This assumption expresses in Scheme the condition of the Mean Value Theorem that the function must have different signs for left and right.

According to the informal process description, the task of find-root is to find an interval that contains a root and that is tolerably small. The size of the given interval is (- right left). For the moment, we assume that the tolerance is defined as a top-level variable TOLERANCE. Given that, find-root can produce one of the two boundaries of the interval because we know what its size is; let's pick the left one.

Here is a translation of our discussion into a contract, a purpose statement, and a header, including the assumption on the parameters:

;; find-root : (number -> number) number number -> number
;; to determine R such that f has a root in [R,(+ R TOLERANCE)] 
;;  
;; ASSUMPTION: (or (<= (f left) 0 (f right)) (<= (f right) 0 (f left))) 
(define (find-root f left right) ...) 
At this stage, we should develop an example of how the function works. We have already seen one; the following exercise develops a second one.


Exercises Exercise 27.3.1

Consider the following function definition:

;; poly : number -> number
(define (poly x) 
 (* (- x 2) (- x 4))) 
It defines a binomial for which we can determine its roots by hand--they are 2 and 4. But it is also a non-trivial input for find-root, so that it makes sense to use it as an example.

Mimic the root-finding process based on the Mean Value Theorem for poly, starting with the interval 3 and 6. Tabulate the information as follows:

tabular34378

Find an interval of size .5 (or less) in which poly contains a root. Solution


Next we turn our attention to the definition of find-root. We start from generative-recursive-fun and ask the four relevant questions:

  1. We need a condition that describes when the problem is solved and a matching answer. This is straightforward. The problem is solved if the distance from left to right is smaller than or equal to TOLERANCE:
    (<= (- right left) TOLERANCE)
    
    The matching result is left.
  2. We must formulate an expression that generates new problems for find-root. According to our informal process description, this step requires determining the midpoint and choosing the next interval. The midpoint is used several times, so we use a local-expression to introduce it:

    (local ((define mid (/ (+ left right) 2)))
      ...) 
    

    Choosing an interval is more complicated than that.

    Consider the Mean Value Theorem again. It says that a given interval is an interesting candidate if the function values at the boundaries have different signs. For the function's purpose statement, we expressed this constraint using

    (or (<= (f left) 0 (f right)) (<= (f right) 0 (f left)))
    
    Accordingly, the interval between left and mid is the next candidate if
    (or (<= (f left) 0 (f mid)) (<= (f mid) 0 (f left)))
    
    And, the interval between mid and right is it, if
    (or (<= (f mid) 0 (f right)) (<= (f right) 0 (f mid)))
    
    In short, the body of the local-expression must be a conditional:

    (local ((define mid (/ (+ left right) 2)))
      (cond 
        [(or (<= (f left) 0 (f mid)) (<= (f mid) 0 (f left))) 
         (find-root left mid)] 
        [(or (<= (f mid) 0 (f right)) (<= (f right) 0 (f mid))) 
         (find-root mid right)])) 
    
    In both clauses, we use find-root to continue the search.
The completed function is displayed in figure [cross-reference]. The following exercises suggest some tests and a termination argument.


;; find-root : (number -> number) number number -> number
;; to determine a number R such that f has a  
;; root between R and (+ R TOLERANCE)  
;;  
;; ASSUMPTION: f is continuous and monotonic 
(define (find-root f left right) 
  (cond 
    [(<= (- right left) TOLERANCE) left] 
    [else 
      (local ((define mid (/ (+ left right) 2))) 
        (cond 
          [(<= (f mid) 0 (f right)) 
           (find-root mid right)] 
          [else 
           (find-root left mid)]))])) 

Figure: The root-finding algorithm find-root



Exercises Exercise 27.3.2

Use poly from [cross-reference] to test find-root. Experiment with different values for TOLERANCE. Use the strategy of section [cross-reference] to formulate the tests as boolean-valued expressions. Solution

Exercise 27.3.3

Suppose the original arguments of find-root describe an interval of size S1. How large is the distance between left and right for the first recursive call to find-root? The second one? And the third? After how many evaluation steps is the distance between left and right smaller than or equal to TOLERANCE? How does the answer to this question show that find-root produces an answer for all inputs that satisfy the assumption? Solution

Exercise 27.3.4

For every midpoint m, except for the last one, the function find-root needs to determine the value of (f m) twice. Validate this claim for one example with a hand-evaluation.

Since the evaluation of (f m) may be time-consuming, programmers often implement a variant of find-root that avoids this recomputation. Modify find-root in figure [cross-reference] so that it does not need to recompute the value of (f mid).

Hint: Define a help function find-root-aux that takes two extra arguments: the values (f left) and (f right)Solution

Exercise 27.3.5

A table is a function that consumes natural numbers between 0 and VL (exclusive) and produces numbers:

;; g : N -> num
;; ASSUMPTION: i is between 0 and VL 
(define (g i) 
  (cond 
    [(= i 0) -10] 
    [(= i 1) ...] 
    ... 
    [(= i (- VL 1)) ...] 
    [else (error 'g ``is defined only between 0 and VL (exclusive)'')])) 
The number VL is called the table's length. The root of a table is the number in the table that is closest to 0. Even if we can't read the definition of a table, we can find its root with a search function.

Develop the function find-root-linear, which consumes a table, the table's length, and finds the root of the table. Use structural induction on natural numbers. This kind of root-finding process is often called a LINEAR SEARCH.

A table t is sorted in ascending order if (t 0) is less then (t 1), (t 1) is less than (t 2), and so on. If a table is monotonic, we can determine the root using binary search. Specifically, we can use binary search to find an interval of size 1 such that either the left or the right boundary is the root's index. Develop find-root-discrete, which consumes a table and its length, and finds the table's root.

Hints: (1) The interval boundary arguments for find-root-discrete must always be natural numbers. Consider how this affects the midpoint computation. (2) Also contemplate how the first hint affects the discovery of trivially solvable problem instances. (3) Does the termination argument from exercise [cross-reference] apply?

If the tabulating function is defined on all natural numbers between 0 and 1024, and if its root is at 0, how many recursive applications are needed with find-root-discrete and find-root-lin to determine a root interval? Solution

Exercise 27.3.6

We mentioned in section [cross-reference] that mathematicians are interested not only about the roots of functions, but also in the area that a function encloses between two points. Mathematically put, we are interested in integrating functions over some interval. Take another look at the graph in figure [cross-reference] on page [cross-reference]. Recall that the area of interest is that enclosed by the bold vertical lines at a and b, the x axis, and the graph of the function.

In section [cross-reference], we learned to approximate the area by computing and adding up the area of rectangles like the two above. Using the divide-and-conquer strategy, we can also design a function that computes the area based on generative recursion. Roughly speaking, we split the interval into two pieces, compute the area of each piece, and add the two areas together.

Step 1: Develop the algorithm integrate-dc, which integrates a function f between the boundaries left and right via the divide-and-conquer strategy employed in find-root. Use rectangle approximations when an interval has become small enough.

Although the area of a rectangle is easy to compute, a rectangle is often a bad approximation of the area under a function graph. A better geometric shape is the trapezoid limited by a, (f a), b, and (f b). Its area is:

displaymath72860

Step 2: Modify integrate-dc so that it uses trapezoids instead of rectangles.

The plain divide-and-conquer approach is wasteful. Consider that a function graph is level in one part and rapidly changes in another. For the level part it is pointless to keep splitting the interval. We could just compute the trapezoid over a and b instead of the two halves.

To discover when f is level, we can change the algorithm as follows. Instead of just testing how large the interval is, the new algorithm computes the area of three trapezoids: the given one, and the two halves. Suppose the difference between the two is less than

displaymath72868

This area represents a small rectangle, of height TOLERANCE, and represents the error margin of our computation. In other words, the algorithm determines whether f changes enough to affect the error margin, and if not, it stops. Otherwise, it continues with the divide-and-conquer approach.

Step 3: Develop integrate-adaptive, which integrates a function f between left and right according to the suggested method. Do not discuss the termination of integrate-adaptive.

Adaptive Integration: The algorithm is called ``adaptive integration'' because it automatically adapts its strategy. For those parts of f that are level, it performs just a few calculations; for the other parts, it inspects very small intervals so that the error margin is also decreased accordingly. Solution



[previous] [up] [next]     [index]
Next: Newton's Method Up: Variations on a Theme Previous: From Files to Lines

PLT