[previous] [up] [next]     [index]
Next: Backtracking with State Up: Changing StructuresVectors, and Previous: More Practice with Vectors

Collections of Structures with Cycles

Many objects in our world are related to each other in a circular manner. We have parents; our parents have children. A computer may connect to another computer, which in turn may connect to the first. And we have seen data definitions that refer to each other.

Since data represents information about real-world objects, we will encounter situations that call for the design of a class of structures with a circular relationship. In the past, we have skirted the issue, or we used a trick to represent such collections. The trick is to use an indirection. For example, in section [cross-reference], we associated each structure with a symbol, kept a table of symbols and structures around, and placed symbols into structures. Then, when we needed to find out whether some structure refers to another, we extracted the relevant symbol and looked in the table to find the structure for the symbol. While this use of indirection allows us to represent structures with mutual references or structures in a cyclic relationship, it also leads to awkward data representations and programs. This section demonstrates that we can simplify the representation of collections with structure mutation.

To make this idea concrete, we discuss two examples: family trees and simple graphs. Consider the case of family trees. Thus far, we have used two kinds of family trees to record family relationships. The first is the ancestor tree; it relates people to their parents, grandparents, and so on. The second is the descendant tree; it relates people to their children, grandchildren, and so on. In other words, we have avoided the step of combining the two family trees into one, the way it is done in the real world. The reason for skirting the joint representation is also clear. Translated into our data language, a joint tree requires that a structure for a father should contain the structures for his children, and each of the child structures should contain the father structure. In the past, we couldn't create such collections of structures. With structure mutations, we can now create them.

Here is structure definition that makes this discussion concrete:

(define-struct person (name social father mother children))
The goal is to create family trees that consist of person structures. A person structure has five fields. The content of each is specified by the following data definition:

An family-tree-node (short: ftn) is either
  1. false or
  2. a person.
A person is a structure:

(make-person n s f m c) where n is a symbol, s is number, f and m are ftns, and c is a (listof person).

As usual, the false in the definition of family tree nodes represents missing information about a portion of the family tree.

Using make-person alone, we cannot establish the mutual reference between a family tree node for a father and his child. Suppose we follow an ancestral tree strategy, that is, we create the structure for the father first. Then we can't add any child to the children field, because, by assumption, the corresponding structure doesn't exist yet. Conversely, if we follow a descendant tree strategy, we first create a structure for all of a father's children, but those structures can't contain any information about the father yet.


bidirectional family trees

Figure: Adding a child


What this suggests is that a simple constructor for this kind of data isn't really enough. Instead, we should define a GENERALIZED CONSTRUCTOR that not only creates a person structure but also initializes it properly when possible. To develop this function, it is best to follow the real world, where upon the birth of a child, we create a new entry in the family tree, record the child's parents, and record in the existing parents' entries that they have a newborn. Here is the specification for just such a function:

;; add-child! : symbol number person person -> person
;; to construct a person structure for a newborn  
;; effect: to add the new structure to the children of father and mother 
(define (add-child! name soc-sec father mother) ...) 
Its task is to create a new structure for a newborn child and to add the structure to an existing family tree. The function consumes the child's name, social security number, and the structures representing the father and the mother.

The first step of the design of add-child! is to create the new structure for the child:

(define (add-child! name soc-sec father mother)
  (local ((define the-child 
            (make-person name soc-sec father mother empty))) 
    ...)) 
This covers the first part of the contract. By naming the structure in a local-expression we can mutate it in the body of the expression.

The second step of the design of add-child! is to add a body to the local-expression that performs the desired effects:

(define (add-child! name soc-sec father mother)
  (local ((define the-child 
            (make-person name soc-sec father mother empty))) 
    (begin 
      (set-person-children! father 
                            (cons the-child (person-children father))) 
      (set-person-children! mother 
                            (cons the-child (person-children mother))) 
      the-child))) 
Since there are two specified effects and since the purpose statement also specifies a result, the body of the local-expression is a begin-expression with three subexpressions. The first mutates father, adding the-child to the list of children. The second mutates mother in an analogous manner. The last one produces the desired result.

Figure [cross-reference] illustrates the evaluation of an application of add-child!:

(add-child! 'Ludwig 3
            (make-person 'Adam ... ... ...) 
            (make-person 'Eve ... ... ...)) 
The top-half shows the new structure for 'Ludwig and how it refers to the father and mother structures. Just as in section [cross-reference], the picture uses arrows to relate the nodes of a family tree. But now this choice isn't just a convenience, it is dictated by necessity. As the bottom half of the figure shows, the structure mutation of add-child! modify the children fields of the father and mother structure. They add an additional item to the list in this field, and this new item is the structure for 'Ludwig. Without arrows, we wouldn't be able to draw this constellation of structures because it is impossible to draw the two structures as nested in each other.

With add-child! we can create family trees, one child at a time. What we need to learn is how to design functions that process this new class of family trees. In this case, we can almost always pick one of the two views that we used before: the ancestor family tree or the descendant family tree. Either view just ignores certain fields in the structures. Once we have chosen a view, we design the desired functions following the known recipes. Even if we decide to use the bi-directional relations in the new family tree representation, designing a function is usually simply a matter of formulating those auxiliary functions that correspond to the real-world family relationships and to compose them properly. The following few exercises demonstrate these principles.


Exercises

Exercise 43.2.1

Modify add-child! so that it has the following contract:

;; add-child! : symbol number ftn ftn -> person
The function otherwise behaves just like the original version.

Once we have the modified function, there is no need for make-person any more. We can create all forms of person structures with add-child! directly.

Transliterate the family tree in figure [cross-reference] into the new representation; use the new modified add-child! function exclusively. Solution

Exercise 43.2.2

Develop the function how-many-ancestors, which consumes a family tree node and determines how many ancestors there are. The node itself counts as an ancestor. Solution

Exercise 43.2.3

Develop how-many-descendants, which consumes a family tree node and determines how many descendants there are. The node itself counts as a descendant. Solution

Exercise 43.2.4

Develop names-of-cousins. The function consumes a person and produces the names of the cousins.

Hints: (1) Don't forget to use Scheme's built-in functions for processing lists. (2) Use a sufficiently large portion of your own family tree to test the functions. (3) For the testing step, compare the names of the results of the auxiliary functions with the expected results. Because the structures are mutually referential, it is difficult to compare them automatically. Alternatively, use eq?, Scheme's intensional equality predicate, to compare two structures. Why does this work? Solution


In sections [cross-reference] and [cross-reference], we encountered the problem of representing and traversing graphs. Recall that a graph is a collection of nodes and connections between nodes. The graph traversal problem is to determine whether there is a route from a node labeled orig to one called dest. In a simple graph, each node has exactly one one-way connection to another node.

Originally, we represented a graph as a list of named nodes. If one node was connected to another, the corresponding structure for the first node contained the name of the second node, not the node itself. Exercise [cross-reference] introduced a vector-based representation. Still, all of our representations used the indirection trick, so that if we wanted to move from one node to another, we first had to look up the connection in a table.

Using structure mutation, we can eliminate this indirection and create structures for nodes that contain each other, even if the graph contains a cycle. To understand how this works in a concrete manner, let's discuss how to model simple graphs such as those in figure [cross-reference] and how to design programs that find routes through such graphs. First, we need a structure definition for nodes:

(define-struct node (name to))
The name field records the name of the node, and the to field specifies to which other node it is connected. Second, we need a data definition:

A simple-graph-node (node) is a structure:

(make-node n t) where n is a symbol and t is a node.

The data definition is unusual in that it is self-referential, but it doesn't consist of several clauses. This immediately raises the question of how we can construct a node that complies with this definition. Clearly, applying make-node doesn't work; instead, we need to define a generalized constructor that immediately sets the to field of a node.

The generalized constructor consumes the atomic data for a node structure and constructs a legal node structure from there:

;; create-node : symbol -> node
;; to create a simple legal graph node with a-name in the name field 
(define (create-node a-name) 
  (local ((define the-node (make-node a-name false))) ...)) 
The natural candidate to place into the to field is the node itself. In other words, the generalized constructor creates a node that contains itself:
;; create-node : symbol -> node
;; to create a simple graph node that contains a-name and itself 
(define (create-node a-name) 
  (local ((define the-node (make-node a-name false))) 
    (begin 
      (set-node-to! the-node the-node) 
      the-node))) 
The generalized constructor makes the node using the ordinary constructor, initializing the name field properly and putting false into the to field. Although the latter is an improper action according to our data definition, it is acceptable because it is immediately corrected in the local-expression's body. Hence an application of create-node produces a node as promised.

With create-node we can create the nodes in a graph, but we can't establish the connections between them. To connect two nodes, we must modify the to field of one of the structures so that it contains the other. While this suggestion is generally on target, it raises the problem of how to identify the nodes. The family tree example suggests one solution, namely, to introduce one variable definition per node. Another comes from our orginal work with graphs, where we represented graphs as lists of symbolic pairs of connections or lists of nodes or vectors of nodes. Here we pursue the second option:

A simple-graph is a (listof node).

Assuming we have a list of all nodes, say the-graph, and a function for looking up the node with a given name, say lookup-node, we can create a connection from one node to the other with a structure mutation:

(set-node-to! (lookup-node from-name the-graph)
              (lookup-node to-name the-graph)) 
We can make connecting two nodes more convenient than that with an auxiliary function:
;; connect-nodes : symbol symbol graph -> void
;; effect: to mutate the to field in the structure with 
;; from-name in the name field so that it contains 
;; the structure with to-name in the name field 
(define (connect-nodes from-name to-name a-graph) 
  (set-node-to! (lookup-node from-name a-graph) 
                (lookup-node to-name a-graph))) 
Defining lookup-node is an exercise in structural function design, though it is best done using Scheme's assf function, which abstracts this situation.


;; create-node : symbol -> node
;; to create a simple graph node that contains itself 
(define (create-node name) 
  (local ((define the-node (make-node name false))) 
    (begin 
      (set-node-to! the-node the-node) 
      the-node)))

;; connect-nodes : symbol symbol graph -> void ;; effect: to mutate the to field in the structure named ;; from-name so that it contains the structure named to-name (define (connect-nodes from-name to-name a-graph) (set-node-to! (lookup-node from-name a-graph) (lookup-node to-name a-graph)))

;; lookup-node : symbol graph -> node or false ;; to lookup up the node named x in a-graph (define (lookup-node x a-graph) ...)

;; the-graph : graph ;; the list of all available nodes (define the-graph (list (create-node 'A) (create-node 'B) (create-node 'C) (create-node 'D) (create-node 'E) (create-node 'F)))

;; setting up the graph: (begin (connect-nodes 'A 'B the-graph) (connect-nodes 'B 'C the-graph) (connect-nodes 'C 'E the-graph) (connect-nodes 'D 'E the-graph) (connect-nodes 'E 'B the-graph))

Figure: Creating a simple graph via mutation


Now we can transliterate simple graphs into a Scheme representation. Suppose we start with the graph in figure [cross-reference], which is reproduced here in a tabular format:

tabular58452

The first step is to create a list of all the nodes and to name it. The second step is to establish the connections according to this table. Figure [cross-reference] shows the corresponding Scheme expressions. They are straight transliterations of the columns in the tabular representation of the graph. There is no need to reconnect the 'F node because it is already connected to itself.


Exercises

Exercise 43.2.5

Draw a picture of (create-node 'A) using the boxes-in-boxes approach from part [cross-reference] and the boxes-and-arrow approach from part [cross-reference]Solution

Exercise 43.2.6

Transliterate the given simple graph without creating a list of all nodes. Solution

Exercise 43.2.7

Develop the function symbolic-graph-to-structures. It consumes a list of pairs and creates a graph.

Example:

(define the-graph 
  (symbolic-graph-to-structures '((A B) (B C) (C E) (D E) (E B) (F F)))) 
Evaluating this definition is equivalent to evaluating the definitions in figure [cross-reference]Solution

Once we have a method for representing simple graphs, we can turn our attention to the problem of finding a route from one node in the graph to another. Recall the original specification from section [cross-reference]:

;; route-exists? : node node simple-graph -> boolean
;; to determine whether there is a route from orig to dest in sg 
(define (route-exists? orig dest sg) ...) 
Of course, we must reinterpret the names for our data classes in the new context, but otherwise the specification is perfectly fine.

The development of the original function demonstrated two new ideas. First, the function uses generative recursion. Once it is known that orig and dest are distinct nodes, the search resumes from the node to which orig is connected. Second, the function requires an accumulator to remember which nodes have been visited. Without the accumulator, the function may revisit the same node over and over again.

So, let's start from the template for generative recursion:

(define (route-exists? orig dest sg)
  (cond 
    [(eq-node? orig dest) true] 
    [else 
     (route-exists? ...  tex2html_wrap_inline73547  ... dest sg)])) 

The function eq-node? determines whether the two nodes are the same; this may just use eq?, Scheme's intentional equality predicate, or it may compare the names of the nodes, assuming they are unique. If the nodes are the same, a route exists. If not, we can generate a new, potentially useful problem by moving to the node to which orig is connected. In the graph representation of section [cross-reference], this requires looking in sg. In our new graph representation, the connection is a part of the node representation. Hence we can use node-to instead of looking in sg:

(define (route-exists? orig dest sg)
  (cond 
    [(eq-node? orig dest) true] 
    [else (route-exists? (node-to orig) dest sg)])) 

The function definition shows that, so far, sg is useless. Because a node in the new graph representation contains its neighbors, and the neighbor contains its neighbor, and so on, there is no need to use the table.

The termination argument for this function fails, just as for the original one in section [cross-reference]. To see why our new function may fail to terminate, take a look at its definition. It doesn't contain false, and the function cannot possibly produce false--even though we know that our sample graph, for example, doesn't contain a path from 'F to 'A or anywhere else. If we inspect what happens with

(route-exists? (lookup-node the-graph 'F) 
               (lookup-node the-graph 'A)) 

we see that route-exists? repeatedly visits the node 'F. In short, it forgets what it has processed so far.

We know that equipping route-exists? with an accumulator overcomes this lack of knowledge, but that requires another table lookup. We can do better than that with a structure mutation that records a visit by the route-exists? function. To do that, the node structures need an addtional field; we call it visited:

(define-struct node (name visited to))

Initially the field contains false. As route-exists? visits a node, it puts true into the field:

(define (route-exists? orig dest sg)
  (cond 
    [(eq-node? orig dest) true] 
    [(node-visited orig) false] 
    [else 
      (begin 
        (set-node-visited! orig true) 
        (route-exists? (node-to orig) dest sg))])) 
To exploit this new knowledge, the function checks the new structure field as one of the new termination conditions. If orig has been visited before, there is no route because the function has discovered a cycle in the graph.

The second structure mutation of this example illustrates two ideas. First, structure mutation can replace a table-based accumulator. In general, though, it is best to study a table-based version and to add structure mutations based on a solid understanding of the accumulated knowledge. Second, structure mutations can play a role in termination tests for generative recursion. After all, state change is motivated by the desire to remember things across function applications, and termination tests must discover whether things have changed. While the combination is rare, it is useful, and it appears time and again in the study of algorithms.


Exercises

Exercise 43.2.8

The function route-exists? assumes that the visited fields of all the nodes are initially false. A single use of the function, however, sets some of the fields in a graph to true. This implies that the function cannot be used twice in a row.

Develop a revised version of route-exists?, with the same specification, that sets all visited fields to false before it searches for a route between the given nodes.

Determine the abstract running time of the new function, assuming the graph has N nodes. Solution

Exercise 43.2.9

Develop the function reachable. It consumes a node in a simple graph. Its effect is to place true into the visited fields of all those nodes that are reachable from the given node and to ensure that the visited fields of all other nodes are falseSolution

Exercise 43.2.10

Develop make-simple-graph, a function that manages the state of a locally defined graph. The function accepts a simple graph in the form of lists of pairs of symbols: (listof (list symbol symbol)). It supports four services:

  1. adding nodes that are connected to already existing nodes (by name);
  2. changing the connection of a node (by name);
  3. determining whether a route between two nodes exists;
  4. and removing nodes that are not reachable from some given node.

Hint: Instead of using a list, the manager should use a node sequence, which is analogous to the hand structure from section [cross-reference]. A node sequence relies on the following structure:

(define-struct sequence (node next))
A sequence is similar to a list, but it supports structure mutations. Solution

The discussion of this section confirms the usefulness of the design recipes, even for collections of structures that refer to each other. The most important lesson is that such situations call for a generalized constructor, a function that creates a structure and immediately establishes the necessary connections. Generalized constructors correspond to the initializers of section [cross-reference]; we have also seen the idea in section [cross-reference] where we created a hand from a single card. In some cases, such as the one for simple graphs, we may also want to introduce auxiliary functions for mutating the structures a second time. Once we have those functions, we can use the standard recipes, including those for introducing additional structure fields.


[previous] [up] [next]     [index]
Next: Backtracking with State Up: Changing StructuresVectors, and Previous: More Practice with Vectors

PLT