Data Structures 1. Multi-value Functions 2. Lists and Streams 3. Descriptors 4. Trees

##### 5 SEARCHING STATE SPACES

State spaces are used for solving problems, such as planning optimal routes for a traveling salesman, determining the shortest distance for a robot arm, or playing a game like chess.

A state space can be represented by a graph whose nodes correspond to problem states. Arcs between nodes are representing legal transitions from one problem state to another. Solving the problem means finding a path from a given initial state to a desired solution state, also called goal state, by applying a sequence of transition rules. Each state in the problem is represented by a data structure which may be generated from a preceding data structure by means of a transition rule. This is different from our examples of graphs in the previous chapter where the data structures for nodes were already existing as part of the graph structure.

Given a state-space formulation of a problem, there are many approaches to finding a solution path. Two basic search strategies are: depth-first search and breadth-first search. A third method, called best-first search, uses heuristic information about a problem to select the most promising nodes of a solution path. In this chapter we will discuss the three search strategies.

5.1 Depth-first search

The depth-first search strategy has a built-in order for selecting problem states. It selects from the possible states in the problem space always a deepest one. Figure 5-1 illustrates a number of possible nodes used by a depth-first search.

```         A
/ | \
/  |  \
B   C   D
/ \     /|\
/   \   / | \
E     F G  H  I```

Figure 5-1: Depth-first search in a state space

The order of the nodes in this state space visited by a depth-first search is: A, B, E, F, C, D, G, H, I.

The framework for solving problems using depth-first search is very simple (see Figure 5-2). The specification says that the depth-first definition has three parameters: an initial or current state, a goal state, and a path representing the current history of states. The depth-first search may return multiple solutions. If no solution has been found no solution path will be returned.

```DepthFirst (State, Goal, Path) -> multi (Solution);
DepthFirst (aGoal, aGoal, aPath) = aGoal & aPath;
DepthFirst (aState, aGoal, aPath) =
[ NewState = NextMove (aState);
DepthFirst (NewState, aGoal, aState & aPath)
when not member (NewState, aPath) ];```

Figure 5-2: Framework for depth-first search

The meaning of the definition rules of DepthFirst are:

• if the current state is the same as the goal state then a solution path has been found: it is the current history path concatenated with the goal which is returned.
• if the current state is not the same as the goal state then a new state must be selected based on the current state by calling the NextMove definition. The NextMove definition belongs to the application domain and may return a sequence of new states. With this new state the depth-first definition will again be called recursively. But before that, it is checked if the new state has already been generated before and is part of Path. If that is the case the new state will be discarded in order to prevent cycling, and a new state will be selected.

The depth-first search makes extensive use of backtracking. In particular, the last definition rule may have three reasons to backtrack:

1) If the number of new states delivered by the NextMove definition has been exhausted, backtracking occurs to the caller of DepthFirst.

2) If a new state is already member of aPath, backtracking occurs to NextMove to compute another state.

3) If the recursive call to DepthFirst delivers no solution, backtracking occurs to NextMove to compute another state.

The depth-first algorithm presented here is independent of the representation of the states in the problem domain; that all is handled by the NextMove instruction. The only other definition which is needed is the member definition, known already from previous chapters on sequences and graphs.

To show how the depth-first strategy works let us consider the problem of Figure 5-3.

```-------------     -------------
|   |   |   |     |   |   |   |
| A | B | C | ==> | C | B | A |
|   |   |   |     |   |   |   |
-------------     ------------- ```

Figure 5-3: A tile rearrangement problem

There are three tiles on a row, called A, B, and C, representing the initial state of the problem. The transition from one state to another state consists of exchanging two tiles in the row. So, A, B, C may become B, A, C, or A, C, B. The problem is to find all the possible solutions starting from the initial state A, B, C and arriving at the goal state C, B, A.

To solve such a state-transition problem the programmer must decide how states are represented. In our example we will use the following term expression to represent a state:

`state (letter, letter, letter) => State;`

where

`type letter = term;`
`A, B, C => letter;`

Now we are able to define the domain specific NextMove operation:

```NextMove (State) -> multi (State);
NextMove ( state (L1, L2, L3) ) = ( state (L2, L1, L3),
state (L3, L2, L1),
state (L1, L3, L2) );```

The NextMove definition has three alternatives for the transition of a given state to a new state, as expressed by the serial expression. With this definition we may now complete the program as given in Figure 5-4.

```type State = term;
type Goal = State;
type Path = term;       << = sequence >>
type Solution = Path;```
```State & Path => Path;   << sequence operation >>
nil = null(Path);```
```DepthFirst (State, Goal, Path) -> multi (Solution);
DepthFirst (aGoal, aGoal, aPath) = aGoal & aPath;
DepthFirst (aState, aGoal, aPath) =
[ NewState = NextMove (aState);
DepthFirst (NewState, aGoal, aState & aPath)
when not member (NewState, aPath) ];```
```type element = State;
type sequence = Path;```
```member (element, sequence) -> boolean;
member (X, =nil)            = false;
member (X, X & Tail)        = true;
member (X, Head & Tail)     = member (X, Tail);```
```NextMove (State) -> multi (State);
NextMove ( state (L1, L2, L3) ) = ( state (L2, L1, L3),
state (L3, L2, L1),
state (L1, L3, L2) );```
```type letter = term;
state (letter, letter, letter) => State;
A, B, C => letter;```
`DepthFirst (state (A, B, C), state (C, B, A), nil)?`

Figure 5-4: An example of a depth-first program

This program will return the following 9 solutions:

```state(C, B, A) & (state(B, C, A) & (state(A, C, B) & (state(C, A, B) &
(state(B, A, C) & (state(A, B, C) & [ ])))))
state(C, B, A) & (state(C, A, B) & (state(B, A, C) & (state(A, B, C) & [ ])))
state(C, B, A) & (state(B, C, A) & (state(B, A, C) & (state(A, B, C) & [ ])))
state(C, B, A) & (state(C, A, B) & (state(A, C, B) & (state(B, C, A) &
(state(B, A, C) & (state(A, B, C) & [ ])))))
state(C, B, A) & (state(A, B, C) & [ ])
state(C, B, A) & (state(B, C, A) & (state(B, A, C) & (state(C, A, B) &
(state(A, C, B) & (state(A, B, C) & [ ])))))
state(C, B, A) & (state(C, A, B) & (state(A, C, B) & (state(A, B, C) & [ ])))
state(C, B, A) & (state(B, C, A) & (state(A, C, B) & (state(A, B, C) & [ ])))
state(C, B, A) & (state(C, A, B) & (state(B, A, C) & (state(B, C, A) &
(state(A, C, B) & (state(A, B, C) & [ ])))))```

There are 4 solutions with 6 states; 4 solutions with 4 states and 1 solution with 2 states.

Exercise:

• 5.1 Modify the framework for depth-first search of Figure 5-2 in such a way that at maximum one solution will be given.

The depth-first search often works well, as in this example, but there are many ways in which our simple procedure can run into trouble. Whether this will happen or not depends on the state space. Many state spaces are infinite. In such a space, the depth-first algorithm may miss a goal node, proceeding along an infinite branch of the graph. The program may then indefinitely explore this infinite part of the space never getting closer to a goal. To avoid aimless infinite (non-cyclic) branches, we can add a refinement to the basic depth-first search procedure: limiting the depth of search:

```DepthFirst (State, Goal, Path, Depth = integer) -> multi (Solution);
DepthFirst (aGoal, aGoal, aPath, Depth) = aGoal & aPath;
DepthFirst (aState, aGoal, aPath, Depth) =
[ NewState = NextMove (aState);
DepthFirst (NewState, aGoal, aState & aPath, Depth - 1)
when not member (NewState, aPath) and Depth > 1 ];```

The search is not allowed to go in depth beyond the initial Depth limit. This constraint is programmed by decreasing the depth limit at each recursive call, and not allowing this limit to become negative.

Another weakness of the depth-first search is that the path it finds may not be the shortest: since it pursues each branch as far as it can before trying any other, it may find a lengthy solution along one branch when some other branch might have been much shorter, as is illustrated by the results of our example program. Because many interesting problems have too large a search space to be searched exhaustively, a method which delivers shortest paths first is described in the following section.

In contrast to the depth-first search strategy, the breadth-first search strategy examines first all nodes at the same level before proceeding to the next level. This results in a search process that tends to develop more into breadth than into depth, as illustrated by Figure 5-5.

```         A
/ | \
/  |  \
B   C   D
/ \     /|\
/   \   / | \
E     F G  H  I```

Figure 5-5: Breadth-first search in a state space

The order of the nodes in this state space visited by a breadth-first search is: A, B, C, D, E, F, G, H, I.

The breadth-first search is not so easy to program as the depth-first search. The reason for the difference is that the depth-first search is based on a stack of nodes, which is implicit in the program structure, while the breadth-first search is keeping all nodes of the same level in a queue, using a first-in-first-out approach which must explicitly be programmed. For example, the nodes on the second level of Figure 5-5 could be stored in a queue of B, C, D. Additionally, for each element of the queue also a path to its parent node should be maintained in order to use that as a solution path as soon as a goal node has been found. That means that an element of the queue consists of two components: a node and a path to the parent node. As done before, we use the & operation to glue the two components together. That means, for example, that the nodes on the second level of Figure 5-5 are stored in the queue as:

`B & path1, C & path1, D & path1`

where path1 is represented by A & nil. The nodes on the third level are stored in its queue as:

`E & path2, F & path2, G & path3, H & path3, I & path3 `

where path2 is represented by B & A & nil, and path3 is D & A & nil. Each element of the queue represents a path. All the elements of a queue are in fact the nodes of a backward pointing path tree.

Based on this approach, the framework for solving problems using breadth-first search is given in Figure 5-6. The BreadthFirst definition has two parameters: the queue of state and path information and the goal description. The result may be zero, or more solutions.

```BreadthFirst (Queue, Goal) -> multi (Solution);
[ NewQueue = alist (Path);
[ aPath = items (aQueue);
aState = State (aPath);
if aState <> aGoal then
[ NewState = NextMove (aState);
when not member (NewState, aPath) ]
else result (aPath) ];
when not endlist (first (NewQueue)) ]; ```

Figure 5-6: A framework for breadth-first search

The procedure starts with the creation of a new empty queue. Then all the path elements of the input queue are read. From each path element the state information is obtained.

If the state is not equal to the goal then a new state is determined by calling the domain specific NextMove routine. The path information together with the new state will be added to the new queue if the new state has not been generated before in the current path.

If the state is equal to the specified goal then a solution has been found and the associated path will be returned by means of the result statement.

After all elements of the input queue have been processed, the BreadthFirst definition will now be called with the new queue unless the new queue is empty. In that case there are no new nodes to be processed and the searching finishes.

This procedure works so far, but there are still some improvements to be made. First of all, recursively calling the BreadthFirst routine is not necessary anymore, because the queue has all relevant information for the next cycle of processing. Furthermore, its is even not necessary to create for each level a new queue. One queue for the whole search process is sufficient as long as the reading of the queue elements is progressing from the front onwards and the elements are added at the back end of the queue. Searching stops when the end of the queue has been reached.

Our modified framework for breadth-first search is given in Figure 5-7.

```BreadthFirst (State, Goal) -> multi (Solution);
[ Queue = { Start & nil };
[ aPath = items (Queue);
aState = State (aPath);
if aState <> aGoal then
[ NewState = NextMove (aState);
when not member (NewState, aPath) ]
else result (aPath) ];
no (Solution) ];```

Figure 5-7: A more efficient framework for breadth-first search

As an example, we will illustrate the working of this breadth-first framework on our preceding example of exchanging tiles, with the program of Figure 5-8.

```type State = term;
type Goal = State;
type Path = term;       << = sequence >>
type Solution = Path;```
```State & Path => Path;   << sequence operation >>
nil = null(Path);```
```BreadthFirst (State, Goal) -> multi (Solution);
[ Queue = { Start & nil };
[ aPath = items (Queue);
aState = State (aPath);
if aState <> aGoal then
[ NewState = NextMove (aState);
when not member (NewState, aPath) ]
else result (aPath) ];
no (Solution) ];```
```State (Path) -> State;
State (State & Path) = State;```
```type element = State;
type sequence = Path;```
```member (element, sequence) -> boolean;
member (X, =nil)            = false;
member (X, X & Tail)        = true;
member (X, Head & Tail)     = member (X, Tail);```
```NextMove (State) -> multi (State);
NextMove ( state (L1, L2, L3) ) = ( state (L2, L1, L3),
state (L3, L2, L1),
state (L1, L3, L2) );```
```type letter = term;
state (letter, letter, letter) => State;
A, B, C => letter;```
`BreadthFirst (state (A, B, C), state (C, B, A) )?`

Figure 5-8: An example of a breadth-first program

This program will return the following 9 solutions in increasing order:

```state(C, B, A) & (state(A, B, C) & [ ])
state(C, B, A) & (state(C, A, B) & (state(B, A, C) & (state(A, B, C) & [ ])))
state(C, B, A) & (state(B, C, A) & (state(B, A, C) & (state(A, B, C) & [ ])))
state(C, B, A) & (state(C, A, B) & (state(A, C, B) & (state(A, B, C) & [ ])))
state(C, B, A) & (state(B, C, A) & (state(A, C, B) & (state(A, B, C) & [ ])))
state(C, B, A) & (state(B, C, A) & (state(A, C, B) & (state(C, A, B) &
(state(B, A, C) & (state(A, B, C) & [ ])))))
state(C, B, A) & (state(C, A, B) & (state(A, C, B) & (state(B, C, A) &
(state(B, A, C) & (state(A, B, C) & [ ])))))
state(C, B, A) & (state(B, C, A) & (state(B, A, C) & (state(C, A, B) &
(state(A, C, B) & (state(A, B, C) & [ ])))))
state(C, B, A) & (state(C, A, B) & (state(B, A, C) & (state(B, C, A) &
(state(A, C, B) & (state(A, B, C) & [ ])))))```

Exercise:

• 5.2 Modify the framework for breadth-first search of Figure 5-7 in such a way that the search is not allowed to go in depth beyond an initial limit.

Our breadth-first search programs generate solution paths, one after another, ordered according to their lengths. In contrast to the depth-first strategy the breadth-first strategy is guaranteed to produce a shortest solution first. However, both strategies suffer from what is called combinatorial explosion. For many problems the number of alternatives to be explored is so high that depth-first and breadth-first strategies will not work in a limited amount of time. The reason is that, if each node in the state space has n successors the number of paths at depth d is nd (assuming no cycles). This means that the possible solutions paths grows exponentially with their depth.

To limit the number of candidate solution paths problem-specific information is needed to decide what the most promising ways are to proceed at each stage of the search. Search algorithms which are using specific information from the problem domain are called heuristic search strategies; one of them is described in the following section.

5.3 Best-first search

The best-first search strategy also starts at the start node and maintains a set of candidate paths. It uses a state-evaluation function which computes heuristic estimates for the nodes in the state space. There are several ways to define a state-evaluation function, depending on the application. In the following we will assume that the state-evaluation function is a cost function which is defined for the arcs of the state space and which estimates how far a given node is from another node. Such an estimate indicates how promising a node is with respect to reaching a goal node. The idea is to continue the search always from the most promising node in the candidate set.

Best-first search is a generalization of breadth-first search. Both are based on a queue of nodes. The difference is that while the breadth-first search builds a queue, level for level, in left to right order, the best-first search sorts the queue for each new node in increasing cost order and proceeds with the first node of the sorted queue. As a consequence, the queue may contain nodes from different levels and different branches.

A framework for best-first search is given in Figure 5-9. It is derived from the breadth-first search as given in Figure 5-7.

```BestFirst (Start = State, Goal = State) -> multi (Solution);
BestFirst (Start, aGoal) =
[ Queue = { cost (Start, 0) & nil };
aState = State (aPath); Cost = Cost (aPath);
if aState <> aGoal then
[ NewCost:=0;
NewState = NextMove (aState, var (NewCost) );
update (Queue, aPath, NewState, Cost + NewCost)
when not member (NewState, aPath) ]
else result (aPath) ];
no (Solution) ];```

Figure 5-9: A framework for best-first search

The main differences with the breadth-first search are in the area of queue handling and in the processing of cost handling.

The procedure starts with the creation of the queue. The first element of the queue consists of the Start node with an initial zero cost value and a nil path. The queue is processed in the following repeat loop.

The first element of the queue is removed from the queue and the State and Cost information is obtained from this element.

If the state is not the goal then a new state is determined by calling the domain specific NextMove routine. This NextMove definition will also return associated new cost information.

If the new state has not been generated before in the current path, the queue will be updated with the new state, path and cost information. The update procedure, as defined in Figure 5-10, will insert a new element at the right position in an already sorted queue.

If the state is equal to the specified goal then a solution has been found and the associated path will be returned by means of the result statement.

The processing of the updated queue is repeated until all elements of the queue have been processed. In that case there are no new nodes to be processed and the searching is finished.

```update (Queue, Path, State, Cost) -> nothing;
update (aQueue, aPath, aState, Cost) =
[ Element:= first (aQueue);
repeat [ ready when endlist (Element);
anItem = item (Element);
Cost1 = Cost (anItem);
Element:= next (Element) ];
insert (cost (aState, Cost) & aPath, Element) ];```

Figure 5-10: Updating the queue for best-first search

Let us investigate how the best-search algorithm works for the example of Figure 5-11. The numbers assigned to the branches of the tree are representing the distances from one node to another.

```           A
/ | \
2 / 8|  \ 3
/   |   \
B    C    D
/ \       /|\
5 /  2\   3 / | \ 6
/     \   / 2|  \
E       F G   H   I```

Figure 5-11: Best-first search in a state space

The best-first search starts with the initial node. It expands this node in a breadth-first manner and sorts the nodes according to the increasing distances. Then the first element of the sorted queue will be removed and its successors will be added to the queue, the queue will be sorted again, and so on. Sorting means in this case the insertion of new elements at the right position in an already sorted queue. The whole process will continue until the queue is empty. For the state space of Figure 5-11, a queue with the following elements and with the distances between parentheses, are built:

```A(0)                                   A is removed and expanded:
B(2), C(8), D(3)                       queue is sorted as:
B(2), D(3), C(8)                       B is removed and expanded:
E(7), F(4), D(3), C(8)                 queue is sorted as:
D(3), F(4), E(7), C(8)                 D is removed and expanded:
G(6), H(5), I(9), F(4), E(7), C(8)     queue is sorted as:
F(4), H(5), G(6), E(7), C(8), I(9)```

The last line represents a queue with only leaf nodes, which successively will be removed from the queue. Leaf nodes will be tested if they are goal nodes. If so, the associated solution path will be returned.

As an example, we will illustrate the working of this best-first searching framework on one of our preceding examples about determining different routes between Amsterdam and Rome (see Figure 20-3). We will define a component for the best-first searching framework applied to routes between cities (see Figure 5-12).

```component SearchBestFirst;
type Arc = term;
type Node = term;
type Cost = integer;
type State = term;
type Path = term;
type Solution = Path;
type Graph = list (Arc);
arc (Node, Node, Cost) => Arc;
SearchBestFirst (Graph, Start= State, Goal= State) -> multi (Solution);
begin
type Queue = list (Path);
State & Path => Path;	<< sequence operation >>
cost (State, Cost) => Path;
nil = null (Path);
graph:= null (Graph);```
```     SearchBestFirst (Graph, Start= State, Goal= State) -> multi (Solution);
SearchBestFirst (Graph, Start, Goal) = [ graph:= Graph; BestFirst (Start, Goal) ]; ```
```     BestFirst (Start = State, Goal = State) -> multi (Solution);
BestFirst (Start, aGoal) =
[ Queue = { cost (Start, 0) & nil };
aState = State (aPath); Cost = Cost (aPath);
if aState <> aGoal then
[ NewCost:=0;
NewState = NextMove (aState, var (NewCost) );
update (Queue, aPath, NewState, Cost + NewCost)
when not member (NewState, aPath) ]
else result (aPath) ];
no (Solution) ];```
```     update (Queue, Path, State, Cost) -> nothing;
update (aQueue, aPath, aState, Cost) =
[ Element:= first (aQueue);
repeat [ ready when endlist (Element);
anItem = item (Element);
Cost1 = Cost (anItem);
Element:= next (Element) ];
insert (cost (aState, Cost) & aPath, Element) ];```
```     State (Path) -> State;
State ( cost(State, Cost) & Path) = State;```
```     Cost (Path) -> Cost;
Cost ( cost(State, integer: Cost) & Path) = Cost;```
```     member (State, Path) -> boolean;
member (State, =nil)  = false;
member (State, cost (State, Cost) & Path)  = true;
member (State, cost (State1, Cost) & Path) = member (State, Path);```
```     NextMove (State, var (Cost)) -> multi (State);
NextMove (Node, Cost) = NeighborNode (Node, items(graph), var(Cost));```
```     NeighborNode (Node, Arc, var(Cost) ) -> optional (Node);
NeighborNode (X, arc (X, Y, integer: cost), Cost) = [Cost:= cost; Y];
NeighborNode (X, arc (Y, X, integer: cost), Cost) = [Cost:= cost; Y];```
`end component SearchBestFirst;`

Figure 5-12: An example of a best-first search component

We can now use this component in a dialogue session. We use the information of Figure 20-3 to fill the map:

`use SearchBestFirst;`
`Amsterdam, Berlin, Paris, Vienna, Nice, Rome => Node;`
```Map = { arc (Amsterdam, Berlin, 669), arc (Berlin, Vienna, 648),
arc (Vienna, Rome, 1150), arc (Amsterdam, Paris, 517),
arc (Paris, Vienna, 1271), arc (Paris, Nice, 912),
arc (Nice, Vienna, 1130), arc (Nice, Rome, 723)};```
`BestFirst (Amsterdam, Rome)?`

This program will return the 7 solutions in increasing distance order:

```  cost(Rome, 2152) & (cost(Nice, 1429) & (cost(Paris, 517) & (cost(Amsterdam, 0) & [ ])))
cost(Rome, 2467) & (cost(Vienna, 1317) & (cost(Berlin, 669) & (cost(Amsterdam, 0) & [ ])))
cost(Rome, 2938) & (cost(Vienna, 1788) & (cost(Paris, 517) & (cost(Amsterdam, 0) & [ ])))
cost(Rome, 3170) & (cost(Nice, 2447) & (cost(Vienna, 1317) & (cost(Berlin, 669) &
(cost(Amsterdam, 0) & [ ]))))
cost(Rome, 3641) & (cost(Nice, 2918) & (cost(Vienna, 1788) & (cost(Paris, 517) &
(cost(Amsterdam, 0) & [ ]))))
cost(Rome, 3709) & (cost(Vienna, 2559) & (cost(Nice, 1429) & (cost(Paris, 517) &
(cost(Amsterdam, 0) & [ ]))))
cost(Rome, 4223) & (cost(Nice, 3500) & (cost(Paris, 2588) & (cost(Vienna, 1317) &
(cost(Berlin, 669) & (cost(Amsterdam, 0) & [ ])))))```

Exercises:

• 5.3 The component of Figure 5-12 is based on the assumption that the best node to select is the node with the minimum distance from the initial node. It is often better to select a node with a minimum distance from a goal node. Modify the example in such a way that coordinates of the cities are given instead of the distances and that a node is selected based on the total estimated distance from start to goal.
• 5.4 Hill climbing is another heuristic search strategy. It differs from the best-first search in that the new queue is made by sorting the children of a node and placing them at the head of the queue, rather than by sorting the whole queue. As such is hill climbing a generalization of depth-first search. Develop a frame work for the hill climbing search strategy.

5.4 Summary

• State spaces are used for solving problems.
• A state space can be represented by a graph whose nodes correspond to problem states and arcs to possible transitions from one problem state to another. Solving the problem means finding a path from a given initial state to a desired solution state, also called goal state, by applying a sequence of transition rules.
• Three search strategies have been discussed: depth-first search, breadth-first search, and best-first search.
• The depth-first search strategy selects from the possible states in the problem space always a deepest one. Methods to prevent cycling are: test for repeated nodes; limit the depth of search.
• The breadth-first search strategy examines first all nodes at the same level before proceeding to the next level. The set of candidate nodes is maintained in a queue.
• In contrast to the depth-first strategy the breadth-first strategy is guaranteed to produce a shortest solution first.
• With large state spaces both strategies suffer from combinatorial explosion. Heuristic information is required in such cases.
• The best-first search strategy uses heuristic information about a problem to select the most promising nodes of a solution path.

5.5 References

Strategies for searching state spaces are described in any general text on artificial intelligence. Examples are Nilsson(1971, 1980), Winston(1984), Charniak and McDermott(1985), Bratko(1990).

 Part 2: Data Structures Chapter 5: Searching State Spaces