Data Structures 1. Multi-value Functions 2. Lists and Streams 3. Descriptors 4. Trees

##### 4 GRAPHS

Graphs are used as abstractions of real world problems such as describing task planning, railroad operations, air traffic routes, and transmission networks. They are applied in many applications for representing relations, situations or problems. In this chapter we will examine graphs. We will discuss ways how to represent graphs and will describe some operations on graphs.

4.1 Representing Graphs

Graphs are networks of nodes connected by arcs. For example, a map can be seen as a graph, in which the nodes are cities and the arcs are roads connecting the cities. If you want to find the shortest way between two cities, you have to solve the problem of finding the shortest path between nodes of a graph. Figure 4-1 shows an example.

```  B - - - - - - - - - C
/                   / \
/                   /   \
A - - - - - - - - - D - - G
\                       /
\                     /
E - - - - - - - - - F ```

Figure 4-1: Example of a graph.

A graph is a collection that include a set of nodes and a set of arcs. A node, also called a vertex, forms the basic element of the graph, much as we have seen for other structures such as lists and trees. An arc, also called an edge, is a connection between two nodes.

Graphs can be represented in several ways. One method is based on adjacency lists. It associates with each node a list of nodes that are adjacent to each node. Then a graph is a list of joints consisting of a node plus its adjacency list. The graph of Figure 4-1 can then, for example, be represented by:

```type joint = term;
joint (Node, list (Node)) => joint;```
```Graph1 = { joint (A, {B, D, E}), joint (B, {A, C}), joint (C, {B, D, G}),
joint (D, {A, C, G}), joint (E, {A, F}), joint (F, {E, G}),
joint (G, {C, D, F}) };```

Another method is to represent a whole graph as a pair of two sets: nodes and arcs. Each set can be represented as a list; each arc as a pair of nodes. The graph of Figure 4-1 can then be represented by:

```Graph2 = { { A, B, C, D, E, F, G },
{ arc (A, B), arc (B, C), arc (A, D), arc (C, D), arc (C, G),
arc (D, G), arc (A, E), arc (E, F), arc (F, G)} };```

A graph is a directed graph if the nodes in an arc are ordered; in that case the arc(A, B) points from A to B and arc(B, A) points from B to A and are therefore distinct arcs. A graph is said to be an undirected graph if the order of the nodes in an arc is irrelevant; in that case are arc(A, B) and arc(B, A) representing the same arc.

Both kinds of graph representations allow for the definition of unconnected nodes or unconnected (sub-)graphs. This may be useful if, for example, networks of railways are represented in a graph. Some cities or networks may be unconnected.

If all nodes and all (sub-)graphs are connected, a simpler representation may be chosen. In that case a set of arcs is sufficient to describe a graph. For example, the graph in Figure 4-1 can be thus represented as follows:

```Graph3 = { arc (A, B), arc (B, C), arc (A, D), arc (C, D), arc (C, G),
arc (D, G), arc (A, E), arc (E, F), arc (F, G)};```

What will be the most suitable representation will depend on the application and on operations to be performed on graphs.

4.2 Finding a path

A typical operation is to find a path between two given nodes of a graph. A path is a sequence of nodes such that successive pairs of nodes are connected by arcs. For example, in Figure 4-1 a path between A and G is the sequence A, D, G which is derived from the connections arc(A, D) and arc(D, G). There may be several paths between two nodes as will be clear from our example.

So, a specification for a path finding operation is:

`Path (Graph, FirstNode = Node, LastNode = Node) -> multi (Path);`

which says: return to me all the Paths between two nodes of a Graph. One node is called the FirstNode and the other is the LastNode.

Finding a path means, according to our definition, going from one node to another connected node. However, such a traversal of a graph may imply that the same nodes may be encountered more than once; this may result in cycling. Since a path may not contain any cycle, a node can appear in a path at most once. To prevent cycling it is necessary to maintain a list of already passed nodes, which is in fact the path found so far, and to test for each following node if it is already in the list. That means that the Path found so far is also an argument of the path finding algorithm. For convenience, we will make a separation between the initial path finding definition and the recursive path finding definition with Path as an argument. The initial path finding definition is:

```Path (Graph, FirstNode=Node, LastNode=Node) -> multi (Path);
Path (aGraph, FirstNode, LastNode) =
path (aGraph, FirstNode, LastNode, FirstNode & nil);```

This definition calls the second definition:

```path (Graph, FirstNode=Node, LastNode=Node, Path) -> multi (Path);
path (aGraph, LastNode, LastNode, aPath) = aPath;
path (aGraph, FirstNode, LastNode, aPath) =
[ Node = NextNode (aGraph, FirstNode, aPath);
path (aGraph, Node, LastNode, Node & aPath)]; ```

The second definition consists of two definition rules. The meaning of these definition rules are:

• If the FirstNode is the same as the LastNode then a complete path has been found which is returned.
• If the FirstNode and the LastNode are not the same then a Node following the FirstNode will be selected and a path between this Node and the LastNode will be tried to find by calling recursively the path definition. In front of the path found so far, called aPath, the new Node is added by means of the & operation. If a path between Node and LastNode does not exist then backtracking will occur and another Node will be selected. If another Node is not available further backtracking will occur.

To select a next node which is adjacent to a Node two criteria are used: it must be a neighbor node and it may not be part of the current path:

```NextNode (Graph, Node, Path) -> multi (Node);
NextNode (aGraph, aNode, aPath) =
[ NewNode = NeighborNode (aNode, items (aGraph));
NewNode when not member (NewNode, aPath) ];```

The selection of a neighbor node depends on the representation of the graph. Here we assume that a graph is only represented by a list of arcs. To select a neighbor node from a given node the corresponding arcs are used:

```NeighborNode (Node, Arc) -> optional (Node);
NeighborNode (X, arc (X, Y) ) = Y;
NeighborNode (X, arc (Y, X) ) = Y;```

We assumed that the arcs are represented as terms. We also assumed the arcs of the graph are not directed and that an arc from X to Y also implies an arc from Y to X. In case of a directed graph the last definition rule of NeighborNode should be omitted.

In our path searching algorithm the current path is extended with new nodes until the goal is reached or it comes to a dead end. The representation we are using for a path is a sequence as described in Chapter 1. As discussed, a sequence is a uni-directional list with the following structure:

`D & (C & (B & (A & nil))))`

This sequence is a representation for a path from A to B to C to D in the graph of Figure 4-1. The path structure acts as a stack for the accepted nodes. The first element of the stack is the initial node attached to nil, a null list element. A new node is added to the path, when the expression Node & Path is executed. The expression results in a term-structure based on the following declarations:

```type Path = term;
nil = null (Path);
Node & Path => Path;	<< sequence operation >>```

To search for a node in the path we are using the member definition as defined for sequences in Chapter 17, with some additional type definitions:

```type element = Node;
type sequence = Path;```
```member (element, sequence) -> boolean;
member (X, =nil)            = false;
member (X, X & Tail)        = true;
member (X, Head & Tail)     = member (X, Tail);```

With these definitions we are now able to find the paths from A to G in Figure 4-1. The complete program is given in Figure 4-2.

```type Graph = list (Arc);
type Arc = term;
type Node = term;
type Path = term;
nil = null (Path);```
```arc (Node, Node) => Arc;    << term-signature >>
Node & Path => Path;        << sequence operation >>```
```Path (Graph, FirstNode=Node, LastNode=Node) -> multi (Path);
Path (aGraph, FirstNode, LastNode) =
path (aGraph, FirstNode, LastNode, FirstNode & nil);```
```path (Graph, FirstNode=Node, LastNode=Node, Path) -> multi (Path);
path (aGraph, LastNode, LastNode, aPath) = aPath;
path (aGraph, FirstNode, LastNode, aPath) =
[ Node = NextNode (aGraph, FirstNode, aPath);
path (aGraph, Node, LastNode, Node & aPath) ]; ```
```NextNode (Graph, Node, Path) -> multi (Node);
NextNode (aGraph, aNode, aPath) =
[ NewNode = NeighborNode (aNode, items (aGraph));
NewNode when not member (NewNode, aPath) ];```
```NeighborNode (Node, Arc) -> optional (Node);
NeighborNode (X, arc (X, Y)) = Y;
NeighborNode (X, arc (Y, X)) = Y;   << not for directed graphs >>```
```type element = Node;
type sequence = Path;```
```member (element, sequence) -> boolean;
member (X, =nil)            = false;
member (X, X & Tail)        = true;
member (X, Head & Tail)     = member (X, Tail);```
`A, B, C, D, E, F, G => Node; `
```graph = { arc (A, B), arc (B, C), arc (A, D), arc (C, D), arc (C, G),
arc (D, G), arc (A, E), arc (E, F), arc (F, G)};```
`Path (graph, A, G)?`

Figure 4-2: Finding acyclic paths in graph

This program will return the following 5 paths from A to G:

`G & (D & (C & (B & (A & [ ]))))`
`G & (C & (B & (A & [ ])))`
`G & (C & (D & (A & [ ])))`
`G & (D & (A & [ ]))`
`G & (F & (E & (A & [ ])))`

Because the paths are represented as sequences, the visited nodes are read backwards.

4.3 Cost factors

In many path-finding problems, costs are associated with nodes or arcs and the problem is to find the least costly path. These costs may represent money, distances, weights or any other measure fitting to the application.

In this section we will apply graph searching to a traveling problem. As an example, we will develop a program to assist in travel planning. It examines different alternatives of a sight-seeing tour to cities in Europe. Given the distances in kilometers between cities as specified in Figure 4-3, what are the possible ways and associated costs to start in Amsterdam and to finish in Rome?

```               669
Amsterdam ---------- Berlin
|               |
517 |               |
|               |
Paris            | 648
|   \           |
|     \ 1271    |
912 |       \       |
|         \     |
|           \   |
|         _ Vienna
|  1130_-    /
|    _-     /
Nice -      /
\       / 1150
723 \     /
\   /
Rome```

Figure 4-3: Example of a weighted graph

We may represent this map as a list of pairs of cities with their distances:

```Map = { arc (Amsterdam, Berlin, 669), arc (Berlin, Vienna, 648),
arc (Vienna, Rome, 1150), arc (Amsterdam, Paris, 517),
arc (Paris, Vienna, 1271), arc (Paris, Nice, 912),
arc (Nice, Vienna, 1130), arc (Nice, Rome, 723) };```

This map represents an undirected graph, because the distance from Amsterdam to Berlin is the same as the distance from Berlin to Amsterdam, and so on.

The definitions we used in the preceding section have to be adapted to handle costs associated with arcs. The path definition needs an additional input parameter for Cost, while the NextNode, and NeigbourNode definitions both require an additional parameter used for output of Cost. Furthermore, to the final Path the total cost is attached by a > operation, which is a newly defined term operation. The complete program is given in Figure 4-4.

```type Graph = list (Arc);
type Arc = term;
type Node = term;
type Path = term;
type Cost = integer;
nil = null (Path);```
```arc (Node, Node, integer) => Arc;   << term-signature >>
Node & Path => Path;                << sequence operation >>
Cost > Path => Path;                << cost annotation >>```
```Path (Graph, FirstNode=Node, LastNode=Node) -> multi (Path);
Path (aGraph, FirstNode, LastNode) =
path (aGraph, FirstNode, LastNode, FirstNode & nil, 0);```
```path (Graph, FirstNode=Node, LastNode=Node, Path, Cost) -> multi(Path);
path (aGraph, LastNode, LastNode, aPath, Cost) = Cost > aPath;
path (aGraph, FirstNode, LastNode, aPath, Cost1) =
[ Cost2:=0; Node = NextNode (aGraph, FirstNode, aPath, var (Cost2));
path (aGraph, Node, LastNode, Node & aPath, Cost1 + Cost2) ]; ```
```NextNode (Graph, Node, Path, var (Cost) ) -> multi (Node);
NextNode (aGraph, aNode, aPath, Cost) =
[ NewNode = NeighborNode (aNode, items (aGraph), var (Cost) );
NewNode when not member (NewNode, aPath) ];```
```NeighborNode (Node, Arc, var (Cost) ) -> optional (Node);
NeighborNode (X, arc (X, Y, integer: cost), Cost) = [Cost:=cost; Y];
NeighborNode (X, arc (Y, X, integer: cost), Cost) = [Cost:=cost; Y];```
```type element = Node;
type sequence = Path;```
```member (element, sequence) -> boolean;
member (X, =nil)            = false;
member (X, X & Tail)        = true;
member (X, Head & Tail)     = member (X, Tail);```
`Amsterdam, Berlin, Paris, Vienna, Nice, Rome => Node;`
```Map = { arc (Amsterdam, Berlin, 669), arc (Berlin, Vienna, 648),
arc (Vienna, Rome, 1150), arc (Amsterdam, Paris, 517),
arc (Paris, Vienna, 1271), arc (Paris, Nice, 912),
arc (Nice, Vienna, 1130), arc (Nice, Rome, 723)};```
`Path (Map, Amsterdam, Rome)?`

Figure 4-4: Path finding with costs associated with arcs

This program will return the following 7 alternative routes from Amsterdam to Rome each with their total number of kilometers:

```2467 > Rome & (Vienna & (Berlin & (Amsterdam & [ ])))
4223 > Rome & (Nice & (Paris & (Vienna & (Berlin & (Amsterdam & [ ])))))
3170 > Rome & (Nice & (Vienna & (Berlin & (Amsterdam & [ ]))))
2938 > Rome & (Vienna & (Paris & (Amsterdam & [ ])))
3641 > Rome & (Nice & (Vienna & (Paris & (Amsterdam & [ ]))))
3709 > Rome & (Vienna & (Nice & (Paris & (Amsterdam & [ ]))))
2152 > Rome & (Nice & (Paris & (Amsterdam & [ ])))```

Exercises:

• 4.1 Graphs, arcs, and nodes can also be represented by corresponding descriptors. Write (generic) components for graphs, arcs, and nodes based on the program of Figure 4-4.
• 4.2 Add to the graphs component of Exercise 4.1 operations, which will allow the client to change the list of arcs.

4.4 Summary

• Graphs are networks of nodes connected by arcs.
• A graph is a directed graph if the nodes in an arc are ordered. A graph is said to be an undirected graph if the order of the nodes in an arc is irrelevant.
• Graphs can be represented in several ways. Some kinds of graph representations allow for the definition of unconnected nodes or unconnected (sub-)graphs.
• A typical operation on graphs is to find a path between two given nodes of a graph. There may be several paths between two nodes.
• A path may not contain any cycle.
• In some applications costs are associated with nodes or arcs. Often the problem is to find the least costly path.

4.5 References

Further discussion on the representation and use of graphs can be found in Knuth(1975), Ado, Hopcroft and Ullman(1974, 1983), and Bratko (1990).

 Part 2: Data Structures Chapter 4: Graphs