Data Structures 1. Multi-value Functions 2. Lists and Streams 3. Descriptors 4. Trees

##### 2 EXAMPLES INVOLVING LISTS
 2.1 List Functions 2.1.1 Length 2.1.2 Sum 2.1.3 Average 2.1.4 Member 2.1.5 Members 2.1.6 Take 2.1.7 Drop 2.1.8 Portion 2.2 Sorting Lists 2.2.1 Insertion sort 2.2.2 Quick sort 2.2.3 Stream based Sorting 2.3 Combinatorial Functions 2.3.1 Interleave 2.3.2 Permutations 2.3.3 Primes 2.3.4 Palindromes 2.3.5 Eight queens puzzle 2.4 Sets 2.4.1 Set 2.4.2 Union 2.4.3 Intersection 2.4.4 Difference 2.4.5 Subset 2.4.6 Equal 2.5 Summary

There are many applications of lists as collections of related data items. In Chapter 8 of the "Language Description" we discussed lists and their basic operations. In this chapter we will discuss a number of interesting list functions. Some are basic functions, others are used for sorting, or to solve combinatorial problems, or are used to represent mathematical sets.

2.1 List Functions

In this section we will describe some basic list functions. Lists can be taken apart, or rearranged and combined with other lists to form new lists; lists of numbers can be summed and multiplied; and lists can be transformed into streams and into sequences. We will define a number of those basic functions to show how such list functions can be defined and how different language elements can be used to form compound definitions. Some of those functions are polymorphic, others are type specific. Because most functions are simple we will not give detailed explanations as the examples will speak for themselves.

2.1.1 Length

The length of a list is a typical example of polymorphic definitions as discussed earlier. We will include the definition here for completeness because it is one of the important basic list functions.

The length of a list is defined as being the number of elements it contains. Its definition is:

```length (list (entity1)) -> integer;
length (L) = [ size:=0; [ E = items (L); size:=size + 1]; size];```

Examples of its use are:

```length ({2, 3, 4})?			<< 3 >>
length ({2.3, 4.5})?			<< 2 >>
length ({term: 2, 3.4, 'a', "abc"})?	<< 4 >>```

2.1.2 Sum

The sum of a list is the sum of its elements. Because the sum is type dependent we will give a definition for computing the sum of an integer list.

```sum (list (integer)) -> integer;
sum (L) = [ Sum:=0; [ E = items (L); Sum:=Sum + E]; Sum];```

Examples of its use are:

```sum ( {0} )?	                << 0 >>
sum ( {2, 3, 4} )?	        << 9 >>
sum ( {1, 2, 3, 4, 5, 6} )? 	<< 21 >>```

Exercise:

• 2.1 Define the product of a list with real values.

2.1.3 Average

The average of a list is the sum of the list divided by the length of the list. We will use the preceding sum and length functions to define the average of an integer list.

```average (list (integer)) -> integer;
average (L) = sum (L) / length (L);```

Examples of its use are:

```average ( {2, 3, 4} )?	        << 3 >>
average ( {10, 12, 15, 23} )?	<< 15 >>```

Remark: The average of an empty list is not defined because the length of an empty list will be zero. And division by zero is undefined.

The disadvantage of the preceding average function is that the input is scanned twice: one scan by the sum function and a second scan by the length function. A more efficient solution is to scan the list only once, as is done by the following definition:

```average1 (list (integer)) -> integer;
average1 (L) = [ Size:=0; Sum:=0;
[ E = items (L); Size:=Size + 1; Sum:=Sum + E];
Sum / Size];```

Examples of its use are:

```average1 ( {12, 33, 42} )?	<< 29 >>
average1 ( {23, 17, 29, 25} )?	<< 23 >>```

2.1.4 Member

The member function reports whether a given element E is a member of a list L. Its polymorphic definition is:

```member (entity1, list (entity1)) -> boolean;
member (E, L) = [ Member:=false;
if E == items (L) then [Member:=true; ready];
Member];```

Another, more abbreviated version of this definition is:

```member (entity1, list (entity1)) -> boolean;
member (E, L) = [ if E == items (L) then return (true);
false];```

Examples of its use are:

```member ( 4, {2, 3, 4} )?             << true  >>
member ( "a", {"A"} )?               << false >>
member ( "a", {"ab", "abc"} )?       << false >>
member ( "a", {"a", "ab", "abc"} )?  << true  >>```

Remark: In both versions, the scanning of the list stops as soon as a member has been found.

2.1.5 Members

The members function reports the number of occurrences of given element E in a list L. Its polymorphic definition is:

```members (entity1, list (entity1)) -> integer;
members (E, L) = [ i:=0; i:=i + 1 when E == items (L); i];```

Examples of its use are:

```members ("a", {"ab", "abc"} )?            << 0 >>
members ("a", {"a", "ab", "abc"} )?       << 1 >>
members ("a", {"a", "ab", "a", "abc"} )?  << 2 >>```

If we compare the member function with the members function we may conclude that the members function can also be used to test if a member is in the list because the relation (members(E, L) > 0) == member(E, L) is always true. However, the members function needs to scan the whole list, in contrast to the member function which stops as soon as a member has been found.

Exercise:

• 2.2 Define a members function which returns all the members in a list.

2.1.6 Take

The take function returns the first n elements of a list as a stream. It is a polymorphic function:

```take (list (entity1), integer) -> multi (entity1);
take (L, n) = [ i:=0; E = items (L); i:=i+1; ready when i > n; E];```

Examples of its use are:

```take ( {1, 2, 3, 4, 5, 6, 7}, 4)?  << ( 1, 2, 3, 4) >>
take ( {"a", "bc", "abc"}, 2)?     << ( "a", "bc") >>```

Exercise:

• 2.3 Define a polymorphic takelist function which will return the first n elements of a list in a new list by using the previous take function.

2.1.7 Drop

The drop function returns as a stream the remaining elements of a list after the first n elements have been dropped. The drop function is polymorphic:

```drop (list (entity1), integer) -> multi (entity1);
drop (L, n) = [ i:=0; E = items (L); i:=i+1; reject(i<=n); E];```

Examples of its use are:

```drop ( {1, 2, 3, 4, 5, 6, 7}, 4)?	<< ( 5, 6, 7) >>
drop ( {'a', 'b', 'c', 'd'}, 1)?	<< ('b', 'c', 'd') >>
drop ( {'a', 'b', 'c', 'd'}, 0)?	<< ('a', 'b', 'c', 'd') >>```

2.1.8 Portion

The portion function returns in a stream those elements of a list which are between the m and n th index number. Its definition is:

```portion (list (entity1), integer, integer) -> multi (entity1);
portion (L, m, n) = take ( {drop (L, m - 1) }, n - m + 1);```

Note that this definition makes use of an intermediate list.

Another version without an intermediate list is:

```portion (list (entity1), integer, integer) -> multi (entity1);
portion (L, m, n) = [ i:=0;
E = items(L);
i:=i+1;
reject(i < m);
E ];```

Examples of its use are:

```portion ( {1, 2, 3, 4, 5, 6, 7}, 3, 6)?  << (3, 4, 5, 6) >>
portion ( {1, 2, 3, 4, 5, 6, 7}, 3, 8)?	 << (3, 4, 5, 6, 7) >>
portion ( {1, 2, 3, 4, 5, 6, 7}, 3, 3)?	 << (3) >>
portion ( {1, 2, 3, 4, 5, 6, 7}, 4, 3)?	 << ( ) >>```

2.2 Sorting Lists

Sorting of sequences has been discussed in the preceding chapter. Sorting of lists is based on the same principles. We will discuss three methods of sorting lists. In all these methods we will assume that there is an ordering relation X < Y for the elements X and Y. The meaning of the ordering relation X < Y depends on its definition. We assume for our examples that such a relation is defined for integers, real, text, and other types, in the same way as defined for sequences:

```       term < term        -> boolean;
integer: I1 < integer: I2 = I1 < I2;
real: R1 < real: R2    = R1 < R2;
text: T1 < text: T2    = T1 < T2;```

This partial function can be extended with other data types for which the relation T1 < T2 is defined.

2.2.1 Insertion sort

There are different approaches for sorting lists based on the insertion method as discussed in the previous chapter. An easy method is based on sorting sequences. For example, an insertion sort for lists, using sequences, can be expressed in the following way:

```insertsort1 (list (entity1)) -> list (term);
insertsort1 (L) = { Items (insertsort (Sequence(L))) };```

The definition first converts the list L into a sequence, sorts the sequence with the corresponding insertsort into a sorted sequence and then converts the sorted sequence to a list of terms.

Another, more economic approach is to write directly a insertion sort for a list, as exemplified by the following definition:

```insertsort2 (list (entity1)) -> list (entity1);
insertsort2 (L) =
[ El1:= last (L);
if endlist (El1) then return (L);
repeat [ El2:= El1;
El1:= prior (El1);
Item1:= item (El1);
Item2:= item (El2);
if term: Item2 < term: Item1 then
[ Item1:= remove (El1);
El1:= El2;
repeat [ El2:= next (El2);
if ~ (term: item (El2) < term: Item1) then
];
];
];
L ];```

This version of insertion sort starts from the back end of the list and compares each element with the preceding element. If the preceding element is larger then the current element then the preceding element will be removed from the list and inserted in the already sorted part of the list. This process continues until all the elements are sorted.

Examples of its use are:

```insertsort2({5,9,4,3,2,5,7,9,2})?     << {2, 2, 3, 4, 5, 5, 7, 9, 9} >>
insertsort2({10,4,3,2,5,7,9,1,6,8})?  << {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} >>
insertsort2({12.0,10.9,4.0,3.2})?     << {3.2, 4.0, 10.9, 12.0} >>
insertsort2({"abc", "a", "ab", "a"})? << {"a", "a", "ab", "abc"} >>```

Note that, in contrast to insertsort1, insertsort2 mutates the input list. If you want to keep the original list intact, a copy operation should be used such as: insertsort2 ( {items (L)} ).

Exercise:

• 2.4 Use the insertsort to sort a list of Employees, as defined in Chapter 9 of de "Language Description", on decreasing salaries.

2.2.2 Quick sort

For quick sort we can use the same approaches as used for insertion sort. One approach is to convert lists to sequences and to use the quick sort method, based on the sequences, as expressed by the following definition:

```quicksort1 (list (entity1)) -> list (term);
quicksort1 (L) = { Items (quicksort (Sequence (L))) };```

Another approach is to write directly a quick sort for lists, as shown by the following definition:

```quicksort2 (list (entity1)) -> list (entity1);
quicksort2 (L) =
if endlist (first (L)) then L else
[ Smaller:=alist (entity1); Larger:=alist (entity1);
Pivot = remove (first (L));    << remove Pivot from list >>
[ E = items (L);               << scan list >>
if term: E < term: Pivot then add (Smaller, E)
add (Pivot, L); 	       << restore Pivot in list >>
{ items (quicksort2 (Smaller)), Pivot, items (quicksort2 (Larger)) } ];```

The quick sort definition works as follows: One element, the Pivot, is removed from the list which must be sorted. The rest of the list is divided over two other lists. All elements of the first list, Smaller, are less than the Pivot, all elements of the second list, Larger are greater than or equal to the Pivot. These two lists are sorted recursively. Finally, the sorted lists and the pivot are concatenated in the proper order to form a result list.

Examples of its use are:

```quicksort2({5,9,4,3,2,5,7,9,2})? 	<< {2, 2, 3, 4, 5, 5, 7, 9, 9} >>
quicksort2({10,4,3,2,5,7,9,1,6,8})?	<< {1,2,3,4,5,6,7,8,9,10} >>
quicksort2({12.0,10.9,4.0,3.2})? 	<< {3.2, 4.0, 10.9, 12.0} >>
quicksort2({"abc", "a", "ab", "a"})?    << {"a", "a", "ab", "abc"} >>```

Note that quicksort1 as well as quicksort2 are not mutating the input list.

2.2.3 Stream based Sorting

Most sorting methods are designed to minimize sorting times. For many purposes, that is what is required. However, there are situations where fast response times are more important than short sorting times, in particular if the user is not interested in the whole sorted list but only in a limited number of sorted items. For example, a publishing company may want to know on a weekly basis from all the thousands of books it sells, the sales of its top ten. Therefore it needs a sorting method which provides a stream of answers based on a partial sort.

The following definition sorts a list and produces a stream of items:

```streamsort (list (entity1)) -> multi (entity1);
streamsort (L) =
repeat[ MinElement:= first (L);
MinValue:= item (MinElement);
[ Element = nodes (L);   	<< scan list >>
Value = item (Element);
if term: Value < term: MinValue then
[ MinElement:=Element; MinValue:=Value] ];
remove (MinElement)];```

The working of this definition is very simple. It searches for a minimum value in the list. After the whole list has been scanned, the minimum element will be removed from the list and its value will be returned as an answer.

Examples of its use are:

```streamsort({5,9,4,3,2,5,7,9,2})?     << (2, 2, 3, 4, 5, 5, 7, 9, 9) >>
streamsort({10,4,3,2,5,7,9,1,6,8})?  << (1,2,3,4,5,6,7,8,9,10) >>
streamsort({12.0, 10.9, 4.0, 3.2})?  << (3.2, 4.0, 10.9, 12.0) >>
streamsort({"abc", "a", "ab", "a"})? << ("a", "a", "ab", "abc") >>```

Note that streamsort mutates the input list.

If not all the sorted items are needed but only a subset, as in our top ten example, we may combine this with a variation of the take definition:

```take1 (list (entity1), integer) -> multi (entity1);
take1 (L, n) = [ i:=0; E = streamsort (L); i:=i+1; ready when i > n; E ];```

Examples of its use are:

```take1 ( {5, 4, 3, 2, 1}, 4)?           << (1, 2, 3, 4) >>
take1 ( {12.0, 10.9, 4.0, 3.2}, 3)?    << (3.2, 4.0, 10.9) >>
take1 ( {"abc", "ac", "ab", "a"}, 2)?  << ("a", "ab") >>```

Exercise:

• 2.5 Determine the top 20 of the highest paid Employees of Exercise 2.4.

2.3 Combinatorial Functions

Many interesting problems are combinatorial in nature, that is they involve selecting or permuting elements of a list in some desired manner. This section describes a number of combinatorial functions.

2.3.1 Interleave

The interleave function returns a stream of lists which are representing all the possible ways of inserting a given element E into a list L. If L has n elements, then there are n + 1 possible ways of inserting a new element into L.

```interleave (entity1, list (entity1)) -> multi (list (entity1));
interleave (E, L) = [ i = 0..length (L); { take (L, i), E, drop (L, i) } ];```

Examples of its use are:

```interleave (7, {1, 2} )?  << {7, 1, 2}, {1, 7, 2}, {1 ,2, 7} >>
interleave ("a", {"b"})?  << {"a", "b"}, {"b", "a"} >>```

Exercise:

• 2.6 The preceding interleave function has the disadvantage that the front part of the list is scanned many times. Define an interleave function which does not have that disadvantage.

2.3.2 Permutations

The permute function returns a stream of all permutations of a list. If the list has n elements, then the number of permutations is n! = n * (n - 1) * (n - 2) * ... * 1. This can be seen by noting that any of the n elements of the list may appear in the first position, and then any of the remaining n - 1 elements may appear in the second position, and so on, until finally there is only one element that may appear in the last position.

The permute definition is a recursive function which makes use of the interleave function:

```permute (node (entity1)) -> multi (list (entity1));
permute (L) = if endlist (L) then alist (entity1)
else interleave (item (L), permute (next (L)));```

Note that, in contrast to other examples, the input of this definition is not a complete list but an element of a list.

Examples of its use are:

```permute (first ( {2, 3, 4, 5} ))?  << 4! = 24 answers >>
permute (first( {'a','b','c'} ))?  << 3! = 6 answers >>```

2.3.3 Primes

The prime function optionally returns a prime. It uses an external list of already found primes to compute the next prime.

```prime (list (integer), integer) -> optional (integer);
prime (L, I) = [ if mod (I, items (L)) == 0 then return;

Examples of its use are:

```L = alist (integer);
prime (L, 2..100)?```

Exercise:

• 2.7 Write a primes definition which produces a stream of primes. It has as input only an integer value which is the upper bound for the primes to be computed.

2.3.4 Palindromes

A palindrome is a word, verse, sentence or number that read the same backward or forward. The function palindrome checks whether a given text string is a palindrome.

```palindrome(text) -> boolean;
palindrome(T) = [L1 = {T[1..size(T)]}; L2 = {prioritems(L1)}; L1 == L2];```

Examples of its use are:

```palindrome ("madam")?                     << true >>
palindrome ("ABLE was I ere I saw ELBA")? << true >>
palindrome ("1991")? 	                  << true >>
palindrome ("")? 	                  << true >>```

2.3.5 Eight queens puzzle

The "eight queens puzzle" asks how to place eight queens on a chess board so that no queen is in check from any other (i.e., no two queens are in the same row, column, or diagonal). One way to solve the puzzle is to work across the board, placing a queen in each column. Once we have placed k - 1 queens, we must place the k th queen in a position where it does not check any of the queens already on the board. We will call such a position a safe position. If no safe position exists the program will backtrack to a preceding column to try another configuration.

To record the different positions we will represent the configurations on a board by a list of integer values. The integer value is the row number of a safe queen, its position in the list represents its column. For example, the list:

`{ 1, 5, 8, 6, 3, 7, 2, 4 }`

represents one of the 92 solutions of the eight queens problem. It shows the solution where the first queen is placed on location (1,1), the second queen on (2,5), the third on (3,8), and so on.

The program consists of the following definitions:

```queens (integer) -> multi (list (integer));
queens (0) = alist (integer);
queens (m) = [ Q = queens(m-1); n = 1..8; {items(Q), n} when safe(Q,m,n)];```
```safe (list (integer), integer, integer) -> boolean;
safe (L, m, n) = [ Check:=false; i:=0;
[ j = items (L); i:=i+1;
Check:=check (i, j, m, n);
~ Check];```
```check (integer, integer, integer, integer) -> boolean;
check (i, j, m, n) = (j == n) | (i+j == m+n) | (i - j == m - n);```

The check definition tests if two queens at coordinates (i, j) and (m, n) will hold each other in check. The safe definition checks whether a queen at coordinates (m, n) is in a safe position. The queens definition recursively determines all safe positions. All solutions are obtained by:

`queens (8)?`

Many authors have used the eight-queens puzzle to demonstrate the capabilities of a programming language. See, for example, Budd (1997) for single-solution programs in various object-oriented languages.

Exercise:

• 2.8 Write a queens version which not only works for 8 * 8 boards but in general for n * m boards.

2.4 Sets

Sets are used in many branches of mathematics and informatics. A set is a collection of entities with common properties. There are many useful operations defined on sets.

Sets and lists are closely related. In a list an element may occur more than once. In a set, however, each element is unique. In a list the order of the elements may be important; in a set no ordering is assumed. Still, lists and sets have a lot of other characteristics in common. Because of this close relationship we will represent sets by lists. Consequently, all functions defined on lists are also defined on sets.

In this section we will discuss a number of functions which are particularly defined on sets. We will make use of two list functions as defined in the first section of this chapter: the member function and the length function.

We will define a set as:

`type set = list (entity1);`

that means that all the set functions are polymorphic.

2.4.1 Set

Because a list may contain multiple occurrences of elements it is sometimes necessary to convert a list into a set. The set function will create a set without the multiple occurrences.

```set (list (entity1)) -> set;
set (S) = [ Set = alist (entity1);
[ E = items (S); if ~ member (E, Set) then add (Set, E) ];
Set ];```

Examples of its use are:

```set ( {"a", "ab", "abc"} )?            << {"a", "ab", "abc"} >>
set ( {"a", "ab", "a", "abc", "a"} )?  << {"a", "ab", "abc"} >>
set ( {2, 2, 2, 2} )?                  << {2} >>```

2.4.2 Union

The union of two sets is a new set containing all the elements that are in either of the two sets.

```union (set, set) -> set;
union (S1, S2) =
[ NewSet = { items (S1) };
[ E = items (S2); add (NewSet, E) when ~ member (E, NewSet) ];
NewSet ];```

Examples of its use are:

```union ( {1, 2, 3}, {4, 3, 2, 1} )? 	<< {1, 2, 3, 4} >>
union({"a", "ab"},{"a", "ab", "abc"})?	<< {"a", "ab", "abc"} >>```

2.4.3 Intersection

The intersection of two sets is a new set containing only the elements that are in both of the two sets.

```intersection (set, set) -> set;
intersection (S1, S2) =
[ NewSet = alist (entity1);
[ E = items (S1); add (NewSet, E) when member (E, S2)];
NewSet ];```

Examples of its use are:

```intersection ( {5, 4, 3}, {3, 5, 7, 9} )?          << {5, 3} >>
intersection ( {"a", "ab"}, {"a", "ab", "abc"} )?  << {"a", "ab"} >>```

2.4.4 Difference

The difference of two sets S1 and S2 is a new set containing all the elements of S1 that are not in S2.

```difference (set, set) -> set;
difference (S1, S2) =
[ NewSet = alist (entity1);
[ E = items (S1); add (NewSet, E) when ~ member (E, S2) ];
NewSet ];```

Examples of its use are:

```difference ( {5, 4, 3, 2}, {3, 5, 7, 9} )?       << {4, 2} >>
difference ( {"a", "ab"}, {"a", "ab", "abc"} )?  << { } >>```

2.4.5 Subset

The set S1 is a subset of S2 if every member of S1 is a member of S2.

```subset (set, set) -> boolean;
subset (S1, S2) =
[ Subset:=true;
[ E = items (S1);
[ Subset:=false; ready] when ~ member (E, S2) ];
Subset];```

Examples of its use are:

```subset ( {3, 6}, {2, 3, 4, 5, 6, 7} )?     << true >>
subset ( {1, 2, 3, 4, 5, 6, 7}, {3, 8} )?  << false >>
subset ( {3}, {3, 4} )?                    << true >>```

2.4.6 Equal

The two sets S1 and S2 are equal if they have the same number of elements and if S1 is a subset of S2.

```equal (set, set) -> boolean;
equal (S1, S2) = length (S1) == length (S2) & subset (S1, S2);```

Examples of its use are:

```equal ( {4, 3, 2}, {2, 3, 4, 5} )?                << false >>
equal ( {5, 4, 3, 2}, {3, 5, 2, 4})?              << true >>
equal ( {"abc", "a", "ab"}, {"a", "ab", "abc"})?  << true >>```

Exercise:

• 2.9 Write a version of equal (set, set) -> boolean; without using the subset function.

2.5 Summary

• Bi-directional lists can be used in a large variety of applications.
• The basic list functions we discussed are related to the length of a list, the sum and the average of a numeric list, the membership of a list and some functions to read a portion of a list.
• Sorting a list may be based on insertion sort, quick sort, or a stream producing sort.
• Stream producing sorts may be used for partial sorts.
• Examples of combinatorial functions include the computation of permutations, primes, palindromes, and the eight queens puzzle.
• Mathematical sets can be represented as lists. The operations for union, intersection, difference, subset, and equality for sets have been defined. Part 2: Data Structures Chapter 2: Examples involving Lists