Previous Home Contents Next

Chapter 9: Generators

9.1 Using generators in loops

9.2 Using generators via functions

9.3 Creating generators

Consider the problem of traversing a list to operate on each of its members. One might write code such as:

This approach makes good sense for linked lists, but is too expensive for lists represented as arrays. Each iteration would have to allocate a new array in the call to rest, leading to O(#L2) storage use where none is really necessary.

This approach makes good sense for lists represented as arrays, but is inappropriate for a linked list representation. Each iteration would need to traverse the list from the begining to find the desired element, leading to O(#L2) time where O(#L) should suffice.

Having seen this, consider the problem of writing a generic program to operate on some FiniteLinearAggregate structure which might be represented as an array, as a linked list, or in some other way.

How should loops be written to minimize cost in a generic program?

A related problem arises with more sophisticated data structures. Here, the steps to traverse an object can be rather intricate. The code for a loop which traverses two objects in parallel can be extremely difficult and error-prone.

How can loops be written which hide the complexity of data representation?

The answer to both these questions is the same in Aldor, and that is to use generators.



9.1 : Using generators in loops

Generators may be used in Aldor to obtain values serially, wherever required. For example: to compute numbers in a sequence; to access the components of a composite structure, such as a list, array or hash table; or to read data from a file.

The simplest way to use a generator in Aldor is with a for iterator on a repeat loop or a collect form:

In fact, this form of using generators is so common, that if the expression E in ``for v in E.'' does not belong to a generator type, then an implicit call is made to an appropriate generator function. This is equivalent to writing ``for v in generator E.''

With this, our example may be written as:

In Aldor all for-repeat and for-collect loops are based on generators. There is no built in method to traverse integer ranges, lists or other structures. It is the compiler's responsibility to make the use of generators reasonably efficient.

Generators are regular values in Aldor and, once created, may be passed as arguments to functions which consume them gradually, according to a cooperative scheme.



9.2 : Using generators via functions

Sometimes it is desired to use the values in a generator gradually, with some logic that is not naturally expressed as a for loop.

One might imagine writing a function, such as the following, to extract elements from a generator one at a time.

Here the loop body would be evaluated zero or one times, and the function would return nil or the first value in the generator. Subsequent calls would extract the remaining values.

The base Aldor library provides a set of functions which can be used in this way, but are somewhat more flexible. To use these functions one must have an include command for ``aldor'' and import from the appropriate generator type, e.g.:

With this, the following functions become available:

The function "step!" runs the generator code until the next value, or the end of the generator, is reached. After step! has been called, the function "empty?" may be used to test whether the generator has been exhausted. If empty? returns false, then the "value" function may be called to extract the current value from the generator.

The expression "for x in g repeat E" is equivalent to

The final function "bound" provides an upper bound on the number of values which may be returned by the generator, if it is possible to determine. This is useful, for example, to avoid allocating intermediate structures when extracting all the values from a generator into an array. If it is not possible to determine a bound, then -1 is returned.



9.3 : Creating generators

Generators are ultimately created with a "generate" expression in one of the following forms:

The first form, without the "to" part, is the most commonly used.

The body, E, of a generate is an expression containing some number of "yield" forms, each with some argument. The evaluation of the generate proceeds as follows:

None of the expressions in the body is evaluated when the generator is first formed. When the first value is requested from the generator, the body is evaluated until a yield is encountered. The argument of the yield is returned as the value of the generator, and evaluation of the generator is suspended. When another value is requested from the generator, evaluation resumes from the point where left off, and continues until the next yield is encountered. Evaluation proceeds in this way, suspending at yield points and resuming when further values are requested, until the evaluation of the body expression is complete. Note that some evaluation may occur after the last yield, before the body has finished evaluating.

All the yields for a given generate must have the same type of argument. If all the yields in a particular generate have type T, then the generate expression has type "Generator(T)".

If given, the "to" part provides a bound on the number of values which the generator may provide. If a bound is not given, the compiler is permitted, but not required, to deduce a bound when it can.

Examples:

generate expressions may have several yields:

A yield may appear in a loop:

The generator body may have arbitrary control flow within it. This encapsulates the logic for traversing a structure. This is an example from the innards of the HashTable type in the base Aldor library:

In the Base Aldor library, the generator for general Segment values (a..b by c) is given as:

Since it might not be obvious, we note that recursive functions may be used to implement generators. This is extracted from the "Tree" example in section 23.8.

Finally, we observe that when using the base library it is possible to form generators by providing a set of empty?, step!, value and bound functions. These would normally operate with a shared environment:


Previous Home Contents Next