Richard
Bird,
Geraint
Jones,
Oege
de Moor
(Oxford
University Computing Laboratory)
Nicholas Pippenger in a paper presented at POPL96 demonstrates the existence of a problem which can be solved in O(n) time by a program in Lisp which uses assignments, but for which an assignment-free Lisp program must take Omega(n log n). By solving this same problem in linear time in a pure functional language with lazy evaluation we show that -- in the case of this program at least -- the assignments required are only those necessary to turn a normal-order evaluator into a lazy one. This shows that lazy evaluators are strictly stronger than assignment-free eager ones.
The programs in this note should be read as Gofer programs: we assume that readers are familiar with Gofer, and with the standard prelude functions which we use. This file is a Gofer literate script which implements the example program which is used in our argument.
The essence of the example ~~~~~~~~~~~~~~~~~~~~~~~~~~ The program used in the proof of Pippenger's complexity result takes (an encoding) of a permutation and then applies it repeatedly to groups of symbols drawn from a potentially infinite sequence of inputs, and emits the symbols of the permuted sequences in turn. In essence, the computation to be implemented is: mapPerm np ps xss = (concat . map (perm ps) . group np) xss where ps is an encoding of a permutation on np-tuples perm ps applies the permutation ps to an np-tuple group np divides an infinite list into successive lists of length np (although there are some details which we will expand below). For definiteness, here is a possible implementation in Gofer: > perm :: [Int] -> [a] -> [a] > perm ps = zipWith (flip (!!)) ps . repeat > group :: Int -> [a] -> [[a]] > group np = unFold (not . null) (take np) (drop np) > unFold :: (a -> Bool) -> (a -> b) -> (a -> a) -> (a -> [b]) > unFold cons head tail = map head . takeWhile cons . iterate tail There is no claim being made here about the complexity of this Gofer program, nor about the exact timing of its execution. In fact that implementation of `perm' is quadratic in np, although perfectly good O(np log np) implementations are possible in a functional language; and `group' could be made more efficient, although only by a constant factor. (Both `concat' and `group n' are asymptotically linear in the number of items processed.) There is some detail of the setting up of the problem in Pippenger's paper. The program used in Pippenger's paper arranges to input symbols which are ignored during dead times when it is outputting values, and to output noise symbols during times when it is inputting values. These extra transactions are immaterial in Pippenger's result about asymptotic real-time behaviour, but simplify the constraint that the program be on-line, so for completeness we include them here by modifying mapPerm np ps = concat . map (preEcho (perm ps) . take np) . group (2*np) > preEcho :: ([a] -> [a]) -> [a] -> [a] > preEcho f xs = xs ++ f xs This definition of `mapPerms' is on-line, in that it is able to produce outputs of any length having consumed inputs of only exactly length (indeed, shorter inputs often suffice). In the context of lazy functional programming, this property may be better known as that `mapPerms' is a constructive function. The reader might care to prove that take n . mapPerm np ps . (++ undefined) . take n is identical to `take n . mapPerm np ps' for all n. Read as an implementation of `mapPerms' this program does not however meet the complexity criterion which Pippenger uses to divide pure and impure Lisp, since it is not real-time. An output of length n consists of O(n/np) applications of `perm ps' and so can take this many times O(np^2) time, which is to say O(n * np) time in all. (Even an O(np log np) implementation of `perm' would reduce this only to O(n log np).) In order to restrict the operations allowed in the program Pippenger defines the notion of a `symbolic' program which consumes a particular representation of the permutation. It first reads two symbols to be used in implementing a representation of numbers, and then reads a representation of np in unary notation followed by a sequence of np unary numbers to represent the permutation, and finally some padding to make this prologue independent of the numbers read. (The length of the numbers would be constant if they were guaranteed to be a permutation, this padding makes the length of the prologue independent of that constraint on the input.) Thus the problem is to implement > problem :: Eq a => [a] -> [a] > problem xs = prologue ++ mapPerm np ps (drop (length prologue) xs) > > where prologue = [ tt, ff ] ++ > takeWhile (== tt) tltlxs ++ > [ head (dropWhile (== tt) tltlxs) ] ++ > take (np^2) (after tltlxs) > > tt = head xs > ff = head (tail xs) > tltlxs = tail (tail xs) > > numbers = unFold farEnough unary after tltlxs > where farEnough = const True > np = head numbers + 1 > ps = take np (tail numbers) > > unary = length . takeWhile (== tt) > after = tail . dropWhile (== tt) on-line in real time, meaning that outputs of length n can be produced in O(n) time, with a constant independent of np. We will demonstrate a lazy functional program which does this. In case you are having difficulty with the representation, it is consistent with: ? problem "abaaabaabaaababbxxxxxx0123xxxx4567xxxx" a b aaab aab aaab ab b xxxxxx 0123 2310 4567 6754 (where spaces have been added to the output to emphasize the structure) showing here the permutation of length four, which maps 0123 to 2310, and is here applied twice only. Before proceeding we note that the `symbolic' property is that the function `problem' must be fully polymorphic in the type of the list which it consumes and outputs, excepting only the requirement that it be able to test its elements for equality. problem :: Eq a => [a] -> [a] Pippenger's result is proved only for symbolic programs. In essence Pippenger demonstrates that whilst something like `mapPerm' can be implemented in real-time (producing each output a bounded number of steps after it consumes the corresponding input) in Lisp with assignment, no on-line pure Lisp program can for all inputs produce outputs of length n in less than Omega(n log n) time. A real-time on-line lazy implementation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The observation which leads to a fast lazy functional program is that each of the permutations to be implemented by mapping `(perm ps)' over a sequence of np-long lists is the same: the permutation is applied obliviously. In fact the type of the term which does this mapping is map (preEcho (perm ps) . take np) :: [[a]] -> [[a]] and in a stronger type system we would be able to observe that all of the components of type [a] in this type are 2*np-long lists. Any such oblivious function can be implemented by a single application of the mapped function to a 2*np-long list of lists of type [a] produced by transposing the argument, and then transposing the result. trans . preEcho (perm ps) . take np . trans With care the transposing function can be written to make this implementation work on-line in real time on infinite arguments. The leftmost occurrence of `trans' must turn a 2*np-tuple of potentially infinite lists into a list of 2*np-tuples, producing the whole of each 2*np-tuple before it inspects the structure of its argument beyond the corresponding value which is the head of each component. The rightmost occurrence of `trans' must have the complementary property when turning a potentially infinite sequence of 2*np-tuples into an 2*np-tuple of lists. In both cases this requirement is met by > trans :: [[a]] -> [[a]] > trans = foldr (zipWith' (:)) (repeat []) > where zipWith' f = (. infinite) . zipWith f > infinite = unFold farEnough head tail > where farEnough = const True Notice that zipWith' is deliberately not strict in its second argument: if the stricter zipWith function were substituted for it, this would prevent the function from operating on-line and indeed make `trans' of an infinite sequence undefined. This leads us to the following program: > mapPerm :: Int -> [Int] -> [a] -> [a] > mapPerm np ps = concat . trans . > preEcho (perm ps) . take np . > trans . group (2*np) which implements mapPerm on lists of length 2*k*np in a time which is O(f(np) + 2*k*np) where f(np) is the cost of applying `perm ps'. Notice that the complexity of the implementation of the function `perm' is essentially immaterial to the long-term cost of applying mapPerm. The effect of the double transposition is to apply perm only once, and fully to evaluate the permutation within the time taken to output the first 2*np elements of the output. Although this might at first sight appear to fall short of the real-time criterion, since it might involve doing more than O(np) of work over the first 2*np components of its output, this does not matter: we are only concerned with asymptotic behaviour and the extra time needed, which is no more than O(np^2), can be attributed to the prologue to the problem (which consumes at least np^2 symbols and so takes time Omega(np^2)). Lazy versus normal-order ~~~~~~~~~~~~~~~~~~~~~~~~ It is well known that normal-order reduction can be simulated in an eager functional language by a systematic translation of the program. For simplicity, first each function in the program is curried, and then each application in the program is rewritten by M N --> (applyto N) M Eager evaluation of the resulting program corresponds to normal-order evaluation of the original program. This translation could be applied to our program to yield an on-line eager program for Pippenger's example problem, but the existence of our real-time on-line lazy implementation of mapPerm does not contradict Pippenger's result since the translation does not preserve the complexity of the program. In particular, wherever the program is not syntactically linear, the normal-order translation executed by an eager evaluator may execute the same sub-computation twice. In our program for example, the implementation of iterate f x = x : iterate f (f x) is essentially not linear. This makes `trans' not linear in the permuted-transposed-grouped sequence, and a normal-order evaluation of our program would re-evaluate the permutation (and everything which is involved in building the structure to be permuted) for each of the 2*np-tuples in the output. This non-linearity can be made more obvious by synthesizing a definition of zipWith' f [] ys = [] zipWith' f (x:xs) ys = f x (head ys) : zipWith' f xs (tail ys) and recalling that the non-strictness in ys is essential to the on-line behaviour of the program. This inefficiency is exactly that which is eliminated in a lazy evaluator which whenever an identified -- and possibly shared -- expression is evaluated replaces the closure for that expression by its value so that it need never be evaluated again even if its value is demanded again elsewhere. One might also attempt to eliminate the non-linearity from the definition of `iterate' by constructing a circular data-structure: iterate f x = fix ((x:). map f) where fix f = y where y = f y but the definition of `fix' is not linear, and efficient execution of this definition relies on a recursive reading of where-clauses: in Lisp terms, on allowing the definition of data objects other than functions in LETREC constructs.
We have shown in this case that the assignments which Pippenger's paper shows are essential for the efficient implementation of mapPerm need not be explicit in an on-line real-time functional program. Our program for mapPerm, together with Pippenger's result, shows that those assignments which are necessary to implement a lazy evaluator are sufficient to reduce the lower-bound complexity of this program.
Note that we cannot and do not claim to have shown that a lazy implementation of a pure functional language can be used to solve all problems with the same complexity as impure Lisp solutions. We do, however, appear to have shown that -- without either assignment or recursive data-object definition -- there is no complexity-preserving translation into an applicative-order eagerly evaluated language of arbitrary programs written in a lazy functional programming language.