data Square a = Zero a | Succ (Quad (Square a)) type Quad a = (a,a,a,a)but inflexible (size is a power of 2), and invariant not enforced.
data Square a = Zero a | Succ (Square (Quad a))Now the shape invariant is enforced (though still inflexible in size). Idea: use the iterative version of fast exponentiation algorithm on types. Vectors first:
type Vector a = Vector_ () a data Vector_ v w = Zero v | Even (Vector_ v (w,w)) | Odd (Vector_ (v,w) (w,w)) createVector x n = create () x n create :: v -> w -> Int -> Vector_ v w create v w n | n==0 = Zero v | even n = Even (create v (w,w) (half n) | odd n = Odd (create (v,w) (w,w) (half n))(rightist, right-perfect trees of Kaldewaij and Dielissen). Now matrices:
type Rect a = Rect_ () a data Rect_ v w = Zero (Vector v) | Even (Rect_ v (w,w)) | Odd (Rect_ (v,w) (w,w))This gives rectangular matrices. To get squares, make v a vector type constructor. Then v (v a) is a square matrix.
newtype Empty a = E () newtype Id a = I a newtype Pair v w a = P (v a, w a) type Square a = Square_ Empty Id a data square_ v w a = Zero (v (v a)) | Even (Square_ v (Pair w w) a) | Odd (Square_ (Pair v w) (Pair w w) a)Accessing functions need to perform arithmetic to know where to go, building up polymorphic access functions as accumulating parameters. Other versions of exponentiation give interesting representations of vectors, but don't always generalize to matrices. (Iterative algs do, non-iterative algs don't.) Efficient? Fewer tags and less pattern matching, but leaves are 50% deeper and, worse, access functions are created at run time (in order to make them well typed). Access functions could be created when the matrix is built and stored in the Zero nodes of the data structure, which saves recreating them for each access.
Adventure in types? We needed non-regular datatypes, polymorphic recursion, rank-2 polymorphism, and higher kinds to make this work.
List = K1 + Id * List Tree = Fst + Tree*Snd*Tree Fork = Id * Id Perfect = Id + Perfect . ForkNon-recursive types are represented as finite trees. Recursive types are modelled by infinite type expressions.
Tree
is represented as a rational expression tree, Perfect
has an algebraic tree modelling it. As an example, here is a generic definition of flatten:
flattenSpecialization cannot be based on the structure of the respresentation tree, because in an non-rational tree there are infinitely many instances arising. Instead we transform the recursion equation of the type into recursive equations for the function. Example: from the user's definition of flatten:: f a -> List a flatten = wrap flatten = const [] flatten = flatten | flatten flatten = flatten >< flatten wrap a = [a] (p >< q) (a,b) = p a ++ q b
flattenNow we apply this to tries. A trie is a search tree scheme due to Thue that employs the structure of search keys to organize information. Generalized by Connelly and Morris to permit indexing by elements of an arbitrary monomorphic datatype. Consider finite maps from one type to another. We can build them by induction on the argument type using the following isomorphisms.= flatten1 wrap flatten1 p = p >< p flatten1 p = p | flatten1 (flatten1 p) flatten2 (p,q) = p | flatten2 (p,q) >< q >< flatten2 (p,q)
1 -> v == Maybe v (k+m) -> v == (k -> v) * (m -> v) (k*m) -> v == k -> (m -> v)Define tries polytypically:
Map<*> :: * -> * emptywhere:: Map v lookup :: k -> Map v -> Maybe v merge :: (v->v->v) -> Map v -> Map v -> Map v
Map v
represents the set of finite maps from k to v. Then (in point-free style):
Map<1> = Maybe MapWe can use the techniques above to generate polytypic datatypes by specialization (as well as functions).= IntDictionary Map = Map * Map Map = Map . Map
Map1<*->*> :: (*->*) -> (*->*)The type constructor Map1
MapL m = Maybe * m . MapL m MapT (m,n) = m * MapT (m,n) . n . MapT(m,n) MapF M = m . m MapP m = m * MapP (MapF m)MapT is a first-order nested type, MapP is a higher-order nest. Now, polytypic definitions of various trie functions can be specialised at the various datatypes as before. The result ends up needing rank-2 polymorphism. (Discussion: the need for rank-2 polymorphism can be removed by using the class mechanism. The polymorphic arguments are simply polymorphic methods of other instances.)
Conclusion: at any individual instance, the definition of tries we calculate ends up very complicated, yet the original definition is very simple. Polytypic programming can be much simpler than monotypic programming.
Define a breadth-first traversal algorithm to traverse and rebuild a tree with breadth-first numbering on the nodes. A lazy solution is well known. Find a solution that works in ML.
data BinLocation = Memory | File Path ModeWant to have compact representations of data types. Bit-vector encoding of
Branch (Branch (Leaf True) (Leaf F) (Leaf T))
(c. 400 bits) could be coded as 00111011 (8 bits). The compression codes Branch
as 0, Leaf
as 1, True
as 1, False
as 0. The class Binary provides routines to compress arbitrary datatypes.
data Ord k => BTree k d = BTree Int [(k,[d])] [BTree k d]A B-Tree of order w has between w and 2w items at each node.
data Binary d => DBlock d = DBlock [d] (Maybe (BinPtr (DBlock d))) deriving Binarywhere the data is broken into pieces in order to put a bound on the number of data items. We provide a lazy functional access function (not monadic).
Binary t => getFAt :: BinHandle -> BinPtr a -> aExample piece of code:
db2list :: BinaHandle -> DBlock d -> [d] db2list bh (DBlock ds Nothing) = ds db2list bh (DBlock ds (Just ptr)) = ds ++ db2list bh (getFAt bh ptr)Want to lay pointers to subtrees first, as the most common case is to access subtrees.
state1 if predicate --> state 2
; and are specificed as a term rewriting system. Can specify the essence of a speculative, out-of-order processor in 25-30 rules. Fundamental question: does the speculative processor produce the same behaviors as a non-pipelined processor?
For an example, take a set of 6 basic assembly operations that is computationally complete (loads, alu ops, conditional jumps). We model a multiprocessor system as Sys(Proc(pc,rf,im)|pg,dm) Examples of non-pipelined Processor rules:
Proc(pc,rf,im) if im[pc] = r:=Op(r1,r2) --> Proc(pc+1,rf[r:=v],im) where v=op(rf[r1],rf[r2]) Proc(pc,rf,im) if im[pc] = Jz(rc,ra) & rf[rc]=0 --> Proc(ra,rf,im) Sys(Proc(pc,rf,im)|pg,dm) if im[pc]=Load(ra) --> Sys(Proc(pc+1),rf[r:=dm[a]],im)|pg,dm) where a=rf[ra]The speculative model has a much richer internal structure. We model these components also. So adding a ReOrder Buffer and Branch Target Buffer gives rules like the following:
Proc(pc,rf,rob,btb,im) if im[pc] = r:=Op(r1,r2) --> Proc(pc+1,rf,rob;ROB(t,pc,r:=Op(tv1,tv2)),btb,im) Proc(pc,rf,rob1;ROB(t,ia,r:=Op(v1,v2));rob2,btb,im) --> Proc(pc,rf,rob1;ROB(t,ia,r:=v);rob2,btb,im) where v=op(v1,v2) Proce(pc,rf,rob1;ROB(t,ia,(Jz(0,npc),ppc));rob2,btb,im) if ppc=npc --> Proc(pc,rf,rob1;rob2,btb',im) if ppc/=npc --> Proc(npc,rf,rob1,btb',im) where btb'=updateBTB(btb,ia,npc,taken/not-taken)Rules at this level allow us to explore relaxed memory access models in the context of multiprocessors, for example. This is where many problems arise in practice. The rules can also be used to introduce pipeline stages by splitting the rule in two. Some conditions can be easily extracted from the original rule, but new rules corresponding to hazard conditions have to be introduced. Superscalar rules can be produced by composing pairs of rules operating on two instructions at the same phase at the same time.
To generate hardware many more details are required. The term rewriting system specifies the whole state, the conditions each rule requires, and the effect each rule has. Rules that do not conflict with one another can execute in parallel: Rules R1 and R2 do not conflict in state S
Cond1(S) & Cond2(S) => Cond1(Effect2(S) & Cond2(Effect1(S)) & Effect1(Effect2(S)) = Effect2(Effect1(S))In order to compare the complex processor to the simple one, we introduce drained states (where ROB is empty) so that in a drained state there is a simpler mapping from the complex state to the simple state. But which rules can be used to drain? Using too many may make a commitment in a non-deterministic system from which there is no recovery. We require that for all R, RD.R.RD=RD.R, where RD are the draining rules.
query0 = do { s <- wg2.8 ; restrict $ s!foof .==. constant "Meat ; project $ s!day}and this generates the following SQL
SELECT s.day FROM wg2.8 AS s WHERE s.food = "Meat"Most SQL generated by web sites comes from perl scripts which construct SQL by joining together text strings. This has many problems, from dynamically generated syntax erors to lack of type safety.
We build datatypes that reflect the structure of the language, and a translator to express these as strings. This nicely separates the abstract syntax of the language from its concrete syntax. We then use Haskell to generate programs in the abstract syntax: this is where we gain the leverage of using Haskell. Finally we provide a run-function runSQL : PrimSQL -> IO Int
, for example. While using datatypes provides structural guarantees, it doesn't give deeper type guarantees. One solution is to introduce a new abstract type whose interface is richly typed, and use the less richly type abstract syntax tree as the implementation type. Then Haskell is able to guarantee richer properties of constructed elements. A similar trick works to model inheritance: suppose f is a method of A, g of B, and h of C, and B inherits from A, and C from B. By introducing the type variables to the interfaces, we can contruct interface types which allows each function to be called only on the types for which it is defined.
f :: Int -> IA a -> IO () g :: Bool -> IA (IB b) -> IO () h :: Char -> IA (IB (IC c)) -> IO ()To manage the row types for SQL really needs extensible records and first class labels, though by heavy use of type classes we can come close to achieving what we'd like.
The biggest change in ML2000 will be in introducing subtyping and subsumption over object types (record types with recursion). One object is a subtype of another either through additional methods, or through having methods with subtypes. Object types can also have fields (mutable by default) which do not admit subtyping. Subtyping interferes with principle types, so more terms will be explicitly typed.
Haskell 98 is fixed, and now thoughts turn to the next version. Interestingly, the original goal for Haskell was to have a stable language for teaching etc. This goal has never yet been achieved. We wait to see if it is now. The next major version will be Haskell 2, with a 2-4 year time-frame. In the meantime there will be a web site maintaining discussions, and potential proposals with their pros and cons.
How do we make functional languages ideal for hosting embedded domain-specific languages such as scripting languages. Ideas include:
Specify COM components by specifying an interface defining the set of methods, and then a coclass to implement a set of interfaces. All interfaces and coclasses are referred to by unique global identifiers, and described by an IDL description. The HDirect tool takes and IDL description (or a type library) and produces a corresponding Haskell interface. HDirect is inspired by J/Direct and Visual Basic, and accepts both Microsoft's IDL and Corba's OMG. It enables Haskell to call C and COM, and can allow C and COM to call Haskell.
To encapsulate a Haskell program as a COM object we write an IDL interface for the interface we wish to export. HDirect then generates Haskell stubs that can be filled in as desired. The interface table is also created by HDirect, using laziness both to delay the creation until needed, and to share the table across multiple uses.
Mobile expressions are designed to enable distinct Clean applications to exchange and evaluate expressions, perhaps across separate machines, or via a file store etc. It is type safe (through dynamic typing), and the received expressions can contain functions which are undefined in the receiving application. This implies that the execution code must be available for dynamic linking. Dynamic types are introduce by
(dynamic expr :: t) :: Dynamicwhich generates a pair
(expr, typecode t)
. Dynamics can be stored in files or sent along channels, and they can be extracted in typed patterns (which implement a typecase). This can be expressed in terms of overloading (not all types can be made dynamic). The patterns can include type variables so, for example, here is a generic application launch program
launch :: Dyn Dyn -> dyn launch appl::(a->b) doc::a = dynamic (appl doc) launch appl doc = docTo write or send dynamic objects we use the graph copying algorithm of concurrent Clean, converting internal addresses to names. Not everything can be converted into dynamic. Unique types, for example, cannot because they contain references to state. When a dynamic value is retrieved, it is unpacked into the heap and the function references are checked to see if they are known. If not, a dynamic link to the appropriate module is made, invoking the compiler if necessary.
expr := \ * pat -> body body ::= expr | expr * expr | pat <- body ; body | if expr then body else body | case ....This can be desugared as follows (occurrences of pat on the RHS are syntactic copies of the pat occurring on the left).
[\* pat -> b] = [b]_pat [e]_pat = arr (\pat -> e) [p<-b1;b2]_pat = arr (\x -> (x,x)) >>> first [b1]_pat >>> [b2]_(p,pat) [if e then b1 else b2]_pat = arr (\pat -> if e then Left pat else Right pat) >>> ([b1]_pat ||| [b2]_pat) [e1 * e2]_pat = arr (\pat -> e2) >>> e1, if FV(e1)/\BV(pat) = {} = arr (\pat -> (e1,e2)) >>> appCGI scripts need to maintain state on the client. arr (\z -> "What is your question") >>> ask >>> (arr id &&& ask) >>> arr (\(q,a) -> "The answer to "++q++" is "++a) or in the syntactic sugar above q <- ask * "What is your question" a <- ask * q "the answer to "++q++" is "++a The interpretation of an arrow is a textual representation of its continuation, so an arrow script can save the remainder of its process. I.e. the entire state of the CGI combinators is accecible. We define a CGI functor (parametrized over an IO arrow to enable it to perform effects newtype CGIF (|->) a b = CGI (Either a (State,String) |-> Either b (State,String)) data State = Empty | InLeft State | InRight State | Save String State ask :: ArrowChoice (|->) => CGIF (|->) Stroing String ask = CGI $ arr $ \z -> case z of Left q -> Right (Empty,q) Right(Empty,a) -> Left a arr f = CGI $ \*(Left b) -> Left (f b) CGI f >>> CGI g = CGI $ \* z -> case z of Left x -> enterf * Left x Right (InLeft s', a) -> enterf * Right (s',a) Right (InRight s', a) -> enterg * Right (s',a) first (CGI f) = CGI $ \*x -> case x of Left (b,d) -> enterf * (Left b, d) Right (Save v s, a) -> enterf * (Right (s,a), read v) where enterf * (x,d) = y <- f * x case y of Left c -> Left (c,d) Right (s,q) -> Right (save (show d) s ,q) first :: (Arrow(|->),Show d,Read d) => CGIF(|->) b c -> CGIF (b,d) (c,d)
f nil = h1 f (cons a l) = do b <- f l h2 (a,b)The monadic cata satisfies
h* . F^ (cataM h) = (cataM h)* . inF^where F^ f : FA -> MFB (assuming f:A->MB). The monadic extensions F^ are in bijection with natural transformations d:MFA -> FMA (e.g. accumulate). As an example of fusion consider
getLine :: IO [Char] getLine = doThis is expressible as a monadic unfold. To see fusion in action, consider k : IO a k = do l <- getLine return (cata h l)This can be transformed tok = do c <- getChar if c==\n then return h1 else do a <- k return (h2(c,a))The equation satisfied by the monadic ana is(F^ anaM g)* . g = out^* . anaM gUnfortunately, this is not necessarily unique (in Haskell it does not necessary terminate over all monads). As an example, consider graph traversal. Represent a graph as a set of vertices and an adjacency function. Introduce an imperative set of marked vertices together with an initializingrun
function.graphTrav :: [V] -> [V] graphTrav = run . gtrav gtrav :: [V] -> M [V] gtrav l = do l <- mdrop l case l' of [] -> return [] v:l'' -> do mark v y <- gtrav (p(v,l'')) return (v,y))where the choice of p alters whether the traversal is depth first (use cons) or breadth first. Expressgtrav = cataM gopen
. Now, expressing filter as a cata on lists (filter q = cata fil
) thenfilter q . graphTrav = filter q . run . gtrav = run . M (filter q) . gtrav = run . hyloM(fil,gopen)As another example consider monadic parsers. These are already hylomorphisms as the internal data structure never actually gets built.
John Launchbury
Single-effect monadic fix points
Colin Runciman
XML is an extensible markup language which allows introduction of tagged elements with attributes surrounded by textual content. The tags produce a tree structured document. We have a library of XML generating functions, but here we concentrate on scripting XML trees. The basic components are:
Haskell and XML: combinators for generic text processingdata Element = Elem Name [Attribute] [Content] data Content = CElem Element | CText String | ... type CFun = Content -> [Content] processXMLwith :: CFun -> IO ()How do we process elements. That is, what should we replace ?? with in the following process stages:XML --parser--> Element --??--> Element --pretty--> String
. Combinators for processing elements are as follows:keep c = [c] none c = [] elem c = case c of CElem _ -> [c] CText _ -> [] text c = case c of CElem _ -> [] CText _ -> [c] children c = case c of CElem (Elem _ _ cs) -> cs CText _ -> [] f o g = concat . map f . g f with g = filter (not . null . g) o f f without g = filter (null . g) o f (?) :: String -> CFun -> CFun f /> g = g o children o f -- then f g = f with (g o children) -- outside (f |>| g) c = if not (nul c') then c' else g c where c' = f c deep f = f |>| deef f o chldren deepest f = deepest f o children |>| fSome brief examples:
talk with ("title"?)
selects those documents which contain a title attribute;
tag "meeting" /> tag "talk /> tag "title"
yields all talk titles;
deep text
strips all tag structure.
This scripting library is available, as is and interpreter for a concrete language in this form.
Dick Kieburtz
Monads and comonads are dual structures, so as a monad is able to give structure to algebras, a comonad gives structure to coalgebras. A comonad is made from
Comonads for effectstype W a counit :: W a -> a (!) :: (W a -> b) -> (W a -> W b)with duals corresponding to the monads laws. As an example, consider the state comonad:type CoST s a = (s -> a, s) counit (f,x) = f x !f = \(g,s) -> (\s'-> f (g,s'), s)Now ifS = (Stream B, A)
andf :: B -> A -> A
, we can define map on streams as follows:hd (mapstr f s a) = f (hd s) a tl (mapstr f s a) = mapstr f (tl s) f aAnother example is the comonad of exceptionstype CoEx a = Left a | Right a counit (Left x) = x counit (Right x) = x !f = Left . f raise e = Right e try e1 handle = case e1 of Left x -> x Right x -> f xFunctional languages are useful for programming reactive systems (use iteration in preference to recursion, input events are mapped to event streams). Comonads afford a natural way to interpret effects in coalgebras over codata: effects affect observations of codata, and state is manifested as a stream of valuations of state variables. It would be useful to have a notation in which codata and data are not easily confused. Discussion: could the arrow notation serve this purpose?
Peter Thiemann
Why interpret specialisatiuon in type theory? Note the parallels between them: staged computation versus phase distinction; static/dynamic versus comile-/run- time; two level language versus constructors/objects; two level types versus kinds/types; well foundedness versus phase distinction requirement. Our goal is to map a typical partial evaluator into type theory by mapping static terms into contructor expressions, and dynamic terms into object expressions. The basic function we want include:
Interpreting specialisation in type theoryE : 2Exp -> term at object level (run-time part) T : 2Exp -> term at constructor level (compile-time part)The compile-time kind, type and value structure echoes the run-time structure, except that it is elevated one level. Thus each integer i represents a one point compile-time type i^, the function + becomes a type constructor, and so on. The definition of E and T includeE[e:int] = T E[e1+e2] = E[e1]+E[e2] E[lift e:int] = choose (T[e:int])(E[e:int]) where choose i^ T = i^ T[e:int] = int T[i:int] = i^ T[e1+e2] = T[e1] +^ T[e2]For a term e:t in simply-typed lambda-mix we can show thatE[e] : type(t)(T[e])
, and that reduction in type theory (restricted to constructors) simulates lambda-mix specialisation. A similar correspondence can be drawn between module calculus and an extended version of lamba-mix.
Lennart Augustsson
Exercise: write a well-typed interpreter in Cayenne. We encode typing rules as a predicate which yields an empty type when false, and a non-empty type when true.
Cayennedata Truth = truth data Absurd = data (/\) a b = (&) a b data Type = TBool | TInt type TEnv = Symbol -> Type Decode :: Type -> #To calculate the types of expressions:HasType ::TEnv -> Expr -> Type -> # HasType g (EBool _) TBool = Truth HasType g (EInt _) TInt = Truth HasType g (EAdd e1 e2) TInt = HasType e1 TInt /\ HasType e2 TInt HasType g (EVar x) t = EqType (g x) t HasType g (EAp f a) t = sig s :: Type pf :: HasType g f (TArrow s t) pa :: HasTypen g a s HasType g (Lam x s' e) (TArrow s t) = HasType (extendT g x s) e t /\ EqType s' s HasType _ _ _ = absurd absurd :: Absurd -> a absurd x = case x of { }The (value) environment depends on the types of the variables, and these types are recorded in the type environment.TEnv :: Symbol -> Type VEnv (g :: TEnv) = (x :: Symbol) -> Decode (g x) extendT :: TEnv -> Symbol -> Type -> TEnv extendT g x t = \(x' :: Symbol) -> if (x==x') t (g x) extendV :: (g::TEnv) -> (r::Venv g) -> (x::Symbol) -> (t::Type) -> (v::Decode t) -> VEnv (extendT g x t) extendV g r x t v = \(x'::Symbol) -> case x==x' of True -> v -- :: Decode t False -> r x -- :: Decode (g x)Now we can use these types and proofs to build an untagged interpreter.interp :: (e:Expr) -> (t::Type) -> (g::TyEnv) -> Venv g -> HasType g e t -> Decode t interp (EBool b) TBool p = b interp (EInt i) TInt p = i interp (EAdd e1 e2) TInt (p1&p2) = interp e1 TInt p1 + interp e2 TInt p2 interp (Var x) t g r p = convert (g x) t p (r x) interp (App f a) t g r p = (interp f (TArrow p.s t) g r p.pf) (interp a p.s g r p.pa) interp (Lam x _ e) (TArrow s t) g r (p & _) = \(v::Decode s) -> interp e t (extendT g x s) (extendV g r x s v) p interp _ _ p = absurd p convert :: (t::Type) -> (t'::Type) -> EqType t t' -> Decode t -> Decode t'Conclusion: we can make an interpreter in Haskell-style that returns the right type but at the expense of three extra arguments. These extra argumements can be optimized away since they have no computational content. This then gives a faster interpreter than an interpreter written in Haskell. Perhaps the real gain is to be made with partial evaluation as the interpreter returns a value of the correct type.
Erik Meijer
Haskell style overloading is constructed from a couple of basic concepts. We aim to dissect the concepts in the hope of being able to build something simpler and/or more powerful (multiple instances at one type, delayed resolution of ambiguous overloading, or dynamic creation of instances, for example). Here we focus on the dictionary translation which introduces the dictionary arguments. Introduce implicit argument constructors:
Implicit arguments\\x -> e[x] :: {x::a,...} => b //x -> e :: {...} => a -> band now\x -> e[x] = //x -> (\\x -> e[x])and the typing judgements reflect this. Examples read :: {readDict :: Read a} => String -> a show :: {showDict :: Read a} => String -> a f = (show . read) :: {readDict :: Read a, showDict :: Show a} => String -> String sort :: {cmp :: a -> a-> Ordering} => [a] -> [a] sort = \\cmp . sortBy cmp x' = \\x .x y' = \\y . y x' :: {x::a) => a y' :: {y::b} => b foo :: {x:a, y::b} => (a,b) foo = (x',y')