The GeomLab language

This page contains a concise and fairly formal description of the programming language that is part of GeomLab. Reading it is almost certainly not the best way to learn to write programs for GeomLab: instead, it is better to follow the examples that are contained in the worksheets.

The GeomLab language is a general-purpose programming language in the sense that it is not tailored towards the graphics programming that is the theme of the worksheets, and it is capable of describing any function that can be computed. Its chief feature is that it is purely applicative: there are no "variables" in the language that can be assigned different values at different times in the execution of a program, but as in ordinary mathematics, each variable takes a single value for each use of the equation or definition in which it appears. As in mathematics, the same equation or definition can be used several times in a calculation, with the variables standing for different values each time.

Syntax

An expression or definition consists of a sequence of tokens made up of ASCII characters. Each token belongs to one of these classes: names, numbers, strings, operators and delimiters. Blanks and line breaks may not occur within tokens, except in comments and for blanks in strings, but are otherwise ignored unless they separate two consecutive tokens that might otherwise be read as one. Upper and lower case letters are treated as distinct.

An identifier is any sequence of letters, digits and underscores that begins with a letter or underscore, except for the reserved words that are listed below.

A number token is any sequence of decimal digits, followed optionally by a decimal point and a further (possibly empty) sequence of decimal digits, then optionally by the letter E, an optional plus or minus sign, and a sequence of digits.

A string is a sequence of characters enclosed between two double-quote characters. A string may not contain a double-quote character or a line break.

An operator or delimiter is one of the reserved words and special symbols in the list that follows. Reserved words appear entirely in lower case, and may not be used as names.

and define div else function if in let mod not op or then when
_ = + - $ * / & ~ : . ++ < <= <> > >= ( ) [ ] , ; |

The syntax of expressions and definitions in the GeomLab language is given in subsequent sections of this document using EBNF notation. In this notation, square brackets [ ... ] enclose optional text, and curly brackets { ... } enclose text that may be repeated zero or more times. Sometimes parentheses ( ... ) are used to group alternatives.

Values

The following kinds of value are denoted by expressions in the GeomLab language:

Numbers

Numbers in the GeomLab language are represented in double-precision floating point format, even if they are integers. Numbers are denoted by number tokens, and are yielded as the results of arithmetic operations.

Booleans

The Boolean values true and false are denoted by predefined names, and are yielded as the results of comparison operators.

Strings

Strings are denoted by string tokens. Strings are not much used in GeomLab programming, and are included mostly for internal use "under the bonnet" in implementing the other language features.

Lists

The empty list is denoted by the expression [], and non-empty lists may be constructed with the operator ":". A list expression [x1, x2, ..., xn] is an abbreviation for a list of length n constructed in this way.

Functions

Names in GeomLab programs may denote functions, either primitive functions that are part of the initial environment or installed as a plug-in extension, or functions that are defined as part of the GeomLab program. Each function takes a fixed number of values as its arguments and deliviers a single value as its result. Functions are created by function definitions (see Section 6.2) and by function expressions (see Section 8.2).

Other values

Other kinds of value may be added to the GeomLab languages as plug-in extensions. Typically, such extensions come with a collection of primitive functions for creating and manipulating the new values. In the GeomLab environment, colours and pictures are provided as additional kinds of value in this way.

Scope rules

Expressions are evaluated in the context of an environment that gives values to the variables that appear in the expression. An environment has two parts: the global part contains the pre-defined names that are built in to the GeomLab system, together with additional names that have been added by top-level definitions. There is also a local part of the environment that contains names that have been defined by pattern-matching in a function definition, or by one of the additional forms of expression covered in Section 8. Unless these additional forms of expression are used, there is no nesting of environments in GeomLab, and the scope rules can be summarised as follows:

The additional forms of expression introduced in Section 8 allow the possiblility of nested scopes in which function values are first-class citizens. For these forms of expression, it is necessary to add the rule that the local part of the environment is treated according to static binding. Thus, function values capture the local part of the environment at the point of definition, and it is this local part, and not the one in force at the point of application, that is used to evaluate the function body. Recursion is allowed in local function definitions but not in local value definitions.

Expressions

We give the basic syntax of expressions here, leaving operator sections (Section), list comprehensions (Comprehension), let expressions, and function expressions to be described later.

Primary    = Name | Number | String
           | Name { "(" [ Expression { "," Expression } ] ")" }
           | "(" Expression ")" 
           | Section
           | "[" [ Expression { "," Expression } ] "]"
           | "[" Expression ".." Expression "]"
           | Comprehension
Factor     = ( "-" | "~" ) Factor | Primary 
Term0      = { Factor ":" } Factor
Term1      = Term0 { ( "*" | "/" | "$" ) Term0
Term2      = Term1 { ( "+" | "-" | "&" ) Term1 }
Term3      = { Term2 "++" } Term2
Term4      = Term3 { ( "=" | "<" | "<=" | "<>" | ">" |  ">=" ) Term3 }
Term5      = Term4 { and Term4 }
Term       = Term5 { or Term5 }
BasicExpr  = Term | if Term then BasicExpr else BasicExpr
Expression = BasicExpr | LetExpr | FunctionExpr

Primary expressions

The smallest expressions are names, which denote the value to which they are bound in the environment; number tokens, which denote a numeric constant; and string tokens, which denote a constant string. Any expression may appear in parentheses as an operand in a larger expression.

An expression [e1, e2, ..., e_n] denotes a list made up of the n values that are denoted by the expressions e1, e2, ... e_n. This list is constructed using the empty list (denoted []) by n applications of the construction operator ":".

If x and y are integers, then the expression [x..y] stands for the list [x, x+1, x+2, ..., y], equivalent to _range(x, y), where range is the function defined as follows.

define _range(x, y) = if x > y then [] else x:_range(x+1, y);

Function application

An expression f(e1, e2, ..., e_n) denotes the application of a function f to the n arguments e1, ..., e_n. The number of arguments must match the number expected by f. The arguments and the function f are first evaluated. If f is a function that has been defined as part of the GeomLab program, then the argument values are matched with the patterns that appear in the definition of f, and the value of the application is the value yielded by the appropriate function body. Other functions are implemented as primitives in the initial environment of the GeomLab program, and deliver a result that cannot be expressed as the value of another expression.

Unary and binary operators

Simple expressions may be combined with various unary and binary operators to form more complex expressions; these operators have the binding powers that are implied by the syntax rules above. All binary operators associate to the left, except for ":" and ++, which associate to the right. An expression written with a prefix or infix operator is just an abbreviation for the application of the same operator as a function. Thus, the expression x + y is an abbreviation for the binary function application (+)(x, y); this expression uses the name (+) for the built-in function that adds numbers together. All the operators of GeomLab are bound to different primitive functions in the initial environment.

An exception to this rule is made in the case of the operators and and or, which are evaluated in a 'short-circuit' fashion: thus, in the expression e1 and e2, the sub-expression e1 must yield a Boolean value. If this value is true, then the value of the expression is whatever value is yielded by evaluating e2; otherwise the value of the expression is false. Similarly, the value of e1 or e2 is the value of e2 if e1 yields false, and otherwise it is true.

This means that e1 and e2 is equivalent to the conditional expression

if e1 then e2 else false

and e1 or e2 is equivalent to

if e1 then true else e2.

Conditional expressions

An expression if e1 then e2 else e3 is evaluated by first finding the value of e1. This should be a Boolean value, either true or false; an error is reported if it is not. Then either e2 or e3 is chosen for evaluation, depending on the Boolean value, and the value of the chosen expression becomes the value of the whole conditional expression. The other sub-expression is not evaluated.

Patterns

Patterns are used in function definitions (see Section 6) and function expressions (see Section 8) to match the arguments of a function. An attempt to match a pattern against a value may either succeed or fail; if it succeeds, then the names that appear in the expression become bound to parts of the original value. A name may appear more than once in a pattern or list of patterns; in that case, matching fails unless the values that it matches are all equal. The equality test that is applied is the same as the one used for the = operator.

PattPrimary = Ident | "_" 
            | [ "-" | "~" ] Number 
            | String
            | "(" Pattern ")" 
            | "[" [ Pattern { "," Pattern } ] "]"
            | Ident "(" [ Pattern { "," Pattern } ] ")"
PattFactor  = PattPrimary { ":" PattPrimary }
Pattern     = PattFactor { "+" Number }

Primary patterns

The simplest patterns are names, which match any value and bind the name to it; the anonymous pattern _ which matches any value but does not bind a name; positive and negative numbers and strings, which match the single, constant values denoted by the number or string. Any pattern may also be enclosed in parentheses and used as a primary part of a larger pattern.

A list pattern [p1, p2, ..., pn] matches any list of length n whose elements are matched by the patterns p1, p2, ... pn respectively.

A constructor pattern c(p1, ..., p_n) is allowed if c denotes a primitive constructor of n arguments. It matches values built with the constructor if patterns p1, ..., p_n match the arguments of the constructor. Primitive constructors defined in the GeomLab library include rgb(r, g, b) for constucting colours, and the commands ahead(d), left(a) and right(a) for constructing turtle commands. Others may be added by plug-in modules.

Cons patterns

A "cons" pattern p1:p2 matches any non-empty list whose head is matched by p1 and whose tail is matched by p2.

Plus patterns

A "plus" pattern has the form p+n, where n is a number. It matches a number x if n > 0 and the difference y = x - n is an integer such that y >= 0 and the pattern p matches y.

Definitions

Definitions appear in define paragraphs to add a definition to the global environment, and also in let expressions to define a name locally to an expression.

Definition = ValueDef | FuncDef

Value definitions

ValueDef = Name "=" Expression

A value definition defines a name as standing for the value of a certain expression. The expression is evaluated immediately, and the name becomes bound to its value.

Function definitions

FuncDef = Clause { "|" Clause }

Clause  = Name Formals "=" Expression [ when Expression ]

Formals = "(" [ Pattern { "," Pattern } ] ")"

A function definition defines a name as standing for a function. The function is defined by a sequence of clauses, each containing a list of patterns that are matched against the arguments of the function, and expression that gives the corresponding value yielded by the function, and optionally a boolean-valued guard expression after when that specifies a condition under which the clause applies. For a function definition to be syntactically valid, all the clauses must contain the same function name, and all must contain the same number of argument patterns, the number of arguments that is expected by the function.

When the function is applied to arguments, the clauses in the definition are considered in order, and the first applicable one determines the value that is yielded by the application. To apply a clause, the patterns in the clause are first matched with the incoming arguments. If they all match, then the guard (if any) is evaluated; if the value of the guard is false, then the clause does not apply. Finally, the right-hand side expression is evaluated, and its value becomes the value yielded by the function application. If any guard that is evaluated fails to return a Boolean result, or if no clause is applicable to the arguments, then the evaluation fails.

The definition of a function may have several clauses, each one matching a different pattern of arguments. For example, here is the definition of a function pow(a, b) that computes a raised to the power b:

define pow(a, b) = a * pow(a, b-1) when b > 0
  | pow(a, 0) = 1

The first clause deals with the case where b > 0, defining pow(a, b) in terms of pow(a, b-1); the second clause deals with the case where b = 0, giving the result directly, and providing a place for the recursion to stop. (The function is not defined at all if b < 0.)

The first clause in this definition has patterns (a and b) that will match any arguments that are supplied, but the guard b > 0 rules out those where b <= 0. The second clause matches those argument lists where the second argument is equal to 0.

Paragraphs

Paragraph = Expression [ ";" ] 
          | define Definition [ ";" ]

A program in the GeomLab language consists of a sequence of paragraphs that are entered at the top-level prompt or read from one or more text files. Each paragraph is either an expression to be evaluated in the current global environment, or a definition that adds to that environment.

When paragraphs are written on a file for use with the File/Load command of GeomLab, each paragraph must end with a semicolon. The semicolon can be omitted when paragraphs are entered at the interactive prompt.

Additional forms of expression

There are three additional forms of expression -- let expressions, function expressions and operator sections -- that are not needed for the worksheets, but are useful in more advanced programming. These forms of expression do not add to the expressive power of the language, but they do make some kinds of program easier to write.

let expressions

LetExpr = let Definition in Expression

An expression let d in e allows the name defined by the definition d to be used in the expression e. For example, the value of the expression

let y = x + 1 in y * y

is the square of whatever is the value of x + 1. The advantages of using a let expression is that it is often clearer to do so, and sometimes shorter and more efficient than writing out the expression and substituting the right-hand side of the definition for the left-hand side, like this:

(x + 1) * (x + 1)

Also, let expressions can be used to define functions that are local to a single expression.

function expressions

FunctionExpr = function Formals Expression

A function expression denotes a function that is defined by a single clause with no guard. A function expression

function (p1, p2, ..., pn) e

denotes the same function f as is defined by the definition

(p1, p2, ..., pn) = e

There is no need, however, to invent a fresh name f in order to write the function as a function expression. Lambda expressions are mainly used in more advanced programming to specify arguments to higher-order functions.

Operator sections

An operator symbol in parentheses, such as (+), denotes the two-argument function performed by the operator. This is useful with higher-order functions like this:

define sum(xs) = foldl((+), 0, xs)

A binary operator symbol may also appear in parentheses together with one or other of its operands and denotes a function of one argument. Thus (+1) is the function that adds one to its argument, and (12/) is the function that divides 12 by its argument.

(Concrete syntax for operator sections is missing from this summary.)

Operator sections are implemented in terms of two functions _lsect and _rsect, so that (12/) is equivalent to _lsect((/), 12) and (+1) is equivalent to _rsect((+), 1), where _lsect and _rsect are defined as follows.

define _lsect(f, x) = function (y) f(x, y);
define _rsect(f, x) = function (x) f(x, y);

=List comprehensions

List comprehensions provide compact notation for expressions that would otherwise involve higher-order functions such as map and filter.

Primary     = "[" Expression "|" 
                  Generator { "," Generator | "when" Expression } "]"
Generator   = Pattern "<-" Expression

For example, the comprehension [ f(x) | x <- xs ] is equivalent to map(f, xs), the compreshension [ x | x <- xs when p(x) ] is equivalent to filter(p, xs), and the compreshension [ x | xs <- xss, x <- xs ] is equivalent to concat(xss).

List comprehensions are implemented in terms of a single function _mapa, defined by

define _mapa(f, [], a) = a
  | _mapa(f, x:xs, a) = f(, _mapa(f, xs, a))

A list comprehension [ E | G1, G2, ... ] is defined as equivalent to the expression T(E, [G1, G2, ...], []), where the transformation T is defined as informally as follows.

T(E, [], a) = "E : a"
T(E, ["x <- E1", ...], a) =
  "_mapa(function (x, b) T(E, [...], b), E1, a)"
T(E, ["when E2", ...], a) =
  "if E2 then T(E1, [...], a) else a"

(The names x and b introduced by this translation are replaced by other names that do not occur elsewhere in the program.)

Syntax summary

Paragraph   = Expression [ ";" ] 
            | define Definition [ ";" ]
Definition  = ValueDef | FuncDef
ValueDef    = Name "=" Expression
FuncDef     = Clause { "|" Clause }
Clause      = Name Formals "=" Expression [ when Expression ]
Name        = Ident | Operator
Expression  = Term 
            | if Term then BasicExpr else BasicExpr
            | let Definition in Expression
            | function Formals Expression
Term        = Term5 { or Term5 }
Term5       = Term4 { and Term4 }
Term4       = Term3 { ( "=" | "<" | "<=" | "<>" | ">" |  ">=" ) Term3 }
Term3       = { Term2 "++" } Term2
Term2       = Term1 { ( "+" | "-" | "&" ) Term1 }
Term1       = Term0 { ( "*" | "/" | "$" ) Term0
Term0       = { Factor ":" } Factor
Factor      = ( "-" | "~" ) Factor | Primary
Primary     = Ident | Number | String
            | Ident { "(" [ Expression { "," Expression } ] ")" }
            | "(" Expression ")" 
            | "[" [ Expression { "," Expression } ] "]"
            | "[" Expression ".." Expression "]"
Formals     = "(" [ Pattern { "," Pattern } ] ")"
Pattern     = PattFactor { "+" Number }
PattFactor  = PattPrimary { ":" PattPrimary }
PattPrimary = Ident | "_" 
            | [ "-" | "~" ] Number 
            | String
            | "(" Pattern ")" 
            | "[" [ Pattern { "," Pattern } ] "]"
Operator    = and | div | mod | not | or | "=" | "+" | "-" 
            | "$" | "*" | "/" | "&" | "~" | ":" | "@" 
            | "<" | "<=" | "<>" | ">" | ">="