Chapter 4: Basic syntax
4.1 Syntax components
4.2 Escape character
4.3 Keywords
4.4 Names: identifiers and operators
4.5 Comments and desciptions
4.6 Application syntax
4.7 Grouping
4.8 Piles
4.1 : Syntax components
An Aldor program consists of a series of lines of text.
These lines of text may be stored in a single file, or gathered
from several files, or typed in as interactive input.
Some lines are not part of the Aldor program proper, but instead
control its composition and the environment in which it is handled.
These lines are called system commands. A system command is a line which has a hash character "#" as its first
character. (Note that no white space may precede the "#" on the
line.)
The example programs in this chapter use the following system commands:
#include "filename.as" #pile #endpile
The system command #include "filename.as" causes the lines of text
from ``filename.as'' to be inserted into the Aldor program in place
of the #include command.
The system commands #pile and #endpile are used to enclose
lines of text in which indentation is used to determine the nesting of
sequences of Aldor expressions. (See section 4.8.)
A complete list of system commands is given in section 24.1.
System commands used in the interactive interpreter are described in section 17.2.
When the series of lines comprising an Aldor program has been
collected together, these lines are interpreted as a series of words,
or tokens.
There are several classes of tokens, each of which has a different meaning:
42
, 1.414
and "Urania riphaeus"
,
represent explicit values. Literals are described in section 5.2.
The exact rules for the syntax of each of these token classes is given
in section 24.2.3.
4.2 : Escape character
The underscore character "_" is used as an escape character
in Aldor to modify the interpretation of the characters which follow.
For example, an escape character followed by any amount of white space
(spaces, tabs, and newlines) causes the white space to be ignored, allowing
the characters on either side of the white space to form a single token,
such as a name or a literal.
Section 5.1 describes how the escape character can be used
inside an identifier, and section 5.2 describes
how the escape character can be used inside a literal.
4.3 : Keywords
The basic components of any Aldor program can be separated into
two broad categories: those which are defined by the language, and
those which may be defined or redefined by the program.
For example, the meaning of the word "if" is defined by the
language, and all "if" statements behave according to
the same rules.
On the other hand, the meaning of a name such as "a" or "9"
or "+" is determined by the program in which it is used.
A keyword in Aldor is a word whose meaning is fixed by the
definition of the language. The following words are keywords which
may not be redefined:
add and always assert break but default define delay do else except export extend fix for fluid free from generate goto has if import in inline is isnt iterate let local macro never not of or pretend ref repeat return rule select then to try where while with yield . , ; : :: :* $ @ | => +-> := == ==> ' [ ] { } ( )
Generally, language-defined aspects of keywords offer protocols
which allow them
to work with new types as well as with language-defined types.
So, for example, the language-defined "if", provides a way for
the condition to be an expression which evaluates to any type, provided
that type has certain properties.
The following keywords are meaningless in the current language
definition, but are reserved for future language extensions.
always assert but delay except fix is isnt let rule select try (| |) [| |] {| |} ` & ||
4.4 : Names: identifiers and operators
A name is an identifier used to denote a variable or a constant.
Most names begin with a letter or the character "%" and are made up of
letters, digits and the characters "%", "?" and "!".
The words "0" and "1" are also treated as names in Aldor
so that mathematical structures can export identity elements without
having to support integer literals. (See section 5.2.)
Examples:
mylist Integer empty? set! %5
When used in an identifier, the escape character is not included in the name
of the identifier. To include a single underscore character in the name
of an identifier, the sequence "__" must be used. So the name of the
identifier denoted by "mod_+" is "mod+", and the name of the
identifier denoted by "My__Integer" is "My_Integer".
A sequence of letters which would otherwise be considered a keyword
(such as "if") can be treated as a name by escaping one of its
constituent letters (as in "_if").
Certain names are treated as having special syntax properties by the language.
The following identifiers can be used as infix operators, prefix operators,
or both:
by case mod quo rem # + - +- ~ ^ * ** .. = ~= ^= / /\ < <= << <- \ \/ > >= >> ->
Aside from their syntactic properties,
these names behave just as other identifiers.
See section 4.6 for examples of using infix operators
in different contexts.
A few naming conventions are used in the standard libraries:
Note that these are only notational conventions and are not considered
as part of the language.
4.5 : Comments and descriptions
Comments and description strings
annotate a program to help other people and other programs understand it.
A comment begins with the two characters
"--" and continues until the end of the current
line of text.
Comments can be used to describe how a program operates, including
an explanation of special assumptions made by the program, or a
step-by-step description of the implementation of the algorithms
used by the program.
Comments are not examined by the compiler, and do not affect the meaning
of a program.
A description begins with two or three plus characters
("++" or "+++")
and continues until the end of the current line of text.
A description should be used to describe the external characteristics
of a program, such as the parameters it will accept or the method used
to compute the result.
Description strings are saved in the compiler output in a form
accessible by other programs.
If a description begins with three plus characters ("+++"),
then the name it describes should appear immediately after the description.
If a description begins with only two plus characters ("++"),
then the name it describes should appear immediately before the description:
+++ An approximation to Euler's constant, +++ which is defined as the value of the limit +++ +++ lim(n->infinity) (1 + 1/2 + 1/3 + ... + 1/n - ln n) +++ gamma: DoubleFloat == 0.57721_56649_01532_86060_65121; +++ `pi' is the ratio of a circle's circumference to its diameter. pi: DoubleFloat == 3.14159_26535_89793_23846_26434; ++ This is not 22/7. Avogadro: DoubleFloat == 6.022e23; ++ The ratio between grams and molecular weights.
Both "+++" and "++" are used so that after a semicolon we can
still associate a description with the previous declaration.
It is easy to remember the difference between comments and descriptions:
the Aldor compiler keeps positive remarks, and throws away negative ones.
Example:
-- This is a quick-and-dirty move generator, with two of -- the utility functions also made visible. ChessPiece: with { bestMoves: (Board, %) -> MoveTemplate; ++ `bestMoves p' suggests the best moves in the ++ given position. legalMoves: (Board, %) -> MoveTemplate; ++ `legalMoves p' generates quasi-legal moves. ++ It does not handle en passant or castling. value: (Board, %) -> DoubleFloat; ++ This is a score which estimates the current ++ value in the given position. } == add { ... }
4.6 : Application syntax
Applications are typically used to denote function calls, array indexing, or element accessors for compound data types.
A prefix application typically has the following form:
f(a1, ..., an)There are two additional forms for specifying a prefix application to one argument: juxtaposition and an infix dot.
f a f.a
The second of these forms is completely equivalent to f(a); the first is equivalent in a free-standing occurrence but associates differently -- to the right, rather than the left:
f a b c -- is equivalent to (f (a (b c))) f.a.b.c -- is equivalent to (((f.a).b).c) f(a)(b)(c) -- is equivalent to (((f(a))(b))(c))
Any application in which the argument is enclosed in parentheses
("( )") or square brackets ("[ ]") is treated as being of
the ``typical'' form, and so associates to the left -- even if a
space follows the applied object. Thus "first [1,2,3]" is treated
as identical with "first([1,2,3])".
The interpretation of mixed forms is determined by precedence rules:
the precedence of juxtaposition is lower than that of the other forms,
which are all equivalent, so an expression such as
"f(a).2(b)(c).x y" is associated as
"((((((f(a)).2)(b))(c)).x) y)". (A complete table of Aldor
operator precedence appears in section 4.7.1.)
Infix operators are applied to a pair of arguments using infix notation
for function application:
a + b -- infix notation for a call to `(+)(a, b)'
Infix operators are generic in that they can be given definitions in Aldor programs just as other identifiers. The typical form for an infix function definition is as follows:
(s1: S1) op (s2: S2) : T == E
where op is one of the infix operators listed in section 4.4.
An infix operator can be used in any context where other names can be used.
However, in some contexts the infix operator must be enclosed in parentheses
to suppress its normal syntactic properties:
-- Here is a declaration for `*'. *: (%, %) -> % -- Here `*' is used as a variable. * := myMultiplicationMethod -- Here is a typical use of `*' as an infix operator. a * b -- Here `*' is used as a prefix operator -- by enclosing it in parentheses. (*)(a, b) -- Here `*' is passed as an argument to a function. reduce(*, l) -- Here `*' is being used as an argument of another infix operator. f := (*) + g(*, 1)
An infix operator must be enclosed in parentheses to be used as a prefix
operator. Also, an infix operator cannot appear as an argument of another
infix operator unless it it enclosed in parentheses.
Alternatively, the same name may be given as an identifier,
rather than an infixed operator, using the escape character
to include special characters, for example: _*(a, b)
.
4.7 : Grouping
Complex expressions in Aldor are formed according to the precedence of the operators appearing in the expression. When an expression is formed, the operators with higher precedence form the subexpressions for the operators with lower precedence.foo(x: Integer, y: Integer): Integer == { a := x * y; b := a * a; b }
The meaning of an expression is the same whether braces or parentheses
are used. Braces are normally used to enclose a longer expression
(especially sequences) split over several lines.
Parentheses are normally used to enclose shorter expressions
(especially multiple values-see section 5.8)
as part of other expressions.
An implicit semicolon is assumed, if possible, after a closing brace.
This is determined by whether the following token may start a new
expression. For instance, in the construct
f: with {...} == add ...
introduced in section 7.9, the "==" may not start an expression, so no semicolon is assumed.
To make the use of braces as natural as possible, an expression
in braces may not be used as an argument to an infixed operator,
e.g. "+", "-", "..".
This is because many infixed operators may also be used in prefixed
position. (Some, incidentally, may also be postfixed.) With
infixed operators, parentheses may be used to achieve the
desired effect -- for example:
x := ({a; b; c}) + d y := (a; b; c) + d
4.7.1 : Precedence
Figure 4.1 provides a table of keywords and
operators,
given in order of syntactic precedence.
Expressions are represented
by ``e'' and keywords or operators by ``o''.
Each of the numbered entries in the table lists syntactic elements
with the same precedence.
The entries at the top of the table
group most loosely, and those at the bottom most strongly.
So, for example, since "+" is above "*", the expression
a * b + c * d
groups as "(a*b) + (c*d)".
Entries with the same level number have the same precedence.
For instance, `` [+] operators (may be overriden)
4.8 : Piles and
'' and ``/
'' have the same grouping strength.
Some operators have both unary and binary forms.
Some of these operators have meanings defined in the standard base
libraries ( infixed "+" and "*"), while others do not
( prefixed "=" and "+-").
A programmer may provide new meanings for operators, but not for keywords.
Entries for operators are flagged with ``[+]''; keyword entries are unflagged.
This table can serve as a convenient reference for determining relative
strengths of keywords and operators.
The full details of the language syntax are given in chapter 24.
Fig 4.1: Keyword and operator precedence
Keyword/Operator Associativity Unary
1 . ;
2 . default define export extend
fluid free import inline
local macro
3 . ,
4 . where
5 . := == ==> +->
6 . break do generate goto
if iterate never repeat
return yield
=>
7 . for while
8 . and or
[+] /\ \/
9 .[+] = ~= ^= >= > >> <= < <<
[+] case is isnt
has
10.[+] .. by
11.[+] + - +-
12.[+] mod quo rem
13.[+] * / \
14.[+] ** ^
15. :: @ pretend
16.[+] -> <-
17. $
18. add with
19. per rep not ~ #
A B (juxtaposition)
20. A(B) A[B] A.B
(e o e) o e
---
---
---
(e o e) o e
(e o e) o e
e o (e o e)
---
---
---
e o (e o e)
---
(e o e) o e
(e o e) o e
(e o e) o e
(e o e) o e
(e o e) o e
(e o e) o e
(e o e) o e
(e o e) o e
(e o e) o e
e o (e o e)
(e o e) o e
e o (e o e)
e o (e o e)
(e o e) o e
---
e o (e o e)
(e o e) o e
---
---
---
---
---
---
---
---
---
---
---
---
---
---
o e
o e
---
e o
o e
---
---
---
---
---
---
o e
o e
---
---
Programmers often use indentation to make the visual structure of a program conform to it's syntactic structure, so that programs are easier to read. In Aldor, white space is usually ignored except to delimit tokens and to compute source position information. However, the compiler can be instructed to use indentation as part of the syntax of the language using a scheme known as piling.
Two system commands are used to instruct the compiler to enable and disable piling syntax, as desired, at various points in an Aldor program. The system command "#pile" instructs the compiler to use piling syntax for the source lines which follow, and the system command "#endpile" instructs the compiler to ignore initial white space on the source lines which follow.
Although the system commands "#pile" and "endpile" are usually found in pairs, the "#endpile" system command can be omitted at the end of a file.
When piling syntax is being used, indentation is treated roughly as follows (see section 24.3 for full details; more examples of Aldor programs which use piling syntax can be found in figure 18.1).
Expressions which are indented by the same amount are grouped together as a sequence (see section 5.9) as though they were enclosed in braces and separated by semicolons. A sequence of expressions indented by the same amount is called a pile.
An expression which is too large to fit on one line at the current indentation level can be continued on another line by indenting the continuation line more than the initial line.
The indentation rules are applied first to the most indented lines, working outward to the lines which are indented the least.
The following example shows the piling rules being applied to a program which uses piling syntax, to convert it to an equivalent program which does not use piling syntax:
#pile GetUp() if Saturday then CookBreakfast() Eat << Toast << Tomato << Bacon << Egg HaveShower() DrinkCoffee()
Because the line "<< Eggs" is indented with respect to the previous line, the two are joined.
#pile GetUp() if Saturday then CookBreakfast() Eat << Toast << Tomato << Bacon << Eggs HaveShower() DrinkCoffee()
The "CookBreakfast" and "Eat..." lines form a pile, which can be rewritten as a semicolon separated sequence:
#pile GetUp() if Saturday then { CookBreakfast(); Eat << Toast << Tomato << Bacon << Eggs } HaveShower() DrinkCoffee()
And finally the entire program is treated as a pile:
{ GetUp(); if Saturday then { CookBreakfast(); Eat << Toast << Tomato << Bacon << Eggs } HaveShower(); DrinkCoffee() }
Readers wishing to experiment interactively with our examples (by using "aldor -gloop") should note that piling is the default in interactive use. The examples generally should still run correctly if the illustrated layouts are used -- a few may require the addition of braces ("{}"). See chapter 17 for further details.