Chapter 24: Formal syntax
24.1 : Source
Aldor source is a collection of lines containing a textual representation of a program.
Lines beginning with the character ``#
''
are system commands and are not part of the program text.
24.1.1 : Source inclusions
Source inclusion collects the source lines which make up a program. This process is controled by the following system commands:
#include "
file-name"
#reinclude "
file-name"
include
commands for that file have no effect.
A reinclude
command always includes the file, whether or not it
has already been included.
The includer tries to find the file relative to the directory of the current source file and then in a sequence of user-specified and platform-specific directories.
#assert
identifier
#unassert
identifier
#if
identifier
#elseif
identifier
#else
#endif
#!
text
#
#!
'' or
begin with ``#
'' and contain only white space.
24.1.2 : Prepared source
The following commands allow source to be prepared by another program.
#line
line-number ["
file-name"
]
#error
text
24.1.3 : Other system commands
Other system commands control the environment:
#pile
#endpile
#pile
''/``#endpile
''
and
``{
''/``}
''
pairs.
Closing ``#endpile
'' lines
at the end of the program may be omitted.
#library
identifier "
file-name"
#libraryDir
''.
#includeDir
"
directory-name"
#libraryDir
"
directory-name"
#int
options
#quit
quit
command causes the language processor to abandon the
program. If Aldor is running in interactive mode (-g
loop), then this command causes the termination of the interactive session.
24.2 : Lexical structure
24.2.1 : Characters
The standard Aldor character set contains the following 97 characters:
a-z A-Z
0-9
( |
left parenthesis | ) |
right parenthesis |
[ |
left bracket | ] |
right bracket |
{ |
left brace | } |
right brace |
< |
less than | > |
greater than |
, |
comma | . |
period |
; |
semicolon | : |
colon |
? |
question mark | ! |
exclamation mark |
= |
equals | _ |
underscore |
+ |
plus | - |
minus (hypen) |
& |
ampersand | * |
asterisk |
/ |
slash | \ |
back-slash |
' |
apostrophe (quote) | ` |
grave (back-quote) |
" |
double quote | | |
vertical bar |
^ |
circumflex | ~ |
tilde |
@ |
commercial at | # |
sharp |
$ |
dollar | % |
percent |
24.2.2 : The escape character
Underscore is used as an ``escape'' character, which alters the meaning of the following text. The nature of the change depends on the context in which the underscore appears.
An escaped underscore is not an escape character.
An escape character followed by one or more white space characters
causes the white space to be ignored.
The remainder of this section assumes that escaped white space has been
removed from the source.
24.2.3 : Tokens
The sequence of source characters is partitioned into tokens. The longest possible match is always used.
The tokens are classified as follows:
add and break default define do else export extend for fluid free from generate goto has if import in inline iterate local macro never not of or pretend repeat return then to where while with yield . , ; : :: :* := == ==> +-> | => ' $ @ ( ) [ ] { }
The characters in a keyword cannot be escaped. That is, if a character is escaped, the token is not treated as a keyword.
always assert but delay except fix is isnt let rule select try ` & || (| |) [| |] {| |}
by case mod quo rem # + - +- ~ ^ * ** .. = ~= ^= / \ /\ \/ < > <= >= << >> <- ->The characters in an operator cannot be escaped.
0
1
[%a-zA-Z][%?!a-zA-Z0-9]*
a
'', ``_#
'', ``a_*
'' and ``_if
''
are all identifiers.
The escape character is not part of the identifier so ``ab
''
``_a_b
'' represent the same identifier.
Identifiers are the only tokens for which the leading character may be
escaped.
`"'[^"]*`"'An underscore or double quote may be included in a string-style literal by escaping it.
[2-9]: [0-9][0-9]+ [0-9]+`r'[0-9A-Z]+Escape characters are ignored in integer-style literals and so may be used to group digits.
[0-9]*`.'[0-9]+{[eE]{[+-]}[0-9]+} [0-9]+`.'[0-9]*{[eE]{[+-]}[0-9]+} [0-9]+[eE]{[+-]}[0-9]+ [0-9]+`r'[0-9A-Z]*`.'[0-9A-Z]+{e{[+-]}[0-9]+} [0-9]+`r'[0-9A-Z]+`.'[0-9A-Z]*{e{[+-]}[0-9]+} [0-9]+`r'[0-9A-Z]+`e'{[+-]}[0-9]+Escape characters are ignored in floating point-style literals and so may be used to group digits.
Certain lexical contexts restrict the form of floats allowed.
This distinguishes cases such as sin 1.2
vs m.1.2
.
A floating point literal may not
.
'', unless the preceding token is a keyword
other than ``)
'', ``|)
'', ``]
'' or ``}
'',
.
'', if the preceding token is ``.
'',
.
'', if the following character is ``.
''.
--
'' and all characters up to the end of the line.
Underscores are not treated as escape characters in comments.
++
'' and all characters up to the end of the line.
Underscores are not treated as escape characters in documentation.
BACKSET
BACKTAB
abc
'' is taken as one token but ``a b c
''
is taken as three.24.3 : Layout
Normally page layout is not significant in an Aldor program. Within a ``#pile
''/``#endpile
'' pair, however,
indendation and newlines have meaning,
and are used to group collections of lines.
Source within such a pair is in a piling context.
Indentation sensitivity may be turned off by enclosing source
in a ``{
''/``}
'' pair.
Within braces all white space -- leading, embedded, and newlines --
is ignored.
This is a non-piling context.
Piling and non-piling contexts may be nested.
The layout of a program in a piling context is understood by converting leading white space and newlines to special markers which are part of the language syntax.
This conversion follows the linearization rules:
then
'', ``else
'', ``with
'' or ``add
''.
(
'', ``[
'',
``{
'' or ``,
'', or
the second begins with ``in
'', ``then
'',
``else
'', ``)
'', ``]
'' or ``}
''.