Carnegie Mellon University
Research Showcase @ CMU
Computer Science Department School of Computer Science
1978
An Informal Definition of Alphard (Preliminary)
Paul N. Hilfinger
Carnegie Mellon UniversityGary Feldman
Robert Fitzgerald
Izumi Kimura
Follow this and additional works at: http://repository.cmu.edu/compsci
CMU-CS-78-185
(Preliminary)
An Informal Definition of Alphard
Paul Hilfinger
Gary Feldman
Robert Fitzgerald
Izumi Kimura
Ralph L. London
KVS Prasad
VR Prasad
Jonathan Rosenberg
Mary Shaw
Wm. A. Wulf (editor)
February 12, 1978
Computer Science Department
Carnegie-Mellon University
Pittsburgh Pa., 15113
This work was supported in part by the Advanced Research Projects Agency cf the Department of Defense under contract: DAHC 15-72-C-0308 and F44620-73-C-0074 ( which is monitored by the Air Force Office of Scientific Research), and in part by the National Science Foundation Grant OCR 74-04187.
An Informal Definition of Alphard
Preface
An Informal Definition of Alphard i
An Informal Definition of Alphard ii
Apx A: Collected Syntax 40
Apx B: Standard Prelude 43
B.1. Primitive Prelude 43
B.2. Standard Prelude 43
B.3. Implementation Prelude 44
Apx C: Special Identifier Assumptions 46
C.1. Generator routines 46
C.2. Extensible routines 47
Apx D: A Complete Example 48
Apx E: Proof Rules 51
Chapter 1
Introduction
Many aspects of programming languages have become fairly standard in the past decade. Alphard constructs are intentionally similar in style and meaning to the analogous constructs in other languages. In particular, we have leaned heavily on the Algol-Pascal culture: the syntax of expressions, variable declarations, procedures, and so on, are all derived from this culture. The following, for example, is a fragment of a valid Alphard program and is obviously similar to Pascal:
We expect that the similarity between the Alphard constructs and the analogous ones in other languages will aid both the reader of this report and the programmers who use the language.
There are, however, a number of aspects of the language which differ significantly from many traditional languages. This section provides brief notes on these aspects of Alphard. It is, in effect, a list of points at which the reader should be aware that things may not be as expected.
1. Type Definitions: The programmer may define a new type through a construct called a form.
The form permits both the specification of the abstract properties of (objects of) the new
type and the implementation of that type in terms of pro-existing types. Type definitions
(forms) may be parameterized; in particular they may accept other form names as
parameters. Such forms are called "generic" and define a class of types (e.g., array(T).
where T is a type, defines array-of-integer, array-of-real, array-of-set-of-integer, etc.).
2. Primitive types: Integer, real, complex, etc, are not primitive types in Alphard; similarly,
structures such as arrays, records, and references (pointers) are not primitive. All of these
familiar notions are available, however. Either they are provided as "syntactic sugar"
through some standard abbreviations, or else they are made available to the programmer as a
part of a "standard prelude" -- a set of standard definitions which (conceptually) prefaces
every program. Specifications for the standard prelude are included as appendix
B to this report.
__________________
1 We are convinced that deciding what not to include in a language design is much harder than inventing clever
new things to include.
There are (only) two distinguished types in Alphard: they are distinguished in the sense that must be considered as part of the language and not as part of the standard prelude. are "rawstorage" and "boolean". Specifications of these types and their associated may be found in appendix B; informally, however:
3. Type Checking: Most modern programming languages contain some notion of the
"equivalence" of types and require that the types of actual parameters to procedures be
equivalent, to those specified by the corresponding formal parameter definitions. The
presence of parameterized and generic type definitions in Alphard makes it advantageous to
replace the notion of equivalence by a more liberal notion of “matching“. Formal parameter
definitions specify a collection of properties which the corresponding actual parameter must
possess, specify a collection of properties which are irrelevant, and provide a limited facility
for relating properties of distinct parameters. Together these define a class of valid actual
parameter types, and provide what is generally called “strong typing“; in particular, the
parameter specification and matching is sufficiently strong to ensure verifiability.
4. Scope Rules: Alphard's block structure is similar to that of Algol 60: declarations appear at
the head of a block, the meaning of an identifier is determined from its nearest enclosing
declaration, and so on. Unlike Algol, however, in Alphard the bodies of procedures and forms
do not inherit the names of variables available in enclosing blocks. The intent of these scope
restrictions2 is to ensure that all effects of an action can be determined by examining the
text immediately surrounding the action itself. An additional benefit is that Alphard can be
implemented (very efficiently) using a stack, but without the need for a display.
5. Operator Overloading: The meaning of the usual infix operators (e.g., "+", "*", etc.) may be
extended to programmer-defined data types. The symbols for, the associativity of, and the
precedence of these operators are fixed by the language (see appendix C).
__________________
2 A particular consequence of the scope restrictions - together with companion restrictions on overlapping actual
parameters and selectors -— is to prevent unintended "aliasing". That is, they ensure that within a given scope
there is at most one name for a given storage call.
6. Selectors: The programmer may define the representation of a data structure by means of a
selector. Intuitively, a selector defines an algorithm for naming data, just as a procedure or
function defines an algorithm for computing values. A selector may be thought of as a
procedure that returns a pointer (reference, address) to an element of a data structure; the
syntax for defining selectors is therefore similar to that of procedures. "Pointer" is not,
however, a type in Alphard; no variables of this type can be declared, and hence the "value"
returned by a selector cannot be stored. The effect of this (coupled with some verification
requirements) is that selectors are “safe”; most of the (useful) flexibility of general address
arithmetic is retained without introducing its corresponding dangers. In particular, it is
possible to define a restricted style of "reference“ variable completely within the language
and to ensure that this type is at least as safe as array indices in other languages.
7. Assertions: Assertions are permitted almost everywhere and special syntax encourages
their use in appropriate places. The language in which the assertions are written, however,
is not defined by Alphard. The choice of that language is, we believe, a private matter
between the programmer and verifier.
8. Iteration: Four iteration statements are provided, three of which are somewhat different
from what one might expect.
The fourth iteration construct is the familiar while statement.
9. Sugaring: A number of familiar notions such as records and enumerated types are not
primitive notions in Alphard. They are provided, however, as abbreviations for the more basic
notions from which they are formed3
This report is a precise but informal definition of Alphard; It is neither a primer nor a completely rigorous formal definition. It is intended, however, to be the reference for users, implementors, and verifiers. To that end we have attempted to be as precise as our human limitations and the vagaries of English permit. We have consciously adopted the style and tone of the Algol 60 report, which we believe remains the exemplar language definition.
__________________
3 Pun intended!
Chapter 2
Fundamental Concepts
__________________
4 We use the word “elaboration”, in preference to “execution”, to connote actions taken at “compile time” as well
as at “run time”. Elaboration may be thought of as an idealized, direct execution of the textual version of the
Alphard program.
5 The word “routine” is used systematically to cover the notions of proc, vproc, func, and sel.
6 At this point a result object may also be instantiated, but this is not essential to the present discussion.
__________________
7 Note that objects, not values, are typed, indeed naked values do not exist in Alphard -- values only exist in
objects. Thus, for example, we may speak informally of lhe 'velue produced by a procedure", but in fact the
procedure returns an object that contains the value.
A type results from the elaboration of a type description (see section 5.6) and consists of a base type, a (possibly null) sequence of actual type qualifiers, and an update set.
A base type is a form name; it uniquely identifies a class of types. For example, the base type of "vector(real,1,10)" is "vector". In the following, If T la a type. Base(T)8 represents the base type of T.
An actual type qualifier is intuitively an actual parameter in a type description; hence it corresponds to a formal parameter in a form definition. It may be an object address, an object, a routine name, a type, or a marker denoted unk9. In ”vector(real,1,10)", the actual type qualifiers are "real", "1", and "10".
A type none of whose qualifiers is unk is called a full type: a type with at least one unk qualifier is called a partial type.
The update set consists of a set of routine designators from among those defined with the base type; specifically, the update set consists of those routines which may have a (visible) effect on an object of the type. If T is a type, Update(T) denotes its update set.
If T is a type, then Qual(T) denotes the sequence of actual type qualifiers of T and Quali(T) denotes the ith element of that sequence.
__________________
8 Here and in the sequel we shall use functions such as Base(T), Qual(T), Update(k), etc. to explain semantic
aspects of the language; these functions are only part of the semantic exposition, not constructs in the language
itself.
9 The marker unk, which is to be read "unknown", denotes situations in which the corresponding form formal is
not considered a part of the nominal type.
A type description is a syntactic construct which describes a class of types and may designate restricted access to objects of those types. A type description consists of a base type, a sequence of formal type qualifiers and an update set. The elaboration of a type description yields a type or a nominal type.
A base type is (as above) a form name.
A formal type qualifier is either the marker unk or else a description at an object address, an object, a routine, or a type. When used to specify a formal parameter, a formal type qualifier may be an identifier preceded by a "?" symbol; in such cases an "implicit binding" is implied (see section 2.4).
The update set consists of a set of routine names trom among those defined with the base type. Update sets give restrictions on the effect-producing actions that may be applied to an object10. If D is a type description, Update(D) denotes its update set.
A nominal type, like a type, consists of a base type, a (possibly null) sequence of actual type qualifiers, and an update set. These notions are defined exactly as in the definition of type.
A type is always associated with an object. A nominal type, on the other hand, is always associated with an identifier. The nominal type of an identifier may, in the general case, be less specific than the type of the object to which that identifier is bound; however, the type of an object will always match the nominal type of the identifier.
During elaboration, some identifiers become associated with -- bound to -- entities: these entities may be objects, routines, or types. The binding at identifiers to obiacts is of particular interest and includes both the declaration of variables (and associated instantiation of an object) and parameter passing.
In all contexts in which an identifier may become bound to an object (i.a., in a variable declaration or formal parameter position) there is an associated type description, in the case of a variable declaration, this description determines the type of the object created, in the case of a formal parameter, the type description defines the types of allowed actual parameters, in both cases, however, the type description is elaborated to a nominal type which determines the permitted uses of the object identified through this identifier.
__________________
10 In practice we allow more than just the names of effect-producing operations in the update sol part of certain
type descriptions, notably those which specify generic (formal parameters, in such cases we allow non-olfact-
producing attribute names as well: this is merely a shorthand for an "assumes clause' (see section 5.12).
This abbrewalicn is permitted because oi its similarity oi intent to the update set: it describes a set of attributes
which the routine or form body requires for correct operation.
In both declarative and formal parameter positions the description of a binding may be preceded by either var or const. The only difference between the two is that in the latter case (const) the update set of the identifier is set to empty; in the former (var) case the update set is determined from the associated type description. A particular consequence of this mechanism is that parameters in const positions are, intuitively, passed "by reference" but cannot be modified by the called procedure.
If K is an identifier bound to an object, then we refer to this object as Obj(k) and to its associated nominal type as Type(k).
The process of parameter binding requires a notion of what it means for an actual parameter to match, or satisfy, a formal parameter specification, intuitively this process involves determining that the nominal type associated with the formal parameter ”includes", or “covers", the nominal type of the actual -- that is, ensuring that the behaviors permissible through the formal parameter name are among those permissible through the actual parameter name. For simplicity we break this process into three subprocesses: subsumption, syntactic satisfaction, and implicit binding. Each of these is used for a different kind of actual/formal matching as specified below.
In addition, all implicit formals bound to types must be bound to types syntactically satisfying the assumed definition of the form (see section 5.12).
The notions of subsumption, syntactic satisfaction, and implicit binding are defined below; we begin with the notion of subsumption -- the kind of matching used when an object parameter is expected.
Definition: We say that a (nominal) type Tf subsumes a (nominal) type Ta (in symbols, Tf>>Ta)
iff:
In some cases condition 3d cannot be checked at compile time. At the discretion of the implementors, the compiler may provide the options of generating warnings, generating checking code, or refusing to compile such cases.
Definition: Two types are identical if each subsumes the other.
In Alphard, both routines and forms may be "generic“. That is, they may require types as parameters -- or, equivalently, they may have parameters whose type is not specified in the routine (m) header. In such cases there will be an "assumes clause" which specifies the properties that the routine (m) assumes about the generic parameter; this clause gives sufficient information to check all uses of the parameter locally, in order for a given use of the M or routine to make sense, the actual parameter must at least meet the syntactic assumptions made about it. Thus the notion of matching formal and actual parameters in such cases involves of determining whether the actual parameter syntactically satisfies the formal parameter assumptions 11.
__________________
11 More generally, of course, a proof will be required to demonstrate that the actual parameter makes semantic
sense as well.
Definition Given an assumed declaration of a generic parameter, T:
form T
specs
<definitions of f1...fn)
and a candidate actual type
Q(a1, a2, ...)
whose base type is declared
form Q(p1, ...)
specs
we say that the type Q(a1,...) syntactically satisfies T if textual substitution of “Q(a1,...)" for
"T" uniformly throughout the specifications of T results in T‘s specifications containing
declarations of f1,...fn identical to those in T's specifications (ignoring assertions and
implementations), though possibly only after suitable renaming of formal parameters.
In the process of determining whether an actual parameter of type T matches a formal parameter specification we may discover that Qual of the formal is an identifier preceded by "?". Such identifiers are called "implicit formal parameters“, and are "implicitly bound" to corresponding (qualifiers of the) actual parameters. Such bindings are performed before other matching.
Definition Let Tf be the formal type and Ta be the actual type. If, in determining whether
Tf>>Ta. Base(Tf) or Quali(Tf) is an identifier preceded by a "?“, the identifier is implicitly
bound to the corresponding Base(Ta) or Quali(Ta) and becomes an "implicit formal parameter“.
The nominal type of an implicit formal is made identical to the nominal type of the corresponding
form formal. Note that only one binding is established for such identifiers, so multiple
occurrences must be consistent.
In the procedure declaration
proc P(x:vector(int,?lb,?ub)) is ...
for example, "lb" and "ub" are such implicit formals. They are, respectively, the lower and upper
bounds of an actual parameter vector. Thus, if some program fragment contains
... var y:vector(int, 1, 10); ... P(y)
then 1 and 10, respectively, will be implicitly bound to the formals lb and ub.
The following table attempts to recap the essential aspects of the notion of actual/formal parameter matching:
Formal | Actual | Matching Rule |
x:form | full type | syntactic satisfaction |
x:pform | type | syntactic satisfaction |
x:<routine description> | routine name | point 3 of match rule |
x:?T | object | Type(noject) must syntactically satisfy T |
x:<type description> | object | subsumption |
object | object | equality under &= for the type of the formal |
unk | anything | always matches |
?T | match | after implicit binding |
routine name | routine name | same routine |
An informal Definition of Alphard
Chapter 3
Basic Lexical Structure
<letter> | ::= | A | B |... | Z | a |... | z |
<digit> | ::= | 0 | 1 |... | 9 |
<alphanumeric> | ::= | <letter> | <digit> | ' |
<special symbol> | ::= | <basic symbol> | <operator> |
<basic symbol> | ::= | begin | end | endof | ; | : | ( | ) | $ | :: | & | |
if | then | else | fi | case | | ||
of | esac | fo | with | in | ni | first | suchthat | from | do | od | | ||
for | exitloop | leave | skip | assert | | ||
var | const | aux | as specified | = | init | | ||
final | unk | ? | proc | vproc | func | sel | label | | ||
note | eton | ! | elif | elof | pform | while | > | < | . | # | | ||
copy | alias | form | inline | pre | post | rule | | ||
is | forward | external | specs | impl | shared | | ||
invariant | initially | axiom | repmap | | ||
record | enumerated | assumes | value | generator | ||
<operator> | ::= | <binary operator> | <unary operator> |
<binary operator> | ::= | ↑ | * | / | div | rem | + | - | < | ≤ | = | <> | ≥ | > | and | or | cand | cor | imp | |
<assign op> | ||
<unary operator> | ::= | + | - | not |
Typographical features such as blanks (spaces), ends of lines, etc., are generally not significant (but see section 8.4.3); an implementation may use them to delimit identifiers, numbers, etc. Outside strings, no such features may appear immediately after the symbol '8." or '?', or around the symbols "." and "s" when they are used as described in sections 3.4.2 and 4.3.
Upper and lower case letters are distinct. Also, note that the grave symbol is considered a (significant) alphanumeric and thus may be used in constructing identifiers; it is intended that this be used to improve program readability by separating mnemonically signifcant portions of such identifiers.
Basic symbols such as begin are conceptually single characters and are underlined in this report to emphasize that fact. An implementation, however, must reserve (all upper/lower case spellings of) the corresponding identifiers to denote these symbols. Thus "BEGIN". "begin", "Begin", etc, are all interpreted as the basic symbol M: we strongly encourage, however, consistent use of one spelling in a given program.
The following two commenting constructs are lexically equivalent to a space (blank) character when they appear outside of strings.
note <any sequence not containing the lexeme "eton"> eton
!<any sequence up to end of line>
The first commenting construct encountered in a line takes precedence over any contained within it.
<identifier> | ::= | <letter> {<alphanumeric>}* |
<special identifier> | ::= | &start | &finish | &next | &done | &value | &subscript | &<operator> |
<identifier list> | ::= | <identifier>,+ |
All the above identifiers are distinct.
Identifiers have no inherent meanings. They identify objects, forms, types, procedures, selectors. statements, and parameters. Declarations establish the meanings of identifiers within particular scopes.
Two identifiers are defined to be similar if they differ at most in the typographical case used to spell them; thus "ABC", "Abc", "aBc", etc. are all similar. Except when used as routine names, similar identifiers may not be declared in the same scope12.
Special identifiers denote entities of special significance in the language. They may be defined but never directly referenced: they are invoked as the consequence of using some other construct defined by the language. A simple example of the use of such symbols appears in section 3.4.1, where the language-defined notion of "+" invokes the user-definable function named ”&+"; more interesting examples may be found in sections 4.7 and 4.3. (Requirements on the definitions of such routines appear in appendix C.)
In order to simplify the language definition, a number of familiar and convenient notations are provided indirectly rather than as a part of the syntax. To accomodate these, we define several "rewrite rules" that transform programs from the more familiar notation to that described by the report. These transformations convert infix operators to function invocations, provide “qualified names", and introduce semicolons. We shall use the notation C1 --> C2 to describe some of these transformations; the notation means that constructs of the form C1 are transformed into constructs of the form C2.
__________________
12This restriction is imposed in order to prevent subtle errors arising from the use of similar identifiers in the
same scope. Routines are exempted from the restriction in order to permit operator overloading.
Neither the syntax nor semantics of Alphard includes the traditional notion of arithmetic or boolean expressions with infix operators. Rather, the language is defined as though all operations were expressed as function invocations, in order to permit the user to write programs in the more familiar infix-expression format, however, two transformations are performed. First, the input text is fully parenthesized in order to observe the following precedences and associativities:
1. Associativities: The operators at highest and lowest precedence are right associative: the
remainder are left associative.
2. Precedence:
↑ (highest precedence)
* / div rem
+ -
≤ < = ≠ > ≥
not
and cand
or cor
imp
:= +:= -:= *:= etc. (lowest precedence)
After being parenthesized, expressions are converted to functional form. If α and β denote arbitrary unary (monadic) and binary (diadic) operators, respectively, then the following transformations are performed:
<term>1 β <term>2 --> &β(<term>1,<term>2))
α<term> --> &α(<term>)
<term>1 β:= <term>2 --> &:=(<term>1, &β(<term>2))
where the <term>s denote any phrases balanced in parentheses.
After being placed in functional form, three of the relational operators are rewritten as the boolean negation of one of the remaining three:
&≠(t1,t2) --> ¬(&=(t1,t2))In addition, two of the boolean operators are rewritten as conditional expressions:
t1 cand t2 --> if t1 then t2 else false fi
t1 cor t2 --> if t1 then true else t2 fi
This rewrite is required to avoid the possibility of undefined argument values in invocations.
Note that, as stated earlier, symbols of the form &<operator> may be defined by the user. Thus, by giving a definition to "&+", the programmer gives a definition to the operator "+"; this does not allow redefinition of existing operators, but does allow these operators to be extended to new types.
It is often convenient to refer to the (visible) components of an object by symbolic names; for example, the components of a record have traditionally been named in this way. The conventional syntax allows "X.y" to denote the y component of X.
The syntax of Alphard does not support such "dotted name" qualification directly, but instead uses the functional form, y(X). To permit the dotted-name notation and user-defined subscripting, qualified names are transformed in two steps. First, dotted names are eliminated in favor of a functional form:
<qualname>.<identifier> --> <identifier>(<qualname>) where <qualname> is any sequence of identifiers (including special identifiers), '.'s, 'S's, and sequences of lexemes belanced and enclosed in parentheses or square brackets. The rule is applied right-to-left; thus, for example
A.y --> y(A)
Q.g[s.t].f --> f(g(Q)[t(s)])
After all dots have been removed, square brackets are removed:
<term>[<expression list>] --> &subscript(<term>,<expression list>)
Thus, the example above becomes
Q.g[s.t].f --> f(g(Q)[t(s)]) --> f(&subscript(g(Q),t(s)))
Note that the user may define the selector &subscript and hence may specify the access algorithm for a type.
The effect of the following transformation is to eliminate the need for explicit semicoions to separate declarations or statements when those semicolons would (all at the end of a text line. According to the syntax in this report, certain phrases are separated from each other by semicolons. In those cases where the final lexeme on a line could end such a phrase, e.g.
end, ), fi, esac, fo, ni, od, exitloopand the next lexeme (i.e., excluding comments) could begin such a phrase , e.g.,
begin, (, if, case, with, first, do, forthe compiler automatically inserts a semicolon between the two unless, on the basis of preceding symbols it is possible to determine that doing so would render an otherwise syntactically valid program into an invalid one.
<special literal> | ::= | <unsigned integer> | <unsigned real> | <string> | <boolean> | <radix> |
<unsigned integer> | ::= | {<digit>}+ |
<unsigned real> | ::= | <unsigned rational>{E<scale-factor>}# | <unsigned integer>E<scale-factor> |
<unsigned rational> | ::= | <unsigned integer>.<unsigned integer> |
<scale-factor> | ::= | {+|-}#<unsigned integer> |
<string> | ::= | "<any sequence of characters with all quotes doubled>" |
<boolean> | ::= | true | false |
<radix> | ::= | {<alphanumeric>}+#<alphanumeric> |
An Informal Definition of Alphard
Chapter 4
Program Structure, Expressions and Statements
<compilation unit> | ::= | begin <block> end | <exec decl list> |
<block> | ::= | {<exec decl list>;}# {<stmt>};+ {;}# |
<exec decl list> | ::= | { <exec decl> };* |
<exec decl> | ::= | <var decl> | <const decl> | <proc decl> | <form decl> | <label decl> |
A block specifies a computation whose effect is as though the following order of execution were
observed:
The scope of all declarations in a block is the text of the block, where not superseded by nested
declarations.
Expressions and statements designate actions to be performed. Their elaboration results in changes In the execution state of the program. Expressions differ from statements only in that their elaboration may "produce values" as well as performing other actions; statements only perform actions. For definitional brevity and convenience, ovary expression is considered to be a statement, but not conversely. When an expression is used in a context requiring a statement, its "value" is discarded.
Somewhat more precisely, the "value produced by an expression" is an object resulting from its elaboration: the type of this object is uniquely determined by the rules stated in the remainder of this chapter. This resulting (unnamed) object exists until any immediately enclosing expression or statement that uses it has finished execution.
<expression> | ::= | <invocation> | <conditional expression> | <value expression> | |
<with expression> | <first expression> | ||
<stmt> | ::= | <expression> | <loop stmt> | <exit stmt> | <null stmt> | |
<inner block> | <labeled stmt> | <assert stmt> | ||
<labeled stmt> | ::= | <identifier> : <stmt> |
Statement labels are used by salt statements (section 4.9). The effect of an exit statement is to force control to the point immediately following the labeled statement whose label is used in the exit. Labels must be declared (see section 5.4) and may be used to label only one statement within the scope of their declaration.
<invocation> | ::= | <special literal> | <simple invocation>{<actuals>}# | (<invocation>) |
<actuals> | ::= | ({<actual>},*) |
<actual> | ::= | <expression> | <type description> |
<simple invocation> | ::= | { <identifier>$ }* <identifier> | <special identifier> |
Infix and prefix operators fall under this syntax by the rewrites of section 3.4.1. Subscripting, denoted by "[...]", falls under this syntax by the rewrites of section 8.4.2. Note: <special identifier>s may not appear in source programs; they result only from these rewrites.
The most obvious <lnvocation>s are those denoting routine "calls", e.g.:
sin(x)
integrate(F,a,b,eps)
In addition, however, <invocation>s result from the rewrite rules for infix operators and subscripting,
e.g.:
&:=(a, &+(b,c))
&:=(x, &subscript(V,i))
Finally, <invocation>s occur as part of type descriptions:
vector(int,1,unk)
A simple invocation may designate a type, an object, or a routine (a procedure or selector) as indicated in chapter 5, identifiers may designate multiple entities in any given context (operator overloading), so some means of resolving the conflict is necessary. An identifier may be qualified on the left by the name of the form containing its definition; such qualifiers are separated by "$". Alternatively, the proper definition may be determined by examining the types of the actuals -- that is, by choosing that definition for which type checking (section 2.4) succeeds, it is an error if the compiler cannot disambiguate statically (i.e., at compile-time).
Assuming that the entity designated by G is uniquely determined, an invocation such as:
G(e1,e2,...,en)
denotes an elaboration and possibly a resulting value as follows:
<conditional expression> | ::= | <if expression> | <case expression> |
<if expression> | ::= | if <expression> then <block> { elif <expression> then <block> }# { else <block> }# fi |
<case expression> | ::= | case <expression> of <case> { elof <case> }* { else <block> }# esac |
<case> | ::= | { <expression> },+ :: <block> |
Conditional expressions denote expressions and statements to be evaluated conditionally. Such an expression has a value if
<value expression> | ::= | value <identifier>{:<obj type>}# of <block> fo |
<with expression> | ::= | with <with list> in <block> ni |
<with list> | ::= | { <identifier>:<invocation> },+ |
<first expression> | ::= | first <template> suchthat <expression> { then <block> }# { else <block> }# fi |
<template> | ::= | <identifier> from {<identifier>:}#<type description> | <identifier> from <invocation> |
<loop stmt> | ::= | <simple loop> | <while stmt> | <for stmt> |
<simple loop> | ::= | do <block> od |
<while stmt> | ::= | while <expression> <simple loop> |
<for stmt> | ::= | for <template> <simple loop> |
<exit stmt> | ::= | exitloop | leave <identifier> |
<inner block> | ::= | begin <block> end | begin <block> endof {<identifier>}# |
<assert stmt> | ::= | assert <assertion> |
An assert statement indicates a condition that must be true when control passes through the statement. It has no semantic effect. The syntax of (anenlon) Is not specified by the language (other than that the assertion text must be enclosed in, and balanced in brackets, "{...}"). It is the province of a verifier or verifying compiler only.
An Informal Definition of Alphard
Chapter 5
Declarations
Declarations define routines (proc, vproc, func, and sel declarations) and classes of types (form declarations), specify the instantiation of objects (var, const), and bind identifiers to these entities.
The scope of a declaration -- the program text in which the binding it establishes is valid -- depends on the kind of declaration and the place it appears, in the sequel, normal Algol scope means the innermost block containing the declaration, including all blocks it encloses that do not redefine the identifier. Restricted Algol scope is the same as normal Algol scope, but excludes the text of routine and form declarations.
Generally identifiers naming routines and forms obey Algol scope rules; identifiers naming objects (i.e., variable names) obey restricted Algol scope. Thus, no free variable names appear in routine or form bodies; all variables are either locally declared or passed in through the parameter list. A few additional scope restrictions are discussed later.
Any declaration may be preceded by the keyword aux. Identifiers defined in such declarations may not be used (except within the assertion language). Auxiliary declarations serve as modeling tools in the specifications; the entities described by such declarations may or may not exist in the implementation, if such entities do exist, they may be implemented in a manner different from that described in the aux declaration. (Note, however, that the clause as specified may be used in an implementation to force precisely the representation appearing in an aux definition of an entity).
Occasionally the detailed implementation of an entity (variable, routine, or form may appear at a point remote from its declaration. The following important cases arise:
<label decl> | ::= | label <identifier list> |
Labels, like all other identifiers, must be declared before use. A label may be "placed" (used to label a statement) only once in the scope of its declaration.
<var decl> | ::= | <aux> var { <obj decl group>{<init fin clause>}# },+ |
<const decl> | ::= | <aux> const {<obj decl group> {<init fin clause>}# | <const assign>},+ |
<aux> | ::= | { aux }# |
<obj decl group> | ::= | <identifier list> : <obj type> |
<obj type > | ::= | <type description> | as specified |
<init fin clause> | ::= | = <expression> | { init <stmt> }# { final <stmt> }# |
<const assign> | ::= | <identifier list> = <expression> |
Object declarations may occur inside form declarations or in blocks; their meanings in the two contexts differ. See section 5.9 for a discussion of their meanings in form declarations.
In blocks, object declarations have restricted Algol scope (see section 5.1). Semantically, constant declarations (<const decl>s) differ from variable declarations (<var decl>s) only in that, outside the initialization and finalization clauses, Update(c) is empty for c a constant; for variables the update set is determined from the <type description>.
The <obj decl group>s within an object declaration are processed in unspecified order (note. however, that declarations are processed in ieft-to-right order; thus the programmer may impose an ordering if that is appropriate). For each group, the full type is evaluated, and for each identifier, a new object of that type is instantiated and bound to the identifier. If an init clause is present, it is executed. The declarationA final clause, if present, is executed just before deallecation of the variables or constants with which it is associated. Deallocation is always in reverse oi the (possibly unspecified) order of creation.
Objects may be designated "as specified" only in form implementations (section 5.9). This indicates that the <type description> is to be copied from the form specifications.
An <init fin clause> in a constant declaration can be omitted only in the specifications part of a form (see section 5.9).
<type description>s are syntactic entities which appear in object declarations and formal parameter specifications.
<type description> | ::= | <simple invocation> { ( {<formal qual>},+ ) }# {<update set>}# | |
?<identifier> { <update set> }# | ||
<formal qual> | ::= | <expression> | ?<identifier> { <update set> }# | { <identifier>: }# <type description> | |
unk | ||
<update set> | ::= | < { <identifier> | <special identifier> },+ > |
Note that the outer <>'s in the definition of <update set> are part of the language, not metabrackets (see examples below).
The <simple invocation> (see 4.3) must designate a unique form ("$" qualification allows duplicate nested form definitions). Disambiguation on the basis of argument types is not performed.
Elaboration of a (type description), T(e1,...,en)<p1,...,pm>, proceeds as follows13 :
The <update set> defines the update set of the type description (see 2.2). The listed identifiers
must be names of routines declared with the base type. Only these routines and routines with no visible
effects may be applied to the object within the scope covered by the declaration in which the type
description appears. In the case that the update set is attached to a ?identifier or a <type formal>,
the listed identifiers must be further specified in an "assumes clause" (see 5.12) unless they
__________________
13 Note that an implementation may require compile-lime elaboration oi those type descriptions used as formal
parameter specifications.
Certain entities -- forms and routines -- can be parameterized. Formal parameters specify these parameterizations. The process of determining whether a given sequence of actual: conforms to the sequence of formal parameters is known as matching or type checking. This process binds actual parameters to the corresponding formal parameters. It may also cause certain implicit bindings of identifiers marked with a “?" lexeme in the formals; these identifiers are implicit parameters with the same scope as ordinary formals. The implicit bindings are in effect whenever the explicit bindings of ordinary formals are.
The scope of a formal parameter to a routine is the text of the parameterized declaration. In the "restricted Algol scope" sense. That is, the scope of a formal does not include routine or form declarations within the parameterized routine text.
The scope of a formal parameter to a form is at most the text of the parameterized form declaration. It is also subject to "restricted Algol scope", in addition, unless explicitly redeclared in the form (specifically, redeclared as specified), the scope of form formals is limited to the specifications and object declarations (including init clauses) of the form other than shared objects.
<formals> | ::= | ( { <routine formal> | <binding> <obj formal> | <type formal> },+ ) |
<routine formal> | ::= | <formal id list> proc <parms> | <formal id list> { vproc | func | sel } <v parms> |
<binding> | ::= | { copy | { alias }# { const | var } }# |
<formal id list> | ::= | <identifier list> : |
<obj formal> | ::= | <formal id list> <type description> |
<type formal> | ::= | <formal id list> { form | pform } { <update set> }# |
Formal parameters give the specifications of allowable actual parameters and provide local names for these parameters. A <routine formal> indicates a parameter position to be filled with a procedure or selector. A <type formal> indicates a position to be filled by a type description. An <obj formal> indicates a position to be filled by an object. The association of actual parameters to formals is determined positionally.
The specification of an object parameter may be preceded by a qualifier that controls the binding of actual to formal; the possible qualifiers are copy, const, var, and alias. The qualifiers const and var denote "by reference" parameters; in both cases the parameter name is bound to the actual parameter object. const parameters (like names declared in const object declarations), have an empty update set. As specified in section 4.3, the qualifier copy indicates that a local object is to be instantiated
Routines encapsulate computations. A routine (proc, vproc, func) may or may not return a result
object. If it does, the update set of the compiler-generated name bound to the returned object is
always null; there can be no effects on a procedure result. A selector always has a result and must
not have effects; unless explicitly restricted, the <update set> of a selector result is the full update
set of the base type. Except within form specifications (section 5.9), routine declarations
have normal Algol scope.
<routine decl> | ::= | <vproc decl> | <proc decl> | <sel decl> |
<vproc decl> | ::= | <aux> {inline}# {vproc | func} <routine id> <v parms> <pre post> <assumes> |
{ <routine body> }# | ||
<proc decl> | ::= | <aux> {inline}# proc <routine id> <parms> <pre post> <assumes> { <routine body> }# |
<sel decl> | ::= | <aux> {inline}# sel <routine id> <v parms> <pre post> <assumes> { <routine body> }# |
<routine id> | ::= | <identifier> | <special identifier> |
<parms> | ::= | { <formals> }# |
<v parms> | ::= | { <formals> }# : <type description> |
<pre> | ::= | { pre <assertion>; }# |
<pre post> | ::= | <pre> { post <assertion>; }# |
<routine body> | ::= | is <stmt> | is as specified | is forward | is external (<system specs>) |
<form decl> | ::= | <aux> form <identifier> { <formals> }# <pre> <assumes> is <form body> |
__________________
14 This assumption is valid only so long as user—defined assignment operators, "&:=", preserve the intended
meaning. The compiler cannot enforce the correctness of any user-defined operation, and specifically not that of
properties.
Selectors (declared sel) name objects. Procedures (declared proc) produce effects but do not return values. Value-returning procedures (declared vproc), may produce effects and also return values (actually objects). Functions (declared func) are semantically equivalent to vproc's except that they are deterministic15 and do not have observable effects on their parameters. The <stmt> portion of the <routine body> of a vproc or func must be a single expression of the type returned by the routine.
The qualifier inline has no semantic effect. It indicates that the compiler should make the declared routine "open" -- i.e., produce a copy of the object code at each invocation site.
The pre and post clauses have no semantic effect, but are specifications of the routine‘s behavior. The pre clause gives conditions which will be true at entry; post gives conditions at routine exit (the keyword result is conventionally used in post conditions to specify the value returned).
The routine body may be absent only in form specifications (see 5.9). It may be given "as specified" only in form implementations; this indicates that the body is to be carried down from the specifications. The routine body may be specified as forward if its declaration appears later in the same <block>. The body may be specified as external if the text of the definition is to be found in the system-dependent entity specified by <system specs>.
The formal parameter lists and result types must be omitted in form implementations if the same routine (name) is declared in the specifications of the form. They are copied from the declaration of the routine in the specifications.
The assumes clause (see section 5.12) declares generic parameters (implicit and explicit). Throughout the text of the routine declaration, only those properties declared in the assumes clause are used in testing syntactic and semantic validity.
As stated previously, identifiers that name objects observe restricted Algol scope. Thus, the body of a routine cannot access objects declared outside itself unless they are passed through the parameter list. Also, as stated previously, the objects named by distinct formal parameter names cannot overlap unless they are explicitly qualified with alias.
__________________
15 That is, invocations With equal inputs yield equal outputs. More precisely, for F to be a func, &=(A,B) must
imply &=(F(A),F(E)). If &= is not defined, invocations with identical inputs must have identical outputs.
<form decl> | ::= | <aux> form <identifier> { <formals> }# <pre> <assumes> is <form body> |
<form body> | ::= | { <specs> }# { <impl> }# end { <identifier> }# | <abbrev body> | forward | external |
( <system specs> ) | as specified | ||
<specs> | ::= | specs { <var decl> | <other form decls> };+ |
<impl> | ::= | impl { <shared> <var decl> | <other form decls> }; |
<other form decls> | ::= | <routine decl> | <form decl> | <axiom>| <shared> <const decl> |
<shared> | ::= | shared # |
<axiom> | ::= | invariant <assertion> | initially <assertion> | axiom <assertion> | repmap |
The names defined in the specifications (<specs>) of a form are available outside the form declaration. The scope of these names is the same as if they had been declared immediately outside the form, except that var and const declarations become routine declarations as described below. The scope of the names in the specifications does not include the implementation (<impl>). Note, however, that all these names must be redeclared in the implementation. The implementation may be omitted if the form declaration appears as part of the specifications of another form. The specification may be omitted if the declaration appears in the implementation of another form, in which case it is copied from the specifications of the containing form.
The scope of object names appearing in the formal parameter list of the form is restricted to the specs and <var decls> of the <form body> unless they are explicitly redeclared (as specified) in the impl. In the latter case a run-time representation of these objects becomes part of the implementation of objects instantiated from the form. In such cases it is also possible for these names to be mentioned (again as specified) in the specs, and hence to be externally available. These redeclarations in the impl and specs must be compatible with the <binding> of the formal parameter; that is, an identifier redeclared var must also have a var <binding> (an identifier redeclared const may have a var, const, or copy <binding>.
Objects declared in a form usually become the concrete components of the objects that result from instantlating the form; there are thus distinct instantiations of these objects for different instantiations of the form. An exception occurs when a declaration is prefixed by the modifier shared: A single
instantiation of a shared object is common to all instantiations of the form. In particular, it makes perfect sense to define a shared const of a given type within the definition of that type. Such a constant functions as a named literal of the type.
<Axiom>s have no semantic effect, but provide further specifications.
Non—shared constant and variable declarations within io_rln_ specifications are shorthand for certain procedure and selector declarations. That is,
In the implementation, object declarations again become selector and procedure declaration: as follows. When an object of base type T is instantiated, the object declarations in its implementation are elaborated as usual (and the in_it clauses are performed); This results in a set of newly created objects which become the concrete components of the object being created. These components may be accessed within the bodies of the routines in the implementation by using implicit selectors (procedures) with the same names as those given in the object declarations. Thus, we can write
That is, inside f, "x" is treated as a selector on objects of type T and is applied to the formal parameter, Q, to access the x-component or the particular actual. Note in particular that "x", like all object names, obeys restricted Algol scope and hence is not inherited by the body of the proc, f.
Abbreviated form body definitions are provided for two commonly occurring kinda of abstractions, "records" and ordered "enumerated” types.
<abbrev body> | ::= | <record type> | <enumerated type> |
<record type> | ::= | record ( { <obj decl group> },+ ) |
<enumerated type> | ::= | enumerated ( <identifier list> ) |
shared const i1, ..., in:C | ! distinct constants |
func &= ...; | ! equality test |
vproc &:= ...; | ! assignment |
func min ...; | ! minimum element (=i1) |
func max ...; | ! maximum element (=in) |
func succ ...; | ! successor (not defined on in) |
func pred ...; | ! predecessor (not defined on i1) |
func card ...; | ! cardinality of enumeration (=n) |
func decode ...; | ! converts element to its ordinal (e.g., decode(i3)=3) |
func code ...; | ! converts ordinal to element (e.g., code(2)=i2) |
func spell ...; | ! convert element to stringlet (its printname) |
func unspell ...; | ! convert stringlet (printname) to element |
generator gen ...; | ! generates elements in order (i1 ... in) |
Generators are specialized forms. They are useful for defining objects that will be bound to the control variables of the for and first constructs (see 4.7, 4.8). (Any form may provide such objects, but their use is sufficiently constrained by the language that special abbreviated syntax is provided for defining forms intended specifically for this purpose.)
<form decl> | ::= | <aux> form <identifier> { <formals> }# <pre> <assumes> is <form body> |
Note that in this example we have chosen to ignore statement and predicate parameters other than k.
In order to generate loop control variables, a form must provide definitions for the routines &start, &next, &done and &value. A definition of the routine &finish is optional. If no definition is provided for &finish, the compiler will provide (1) an appropriate header, (2) the body skip in the specifications, and (3) the body as specified in the implementation. Restrictions on the definitions of &done, &start, &next, &finish, and &value are provided in appendix C.
A distinguished class of forms defining objects for controlling loops is designated by the reserved word generator and the syntax indicated above. This class is significant because the behavior of these objects is sufficiently constrained to be specified by a proof rule. Necessary properties of the specifications of the generator routines can be derived from the proof rules and the constraints16. As a result, these specifications are not written explicitly. Instead, proof rules for first and for loops that use the generator are written as shown above. An instantiation of a generator may be used only to control loop constructs for which it provides proof rules.
__________________
16 See "Abstraction and Verification in Alphard: Defining and Specifying Iteration and Generators" by Mary Shaw,
Wm. A. Wolf, and Ralph L. London, in Communications of the ACM, August 1977, pp. 553-564.
<assumes> ::= {assumes <form decl>},*
Assumed forms may not be parameterized and may not have implementations. Identifiers appearing in the update sets of generic parameters must be specified in an assumes clause (see 5.6). Actual parameters corresponding to generic formals must syntactically satisfy the assumptions about the formal (see 2.4).
This routine will determine whether two vectors, x and y, are equal so long as (1) the type of the elements of the vectors, T, provides an equality operator and (2) the two vectors have the same upper and lower bounds. An assumes declaration behaves as though it had normal Algol scope; in particular, assumptions about generic parameters of a form need not be repeated within routines (or other forms) declared within the form.
An informal Definition of Alphard 40
Appendix A
Collected Syntax
<letter> | ::= | A | B |... | Z | a |... | z |
<digit> | ::= | 0 | 1 |... | 9 |
<alphanumeric> | ::= | <letter> | <digit> | ' |
<special symbol> | ::= | <basic symbol> | <operator> |
<basic symbol> | ::= | begin | end | endof | ; | : | ( | ) | $ | :: | & | |
if | then | else | fi | case | | ||
of | esac | fo | with | in | ni | first | suchthat | from | do | od | | ||
for | exitloop | leave | skip | assert | | ||
var | const | aux | as specified | = | init | | ||
final | unk | ? | proc | vproc | func | sel | label | | ||
note | eton | ! | elif | elof | pform | while | > | < | . | # | | ||
copy | alias | form | inline | pre | post | rule | | ||
is | forward | external | specs | impl | shared | | ||
invariant | initially | axiom | repmap | | ||
record | enumerated | assumes | value | generator | ||
<operator> | ::= | <binary operator> | <unary operator> |
<binary operator> | ::= | ↑ | * | / | div | rem | + | - | < | ≤ | = | <> | ≥ | > | and | or | cand | cor | imp | |
<assign op> | ||
<unary operator> | ::= | + | - | not |
<identifier> | ::= | <letter> {<alphanumeric>}* |
<special identifier> | ::= | &start | &finish | &next | &done | &value | &subscript | &<operator> |
<identifier list> | ::= | <identifier>,+ |
<special literal> | ::= | <unsigned integer> | <unsigned real> | <string> | <boolean> | <radix> |
<unsigned integer> | ::= | {<digit>}+ |
<unsigned real> | ::= | <unsigned rational>{E<scale-factor>}# | <unsigned integer>E<scale-factor> |
<unsigned rational> | ::= | <unsigned integer>.<unsigned integer> |
<scale-factor> | ::= | {+|-}#<unsigned integer> |
<string> | ::= | "<any sequence of characters with all quotes doubled>" |
<boolean> | ::= | true | false |
<radix> | ::= | {<alphanumeric>}+#<alphanumeric> |
<compilation unit> | ::= | begin <block> end | <exec decl list> |
<block> | ::= | {<exec decl list>;}# {<stmt>};+ {;}# |
<exec decl list> | ::= | { <exec decl> };* |
<exec decl> | ::= | <var decl> | <const decl> | <proc decl> | <form decl> | <label decl> |
<expression> | ::= | <invocation> | <conditional expression> | <value expression> | |
<with expression> | <first expression> | ||
<stmt> | ::= | <expression> | <loop stmt> | <exit stmt> | <null stmt> | |
<inner block> | <labeled stmt> | <assert stmt> | ||
<labeled stmt> | ::= | <identifier> : <stmt> |
<invocation> | ::= | <special literal> | <simple invocation>{<actuals>}# | (<invocation>) |
<actuals> | ::= | ({<actual>},*) |
<actual> | ::= | <expression> | <type description> |
<simple invocation> | ::= | { <identifier>$ }* <identifier> | <special identifier> |
<conditional expression> | ::= | <if expression> | <case expression> |
<if expression> | ::= | if <expression> then <block> { elif <expression> then <block> }# { else <block> }# fi |
<case expression> | ::= | case <expression> of <case> { elof <case> }* { else <block> }# esac |
<case> | ::= | { <expression> },+ :: <block> |
<value expression> | ::= | value <identifier>{:<obj type>}# of <block> fo |
<with expression> | ::= | with <with list> in <block> ni |
<with list> | ::= | { <identifier>:<invocation> },+ |
<first expression> | ::= | first <template> suchthat <expression> { then <block> }# { else <block> }# fi |
<template> | ::= | <identifier> from {<identifier>:}#<type description> | <identifier> from <invocation> |
<loop stmt> | ::= | <simple loop> | <while stmt> | <for stmt> |
<simple loop> | ::= | do <block> od |
<while stmt> | ::= | while <expression> <simple loop> |
<for stmt> | ::= | for <template> <simple loop> |
<exit stmt> | ::= | exitloop | leave <identifier> |
<null stmt> | ::= | skip |
<inner block> | ::= | begin <block> end | begin <block> endof {<identifier>}# |
<assert stmt> | ::= | assert <assertion> |
<label decl> | ::= | label <identifier list> |
<var decl> | ::= | <aux> var { <obj decl group>{<init fin clause>}# },+ |
<const decl> | ::= | <aux> const {<obj decl group> {<init fin clause>}# | <const assign>},+ |
<aux> | ::= | { aux }# |
<obj decl group> | ::= | <identifier list> : <obj type> |
<obj type > | ::= | <type description> | as specified |
<init fin clause> | ::= | = <expression> | { init <stmt> }# { final <stmt> }# |
<const assign> | ::= | <identifier list> = <expression> |
<type description> | ::= | <simple invocation> { ( {<formal qual>},+ ) }# {<update set>}# | |
?<identifier> { <update set> }# | ||
<formal qual> | ::= | <expression> | ?<identifier> { <update set> }# | { <identifier>: }# <type description> | |
unk | ||
<update set> | ::= | < { <identifier> | <special identifier> },+ > |
<formals> | ::= | ( { <routine formal> | <binding> <obj formal> | <type formal> },+ ) |
<routine formal> | ::= | <formal id list> proc <parms> | <formal id list> { vproc | func | sel } <v parms> |
<binding> | ::= | { copy | { alias }# { const | var } }# |
<formal id list> | ::= | <identifier list> : |
<obj formal> | ::= | <formal id list> <type description> |
<type formal> | ::= | <formal id list> { form | pform } { <update set> }# |
<routine decl> | ::= | <vproc decl> | <proc decl> | <sel decl> |
<vproc decl> | ::= | <aux> {inline}# {vproc | func} <routine id> <v parms> <pre post> <assumes> |
{ <routine body> }# | ||
<proc decl> | ::= | <aux> {inline}# proc <routine id> <parms> <pre post> <assumes> { <routine body> }# |
<sel decl> | ::= | <aux> {inline}# sel <routine id> <v parms> <pre post> <assumes> { <routine body> }# |
<routine id> | ::= | <identifier> | <special identifier> |
<parms> | ::= | { <formals> }# |
<v parms> | ::= | { <formals> }# : <type description> |
<pre> | ::= | { pre <assertion>; }# |
<pre post> | ::= | <pre> { post <assertion>; }# |
<routine body> | ::= | is <stmt> | is as specified | is forward | is external (<system specs>) |
<form decl> | ::= | <aux> form <identifier> { <formals> }# <pre> <assumes> is <form body> |
<form body> | ::= | { <specs> }# { <impl> }# end { <identifier> }# | <abbrev body> | forward | external |
( <system specs> ) | as specified | ||
<specs> | ::= | specs { <var decl> | <other form decls> };+ |
<impl> | ::= | impl { <shared> <var decl> | <other form decls> }; |
<other form decls> | ::= | <routine decl> | <form decl> | <axiom>| <shared> <const decl> |
<shared> | ::= | shared # |
<axiom> | ::= | invariant <assertion> | initially <assertion> | axiom <assertion> | repmap |
<assertion> | rule <identifier> <assertion> | ||
<abbrev body> | ::= | <record type> | <enumerated type> |
<record type> | ::= | record ( { <obj decl group> },+ ) |
<enumerated type> | ::= | enumerated ( <identifier list> ) |
<form decl> | ::= | <aux> generator <identifier> { <formals> }# : <type description> <pre> <assumes> |
is <form body> | ||
<assumes> | ::= | {assumes <form decl>},* |
An informal Definition of Alphard 43
Appendix B
Standard Prelude
The complete specifications of the types in the standard prelude will appear in a future technical report. This appendix previews that report by listing the types to be included and summarizing their main properties. This, together with the reader‘s experience and goodwill, should suffice to understand the language definition.
The type definitions loosely referred to as the "standard prelude“ actually comprise three classes of definitions. The primitive prelude defines operationally the basic notion of linear contiguous storage and all of the common machine operations. The standard prelude proper includes types that we expect to be implemented for all versions of the language, usually as a part of the compiler. The implementation prelude includes types that provide facilities of the underlying hardware, operating system, or support environment of a particular implementation. In addition, types (such as real) which should be defined for a large class of systems (but not all) are defined here. It is our intention that whenever a particular implementation chooses to define a type given here, it follows our specifications to the greatest extent possible.
The RawStorage form supplies the fundamental abstraction of linear contiguous storage as described in Chapter 1. Fetching and storing are defined for RawStorage objects of equal length. The integer and bitwise boolean operations are provided for RawStorage objects of length one. In addition we provide support here for the subsequent definition of forms such as collections and references, as well as the ability to do storage management.
These types were chosen for their simplicity and common utility. They are intended to provide only primitive facilities that may reasonably be expected to appear (and be efficiently implementable ) on all systems. The reader must bear in mind that these specifications were selected with the understanding that they become essentially required of all implementations. When in doubt, therefore, we have tended to exclude features rather than to include them.
There are three major classes of types in the standard prelude. The sections below sketch the major properties of each.
The scalar types of the standard prelude are Boolean and Integer. Literals of these types are defined in section 3.5.
Boolean is the other form required by the language definition (Chapter 1). Objects of type Boolean are unstructured; they possess values from a set designated (true, false). The customary unary and binary functions are provided, along with assignment.
Integers are restricted to machine precision. The standard prelude supports the customary unary and binary arithmetic operators on integers, the arithmetic relations, named constants to describe the finite range of a particular implementation, assignment, and transfer functions to and from Stringlets.
Two linear, homogeneous, fixed-length data structures are provided in the standard prelude. Vectors may have elements of many types; Stringlets are minimally sufficient to support I/O operations.
Vectors may have elements of almost any type (the type must be allocatable without special restrictions). Any particular Vector, of course, contains elements of only one type. The length of a Vector is fixed at instantiation time. A subscript selector for integer indices and a generator that produces the elements in order are provided. If the element type supports either assignment or equality test, that operation is extended to the entire Vector. There is a slice routine which returns a subvector of the original vector (with the new origin forced to zero).
Stringlets closely resemble vectors of characters. The same operations are provided for Stringlets as for Vectors. In addition, literals are supported by the syntax of section 3.5 and an assortment of special predicates on Stringlets of length 1 is provided (isCharacter, isLetter, isDigit, etc.). Transfer functions to and from Integers are also provided. Note that type "character" is not included in the standard prelude; use Stringlets of length 1 instead.
Certain other types appear in some fashion in almost every language. These types do not have uniform specifications across implementations, but rather depend on the host machine. Since we don't want to require these for all systems, we cannot include them in the standard prelude. Instead, such types are provided in an implementation prelude -- a segment of the definition of each implementation that is frankly machine-dependent. In addition, certain types which may not exist under all implementations, but which must have uniform specifications for those systems on which they do exist, are included here.
The types defined here may provide direct access to features of the underlying hardware; they may support special facilities of the environment or the operating system; they may simply be data types that are best supported directly by the compiler.
Reals have the properties of hardware floating-point values. Unary and binary arithmetic operators and relations are supported to the extent that the underlying machine allows. Constants describing floating point accuracy, assignment, and transfer functions to and from Integers and Stringlets are also provided. There is NO mixed-mode arithmetic.
The implementation prelude requires a minimal set of operations on flies or characters. This, together with transfer functions to and from Stringlets, is intended to guarantee the availability of at least primitive input/output facilities. It is intended that each implementation prelude provide such richer support as is appropriate, provided such support is an upward compatible extension of the facilities described here.
Form IOFile supports sequential files of characters. Files are sequences of stringlets of length one. They are constructed by appending Stringlets and decomposed into Stringlets. The available operations are commands to open or close an File for reading or writing, to test the status of an File, and to read or write a Stringlet from or to an File. Note that File is an auxiliary form definition--such objects cannot be declared. IOFile, on the other hand, has a null representation. Hence it is not meaningful to declare objects of this type.
These types may provide direct access to features or the underlying hardware or to special facilities of the environment or the operating system.
An informal Definition of Alphard 46
Appendix C
Special Identifier Assumptions
A certain number of Alphard routines are used for special purposes. Their definitions are thus more constrained than has been indicated elsewhere in this report. These routines are distinguished in that their names begin with ampersand ("&"); syntactically, they are <special identifier>s. The routines are invoked by specific language constructs rather than by ordinary routine invocation.
These routines are grouped into two classes. Generator routines are invoked by the for and first iteration constructs in the manner described in sections 4.7 and 4.8. Extensible operators are effected by the infix operator rewrite rules as well as by the subscript rewrite rule (see section 3.4). Extensible operators include all operators named by <special identifier>s other than generator routines; they can be overloaded by user-defined forms.
The generator routines are &done, &start, &next, &finish and &value.
The restrictions on the invocation of generator routines as an ordinary routines are intended to support two conflicting requirements. First, it is mandatory that a for or first loop body not be able to use these functions on the object generating the loop control variable. It is, in general, desirable that they not be invoked in other arbitrary places outside of loops either. It is, however, useful to be able to use these routines in the definitions of other generators, specifically those which simultaneously generate elements of two or more data structures.
The invocation restriction is thus that: &start, &next and &finish can only be invoked in the definition of an &start, &next or &finish routine of another generator; &done can only be invoked in the definition of an &start, &next, &finish or &done routine in another generator; &value can only be invoked within another generator. This would seem to permit use of the same generator object to create control variables in two nested loops. Any problems which this might cause for the generator object would be detected as violations of sufficient independence of the object from the body of the outer loop.
Suppose a generator named gen defines a type of object that provides a control variable of type t for the "<identifier> from" construct. The generator routines must have header specifications subsuming:
The extensible routines are &↑, &*, &/, &div, &rem, &+(unary and binary). &-(unary and binary), &<,
&=, &>, &and, &or, &imp, ¬, &:= and &subscript.
Extensible routines can be invoked either by using the infix operator rewrite rule or by using the subscript rewrite rule (see 3.4). There is no loss of generality in prohibiting their invocation as ordinary routines.
Within form f, specifications of these operators must subsume:
The compiler is able to enforce only the precedence and syntax of these operators. However, their traditional use in mathematics raises other expectations about them; most people, for example, presume "+" is at least associative and possibly commutative. We strongly urge that the programmer overload these operators only with operations that preserve those expectations; failure to observe this convention may badly mislead the reader.
An informal Definition of Alphard 48
Appendix D
A Complete Example
We now present a complete Alphard program. This program defines finite sets with a fixed maximum size, then uses them in a small program. Several aspects of the program deserve special note.
Form FinSet defines one variety of sets. These sets must be homogeneous (i.e., all elements must be of the same type), but the elements may he of any type that provides assignment and equality operators. Thus FinSet is a generic type definition. The specifications of FinSet are stated in terms of ordinary mathematical sets; the restrictions that apply to sets of type FinSet are explicit.
We assume the types of the standard prelude, Appendix B. In particular, we use vectors, integers, and the generator upto for integers. The main program defines an ordered enumerated type and uses the generator that is automatically defined for such a type.
The names V and m used in the implementation of Finset are available to the bodies of the routines defined in that form. V and m may be used as qualifiers on any objects of type FinSet that those routines receive as parameters. The names V and m are not available outside the form.
FinSet defines routines &+(union), &*(intersect), &:=(assign), and &=(equality). These extend the definitions of the binary infix operators +, *, :=, and = to pairs of FinSets. The FinSet implementation also takes advantage of the rewrite rule for +:= and -:=.
Assertions are included in the specifications of the FinSet operators. (We use ”prime“ notation in post conditions: S' is the value that S had on entry to the procedure.) Some have also been included in the main program to explain the operation of the program.
Three kinds of loops are used. Routines Insert, Remove, and Has all use first loops to search for elements. Insert defaults the then part and Remove defaults the else part. Note that the equality test in the suchthat clauses of these loops uses the equality defined for EltType. As a result, the code for these routines may depend on the definition of the type passed as an instantiation parameter to FinSet. The main program declares three sets -- one is a set of colors and the other two are sets of sets (of colors). The program uses for loops with the generator color$gen on colors. It also uses a conventional while.
The '$' name qualification is used for names color$card and color$gen. These are constants defined for the ordered enumerated type color (number of elements and a standard generator, respectively). They are associated with the type rather than with any variable of the type. The qualification is used to distinguish ord and gen from the corresponding definitions associated with other enumerated types.
aux var FS:MathematicalSet(EltType) | |
invariant { cardinality(FS) ≤ MaxSize } | |
initially { FS = { } } | |
proc Insert(var S:FinSet,x:EltType) | pre cardinality( {x} ∪ S.FS) ≤ S.MaxSize } |
post { S.FS=S.FS' ∪ {x} } | |
proc Remove(var S:FinSet,x:EltType) | post { S.FS = S.FS' - {x} } |
vproc Choose(S:Finset):EltType | pre { S.FS ≠ { } } |
post result ∈ S.FS } | |
func Has(S:FinSet,x:EltType):boolean | post { result ≡ R.FS' ∪ S.FS' } |
func &+(R,S:FinSet):Finset | pre { cardinality(R.FS ∪ S.FS) ≤ R.MaxSize } |
post { result ≡ R.FS ∪ S.FS' } | |
func &*(R,S:FinSet):FinSet | post { result ≢ R.FS' ∩ S.FS' ) |
vproc &:=(var R,S:FinSet):FinSet | post { result = R.FS = S.FS' } |
func &=(R,S:FinSet):boolean | post { result ≡ (R.FS' = S.FS' } |
An informal Definition of Alphard 51
Appendix E
Proof Rules
Proof Rules Omitted From Preliminary Version