Butler Lampson - A Description of the Cedar Language

This document was made by OCR from a scan of the technical report. It has not been edited or proofread and is not meant for human consumption, but only for search engines. To see the scanned original, replace OCR.htm with Abstract.htm or Abstract.html in the URL that got you here.

A Description of the Cedar Language

A Cedar Language Reference Manual

Butler W. Lampson

CSL-83^.15 December 1983 (Printed November 1986) [P83-00016]

Abstract: The Cedar Language is a programming language derived from Mesa, which in turn is derived from Pascal. It is meant to be used for a wide variety of programming tasks, ranging from low-level system software to large applications. In addition to the sequential control constructs, static type checking and structured types of Pascal, and the modules, exception handling, and concurrency control constructs of Mesa, Cedar also has garbage collection, dynamic types, and a limited form of type parameterization.

This report describes the Cedar language. Except for chapter 2, it is written strictly in the style of a reference manual, not a tutorial. Furthermore, it describes the entire language, including a number of obsolete constructs and historical accidents. Hence it tells much more than you probably want to know. A summary of the safe langauge and comments throughout the manual suggest which constructs should be preferred for new programs.

CR Categories and Subject Descriptors: D.3.2 [Programming Languages]: Language Classifications - Cedar, extensible languages; D.3.1 [Programming Languages]: Formal Definitions and Theory - semantics

Additional Keywords and Phrases: kernel language, polymorphism, data types

The work described here was completed in late 1983 but not published at that time. It attempts to provide a reasonably formal and precise definition of the Cedar programming language. The then current version of the language was perceived as an inadequate base for a number of planned extensions to the language and supporting environment: on the other hand, there was already a large body of Cedar code that could not simply be abandoned. These problems are dealt with by defining a small but powerful kernel language plus a mapping of existing Cedar constructs into that kernel. The kernel language introduces value spaces and operations over them that go well beyond what has been available in any implemented version of the Cedar language; it was to provide the basis for extension and simplification. The mapping from existing Cedar into the kernel provides not only a migration path for existing code but also a definitional method.

This report should be of interest to students of programming languages and their definitions. Most of the interesting ideas of the Cedar language appear in the kernel, which is described in Chapter 2. Such readers should note that the formalism used to describe the kernel has several known shortcomings. Its treatment of so-called dependent types is somewhat cavalier. A subsequent report by Burstall and Lampson ("A Kernel Language for Modules and Abstract Data Types," Digital Systems Research Center, September 1984) includes a more careful treatment of such types in a language very similar to the kernel. The present treatment also glosses over most of the definitional problems raised by the possibility of concurrent evaluation.

The report should also be of interest to Cedar programmers. Chapters 3 and 4 constitute the most complete. precise and accurate definition of the implemented Cedar langauge that has appeared to date. For a reader willing to make the effort to assimilate the concepts introduced in Chapter 2, this report can serve as an interim reference manual. The later chapters are painfully honest and complete; as the abstract notes, they say much more than anyone probably wants to know. As of March 1986, the only known differences between the description and implemention, other than minor bugs in each, are the following:

· The improved syntax for ENTRY and INTERNAL has not been implemented; these attributes must still precede the type in a procedure declaration (Section 3.5).

· Sections 3.3.4 and 4.3.4 document an improved design for opaque types that was never implemented. In current Cedar, opaque types behave as they do in Mesa.

· According to Section 4.14, if P is a procedure taking one argument, its application to x using dot notation is written without brackets, as x.P. In current Cedar, the alternative form x.P[] is also accepted.

Both classes of readers should note that many parts of the kernel language have never been implemented in their full generality. Some of the current developers and users of the Cedar language would not even agree that the directions of evolution suggested by the kernel language are desirable or feasible. The claims about long-term goals and promised improvements in this report should therefore be taken as the personal opinions of the author.

Ed Satterthwaite, March 1986

Chapter 1. Introduction

The Cedar language is a programming language derived from Mesa, which in turn is derived from Pascal. It is meant to be used for a wide variety of programmin^g tasks, ranging from low-level system software to large applications. In addition to the sequential control constructs, static type checking and structured types of Pascal, and the modules, exception handling, and concurrency control constructs of Mesa, Cedar also has garbage collection, dynamic types, and a limited form of type parameterization.

This manual describes the Cedar language. Except for the overview material in § 2.1 and the discussion of concepts in §§2.3-2.7. it is written strictly as a reference manual, not a tutorial. Furthermore, it describes the entire lan^guage, including a number of obsolete constructs and historical accidents. Hence it tells much more than you probably want to know. A summary of the safe language and comments throughout the manual, suggest which constructs should be preferred for new programs.

The manual is organized into three major parts:

Chapter 2: A description of a much simpler kernel langua^ge, in terms of which the current Cedar language is explained. This description includes:

An overview or glossary, in which the major technical terms used in the kernel are briefly defined (§2.1).

An informal explanation of the ideas of the kernel and the restrictions imposed by current Cedar (§§ 2.3-2.9)

A precise definition of the kernel (§2.2). Most readers will probably find this rather hard going.

Chapter 3: The syntax and semantics of the current Cedar language. The semantics is given precisely by a desugaring into the kernel. It is also given more informally by English text. This chapter also contains a number of examples to illustrate the syntax.

Chapter 4: The primitive types and procedures of Cedar. For each one, its type is given as well as an English definition of its meaning. This chapter is organized according to the class hierarchy of the primitive types (§4.1).

In addition, there is a one-page grammar for the full language, a shorter grammar for the safe language, and a two-page language summary which includes the grammar, the desugaring, and the examples from § 3. The tables in §§4.1-2 summarize the types and primitives.

To find your way around:

First read chapter 2, except for § 2.2.

Then consult the table of contents, or the index, for the topics of interest to you. The full grammar (at the end) and the class hierarchy (Table 4-1) may also be useful as starting points.

The manual is extensively cross-referenced. Section titles and numbers appear at the top of each page. The summaries and tables also point to the section in which each construct is defined.

Acknowledgements: Rod Burstall and Ed Satterthwaite helped me greatly in clarifying the ideas presented in § 2. Ed was also indispensable in getting an accurate description of the current Cedar language. Bill McKeeman's work on an earlier Cedar language description was the starting point for this manual. Will Crowther, Jim Horning and Lyle Ramshaw read part or all of the manual carefully, and made many helpful comments. Several other Cedar programmers have pointed out errors or omissions. Of course, I am responsible for the errors that remain.

Chapter 2. The kernel language

This document describes the Cedar language in terms of a much smaller language, which we will usually call the kernel or the Cedar kernel. Cedar differs from the kernel in two ways:

· It has a more elaborate syntax (§ 3). The meaning of each construct in Cedar is explained by giving an equivalent kernel program.

Often the kernel program is longer or less readable: the Cedar construct can be thought of as an idiom which conveniently expresses a common operation. Sometimes the Cedar construct has no real advantage, and the difference is the result of backward compatibility with the ten-year history of Mesa and Cedar.

· It has a large number of built-in or primitive types and procedures (§4). In the kernel language all of these could in principle be programmed by the user, though in fact most are provided by special code in the Cedar compiler. In general, you can view these built-in facilities much like a library, selecting the ones most useful for your work and ignoring the others.

Unfortunately, the current Cedar language is not a superset of the kernel language. Many important objects (notably types, declarations and bindings) which are ordinary values in the kernel that can be freely passed as arguments or bound to variables, are subject to various restrictions in Cedar: they can only be written in literal form, cannot be arguments or results of procedures, or whatever. The long-term goal for evolution of the Cedar language is to make it a superset of the kernel defined here. In the meantime, however. you should view the kernel as a concise and hopefully clear way of describing the meaning of Cedar programs.

To help in keeping the kernel and current Cedar separate, reserved words and primitives of the kernel which are not available in current Cedar are written in SANS-SERIF SMALL CAPITALS, rather than the SERIF SMALL CAPITALS used for those symbols in current Cedar. Operator symbols of the kernel which are not in current Cedar are not on the keyboard.

The kernel is a distillation of the essential properties of the Cedar language, not an entirely separate invention. Most Cedar constructs have simple translations into the kernel. Those which do not (e.g., some of the features of OPEN) are considered to be mistakes, and should be avoided in new programs.

Roadmap

§ 2.1 gives a brief summary of each major idea in the kernel, which may be helpful as an introduction and reminder. Most of the chapter (§§2.3-2.8) is an informal explanation of the concepts behind the kernel. Usually. terms are defined and explained before they are used, but some circularity seems to be unavoidable. Both this and the explanations in §§2.3-2.7 are given under five major headings, as follows:

Values and computations The type system

Programs

Conveniences

Miscellaneous

There is also a sketch of the restrictions imposed by the current Cedar language on the generality of the kernel: for more on this subject, see § 3. The meaning of the various built-in primitives is given in § 4. The incompatibilities between the kernel language and current Cedar are described in § 2.9. i.e., the constructs in Cedar which would have a different meaning in a kernel program. For the most part, these are bits of syntax which do not have consistent meanings in current Cedar: future evolution of the language will replace them with their kernel equivalents.

§ 2.2 precisely defines the syntax and semantics of the Cedar kernel language, the former with a grammar, and the latter by explaining how to take a program and deduce the function it computes and the state changes it causes. The kernel definition follows the ordering of the kernel grammar. This section is rather difficult to read, and you may prefer to skip it.

2.1 Overview

This section gives a brief summary of the essential concepts on which the Cedar language is based. The explanations are informal and incomplete. For more precise but more formal definitions, see § 2.2: for more explanation, see § 2.3-§ 2.8.

2.1.1 Values and computations

Application: The basic mechanism for computing in Cedar is applying a procedure (proc for short) to arguments. When the proc is finished, it returns some results, which can be discarded or passed as arguments to other procs. The application may also change the values of some variables. In the program an application is denoted by (the denotation of) the proc followed by square brackets enclosing (the denotation of) the arguments: f [first-3, last– x+ 1]: here the – symbol binds the value of the expression on the right to the name on the left. There are special ways of writing many kinds of application: x+ 1, person.salary. IF x<3 THEN red ELSE green, x<-7 .

Value: An entity which takes part in the computation (i.e., acts as a proc, argument or result) is called a value. Values are immutable: they are not changed by the computation. Examples: 3, TRUE, "Hello". X [x: !NT] IN x+ 3: actually these are all expressions which denote values in an obvious way. The X-expression denotes a proc value P: the name x is called a parameter. When P is applied to an argument, the parameter x is bound to the argument.

Variable: Certain values. called variables, can contain other values. The value contained by a variable v (usually called the value of v) is returned by v.VALUEOF, and can change when a new value is assigned to v. In addition to its results, a proc may have side-effects by changing the values of variables. Nearly every non-variable type T has a corresponding variable type VAR T; values of type VAR T contain values of type T. Every VAR type has a NEW proc which creates a variable of the type. A variable is usually represented by a single block of storage: the bits in this block hold the representation of its value. A variable may be local to a proc. or it may be created by an explicit call of NEW, and referred to by a REF or pointer value.

Group: A group is an ordered set of values, often denoted by a constructor like this: [3, x+ 1, "Hello"]. Like everything else. a group is itself a value.

Binding: A binding is an ordered set of [name. value] pairs, often denoted by a constructor like this: [x: INT-3, y: BOOL^—TRUE] (or simply [x-3, y^–TRUE], in which the types of the names are the syntactic types of the expressions). If b is a binding. b.n denotes the value of the name n in b. Note the difference between binding and assignment: one introduces a new name with a fixed value: the other changes the value of a variable.

Argument: A binding constructor written explicitly after an expression (e.g., Copy{from– x, to–y]) denotes application of the value P denoted by the expression to the value a denoted by the constructor, called the argument. P is usually a proc, and a is a binding, which is bound to P's domain declaration D to get the argument which is passed. In making this binding a is coerced, if necessary, to match the declaration:

If a name in D is missing from a. a default value is supplied.

If a value in a doesn't have the type required by D, it is coerced (if possible) into another value which does.

The constructor can also be for a group, in which case the names from D are attached to its elements to turn it into a binding.

2.1.2 The type system

Type: A type defines a set of values by specifying certain properties of each value in the set (e.g., integer between 0 and 10); these properties are so simple that the compiler can make sure that proc arguments have the specified properties. A value may have many types; i.e., it may be in many of these sets. A type also collects together some procs for computing with the value (e.g., add and multiply).

More precisely, a type is a value which is a binding with two items:

Its predicate, a function from values to the distinguished type BOOL. A value has type T if T's predicate returns TRUE when applied to the value.

Its cluster, a binding in which each value is usually a proc taking one argument of the type. For any expression e, the expression e.f denotes the result of looking up fin the cluster of is syntactic type V e, and applying the resulting proc to the value of e.

A proc's type depends on the types of its domain and range; a proc with domain (argument type) D and range (result type) R has the type D— )R. Every expression e has a syntactic type denoted V e. e.g., the range declared for its outermost proc; in general this may depend on the arguments. The value of e always has this type (satisfies this predicate); of course it may have other types as well.

Mark: Every value carries a set of marks (e.g., INT or ARRAY: think of them as little flags stuck on top of the value). The predicate HASMARK tests for a mark on a value; it is normally used to write type predicates. The set of all possible marks is partially ordered.

The set of marks carried by a value must have a largest member m. and it must include every mark smaller than m. Hence all the marks on a value can be represented by the single mark m: we can say that m is the mark on the value. This does not imply a total ordering on the marks.

Type-checking: The purpose of type-checking is to ensure that the arguments of a proc satisfy the predicate of the domain type; this is a special kind of pre-condition for executing the proc. The proc body can then rely on the fact that the arguments satisfy their type predicates. It must establish that the results satisfy the predicate of the range type; this is a special kind of post-condition which holds after executing the proc. Finally, the caller can rely on the fact that the results satisfy their type predicate. In summary:

Caller— establish pre-condition: arguments have the domain type;

rely on post-condition: results have the range type.

Body— rely on pre-condition: parameters have the domain type;

establish post-condition: returns have the range type.

Declaration: A declaration is an ordered set of [name, type] pairs. often denoted like this: [x: INT, y: BOOL]. If d is a declaration, a binding b has type d if it has the same set of names, and for each name n the value b.n has the type dn. A binding b matches d if the values of b can be coerced to yield a binding b' which has type d.

A declaration can be instantiated (e.g., on block entry) to produce a binding in which each name is bound to a variable of the proper type; instantiating the previous example yields

[x: VAR INT^—(VAR INT).NEW, y VAR BOOL-'(VAR BooL).NEw].

Class: A class is a declaration for the cluster of a type. For instance, the class Ordered is [T: TYPE. LESS: PROC[T, T]--0^.[Boot1, . . 1. C is a subclass of D if (loosely) C includes at least all the [name. type] pairs in D.

2.1.3 Programs

Name: A name (sometimes called an identifier) appearing in a program denotes the value bound to the name in the scope that the name appears in (unless the name is in a pattern before a colon (declaration or binding) or tilde (binding), or after a dot or $). An atom is a value that can be used to refer to a name; a literal atom is written like this: $alpha.

Expression: In a program a value is denoted by an expression, which is one of: a literal value-3 or "Hello":

a name– x or salary:

an application of a proc value to a group or binding value– GetProperties[directory. input]:

a A-expression. which yields a proc value– A [x: INT]=>[INT] IN (IF x<0 THEN –x ELSE x);

a constructor for a declaration or binding–[x: INT-3, y REAL-3.14D.

If a value is given for each free name in an expression, then it can be evaluated to produce a value. Thus an expression is a rule for computing a value. The entire program is a single expression. made up of sub-expressions according to the five constructs above.

Scope: A scope is a region of the program in which the value bound to a name does not change (although the value might be a variable, whose contents can change). For each scope there is a binding called ENV (for environment) which determines these values. A new scope is introduced (in the kernel) by IN (after LET or A) or by a REC [...] constructor for a declaration or binding; e.g..

LET x"3 IN x-l- 5:

LET REC Fact–A[n: INT]=>[r: INT] IN (IF n THEN 1 ELSE rt*Fact[n-1]) IN Fact[4].

The first expression evaluates to 8. the second to 24.

Constructors: Brackets delimit explicit constructors for group. declaration or binding values. They all have the form [x₁, x₂....], and are distinguished by the form of the x.:

an expression for a group; n: e for a declaration:

n–e or n: _e_1–e2for a binding.

Recursion: When names are introduced in a constructor in Cedar, this is done recursively:

If v is bound to n in a binding constructor, then in expressions in the constructor n has the value v. rather than its value in the enclosing scope. Exception: argument bindings are non-recursive.

If n is declared in a declaration constructor, then it may not be used in the constructor, unless there is an ordering of the declarations in the constructor such that a name is used only by later declarations. Exception: declared names may be used in the bodies of A-expressions in the constructor (see § 3.3.4).

In the kernel, however, constructors are non-recursive unless preceded by REC.

Dot notation: The form an looks up n in some binding associated with e. and does something with the result. There are three cases:

if e is a binding, an is just the value paired with n in e.

If e is a type. e.n is e.Cluster.n.

Otherwise. e.n is (V e.n)[e], and e.n[more args] is usually (V e.n)[e, more args]. Recall that V e is the syntactic type of e.

In all cases you are supposed to think of n as some property or behavior associated with e: an denotes that property or evokes that behavior.

2.1.4 Conveniences

Coercion: Each type cluster may contain To and From procs for converting between values of the type and values of other types (e.g.. Float: PROC[INT]-'[REAL]: this would be a To proc in REAL and a From proc in 4\T). One of these procs is applied automatically if necessary to convert or coerce an argument value to the domain type of a proc: this application is a coercion. Each coercion has an associated atom called its tag (e.g.. $widen for INT^—REAL or $output for INT—ROPE): several coercions may be composed into a single one if they have the same tag. The tags thus serve to prevent unexpected composition of coercions: all are NIL currently. however.

Exception: There is a set of exception values. An expression e denotes a value which is either of type Ce or is an exception. Whenever an exception value turns up in evaluating an expression e₁. it

immediately becomes the value of e₁. unless (in the kernel) e₁ has the form e, BuT {...}. The {...}

tests for exception values and can supply an ordinary value, or another exception. as the value of the BuT expression. An exception value may contain an ordinary value, called the argument of the exception. so that arbitrary information can be passed along with an exception.

Finalization: When a variable is no longer accessible, the storage it occupies is freed (automatically in the safe langua^ge). Before this is done. a finalization proc in the cluster of the variable's type is called to do any other appropriate resource deallocation. Finalization is done by separate processes. and hence must be explicitly synchronized with the rest of the program. The local variables of a proc or other scope may also be finalized (using UNwiND): this is done synchronously (§ 3.4.3A).

Safe: The safety invariant says that all references are legal, i.e., each REF T value is NIL or refers to a variable of type T. A proc is safe if it maintains the safety invariant whenever it is applied to arguments of the proper types. If a proc body (A-expression) is

checked, the compiler guarantees that the proc value is safe:

trusted. the programmer asserts that it is safe (the compiler makes no checks): the proc value is safe: unchecked, the compiler makes no checks and the proc value is unsafe.

It is best to write checked code whenever possible. However, checked code cannot call unsafe procs (since the compiler then cannot guarantee safety).

Process: Concurrency is obtained by creating a number of processes. Each process executes a single sequential computation. one step at a time. They all share the same address space. Shared data (touched by more than one process) can be protected by a monitor: only one process can execute within the procs of the monitor at a time. So that each process can know what to rely on. there must be an invariant for the monitored data which is established whenever a monitor proc returns or waits. A process can wait on a condition variable within a monitor: other processes can then enter the monitor. The waiting process runs again when the condition is notified. or after a timeout.

2.1.5 Miscellaneous

Allocation: Cedar has standard facilities for allocating new variables of any type (the NEw primitive): related variables can be allocated in the same zone. Normally. variables are deallocated automatically by the garbage collector when they can no longer be referenced: such variables can only be referred to by REFS. Variables can also be deallocated explicitly by FREE. but this is unsafe.

Static: An expression whose value is computed without executing the program is called static. Literals are static. as are names bound to literals. and any expression with static operands. Proc bodies are never static unless they are inline, and often not then.

Pragma: Some language constructs do not affect the meaning of the program (except possibly to make a legal program illegal). but only its time and space costs: these are called pragmas. Examples are INLINE for proc bodies and PACKED for arrays.

2.2 Kernel definition

This section gives the syntax and semantics of the Cedar kernel language. Motivation, and an explanation of the relation between the kernel and the current Cedar language, can be found in §§ 2.3-2.8. Since this section is rather formal, you are advised to read the rest of the chapter first, and then return here if you want a more precise definition.

The kernel is subdivided into

A rather austere core; anything can be desugared into this, but not very readably (§2.2.1). A set of conveniences: with these, readable programs can be written (§ 2.2.2). Imperative constructs: statements and loops (§ 2.2.3).

Exception handling (§ 2.2.4).

The format of this section interleaves grammar rules which give the syntax of the language with text which gives the meaning. The meaning of the core is given in English. For other parts of the kernel, it is given by desugaring rules which show how to rewrite each construct in terms of others: if rewriting is done repeatedly. the result is a core program, which may invoke some primitives. The meaning of these is also given in English. There is also some English explanation of the desugaring. but this is only a commentary and does not have the force of law.

See § 3.1 for the notation used in the grammar and desugaring.

2.2.1 The core

The Cedar core is a minimal subset of the kernel, barely adequate as a base into which the rest of the kernal can be desugared. In the core, there is syntax only for names. literals. application, X-expressions. a basic and a recursive binding construction, and syntactic type: everything else is done with primitives. We never write anything in the core, however, except to show the desugaring of a kernel construct. Thus the reader need not struggle with programs in the ugly core syntax.

Many readers may be happy with the kernel definition given in the other sub-sections of § 2.2, and may wish to avoid the formalism of this section.

Table 2-1 gives the core syntax (in the first column). together with a comment suggesting the meaning of each construct (in the last column). The meaning is given in detail in § 2.2.1A-G. The middle column gives the syntactic type of each construct. For readability, this is written in the full kernel language. with a few conventions:

a * in front of the syntactic type indicates that it gives less information that one would like. For instance. DDOTP has type DECL-⁴TYPE. which says nothing about the fact that the type is a cross type whose structure matches the structure of the decl.

A parameter to a primitive declared with :: is the type of some other argument: the argument for this type parameter may be omitted in an application of the primitive, in which case it is supplied as the syntactic type of the other argument. For instance, p: [r: TYPE. x: t]^—*[...] can be applied with p[x-3], which is short for ,[r—INT. x-3].

A bold name is a reference to another parameter, e.g.. t in the previous example.

In the kernel, a core primitive named xooly is in the cluster of the type of its argument under the name y. Thus DDOTP is in the cluster of DECL under the name P. so that d.P=DDOTP[d] if d is a decl.

name ::=

letter (letter I digit)... literal ::=

$n[

primitive
primitive ::=

ARROW I

DOMAIN I RANGE I MKPAIR

GROUP I

MKCROSS I

CDOTG MKBINDD

BDOTDIBDOTVI MKBINDP I LOOKUP[ THEN

ENV I

MKDECLI

DDOTP

DDOTT I DTOB I

BTOD

THEND I

(VENv).n -- Appears as an e or in a pattern.

A^-10M .;TOM literal. I
V primitive

[d: DECL, p: (d—>DECL)]—qa: --arrow--fl PE] --arrow--TYPE]^—,[r: TYPE]

[r_i:: TYPE, firs:: t₁,t,::TYPE. rest: t_]—*[v: t₁Xt₂]

[r_i:PE]—[t:TYPE] --iTYPE

[g: GROUP[TYPE]]^—)[c: --cross--TYPE]

*Et: --cross--TYPE]^—+[g: GROUP[TYPE]]

Ed: DECL, v: d.-rho[b: d]

[b: BINDING]-⁰[d: DECL] [d:: DECL, b: [1]-4[v: d.T

[p: PATTERN, I:: TYPE, v: t]-^3,[b: MKDECL[p. t]] =MKBINDD[d—MKDECap, t]. v— v] Ed:: DECL, b: d. n: ATOM]^—+[v: oToB[d].n]

DECL, b1• ^d1' d2' • : DECL, b,: dd,]-->[v:d,]-->[v:d1 THEND d, ]

^*BINDING

*[p: PATTERN, t: TYPE]— [d: DECL]

*[d: DECL]^—*[p: PATTERN]

*[d: DECL]^—>[l: TYPE]

*[d: DECL]^—)[b: BINDING] --=MKBINDP[p^—d.P, v^—dT.G]]

*[b: BINDING]^—*[d: DECL] --=MKDECL[p^—b.D.P, t^—MKCROSS[b.V]] DECL, d₂: DECL]^—.[v: DECL] --=BTOD[DTOB[d_i] THEN ErOB[d_i]l

An expression denotes a value according to a rule which depends on its syntax. If the expression is a name or literal, the value is the value of the name or literal. The remaining cases are discussed in the following sub-sections. Most of these cases define the value of the expression in terms of the value of its sub-expressions. The sub-expressions may be evaluated in any order.

A. The current environment ENV

The current environment ENV is a binding. The value of the expression n is ENV.n. ENV for a sub-expression is the same as ENV for its containing expression, except that:

For the b of a closure being applied, ENV is computed according to B below.

For the e of a FIX. ENV is computed according to E below.

Thus applying a closure and evaluating a FIX are the only ways to change ENV.

B. Application

The value of a standard application is obtained by evaluating e_l and e₂to obtain v₁ and v,, and applying v₁ to v₂. There are two cases for application:

v₁ is a primitive. The value of the application is a function of v₂ given in the definition of

the primitive. The core primitives are defined throughout § 2.2.1. the Cedar primitives in §4.

v is a closure c (C below), with domain declaration d, body b and environment E. The value of the application is the value of the expression b in the environment

MKBINDD[d, v₂] THEN E

(E below). Note that if the closure was made with A, the body must be type-checked when it is applied: a closure made with A was type-checked when it was made (C below).

Ve_t must be an arrow type. An application type-checks if Ve, implies Ve_rDOMAIN (G below). The type of the application is obtained by applying Ve₁.RANGE to v₂.In simple cases. Ve₁.RANGE is a

constant. For instance. NOT: BOOL^—>BOOL has RANGE=A BOOL=>TYPE IN BOOL. However, the result type may depend on the argument value. Thus

VMKBINDD.RANGE= A [d: DECL, v: d.T] = >TYPE IN [b: d]

so that MKBINDD[[i: INT]. 3] has type [b: [1: INT]] to go with its value [b–P-31].

C. Lambda

The value of a A-expression is a closure, which has three parts:

A domain declaration d. equal to the value of d₁.

A body b. which is the expression e (not the value of e).

An environment E. equal to the current environment ENV when the A is evaluated. A A-expression type-checks if

d₁ evaluates to a declaration d.

For any x of type d.T. Ve implies d₂.T in the environment MKBINDD[d. x] THEN E.

A A-expression type-checks if d₁ evaluates to a declaration: type-checking of the body is deferred until the closure is applied.

D. Pairs, groups and cross types

A pair is the basic structuring mechanism. MKPAIR[x, y] yields the pair <x. y>. Bigger structures are made, as in Lisp. by making pairs of pairs. When we are interested in the leaves of such a structure. we call it a group and call the leaves its elements. A group has type GRouP[7] if all its elements have type T or are NIL. A flat group is a pair in which first is not a group, and rest is a flat group or

NIL.

The type of a pair is a cross type: MKPAIR[x, y] has type TX U iff x has type T and y has type U. Cross types are made with MKCROSS, which turns a GROUP[TYPE] (i.e., a group whose elements are types) into a cross type in the obvious way:

MKCROSS[NI L] = NILTYPE

MKCROSS[T] = T if T is a type.

MKCROSS[ MKPAIR[x, y]]=MKCROSS[x]XMKCROSS[y]

Note that MKCROSS of a flat group is flat. CDOTG goes the other way, turning a cross type into a GROUP[TYPE] in which no element is a cross type. Thus MKCROSS is the inverse of CDOTG, but not necessarily the other way around.

E. Bindings

A binding is either NIL, or an <atom, value> tuple, or a <binding, binding> tuple. The primitive MKBINDD constructs a binding from a declaration d and a matching value v, i.e. (as the type of MKBINDD indicates), one with the type d.T. The resulting binding has type a and consists of the names from d paired with the corresponding values from v. Example:

MKBINDD[ [x: INT, b: BOOL]. [3, TRUE]] = [x-3, b–TRUE]

= < <$X, 3>, «$b, TRUE>, NIL > >

In this example. d-r is INTXBOOL.

The declaration and group in this example is written using the syntax of §2.2.2: in the core they would be MKDECL[p^—[Sx. Sb], r—imcRossllivr. BooLl] ] and kiK^pain[firsi-3. rest^—mKPAIRUirsi^—TRUE. rest^—NIL]] (where we have written the arguments of these primitives in the kernel syntax).

The primitives BTOD and BTOV return the arguments of the MKBINDD primitive that made the binding. MKBINDP is redundant: it is like MKBINDD, but takes a pattern instead of a declaration, and hence accepts any v with the right structure, regardless of the component types.

LOOKUP returns the value of the name n in the binding. THEN combines two bindings, giving priority to the first one in case of duplicate names. It works only for flat bindings, in which the first element of each <binding, binding> tuple is an <atom, value> tuple, and the second element is another <binding. binding> tuple or NIL. The value of b₁ THEN b₂ is another flat binding, obtained

by first replacing any tuple <<a, v>, b> in b₂ where a is equal to an atom in b₁by b. and then using this binding to replace the final NIL in b₁.

The binding constructor [(n^–e), has the value MKBINDP[p^—In, ...], v–[e, ...] ].

FIX makes a recursive binding: the value of FIX d₁–e is MKBINDD[d. v], where d is the value of d₁ in

ENV and v is the value of e in the environment (LET FIX d–e IN d–e) THEN ENV. Of course in general this computation may not terminate: normally the names in d occur in e only in the bodies of A-expressions, and in this case it does terminate. The FIX typechecks if De in the latter environment implies DTOT[d].

F. Declarations

A declaration is either NIL, or an <atom, type> tuple. or a <declaration, declaration> tuple. The primitive MKDECL constructs a decl from a pattern p and a value 1 of type GROUP[TYPE]. A pattern is a GROUP[ATOM]. i.e., either NIL, or an atom, or a pair of patterns: the ATOM elements must all be different. An application of MKDECL typechecks if t matches p. i.e., if

both p and t are NIL, or

p is an atom and 1 has type TYPE, or

p is a pair [p₁, p₂] and t is a cross type 1₁Xt, and p_i matches t_i and p₂ matches t₂.

The resulting declaration consists of the names from p paired with matching type values from I.

The primitives DDOTP and DDOTT return the arguments of the MKDECL primitive that made the declaration. Thus

DDOTT[NIL]=NILDECL;

DDOTT[<$n, 7)] = T:

DDOTT[<d_t. d₂>]=DDOTT[d₁]XDDOTT[d₂]

DTOB is redundant: it converts a declaration to a binding in which each name has the corresponding type as its value. Thus DTOB[[x: INT, y: REAL]]=[x-INT, jr-REAL]. The inverse is BTOD. also redundant: it is defined only if all the values in the binding are types. THEND combines two declarations just as THEN combines two bindings: V(b₁ THEN b₂)= Vb_i THEND Vb₂

G. Types and type-checking

A type is a value consisting of a pair:

the predicate, a function from values to BOOL. the cluster, a binding.

A value v has type T if Ts predicate applied to v is TRUE. T implies U iff (V x) T.Predicate[x]U.Predicate[x].

Typechecking consists of ensuring that the argument of an application has the type specified by the domain of the proc (B above). The body of a A-expression can then be type-checked (or the implementation of a primitive constructed) independently, assuming that the parameter satisfies the domain predicate. Symmetrically, the result of an application can be assumed to have the type specified by the range of the proc.

To complete the induction. it is also necessary to check that the value of the body of a A-expression has the range type (C above).

The primitive types in the kernel are:

BOOL, with two values TRUE and FALSE.

ATOM, with values denoted by literals of the form $n. TYPE, a predicate satisfied by any type value. ANY, a predicate satisfied by any value.

DECL. the type of a declaration (F above).

BINDING, the type of any binding.

Arrow types, the types of procs (C above). An arrow type has a domain type and a range type.

Cross types, the types of pairs (D above).

GROUP[7], the type of any pair in which all the elements have type T. Declarations, the types of bindings (E and F above).

There are no non-trivial implications among any of these types, except as follows:

DECLTYPE: BINDINGTYPE: GROUP[I]TYPE.

TANY for any type T.

T₁XT₂= U _IX U₂ iff T_i=U_i and T₂= U₂.

GROUP[T]=^.GROUP[U] if U.

T₁-+T₂U₁-1₂ T₁ and (Vx: U₁)(A T₁ IN T₂)[xl(X U₁IN L12)[x]. Note the
reversal of the domains.

d₁'d₂for declarations iff d_i.P=d₂.P and oToB[d₁].nDTOB[d₂J.n for each n in d₁.P.

2.2.2 Conveniences

Table 2-2 gives the syntax and semantics for kernel expressions. Most of this is straightforward sugar. LET adds the binding e_lto ENV in evaluating e₂.The separate case for b, ... simply allows the

H which normally enclose a binding constructor to be omitted in this case; see below. IF wraps e₂and e₃in A's so that they don't get evaluated: the iFPROC primitive chooses the one to evaluate and applies it.

The dot notation has three cases.

For a binding it just looks up n in the binding.

For a type it looks up n in the type's cluster.

For anything else, it looks up n in the cluster of V' e and applies the result to e. The special LOOPUPC primitive does something special if it finds a proc which takes more than one argument: it splits the proc into one which takes the first argument and returns a proc taking the remaining arguments. This ensures that if V e.n is such a proc P, the expression en[a b] will desugar into something equivalent to P[e a b].

The usual syntax for application is a proc e_lfollowed by an explicit binding constructor. The kind of application may depend on the type of e_l,via the APPLY element of its type; for a proc applied by the standard apply operator 4, APPLY is the identity. If e₁is followed by an group rather than a binding constructor, the argument is obtained by binding the group to the declaration which is e_i's domain.

Infix operators desugar straightforwardly into application: note that the choice of proc is determined by the type of the first operand only. AND and OR are not ordinary infix operators. since they evaluate no more than necessary; this is expressed by the desugaring into IF.

The remaining expression syntax is various constructors, described below, and the imperative and exception features described in the next two sections.

expression = coreExpression d₁ —> d₂

A(l^e_l)(I=>^e,)^We₃1

LET e_l IN e₂ I

LETb....INeI

IF e_l THEN e₂ ELSE e₃

e n

^...1I

e_l infixOp e₂ I

e₁ AND e₂ I e_l OR e₂ I

PATT ^pI [ b, !.. 1 I

REc[(p : t"'e)....]I

[ d. !.. ] X X

statements I simpleLoop but

infixOp ::=

PLUS THEN

literal :: = coreLiteral digit digit ... I

declaration ::= p: t 1

[(^p:I

binding :: =

1⁰-^el
d e I
pattern ::=

n I

^[pi^.-¹

primitive :: = corePrimitive

LOOKUPILOOKUPCI

PLUS I IFPROC

ARROW 4 [d_i.A d_i= >DECL IN d₂]

-- The domain defaults to [], the range to Ve_t (A Ve_i IN e, ) 4 e_r^ye_i a binding I

LET [b....] IN e

iFPROC[Ve₂. e_i,A IN e₂. A IN e₃11 []

IF VeBINDING THEN LOOKUP 4 Eve. $n]

ELSE IF VeTYPE THEN LOOKUP I [V e.cluster, $n]

ELSE ( LOOKUPC 4 [V e.cluster, $n] ) 4 [e]

e_l . APPLY 4 [b, ...] I e_l . APPLY I MKBINDD[Ve_i.DOMAIN, [e,....]

e_i . infix0a[e₂]

IF e₁ THEN e₂ELSE FALSE 11F e THEN TRUE ELSE e₂ 1

NIL I mkPAIR[e_l, [ (I e₂, )]-- Group constructor. I -- Pattern constructor: see the rule for p below. I

b PLUS ... PLUS NIL I

FIX [p, ...] : MKCROSS[[t, ...11-[e, ...]

d PLUS ... PLUS NIL I

xxxxxx I --Also recursive d maps into this? -- See § 2.2.3

-- See § 2.2.4.

MKCROSS

INT -- Numeric literal, giving the decimal representa
-- A d is not an e: a d must be before - or after LET or DECL. MKDECL[ PATT p, t]

[p, ...]: MKCROSS[[t ...]] -- to separate names and types
-- Only the [...] form is an e: a b must be written after LET. I

MKBINDP[PATT p, e] MKBINDD[LI, e]

-- Note: a pattern is not an e: it can appear only before - or or after PATT in the kernel.

PATT n=$n

PATT [p_i, ...]=[PATT p_i, ...]

-- Fill in types

Constructors

A bracketted sequence of expressions (e.g., [1, 2. 3]) denotes a flat group with its elements in the same order (e.g.. mKPAIR[1, MKPAIR[2. mkPAIR[3, Nit]]]. Thus a group constructor is just like the LIST function in Lisp. A pattern is a similar construct, except that it contains names which stand for the corresponding ATOM literals; PATT yields the group obtained by replacing each name n by the literal $n. After desugaring a pattern always appears after PATT and hence is always desugared into an atom or a GROUP[ATOM].

Brackets are also used to delimit binding and declaration constructors. They are distinguished from each other, and from group constructors, by the presence of – in each element of a binding constructor, and : in each element of a declaration constructor. The elements of a binding or declaration constructor are sugar for applications of the MKDECL. MKBINDP and MKBINDD primitives. The constructor itself strings the resulting declarations into a big one using the PLUS operator. which is just like THEN except that it does not allow duplicate atoms; the motivation for this is to allow the names and corresponding types or values to be written together, instead of factored as the primitives require. As a result, values made from constructors are always flat.

Note that these constructors do not nest, so they can only be used to build flat values. The only exception is that a d can be [(p: ... ]. This is intended for the d–e form of binding; e.g., if DivRem returns two INTS, you can write [d: INT, r. INT]-'DivRem[...] instead of [d, r]: INTXINT–DivRem[...].

The REC binding constructor is sugar for FIX which exactly parallels the non-recursive one.

2.2.3 Imperatives

These constructs are generally used together with non-functional procs.

statements ::= e; } IF (IsvoiD[e]) AND ... THEN [] ELSE ERROR

-- Ordering by non-prompt evaluation.

simpleLoop ::= SIMPLELOOP statements LET REC [loop'–(X IN { statements: loop[] })] IN loop[]

-- Only an exception (such as EXIT) will terminate the loop.

Each e in the statements must evaluate to VOID, which is a distinguished null value; this is to catch mistakes like writing x+1 as a statement. The definition of AND ensures that the is are evaluated left-to-right.

The simpleLoop is the standard way to express a loop in terms of recursion. You are supposed to use an exception to get out of this loop; Cedar provides a number of convenient ways to do this, such as EXIT and RETURN.

2.3 Values and computations

A computation in Cedar is the evaluation of an expression in some environment. This section describes the kinds of values which can be computed by Cedar programs, and the basic mechanisms for doing computations.

2.3.1 Application

The basic mechanism for computing in Cedar is applying a proc to argument values. A proc is a mapping

from argument values and the state of the computation. to result values, and a new state of the computation. The state is the values of all the variables.

A proc is implemented in one of two ways:

By a primitive supplied as part of the language (whose inner workings are not open to inspection, but which is defined in § 4).

By a closure, which is the value of a A-expression whose body in turn consists of an expression, which may contain further applications of procs to arguments. e.g., A [x: INT] IN x+ 3. When a closure is applied, the parameters declared after the A are bound to the arguments, and then the body after IN is evaluated in the new environment thus obtained.

In Cedar, each parameter value thus obtained is used to initialize a variable, which is the object named by the parameter in the body. Thus the body can assign to the parameters. Use of this feature is not recommended.

Note that when a A-expression is evaluated to obtain a closure its body is not evaluated, but is saved in the closure, to be evaluated when the closure is applied. Some constructs (IF, SELECT, AND, OR) are defined (see § 2.2.2 and § 3.8) by wrapping A-expressions around some arguments, and then applying them only when certain conditions hold; e.g., IF b THEN f[x] ELSE g[y] evaluates f[x] iff b iS TRUE and g[y] iff b iS FALSE.

Application is denoted in programs by expressions of the form fiarg, arg. mi. If the value off is a closure, this expression is evaluated by evaluating f and all the arg's, and then evaluating the body of the closure with the formal parameters bound to the arguments (unless an exception value turns up; see § 2.6.2). Thus to evaluate (A [x: INT] IN x+3)[4]:

evaluate the A-expression to obtain a closure;

evaluate the argument 4 to obtain the number 4; evaluate x+ 3 with x bound to 4 to obtain the number 7.

The first two evaluations can be done in either order (with different results in general, though not in this case).

To evaluate a primitive application such as x+3, evaluate the arguments, and then invoke the primitive on those arguments to obtain the result and any state change. With a few exceptions (e.g., assignment and dereferencing or following references), primitives are functions and can be thought of as tables which enumerate a result value for each possible combination of arguments. Invoking a primitive can therefore be viewed as a simple table lookup using the arguments as the table index.

Actually there may be one more step in an application. If an argument doesn't have the type expected by the proc, the argument is coerced to the proc's domain type if possible. If no coercion can be found, there is a type error. Coercion is discussed further in § 2.6.1 and § 4.13.

Most procs take a binding as argument, in which the various parts of the argument are named. E.g., OpenFile: PROC[name: ROPE, mode: Files.Mode] takes a binding with two values named name and mode. It might be applied like this: OpenFile[name–"Budget.memo", mode–$read]. If the names are missing, there is a positional coercion which supplies them left-to-right, see § 2.3.6. There is also a defaulting coercion that supplies missing parts of the binding: see § 4.11.

If f is neither a primitive nor a closure, the meaning of applying it is defined by the APPLY proc for its type: this case is discussed further in § 4.4.

There are many ways of writing applications other than f[x]. In fact, many Cedar primitives cannot be the values of expressions. and can only be applied by writing some other construct. The desugaring rules show how large parts of the Cedar syntax denote various special kinds of application. In each case. the meaning is defined by the standard meaning of application and the specific meaning of the primitives involved: see § 4.1.

This is partly because of history. and partly because specialized syntax makes the program more readable. Future evolution of the langua^ge will improve the situation.

Functions and order of evaluation

An expression is functional if

its value does not depend on the state, but only on the values bound to its free names, and evaluating it does not change the state.

As a consequence of this definition,

Two identical functional expressions in the same scope will always have the same value.

Note that a functional expression must not depend on values contained in variables bound to its free names. Thus. v.VALUEOF is not functional.

A proc is a function if every application of it is functional. It doesn't matter when or how many times a function is applied: the order of evaluation doesn't matter for functions. Thus Cedar functions can be thought of as mathematical functions for many purposes. Note that a constant can be regarded as an application of a function of no arguments.

Non-functional procs, on the other hand, are more complicated objects. Cedar makes no formal distinction, either in syntax or in the type system, between functions and procs. However, it does not define the order of evaluation in an expression, except that:

all arguments are evaluated before a proc is applied:

because of the desu^garing of IF, SELECT, AND and OR into A-expression. the order of evaluation for these expressions is determined by the first rule:

statements separated by semi-colons are evaluated in the order they are written.

As a consequence, two applications of non-functions should not be written in the same statement unless they don't affect each other: if this is done the effect of the program is unpredictable.

An expression is guaranteed to be functional if it only applies functions: thus if f is a function, p a non-functional proc, and x a variable, f[3] is functional and p[3] and p[ x] may not be. Furthermore. f[x] may not be functional, because it is sugar for f[x.vALuEoF], and VALUEOF is not a function. The value of a A-expression is a function if its body is functional. There are more complicated ways of ^guaranteeing that an expression is functional, just as for any other interesting property.

Because the values of variables constitute the state. it is only the existence of variables that allows non-functional procs to exist. In particular. the VALUEOF proc which returns the value of a variable is non-functional (because its result depends on the state), and the ASSIGN proc which changes the value of a variable is non-functional (because it changes the state).

2.3.2 Values

A Cedar program manipulates values. Anything which can be denoted by a name or expression in the program is a value. Thus numbers, arrays, variables, procedures, interfaces, and types are all values. In the kernel language, all values are treated uniformly. in the sense that each can be:

passed as an argument, bound to a name, or returned as a result.

These operations must work on all values so that application can be used as the basis for computation and A-expressions as the basis for program structure. In addition, each particular kind or type of value has its own primitive operations. Some of these (like assignment and equality) are defined for most types. Others (like addition or subscripting) exist only for certain specific types (numbers or arrays). None of these operations. however, is fundamental to the language. Formally. assignment or equality has the same status as any operation on an abstract type supplied by its implementor: thus INTEGER.ASSIGN has the same status as 10.GetInt. In practice, of course, special syntax is usually used to invoke these operations, and the implementations are not Cedar programs open to inspection by the editor or debugger. A complete description of the primitives supplied by the language can be found in Chapter 4, organized by the type of the main operand. Table 4-5 is an alphabetized index of these descriptions.

Restrictions on types, declarations, bindings and unions: In current Cedar, however, there are restrictions on values which are types, declarations or bindings: they can only be arguments or results of modules, and hence are first-class values only in the modelling language, and not within a module. Also, declarations and bindings cannot be constructed or bound to identifiers within a module. Unions are also restricted: they can only appear inside records. Nonetheless, it is simplest to emphasize the uniform treatment of all values, and consider separately the restrictions on types, declarations, bindings and unions. Future evolution will improve this situation.

Restriction on dot notation: In current Cedar you can only use dot notation for some operations of built-in clusters: the procs which access record fields, and others as noted in Table 4-5. As a substitute, there are various syntactic forms which are sugar for dot notation: infix, prefix and postfix operators, built-in functions, and funny applications. These desugarings are given in rules 20-24 of the Cedar grammar in § 3.

2.3.3 Variables

Certain values, called variables, can contain other values. A variable containing a value of type T has type VAR T. If the variable doesn't allow the value to be changed. the type is READONLY T: this is not the same as T, because there may be a VAR T value which is the same container. The value contained by a variable (usually called the value of the variable) can be changed by assigning a new value to the variable. The set of all variables accessible from the process array constitutes the state of the computation: these are all the variables which can be reached from any process. and a variable which cannot be reached cannot affect the computation. Note that a variable value is a container, which like all values is immutable; it may help to think of it as (the address of) a block of storage. The contents of a variable can be changed by assignment. Thus the value of a variable can change. even though the value that is the variable is immutable.

A suitable abstract representation for a VAR T is a value of type [Get: []–*T. Set: T–>0]. This representation is not used in Cedar, but it clarifies the way in which variables fit into the type system: VAR TVAR U only if T and U have the same predicate, because the Get proc requires TAU and the Set proc requires UST. READONLY T corresponds to [Get: 0–>T] and a write-only variable type would be [Set: T– U].

There is a coercion (an automatically applied conversion: see § 2.6.1) from VAR T to T. so that a variable can be passed without fuss as an argument to a proc which expects a value.

Restriction on variables: In current Cedar, variables generally cannot be passed as arguments or results. The only exception is that an interface can declare a variable (called an exported variable) for which an implementation supplies a value; this is normally written x: VAR INT in the interface, but for historical reasons it is also possible to write just x: INT. Certain primitives (e.g.. dereferencing a REF or POINTER) return variables. a variable can (indeed, must) be passed as the first argument to ASSIGN, and a variable can be bound to a name by a declaration in a LET or block (LET x^— I NT.NEW IN ... binds a VAR INT value to x). For the most part, however, a program which wants to handle variables must do so at one remove, through procs or REFS (or, unsafely,

POINTERS).

A variable is often represented by a block of storage; the bits in this block hold the representation of its value. All the built-in VAR types are represented in this way. A variable u overlaps another variable v if assigning to u can change the value of v. The primitive ASSIGN procs have the property that

if r and s are REFs. then rr overlaps st iff r= s.

For any variables u and v with the same VAR type, u overlaps v iff u= v, provided that no unchecked program has given overlapping blocks of storage to the two variables (if u and v have different types, one might be contained in the other).

The role of variables in non-functional expressions is discussed in § 2.3.1.

2.3.4 Groups

There is a basic mechanism for making a composite value out of several simpler ones. Such a composite value is called a group. and the simpler values are its components or elements. Thus [3, x+ 1, "Hello"] denotes a group, with components 3, the value of x+ 1, and "Hello". The main use of explicit groups is for passing arguments to procs without naming them (these are sometimes called positional arguments). This is done by binding the group to the declaration which is the domain type of the proc; the result is a binding which is the argument the proc expects. Thus, with P: [x: INT, y: R EA L]--q. . .1, the application P[2. 3.14] is sugar for P[ [x: INT, y REAL]-'[2, 3.14] ], which is equivalent to P[x-'2. y-3.14].

A group has a type which is the cross type of its component types: if x has type T and y has type U, then [x y] has type TX U. Thus for syntactic types, V[e_i, e₂. ]=Ve_/XVe₂X ... The X type

constructor is associative, and type implication (§2.4.2) extends to cross types elementwise. If the T

are types, there is a coercion called MKCROSS from [T₁.T₂....]to T₁X T₂X ...; because of this, the

explicit cross type is usually not needed.

Restriction on cross types: Current Cedar provides no way of making cross types except as domain and range types of a proc type (or other transfer type); e.g.. PROC [INT, REAL]^—+[BOOL, ATOM]. There are no procs taking groups except the group-to-binding coercions. Hence the only thing to do with a group is pass it to one of the built-in coercion procs by writing it as a proc argument or to a record or array constructor as described in the next section. Current Cedar does not have X, but it does have the MKCROSS group to cross type coercion described in the last paragraph and illustrated in the example.

2.3.5 Bindings

A binding is a group in which each component has a name. Thus, it is an ordered set of [name, value] pairs. There are three main uses for a binding:

• As an argument in an application. Thus, if P is a proc with type PROC[i: INT, b: BooL], its argument must be a binding such as [1-3, b^—TRUE]. The application then looks like this: P[i-3, b^—TRUE]. A binding argument is sometimes called a keyword argument list. See the next section for details.

· In a LET expression, to give names to values in the scope of the LET. Thus,

LET 1-3. b^—TRUE IN (IF b THEN 1+5 ELSE 0)

has the value 8. Current Cedar doesn't have LET expressions. but a binding at the beginning of a block has the same effect. See § 2.5.4 on scopes for details.

· As a way of collecting and naming a set of related values. A value can be extracted from the set using dot notation. Thus if b is the binding [1-3, b^—TRUE], the value of b.i is 3. In current Cedar this only works for interfaces; see § 3.3.4 and § 4.14 for details.

A binding is usually denoted by a constructor, which takes the form

[1-3, b^—TRUE]

or redundantly (if there are no coercions)

[i: INT-3, b: BOOL^—TRUE]

in which the types are specified explicitly (but you can't write the second form as the argument of an application). See § 2.5.5 on constructors for details.

2.3.6 Arguments

When a group or binding is bound to a declaration (d–v), there are various conversions called coercions which may be applied to the values. This usually happens when the arguments of a proc application are bound to the parameter declaration.

First, if v is a group rather than a binding, it is coerced to a binding by attaching the names from d to the elements of v in order. Thus in

[a: [NT, b: REAL]–[2, 3.14]

the group constructor is coerced to [a-2, b-3.14].

Next, if v is shorter than d, elements of the form n^—OMITTED are appended. where n is the corresponding name from the declaration. Thus in

[a: [NT, b: REAL]^—[2]

the group constructor is coerced to [a-2, b^—OMITTED].

Now the items of the binding are matched by name with the items of the declaration. There is an error unless the names match exactly. The remaining coercions are done on individual items. n: from the declaration and the corresponding n–v from the binding. If v has a type implying 1. all is well. Otherwise, if there is a sequence of coercions from the type of v to 1. these are applied to v. If no such sequence exists, there is an error. In particular, there is a coercion from OMITTED to the default value for :, if any. Thus in

[a: INT -0, b: REAL⁴-1.1]^—[b-3.14]

the group constructor is coerced to [a-0, b-3.14], and in

[a: INT^O-0, b: REAL⁴-1.1]–I]

it is coerced to [a-0, Coercions are discussed in § 2.6.1 and § 4.13, defaulting in § 4.14.

An important special case is constructors for record and array values. A record type has a construction proc; e.g.,

TYPE^•RECORDIa: INT, b: REAL.-0.01

has a proc R.CONS of type PROC[a: INT, b: REAL€0.]—)[R]. Thus R.CONS[a-2, b-3.1416] constructs a record value. There is also a coercion from BINDING to the particular declaration RB which is the domain type of R.CONS, so that

rl: R⁴-[a-2, b-3.1416]

is short for

rl: R⁴- R.coNs[a-2, b-3.1416].

Composing the positional coercion from GROUP to RB with R.CONS makes

rl: R<-[2.3.1416]

also short for the previous line.

The same scheme works for arrays. but only an array indexed by an enumeration has a corresponding binding which can be written; the elements of an array indexed by numbers don't have names which can be written in a binding. However, the group constructor still works.

2.4 The type system

This section describes the way in which types can be used to make assertions about the program which the compiler can verify. It also discusses the role of types in organizing the names of the program.

2.4.1 Types

Types serve two independent but related functions in Cedar:

· A type contains an assertion about some property of a value, e.g.. that it is a whole number between 0 and 10 represented in a single machine word. A value which has the property is said to be of that type, or to have that type.

The assertion part of a type is called its predicate. It is a function which accepts a single value (of any type) and returns TRUE iff the value satisfies the assertion. In principle the predicate can be applied to any value at runtime, but in practice a lot of optimization is done by the compiler.

· A type contains a collection of named procs (and perhaps other values) related in some useful way. Most often, the procs of type T take a value of type T as their first argument. For example, INT has PLUS, TIMES and MINUS procs (usually written as infix or prefix operators) which can be applied to INTS. The dot notation (see § 2.4.4) makes it easy to refer to the procs in a type's collection.

The collection part of a type is called its cluster. It is simply a binding. No rules are enforced about what kind of values are in the binding. However, the idea is that the cluster is an interface for manipulating values of the type (perhaps the main or even only interface). As with any interface, a tasteful choice of names and values is important.

The predicate and the cluster serve rather different purposes:

The predicate provides the basis for type-checking (§ 2.4.2). The most important function of type-checking is to guarantee the integrity of abstract data types: this is done with basic predicates called marks (§ 2.4.3).

The cluster provides the basis for convenient naming of a large collection of procs and other values (§2.4.4). Clusters are organized into a hierarchy of classes (§ 2.4.5).

Like everything else which can be named, a type is a value. Hence there is nothing special about binding a type value to a name. If T is a type expression, the binding

U: TYPE^— T

binds Ts value to U. In the scope of U, T and U are completely interchangeable (provided T is not rebound). Furthermore, with two exceptions, all type expressions are functional: identical type expressions in the same scope denote the same type value. The exceptions are the record and enumeration type constructors, which make a distinct type each time they are used (by constructing a new mark: see § 2.4.3).

Restriction on uses of types: Current Cedar has a number of restrictions on the use of TYPE values, given in §4.8.

2.4.2 Type predicates and type-checking

Type predicates provide a way of making assertions in the program which can be checked mechanically. These assertions take the form of declarations for the formal parameters of procs. In general the checking must be done during execution. Thus, if the program says

a: ARRAY [0..10] OF INT⁴-ALL[0]:

is INT⁴- s.Readl ni:

s.PutF[ a[i]]:

there must be a check that i>0 and 1<10 just before the expression a[i] is evaluated. This is called a bounds check: if it fails there is an exception called Runtime.BoundsFault. Where did this check

come from? Note that aid is short for Va.APPLY[a. d. and Va.APPLY is SUBSCRIPT, the subscript procedure for ARRAY [0..10] OF INT. The type of SUBSCRIPT iS PROC[array: VAR ARRAY [0..10] OF INT, index: [0..10]1—qvAn INT]. So when i is passed as the index argument. the declaration of SUBSCRIPT says it must have the type [0..10]. The predicate for this type is

[x: ANY] IN HASMARK[x, INT] AND LET y^—NARROW[x. INT] IN y>= 0 AND ye= 10.

Leaving the HASMARK term for later discussion. we see that the rest of the predicate is the same as the bounds check.

The type system is designed, however, so that most assertions can be checked statically (i.e., proved), by examining the text of the program without running it. Static checking has three obvious advantages:

It reports any errors after a single examination of the program, leaving none (of this kind) to be discovered later in Peoria.

It introduces no cost in time or space for run-time checking.

The compiler can take advantage of the assertions to generate better code.

Of course, there is a corresponding drawback: the assertions made by parameter declarations must be simple enough that the compiler can reliably prove or disprove them.

The proofs done for type-checking have exactly the same form as program correctness proofs based on preconditions and postconditions. Consider a proc whose value is the A-expression

A[x:7]=>[y:U]INe.

The domain declaration [x: 7] is a precondition for the body e. This means that any application of the proc must satisfy this condition. As a consequence, the body e can be analysed on the assumption that the precondition holds, i.e., that x has type T. Similarly, the range declaration [y: U] is a postcondition for the body. This means that given the precondition, any evaluation of e must produce a value y which has type U. In summary, for the body we assume the precondition and must establish the postcondition.

To make this hang together, each application must establish the precondition; this means that the argument must have the domain type. In return, the application can assume the postcondition; this means that the result of the application has the range type. Thus we have a linkage:

argument domain range result

The result in turn will be the argument of another application. In this way the proof is extended to larger and larger expressions, and finally to the whole program. In summary:

Application — establish pre-condition: arguments have the domain type;

rely on post-condition: results have the range type.

Body — rely on pre-condition: parameters have the domain type;

establish post-condition: returns have the range type.

These proofs require showing that an expression always has a particular type T. This is done by observing that every expression has a unique syntactic type U, which is the type of every evaluation of that expression; e.g.. an application always has the range type of its proc (see below for a more detailed discussion of syntactic type). If every value of type U has type T. we are done. Hence the usefulness of type implication. One type implies another. iff (V x) T[x] U[x]; sometimes we say that T is a sub-type of U. If two types are equal, each implies the other. However, there are many other useful cases of implication. For instance. VAR INT implies READONLY INT. The type implications in current Cedar are given in § 4.12.

Of course, not all arguments are applications. The kernel grammar gives the other possible forms of argument expressions, and we enumerate the proof rules for each:

A literal is like a zero-argument proc: it has a known range (e.g.. 3 has type INT, 'A has type CHAR).

A name has the type specified in its declaration or binding.

If there is only a declaration n: T (e.g., x: INT), it must be the domain declaration

of a A-expression, and we have already seen how to ensure that the n's value has type T when the resulting proc is applied.

If there is a binding n: T—e for the name (e.g.. x: INT-3), we must check that e has type T.

A A-expression A [x: 7]=>[y: U] IN e has the type [x: U]. This works for the reason
discussed in the next paragraph.

A binding constructor [x—e. y—A has the type of the corresponding declaration. [x: V e. y: VA.

There is one more link in the chain. An application f[x] has an arbitrary expression for f not necessarily a A-expression. The requirement is that f must have a proc type. say D—⁴R: D is the domain type and R the range type. Since the type of A D=>R IN e is D-^.R, satisfying the precondition D for the application is the same as satisfying the precondition D for the A-expression, and similarly in reverse for the postcondition. The value of f may be a primitive rather than a closure obtained from a A-expression. In this case, the implementation of the primitive can still depend on the precondition and must still establish the postcondition, but since the implementation cannot be examined (within the framework of Cedar) we can say nothing about how this is accomplished. Example: INT.PLUS, which is implemented by the machine's 32-bit add instruction.

In a proc type D—*R, D and R may be declarations which provide names for the arguments and results. In general. the expression R may include names declared in D. The range type of an application then depends on the argument values.

Restriction on dependent proc types: In current Cedar only a module has a type whose range depends on its argument values; the type returned by an interface, or the interfaces exported by an implementation, may depend on the interface and implementation parameters.

As a by-product of the type-checking proof rules just given, a syntactic type is derived for every expression e in the program. It is denoted by De. and computed as follows:

for a name, the declared type; for a literal, its type;

for an application, the range type (which may depend on the argument);

for a A the obvious proc type;

for a binding constructor. the declaration obtained by pairing the names with the syntactic types of the value expressions.

Typechecking ensures that whenever e is evaluated, the resulting value will have type De (though it may have other types as well, i.e., it may satisfy other predicates). The main use of syntactic types is in connection with dot notation (see § 2.4.4).

In order to carry out the proofs described above, the compiler must either compute the values of all types, including those denoted by complex expressions such as ARRAY [i...j] OF INT, or it must be able to prove the equality of unevaluated type expressions. For the most part. current Cedar requires the former approach: hence a type expression must have value which the compiler can compute. Such a value is called static: the rules for static values are given in § 3.9.1.

2.4.3 Marks

By this point you may have thought of asking why the assertions provided by type predicates are worth all this fuss. The reason is simple: they are the basis for authenticating values of an abstract type. so the implementation can be sure that it is working on properly formed values. Suppose you are the implementer of an abstraction. e.g.. Table. You provide operations to Lookup a key in the table, to Insert a [key. value] pair, and to Enumerate the items in the table. A Table is implemented as a REF to a record containing a sorted array a of items and an INT n which gives the number of

items. Lookup is implemented by binary search. All three operations are programmed on the assumption that elements 0 through n-1 of a are sorted, and that n is smaller than the size of the array. They will not work properly if these assumptions are not satisfied, and indeed they may try to subscript the array with an out-of-bounds index or to violate other requirements of the abstractions they depend on.

Here is a lower level, but perhaps more dramatic example. The dereferencing operation t for a REF REAL returns a VAR REAL, which can, for instance, be assigned to, as in the program fragment

r: REF REAL^—NEW[REAL(-1.0];

rt^. <- 3.14159

A REF REAL is represented by the address of a four-byte block of storage which holds a REAL, and the assignment to rt stores the four bytes which represent 3.14159 into that block. If somehow a REF BOOL finds its way into r. the assignment will still store four bytes, since it doesn't know any better. But the REF BOOL points to a two-byte block: the other two bytes that will be modified belong to some unrelated variable, which will be clobbered without warning.

The second example is scarier because the consequences of the bug seem more unpredictable. In both cases, however, the fundamental problem is the same: even if the implementation is correct, the wrong thing happens because it is given an improper value to work on. Or to make the same point in different words, the implementation cannot be held responsible for bad results from one of its operations, if it has no control over the validity of the arguments it receives.

So that the implementation of an abstraction can take responsibility for correct operation. there must be a way to authenticate a value of the abstract type. In Cedar this is done by placing a mark on the value: think of it as a little flag stuck into the value. The mark uniquely identifies the abstract type, and authority to affix it is under the control of the implementation. A correct implementation will mark only values which have the properties needed for a representation of an abstract value, and if no one else can affix the mark, the implementation can be sure that every value with the mark has the desired properties.

A mark can be thought of as an abbreviation for an assertion or type invariant which characterizes a proper abstract value, such as Table or REF REAL. Such an assertion can be quite complex. In the Table example, it would say that the representation is a record of the proper form, that n is less than the array size, and that the first n array elements are sorted. In the REF REAL example, it would say that the address points to a block of storage such that at least the first four bytes don't overlap any other blocks. Such assertions are not easy to write down formally, and proving them is certainly beyond the power of any existing program. So the abbreviations are not a mere convenience, but a necessity.

A new mark can be created on demand by the primitive

CREATEMARK: PROC[Rep: TYPE, tag: UNIQUEID]^—>[m: MARK, Affix: [Rep]^— [TYPEFROMMARK[m]] ]

The primitive HASMARK tests a value for the presence of a mark, so HASMARK[x. m] tests x for the presence of the mark tn. Affix adds the mark to a Rep value.

Restriction on marks: MARK, UNIQUEID, CREATEMARK, HASMARK and TYPEFROMMARK are not accessible in current Cedar. Record and enumeration type constructors provide some access to CREATEMARK, as described below. The 1STYPE primitive, also described below, is closely related to

HASMARK.

With these facilities, it is easy to create a new abstract type. Choose its representation type, and obtain a new mark m. TYPEFROMMARK[m] with an appropriate cluster added is the new abstract type. The implementation must use Affix to mark only values which satisfy the properties it demands.

The type returned by TYPEFROMMARK[m] has the predicate

A [x: ANY] = >[BOOL] IN HASMARK[x. m]

and an empty cluster. Except for subranges and bound unions, all types in current Cedar have a predicate of this form. The built-in types (INT. BOOL etc.) come with such predicates, and the built‑

in type constructor procs (ARRAY, RECORD etc.) obtain a mark from CREATEMARK. So that two invocations of ARRAY [0..10] OF INT will produce the same type, ARRAY and most of the other constructors use a canonical encoding of the constructor and its arguments for the UNIQUEID, and hence are functional. RECORD and ENUMERATION produce a different type each time they are invoked, so they obtain fresh unique identifiers. Since the program cannot invoke CREATEMARK directly. we need not explain how to prevent forgery of UNIQUEIDS. Future versions of Cedar will address

this problem.

In current Cedar you make a new abstract type by declaring it as an opaque type in an interface:

T: TYPE[ANY]

This generates a new mark, and declares T to be a type which has that mark. You get such a type by explicitly painting some other type, normally in an implementation which exports T to the interface which declared it:

T: PUBLIC TYPE^— Interface.T PAINTED RECORD [...].

See § 4.3.4 for more details.

The implementation actually stores a mark with each variable allocated by NEW. Such a variable can be referenced by a REF. and in particular by a REF ANY value. The type of a REF ANY value can be tested at runtime using the primitive

ISTYPE: PROC[x: ANY, U: TYPE]—>[Boot]

If V e is REF ANY and RT= REF T, then the value of ISTYPE[e, is TRUE iff the predicate for T just tests for mark m. and xt has the mark VAR m. ISTYPE is described in detail in § 4.3.1. along with the WITH ... SELECT construct and the NARROW primitive, which are more powerful operations built up from ISTYPE.

For other values, there is no mark actually stored: instead, types must be computable statically using the methods described in the last section. The AMTypes interface, however, gives a way to refer to any value in a uniform way, and to test its type at runtime.

There is only room for one mark on a variable, and this must encode all the marks that the value actually carries. We arrange for this by imposing a partial order on the marks, and requiring that:

The set of marks on a value must have a maximal element.

Every mark smaller than the maximal one must be on the value.

With these rules, a single mark stored on the value is enough to code all the others. In current Cedar. a value actually has only one mark, since:

The only way to create a new mark is with the record or enumeration type constructors, or by declaring an opaque type.

When you paint a type T with the mark of an opaque type, T must be a record or enumeration type, and the opaque type mark replaces the mark it had before.

Note that VAR T, READONLY T and T are different types with different marks, although VAR TREADONLY T, and there is a coercion VALUEOF from either one to T.

2.4.4 Clusters and dot notation

It is convenient to associate with a type the procs supplied by its implementor for dealing with values of the type. This is done by putting these procs into the type's cluster. The cluster is simply a binding which is part of the type value (the predicate is the other part). There are no rules enforced about what goes into the cluster. However, there is a special dot notation which makes it desirable to populate Ts cluster with procs which take a T as their first argument. The usual effect is like this: Ln is sugar for Vt.n[1], and t.n[other args] is sugar for V t.n[t, other args].

For example, if t has type T, and a proc [T, INT]-0[Boot.] is in Ts cluster under the name P, then

the proc can be applied by an expression like t.P[3], which is sugar for V' t.P[r, 3]. The name P is looked up only in Ts cluster, not in the current scope. If Q: [T]–[INT] is also in the cluster, it can be applied with t.Q. which is sugar for V t.Q[d.

The general rule that makes this work is the following: t.n is sugar for LooKuPc[Vt, $n][t]. LOOKUPC[Vf. $n] is just V t.n. except that if Vr.n is a proc that takes several arguments. it is split into a proc that takes the first argument and returns a proc taking the remaining ones. Thus LookuPc[Vt. $ n][t] will be a proc taking the remaining arguments. and tifi[other args] = LOOKUPC[V r. $n][t][other args] will be the same as V t.n[t, other args].

Dot notation can also be used to obtain values from a binding or from the cluster of a type without any application: T.P would be the proc named P in the previous example. The possible cases of dot notation in current Cedar are described in detail in § 4.14.

Restriction on constructing clusters: There is currently no way to explicitly construct clusters. The built-in types and type constructors have clusters: they are described in detail in § 4. In addition, there is a clumsy way to provide a cluster for an opaque or record type in an interface: every proc name in the interface is put into the type's cluster. For a record. the procedures supplied by the record constructor are also in the cluster. and they win if there are name conflicts. There is one of these clusters for each type in each imported interface value: if a module imports more than one value of the same interface, however, there are severe restrictions (see § 3.3.3).

2.4.5 Declarations

A declaration is the type of a binding. Thus, the binding [x-3, y-3.14] has the type [x: INT. y: REAL]. All the relationships among types, and between types and values, are carried over elementwise to decls and bindings: the elements are matched up by name rather than by position. A decl itself simply has the type DECL.

A decl is made up of two parts: the names or pattern, and the types. The basic operation for making decls. MKDECL, takes a pattern and a type. Thus MKDECL[ PATT[x. y], INTXREAL]= [x: INT, y: REAL]. In general, a pattern is one of NIL, a simple name, or a pair of patterns. just like a Lisp S-expression. Similarly, a type argument to MKDECL is one of NIL, a type, or a cross type. The type must decompose in a way which matches the pattern. Normally. as in Lisp, we deal only in flat patterns. where the first element of a pattern is always a name. Such flat patterns are conveniently denoted by constructors of the form [x, y. ...]. The reason for defining things in terms of pairs is that it makes it much simple to write down precise rules for the semantics, using structural induction on the values.

The main use of a decl is to type-check a binding. The basic binding constructor is MKBINDD[d, e], where d is a decl and e is matching group or binding. If e is a-binding. then its structure and names must match the structure and names of d, and each element of e must have the type demanded by the corresponding component of d, after a possible coercion. Thus MKBINDD[[x: INT, y: REAL]. [x-3. y-3.14]1= [x-3, y-3.14]. This may seem pointless, but it has two important uses:

Such a binding is used to bind the argument of a proc to the domain declaration. Even though the resulting binding is the same as the argument, the type-checking is essential.

There may be coercions involved, so that the resulting binding is not the same. Coercions on the component values are discussed in § 2.6.1. There are also coercions on the binding itself, which can default missing elements: these are discussed in § 2.3.6.

If e is a group, it is first coerced to a binding by attaching the names from the decl, as discussed in § 2.3.6. Thus in MKBINDD[[x: INT, y: REAL], [3. 3.14]] the second argument is coerced to [x-3. y-3.14], and things then proceed as before.

Bindings may also be used in LET expressions. Here the types are often redundant, and it is better to use the MKBINDP primitive to bind the value directly to a pattern. The syntactic type of the result is the decl whose type is the syntactic type of the value. Thus [x-3. y-3.14] is short for

MKBINDP[PATT[x. [3. 3.14]]: its syntactic type is MKDECL[[x. y]. V[3, 3.14]]= MKDECL[[x. y].
INTXREAL]=[x: INT, y: REAL].

A decl D in a block is interpreted somewhat differently. It becomes the argument of the NEWFRAME primitive. which turns the type of the decl D.T into the corresponding vAR type VT= D.T.MKVARO, allocates a new value v of type VT, and makes the binding MKBINDP[D.P, v] over the scope of the block. Thus

lx: INT: y: REAL: ^S}

becomes

LET [x y]^—[VAR INT. VAR REAL].NEW IN S

Here D=[x: INT: y: REAL], VT=[VAR INT, VAR REAL], and v= [VAR INT, VAR REAL].NEW. Note that

the types might have defaults, which are used to initialize the variables as part of the NEW operation.

Actually this is a bit oversimplified, since NEWFRAME has to separate the bindings in the block from the decls. construct the variable binding just described from the decl. and then combine it with the binding from the block. Thus

ix: INT: y: REAL; z: BOOL^—TRUE: ^S}

becomes

LET [x. y. z]^—'([VAR INT, VAR REAL].NEW PLUS [TRUE]) IN S

or more readably

LET x^—VAR INT. y^—VAR REAL. z: BOOL^—TRUE IN S

Anomaly about uninitialized names or variables: In Cedar the names in a block are introduced recursively, so that the d's and b's can refer to each other. It is possible for a binding or type to refer to a value which has not yet been initialized, with undefined results. See § 3.4.1 for a further discussion of this point.

2.4.6 Classes

Another important use of a declaration is to characterize the cluster of a type. Since the cluster is just a binding, it is characterized by its type, which is a decl. When used for this purpose, a decl is called a class. See § 4.1 for further discussion of classes, and an enumeration of the primitive classes of Cedar.

2.5 Programs

This section describes how meaning is assigned to kernel programs.

2.5.1 Structure of programs

A kernel program is an expression, which is either atomic (a name or literal), or is an application which involves sub-expressions: the proc being applied, and the arguments. The concrete syntax treats certain kinds of expressions specially: modules, blocks (which introduce new variables and return no value), and statements (which return no value). All desugar into simple expressions, however, and are treated identically in the kernel.

2.5.2 Names

A name is a part of a program which usually serves to denote a value. There are two contexts in which the occurrence of a name n denotes a value:

It may occur as an expression. Then n denotes the value bound to it in the scope in which the expression appears (see § 2.5.4 for details).

It may occur after a dot. as in en. Then the expression e.n denotes the binding for n supplied by e (see § 2.4.4 and § 4.14 for details):

the value bound to n in e, if e is a binding:

the value bound to i? in the cluster of e. if e is a TYPE; roughly (V e). n[e] otherwise.

There are also two defining contexts for a name n (see § 2.5.5 for details):

It may occur before a – in a binding constructor, as in n--e. The value of e is the value bound to n in the binding denoted by the constructor (see § 2.3.5 for details).

It may occur before a : in a declaration constructor. as in n: t. The value oft is the type of n in the declaration denoted by the constructor (see § 2.4.5 for details).

These constructors are usually recursive in Cedar: that is. the expression n elsewhere in the constructor denotes the value bound to n in that constructor: see § 2.5.6 for details. In the kernel they are non-recursive unless preceded by REC.

A name is not a value, but there are values of type ATOM which are related to names. An atom has a print name which is a rope (an immutable sequence of CHARS). A name following a $ is an atom literal: $n denotes the atom with print name n. Other properties of atoms are described in § 4.5.1A.

Caution on names: Current Cedar has several complications in its treatment of names:

•In an argBinding²⁷, n: e may be written instead of n- e. The syntactic context distinguishes this from a declaration, but this usage is not recommended.

An argBinding is not recursive: in {a-1; f[a-3. b–a+ 1]} b is bound to 2. not to 4.

The declaration in an import list is non-recursive: IMPORT M is short for IMPORT M: M. and the second M denotes its binding in the surrounding scope (i.e.. the binding supplied by the DIRECTORY). Inside the body of the module, of course, M denotes the imported parameter.

Names which appear in an enumerationTC⁵⁴ are treated specially: see § 4.7.1A for details.

2.5.3 Scope

A scope is a region of the program in which all names retain the same meanings (note that many names denote variables, which can change their values in the same scope. but each name continues to denote the same variable). In the kernel there are only three constructs which introduce a new scope. X, LET and REC. In current Cedar, these are sugared in a variety of ways: modules, import lists, proc bindings, blocks, exit labels, open. iterators, safeSelects and withSelects. All have straightforward desugarings, however.

2.5.4 Constructors

The kernel has constructors, denoted [...]. to make expressions which denote group, decl and binding values more readable. There is one flavor of constructor for each class:

A binding constructor is a list of binding elements (b in the kernel syntax) of the form I,– e or el– e. The presence of the – distinguishes it from the others. Here d is a decl element (not a declaration), and p is a pattern, in which the names are being defined rather than evaluated.

A decl constructor is a list of decl elements (d in the syntax) of the form p: t. The presence of the : without any – distinguishes it from the others. Again. p is a pattern.

A group constructor is a list of expressions. Note that decl and binding elements are not expressions, although constructors are expressions.

Constructors are useful for making decls and bindings where the names are literal. This is the normal case, and in fact the only case in current Cedar. If you want to make them out of other decls, for instance to bind an expression to a decl which is the value of a name dn, you cannot use a constructor: [dn-'e] would bind the value of e to the name dn, not to the decl which is its value. You have to write the decl-constructing primitive directly: MKDECL[d, e].

The only kinds of constructor you can write in current Cedar are:

Decl constructors for proc domains and ranges, and for records and unions (fields⁴³ in the syntax).

Binding constructors for arguments in an application, or as an expression alone if a record or array value is needed (argBinding²⁷ in the syntax).

2.5.5 Recursion

In the kernel, you get recursive definition of names only if you write PEG (or the unsugared form Fix) explicitly. In Cedar, on the other hand, decls and bindings are normally recursive, except for argBindings and import lists.

The recursion is legal in a block or interface body (although anomalies are possible in some cases when names are used before they are defined: see § 3.4.1). In fields it is illegal.

2.6 Conveniences

The facilities described here are not fundamental, but they are of great practical importance.

2.6.1 Coercion

A coercion is a proc which is automatically applied under some circumstances to map a value of one type T (called the source) to a value of another type U (called the dest), e.g. from [O..5) to INT. Coercions are obtained from the clusters of the types involved. The coercion mechanism adds no new functionality, since the programmer could always write the applications himself, but it is important in concealing some of the distinctions made by the type system when they are distracting rather than helpful.

There is exactly one (desugared) context in which a coercion is applied: when an expression e of syntactic type T appears as an argument in an application which expects a value of type U: this means that there is a binding n: U"e. Since nearly all Cedar constructs are desugared to application, coercions are widely applicable. The only (desugared) context in which there is no coercion is for the first operand of dot, since in that case the cluster of the operand is used to interpret the name which is the second operand. Thus in the expression e.n, it is always V e. the syntactic type of e, that is used to look up n, regardless of the fact that this expression may appear as an argument to a parameter of type U. If e is not a type or binding, however, then e.n desugars to P[e], where P= LOOKUPC[V e.Cluster, $n], and in the application of P, e does appear as an argument and can be coerced. Usually the cluster for T is set up with procs which take an argument of type T, so the domain of P is De and no coercion happens. This isn't always true, though: a subrange T of INT inherits the arithmetic procs of INT, for example, and there is a coercion from T to INT when PLUS is applied.

If it is sometimes natural to think in terms of a coercion from T to U that is implemented by the identity function. In fact, implication is stronger than that, since it propagates through many type constructors, including PROC, while coercion does not. Implication is discussed in § 2.4.2 and § 4.12.

There is a rather general rule for finding coercions from the clusters of types, though it is not of much practical importance in current Cedar, since there is no way for the user to define coercions. The rule goes like this. Each cluster may have a From item and a To item. T.From should consist of pairs with type [tag: ATOM, proc: T-+ U1, and T.To of pairs with type [tag: ATOM, proc: U-+71. Ignore the tags for the moment. Consider the binding n: U-'e. where V e= T, and TA U is false. For each proc P in T.From or U.To we try n: U- P[e^.].

If P: T-> V is in T.From, it maps e to a value of type V, and we have to bind n: U- P[e]. If we are done: otherwise we can recurse on this sub-problem.

If P: V-0 LI is in U.To, we have to bind m: If V we are done: otherwise we can
recurse on this sub-problem.

The whole process fails if no path of coercion procs takes us from T to U. The search can terminate when all paths have been explored, and a particular path can be abandoned when a type appears on it for the second time. Since the search is done statically (by the compiler), and since the results of an attempt to coerce T to U can be cached, the time required for the search is not a problem.

There are two obvious difficulties with this scheme. First. it may transform erroneous applications into legal ones, by coercing an argument in ways not intended by the programmer. Second, more than one path of coercion procs may exist, and different paths may give different results. The second difficulty can be avoided, and the first minimized, if every coercion proc P is chosen so that it has a (partial) inverse, and P^-1[P[x]]=x for all x in P.DOMAIN. This says that a coercion does not lose information. and that different paths give the same answer. Sometimes this is not feasible. e.g. for the narrowing coercion from INT to [0..5). The following rule gives the builder of clusters control over proliferating coercions:

If two procs on a coercion path have non-Nit. tags, they must have the same tag.

In general, coercions that don't lose information can have NIL tags, and others should have different tags.

The coercions in current Cedar are described in § 4.13. All have NIL tags. and none loses information except the subrange narrowing. Note that coercions extend componentwise to groups and bindings.

2.6.2 Exceptions

The basic idea behind exceptions is to extend the value space, so that it includes not only ordinary values, but also a set of exception values. An exception value has the special property that whenever it appears in an application, it becomes the value of the application, so that it propagates up through the control stack of the program until it finally becomes the value of the whole program. Of course this isn't always what is wanted, so there is a special HIDE construct which is not an ordinary application, but takes its argument value, ordinary or exception. and bundles it in a variant record which is a normal value. Then ordinary code can be used to test for the exception and take appropriate action. This construct is sugared to give distinctive ways of catching an exception: in the kernel with BUT (§ 2.2.4), and in Cedar with ENABLE, EXITS and REPEAT (§ 3.4.3). Cedar has two kinds of exception: GOTO labels and ERRORS, which must be raised and caught separately. and have slightly different semantics.

The main point of this treatment is that it does not require continuations or any other baroque explanation of how control is transferred to catch an exception. The view is that exceptions are simply a convenience feature: the same job could be done by returning a slightly larger result from each proc, with an appropriate status code.

An exception consists of a code and an optional argument value. The type of the code is ERROR T, where T is the type of the argument which goes with it. GOTO labels always have empty arguments. The argument is a way of passing some information along in addition to the identity of the exception.

A proper treatment of exceptions in the type system would require that each proc the exceptions that can emerge from an application of the proc. In fact, this is not possible in current Cedar.

Cedar also has signals, which historically were viewed as a kind of exception but different interpretation. as a way of obtaining dynamic rather than static scoping are discussed in § 3.4.3A.

range include all required or even

now have a very for names. They

2.6.3 Finalization

This subject is discussed in § 3.4.3A.

2.6.4 Concurrency

This subject is discussed in § 4.10. where the Cedar facilities for writing concurrent programs are given. Writing good concurrent programs, or even correct ones, is another matter, which is beyond the scope of this manual to more than hint at. Unfortunately, an adequate reference is lacking.

2.7 Miscellaneous

The different kinds of allocation are discussed in § 4.5. Static values are defined in § 3.9.1.

2.7.1 Pragmas

A pragma is a construct that does not change the meaning of the program, except perhaps to make something illegal which was legal without the pragma. Its purpose is to affect the implementation, generally by requesting optimization to favor one criterion over others. The pragmas in current Cedar are:

INLINE, which causes a proc body to be expanded inline when it is applied. See § 3.5.1 for details.

PACKED, which causes array components that fit in 8 or fewer bits to be packed, at the expense of more expensive code to access them (§ 4.4.2).

CHECKED, which forbids application of unsafe procs in a block, and adds runtime checking for some primitive procs which are otherwise unsafe—in particular, narrowing to a subrange, and assigning a proc (§ 3.4.4).

PRIVATE, which forbids access to items in an interface or instance except to modules which EXPORT (or SHARE) it (§ 3.3.6).

MACHINE DEPENDENT, which allows positions of record fields (§ 4.6.1) and representation values for enumeration elements (§ 4.7.1A) to be specified (strictly, it is the absence of MACHINE DEPENDENT that is the pragma, since the positions or representation values are legal only when it is present.)

3.1 Notation

This section describes the notation used in the grammar, desugaring, and commentary of this chapter.

3.1.1 Notation for the grammar

The grammar is written in a variant of BNF:

Bold parentheses are for grouping: ( interface I implementation). Item item means choose one.

?item means zero or one occurrences of item.

item; ... means zero or more occurrences of item separated by ":". The separator may also be ",",

ELSE, IN. or OR, or it may be absent. If the separator is ":", a trailing ¹¹;¹'is optional. item; !.. is just like item; ... but there is at least one occurrence.

A terminal is a punctuation character other than bold ()?1, or any character underlined, or a word in
SMALL CAPS. Note that a and {} are terminals, and do not denote optional occurrence and repetition as they do in many other variants of BNF.

The rules are numbered sequentially.

Special symbols mark constructs with special properties: t = unsafe:

· = obsolete;

=machine-dependent;

*= efficiency hack.

The grammar is written so that a non-terminal never expands to the empty string. When an element of a rule is optional, that is always indicated explicitly by "?" or "..." .

The following non-terminals are so basic to the language and so frequently used, that they are represented in the grammar by abbreviations:

b= binding¹³

d = declaration"

e = expression ¹⁹

n = name⁵⁶ (identifier)

s = statement¹⁴

t= type³⁶

I'm afraid this means that you must learn the meaning of these six abbreviations in order to read the grammar.

With the exception of these abbreviated non-terminals, each use of a non-terminal is cross-referenced with a small superscript number⁵⁹, unless the non-terminal is defined in one of the next few rules. If a non-terminal (other than e, t or n) is used in more than one rule, then all the rules that use it are listed in a comment after its definition.

Except for the entries in Table 3— 1, a terminal symbol appears in only one rule. These duplications do not lead to syntactic ambiguity. In most cases they are harmless, since the symbol has essentially the same meaning in each case, and the rules are separate only for greater readability, to highlight an unusual use of a construct, or for historical reasons. In some cases, however, the symbol has quite different meanings in different rules. These are marked on the left as follows:

· In the rules whose numbers are marked with * the symbol has a different meaning than in the others, and confusion is quite possible. The programmer should beware.

m In the rules whose numbers are marked with * the symbol has a different meaning than in the others, but the context is sufficiently clear that confusion is unlikely.

· The rules whose numbers are marked with • are obsolete and should be avoided. A superscriptxn indicates that the terminal is repeated n times in that rule.

m Text Box: expr, subrange, *position in record or enumeration constructor/built-in/funnyAppl, subrange,
application, typeName, fields, mdFields
interface body, block, enable, machine code. *enumerationTC
See note in § 3.2. • f

o{}

•

m :

•

o *

•

=> 4-

m ANY

m CODE EN DCASE

m ERROR
IN

LONG NOT

· NULL PACKED SELECT FROM SHARES

m SIGNAL TRASH TRUSTED

m USING

m WITH

19. 25. *51.1. *54

19, 25, 26, 37, 43, 51

2, 6, 8, 13, *54

2.3, 6, 7, 9, 17, 27. 29,30, 32, 34, 35, 43, 51, 52

6. 8.10, 17. 27.1. 30, 33, 35

1, 2, 3, 5, •7, 11, 13,

18, •27, 33, •34. 51. *51.1. 53

19, 37 25^x4. *51.1

21, *53 21, 58

20, 21, 58

·13, 22

6, 9, 17. 31, 33, 35, 52 14, 16, 18, 21, *55

2, 3, 13, 20, *22, *27

7, 34

*9. 40, 43

*13, 23 31, 52

*19, *24. 41.1

18, 22

38^x2, 45.1, 48

20. 22

14, •27, •52, •55

44, 45

29, 32, 34, 52

2, 3

*24, 41.1

27x2, 55x2

6, 13 1, *5 *32, 34

See note in § 3.2.

introducing names with types, except *51.1= position,

·7 = open, 027= argBinding 034= withSelect

dot notation for e is repeated for types

subrange, *position infixOp, *tag

infixOp, exponent

prefix0p, infixOp, exponent

·binding, infixOp

exits, enable, repeat, select choices^x4, unionTC

s, e'-STATE, iterator, e, *defaultIC interface.implementation,b,argBinding,*unary0p,*relOp open, withSelect

*enable, variableTC, fields *new exception. convert t to e

select endChoice, unionTC *expression, *funnyAppl, transferTC

iterator, relOp

cardinal/unspecified, pointer, descriptor

prefixOp, relOp

statement, •argBinding, •unionTC, •defaultTC

array, sequence

select, safeSelect, withSelect, unionTC

interface and implementation

*funnyAppl, transferTC argBinding, defaultTC block, machine code directory, *locks

*safeSelect, withSelect

Names introduced in the desugaring are written with one or more trailing prime ("'") characters. Such names cannot be written in a Cedar program, and hence they are safe from name conflicts. The desugaring is constructed so that the Cedar scope rules prevent multiple uses of these names from being confused.

3.1.3 Notation for the commentary

Each section of the commentary begins with grammar rules, desugaring and examples for part of the language. It continues with text which explains the meaning of the constructs. Generally the meaning is fairly clear from the desugaring, and this text is short. For blocks and especially for modules, however, there are many non-obvious implications of the desugaring. and a number of restrictions: these constructs have a lot of explanatory text.

Some kinds of information are put into specially marked paragraphs, which begin with one of the following italicized words:

Anomaly: the meaning of this Cedar construct is not explained by desugaring into the kernel, but by the special rule given here.

Caution: here is an implication of the definition which might surprise you. Performance: facts about the time or space required by some construct.

Representation: the values of a data type are represented in terms of other types like this.

Restriction: a construct is not fully general. and will cause a static error unless the additional conditions stated here are satisfied.

Style: advice about good Cedar style.

Symbols written in SANS-SERIF SMALL CAPITALS are in the kernel but not in current Cedar. The superscript notation used to cross-reference non-terminals in the grammar is also used in the examples, usually to point to a rule whose example introduces a name.

3.2 Lexical structure

$Text Box: -- But not one of the reserved words in Table 3 — 2. -- INT literal, decimal if radix omitted or D, octal if B. I -- INT literal in hex; must start with digit. I -- REAL as a scaled decimal fraction; note no trailing dot. -- With an exponent, the decimal point may be omitted. I -- CHAR literal; the C form specifies the code in octal. I [ ('extendedChar I ' )....] Rope.ROPE, TEXT, or STRING. I -- ATOM literal. -- Optionally signed decimal exponent.$ 56 name :: = letter (letter I digit)...

57 literal ::= num ?( ( Did I 13111 ) ?num ) I digit (digit IAIDICIDIEIE) ( Lib ) ?num I ?num . num ?exponent _I

num exponent I

' ( extendedChar _I' I " ) I • digit !.. (CIO I " ( extendedChar I ' ) " MUDI

$ n

58 exponent ::= (Ele) ?(+ I —) num

59 num :: = digit !..

60 extendedChar :: = space I \ extension I anyCharNot"'Or\

61 extension :: = digit_s digit, digit₃ i -- The character with code digit, digit₂ digit._; B. I

(nIN I 1111) I (lID I (121M I -- CR, \ 015 I TAB, '\011 BACKSPACE, '\010

OD I (111--) I' I " I \ FORMFEED, '\014 I LINEFEED, '\012 " I \

Examples

m, xl, x59y, longNameWithSeveralWords: INT: n: iNT-1 + 12D+ 2B3 +2000B

+ 1H +OFFH;

rl: REAL-0.1+.1+1.0E-1

+1E-1;

al: ARRAY [0..3] OF CHAR^—'['x, '\N, '\141];
r2: ROPE^—"Hello.\N...\NGoodbye\F": a2: ATOM ^—'$NamelnAnAtomLiteral;

= 1+12+1024+1024 -- +1+255

= 0.1+0.1+0.1

-- +0.1

The main body of the grammar (rules 1-55) treats a program as a sequence of tokens; these are the terminal symbols of the grammar. Rules 56-61 give the syntax of most tokens. A token is:

- A litera1⁵⁷. More information about literals of type T is in the section of § 4 devoted to T.

- A name⁵⁶, not one of the reserved words in Table 3-2. Note that case matters in names.

- A reserved word, which is a string of uppercase letters that appears in Table 3-2. A reserved word may not be used as a name, except in an ATOM literal.

- A punctuation symbol: any printing character not a letter or digit, and not part of one of the two-character sequences below. The legal punctuation symbols in programs are:

! @ # $ * — + = I( ) } +- t : • " . > /

The following ASCII characters are not legal punctuation symbols (and must not appear in a program except in an extendedChar^6o):

% & \ ?

- Text Box: One of the following two-character symbols (used in the grammar rules indicated): not equal²²

less than or equal²²

not less than22

greater than or equal²²

not greater than²²

chooses

subrange constructor^25.51.I bind by name⁶.³⁴

Note that Cedar uses a variant of ASCII which includes the characters (instead of the underbar and t (instead of

the circumflex ). Also, the character written — here is the ASCII minus, code 55B. and not any of the various dash or
typographer's minus characters with other codes, which are not in the standard Ascii set.

ABS	ELSE	ISTYPE	PACKED	SIGNAL
ALL	ENABLE	JOIN	PAINTED	SIZE
AND	END	LAST	POINTER	START
ANY	ENDCASE	LENGTH	PORT	STATE
ARRAY	ENDLOOP	LIST	PRED	STOP
ATOM	ENTRY	LOCKS	PRIVATE	STRING
BASE	ERROR	LONG	PROC	SUCC
BEGIN	EXIT	LOOP	PROCEDURE	TEXT
BOOL	EXITS	LOOPHOLE	PROCESS	THEN
BOOLEAN	EXPORTS	MACHINE	PROGRAM	THROUGH
BROADCAST	FINISHED	MAX	PUBLIC	TO
CARDINAL	FIRST	MIN	READONLY	TRANSFER
CEDAR	FOR	MOD	RECORD	TRASH
CHAR	FORK	MONITOR	REF	TRUSTED
CHARACTER	FRAME	MONITORED	REJECT	TYPE
CHECKED	FREE	NARROW	RELATIVE	UNCHECKED
CODE	FROM	NEW	REPEAT	UNCOUNTED
COMPUTED	GO	NIL	RESTART	UNTIL
CONS	GOTO	NOT	RESUME	USING
CONTINUE	IF	NOTIFY	RETRY	WAIT
DECREASING	IMPORTS	NULL	RETURN	WHILE
DEFINITIONS	IN	OF	RETURNS	WITH
DEPENDENT	INLINE	OPEN	SAFE	ZONE
DESCRIPTOR	INT	OR	SELECT
DIRECTORY	INTEGER	ORDERED	SEQUENCE
DO	INTERNAL	OVERLAID	SHARES

Table 3 — 2: Reserved words and predefined names

The program is parsed into tokens by starting at the beginning and successively taking from the front the longest sequence of characters which forms a token according to the rules above, after first discarding any number of initial whitespace characters or comments.

The whitespace characters are space, tab, and carriage return. A Tioga node boundary is also treated as a whitespace character.

A comment is one of:

A sequence of characters beginning with --, not containing -- or a carriage return. and ending either with -- or with a carriage return.

A Tioga node with the comment property.

Note that whitespace and comments are not tokens, but may appear before or after any token: they are token delimiters, and hence cannot appear in the middle of a token. Whitespace and comments thus do not affect the meaning of the program except:

When they delimit a token.

Within a CHAR literal or a ROPE literal, where they are taken literally. Thus • is equal to '\040, and "I

am --not--" is equal to "I\Nam --not--" and different from "I\Nam ".

Both reserved words (Table 3-2) and most names with predefined meanings (Table 4-5) are made up entirely of upper case letters. All are at least three characters long except the following:

DO GO IF IN OF OR TO.

Caution on use of reserved words and predefined names: They should not be rebound by the program: in some but not all cases the compiler forbids their rebinding.

3.3 Modules

module ::= DIRECTORY (n_d (: TYPE (n, 1 ) 1 )

?(USING [ n_u. ... ) ),

( interface _Iimplementation )

2 interface :: = : ?CEDAR DEFINITIONS

?locks (imports I) ?.(SHARES n_e ...) ?eaccess¹² { ?open⁷ (d I b): .

3 implementation ::= ri_m : ?CEDAR

?safety ( PROGRAM ?drType⁴²

MONITOR ?drType⁴² ( I locks)) (imports _I)

?(EXPORTS n_e, ...)

?.(SHARES n_s,

?*access¹² block .

3.iimports ::= IMPORTS ( (fl_y : I ) n, ), ... --In 2, 3.

4 safety ::= SAFE I UNSAFE --In 3.41. Stocks ::= LOCKS e?( USING n_u: t)

Examples

A [ (n_d : ( (TYPE n_t I TYPE Il_d) I TYPE n_d)), ] IN

LET (n_d•RESTRICT[n [$n ] )

IN ( interface I implementation )

LET r' [ n_m:INTERFACETYPC $ri_m, ...]] ] IN (imports I X = >r) IN

-- SHARES allows access to PRIVATE names in n_eLET REC n_m–open [ ?(1'–locks, ) (d I b), IN ti_n,

LET [(n_e: n_e) , , FRAME: TYPE n_ni

n_m: FRAME, CONTROL: PROGRAM]

IN (imports I X = >1^.1) IN

( I LET l'^—( LET LOCK ^...NEWLOCK IN (A IN LOCK) I locks) IN)

LET b'–NEWPROGINSTANCENOCklUNCONS IN

[ (n_e--^.BINDDFROM[n_e, b' PLUS n_m^–b`.n_m] ), ,

FRAME~MKINTTYPE[block],

n_m–b' , CONTROL–b%n _in]where the block body is desugared: [ (d I b), ri_m: PROGRAM drType–{s:

[(n_i:n_r), ...]= >r' IN LET [((n_v I n_i)– (n, PLUS n_f.BINDING) ]

X ?( [n_u t] ) e

Buffer: DEFINITIONS - {

Handle: TYPE-REF BufferObject: BufferObject: TYPE = Rope.RoPE New: PROC RETURNS[h: Handle];

Get: PROC[h: Handle] RETURNS[BufferObject]; Put: PROC[h: Handle. o: BufferObject] }:

Bufferlmpl: MONITOR [f: CIFS.OpenFile] LOCKS Buffer.GetLock[h]t

USING h: Buffer.Handle

IMPORTS Files: CIFS, 10, Rope

EXPORTS Buffer

— module body -- 1 .

-- Implementations can have arguments. -- LOCKS only in MONITOR, to specify -- a non-standard lock.

-- Note the absence of semicolons.

-- EXPORTS in PROGRAM or MONITOR.

-- Note the final dot.

Modules serve a number of functions (which might perhaps better be disentangled, but are not): A file of source text (BufferImpLmesa), or its translation into object code (BufferlmpLbcd).

The unit handled by the editor, named in DF files and models, and accepted by the compiler, the binder, and the loader.

A set of related structures (types, procedures, variables) which are freely accessible to each other, hiding secrets or irrelevant information from other modules.

A procedure which can accept interface types and bindings as arguments, and returns interface instances as results.

The procedures of a monitor, perhaps with its protected data.

The first two uses are not relevant to the language definition, and are not discussed further here. The others are the subject of this section.

There are two kinds of modules: interface modules (written with DEFINITIONS) and implementation modules (written with PROGRAM or MONITOR). They have the same header (except that interfaces have no EXPORTS list); it defines the parameters and results of the module viewed as a proc (§ 3.3.1) and specifies the name n_m of the module. The bodies (following the —) are different. Table 3 —3

summarizes the structure of modules and their types; it omits a number of details which are given in rules 1-3 and explained in the text.

Example	Module	Module type	Result	Result type
DIRECTORY Rope. 10;	Interface	[Rope: TYPE Rope, 10: TYPE 10]	Interface	TYPE Match
Match: DEFINITIONS-{...}	module	-⁰[TYPE Match]
DIRECTORY Match, Rope. 10; Matchlmpl: PROGRAM	Implementation module	[Match: TYPE Match, Rope: TYPE Rope, 10: TYPE 10.	Exported instance	Match
IMPORTS R: Rope. 1: 10		R: Rope. 1: 10]—)^.[Match]
EXPORTS Match-1...1

Table 3-3: Interface and implementation modules

The ensuing sub-sections deal in turn with:

§ 3.3.1: Modules as procedures. and the interface or instance values they return. § 3.3.2: How modules are applied.

§ 3.3.3: Module parameters: the DIRECTORY and IMPORTS lists: USING clauses.

§ 3.3.4: Interface module bodies and interfaces.

§ 3.3.5: Implementation module bodies: the EXPORTS list,

§ 3.3.6: SHARES and access¹².

The meanings of the other parts of a module header are discussed elsewhere: CEDAR in § 3.4.4.

MONITOR and LOCKS in § 4.10.

3.3.1 Modules and instances

A module is a proc which takes two kinds of arguments:

Interfaces, declared in the DIRECTORY list. These arguments are supplied by the model (or on the compiler's command line), and used during compilation.

Instances of interfaces, declared in the IMPORTS list. These arguments are also supplied by the model (or in a config file passed to the binder, or implicitly by the loader), and used during loading.

§ 3.3.3 discusses the types of these arguments and how they are declared. In addition, an implementation may take PROGRAM arguments declared in the drType following PROGRAM or MONITOR. These are ordinary values: they are discussed in § 3.3.2A.

When a module is applied to its arguments, the resulting value is

For an interface module, an interface.

For an implementation module, a binding whose values are instances:

one interface instance for each interface it exports:

one for the program instance, also called a global frame;

one for the program proc derived from the module body (§3.3.2A), called

CONTROL.

This application cannot be written in the program, only in the model: it is described in § 3.3.2.

An interface (sometimes called an interface type) is a type, as the latter name suggests. This type is a declaration (obtained from the declarations which constitute the module body), with an extended cluster that includes all the bindings in the module body that don't use declared names (§ 3.3.4). In the example, the Buffer interface (obtained by applying the Buffer module to the arguments declared in its DIRECTORY) has declarations for New, Get, and Put, and its cluster includes values for Handle and BufferObject.

An interface instance is a value whose type is an interface: such values are the results of instantiating implementation modules. In the example, Bufferlmpl returns (exports) an instance of Buffer.

A program instance or a global frame is a frame. as the latter name suggests, i.e., a binding obtained from the bindings and declarations of an implementation (PROGRAM or MONITOR) module body, just like any proc frame (§3.3.5). Normally code outside the module does not deal with the instance directly, but only with the exported interface instances. In the example, Bufferlmpl exports a program instance for the module and a CONTROL proc.

In most cases, there is:

Exactly one application of each module, and hence exactly one interface or one instance. Only one module which exports an interface.

Only one interface exported by a module.

Only one argument of the proper type for each module parameter (§3.3.3); hence it is redundant to write the arguments explicitly.

When these conditions hold, there is a close correspondence among the following four objects: an interface module;

the interface it returns (since its arguments need not be written explicitly);

the implementation module which exports the interface;

its instance (again, since its arguments need not be written explicitly).

The distinctions made earlier in this section then seem needless; it is sufficient to simply consider the interface and implementation modules, and identify them with the files which hold their text. In more complicated situations, however, it is necessary to know what is really going on.

In the example at the start of this section, BufferImpl is an implementation module with seven parameters:

Four interface parameters, declared in the DIRECTORY: Rope, CIFS, 10 and Buffer.

Three instance parameters, declared in the IMPORTS: Files (of type CIFS), 10 (of type 10), and Rope (of type Rope). Since the instance parameters are declared in an inner scope, the instance Rope is the one visible in the module body; the interface Rope is visible only in the header. The same is true for 10, but both the interface CIFS and the instance Files are visible in the body.

When BufferImpl is compiled, the four interface parameters must be supplied, in the form of (compiled) interface modules named Rope, CIFS, 10 and Buffer. When BufferImpl is instantiated (normally by loading it), the three instance parameters must be supplied, i.e. there must be other instantiated implementation modules which export the Rope, CIFS, and 10 interfaces. Normally there will be one of each, and the entire program will consist of eight modules:

the interface modules Rope, C1FS, 10 and Buffer,

implementation modules normally named Ropelmpl, CIFSImpl, 101mpl and Buffer1mpl, each exporting an instance of the corresponding interface

The instantiated Bufferlmpl exports an instance of Buffer, which can thus be used as a parameter by some other module.

3.3.2 Applying modules

A module is not applied to all its arguments at once. Instead, the arguments are supplied in two stages:

A module is applied to its interface (DIRECTORY) arguments by compiling it; the result is a BCD (represented by a .bcd file). The bcd is still a proc, with instance parameters. Like any proc, a module can be applied to different arguments (i.e., different interfaces) to yield different results (BCDs).

A BCD is applied to its instance (IMPORT) arguments by loading (or binding) it; the result is a program instance, together with any interface instances exported by the module. Again, the BCD can be applied to different arguments (i.e., different interface instances) to yield different instances. Indeed, because an instance may include variables, even two applications to the same arguments will yield different results (instances).

These two stages are separated for several reasons:

All the type-checking of a module can be (and is) done in the first stage, by the compiler. The only type error possible in the second stage is supplying an unsuitable argument.

Compiling is much slower than loading, and a module needs to be recompiled only when its interface arguments change, not when the interface instances change. The latter are changes in the implementations of the interfaces, and are much more common.

When there are multiple instances of the same module with the same interface parameters, they automatically get the same code.

We've always done it that way.

3.3.2A Initializing a program instance

The statements in the body of an implementation module form the body of a proc called the program procedure. The function of this proc is to initialize an instance of the module. When program instance PI is made, no code in the module is executed: hence PI may be uninitialized. It is the job of the program proc PP' to initialize PI, perhaps using the PROGRAM arguments if there are any. Until PP' has been called, PI is not in a good state. It would be better to supply the PROGRAM arguments along with the imported instances, and call PP' as part of making P1, so that PI is never accessible in its uninitialized state. But it isn't done that way: hence the programmer must ensure that PP is called before any use is made of P1. The preferred way to get hold of PP is from an interface to which it is exported: see § 3.3.5.

To confuse things. PP is not an ordinary procedure but a PROGRAM, and it must be called using the START construct (see § 4.4.1). Note that in addition to the statements of the module body, PP also contains the type-specific initialization code for any variables or non-static values in the instance: e.g., if x: INT^O-3, the value of x will not be 3 until after PP' has been called.

There is some error detection associated with this kludge. If a proc in the instance is called before the instance has been initialized by START, a start trap occurs. At this point, if PP' takes no arguments it is called automatically, and the original call then proceeds normally: if PP does take arguments, there is a Runtime.StartFault ERROR.

Caution on initializing monitors: If the module is a monitor, PP' runs without the monitor lock: if another process calls into the module while PP' is running, it will not wait, but will run concurrently with PP. This is unlikely to be right. It is unwise to rely on a start trap to initialize a monitor module: call PP explicitly with START.

Caution on referencing module variables before initialization: If a variable in the instance is referenced before the instance has been initialized, no error is detected, and the uninitialized value will be obtained. PP' can still be called to initialize the instance, and may still be called automatically by a start trap.

The program proc is bound to the name CONTROL in the result of an implementation module if its type is PROGRAM[] RETURNS (otherwise the proc RuntimeReportStartFault is bound to CONTROL). This allows the modeller (and binder) to get access to PP so as to control the order in which modules are started.

3.3.3 Parameters to modules: DIRECTORY and IMPORTS

The interface parameters of a module are declared in the DIRECTORY. An interface / has type TYPE n, where n is any one of the names given before DEFINITIONS in the header of the interface module that produced I. The INTERFACETYPE primitive in the desugaring takes a list of atoms and returns a type which implies TYPE n for each $n in the list. The reason for allowing several names is to aid conversion of an interface from one name to another: both names can continue in use for a while.

The use of these names provides a clumsy check that the proper interface is supplied as an argument. DIRECTORY n: TYPE and DIRECTORY n are both short for DIRECTORY n: TYPE n.

The compiler must be able to find the interface arguments. which in general are stored as files. When the modeller is used, it supplies these arguments from the specifications in the model. Otherwise, they may be specified explicitly on the compiler's command line, or failing that. the compiler gets the interface I from the file I.bcd.

An interface is a type which can only be used:

Before a dot (§4.14). to obtain a value from its cluster, which simply consists of the bindings in the interface module body (§3.3.4).

In an IMPORTS list as the type of an instance parameter to a module. After POINTER TO FRAME (§ 4.5.3)

The USING clause in the DIRECTORY, if present. restricts the cluster of the interface to contain only items with the names n_u, ... Thus in the example, only ROPE and Compare are in the cluster of Rope

in the BufferImpl module. This means that Rope.ROPE and Rope.Compare are legal, but Rope.n for any other n will be an error. Note that USING affects only the cluster of the parameter: it does not affect the clusters of any types or the bodies of any INLINE procs obtained from the interface. Thus within Rope. Compare might be bound by

Compare: PROC[rl, r2: ROPE] RETURNS NOOLHNLINE IF Length[r1]' = Length[r2] THEN ... I
A call of Rope.Compare in BufferImpi is all right, even though RopeLength in BufferImpi is an error.

In the example, CIFS, 10, and Rope are interfaces. They are the types of three IMPORTS parameters named Files, 10, and Rope (if the IMPORTS clause gives no name for the parameter, the name of the interface is recycled). An actual argument for an IMPORT parameter must be an interface instance, i.e., a value whose type is an interface type. Such a value is obtained from one or more modules which export the interface (§ 3.3.5). An instance is a binding: in it, the value of a name declared in the interface is provided by the exporter: the value of a name bound in the interface (e.g., x-3) is just the value the interface binds to the name (in this case, 3). This rule has two effects:

The client can ignore the distinction between names bound and declared in the interface, since both appear in the instance binding and are referenced uniformly with dot notation. This means that the client is not affected, for example. when a proc is moved from an INLINE in the interface to an ordinary definition in an implementation.

The client can often ignore the distinction between the interface and the instance, since all the values in the interface are also in the instance, with the same names. This is the motivation for the shorthand which allows the name of an IMPORT parameter to default to the name of the interface: the interface is no longer accessible, but I.x has the same meaning (namely 3) whether / is the interface or the instance.

Caution on inlines in interfaces: Names bound to inline procs in an interface do not appear in the interface binding, but only in an instance. This somewhat dubious rule ensures that clients won't have to add to their imports lists if a proc stops being an inline.

Restriction on importing multiple instances: An interface module may not import more than one instance of a given interface I. If an implementation module P imports more than one instance of I. the principal instance of / is the one with no name in the IMPORTS list (which is therefore named I by default). if P imports only one instance of type I. then that instance is the principal instance.

Restriction on importing a principal instance into imported interfaces-. Often an interface module has no IMPORTS. because it only needs access to the static values (types and constants) bound in its interface parameters, and does not need values for any names declared there (procs and interface variables). If an interface module does have IMPORTS, however, and there is more than one instance of any imported interface around, then there is a restriction on the argument values. Suppose that Intl imports Int2, and that a program module P imports Intl. Then Intl may only import one instance of Ina. and if P also imports MO. the principal instance of /n/2 in P must be the same as the value of Intl imported by the Intl imported by P. For example, with

DIRECTORY MO: Intl: DEFINITIO\S IMPORTS Int2V: Int2...

DIRECTORY Intl. Intl: P. PROGRAM IMPORTS Int1V: Intl. Int2V: Int2...

we must have in P that Int1V.Int2V--=Int2V.

3.3.4 Interface module bodies

The body of an interface module I is a collection of bindings (e.g.. x: INT-3) and declarations (e.g..

y: VAR INT or P: PROC[a: INT] RETURNS [REAL]).

Restriction on bindings in interfaces: The construct that follows the – in one of the bindings¹³ is restricted:

If it is an expression, it must be static (§3.9.1). Thus, no imported names. As a result, P: PROC^—LP

E: ERROR^—LE

are not allowed.

If it is a block (providing the body of a proc), it must be INLINE (because there isn't any place to put the compiled code).

It may not be CODE. This is an unfortunate accident of the implementation.

The result of applying an interface module is an interface (§3.3.2), which is a type I obtained by applying the primitive MKINTTYPE to the d's and b's of the body. This type is simply the declaration obtained by collecting the declarations in the body, with a cluster which is extended to include all the bindings of the body. However, MKINTTYPE omits any inline proc bindings from the type's cluster, instead leaving the proc declarations in I. It puts an extra item BINDING in Is cluster with the inline procs in it. When an instance Ins: of 1 is imported, the binding actually imported is Inst PLUS 1.B1NDING. This slightly dubious arrangement ensures that clients don't have to change imports lists if a proc stops being inline. This policy is not extended to other items, however, even though they might change from being bound in the interface to being interface variables.

The interface returned by

Red Blue. Green: DEFINITIONS^—...

has the types TYPE Red TYPE Blue and TYPE Green.

Restriction on referring to names introduced in an interface: The types and expressions in the declarations and bindings of an interface may refer to other names in the bindings as usual, but they may not refer to names introduced in the declarations, except that:

Any declared name may be used

in the body of an INLINE, or

after a "⁴-" in a defaultTC⁵⁵ in the fields⁴³ of a transferTC⁴¹ which is the type of a decl in the interface's body.

A declared (opaque) type may be used anywhere. For example, if an interface contains

I: DEFINITIONS—
x: INT-3;

y VAR INT;

T: TYPE[ANY]

then the following may also appear in the interface: xx: INT^—x+ 1;

P: PROC RETURNS[INT]^—INLINE {RETURN[x+A}:

Q: PROC [INT⁴-y];

V: TYPE^—RECORDV. REF T. g:

but the following are illegal: xy: INT^—y+ 1:

U: TYPE^—INT⁴-y:

W: TYPE^—ARRAY [0.4] OF INT;

The values of the bindings can be accessed directly by dot notation in any scope in which the interface is accessible. Thus if the value of the previous interface module is bound to J, e.g., because

J: TYPE / appeared in the DIRECTORY, then J.x is equal to 3. The declarations cannot be accessed directly (J.y is an error).

The declarations in an interface module are not quite like ordinary declarations. They are of three kinds, depending on whether the type of a declaration is:

A transfer type (§ 4.4.1); this is just like a declaration of a transfer parameter to an ordinary proc, except that it is readonly.

TYPE[ANY] or TYPE[e]; the type being declared is an opaque type or exported type. discussed in § 4.3.4. The expression e must be static. TYPE[ANY] or TYPE[E] is not allowed in an ordinary declaration; except in an interface, a type name must be bound to a type value when it is introduced.

VAR T, or READONLY T for any type T except TYPE: this is an interface variable: discussed in § 3.3.4.1 below. You can also write simply T here, but this is not recommended.

An interface instance // has the interface type / if for each item n: T in the interface, there is an item n— v in the instance, and v has type T. This is the same rule which determines that a binding has the type of a declaration; e.g., that a proc argument has the domain type. In this respect there is nothing special about an interface.

Note that a name can be declared PRIVATE in an interface, even though it must be declared PUBLIC in the exporter (§ 3.3.6). This can be useful if the name is used in a type constructor or inline proc in the interface, but its value should not be accessible to the client

3.3.4A Interface variables

An interface variable v gives clients of an interface direct access to a variable in a program module, namely the variable which is exported to v. This is the only kind of variable parameter in current Cedar.

·If you use the obsolete shorthand of T for VAR T in an interface variable declaration, you cannot declare a transfer type variable as an interface variable, since that already means passing the transfer value.

Caution on uninitialized interface variables: the variable which is exported to provide the value for an interface variable is not initialized until its module is initialized (§ 3.3.2A). However, there is nothing to stop it from being accessed sooner, with possibly undesired results.

Performance of interface variables: An interface variable can be read and (if not READONLY) set directly, which is significantly faster than Get and Set procs. Of course, the implementor gives up some control. These operations are not quite as fast as access to an ordinary variable, since there is an extra level of indirection which costs one or two extra instructions each time. There is also one pointer per interface variable per module which refers to it. If you use a private interface variable and inline Get and Set procs, you pay nothing in performance, but retain the option of changing the proc definitions later.

·You can get direct access to all the variables of a module by using a POINTER TO FRAME type (§ 4.5.3).

3.3.5 Implementation module bodies

The body of an implementation module Imp is simply a block. This block plays two roles. On the one hand, it is an ordinary block, the body of an almost ordinary proc PP' called the PROGRAM proc, which has parameters and results like any other. PP' is special in one way: it has a PROGRAM type rather than a PROC type. When PP' is applied (using the special construct START: see § 4.4.1), its declarations and bindings are evaluated, its statements are executed, and its results are returned

as with any proc. The only difference is that the values bound to the names introduced in the block (i.e., the frame of PP') are retained after the proc returns: in fact, forever (unless Runtime.Unnew is used to free the frame). Procs local to the block can access these values in the usual way. and values of exported names can also be accessed through interfaces, as explained below: see § 3.3.2A.

As with any proc (§ 3.5.1). the frame of PP includes the parameters and results from Imp's drType⁴² as well as the names introduced in the block's d's and b's. It also includes an additional item:

Imp: PROGRAM T- PP'

where Imp is the name of the module and T is its drType.

The body of Imp has a second role: to supply values for the names declared in the interfaces exported by Imp. For each interface Ex which Imp exports. an interface value Exi of type Ex is constructed. Each name n in Exl acquires a value as follows:

If n: T is in Ex and n: PUBLIC T-'x is in the body of Imp, then n–x is in ExI. This is a slightly peculiar kind of binding: as in an ordinary binding, x must be coerceable to T (§ 4.13). Note that n must have PUBLIC access (§ 3.3.6) in the body.

If n is Imp and n: T is in Ex, then n–PP is in Exl; the type of PP' (which is PROGRAM D RETURNS R, where D RETURNS R is Imp's drType) must be coerceable to T. This method of exporting PP' is the usual way of giving another module access to the program proc, so that it can be called to initialize the module at the proper time.

If n is declared in Ex, not bound in the body of Imp, and not the same as Imp, then n' UNBOUND is in Exl. UNBOUND is a special value with the following properties:

For a proc P, it causes a Runtime.UnboundProcedure signal on any application of P. For a variable v, it causes a Runtime.PointerFault error on any reference to v. For a type T, it causes no problem.

If n–x in Ex, then n-'x in Exi. Thus any names bound in the interface are bound the same way in any interface value.

Caution on exporting a name to several interfaces: A name can be exported to several interfaces without any warning, if it has a suitable type. This is unlikely to be what is wanted.

On the other hand, it is quite usual to have several modules exporting to the same interface. The modeller, loader and binder provide facilities for merging the interface instances produced by the several modules into a single instance that contains all the items bound by any of the modules.

The result of instantiating Imp is a binding with:

One item for each exported interface Ex, namely Ex: Ex^–Exl, where Exl is the interface value constructed above. Here Ex is the name n_d given to the interface in the DIRECTORY.

One item CONTROL: PROGRAM!' RETURNS 11 whose value is the program proc PP if that has no arguments and no results, and otherwise Runtime.ReportStartFault.

·One item for the type of the module's global frame, namely FRAME^—TYPE Imp.

.One item for Imp itself, namely Imp: FRAME. The value of this item is the program instance, i.e., the frame of the module's body. The instance exists before PP' is called (though it is uninitialized). In fact, its Imp item can be applied to call PP.

This binding is accessible in a model, where it can be used to get access to the interface instances, the program proc, the global frame type, and the program instance.

·You can pass FRAME as an argument to a DIRECTORY parameter /: TYPE Imp: like an interface: I provides access to constants bound in the module, and allows you to declare an IMPORTS parameter whose argument will be a program instance of the module. From I you can also obtain a first-class Cedar type POINTER TO FRAMEM see § 4.3.5. Is cluster includes a coercion from Ito POINTER TO FRAME[/], and the proc COPYIMPLINST (applied by the funnyAppl NEW). which is the same as the proc of the same name in cluster of POINTER TO FRAME[!].

•You can import Imp into another module (by writing DIRECTORY Imp ... IMPORTS ImpInst: Imp ...). and obtain access to all the variables and procs of the program instance.

3.3.6 PUBLIC. PRIVATE and SHARES

Cedar has a rather complicated mechanism for controlling access to names. Most uses of it are now considered to be obsolete, with the following exceptions:

Names to be exported must be declared PUBLIC.

Names included in an interface for use in inline procs etc.. but not intended for use by clients, should be declared PRIVATE.

Access to a name is declared by writing PUBLIC or PRIVATE right after the colon in a declaration: x: PUBLIC T

In the Cedar syntax these colons occur in the declarations" and bindings^I3 in bodies¹⁰, fields⁴³•⁵¹, and interface modules², and in the tag⁵³ of a unionTC. You can set a default access for all the names in a module². ³or record⁵° by writing PUBLIC or PRIVATE just before the { or RECORD: this is overridden by an explicit PUBLIC or PRIVATE inside. By default, an interface is PUBLIC and an implementation is PRIVATE.

A PRIVATE name defined in module M can only be referenced: from within M;

from a module which EXPORTS M.

•from a module which SHARES M: avoid this feature.

This does not mean that the name is invisible, but rather that it is an error to use it if, e.g., M is OPENed. Thus in

x: INT: {OPEN M: f[x]}

if x is bound in M (and not hidden by a USING clause), the call of f is equivalent to f[M.x] regardless of whether x is PUBLIC or PRIVATE. It is illegal if x is PRIVATE, but it never refers to the x declared by the x: INT.

Furthermore, if a record has any PRIVATE components, a constructor or extractor for the record is legal only in a module where use of the PRIVATE names is legal (even if the private components are not mentioned and have defaults).

Examples

CHECKED {

OPEN Buffer, Rope;

ENABLE Buffer.Overflow =>GOTO HandleOvfl; stream: 10.Stream-10.CreateFileStream["X"]; x: INT⁴-7;

{OPEN b–'–buffer;

ENABLE {

Files.Error--[error, file]-- = >{

stream.Put[10.rope[error[]; CONTINUE: ANY =>{ X⁴-12; GOTO AfterQuit } 1;

y: i\r€9; };

x⁴-stream.GetInt:

EXITS

AfterQuit=›{...}; HandleOvfl=>{...} }:

-- Unnamed OPEN OK for exported

-- interface or one with a USING clause. -- A single choice needn't be in {}.

-- Use a binding if a name's value is fixed. -- Better to initialize declared names. -- A statement may be a nested block. -- Multiple enable choices must be in {}. -- ERRORS can have parameters.

-- Choices are separated by semicolons. -- ANY must be last. ENABLE ends with -- Other bindings, decls and statements. -- Other statements in the outer block. -- Multiple EXIT choices are not in {}.

AfterQuit, HandleOvfl declared here, -- legal only in a Go-ro in the block.

The main function of a block is to establish a new scope (§ 2.3.4) and to allow for the allocation of variables declared in the block, as in Algol or Pascal. A Cedar block has four other features:

attributes: CHECKED, UNCHECKED and TRUSTED are treated in § 3.4.4 on safety. open⁷: a combination of sugar for LET and call by name; see § 3.4.2. enable⁸: catches signal and error exceptions in the body; see § 3.4.3.1. EXITS: catches GOTO exceptions in the body or enable; see § 3.4.3.2.

Note that the braces around a block may be replaced by BEGIN and END (§3.2).

The statements in a block are evaluated in the order they are written. The initialization expressions in the d's and b's are also evaluated in the order they are written; this may be important if they have side effects, although that should be avoided.

3.4.1 Scope of names and initialization

The names introduced in the block body's d's and b's (i.e., appearing before a : or –) are known throughout the body with the values supplied by the d's and b's, except in inner scopes where they are reintroduced: they are not known elsewhere in the block. The frame of the block can be coerced to a binding with a value for each such name.

Actually. the frame is a value of an opaque type which has a coercion (called UNCONS) to this bindin^g. As the desugaring for body indicates, the frame is constructed (by NEWFRAME). and then a LET makes the names in the binding known in the statements of the body.

Anomaly on order of evaluating bindings: A name introduced by a binding. n: T-'e, has the value of e throughout the body if e is static. If e is not static, it is evaluated after all preceding d's and b's, but before any following ones. This means that n is trash in all the d's and b's before its binding. Symmetrically, if e refers to a name introduced in a following decl or non-static binding. it will get a trash value. Compiling with the "u" switch will yield a warning in this case. Note that only attempts to use the value of n get trash; n may appear anywhere in a X-expression, and all will be well as long as the X-expression is not applied before n's binding is evaluated.

A name introduced by a declaration, n: T, is bound to a new VAR T. The variable bound to n is allocated, and its INIT proc is executed, before any statements in the block is executed (this is done by the NEWFRAME proc in the desugaring).

Anomaly on order of initializing variables: However, the INIT proc is executed (to set a REF or transfer value to NIL), and any initialization specified by a defaultTC⁵⁵ in T is done at the same time that a non-static binding would be evaluated. As with a binding, n.VALUEOF is trash before this time. Furthermore, any (unwise) assignment to n before this time is overridden by the defaultTC.

Caution on uninitialized RC variables: The failure to initialize RC variables is a safety loophole, since the trash can be picked up and used as an address.

Style of expressions in bindings and initializations: The expression in a binding or defaultTC should be functional. or at least it should have only benign side-effects. There is no enforcement of this recommendation, unfortunately. In current Cedar such an expression is evaluated exactly once. at the time described above. This may change in the future, however.

The variables created by a declaration are deallocated when execution of the block is complete, unless the block's frame is retained. Currently only an implementation's block³ has its frame retained. There are two ways to hang on to a variable v after execution of the block is complete:

Obtain a pointer to v with @: this pointer value can survive the block.

Obtain a proc value for a local proc which refers to v; this proc value can survive the block.

In the checked language both these dangling references are impossible: the @ operator, being unsafe, is forbidden, and ASSIGN for proc values gives an error unless the proc is local to a program instance (which has a retained frame).

Caution on dangling references to frames: An unchecked program can get into trouble.

Performance of block entry and exit: There is no overhead associated with block entry or exit. even if the block has an open, enable or EXITS. The only cost is for initializing the variables bound to its names. It is good style to use blocks freely to limit the scope of names.

3.4.2 OPEN

There are two forms of open. The first, n^—e, binds the name n to a_openIN e.UNREF. This is just like

X IN e.UNREF, except that there is a coercion from n to nn. In other words, every time n appears, its value is obtained by evaluating e.UNREF. The effect is exactly like call by name in Algol: the — is to remind you that this is not ordinary value binding. The value of e.UNREF is

e if the cluster of Ve does not include DEREFERENCE; et.UNREF if it includes DEREFERENCE:

In other words, a reference value is dereferenced (and a single-component record or binding replaced by the component), repeatedly if necessary, to obtain a non-reference value. In an open. e.UNREF must be a record, interface or instance.

The second, nameless, form of open gives an expression without binding it to a name: { OPEN e: ...}. The expression e.UNREF must evaluate to a binding b:

An interface or instance value is a binding (§3.4.2).

A record value has a corresponding binding which has the names of the record fields bound to the field values (or variables, for a VAR record).

sAn application returns a binding, though the call-by-name feature makes it unwise to use an application in an open.

The nameless open converts b into another binding by in which each value is a A_openproc (see

above), and introduces bp's names in the block with a LET. Thus in the program

R: TYPE^—RECORD [a: INT^I-3. b: REAL⁴-3.4]; r: R; { OPEN r: ...}

the names a and b are known in the body of the block, with the same meaning as r.a and r.b.

Style for nameless open: Nameless open should be used with discretion, with the smallest practicable scope, and only if the value being opened is very familiar, or heavily used, or both. Nameless open can cause great confusion. since it is not obvious from the text of the program where to find the bindings for the names it makes known. It should never be used when evaluation of e has a side‑

effect.

The scope of an open is all the rest of the block, including any enable and any EXITS. A single open may have several bindings or expressions. These are applied sequentially, so that the names bound by earlier ones are known to the later ones as well as to the rest of the block.

3.4.3 ENABLE and EXITS

The ENABLE and EXITS constructs are two forms of sugar for exception handling (§ 2.2.4, § 2.6.2). ENABLE catches signals and errors raised in the body (but not the open, enable, or exits). EXITS catches GOTOS in the body or enable (but not the open or exits). Both are in the scope of the open, if any. Neither is in the scope of any names introduced in the body.

3.4.3A ENABLE

An enable has a chance to catch any signal or error raised in the block (and not caught at a deeper level). A nearly identical construct can appear in an application²⁶; the following explanation covers both cases.

Each enable choice (enChoice⁹) has a list of expressions with exception values (.or ANY) before the =>. If ANY appears. it must be the last enChoice. If the exception is equal to one of these values, or if ANY appears. the statement after the => is executed. Control leaves this statement in one of the following ways:

A REJECT statement causes the exception to be the value of the block: it will then be propagated within the enclosing block, or if the block is a proc body it will be propagated to the application.

A GOTO statement sends control to the matching choice in the EXITS. There are three special cases¹⁶:

A RETURN is not allowed in an enChoice.

A CONTINUE statement ends execution of the current statement (in this case the block): execution continues with the next statement following. If the block is a proc body, the effect is the same as RETURN. You cannot write CONTINUE in a body's d's or b's.

·A RETRY statement begins execution of the current statement (in this case the block) over again at the beginning. You cannot write RETRY in a body's d's or b's.

The semantics of CONTINUE and RETRY follow from the desugaring of statement'''.

A RESUME statement (signals only) is discussed below.

·If the statement finishes normally, a REJECT statement is then executed.

If a single expression e appears before the =>, then within the enChoice statement the names in Ve.DOMAIN are declared and initialized to the arguments of the exception. With multiple expressions, or ANY, the arguments are inaccessible. •The use of ANY is not recommended.

Note that an error is caught by an enChoice with a matching exception value, not by one with a matching name. Normally an error E will be declared in some interface, its value will be supplied by a binding of the form E: PUBLIC ERROR ... — CODE, and both the signaller and the enChoice will refer to this value by the name E. In this case, it is natural to think of the binding as being by name. However, it is possible to have a different name for this exception value, e.g. by writing El: ERROR ... — E. It is also possible to bind some other exception value to E in a scope which includes some enChoice examined when the signal is raised. Thus in the silly program

E: ERROR^—CODE:

F: ERROR^—'E:

{ENABLE E=>{--Handler 1--...}:

E: ERROR^—CODE:

{ENABLE E=>{--Handler 2--...}:

IF switch THEN ERROR FELSE ERROR E;

if switch is true handler 1 will be used, and if it is false handler 2 will be used.

Finalization

You are supposed to think of an ERROR as an unusual value ev which can be returned from any application; this value immediately stops the evaluation of the containing application, which likewise returns ev as its value. This propagation is stopped only by an enable choice which catches the ERROR. As each application is stopped, it is finalized. Aside from invisible housekeeping. finalization confusingly consists of executing an enChoice which catches the ERROR UNWIND. The programmer can write any cleanup actions he likes in this statement.

Caution on ERRORS in finalization: If the finalization raises another ERROR which it does not catch. it will itself be stopped, with very confusing consequences. It isn't very useful to know exactly what happens then: avoid this situation.

Anomaly on order of finalization: In fact, things are a bit more complicated. When a signal or error is propagated, the enChoice statement is called as a proc from the SIGNAL or ERROR which raises the exception. When control leaves the statement by a GOTO (including EXIT. CONTINUE. RETRY or LOOP, but not RETURN, which is forbidden in an enChoice), the finalization is done. This means that the enChoice statement is executed before any finalization. This is useful for signals. which often resume. In some cases, however, notably if finalization would release monitor locks, this can cause trouble. Avoid the problem by exiting from the enChoice immediately with a GOTO.

Caution on exceptions in enable choices: An enChoice can raise a second exception ex2 and fail to catch it. This will probably result in confusion, and should be avoided. If it happens, ex2 is propagated just like the first exception exl: all the enChoices which saw exl will see ex2. This is because the enChoice statement for exl was called as a proc. Unless ex2 is a signal which is resumed, the enChoice which caught exl will be finalized and abandoned.

Caution on ANY and UNWIND: ANY unfortunately catches UNWIND, and hence its statement will be taken as the finalization. It is better not to use ANY. Also, it is possible to raise UNWIND explicitly: don't.

Signals

Conceptually, a signal is quite different from an error; in fact, it is very much like an ordinary application. The only differences are:

The proc to be called is an enChoice which is found exactly as though the signal were an error. The effect of this is that SIGNAL P[args] binds the proc name P to the proc body dynamically, by searching up the call stack for a binding of P. This is just the way Lisp binds free variables, except that a binding for P can only be found in an enChoice. not in the frame of a proc.

Actually this is not quite right. Like an error handler. the signal proc is not found by matching names. but by matching exception values. This point is discussed in detail above.

The enChoice can be terminated by a GOTO out of its body, unlike an ordinary proc. The GOTO exception is treated exactly like a GOTO out of an enChoice for an error: it causes all the intervening frames to be finalized.

The implementation, however, treats errors and signals in a very similar way: the only difference is that you cannot resume an error (return from the enChoice). In fact, you can invoke a signal with ERROR, which prevents it from being resumed; avoid this feature. In the future the distinction between signals and errors will be reflected more clearly in the implementation.

Anomaly on RESUME: The desugaring gives no explanation of how RESUME works, since it does not turn the enChoice for a signal into a proc at all. This is a defect.

3.4.3B EXITS

An EXITS construct (confusingly called REPEAT in a loop) declares one or more exceptions which are local to its block, and also catches them. The syntax is just like an enable. However, names called labels appear before the => rather than expressions, and the EXITS introduces these names in a scope which includes the block body and any enable, but not an open and not the statements in the EXITS itself. A label may only be used in a GOTO statement.

Anomaly on the separate name space of labels: Actually labels have their own name space, disjoint from the other names known in the block. Hence it is possible to declare a label n and still to refer to another n in the block. Avoid this feature.

Like the raising of any exception, a GOTO n stops execution of the current statement. The statement associated with n is executed. If it finishes normally, execution continues after the block in which n was declared. If it raises an exception, that exception becomes the value of the block.

Anomaly on GOTO and UNWIND: A GOTO skips any UNWIND enChoices that intervene between the GOTO and its matching EXITS. This is the only way to escape from a block without executing the UNWIND. You can avoid this anomaly by not nesting UNWIND enChoices within blocks that have

EXITS.

3.4.4 Safety

A SAFE proc has the property that if the safety invariants hold before it is called, they also hold afterwards. Roughly, these invariants ensure that the value of every expression has the syntactic type of the expression, and that addresses refer only to storage of the proper type (§4.5.1). An UNSAFE proc may lack this property. Hence a safe proc type implies the corresponding unsafe one.

We want to have confidence that the safety invariants hold. To this end, we want to have: as few unsafe procs as possible;

a mechanical guarantee that a proc is safe, if possible.

Clearly, a proc whose body calls only safe procs will be safe; this means that all the primitives it applies must be safe, as well as all the user-defined procs.

Applying this observation, Cedar provides three attributes which can be applied to a block:

CHECKED: the compiler allows only safe procs to be applied; hence the block is automatically safe, and any proc with the block as its body is safe.

UNCHECKED: there are no restrictions on the block, and it is unsafe.

TRUSTED: there are no restrictions on the block, but the programmer guarantees that it preserves the safety invariants; the compiler assumes that the block is safe. This is a restricted form of LOOPHOLE.

These attributes are defaulted as follows.

A block is checked if its enclosing block is checked: otherwise it is unchecked.

If CEDAR appears in the module header. the outermost block is checked, and a transfer type constructor anywhere in the module defaults the SAFE option to TRUE. Hence the resulting type will be safe, and its initialization must be safe or there is a type error.

Otherwise, the outermost block is unchecked, and a transfer type constructor anywhere in the module defaults the SAFE option to FALSE. Hence the resulting type will be unsafe, and there is no safety restriction on its initialization.

Of course you can override these defaults by writing CHECKED. UNCHECKED or TRUSTED on any block, and SAFE or UNSAFE on any transferTC (except ERROR, which is automatically safe). The defaults are provided to make it convenient to:

write new programs in the safe language:

continue to use old, unsafe programs without massive editing.

An unsafe proc value never has a safe type, and hence cannot be bound to a name declared with a safe type. This applies to enable choices for signals as well as to procs. In both cases, the body must be checked or trusted if the type is safe. ERRORS are treated differently, however, because of the view that an ERROR is a value returned from an application, unlike a signal which calls the enChoice expression. Hence the enChoice for an ERROR is treated just like any statement in its enclosing block, and is not considered to be bound to a proc when the ERROR is raised.

The following primitive procs are unsafe:

C. DESCRIPTOR and BASE.

t or FREE applied to a pointer (but not a REF), and all pointer arithmetic. APPLY of

a descriptor (because it involves dereferencing a pointer):

a computed sequence. or a record containing a computed sequence: a base pointer.

APPLY for process and port types (JOIN and port calls).

withSelect³⁴.

The fields of an OVERLAID union.

ASSIGN Of:

An unspecified type to anything other than the same unspecified type (§4.9). A union or variant record.

LOOPHOLE which produces a RC value (§4.5.1).

Examples

HistValue: TYPE[ANY]: -- Interface: An exported type.

Histogram: TYPE^—REF HistValue; A type binding.

baseHist: READONLY Histogram; An exported variable .

AddHists: PRoc[x, y: Histogram] An exported proc.
RETURNS [Histogram]:

Label Value: PRIVATE TYPE^—RECORD[ PRIVATE only for secret

first,last:INT,S:ROPE.X:R EAL,f,g:INT,r:R EF ANY]: stuff in an interface.
Label: TYPE^—REF LabelValue;

Next: PROC[1: Label] RETURNS[Label] An inline proc binding.

INLINE { RETURN [NARROW[1.1]] }:

H: TYPE–Histogram"; Size: INT-10; -- Implementation: Binds a TYPE and INT.

HistValue: PUBLIC TYPE—HV⁴⁰.¹: PUBLIC for exports.

baseHist: PUBLIC H^€NEw[HistValue^i—ALL[17]]; - An exported variable

x. y: HistValue-[ 20. 18, 16, 14, 12, 10, 8. 6, 4. 2. -- with initialization.

FatalError: ER ROKreason: ROPE]-CODE; Binds an error.

Setup: PROC [h: Handle³, a: INA–ENTRY {...}: -- Binds an entry proc.
ij,k: IN^-r⁴-0; p,q: BOOL: lb: Label: main: Handle;

Declarations are explained in § 2.2.1F and § 2.4.5. Their peculiarities in the different contexts where they can appear are explained elsewhere:

interfaces in § 3.3.4:

blocks in § 3.4.1:

fields in:

domains and ranges in § 4.4: records and unions in § 4.6: Access is explained in § 3.3.6.

Bindings are explained in § 2.3.5. See also § 3.7 on argument bindings. Note that the e in a binding is evaluated just once, even if several names are bound. There are four special forms of binding given in rule 13. however, which are defined here:

A TYPE binding is the only way in which a type value can be bound to a name, since types cannot be passed as parameters. Unlike other bindings, this one expects a type³⁶ rather than an expression¹⁹ after the –.

A name with a signal or error type can be bound to CODE; this use of CODE is not allowed anywhere else. See § 4.4.1 for details on the meaning of this.

ttA MACHINE CODE construct can be bound to a name with a proc type. This construct allows machine instructions to be assembled into a proc value. The instructions are separated by semicolons. Each instruction is assembled from a list of expressions separated by commas. An expression in the list is evaluated to yield a [0..256) static value which forms one byte of the instruction: successive expressions form successive bytes.

A A-expression derived from a block can be bound to a name with a proc type. The complicated semantics of this construction are explained in the following subsection.

3.5.1 PROC bindings

A binding of the form n: T–{...} is the only way to construct a proc value and bind it to a name, since you cannot write a A-expression in current Cedar.

There are other ways to construct proc values:

The expression in a defaultTC⁵⁵ is turned into a parameterless proc which is bound to Default in the type's cluster (§4.11).

The expression following — in an open or WITH ... SELECT is turned into a parameterless proc with a deproceduring coercion (§3.4.2).

The statement in an enable choice for an exception is turned into a proc with domain and range given by the exception type (§3.4.3A).

The expression following LOCKS in a module heading is turned into a proc according to a peculiar rule (§4.10).

The A-expression is constructed from the block in the following way. Its domain and range are the domain and range of the proc type T. Its body implicitly declares a variable for each item of the domain and range: these variables have the names of the domain and range items, and their scope is the entire block, not just the block body. The domain variables are initialized to the parameters. and the range variables in the usual way according to their types. Then the block, with a RETURN tacked on the end, is evaluated. A RETURN exception in the block is caught, and the current values of the range variables are the result of the A-expression. The only other way out of the block is to raise an exception.

A RETURN in the block is sugar for GOTO Return', which is caught as described. RETURN e assigns e to the range variables and then does a GOTO Return'.

Anomaly about parameter and result names: It is an error to introduce the same name twice in the domain, range or block.

Performance of proc calls: A proc call and return is about 30% faster if the proc is local, i.e., denoted by a name which was bound to a proc body in the same module as the call. A proc which is local to another proc, rather than bound in the body of an implementation, is about 20% slower to call. It also introduces some overhead when its parent proc is called, and its access to non-static names introduced in its parent proc is slower than access to other names. A call and return for an ordinary, non-local proc takes about 10 times as long as the statement x'-y+z, not counting the time for passing arguments or results. Each argument or result value costs as much as an assignment of that value. If the total size of the arugments is more than 11 words (in the current implementation), the cost of passing them is doubled, and likewise for results.

The attributes ENTRY and INTERNAL can be used only in a MONITOR: they are discussed in § 4.10.

The attribute INLINE has no effect on the meaning of the program, but it causes the proc body to be expanded inline whenever it is applied. This saves the cost of a proc call and return and sometimes the cost of argument passing, and it may allow static arguments to participate in static evaluation within the proc.

Restrictions on inlines: An INLINE proc may not be:

Recursive.

Exported.

Used as a proc value except in an application; thus you cannot assign it to a proc variable. The argument of FORK.

Accessed from the cluster of a POINTER TO FRAME type.

Caution on inlines in interfaces: An inline proc binding in an interface is not accessible from the interface (i.e., from a DIRECTORY argument); you must get it from an instance (i.e.. import the interface). See § 3.14.

Performance of inlines: Excessive application of inline procs will result in much larger compiled code. Excessive definition of inline procs will result in much larger data structures in the compiler, and hence in larger symbol table files, and a greater chance of overflowing the compiler's capacity. The following cases are efficient:

An inline proc in an implementation which is called zero or one times.

An inline proc which has a simple body, no locals, no named results, and no accesses to the formals after potential side effects.

3.6 Statements

Text Box: 14 Statement = SS
In 6.10. 17. 19.
sS = e14-e2 I e I block6 I escape I loop I NULL
16 escape :: = GOTO n I GO TO n I
EXIT I CONTINUE I •LOOP I •RETRY I (RETURN I RESUME) ?e I
•REJECT I tte 4- STATE
17 loop :: = (iterator I )
(WHILE e l UNTIL e I )
DO ?•open7 ?•enable8 ?body 1°
?(REPEAT (n, !..= >s); ...) ENDLOOP
iterator = THROUGH e I
FOR (n : t I *n)
( ( I DECREASING) IN e I { SIMPLELOOP {SS; GOTO Cont."; EXITS Retry"=>NULL}; EXITS Cont" = >NULL }

[e₁¹-e₂].-rovoio I e --must yield VOID-- I --all four yield VOID--

HEx[exception[code^— n'', args–NIL]] 1 GOTO ( Exit'¹⁷ I Cont'⁹ I Loop'¹⁷1 Retry'⁹) 1 ?(r`134-e;) GOTO (Return'¹³ I Resume') } I

THISEXCEPTIOND I DUMPSTATE[e]

( iterator ; 1 done'–FALSE; Next': PRoc–{}; )

{ Test"– IN (NOT elel FALSE);

{ open SIMPLELOOP {

IF Testy OR done' THEN GOTO FINISHED;

{ enable body EXITS Loop' = >NULL }; Nextll }

EXITS Exie= NULL; (n.!..=s); ...; FINISHEDNULLIN

FOR X': e IN e 1

( n: t; 1 )

( Range': TYPE^—e; done': BooL^€Range'.1sEmPTv; Next: PROC^—{ IF n ( >RangelAsT I <Range'.FiRsT )

THEN done' +-TRUE ELSE n⁴-n.(SUCC PRED) I;

n⁴-Range'.(FiRsT I LAST); I

e₁ e ) done': BOOL^— FALSE; Next': PRoc–{n⁴-e }. n⁴-e

1 2 2 ^ 1) •

e is a subrange. In FOR n: t ri is readonly except for the assignment in the iterator's desugaring.

Examples

AddHists[baseHist. baseHist[t; Setup[bh-'main, a-3];

{ENABLE FatalError=>RETURN[0]: []+-q31: ...}: IF i>3 THEN RETURN[25] ELSE GOTO NotPresent;

FOR t:INT DECREASING IN [0..5) UNTIL f[t]>3 DO u: INT⁴-0; ;

REPEAT Out=>{...}: FINISHED= >{...} ENDLOOP:

THROUGH [1..4] DO i⁴-i*i ENDLOOP:

FOR i: INTF1, i+2 WHILE i<8 Do j⁴-j+i ...; FOR I: Label^s-lb, I.Next WHILE l#NIL DO ...;

-- A statement can be an assignment, -- or an application without results, -- or a block,

-- or an IF or an escape statement,

-- or a loop. Try to declare r in the FOR -- as shown. Avoid OPEN or ENABLE -- after DO (use a block). FINISHED -- must be last.

-- Raises i to the 16th power.

-- Accumulates odd numbers in [1..8). -- Sequences through a list of Labels.

Cedar makes a distinction between expressions and statements. This distinction is most easily defined in terms of a special type called VOID, which is equivalent to the empty declaration a. This is the range type of a PROC [...]--q], and it is also the result type of a block, control, loop or NULL statement. An expression whose value is a VOID can be used as a statement, and cannot be used as an ordinary value in a binding (since it wouldn't have the right type). If you want to call a proc which returns values as a statement, you must assign the results to an empty group:

0⁴-fl^...1

Assignment is a special case: an assignment can be used as a statement even though its value is the value of the right operand. This is explained in the desugaring¹⁵ using a special proc TOVOID in the cluster of every assignable type; it takes a value of the type and returns a VOID. Note that the grammar is ambiguous here, since there are two parsings of e₁4-e₂ as a statement; the one written in

the rule for statement is preferred.

Anomaly about separators for SELECT: In a select²⁹ which is a statement (i.e., returns von), the choices are separated by semicolons; in a select expression they are separated by commas.

Anomaly about applying a parameterless proc: •If you write an expression whose value is a proc taking no arguments as a statement, the proc gets applied. Thus

is the same as

This is the only situation in which an ordinary proc gets applied by coercion (but see § 3.4.2 for open procs).

A statement¹⁴ is actually a rather complicated construct, as the desugaring shows. This is because of the CONTINUE and RETRY statements, which respectively terminate and repeat the statement containing the enable⁹ in which they appear. The desugaring shows exactly what this means in various obscure cases. CONTINUE and RETRY are legal only in an enable choice (§ 3.4.2). and they may not appear in a declaration at all. •RETRY should be avoided everywhere, since it introduces a loop into the program in a distinctly non-obvious way.

Escape^la consists mainly of the various flavors of GOTO (including EXIT, CONTINUE, LOOP, RETRY. RETURN and RESUME) which raise a local exception bound in an EXITS; this is explained in § 3.4.3B. REJECT is explained in § 3.4.3A.

Anomaly about GOTO and procs: You cannot use a GOTO to escape from a proc body, even though the body is within the scope of the label. Only normal completion. or a RETURN or ERROR exception (or a SIGNAL which is not resumed) can terminate the execution of a proc body.

A loop¹⁷ is repeated indefinitely until stopped by an exception. or by the iterator¹⁸ or the WHILE or UNTIL test. It has a body. bracketted by DO and ENDLOOP, which is almost like a block, but with some confusing differences:

You catch GOTO exceptions with REPEAT, which is exactly like EXITS in a block immediately around the loop, except for the different delimiting reserved word. Note that the scope of the labels does not include the iterator or the test, even though these are evaluated repeatedly during execution of the loop. This feature is best avoided if possible, but unfortunately is necessary if you want to catch the FINISHED exception explained below.

·You can write an open or enable. This is also best avoided, since the scope is confusing. It is better to write a block explicitly inside the DO if you need these facilities.

There are three special exceptions associated with loops:

EXIT is equivalent to GOTO Exit', where Exit' is a label automatically declared in the REPEAT of every loop. Its enable choice does nothing. Thus EXIT simply terminates the smallest loop that encloses it.

FINISHED is raised when the iterator or the WHILE/UNTIL test terminates the loop. It can be declared in the REPEAT like any label, but it must come last. If it is not declared, a null enable choice is supplied for it.

·LOOP causes the next repetition of the loop to start immediately. Anomaly about GOTO FINISHED: You cannot write GOTO FINISHED.

An iterator¹⁸ declares a control variable v which is initialized by the iterator and updated after each execution of the loop: the scope of v is the entire loop, and it is constant in the loop. After the loop is terminated by the iterator (i.e., in the FINISHED clause), the value of v is undefined. •If you omit the declaration and simply name an already declared variable, it will be used as the control variable, and will not be constant: it will still be undefined after the loop is terminated by the iterator. Avoid this feature.

There are three flavors of iterator:

THROUGH, which has no explicit control variable: THROUGH [0..k) or THROUGH [1..k] is convenient when you just want to loop k times.

FOR v: T IN [first .. last] ...; v is initialized to first, and set to v.SUCC after each repetition. The iterator finishes the loop after a repetition which leaves v> last. The > case can only occur in FOR v IN ..., when an out-of-range value is assigned to v in the loop body. DECREASING reverses the order in which the elements of the subrange are used. The subrange need not be static. Note that the subrange is evaluated only once, before execution of the loop begins.

FOR v: D-first, next ...; v is initialized to _first, and set to next after each repetition. This iterator never finishes the loop. Note that the expression next is reevaluated each time around the loop. The usual application is something like

FOR v: List"- header, v.next UNTIL v=NIL.

Note that the WHILE or UNTIL test is made with v equal to its value during the next repetition, and that both tests are made before the first repetition, so that zero repetitions are possible.

3.7 Expressions

Text Box: e . prefixOp I el . infixOp[e2] I
(A [x': Vet, y' :Ve2]=›[BOOL] IN relon ) [el. e2]
IF el THEN e2 ELSE FALSE I IF ei THEN TRUE ELSE e2
e . DEREFERENCE I STOP I ERROR NAMELESSERROR I
el . builtIn ?( [e2.... ?applEn ] )1
e . funnyAppl ?( [?argBinding ?applEn] )1
--Binding must coerce to a record. array. or •local string-- I 19 expression = n I literal⁵⁷ _I(e) I application2⁶ (e I typeName3⁷) . (9) n

prefixOp e I e_l infixOp e₂ I

e₁ relOp (4) e₂ I

e_i AND (2) e₂ J e_i OR (i) e₂ e t (9) I •STOP I ERROR I

builtln [ e₁ ?( , e_r ?applEn²⁷]

funnyAppl e ?( [?argBinding²⁷ ?appl En²⁷] ) [ argBinding²⁷ I I

subrange²⁵ ^I

if²⁸ 1 select²⁹ I safeSelect³² I •withSelect³⁴ ^I

to the right. Application has highest precedence. Subrange only after IN or THROUGH. s only

Precedence is in bold in rules 19-21. All operators associate to the left except which associates

in if ²⁸and select choices³⁰ ^{33 35}.

20 prefixOp ::= @ (8) I — (7) I (— I NOT) (3) VARTOPOINTER I UMINUS I NOT

21 infixOp ::= * I / I MOD (6) I + I — (5) 1^4-(0) TIMES I DIVIDE I REM I PLUS I MINUS I ASSIGN

22 relOp ::= ?NOT ( (= I < I >) I # J ?NOT ( ?NOT x'.(EQUAL I LESS I GREATER)[Y1 I =

(<= I>=) I IN) = OR X^'(< I >) Y^' X^r>⁼Y^'.FIRST AND(x'<=y'.LAST

--In 19.30. BUT {BoundsFault= >FALSE} ) )

23 builtIn ::= -- These are enumerated in Table 4-5.

24 funnyAppl ::= FORK I JOIN I WAIT I NOTIFY I BROADCAST _I

SIGNAL I ERROR I RETURN WITH ERROR _I

•NEW I •START I •RESTART ItiTRANSFER WITH I tIRETURN WITH

25 subrange (typeName³⁷ I )

([1()e1..e201))

--In 19. 39. 48.

26 application :: = e [?argBinding ?applEn]

27 argBinding :: = (n (e I 1 *TRASH )). 1.. I

(e I I *TRASH ),

In 19. 26. •TRASH may be written as NULL. - as :.

27JapplEn ::= ! enChoice⁹; ...-- In 19. 26.

Examples

Iv: LabelValue¹³⁴-1 i, 3, "Hello". 31.4E-1, (i+ 1), g[x]+ lb.f+j.PRED, NIL ]:

p1: PROCESS RETURNS [INT]⁴-FORK j]; ERROR NoSpace; WAIT bufferFilled:

RT: RTBasic.Type(-coDE[LabelValueⁱt

h[ - 3, NOT(i>j), i NOT >j, p OR q, lbst];
10⁹⁴- [first-0,1ast-5,x-3.2,g-2S-5,r- NIL.s—"1"];

last-j]4-1v1⁹:

b: BOOL*-i IN [1..1O]: FOR x: INT IN (0..1 1) DO ...: b⁴-( c IN Color5⁴(red..green] OR x IN INT[0..10) );

LET t'-(typeName I INT) , first'-( e_l I e₁.SUCC ) IN e.mksuBrIANGE[firse. (e₂ I e₂.PRED )] BUT {BoundsFault=>e.MKENAPTYSuBRANGE[e_i]l

LET m'^—e, a' -[argBinding] IN ( (m'. APPLY ) ?applEn )
(n (e I OMITTED I TRASH)), !..

(e I OMITTED I TRASH ),

BUT enChoice; ... }

-- A constructor with some sample

-- expressions.

FunnyAppls take one unbracketted arg; many return no result, so

-- must be statements.

-- An application with sample expressions. -- Short for /v⁴- LabelValue^ll...1.

-- Assignment to VAR binding

-- (extractor).

Subrange only in types or with IN. -- The INT is redundant.

fh⁴-Files.Open[name^–lb.s, mode– Files.read

! AccessDenied=>{...}; FatalError=){...}]; (GetProcs[j].ReadProc)[k];

file.Read[buffer– b. count–k];

f[i-3, j– k^—TRASH]: f[i-3, k^—TRASH]; f[3„ TRASH]:

-- Keywords are best for multiple args. -- Semicolons separate choices.

-- The proc can be computed.

-- File.Reactflle, b, (object notation).
--land k may be trash (see defaultTC⁵⁵). -- Likewise, if i, j, and k are in that order.

Most of the forms of expression are straightforward sugar for application: prefix, infix and postfix operators, explicit application of a primitive proc²³, or the funnyAppl²⁴ in which the first argument follows the proc name without any brackets. All of these constructs desugar into dot notation (§ 2.4.4, § 4.14); this means that the procs come from the cluster of the first argument. The exceptions to this rule are ALL, CONS for variant records and lists, LIST, and the single-argument forms of LOOPHOLE and NARROW, and VAL: all of these get the proc from the target type of the expression (§4.2.3). All the primitive procs are described in § 4.

Note that AND and OR are not simply sugar for application. Rather, they are sugar for an if expression, since the second operand is evaluated only if the first one is TRUE or FALSE respectively.

The order of evaluation for arguments of an application, and therefore for operands in an expression. is not defined (unless the operator is AND or OR). However, the arguments are evaluated one at a time, and all arguments are evaluated before the proc is applied. In particular, an assignment which executes completely behaves as though both left and right operands are completely evaluated before any assignments are done, even if the left side is a binding such as [a^—x, b

Rules 19-21 give the precedence for operators: t and . are highest (bind most tightly) and +- is lowest. All are left-associative except which is right-associative. Application has still higher precedence.

Style using precedence: The precedence rules are sufficiently complex that it is wise to parenthesize expressions which depend on subtle differences in precedence.

The first operand of assign can be an argBinding²⁷ whose value is a variable group or binding, i.e., one whose elements are variables; this is sometimes called an extractor. The second argument will typecheck if it is a group or binding with corresponding elements which can be assigned to the variables. Usually the second argument is either an application which returns more than one result, or a record-valued expression. You can omit elements of the left argBinding to discard the corresponding values; however, you can't write TRASH in the left operand. Note that the right operand is fully evaluated before any variables are changed by the assignment. Thus, for example, if

Pair. TYPE^—RECORD[INT. INT]

you can write

[ Pali': i]

to transpose i and j.

The expresssion ERROR is short for raising a nameless ERROR exception. You should think of it as a call to the debugger, appropriate for a state which "can't occur".

A funnyAppl which takes more than one argument has the extra arguments written inside brackets in the usual way; e.g., START P[3, "Help"]. RETURN WITH ERROR is explained in § 4.10.

Anomaly about NEW: The funnyAppl NEW e actually stands for e.COPYIMPLINST. See § 4.4.1 and § 4.5.3.

Anomaly about enables in funnyAppls: Enable choices are legal only for the following funnyAppls: FORK JOIN RESTART START STOP WAIT. You can write empty brackets if necessary to get a place for the enChoices.

3.8 IF and SELECT

28 if ::= IF e₁ THEN e₂ (ELSE e₃ 1 )

29 select :: = SELECT e FROM

choice: endChoice

The ":" is "." in an expression: also in 32 and 34.

30 choice ::= ( ( I relOp2² ) e_l ), !..= >e₂

31 endChoice ::= ENDCASE ( =>e₃1) In 29, 32, 34.

32 safeSelect :: = WITH e SELECT FROM safeChoice: endChoice³¹

33 safeChoice ::= n : t => e₂

34 •withSelect ::= WITH (n_i — e_l I. e_l ) SELECT ( ] te_n) FROM

withChoice: endChoice^m

•The — may be written as :.
35.withChoice :: = n₂ = > e₂1

n₂, n₂, L. => e₂

Examples

i4-(IF j<3 THEN 6 ELSE 8):

IF k NOT IN Range THEN RETURN[7]; SELECT f[j] FROM

<7=>{.,,}^•

IN [7..8]=>{...}•

NOT <= 8 =>{...}; ENDCASE= >ERROR;

IF e₁ THEN e₂ ELSE (e₃ 1 NULL)

LET selector'–e IN

choice ELSE ... endChoice

-- ELSE is a separator for repetitions of the choice.

IF ((selector (= 1 relOp ) e_l) OR ) THEN e₂ ELSE (e₃ 1 NULL)

LET v'^-e IN

safeChoice ELSE ... endChoice

IF ISTYPE[V, t] THEN LET n : t⁴-NARROW[v', t] IN e.₂

OPEN –e IN LET n'^.–($n₁ NIL), type – V v'.
selector'–(e_i.TAG 1 e_n) IN withChoice ELSE ... endChoice

-- e_llmust be defaulted except for a COMPUTED variant.

IF selector = $n₂ THEN OPEN

(BINDP[n', LooPHoLE[v".type'.n₂] ] 1 BINDP[T1', ) IN e₂

An IF with results must have an ELSE. -- SELECT expressions are also possible.

IF K7 THEN {...} ELSE ...

-- 7, 8 => or =7, = 8 = >{...} is the same. ENDCASE=>{...} is the same here. -- Redundant: choices are exhaustive.

IF test THEN IF test₂ THEN ijTrue, ELSE ifFalse₂

the grammar is ambiguous about which IF the ELSE belongs to. It belongs to the second one.

A select²⁹ is a sugared form of if which is convenient when one of several cases is chosen based on a single value. The selector expression e is evaluated once to yield a value selector', and then each of the choices is tested in turn. Within each choice, each expression e_lpreceding the => is compared

in turn with selector': the comparison is selector' relop e_l if e_l is preceded by a relop: otherwise it is selector' =e₁. If any comparison succeeds. the expression e₂ following the => is evaluated to yield the

value of the select. If no comparison succeeds, the next choice is tried. If no choice succeeds, the expression e₃ following the ENDCASE is evaluated to yield the value of the select; e₃ defaults to

NULL, and hence must be present when the select is not a statement to prevent a type error.

Style for SELECT: It is good practice to arrange the tests so that they are disjoint and exhaust the possible values of the selector. ENDCASE should be used to mean "in all other cases"; often the appropriate e₂raises an error. Don't use ENDCASE to mean another specific selector value which you

don't bother to mention. Another acceptable form is SELECT TRUE FROM ..., which selects the first choice that succeeds, and is sometimes easier to read than a long sequence of ELSE IF'S.

Performance of SELECT If the e₂ are static and select subsets of the selector values, the average size

of these subsets is not too large. and the density of unselected values is not too high. a select compiles into an indexed jump, which executes in a time independent of the number of choices.

A safeSelect³² is a special form for discriminating cases of unions or ANY. The selector must be a value for which ISTYPE can be evaluated dynamically (§4.3.1): REF ANY, PROC PROC T^—*ANY, V, REF V, or (LONG) POINTER TO V, where V is a variant record. Each choice specifies one possible type that the selector might have, and declares a name which is initialized to the selector value if it has that type. Thus, the example tests for r having the types REF INT and REF REAL. If it has REF INT, the first choice's e is evaluated; within e, rInt is a variable initialized to the selector, and has type REF INT. Likewise for REF REAL and the second choice. As with an ordinary select, the ENDCASE expression is evaluated (with no new names known) if none of the other choices succeeds. Note that safeSelect does ordinary binding by value, not the binding by name done in open and withSelect

withSelect³⁴ is an unsafe and rather tricky construction for discriminating cases of unions. Its use should be avoided unless a safeSelect can't do the job; this is the case for a COMPUTED tag, or if the call by name feature of withSelect is required.

It incorporates an open (§ 3.4.2) of the e_lbeing discriminated. This means that e_lis

dereferenced to yield a variant record value. It also means that this value is not copied, and hence it can change its type during execution of a choice, either by assignment to the variant part of a variant record (an unsafe operation), or by a change in the value of e_l.

If the union has a COMPUTED tag, the selector value to be used for the discrimination must be given as e_n in the withSelect. It is entirely up to the programmer to supply a meaningful

value. If the tag is not COMPUTED, e_llmust be omitted and the selector value is e₁.TAG.

The n₂ preceding => in a choice are literals of the (enumerated) type (§4.7.1A) which is

the tag type of the union (§4.6.3). They are compared with the selector, and if one matches, the e₂ following => is evaluated as with an ordinary select. If exactly one is given, then the

e₂ following => is in the scope of

OPEN n_i^—LOOPHOLE[e_i.UNREF, V.n₂], or simply OPEN LOOPHOLE[e₁.UNREF, V.n₂]

if no n₁-- followed the WITH. If several n₂ are given, then there is no discrimination, and the e₂ following => is in the scope of

OPEN n ^—e UNREF or OPEN e UNREF

3.9 Miscellaneous

This section deals with various topics that are not naturally associated with particular types or grammar rules.

3.9.1 Static values

An expression has a static value if the compiler can compute the value. Static values are required in various contexts, notable in type expressions, and as the right hand side of a binding in an interface module. In Cedar, an expression has a static value (is static for short) if it is:

a literal:

a name bound to a static value:

an application to static arguments of

a proc declared INLINE with a static body, or

a primitive which is not a loop, a REAL primitive (except unary minus, ABS or INTTOREAL), ASSIGN, @ or NEW. Note that IF and SELECT are evaluated.

Note that values obtained from an interface are static, but imported values are not.

Performance of static expressions: The compiler evaluates all static expressions, not just type expressions. This is often important for efficiency.

3.9.2 Size restrictions

Current Cedar has the following restrictions on the sizes of values:

· A record type T must have T.sizE<2¹⁶.

· A row type T must have T.sizE<2²⁸ and T.RANGE.SIZE<2¹⁶.

· A type T with T.SIZE>2¹⁶ lacks the following procs:

ALL ASSIGN

CONS

DESCRIPTOR

INIT NEW

· A subrange type T must have

O<T.LAST— T.FIRST<2¹⁶

—2¹⁵<T.FIRST<2¹⁵

T.LASTVF T.FIRST<0 THEN 2¹⁵+ T.FIRST ELSE 2¹⁶)

3.9.3 Checking

Possible errors arising from certain primitive operations are checked, and cause ERROR exceptions if they occur, in a CHECKED block, or if the compiler's "u" switch is on:

Dereferencing NIL.

Narrowing an out-of-range value to a subrange type.

Assigning a local proc to a proc variable (in CHECKED blocks only).

In an UNCHECKED block these errors are not checked for unless the program is compiled with the "u" switch.

Chapter 4. Primitives

This chapter gives detailed information about the primitive types, type-returning procs (type constructors), and other procs. It should be read after § 2.4, which defines a Cedar type and explains the basic ideas underlying the type system.

§ 4.1 gives the partial ordering called the class hierarchy that is used to classify the primitive types. § 4.2 lists all the primitives of Cedar. §§4.3-4.11 give the declarations and semantics for all the primitive classes and types. These descriptions are ordered according to the class hierarchy in Table 4-1. Each one specifies:

The declarations in the class that are not in any bigger class. The constructor for types in the class.

Any literals or basic constructors for values of types in the class Anomalies and facts about performance.

The implies relations on primitive types are summarized in § 4.12, and the coercions in § 4.13. The various cases of dot notation are described in § 4.14.

4.1 The class hierarchy

A useful way of organizing a set of types is in terms of the properties of their clusters. Since a cluster is a binding, its type is a declaration; we call such a declaration a class. For example. the class Numeric is

[T: TYPE:

PLUS: PROC[T, 7]-->[7]:

MINUS: PROC[T, 7]—)171;

. . . -- Declarations for other arithmetic procs.

LESS: PROC[T, 71-0[8000;

-- Declarations for many other procs.

By convention, the name T in a cluster denotes the type to which the cluster belongs. We call each <name, type> pair in the class an item.

Sometimes when a type U is derived from another type T (e.g.. REF T from 7), some of Us items are obtained from Ts items with the same names in some simple way (e.g., REF RECORD[a, b: INT] has procs a and b which dereference the REF and then apply the record's a and b procs). We say that U inherits the items from T.

A type T is in a class C if T.Cluster has the type C: we also say that T is a C type, e.g., INT is in class Numeric, or is a numeric type.

To make this explicit, we give the type CLASS a cluster proc called T^ype. such that every type T in class C' has type CType. For example. INT has type Numeric.Type. Thus.

T is a C type T in C .=L T has type C. Type (C.Trpe).Predicate[71=TRUF

A value satisfies the predicate for C.Type if it is a type. and its cluster satisfies the declaration which defines C. E.g., INT satisfies the predicate for Nwrierie.Type because it is a type. and its cluster contains procs for PLUS. MINUS. LESS etc. with the right types. Precisely. (C. Type).Predicate is

X [T: ANY] IN TYPF.PredicateM A CPredieate[T.elumed

ordered § 4.7 discrete* I numeric* I pointer—_*address subrange § 4.7.3

discrete § 4.7.1 whole number—>numeric

enumeration --painted-- § 4.7.1AD(BOOL BOOLEAN 0 I CHAR CHARACTERO)

numeric § 4.7.2 whole nurnber*/discrete REALO § 4.7.2B

whole number long number* I short number*

§ 4.7.2A

long number 11‘1Ta- LONG INTEGERO I •LONG CARDINAL°^.

short number INTEGERODNATO CARDINALODNATO

•unspecified § 4.9.1 •UNSPECIFIEDO I •LONG UNSPECIFIED°^,
exception I DECL I BINDING § 4.9.2 --kernel only-‑

process § 4.10 MONITOR LOCK 0 I CONDITION°^,

Notation:
n*

n =e I ... nDe

n is further specified in one of the indented lines below. n is a type. rather than a class.

n has its main definition under (and implies) class m.

n also appears under (implies) class m.

n includes (is implied by) the e classes, which together exhaust n. n includes (is implied by) the e classes, which are special cases.

Table 4 — I: The class hierarchy

A class C is a subclass of another class D if C=D. Recall the implies relation for declarations (§ 2.2.1F) means that

Each name n in C is also in D.

n's type in C implies n's type in D.

Precisely.

(V nEC.names) nED.names A (C.DTOB.nD.DTOB.n)

For example. the class ORDERED includes

LESS: PROC[T. 71--+BOOL

Every subclass of ORDERED must also declare a LESS proc which takes two Ts to a BOOL. If we had a richer assertion language, there would also be axioms defining LESS to be an ordering relation. Similarly, every ORDERED type (e.g., TNT) must have such a LESS proc in its cluster.

The subclass relation defines a class hierarchy, i.e., it gives a partial ordering on classes. Table 4-1 gives the class hierarchy for the primitive classes of Cedar. It is presented as a tree: a node N with sons N1, N_2'N_k is written

N N_i I N₂, I... I N_I(

and any of the N that are not leaves are marked with a * and defined on following indented lines:

N. N_ilI

^N12* ."'

In fact, however, the class hierarchy is not a tree but a partially ordered set; some classes appear more than once in the table, with appropriate cross-references. Classes produced by Cedar type constructors are named by the constructors; other, more general classes are given suggestive names, sometimes lower-case versions of the constructor names. Each primitive type also appears in the table, under its class in the tree.

4.2 Summary of primitives

36 type :: = typeName _I^builtInType_ItypeCons

37 typeName = n₁1 typeName . n₂ 1

typeName [e] I •n₂ typeName typeName.sPEcIALIzE[e] _ItypeName . n₂

In 19, 25. 36, 40.1, 49. --n₂ names a variant

38 builtInType = INT I REAL I TYPE I ATOM I MONITORLOCK I CONDITION

* ?tUNCOUNTED ZONE I •tMDSZone I *LONG CARDINAL I *I' ?LONG UNSPECIFIED -- See Table 4— 2. TYPE only as t in a b or an interface's d. INTEGER, CARDINAL, NAT, TEXT, STRING, BOOL. CHAR are predefined.

39 typeCons :: = subrange²⁵ _IpaintedTC^40.1 I transferTC⁴¹ I arrayTC⁴⁴ I seqTC⁴⁵ _ItdescriptorTC^45.1 ^IrefTC⁴⁶ I listTC⁴² _ItpointerTC⁴⁸ •trelativeTC⁴⁹ I recordTC5⁰ I unionTC⁵² I enumTC⁵⁴ I defaultTC⁵⁵

Examples

P: PROCE b: Bufferi.Handle,

i: INT ^I-TEXT[20].SIZE

TypeIndex: TYPE–[0..256); BinaryNode: TYPE^— Node⁵².binary

-- A type from an interface.

-- A bound sequence; only in SIZE, NEW.

-- A subrange type.

-- A bound variant type.

The tables in this section summarize the primitive and predeclared types, type constructors and procs of Cedar. There are also a number of interfaces which contain useful procs or values of primitive types; in some cases, the distinction between a primitive in the language and one in such an interface is rather arbitrary. These interfaces are Process, Inline. CedarReals, AMTypes, Rope, SafeStorage. UnsafeStorage, ListsAndAtoms, PrincOps. Runtime.

4.2.1 Primitive types and constructors

Table 4-2 lists the primitive or predeclared types of Cedar, giving the name for each in the current language, and either a definition or, for the primitive types, a comment suggesting the meaning of the type. Later sections describe the items in the clusters of these types, and give their representations.

Text Box: 40 § 4.3.3
40.1§ 4.3.4 2 §4.3.5 2 § 4.3.5
41 § 4.4.1
41 §4.4.1
44 § 4.4.2.1
45 § 4.4.2.2
45 § 4.4.3
46 § 4.5.1
47 § 4.5.1.1
48 § 4.5.1.2
49 § 4.5.4
50 § 4.6.1
51 § 4.6.1
52 § 4.6.3 54 § 4.7.1.1
54 § 4.7.1.1 25 § 4.7.3
55 § 4.11 MKVAR [readOnly. short: BOOL*-FALSE] variable
-- This proc in the cluster of each type T produces the type VAR T or READONLY T.

REPLACEPAINT [in: TYPE, from: OPAQUE.Type] general

Text Box: MKINTTYPE INTERFACETYPE MKXFERTYPE [LIST OF DECLORBINDING] interface

[LIST OF ATOM] TYPE n

[flavor:{PROC,PORT,PROCESS.SIGNALERROR,PROGRAM}, transfer
domain, range: DECL ⁴-NIL, safe: BOOL⁺-ISCEDAR]

MKPROC [domain, range: DECL ⁴-NIL, safe: BOOL⁴-ISCEDAR] PROC
^—MKXFERTYPE[PROC, domain, range, safe]

MKARR AY [domain: DISCRETE. Type-CARDINAL, range: TYPE, ARRAY
packed: BOOL<- FALSE]

Text Box: SEQUENCE DESCRIPTOR
reference
LIST
pointer
RELATIVE RECORD
union enumeration enumeration subrange
luster. assignable MKSEQUENCE [domain: TAG, range: TYPE, packed: BOOL ^4-FALSE]

•MKARRAYDESCR [arrayType: ARR AY. Type,

Text Box: MKLIST I'MKPOINTER long: BOOL <- FALSE, readOnly: BOOL - FALSE] MKREF [target: TYPE, base: BASE ⁴-WORLD.
readOnly, ordered. uncounted: BOOL<- FALSE] [range: TYPE, readOnly: BOOL - FALSE] [target: TYPE⁴- UNSPECIFIED,

long, readOnly, ordered, base: BOOL -FALSE] ^—MKREF[larget^— target, readOnly^— readOnly. ordered^—ordered, uncounted^—TRUE. base^–(IF long THEN WORLD ELSE MDS)].

*tMKRELATIVE [range: TYPE, baseType: BASE. Type]

MKRECORD [fields: DECL,

or MKMDRECORD access: {PUBLIC. PRIVATE} ⁴-CURRENTACCESS,
monitored: BOOL ⁴-FALSE]

MKPOSITION [first Word: NAT, firstBit: lastBit: INT<- –1]

MKUNION [selector: TAG, variants: LIST OF FIELD]

MKENUMERATION [LIST OF ATOM]

MKMDENUMERATION [LIST OF RECORD[ATOM, NAT]]

MKSUBRANGE [FIRST: T, LAST: T]

T is the Discrete base type, which has a MKSUBRANGE type constructor in its c CHANGEDEFAULT [type: TYPE, proc: (PROC[]^–'type), allowTrash: BOOL]

Table 4 —4: Type options and their constructors

4.2.3 Primitive procs

The primitive procs and other values of Cedar are listed in Table 4-5. All of the primitive procs in the Cedar language except the type constructors (see Table 4-3) appear here.

The Name column gives the name of the value in the cluster. For a proc, the following symbol summarizes the handling of exceptions:

A "!" means that application can cause an exception. and you can write an applEnable²⁷¹.

An italic "!" means that an exception is possible. but you cannot write an applEnable. If you are desperate, enclose the application in a block with an enable.

An italic "!!" means that an exception should be possible, but the implementation does not make the necessary check (e.g., for overflow on adding INTS).

If nothing follows the name, no exception is possible.

The Classes column gives the classes in which the name appears; see Table 4-1.

The Type column gives the type with which it is declared in those classes. The type usually refers to other names of the class. Since it is taken from the class declaration, it can use these names without explicit qualification: see the detailed class descriptions in § 4.3-4.11 for their meanings.

The Notes column gives information about how a proc is applied or a non-proc value is denoted in current Cedar. In the kernel a proc named P from the cluster of type T is applied to a value x of type T by the expression x.P if there is only one argument or x.P[y. ...] if there are several. In current Cedar, however, not all primitives can be applied or denoted by dot notation. There are three other ways of applying a primitive proc:

It may be an operator with a symbol listed in the Notes column. If it takes two arguments. the operator is infix. Thus for a proc named P with operator symbol EB, you write xey instead of x.PM. If it takes one argument the operator is usually prefix: you write ex instead of x.P. The t operator is postfix: you write xt instead of x.DEREFERENCE.

It may be a built-in proc named P, in which case you usually write PH or P[x, y, ...] as an alternative to x.P or x.P[y. ...]. For each built-in which cannot be applied using either of these notations, the ways of applying it are indicated explicitly in the Notes column; any ways not mentioned cannot be used.

It may be a funny application proc named P, in which case you write P x or P x[y, ...].

The three kinds of primitive proc are listed in that order, alphabetically within each kind. Values which are not procs (ABORTED, FALSE, FIRST. LAST. NIL. SIZE. TRUE) are listed with the built-in procs. Except for ABORTED, FALSE and TRUE, which are globally known and must be written alone, the cluster must be specified by dot notation (ivr.stzE) or optionally as an argument (S1ZE[INT]).

A few primitive procs cannot be desugared so simply into dot notation. These cases are indicated in the Notes column, and are described here:

Some PROC [T]-0[U] are coercions: CONS, FROMGROUND, LONG, TOGROUND, VALUEOF. This

means that they may be invoked automatically when typechecking demands a U and an expression has syntactic type 7; see § 4.13 for details.

Some involve target typing: ALL, CONS. LIST, VAL, union constructor: they are marked TT. For these the proc does not come from the cluster of the type of the first argument. Instead, it comes from the cluster of the so-called target type. An application of one of these procs must appear as an argument in another application (e.g., f[y, NARROW[x]] or z⁴-NARRow[x]). and not before a dot. In this context the target type is known from the declaration of the outer proc being applied (for ZASSIGN in the example: if the type off is PROC [U, 7]^—4[V], the target type for the NARROW application is 7). Target typing is also used for enumeration literals (§4.7.1A), and is optional to default the type argument of

DESCRIPTOR, NARROW or LOOPHOLE.

One is ambiguous: MINUS for CHAR and pointer. The type of the second argument decides.

Name Notes Classes Type

Operators (infix except as noted)

VARTOPOINTER @(prefix) general

EQUAL general

ASSIGN assignable

PLUS!! numeric

•CHAR. •ointer

macs!! numeric

'CHAR. •pOinter

ambiguous

UMINUS!! —(prefix) numeric

TIMES!! numeric

DIVIDE! / numeric

LESS ordered

GREATER > ordered

same as NOT DEREF- t(postfix) ref,

ERENCE! pointer

REM! MOD whole number

NOT NOT(prefix) BOOL

UNSAFE PROC[71^—*[MKPOINTER[larget-- T.TARGET, long" LONG]]

PROC[x: T, y: 7]-⁰[BOOL]

UNSAFE --sometimes-- PROC[x: VAR T, y: 71-⁰[7]

PROC[T, 7]-⁰[71 PROC[T, INTEGER]^—⁴171

PROC[T, 7]-⁴[71 PROC[T, 1-4[INTEGER]

PROC[T, INTEGER]^—,[71

PROC[T]-^1,171 PROC[T, 7]^—⁴[7] PROC[T, T]-0[71 PROC[T, 7]^—⁴[BOOL]

PROC[T, 7]-.[BOOL]

PROC[r. 1-4[TARGET]

UNSAFE PROC[r: 7]^—>[TARGET]

PROC[T, 71^—,171

PROC[BOOL]^—,[BOOL]

Text Box: FALSE FIRST!! first
FREE!
FROMGROUND
ISTYPE! LAST!! LENGTH LIST!
LONG
LOOPHOLE MAX
MIN NARROW! NEW!
NIL
ORD FRED!!
rest
SIZE SIZE SUCC!! TOGROUND TRUE VAL Text Box: FALSE
&first
FREE[...] or z.FREE[... FREE[...] or z.FREE[... ! if...]: coercion
ISTYPE[x, U] BOOL bool

discrete

LIST PROC[l: T1^—4[RANGE]

] ZONE PROC[z: 7', p: NEVVTYPE[NEWTYPE[U]]]^—+D

] UNCOUNTED ZONE UNSAFE PROC[z: T, p: NEWTYPEINEWTYPE[U]]]^—>[]

subrange PROC[x: GROUND]—+[7]

general PROC[x: T, U: TYPE]--'[BOOL]

discrete

AR RA Y,descriptor PROC[a: 7]-0[CARDINAL]

Text Box: Lisr[...] or z.Lisit...]: TT
also coercion also coercion also coercion
only; TT for U MAX[!..]
MIN[!..]
NARROW[...] : TT for U NEW[...] Or z.NEW[...] Text Box: LIST
short number
POINTER DESCRIPTOR
general ordered ordered general
ZONE Text Box: also NIL alone Text Box: L rest
S1ZE[T, n]
coercion
TRUE VAL[...]; TT Text Box: variable, address. transfer
enumeration discrete
LIST
general general discrete subrange
BOO L
enumeration PROC[z: ZONE g: RANGE X ...]^—+[7]

PROC[x:7]^—'[LONG 7]

PROC[p:7]^—'[LONG POINTER TO T.TARGET] PROC[p:7]^—⁴[LONG DESCRIPTOR FOR ARRAY OF T.RANGE] UNSAFE -- if u is RC-- PROC[t: T. U: TYPE]^—'[U]

PROC[T, PROC[T,

PROC[x: T, U: TYPO-0^.[U]

PROC[z: T^€SafeStorage.GetSystemZoneg, U: TYPE]

NEWTYPE[U]] Tor NILTYPE

PROC[7]^—ql NT] PROC[x: 1-.[T]

PROC[7]-1^.[7]

CARDINAL

PROC[T, CARDINAL]^—'[CARDINAL] PROC[x:

PROC[x: 7]^—'[GROUND]

BOOL

PROC[INT]-0171

Funny applications

BROADCAST no args ERROR

FORK! FORK P[args]

JOIN! no args

NEW no args

NOTIFY no args

RESTART! no args
RETURN WITH

RETURN WITH ERROR SIGNAL

START!

STOP! no args
TRANSFER WITH

WAIT! no args

Not in current Cedar

APPLY!

BINDING BYTESTOINSTRUCTIONS

Cluster Default

DOMAIN

DUMPSTATE HIDEEXCEPTION

!NIT

ISLONG

ISREADONLY LOCALSTRING coercion MACHINEINSTRUCTIONS NAMES

NEWEXCEPTIONCODE NEWFRAME

NEWLABEL

OMITTED

OPENPROCS

Predicate

RANGE TARGET TOVOID Trash UNBOUND

UNCONS coercion
UNREF

VALUE VALUEOF

CONDITION PROC[T]^—>
SIGNAL, ERROR like APPLY

PROC PROC[PROC[DOMAIN]^—'[RANGE]]^—>

[PROC[DOMAIN] —> [PROCESS []—*[RANGE]]]

PROCESS UNSAFE PROC[7]^—'[RANGE]

PROGRAM, PROC[p: 7]^—>[T]
POINTER TO FRAME

/: TYPE Imp PROC[p: POINTER TO FRAME[!]]^—'[POINTER TO FRAME[!]]

CONDITION PROC[T]^—[]

PROGRAM PROC[T1^—]

PrincOps.StateVector

ERROR like APPLY

SIGNAL like APPLY

PROGRAM like APPLY

PROGRAM PROC[1-⁴0

PrincOps.StateVector

CONDITION PROC[T]—+Q

Text Box: UNSAFE --sometimes-- PROC[map: T, arg: DOMAIN]—> [RANGE]
PROC[]-->[T1 TYPE
BOOL BOOL
PROC[ [CARDINAL] ]—'[STRING] map

general general map

variable variable variable

STRING

binding, decl exception

decl PROC []^—*FRAMETYPE[T]

exception PROC []^—⁴EXCEPTION

general PROC[ANY]-0BOOL

map TYPE

reference TYPE

assignable PROC [T]--'[VOID]

general PROC[]—>[T]

record PRoc[7]—'[FIELDS]
general

variable TYPE

variable PRoc[7]—'[VALUE]

•Anomaly about SIZE: There is another SIZE item in each cluster:.

SIZE: PROC[n: DOMAIN[^—⁴[CARDINAL] -- Returns the size of a PACKED ARRAY [0..n) OF T.

Apply with sizE[T. n].

This proc can be useful in calculating the space required for the target of a descriptor for a packed array. You can only apply it with sizE[T. n]; the second argument is what selects this proc. It is usually better to use a sequence.

In current Cedar the value of ISTYPE[x, TI is determined as follows. Here Tr-ZU means that T.Predicate= U.Predicate. Two types may be unequal and yet have the same predicate if they have different clusters. Currently, the cluster can only be changed by CHANGEDEFAULT.

1) It is TRUE statically if: Vx.--zT, or

one of G'x and T is an opaque type, and the other is the corresponding concrete type (only in an implementation that exports the opaque type).

2) It is tested dynamically if (with V any variant record type without a COMPUTED tag, and a the name of a particular variant).

Vx^-ZREF ANY and TZ.REF U for any U except ANY, or

U RETURNS ANY and T=PROC U RETURNS V for any U, and any V except ANY, or

Vx=PRoc ANY RETURNS V and T=PROC U RETURNS V for any V, and any U except ANY, or

Ox"-^---REF V. and TZREF V.a, or

Vx:-.-(LONG) POINTER TO V and TZ(LONG) POINTER TO V.a.

Ox V, and Tz. V.a, or

Note that the result is TRUE if x= NIL (except in the last case).

3) It causes a static error in all other cases. even if it is statically false.

In current Cedar, NARROW[x. T] is

IF ISTYPE[x. 71 THEN x ELSE ERROR e

where e is

AMTypes.NarrowRefFauli[x. T.RANGE.CODE] if ISTYPE[x, REF ANY]:

AMTypes.NarrowFaultU otherwise.

Note that NARROW[x, 7] gives a static error if ISTYPE[x. 7] does (case (3) above). Note also that ISTYPE and NARROW are conveniently packaged in the safeSelect construct (§ 3.8).

Performance of ISTYPE for PROC ANY: The ISTYPE (and therefore NARROW) of a PROC type with ANY domain or range are very slow, since they use AMTypes to do the test, and it consults the symbol tables.

Anomaly for target typing of NARROW and LOOPHOLE: For NARROW and LOOPHOLE the second argument may be defaulted to the target type.

Anomaly for LOOPHOLE on variable types: For a variable type T, if T.LOOPHOLE is applied to a second argument U which is not a variable type, U is coerced to UJAKvARD. Thus

{x: INT; LOOPHOLE[x, BOOL]-TRUE1

leaves x=1.

Every general type T with T.size<2¹⁶ has an EQUAL proc except a variant record or union type. A variant record type has EQUAL only if its variant part is a union in which all the cases are the same size. Note that a bound variant does have EQUAL, unless it is itself a variant record. EQUAL is denoted by the infix operator =.

Anomaly for equality of variants: If v is a variant record and b^y is one of its bound variants, the expression bv= v applies the EQUAL proc of the bound variant. This works even though v is not of the same type as by.

Representation and address equality: EQUAL compares addresses in the representation of a value; it does not dereference them. Thus types like ROPE and ZONE which are represented by addresses are compared by comparing the addresses.

Restriction on EQUAL procs: A type has an EQUAL proc only if T.size<2¹⁶.

4.3.2 Assignable types

Most types (see Table 4-1 for exceptions) are in this class, which is a subclass of general (§ 4.3.1) and has items:

ASSIGN: UNSAFE --sometimes-- PROC -- Returns y after storing it in x. Denoted by the

[x: VAR T, y: 7]--*[7] right-associative infix operator ⁴'.

TOVOID: PROC[7]^—+ -- Discards the value. See § 3.6.

Default: PROCO^—+[T] See § 4.11.

Trash: PROC[]—4[T] -- See § 4.11.

As explained in § 3.7, groups and bindings are assignable if their components are. Since you cannot write these types in declarations, you have to write the constructors explicitly on the left of the ⁴-;they are called extractors. E.g.,

{x: INT; y: REAL; Ex. y1⁴- Pai42. 3.4] }

Note that if T is not assignable. it cannot be used as the type of a proc argument or result, since arguments and results are passed by assignment to variables.

unions and variant records:

assigning an unspecified type to anything except itself.

In a CHECKED block, a proc value cannot be assigned if it is local to another proc rather than to an implementation (since this could lead to a dangling reference). This is checked at runtime.

Restriction on ASSIGN procs: A type T has ASSIGN only if T.stzE<2¹⁶.

Representation of ASSIGN: Since it involves a VAR parameter, an ASSIGN proc cannot be written in current Cedar. The primitive ASSIGN procs simply copy the bits of y's representation into the variable x, unless some of them represent REFS. In this case the assignment involves reference-counting if x is in counted storage; see § 4.5 for details.

CHANGEDEFAULT can take any type and produce a new one which is identical except for the cluster items named Default and Trash which determine how default values are supplied when a binding value is coerced to a decl type; see § 4.11 for details.

4.3.3 Variable types

4/3 varTC ::= ( I READONLY I VAR) t I ANY ( VAR I READONLY I VAR) t ANY
In 11.45 —48. ANY only in refTC. VAR only in interface decl.

For every non-variable type T there are corresponding variable types:

VAR T

READONLY T

SHORT VAR T

SHORT READONLY T

You cannot denote these types in current Cedar except in a few contexts, but they are fundamental to an understanding of how it works nonetheless. The basic facts about variables in Cedar are given in § 2.3.3. A variable type is made by the MKVAR proc in the cluster of the non-variable type (§ 4.3.1).

The variable class is a sub-class of general (§4.3.1) and hasNiL (§4.3.7), and has items:

VALUE: TYPE: -- (VAR U).VALUE=U: T.VALUE.MKVAR = T.

VALUEOF: PROC[71^—⁴[VALUE]: -- A coercion.

ISLONG: BOOL: -- FALSE for short vars.

ISREADONLY: BOOL: -- TRUE for readonly vars.

VARTOPOINTER: UNSAFE PRoc[7]—4 -- Apply by prefix et.

[MKPOINTER[range^— T.VALUE, long—ISLONG, readOnly^—ISREADONLY]];

Furthermore, T inherits the cluster of T.VALUE. The procs are not modified, since the VALUEOF coercion provides them with T.VALUE arguments where needed. There is one exception: the component procs described below are replaced by procs which return variables instead of values.

The INIT proc (§4.3.1) converts a block of storage into a legal variable of type T. at least in theory. In fact, it is currently a no-op except for

RC types (§4.5); these are set to NIL.

Bound variants; the tag field is set appropriately.

INIT cannot be supplied or called directly by the user; it can only be called indirectly, from NEW.

The NEW proc (§4.3.1) calls on the zone z to obtain a block of storage of size T.SIZE (§ 4.5.2), and applies T.INIT to convert the block into a VAR T, call it x. Then if T.Default exists, NEW calls it and assigns the result to x.

Caution on finalization: A variable type may have a finalization proc, which is called when no client references to a variable remain; see SafeStorage. This proc is executed concurrently, and must therefore provide proper synchronization.

Restriction on NEW: A type has a NEW proc only if T.sIzE<2¹⁶.

•The @ operator (VARTOPOINTER) does not work on a variable v which is a component of a packed array (no matter what its type is), or component of a record if v is represented in less than 16 bits. These restrictions are machine-dependent, and @ is unsafe: avoid it if at all possible.

Composite variables and component procs

MKVAR commutes with a composite type, cross type or declaration constructor. For example,

VAR [a: INT, b: REF ANY]

is equal to

[a: VAR INT, b: VAR REF ANY]

and likewise for READONLY. Similarly. VALUEOF commutes with the component procs for values of these types, so that v.a.VALUEOF= v.VALUEOF.a if v has the variable type just mentioned.

Another way to think of this is that one of these variables is the composite of a set of variables, one for each component If T is a record type, row type, cross type or declaration, then a component proc in Ts cluster which extracts a component of a T value (e.g., a field proc, APPLY which subscripts the array, etc.) has a counterpart in the cluster of VAR T which extracts a variable. Thus if a and r are array and record variables, then a[i] and r.f are also variables which can be modified by

ASSIGN.

4.3.4 Opaque types

4o.ipaintedTC::= typeName PAINTED t REPLACEPAINT[in^—t, from–typeName]
typeName must be an opaque type, t a recordTC or enumTC.

Example

HV: TYPE–Interface.HistValue PAINTED -- See 13 for use.

RECORD[...]

An opaque type declaration in an interface is the only way to declare a type parameter (except for the interface parameters declared in the DIRECTORY). Such a type parameter is called opaque. The type of an opaque type must be TYPE[ANY] or TYPE[n]: thus you can write

T: TYPE[ANY]

T: TYPE[n]

in an interface. These expressions are non-functional: each generates a new mark, and a type can be exported to T (i.e., has the type denoted by the TYPE[ANY] or TYPE[n] expression which declares T, and hence is an acceptable argument value for this formal parameter) only if it carries that mark. A type exported to T: TYPE[n] must have additional properties described below.

You attach one of these marks to a type using a paintedTC. The type being painted (t in the rule) must be a recordTC or enumTC. The paint comes from the typeName, which must be an opaque type: it replaces the new paint which the constructor would have supplied.

Any record or enumeration type can be painted from a type declared TYPE[ANY]: only a type so painted can be supplied as the argument for the declaration T: TYPE. T is called fully opaque.

A type V can be painted from a type U declared TYPE[n] if:

V.SIZE= n.

V is a recordTC or enumTC and has standard NEW, INIT, ASSIGN, EQUAL and ISTYPE procs. All the assignable primitive types do except

the RC types (§4.5.1):

bound variant types (§ 4.6.2):

types produced by a defaultTC⁵⁵:

composite types with a component that has a non-standard NEW, INIT. ASSIGN, or EQUAL proc.

Representation of standard procr. The standard NEW proc allocates n words. The standard 'NIT does nothing. The standard ASSIGN copies n words. The standard EQUAL compares n words bitwise. The standard ISTYPE compares the mark of the value with a single mark associated with the type.

Only a type painted with U can be supplied as the argument for the declaration U: TYPE[n]. U is called n-opaque.

Example: For the interface:

I: DEFINITIONS^—{ FO: TYPE[ANY]:

nO: TYPE[SIZE [INT]] }

this implementation is suitable:

limp!: PROGRAM EXPORTS I-1

FO: PUBLIC TYPE^— 1.F0 PAINTED RECORD[a: INT, b: ROPE];

nO: PUBLIC TYPE^— /.7/0 PAINTED RECORD[INT]:

Note that replacing INT by REF ANY in nO would not work. since this does not have standard ASSIGN and INIT procs.

The cluster of a fully opaque type T is empty: it provides no operations. A T value cannot be passed as a parameter, and there are no VAR T variables. Thus you cannot use T as the type in a declaration. The only thing to do with T is use it as the target of a reference type such as REF T.

The cluster of an n-opaque type U has VAR, NEW, INIT, ASSIGN, EQUAL and ISTYPE procs (the last not yet implemented). Thus these operations can be done on a U value. As a consequence, a U value can be passed as a parameter and declared.

Restriction on values of opaque types: All instances of any interface produced by applying an interface module which declares an opaque type T must supply a type value with the same predicate for T if they supply any value at all: this value is called the standard implementation of T. Because of this restriction, clients can safely interassien values of type T. no matter how obtained. In addition, it is safe for any exporter of T to convert a value of type T to a value of the corresponding concrete type, and conversely. The restriction arises from the fact that the type is identified by its mark: hence the same mark must not be assigned to two different types.

Anomaly on referencing opaque 'types: It is not necessary to import an interface to refer to an opaque type declared in that interface (because of the above restriction).

Within an implementation P which exports an opaque type T declared in interface I, LT and P.T (simply T within P) imply each other. However, they have different clusters, and are not equivalent. You can convert from one to the other using NARROW (§ 4.3.1).

Performance of converting between opaque and concrete types: The conversion between an opaque type and the corresponding concrete one costs nothing at runtime.

42.5 Interface types

The type of an interface module is d--4n_.: TYPE n _m],where d is the declaration given in the DIRECTORY; when the module is applied, the result is an interface, with type TYPE n _m.The interface

is itself a type. A value of that type is an instance exported by an implementation module that exports the interface. These classes have no standard items (except an implementation instance. which has COPYIMPLINST), but the clusters of these types do have the items bound or declared in the interface. Thus you cannot do anything with these types except

use them in a DIRECTORY. IMPORTS, SHARES. or EXPORTS:

select items from the cluster using dot notation:

use an interface type in an open.

See § 3.3.4-5 for complete information.

4.3.6 ANY

The type ANY is implied by every type. ANY cannot be the type of a d or b item, and an expression never has syntactic type ANY unless it is an ERROR application. ANY can only be used as the target of a REF or as the domain or range of a transfer type. A value whose type involves ANY cannot be dereferenced or applied, since these operations would yield an expression with syntactic type ANY. However, it can be narrowed (§ 4.3.1).

4.3.7 HasNIL

Variable, address and transfer types are in this class, which is a subclass of general (§ 4.3.1), and gives them one thing in common:

NIL: T -- A distinguished value pointing to no storage.
There is a universal value NIL (with type NILTYPE) which can be coerced into any particular T.NIL.

4.4 Map types

The map class is a subclass of assignable (§ 4.3.2) and has the items:

DOMAIN: TYPE: -- Domain type for the mapping.

RANGE: TYPE: -- Range type for the mapping.

APPLY: PROC[map: T,arg: DOMAIN]^—)[RANGE] map[ardis sugar for map.APPLY4arg. In current

Cedar, you can write this explicitly only for transfer types.

Usually DOMAIN and RANGE are declarations, so that bindings can be used for the arguments and results. Application is denoted by brackets (map[arg]), or explicitly (APPLY[map, arg]) for transfer types only.

There are several subclasses of map in Cedar, each with its own APPLY proc. These are summarized here, and treated in detail in the sections on the various subclasses.

Primitives (since you can't get hold of the value of the primitive, these can be applied only with the various special syntactic forms summarized in Table 4-5).

Transfer types: procs, and their close friends processes, signals, errors, ports and programs: applying a transfer value executes the body of some A-expression (§ 4.4.1). § 2.2.1 and § 2.6 tell all about applying procs.

Row and descriptor types: applying an array, sequence (or sequence-containing record), or array descriptor to an index value yields a value of the component type (§ 4.4.2).

BASE POINTER types: applying a base pointer to a value which is relative to that base yields a (non-relative) pointer: this is unsafe (§4.4.3).

Reference types: if the base type T has APPLY, then the reference type inherits it composed with DEREFERENCE. so that a[arg] is the same as at[arg] (§ 4.5.1).

In addition, many subclasses of TYPE have APPLY procs with assorted meanings (§ 4.8).

4.4.1 Transfer types

41 transferTC::=?safety4 xfer ?drType 41.ixfer ::= PROCEDURE I PROC I PROGRAM

PORT I PROCESS I SIGNAL I ERROR

42 drType = ?fields, RETURNS fields₂ I fields,

No domain for PROCESS. In 3.41.

43 fields ::= [d^u, ] 1 [C, ] 1 ANY
ANY only in drType. In 42. 50. 52.

MKXFERTYPE[drType, flavor–xfer]

domain–fields,, range–fields₂

Examples

Enumerate: PROC[

1: RL,

p: PROC[X: REF ANY] RETURNS [stop: BOOL]] RETURNS [stopped: BOOL];

p2:PROCESS RETURNSD:INTI€ FORK stream.Get: failed: ERROR [reason: ROPE]^—CODE;

Transfer is a subclass of map (§ 4.4) and of hasNii, (§ 4.3.7). The subclasses of transfer are PROC. PORT, PROGRAM, PROCESS, SIGNAL, and ERROR. These types are constructed by transfer type constructors which begin with those words, or in the kernel by the MKXFERTYPE constructor. What they have in common is that application executes the body of some A-expression, but the transfer class adds no items to the map class.

One transfer type T implies another U if

The subclass is the same.

T.RANGE implies U.RANGE.

U.DOMAIN implies T.DOMAIN.

See § 2.3.2 and § 4.12. One declaration D implies another E if:

They have the same names, or each has only one name, and

The corresponding types imply each other.

I.e.

If n: T is in D and n: U is in E, then T^.U.

If D=[m: 71 and E=[n: U], then TAU.

See § 2.2.1F. D implies a cross type T if D.T implies T: in this case T also implies D.

Either the domain or the range of a transfer type (or both) can be ANY. A value of these types cannot be applied, but it can be narrowed to a specific transfer type (§ 4.3.1).

Representations for transfer types are given in the PrincOps interface. They tend to change when the machine architecture changes.

An attempt to apply a NIL transfer value results in the error Runtime.UnboundP MC.

PROC types

The PROC class is a subclass of transfer (§ 4.4.1) with no additional items. In the kernel, a new proc value is made by evaluating a A-expression. In current Cedar, it is made by a binding of the form P:

in a block, where T is a proc type; see § 3.5.1 for details.

Assignment of a proc may lead to a dangling reference, if the proc value is for a local proc P and it survives the return of P's enclosing proc. In a checked block any assignment of a local proc value is disallowed (except the assignment of a parameter value to a parameter variable).

PROGRAM types

The program class is a subclass of transfer (§ 4.4.1), and also has items:

·STOP: PROC[]^—>111 -- Apply by STOP. Legal only if RANGE = O.
Denoted by STOP, since it takes no arguments.

·RESTART: PROC[7]^—>1] -- Apply by RESTART P. Legal only if RANGE = 0.

·COPYIMPLINST : PROC[p: 71—*[71 -- Apply by NEW P*,

Their use is not recommended; for details, consult a wizard. For more on implementations, see § 3.3.2.1 and § 3.3.5. COPYIMPLINST makes a copy of the implementation module for which p is the program proc, and returns the program proc of the copy. See § 4.5.3 for more details.

The syntax for applying a program P is

START P[args]

·The START may be omitted, so that it looks like an ordinary application; avoid this feature. This expression's type is VP.RANGE.

A program value is obtained from the frame of an implementation, which always includes the item:

Imp: PROGRAM T^— PP:,

where Imp is the name of the module, T its drType, and PP its program proc; see § 3.3.5. This value can be accessed:

from an interface exported by Imp which declares Imp as a PROGRAM T; as Pimp. where F is a POINTER TO FRAME of the implementation; as the CONTROL, item returned by the module.

·PORT types

Use of ports is complex, unsafe and not recommended. See chapter 9 of the Mesa manual if necessary.

PROCESS types

The process class is a subclass of transfer (§ 4.4.1) with no other items, but ProcesxAbort[P] raises the ERROR ABORTED in P. § 4.10 describes Cedar's facilities for concurrent programming.

A process always has DOMAIN =1]. The syntax for applying a process P is

JOIN P

This expression's type is VP.RANGE. •The JOIN may be omitted, so that it looks like an ordinary application; avoid this feature.

A process value is obtained from:

FORK: PROC[PROC[DOMAIN]^—*[RANGE]J^—>[PROC[DOMAIN]^—>IPROCESS 0—+[RANGE]]]

The syntax for using this is

FORK P[args].

The FORK P returns a proc which when applied to args creates a new process, starts it running, and returns it.

Anomaly for FORK: Note the peculiar parsing (FORK P)[args]. You cannot write FORK P alone to get hold of the process-creating proc.

SIGNAL and ERROR types

These are subclasses of transfer (§4.4.1) with no other items.

In the kernel, a new signal or error value is made by applying NEWEXCEPTIONVALUE. In current Cedar. it is made by a binding of the form

E: T^—CODE

in a d or b, where T is a signal or error type. The effect is to construct a unique, exception value. not equal to any other. An enable choice which catches this value will only catch an exception raised with this value; it cannot catch some other expression with the same name.

Anomaly for CODE: Unfortunately, CODE does not yield a unique value at each execution. The value is only unique to the textual occurrence of CODE and the module instance; if CODE appears inside a proc, the same value is produced each time the proc is applied. Thus care may be needed if the proc is recursive.

The syntax for applying an error (signal) E is ERROR (SIGNAL) E[args], or ERROR (SIGNAL) E if there are no arguments. For a signal, this expression's type is VE.RANGE; for an error, its type is ANY (since control can never return). •If the argument constructor is present. the ERROR or SIGNAL is optional; avoid this feature.

§ 2.6.2 and § 3.4.3 explain errors in detail. A signal is exactly like a proc, except that the closure that is executed is obtained from the statement of an enable choice; see § 3.4.3A for details.

You can write an expression consisting simply of ERROR; this is short for ERROR NAMELESSERROR. Here NAMELESSERROR is an error you cannot denote in the program. Hence it cannot be caught (except by ANY); you should think of it as a call to the debugger.

4.4.2 Row and descriptor types

44 arrayTC = ?*PACKED ARRAY ?L_i OF t₂

45 seqTC = ?*PACKED SEQUENCE tag⁵³ OF t Legal only as last type in a recordTC or unionTC. 45.1tdescriptorTC =

?LONG DESCRIPTOR FOR varTC⁴⁰

varTC must be an array type.

Examples

Vec: TYPE^—ARRAY [0..maxVecLen) OF INT: Chars: TYPE^—RECORD [text: PACKED SEQUENCE

len: [0..INTEGER.LAST] OF CHAR]: ch: Chars; v: Vec— A LL[0]:

dV: DESCRIPTOR FOR ARRAY OF INT‑

DESCR IPTOR[v];

mKARRAv[domain—t₁, range —t₂1

MKSEQUENcE[domain —tag, range—t]

mKARRAYDEscR[arrayType^—varTC]

-- A record with just a sequence in it. ch.rext[i] or ch[i] refers to an element.

A row value provides an indexed set of values of an arbitrary type. called the components of the row; application maps an index into the corresponding value. Usually the values are variables, so that assignment to a component is possible. A descriptor is an unsafe pointer to a row which includes a subrange of the domain or index type in the descriptor value; thus values of the same descriptor type can point to rows of different sizes. Because all the row types use the same representation for the set of values, it is possible to make a descriptor from any row value.

The domain or index type of a row must be a discrete type with no more than 2¹⁶distinct values: note that this rules out large subranges of INT. There is one element in the range set for each value of the domain type.

The PACKED argument of the row type constructors governs the representation of a row whose range type is represented in <8 bits. See the discussion of representation below. It also disallows

the use of @ on an element of the row.

The row class is a subclass of map (§ 4.4) and also has the item:

DESCRIPTOR: UNSAFE PROCk: VAR 71—)[LONG -- Returns a descriptor for r. DESCRIPTOR FOR ARRAY DOMAIN OF RANGE]

Since DESCRIPTOR returns an address, it must take a VAR; i.e., it can't be given a row value such as a constructor, but demands a row which has been declared or allocated.

Representation of rows: A VAR row value is represented by a contiguous block of words. If PACKED= FALSE, each element VAR occupies T.RANGE.SIZE words, and the successive elements occupy consecutive blocks of storage. beginning with the one indexed by T.DOMAIN.FIRST. If PACKED= TRUE and a T.RANGE value is represented in n<8 bits, each element occupies ^{2CEILING[L002I}n]lbits, i.e. 1, 2, 4 or 8 bits depending on its size; PACKED has no effect on the representation for ranges with bigger values. Note that the entire representation of a packed array may be smaller than a word, and need not be word-aligned in another packed array or in a record. This is the entire representation of an array value; a sequence value also has a tag field, which is represented like a component of the containing record.

Restriction on row sizes: A row type must have T.sizE<2²⁸ and T.RANGE.S1ZEQ¹⁶.

It is not possible to obtain a REF to a row component; this is because the implementation of both reference counting and REF ANY discrimination requires more information about each VAR than is available for an array element. If the row is PACKED. it is not possible to apply @ to obtain a pointer to an element either.

Performance of row arguments and results: Passing a row as an argument or result entails copying the representation. Unless the row is quite small, this is expensive. It is usually better to pass a REF. Very large rows (say, more than 100 words) should not be declared in a block, since this results in large frames which consume the 64k words of frame space. Instead, they should be allocated with

NEW.

4.4.2A ARRAY types

An array is a row with an element for each value in the domain: its APPLY proc is a total function. The advantages of this are that no space is needed to store the length of an array, and any bounds checking on a subscript is done against constant values (as part of narrowing the subscript to the domain type, which is usually a subrange). The disadvantages are that a given proc, written to deal with a given array type, cannot be used on other arrays of different lengths, since there is no way in current Cedar to parameterize the proc with a type. In this case it is better to use a sequence (§ 4.4.2B).

The array class is a subclass of row (§ 4.4.2) and of assignable (§ 4.3.2) if RANGE is assignable. It also has the items:

CONS: PROCIg: RANGE X ...1--qT1 -- A coercion from the group, or denoted 7[...].

ALL: PROC[x: RANGE]-*[T] -- Returns an array with each element = x

LENGTH: CARDINAL -- The cardinality of DOMAIN.

BASE: PROCEa: VAR 71--[LONG POINTER TO UNSPECIFIED] -- Returns the address of a's first element.

CONS takes a group of values, one for each element of the array, into an array value. Note that the argument of CONS may have omitted values, which are filled in if possible by the defaulting coercion for T.RANGE. If the index type is enumerated, CONS takes a binding, with one element named n of type T.RANGE for each index value n. In current Cedar you can't write T.CONS. Instead you write T itself; i.e.. T[...] for T.coNs[...]. Because CONS is a coercion from group or binding to array, you can omit the T whenever the group or binding appears as an argument or in a binding; see § 4.13. Examples:

I: TYPE-INT+-0: B: TYPE-BOOLEAN ⁴-TRUE A: TYPE-ARRAY [0..5) OF 1:

al: A-40, 1. 2. 3, 41:

a2: A-[ .1, 2, 3, 4]: 1: INT-A[4, 3, 2, 1, 0][1]: E: TYPE-ARRAY {red blue. green} OF B:

el: E-[TRUE, FALSE, TRUE]:

e2: E-[b/ue-FALSE]:

-- OK to omit A here.

-- Same as al, by defaulting.

-- 1=3. The A is required here.

-- Same as el.

Anomaly about ALL: ALL replicates its argument in all the elements of an array. In current Cedar you can't write T.ALL. Instead you just write ALL: it must be in an argument or binding. Unlike most built-ins, ALL is not sugar for dot notation. If the range type permits it. you can write ALL[TRASH] to trash all the elements.

a3: A-ALL[3]: -- Same as [3. 3, 3. 3. 3]

BASE returns the address of its VAR array argument. It is mostly useful for writing storage allocators. The resulting LONG POINTER TO UNSPECIFIED can also be passed to DESCRIPTOR to yield a descriptor for a different type of array; obviously this is dangerous.

•Anomaly about arrays with empty domains: An array may be declared with a domain type which is an empty subrange. The effect is to suppress the bounds checking in APPLY. If a pointer p to such an array is constructed (with a LOOPHOLE), then pt[i] (you can also write p[i], because p inherits APPLY) will never give an BoundsFault. This kludge is sometimes useful for obtaining arrays whose size is not static. However, beware that operations on the array other than subscripting (e.g.. equality tests, assignment and parameter passing) will believe the type declaration and do the wrong thing. It is generally better to use a sequence or a descriptor.

4.4.2B SEQUENCE types

A sequence is like an array, but each sequence value includes a tag value which specifies the number of elements in that sequence, i.e. the values of the domain type for which APPLY is defined. Note that APPLY for a sequence is usually not total. If the domain type is T and the tag value is v. then APPLY is defined for [T.FiRsT..v). Usually T is NAT, so that v is the number of elements in the sequence, and the elements are indexed by 0, 1, ..., v-1.

In current Cedar there are many restrictions on the use of sequences. A sequence type is defined by a sequenceTC⁴'; it is not a first-class type, and can only appear as the type of the last field of a variant record or union (§4.6.2). The items in the cluster of a sequence type are just those for a row: they are inherited by the containing variant record, which is the type a program normally deals with.

A record type T containing a sequence field is a variant record. T is a first-class type which can be bound to a name, but unlike a union-containing record it cannot be used where type³⁶ appears in the grammar, except in a refTC⁴⁶ (or pointerTC"). The only items in the cluster of T are the ones of the variant record class, and those inherited from the row class of the contained sequence:

DOMAIN: TYPE =TAGTYPE.

RANGE: TYPE -- The RANGE of the sequence.

APPLY: PROC[map: T. arg: DOMAIN]-*[RANGE] -- Indexes the sequence.
-{RETURN[map.VARIANTPART[arg]]}.

DESCRIPTOR: UNSAFE PROC[r: VAR T]-.[LONG DESCRIPTOR FOR ARRAY DOMAIN OF RANGE]
-{RETURN[DESCRIPTORITNARIANTPARTI]I. -- Yields a descriptor for the sequence.

The tag of a sequence is readonly.

Hence the only uses of T are:

As the target type of a reference type. e.g.. REF T.

In the form Tin] to yield a specialization of T.

The specialization T[n] has TAG = T.TAGTYPE.FIRST.SUCCⁿ, and n elements in the sequence; n need not be static. This application causes a Runtime.BoundsFault if n NOT IN T.TAGTYPE. Tin] is also not a first-class type; you cannot write it where type³⁶ appears in the grammar. and it has only the

following cluster (§ 4.3.1):

NEW: PROC[z: ZONE⁴-SafeStorage.GetSystemZonell -- Denoted NEW[714 or z.NEWinnil T: TYPE]---[r. REF GENERAL]

SIZE: CARDINAL

GENERAL: TYPE -- The type of the unspecialized sequence.

Note that since you cannot use T or T[n] in a declaration, there are no declared variables, record fields. or arguments to non-primitive procs of these types; you must use REF T (or a pointer to 7). Furthermore, these types have no ASSIGN or EQUAL procs; you must do these operations on the components. Finally, there are no constructors for sequence types; you must explicitly trash the sequence field in a record constructor. A sequence does get initialized when allocated, however; in current Cedar this just means that non-composite RC variables are set to NIL.

Thus the normal way to use a sequence is to embed it in a record (which need not have any other components), and to allocate one of the desired size using NEW (as in the examples below). The record value can then be applied to index the sequence. Usually it is convenient to have DOMAIN= NAT. If, however, some maximum length N is important to you, consider DOMAIN =[0..N]: then the value of the tag field for a sequence of length n<N is just n, and the valid indices are IN

[0..n).

Examples:

StackRep: TYPE^—RECORD[

top: INT⁴- —1,

item: SEQUENCE size: NAT OF 71; Number: TYPE^— RECOR D[

sign: minus},

magnitude: SELECT kind: * FROM

short =>[val: [(lam,

long=>[val: LONG CARDINAL],

extended=>[val: SEQUENCE length: NAT OF CARDINAL] ENDCASE]:

rsl: REF StackRep^€NEW[StackRep[100]]; rsl.top= —1, rsl[i] is trash.

rs2: REF StackRep⁴-NEW[StackRep[100]⁴-[top-3, item^—TRASH]]; rsl.top= 3. rs2[i] is trash.
rnl: REF Numbenextended NEW[Number.extendecti2*4:

rn1[2]= rnit[2] = rnLitem[2]= rnit.item[2]. but all start out trashed.

•A sequence may have a COMPUTED tag, with the same meaning as for unions: no tag field exists. no bounds checking is possible so that application is unsafe, and the cluster has no DESCRIPTOR proc. You can still compute the address of the sequence with @ and use the unsafe three-argument form of DESCRIPTOR (§ 4.4.2.3). Example:

-- Here is the recommended unsafe method for imposing an indexable structure on raw storage.

WordSeq: TYPE^— R ECORD[SEQUENCE COMPUTED CARDINAL OF Word];

A sequence may not have an OVERLAID tag, and * cannot be used for the tag type.

A sequence may appear in a MACHINE DEPENDENT record. It must come last, both in the record constructor and in the layout. The total length of a record with a zero-length sequence part must be a multiple of the word length. The size of the sequence field (if specified) must describe a zero-length sequence; i.e.. it must account for just the space occupied by the tag field (if any).

There is a predefined sequence TEXT; see Table 4-2 for its declaration. There are literals of type REF TEXT, denoted as in rule 57 by the characters of the literal enclosed in doublequotes. Such a literal is shorthand for a constructor (which you couldn't actually write in current Cedar, since it lacks constructors for sequences). REF TEXT can be used where efficiency is critical; for general purposes use Rope.ROPE.

•There are also unsafe predefined types LONG STRING and STRING: see Table 4-2 for their declaration. They are described here for completeness, but should not be used. These types are pointers to a StringBody type also given in Table 4-2.

Anomaly for StringBody: In spite of the declaration, StringBody behaves like a sequence with tag maxlength and sequence text. Thus z.NEw[StringBody[n]] returns a STRING or LONG STRING with maxlength= n: if s is a STRING or LONG STRING, s[i] indexes its text, etc. You can also use &text. as with sequences, but this is not recommended: because of the definition, sdext[i] is never bounds-checked (use sfip, and DESCRIPTOR[s.text] describes an array of length 0 (use DESCRIPTOR[rt].

·There is a special kludge for allocating a string in the local frame of a proc:

LOCALSTRING: PROC[ [length: CARDINAL] ]-[STRING] -- A coercion.

Because this is a coercion, you can write

s: STRING^–[20]

to obtain a local string of length 20. Of course, the storage will be freed when the proc frame is freed, and a dangling reference may remain. This construct is legal only in declarations as the e of a defaultTC.

·There are literals of type STRING, denoted just like REF TEXT literals as in rule 57. Since they are string literals, they are allocated in the MDS, where they consume precious space. By suffixing L to the literal, you can get it allocated in the proc frame, where the space is recovered when the frame is freed, at the risk of a dangling reference.

1^.4.4.2C Descriptor types

A descriptor is a pointer to a row value which includes a subrange of the row's domain as part of the descriptor value. A proc which takes descriptors rather than rows or REFS to rows can deal with rows of different sizes. Because a descriptor is like a pointer, there are short, long and relative descriptors which are exactly analogous to short, long and relative pointers: see § 4.5.1 and § 4.5.4 for details.

Style for rows of variable length: Applying a descriptor is unsafe. It is generally better to use a REF to a sequence-containing record.

Like a row, a descriptor can be applied to yield a VAR of the range type. If it is READONLY, the VAR will be READONLY too.

Descriptor is a subclass of row (§4.4.2) and address (§ 4.5). Like array, it has the items:

LENGTH: PROC[a: 71^–+[CARDINAL] -- Returns the cardinality of the subrange in a.

BASE: UNSAFE PROC[a: 7] -- Returns the address of a's first element.
^–*[LONG POINTER TO UNSPECIFIED]

Like pointer, it has:

TARGET: TYPE -- The type of the arrayType used to

make the descriptor.

In addition, there is an unsafe and untypesafe proc for making a descriptor with RANGE =CARDINAL from a LONG POINTER:

DESCRIPTOR: UNSAFE PROC[base: LONG POINTER TO UNSPECIFIED, length: CARDINAL, type: TYPE]

[d: LONG DESCRIPTOR FOR ARRAY CARDINAL OF type]

d.LENGTH = length and d.BASE= base.

Anomaly for target typing of DESCRIPTOR: The type argument of DESCRIPTOR may be omitted, in which case it is the range type of the target type (which must be a descriptor type). Similarly if the target type is packed.

There is a compile-time coercion from LONG DESCRIPTOR to DESCRIPTOR, which works exactly like the similar coercion from LONG POINTER to POINTER (§ 4.5.1B).

Erk4.4.3 BASE POINTER types

A base pointer by is like an ordinary pointer, except that it has an APPLY operation which maps a relative pointer rp (see § 4.5.4) into an ordinary pointer p. Its class is a subclass of pointer (§ 4.5.1B) and approximately a subclass of map (§4.4), but with the items:

APPLY: UNSAFE PROC[bp: T, rp: DOMAIN]--⁴p: rp.TARGET]

DOMAIN: T RELATIVE POINTER

Note that the type of bp[rp] is determined by the type of rp, and has nothing to do with the type of bp. There can be many relative pointer types for a single base pointer type. The scheme is much less safe than ordinary pointers, since a particular relative pointer in general makes sense only relative to a particular base value, but the type system allows it to be used with any base value of the proper base type.

In other respects, a base pointer is like an ordinary pointer: indeed, it is a subclass of pointer. Thus, it has a target type of its own, and can be dereferenced to yield a value of that type. This allows it to point to a record or other variable at the start of the region. Note that the base pointer's target has nothing to do with the range of its APPLY, which is the target of the relative pointer it is applied to: unlike other map types, a base pointer has no RANGE of its own.

A base pointer type implies the corresponding non-base type, and vice versa.

Representation of base pointers: The APPLY proc is

A [bp: T, rp: DOMAIN] IN

LOOPHOLE[LOOPHOLE[bp.LONG, LONG CARDINAL]+ LOOPHOLE[rp.LONG, LONG CARDINAL], LONG POINTER TO rp.RANGE]t

if T.TARGET.ISLONG =TRUE or DOMAIN.ISLONG= TRUE, or the same thing without the LONGS if neither is long.

Anomaly for relative array descriptors: A relative array descriptor (obtained by using a descriptor type as the range argument of the type constructor) doesn't quite work this way, since it uses the bounds in the descriptor, rather than in TARGET.DOMAIN, to check the subscript.

4.5 Address types

46 refTC ::= REF ( varTC⁴⁰ I ) MKREF[target–( varTC I ANY )]

47 listTC :: = LIST ( OF varTC^4O 1) MKLIST[range–( varTC I REF ANY )]

48 tpointerTC:: = ?LONG ?ORDERED ?•13ASE

POINTER Nsubrange²⁵ (TO varT0¹⁰ 1 ) 1 •POINTER TO FRAME [ n ] n Subrange only in a relativeTC: no typeName³⁷ on it.

49 •trelativeTC ::= typeName³⁷ RELATIVE t MKRELATIVE[range^—t, baseType–typeName]
t must be a pointer or descriptor type. typeName a base pointer type.

Examples

ROText: TYPE^—REF READONLY TEXT:

RL: TYPE^—LIST OF REF READONLY ANY: rl:RL:

UnsafeHandle: TYPE^—LONG POINTER TO Vec⁴⁴:

-- NARROW[ri.first. ROTexi]t is a -- READONLY TEXT (or error).

Address is a subclass of assignable (§ 4.3.2) and of hasNIL (§ 4.3.7). it has no items of its own. An address value is the address of a variable, i.e., of a block of storage.

Storage is a precious resource which must be reclaimed when it is no longer needed, i.e., when the variable it represents will no longer be touched by the program. Cedar provides safe storage which does this reclamation automatically, and unsafe storage which must be reclaimed explicitly by the program. A checked program (§3.4.4) deals only with safe storage, and need not be concerned with how storage is reclaimed, or how things can go wrong, except for one point discussed in the next paragraph. If you write only checked programs, you can skip to § 4.5.1. An unchecked program must maintain the safety invariants, in order to ensure that the Cedar system continues to function. These invariants are given in the remainder of this sub-section.

Cedar has two garbage collectors for reclaiming safe storage. The incremental collector runs continuously and reclaims storage without stopping other computations for more than a few milliseconds at a time. The trace-and-sweep collector runs only when invoked, and stops other computations for many seconds. The disadvantage of the incremental collector is that it cannot reclaim a cyclic structure, even if that structure can no longer be reached by the program. Therefore, a production program, especially a real-time or interactive one, should break the cycles in its structures when they are no longer needed. The package finalization mechanism is often helpful in doing this. It and other features of Cedar safe storage are described in the SafeStorage interface.

Anomaly in garbage collection: It is possible that an unreachable variable will not be reclaimed because it appears to be pointed to by some double-word quantity in a frame which is not actually a REF. This can happen because the collectors cannot tell which double-words in a frame are REFS. and hence proceed conservatively.

Definitions

To state the safety invariants, we need some definitions.

A safe variable (SV for short) is a frame or a counted variable, i.e., one allocated by z.NEW, where z is a ZONE (§4.5.2). A safe reference (SR for short) is a transfer or REF value. A SR is the only legitimate way of addressing a SV, and furthermore, a SR can legitimately only be stored in a SV. A reference-containing type (RC for short) is a SR type, or a composite type with a RC component.

A SV is reachable if:

it is the process array or, in current Cedar, the global frame of a module, or a SR which points to it is stored in some reachable SV.

The collector tries to reclaim safe storage when it is no longer reachable. A SV v is good if:

It overlaps no variable of another type.

If its VALUE type is RC, then v.VALUEOF is good.

A SR is good if it points to a good SV of the proper type, or is NIL. A composite RC value is good if each of its RC components is good.

The idea is that if:

new address values are generated only by NEW or frame allocation, and

these allocators always return an SR which is the address of a SV that doesn't overlap any other SV, and

SR values never get damaged or mistyped,

then by keeping track of the SRs the collector can know about all possible ways of reaching an SV. If there are no ways, the SV can be freed.

For the purpose of this analysis. we assume that every value is held in some variable; the fact that some values are constant is not important here. Storage can be modified only by an ASSIGN proc for some variable. Hence the behavior of ASSIGN determines how values can change. A composite variable (§4.3.4) is made up of other variables; in Cedar record, union and row variables are composite. ASSIGN for a composite variable is simply a sequence of ASSIGNS for the components. Therefore the remaining analysis considers only non-composite variables.

Safe storage main invariants

Cedar safe storage depends on three invariants. These in turn depend on some local invariants (L1- L4), and some properties of the Cedar primitives (P1-P3) given below. The proofs of the main invariants follow these definitions.

S1) Every SV is good.

S2) Every SR is good. S3) A SV is not freed if there is an SR for it in some other SV.

Local invariants

1.1) No variable of another type overlaps an existing SV. The allocator ensures that no SV will do so, because NEWETI returns the address of a block of at least T.SIZE words, none of which is part of an existing SV. Similarly, applying a closure allocates a block of unused words at least as large as the frame. Unchecked code must ensure this for other variables.

L2) Assignment to a SV works, and is type-correct: the value being assigned has the VALUE type of the SV, and the assignment leaves it as the value of the SV (P1). Unchecked code must ensure that only a SR of the proper type is assigned to a SV. In particular, it must not produce a SR value out of thin air, unless it is known that there is an equal existing reachable SR value.

L3) A counted SV is reached only through a REF: the allocator which creates a counted SV returns only a REF to it. There is no safe operation for obtaining a counted SV except from a REF. Unchecked code must not produce a counted SV except from a REF.

L4) An SR which points to a frame (i.e., a transfer SR) is stored only in a frame which is freed first. A checked assignment cannot assign a transfer SR unless it points to a global frame, which is never freed (except by an unsafe operation); when a SR is bound to a name, it must be from the same or a larger scope. Unchecked code must not preserve a transfer SR after its frame has been freed

Primitive properties

P1) ASSIGN(to^—SV, from^—SR) leaves SV.vALuEoF=SR and affects no other non-overlapping variable. If SV is counted (i.e.. came from dereferencing a REF) it updates the count correctly.

P2) The collector does not free a counted SV holding a SR until the value of the SR is NIL.

P3) The collector does not free a SV until no SR on the stack points to it and it has a zero reference count,

P4) A SR is stored only in a SV. Proof of main invariants

S1) Every SV v is good. Proof by induction.

Basis: A SV is good when created by NEW.

Induction: There are three ways v might cease to be good:

Another variable might come to overlap it, but this doesn't happen (L1).

If v.VALUEOF is SR, it might change:

By assignment to it, but ASSIGN replaces the value in v with another SR value (L2), and this other value is good (S2).

By an assignment to some other variable which clobbers v, but no variable of another type

overlaps v (Si), and no assignment to a non-overlapping variable can clobber v (P1).

If v.VALUEOF is SR. it might cease to be good, but it points to a good SV (S2) which remains good (SI).

S2) Every SR is good. Proof by induction.

Basis: the values produced by NEW and by applying a closure are good.

Induction: the other source of SRs is SVs (P4). and these are good (SI). Furthermore, an SV for which an SR exists is not freed (S3). so the SR remains good.

S3) A SV v is not freed if there is an SR for it. Proof by case analysis.

A) Not by the reference counting garbage collector. because:

This collector frees v only if no SR on the stack points to it, and it has a zero reference count (P3).

An SR can be stored only in a SV, i.e., on the stack or in a counted SV (P4).

The number of SRs pointing to v in counted SVs is equal to the reference count for v, by induction:

Basis: Both start at zero.

Induction: There are three ways the number of counted SRs pointing to v can change:

ASSIGN to a counted SV, which updates the count correctly, because a counted SV is

reached only through a REF (L3), and any assignment through a REF updates the count

correctly (P1).

ASSIGN to some variable w of another type, but v is good, hence overlaps no variable of another type (S1), hence is not affected by ASSIGN to w (P1).

Freeing a counted SV, but it is not freed until its value is NIL (P2).

B) Not by the trace-and-sweep garbage collector, because:

It implements the definition of reachability. Note that the collector sets SRs for unreachable SVs to NIL, thus breaking circular structures.

C) Not by the frame deallocator, because:

A frame is either permanent (a global frame), or an SR which points to it is stored only in a frame which is freed first (L4).

It is possible to obtain a variable without going through a SR value by using an unsafe pointer-containing type (PC for short). The non-composite PC types are:

pointer (which includes POINTER TO FRAME, string and uncounted zone):

descriptor:

A program which obtains a variable from a PC value (by dereferencing a pointer, applying a string or descriptor, or using NEW or FREE for an uncounted zone) must maintain the safety invariants LlL4.

4.5.1 Reference types

This class is a subclass of address (§ 4.5) and has the items:

TARGET: VARIABLE. Type -- Always a variable type.

DEREFERENCE: PROC[r T]—olT.TARGETI -- Denoted by rt

APPLY: PROC[r. T, arg: T.TARGET.DOMAIN]-0 -- Inherited from the target type if it has APPLY.
[T.TARGET.RANGE]

f PROC[r. 7', arg: T.TARGET.f.DOMAIN]^—> -- Inherited from the target type for each procf

[T.TARGETf.RANGE] in its cluster: see below.

The target of a reference type T may be any variable type VAR U or READONLY U. If T is READONLY, then T.TARGET is READONLY also: this means that assignment to the dereferenced address is impossible. Dereferencing a T yields a VAR U (which can then be coerced to a U value if appropriate). Dereferencing NIL causes the error Runtime.PoiraerFault.

If the target has an APPLY, DESCRIPTOR. WAIT, NOTIFY or BROADCAST proc, or any record field procs in its cluster, these are inherited by the reference type (except that APPLY is not inherited by a BASE POINTER, which has its own APPLY: see § 4.4.3). The value of an inherited f is

[r. T, arg: T.TARGETJ.DOMAIN] IN rt.f[arg]

In other words. the address is dereferenced, and then the target's f is applied. The effect is that a reference to an array or proc can be applied without explicit dereferencing, a reference to an array can be turned into a descriptor, a reference to a condition can be used to do a WAIT or whatever, and a reference to a record can be used to select a field.

Procs which get into a cluster by being in an interface instance are also inherited in this way, but this is not useful, since they are not modified to dereference their reference argument: this is a deficiency. To compensate for this, you can define such procs to take a REF T, so they will be useful when inherited from T.Cluster to (REF T).Cluster.

4.5.1 A REF types

The REF class is a subclass of reference (§ 4.5.1) and has no additional items. A REF value can be safely created only by a NEW proc. Every general type except union has one of these (§ 4.3.1).

The type VAR ANY may be the target of a REF: it cannot appear anywhere else. This REF type is denoted REF ANY, or simply REF. It is implied by every REF type. ISTYPE can be used to test the particular REF type of a REF ANY value, and NARROW can be used to convert a REF ANY value into a REF T value (§ 4.3.1). These two operations are combined in a convenient way by safeSelect³² (§ 3.8). REF ANY does not have a DEREFERENCE proc, and of course there are no procs for it to inherit from the target.

LIST types

The LIST class is a subclass of REF. and has items:

RANGE: VARIABLE.Type: ‑

first: PROC[/: 71^—>[RANGE]: ‑

rest: PROC[!: 71-4T1: ‑
CONS: PROC[z: ZONE⁴-SafeStorage.GetSystemZonen,

x: RANGE. y: 71--q71

LIST: PROC[z: ZONE⁴- SafeStorage.GetSystemZond -g: RANGE X ...J^— [T]

- Always a variable type.

- Denoted 1.first,notfirst[1]

- Denoted !Jest, not rest[I]

-- Denoted z.CONS[x. y] or CONS[x. y].

- Denoted z.L4ST g or LIST g.

The TARGET type R of a list type T is opaque, but it may be thought of as an unpainted record [first: RANGE, rest: TI; thus a list value is a REF to an R. The first and rest procs return the fields of an R. LIST is short for LIST OF REF ANY.

CONS is NEW[R⁴-[x, y]]: the optional zone tells where to do the NEW. LIST does a series of coNses, yielding a list such that

LIST[x_cr x_n] rest' first= x

Note that the g argument of LIST may have omitted values, which are filled in if possible by the defaulting coercion for RANGE. Examples:

Text Box: -- Same as 1, by defaulting. I: TYPE^—INT(-0

L: TYPE^—LIST OF

1: L–Lis10, 1, 2. 3, 41: rn: L–LIST[ , 1, 2, 3. 4]:

The type ATOM

An ATOM is a REF to an opaque type which is exported from AtomsPrivate as

AtomRec: TYPE^—RECORD[

printName: Rope.ROPE.

propertyList: REF ANY ^4-NIL,

link: ATOM ⁴-NIL]

There are no additional items in ATOM'S cluster; the useful operations on ATOMS are provided by the ListsAndAtoms interface. However, the language does provide ATOM literals for atoms which have Cedar names as their printnames, with the syntax $n. Examples:

$red

$VeryLongAtomMadeUp0fSeveralWords

There is a coercion from ATOM to any enumerated type: see § 4.7.1A.

Anomaly for space in ATOM literals: You cannot put white space between the $ and the name in an ATOM literal. In return, the name may be a reserved word.

4.5.1 B *Pointer types

Pointer is a subclass of reference (§ 4.5.1). There are two flavors of pointer: short and long. Short pointers occupy one word, and point only within the 64k word main data space where frames are allocated. Long pointers occupy two words and point anywhere.

Pointer dereferencing is unsafe; hence all the inherited procs are also unsafe. Dereferencing a pointer may cause an address fault if it points to storage which is not mapped by the operating system; this is about the least disastrous thing that can happen if an unsuitable value gets into a pointer.

Long pointer types have the following dubious items:

·PLUS: PROC[T, LONG INTEGER]^—q7] -- Denoted by infix +.

'MINUS: PROC[T, LONG INTEGER]^—,in -- Denoted by infix —.

·DIFF: PROC[T, 1--)^.[LONG INTEGER] -- Also denoted by infix —.

Anomaly for MINUS on pointers: The infix "—" cannot be desugared into dot notation, since there are two procs denoted by an infix " —" whose first argument is a pointer. The choice between MINUS and DIFF is based on the type of the second argument.

Short pointer types have the same procs without the LONG. They also have the following coercion. called lengthening:

LONG: PROC[p:1^—>[LONG POINTER TO TARGET]

Note that VAR types have a VARTOPOINTER proc (denoted by prefix @); this turns a VAR T into a LONG POINTER TO T.

Anomaly for narrowing to a short pointer. The VARTOPOINTER and BASE primitives turn a variable into a LONG POINTER. If the compiler can determine that the variable is in the main data space. then an application of one of these primitives can be narrowed into a POINTER. This is done statically; if an error is possible it is reported by the compiler, even though the actual narrowing might have been successful.

The subrange in pointerTC⁴⁸ is only for a pointer type used as the range argument of RELATIVE

(§4.5.4).

9.5.2 Zone types

The zone class is a subclass of address (§ 4.5) and has the items:

NEWTYPE: PROC[U: TYPO-⁰[A: REFERENCE];

NEW: PROC[z: T, U: TYPE]-*[r: NEWTYPE[0];

FREE: PROC[z: T, p: VAR NEWTYPE[U]]-⁰4]; For a ZONE.

FREE: UNSAFE PROC[z: T. -- For an uncounted zone.
p: NEWTYPE[ NEWTYPE[U]]]->[]:

Currently there are exactly three zone types:

ZONE, with NEWTYPE= [U: TYPE] IN MKREF[target^—U]. which implies REF.

UNCOUNTED ZONE, with NEWTYPE=A [U: TYPE] IN MKPOINTER[targe/- U, long-TRUE]. which implies LONG POINTER.

MDSZone, with NEWTYPE= [U: TYPE] IN MKPOINTER[targe/- U. long-FALSE], which implies
POINTER.

In other words, a ZONE deals in REFS, an UNCOUNTED ZONE in LONG POINTERS, and an MDSZone in POINTERS. The latter two are called uncounted zone types.

NEW is explained in § 4.3.1. FREE takes a pointer p to a variable v containing a reference r to a variable fr. For a ZONE, the expression denoting p must have the form @v. In spite of appearances, this is safe; think of r as a VAR parameter to FREE, and the @ as indicating in the application that it is modified. For example,

{v: REF-NEW[INTI: FREE[@v] }

The reference r must be supplied by the NEW proc of the same zone; this is checked for a ZONE. FREE sets v to NIL. In addition:

For a ZONE, FREE sets all the REF variables of fv to NIL; this helps to break circular structures, but only the collector actually reclaims storage. Hence FREE for a ZONE is safe.

For an uncounted zone, FREE reclaims the storage for fi^g by calling the Dealloc proc of the zone (see below); hence FREE is unsafe for an uncounted zone; the safety invariant demands that FREE not be called with a pointer unless the variable will not be used any more. It is best if no other pointers to f^yexist.

New zones can be obtained, and other aspects of storage allocation monitored and controlled, using the procs in SafeStorage (for ZONES) or UnsafeStorage (for uncounted zones). It is also possible, though not recommended, to make up your own UNCOUNTED ZONE using a type like this:

UncountedZoneRep: TYPE-LONG POINTER TO MACHINE DEPENDENT RECORD [

procs (0: 0..31): LONG POINTER TO MACHINE DEPENDENT RECORD [

Alloc (0): PROC[zone: UncountedZoneRep, size: CARDINAL]-+[LONG POINTER].

Dealloc (1): PROC[zone: UncountedZoneRep, object: LONG POINTER]

-- possibly followed by other fields-- j,

data (2: 0..31): LONG POINTER -- Optional; see below

-- possibly followed by other fields-- ]:

The same structure serves for a MDSZone. with all the LONGS dropped and the field positions adjusted accordingly. You must use a LOOPHOLE to turn one of these Rep values into an uncounted zone value.

If z is an uncounted zone, the code generated for z.NEW[7] is zr.procst.Alloc[z, T.SIZE]

and the code generated for z.FREE[p] is

{temp: LONG POINTER-pt; /ft ztprocst.Deallock. temp] }

Usually p is @q. for some variable q which holds the pointer being freed.

Within this framework, you may design a representation of zone objects appropriate for your storage manager. In general, you should create an instance of a UncountedZoneRep for each zone instance. The procs record can be shared by all zones with the same implementation; the data pointer normally references the state information for a particular zone.

•4.5.3 POINTER TO FRAME types

Text Box: This class contains the types of the frames subclass of address, and has the items:
FRAME: TYPE
FROMIMPLINST: PROC[FRAME][7] COPYIMPLINST: PROC[i: 71—4[T] of instances of implementation modules (§ 3.3.5). It is

-- POINTER TO FRAMEM.FRAME =1.

-- Coercion from FRAME to T.

-- Returns a copy of the module instance i. Denoted NEW i.

-- Coercion from T to the program proc for the instance.

In addition. T has a field proc for each value in the frame.

Note that there are coercions from an imported module instance II: I to the corresponding POINTER TO FRAME, and from the latter to the program proc for the frame. You can get a POINTER TO FRAME[/] value from an imported implementation using the first coercion, or from NEW PF, where PF is an existing POINTER TO FRAME[/] value (an application of COPYIMPLINST).

4.5.4 RELATIVE types

Sometimes it is convenient to have addresses which are relative to the base of some region. Such pointers can be shorter than ordinary pointers. Also, the entire collection of variables in the region can be moved in storage simply by changing the base; in fact, it can be written out and later read in to a possibly different place, and any relative pointers stored in it will still be valid. Cedar provides some (unsafe) support for this facility, in the form of RELATIVE types. A RELATIVE type has a target type which plays the same role as the target type in an ordinary pointer. The analogy to dereferencing a pointer is applying a base pointer to the relative pointer. The RELATIVE class has no DEREFERENCE or APPLY proc. The only useful thing to do with a RELATIVE value is to apply a suitable BASE POINTER to it (§ 4.4.3).

Relative is a subclass of address (§ 4.5) and has items:

BASE: TYPE; -- The type of the base pointer.

SUBRANGE: SUBRANGE. Type -- The subrange type; only for pointers.

TARGET: TYPE; -- If b: BASE and rp: T then b[rp] has type TARGET.

A relativeTC takes a pointer or descriptor type as its range argument. The TARGET of the RELATIVE type is the TARGET of the range. To indicate the desired size of a RELATIVE POINTER value, the type constructor for the range pointer type can specify a subrange of CARDINAL. There are coercions between RELATIVE POINTER types which differ only in their subranges; these are just like the coercions between subranges of CARDINAL (§ 4.7.3).

Node: TYPE^—MACHINE DEPENDENT RECORD [ -- rands is a union or variant part.
type (0: 0..15): Typelndex. rator (1: 0..13): Op54_. -- This is the common part.

rands (1: 14..79): SELECT n (1: 14..15): * FROM -- Both union and tag have pos.

nonary = >11 -- Type of n is I nonary, unary, binary}.

unary =>[a (1: 16..47): REF Node], -- Can use same name in several variants.
binary -= >[a (1:16..47), b(1:48..79): REF Node] -- At least one variant must fill 1: 14..79.

ENDCASE

Record types are Cedar's facility for grouping values of different types (since group and binding types cannot be named or written in ordinary declarations). Unions are closely related to records because they must be embedded within records in current Cedar.

4.6.1 Record types

RECORD is a subclass of assignable (§ 4.3.2), or of general (§ 4.3.1) if any component is not assignable. The MKRECORD type constructor takes one argument called fields: a declaration or group of TYPES: in the latter case, it is rebound to a decl with secret names. If fields=[n_i: T_I, n₂: T₂, ..., n_k:

T_k]_.MKRECORD produces a type with items:

n.: PROC171^—>T_, -- One for each name in the decl.

FIELDS: DECL

CONS: PROC[b: FIELDS]—)[7] -- Apply by T b ; a coercion from the binding.

UNCONS: PROCM^—> [FIELDS] -- No denotation; a coercion to the binding.

UNWRAP: PROC[T]^—⁴UI If fields= [n: i.e. for a single-component record Nameless fields are not very useful, since there is no way to name the field procs. The values of the n procs are not accessible; they can only be applied with dot notation. Thus if r is a record value.

r.n. denotes its eth field.

A record type T with a single component of type LI inherits all of U's cluster. There is also a coercion UNWRAP from T to U. The effect is that a T value behaves just like a U value. but not vice

versa.

A variant record inherits some procs from the sequence or union type it contains (§4.4.2B, § 4.6.2).

If v is a VAR U returned by a field proc, you can only apply @ to it if U.sizE>l, or U's representation occupies an entire word, or by accident v happens to occupy a whole word in the record representation.

Record types in interfaces are painted: each type produced by RECORD[...] (i.e., by MKRECORD or MKMDRECORD) in an interface has a unique mark. Thus two occurrences of a record type constructor in an interface always produce two different types. In this respect, recordTCs are like unionTCs and enumerationTCs, and differ from all other type constructors. In a program module. however, record types are not painted (unless they are machine-dependent or union-containing; this is a deficiency. and these should not be painted either in this context). The reason is to ensure that old values will still be useful after module replacement. Since painting is the only way to generate unique marks, it is the only way that an implementation can guarantee that its types cannot be forged. In practice. however, the protection afforded by opaque types (§4.3.4) is usually adequate.

Representation of records: A record variable is represented by a contiguous block of storage, in which the bits representing each field are contiguous and do not cross a word boundary unless they fill a block of words, but are otherwise arranged at the discretion of the compiler. It is not possible to obtain a REF to a record element; this is because the implementation of both reference counting and REF ANY discrimination requires more information about each VAR than is available for a record field. Unless a field fills one or more words, it is not possible to obtain a pointer to the field either (using @); this is because pointers point to words.

Restriction on record sizes: A record type T must have T.sizE<2¹².

A MACHINE DEPENDENT RECORD type constructor can specify the exact arrangement of the fields in a record, using the syntax of rules 51-53. Examples are given with the rules. Fields must be arranged according to the following constraints.

A pos⁵¹-¹ (w) for a field with type U means that the field occupies words w through w+ U.S1ZE, which are bits 16w through 16*(w+U.sizE)-1, of the record variable; (w: f.I) means that it occupies bits 16w+f through 16w+1 inclusive (051<1 is required; there is no upper bound on I). Like everything else in a type constructor, all of w, f and / must be static.

The pos must be large enough to hold a variable of the field type U: if U.sIzE>l, it must exactly fill U.SIZE words; if U.sizE=1 and U is represented in less than 16 bits (possible for a discrete, row, or record type), it need only be as large as the representation, but may not cross a word boundary. Union fields are treated specially (§4.6.3).

If there is a union field, the subfields of at least one case must exactly fill the pos specified for the union field.

Fields may not overlap, and if they fill at least one word, they together must completely fill an integral number of words. The order of fields is not important, except that any variant part must come last both in the layout and in the constructor.

If any field has a pos, each must have one. A machine dependent record may have no pos. In this case, the fields are arranged consecutively, and the constructor must be such that that the rules about word alignment and boundary crossing are not violated by this arrangement; this may require the presence of dummy fields which fill out unused space.

Note that a pos is really explicit code for the field proc, written in a rather restrictive special language.

4.6.2 Variant record types

There are two classes, unions (§ 4.6.3) and sequences (§ 4.4.2B), whose types are not first-class type values, but can only appear as the type of the last field of a record or union. A record whose last field is one of these types is a variant record. and its last field is a variant field. The other property shared by a union and a sequence type is that each is a generalization of a number of special cases: there is a single value called the tag which identifies the special case.

For a union, the special cases are unrelated. and the tag is a value from an enumeration.

For a sequence, the special cases are rows of different length. and the tag is a value from the row's domain. .

The tag⁵³ is treated as a field of the containing variant record. This field is readonly. tFor a union it can be changed only by an unsafe assignment to the entire variant part or the entire variant record. There is no way to change the tag field of a sequence. IA tag of COMPUTED or OVERLAID means that there is no tag field: instead. the tag value must be supplied by an expression in a withSelect³⁴ when it is needed for specialization. Tags of * and OVERLAID are only for unions, and are explained in § 4.6.3.

The cluster of a variant record has the items:

The usual procs for the record fields (including the variant field itself, and the tag), and any items inherited by the record type.

For a union, the types of the bound variants: T.n= T.SPECIALIZE[Sn].

TAGTYPE: TYPE -- The type of the tag.

TAG: TAGTYPE: -- Another proc for the tag field.

VARIANTTYPE: TYPE -- The (union or sequence) type of the variant field.

VARIANTPART: PROC[7]--0^.[VARIANTTYPE] -- Another proc for the variant field.

SPECIALIZE: PROC[x: TAGTYPE]-4^.[BT: TYPE] BT is a bound variant of r, denoted 7[x] for a

sequence-containing variant record type.

Specialization yields a record type called a bound variant in which the type of the variant field is one of the special cases of the union or sequence. The bound variant differs by:

GENERAL: TYPE -- The type of the unbound variant record.
lacking SPECIALIZE.

a readonly tag field,

for a union:

VARIANTTYPE equal to the corresponding case,

procs inherited from the corresponding case.

Note that if the special case is itself a union or sequence, the bound variant is still a variant record: otherwise it is an ordinary record. A bound variant of a union-containing variant record is denoted T.n, since the bound variant types are in the cluster of the variant record type (.alternate, obsolete notations are 71n] or n 7). A bound variant of a sequence-containing variant record is denoted ne].

Anomaly for equality of variants: A variant record type has EQUAL only if it does not have a SEQUENCE field, and for any two tag values a and b. T.a.SIZE= T.b.SIZE. Even if not all sizes are equal. the bound variants have an EQUAL which takes the variant record as its second argument: hence by v is always correct.

The special properties of the subclasses of variant records are given in the sections on unions (§ 4.6.3) and sequences (§ 4.4.2B).

4.6.3 Union types

Together with REF ANY, union types provide Cedar's facilities for associating a type T with a class which contains subtypes T₁, T. and dynamically narrowing a value of type T into a value of the

proper type T. REF ANY is more convenient:

Any REF T is a subtype of REF ANY; no pre-planning of the subtypes is required.

REF T implies REF ANY: hence procs taking REF ANY accept any REF T without further ado. Union types, on the other hand. have performance advantages:

A union type is just a value, not constrained to be a REF. These values or their VARS can be embedded in records or arrays without paying for extra storage allocation or an extra level of indirection.

The subtype of a union type can be discriminated somewhat faster than a REF ANY. Union types can therefore be recommended when performance tuning is required.

Text Box: or of general (§4.3.1) if it has an union type is defined by a unionTC52: the type of the last field of a variant
of union type Tis denoted by T.n: Like record, union is a subclass of assignable (§4.3.2), unassignable component. Assignment to a union is unsafe. A it is not a first-class type in Cedar, and can only appear as record (§4.6.2) or another union. A union type has items:

the types of the union cases, named by their tags–thus case n

CONS (see below).

The types and tag are inherited by the containing variant record, which is the type a program normally deals with. Note that a union type is always painted (although it shouldn't be painted in an implementation).

A case of the union has items:

the field procs for its fields:

GENERAL: TYPE -- The union type of which this is a case.
These are inherited by the containing bound variant record in the obvious way.

The cases of the union are given by the arms of the SELECT. The type of the tag must be an enumeration, and each case is named by one or more literals of the enumeration. Thus Node in the example has cases binary, unary and nonary. and the type of the tag could have been written {binary. unary. nonary}. The * which actually appears for the tag type is short for an enumTC⁵⁴ which lists all the names preceding the => symbols of the SELECT in turn. If the tag type is given explicitly, any enumeration values which don't appear preceding a => symbol have empty cases.

A record type T containing a union field is a variant record. T is a first-class type which can be used like any other Cedar type. The only items in the cluster of T are the ones of the variant record class. The fields of the union cases are not in the cluster of the variant. However, the fields of the selected case in a bound variant are in the cluster (e.g., in the example Node.binary has procs for a and b). The name declared in a field must not be the same as any name declared in the containing record. However, the same name may be declared in more than one case of the union. •NULL following => is an obsolete synonym for

Anomaly for union constructors: A constructor for a union value has the form a[...], where a is one of the enumeration literals of the tag type, and [...] is an ordinary argBinding²⁷ for the fields of case a. The literal a may not be omitted. Thus

n: Node⁴- rator– plus, rands– binary[a– NIL, b^–N11]1

and also

n: Node.binary⁴-[rator– plus, rands– binary[a–NIL, b^–NIL]]

Anomaly for union values: If n is the name of the variant field, and r: T, r.n is legal only as the first operand of In all other cases, only a constructor can denote a union value.

The primitive ISTYPE can be used to distinguish the case of a union-containing variant record value x, and NARROW can be used to obtain a value bx of the bound variant type from x; see § 4.3.1. The safeSelect³² construct is a useful and efficient combination of ISTYPE and NARROW which deals systematically with any number of cases. The withSelect³⁴ construct is an unsafe version of safeSelect which can be used with any union type, and is the only alternative when the tag is COMPUTED or OVERLAID. See § 3.8 for discussion of these forms.

If the tag is OVERLAID, any field name that appears in exactly one case of the union has a proc in the cluster of the variant record. When such a proc is applied to a value x, there is no checking that x is the proper case of the union. f Obviously this is not typesafe, and it is also unsafe in general.

A union U has machine-dependent fields if and only if its containing record type R is machine-dependent. U must be last both in the fields and in the representation. Its pos includes the tag. It need not be word-aligned, though its tag and each field in each case must obey the alignment rules for record fields (§ 4.6.1). If R's representation is <16 bits in size, all cases must be the same size. Otherwise. all cases of R must be a multiple of 16 bits in size, and at least one case of U must exactly fill the space given by the pos for U.

4.7 Ordered types

Ordered types can be compared, and they have subranges. The subclasses of ordered are discrete, numeric, pointer, and subrange. Ordered is a subclass of assignable (§ 4.3.2), and has items:

LESS: PROC[T. 71—>[Bootl: -- Apply by infix <. See rules 19, 22.

GREATER: PROC[T, T]—4[BOOL]; -- Apply by infix >. See rules 19, 22.

MAX: PROC[T⁴-TFIRST, ...]^—*[71: -- Apply by MAX[x, y, ...].

MIN: PROC[T⁴-T.LAST, ...]^—>[T]; -- Apply by MIN[x, y, ...].

All these procs do just what you expect. MAX and MIN accept more arguments than you have the patience to write. Pointers have these procs only if ORDERED = TRUE.

The class also has items:

SUBRANGE: CLASS; -- The class of subrange types of T.

MKSUBRANGE: PROC[first, last: 7]^—>[SUBRANGE]; -- See rule 25 for denotations.
MKEMPTYSUBRANGE: PROCUirm: ThqSUBRANGE] -- See rule 25 for denotations. These are discussed in § 4.7.3

4.7.1 Discrete types

The discrete types are those which have a useful bijection into an interval of the natural numbers: whole numbers and enumerations. These are the types that can be used as domains for row types (§ 4.4.2). The class is a subclass of ordered (§ 4.7), and has items:

FIRST: T LAST: T

PRED: PROC[x: 7]^—)[7] -- Predecessor. May cause a bounds fault.

succ: PROC[x: 7]—4[71 -- Successor. May cause a bounds fault.

Whole numbers are discussed in § 4.7.2A as a subclass of numeric.

4.7.1A Enumeration types

n T

n_k:^T

FROMATOM: PROC[ATOM]^—>[7] ORD: [71^—)[INT]

VAL: [IN^-1]^—q71

-- *The value of the first element of T. Also denoted 71_%1

-- •The value of the last element of T. Also denoted nn_k]

-- A coercion. The argument must be static.

T.FIRST.SUCCⁿ.ORD=n.

-- Denote by target typing only. VAL[x.ORD]= x

The ATOM to enumeration coercion is done only at compile-time: the effect is that you can write $n₁ rather than T.n_i for an enumeration literal anywhere except before a dot (and by desugaring, as the first operand of an operator). Note that when the n_i appear in the type constructor, or as tags in a

variant record declaration or constructor, they are not expressions: hence this coercion doesn't apply, and you can't write $n₁ in those contexts.

ORD and VAL convert between T and INT.

Enumeration types in interfaces are painted; each type produced by {...} (i.e., by MKENUMERATION or MKMDENUMERATION) in an interface has a unique mark. Thus two occurrences of an enumTC always produce two different types unless both are in implementations and are textually identical. In this respect, enumTCs are like recordTCs and unionTCs, and differ from all other type constructors.

•Anomaly for enumeration literals: You can write n_ifor T.n_i in an argument or binding where the desired type is T. In these contexts, even if n_i is known in the current scope, it denotes T.n₁ and not

the value it is bound to in the scope. Thus

Color: TYPE^—{ red blue. green}:

red: Color<-Color.blue;

c: Color⁴- red

leaves c= Color.red, not =Color.blue. In fact, red=red is false! It is best not to redeclare enumeration names. Better yet is to always write atoms for enumeration literals, and qualify explicitly with the type in the rare cases where this fails because the literal comes before a dot. Thus red= $red would be false, and $red= red would be illegal.

Representation of enumerations: The representation of n in an enumeration type is the same as that of the INT i. For a subrange of an enumeration, T.FIRST.SUCC' is represented by 1.

The type BOOL or BOOLEAN

This is an enumeration type {FALSE, TRUE}: BOOLEAN is a synonym for BOOL. It also has items:

NOT: PROC[BOOL]^—*[BOOL] -- Denoted by prefix NOT or —.

IFPROC[U: TYPE, test: BOOL. -- Denoted by IF test THEN "ifTrue"

ifTrue, ifFalse: PROC[—>[U]]—>[U] ELSE "ifFalse"

The meaning of " °True" and "ifFalse" is that in the construct

IF test THEN ifTrue ELSE ifFalse

the ifTrue and if expressions are converted into parameterless procs and passed to IFPROC, which applies the one selected by test. The other one is never applied, so that expression is never evaluated.

Note that AND and OR look like infix operators on Booleans, but have special evaluation rules for their arguments, because they are desugared into IF expressions (§ 3.7). The literals TRUE and FALSE can always be written without qualification.

The type CHAR or CHARACTER

This is an enumeration type which could be written {'\000, '\377} if the CHAR literals were
names; CHARACTER is a synonym for CHAR. CHAR literals are written:

As 'c for any character c except \, denoting the ith CHAR value, where i is the ASCII character code for c.

As "\ddd, where each d is an octal digit, denoting the dddBth CHAR value.

As '\c for various values of c, denoting the CHAR values for various non-printing or otherwise confusing characters (see rule 57).

·As ddaC, denoting the same value as '\ddd (obsolete).

Note that CHAR literals are not names, and you cannot use any of the notations for enumeration literals: CHAR[c/] or CHAR.ci or $c/ are not allowed if cl is a CHAR literal.

CHAR also has the following dubious items:

·PLUS: PROC[T, INTEGER]^—'[T]

·MINUS: PROC[T, INTEGER]-⁴471

·DIFF: PROC[T. 7]^—>[INTEGER]

-- Denoted by infix + . -- Denoted by infix —.

-- Also denoted by infix —.

Anomaly for CHAR MINUS: The infix " —" cannot be desugared into dot notation, since there are two procs denoted by an infix " —" whose first argument is a CHAR. The choice between MINUS and DIFF is based on the type of the second argument.

4.7.2 Numeric types

Numeric types have arithmetic operations. There are no numeric type constructors, only the

primitive types INT= LONG INTEGER, LONG CARDINAL, INTEGER, CARDINAL and REAL. All except REAL are subclasses of whole numbers, corresponding to different finite subsets of the integers, and are discrete as well (§ 4.7.1). The class is a subclass of ordered (§ 4.7), and has items:

PLUS: PROW', 71^—>[7] MINUS: PROC[T, T]^—)^.[T] TIMES: PROC[T,

DIVIDE: PROC[T, T]--q71

ABS: PROC[T]—>[T] UMINUS: PROC[T]—>[7]

4.7.2A Whole numbers

-- Denoted by infix "+". -- Denoted by infix " —". -- Denoted by infix "*".

-- Denoted by infix "/". Truncates toward

0: —(i/j)=(— 0/j= i/(—j). except for REAL,

which normally rounds.

-- Denoted by prefix " — ".

This class is a subclass of discrete (§ 4.7.1) and of numeric (§ 4.7.2), and has the item:

REM: PROC[T. 7]^—>[7] -- Denoted by infix MOD. i=t(i/j)+ i MOD j

Considerable confusion surrounds Cedar's treatment of whole numbers. This section gives a simple but somewhat idealized description of how it works. Then it tells you the hard facts; future versions of Cedar will adhere more closely to the ideal, and this part will shrink. Finally, it describes various obsolete facilities whose use is not recommended.

In general, a whole number type (except the CARDINAL types) is a subrange of INT, which is [-2'¹..2³¹). This means that all the arithmetic procs work on INTs. If an argument of such a proc is a subrange value, it is coerced to INT (this cannot lose information or cause a fault), and the result is coerced to a subrange type if necessary (with a possible Runtime.BoundsFault). An arithmetic proc gives a BoundsFault if its result is not an INT (overflow).

Anomaly in arithmetic: In fact, there are two deficiencies in the implementation:

1) There is no overflow checking on the numeric procs, except for DIVIDE and REM, which may raise an ERROR defined in Inline.

2) A subrange with <2¹⁶ values is called short (currently all subranges have this property, as do INTEGER and NAT). If all arguments are short, the result of an arithmetic proc is truncated to 16 bits without notice (even if it is static). This means that the result is always IN [-2¹⁵..2¹⁵),and may differ from the correct result by some multiple of 2¹⁶.You can force proper INT arithmetic by writing at least one argument as x.LONG rather than x. Thus the program

x. y: [0..10000)⁴-1000;

z: X^*Y:

w: INTⁱ-x.LONG*y

initializes w to 1000000 but z to 16960. Beware. This will also happen if x and y are declared as INTEGER or NAT, since these too are short.

There are several forms of whole number literal, given in rule 57. The radix may be: Decimal, the default., or specified by D after the number.

Octal, specified by B after the number. A sequence of digits without a B is never taken as octal, except in a CHAR literal.

Hexadecimal, specified by H after the number. A hex number may include the letters A through F, denoting the hex digits with decimal values 10 through 15. It must start with a digit in the range 0 through 9, however.

The optional number following the radix character is a scale factor, given in decimal; that many zeros are tacked on the end of the number. Precisely,

num₁ R num, = num₁ 0 R num₃ if num₃= num₂-1: nurn₁ R 0 = num₁ R

Note that literals are always non-negative; a static negative value can be obtained by arithmetic; e.g., —1.

Representation of whole numbers: Short values are represented in one word; other INT values require two words. The representation is twos complement, with one more negative than positive value.

Performance of whole numbers: Arithmetic is less efficient on subranges with FIRST*0 (except for INTEGER, which is efficient). Widening a short value to INT is more efficient if FIRST=0. Multiply and divide are quite slow when the arguments are not short. Short divide is faster when FIRST=0 than for INTEGER.

The interface Inline has inline procedures for doing bit manipulation on numbers, for obtaining the quotient and remainder simultaneously, and for doing certain other calculations more efficiently than is possible using the procs described above.

•Cardinal types

The type LONG CARDINAL has elements in the range [0..2³²);CARDINAL is the subrange [0..2¹⁶).

The arithmetic procs produce answers modulo 2³ (or modulo 2¹⁶if all arguments are short cardinals). Use of these types is not recommended, mainly because there are confusing coercions to and from INT. If you program so that these coercions are never invoked, by never mixing CARDINAL and INT values, you will avoid these problems; in the future Cedar will not have these coercions, and cardinal types will be harmless.

Anomaly for mixed integer and cardinal arithmetic: •Current Cedar attempts to do the "right" thing when subranges of INT are mixed with subranges of LONG CARDINAL in an arithmetic proc. by supplying various coercions which may lose information. Do not use these features (unfortunately, the compiler won't check for their non-use); if you need to understand them. consult a wizard.

4.7.2B The type REAL

Cedar uses the IEEE standard 32-bit floating point arithmetic for REALS. There are REAL literals with syntax given in rule 57; they are rounded to the nearest representable number. The exponent, if present. indicates the power of 10 by which the number or fraction should be multiplied. A literal that overflows the representation is a static error; one that underflows is replaced by its denormalized approximation. Note that a REAL literal can begin, but not end, with a decimal point.

The interface CedarReals has items for handling exceptions that can arise in real arithmetic, for changing the rounding modes. etc.

4.7.3 Subrange types

Each discrete type U has a MKSUBRANGE type constructor; its application is denoted by the syntax in rule 25. The first and last arguments specify the first and last elements of the subrange; the FIRST and LAST items in the subrange cluster have these values. The number of values in the subrange type is last— first+1. The subrange is empty if last<first. It is also possible to make an empty subrange with first= U.FIRST using the EMPTYSUBRANGE type constructor. You cannot make an empty subrange with last= U.FIRST.

In current Cedar the arguments of MKSUBRANGE must satisfy

—2¹⁵_first<2¹⁵ AND (last— first)<2¹⁶ — 1 AND lasK(IF first^ql THEN 2¹⁵+ first ELSE 2¹⁶)

There is a subrange class for each discrete type, which is a subclass of discrete (§4.7.1), with the items:

GROUND: TYPE; -- The type whose MKSUBRANGE or

EMPTYSUBRANGE proc produced T.

TOGROUND: PROC[x: 7]--->IGROUND] -- A widening coercion.

FROMGROUND: PROC[x: GROUND]-⁴171 -- A narrowing coercion; may raise

RuntimeBoundsFault. Apply explicitly by nx].

Note that there are coercions both to and from the ground type. The former cannot lose information or raise an exception, but the latter raises BoundsFault if its argument is not in the subrange. Subranges have their own FIRST, LAST, and ASSIGN items, as well as the items of general. They also inherit unchanged all the procs of the ground type with names not in the subrange class (including the MKSUBRANGE and EMPTYSUBRANGE type constructors); these procs still take the same arguments, and the coercions make it convenient to apply them to subrange values. There are no special arithmetic or comparison procs for subranges. Note that assigning a value of the ground type to a subrange variable will invoke the FROMGROUND coercion, with its attendant bounds check.

Representation of subranges: If T is a subrange type, T.FIRST is represented by the INT 0 (except for INTEGER, which has 0 represented by 0), and T.LAST by the INT (T.LAST— T.FIRST± 1). The number of bits required to represent a T value is the n such that

2ⁿ- ¹<(T.LAST — T.FIRST± 1)<2ⁿ

In current Cedar, a subrange value always fits in one word, because a subrange may not have more than 2¹⁶values.

4.8 TYPE types

All type values have type TYPE. TYPE is not a general type; it lacks SIZE. NEW and the other general procs nearly all types have. Furthermore, in current Cedar a type can't be passed as a parameter, with two exceptions:

An interface type parameter can be declared in a DIRECTORY statement, and the resulting interface type can be used to declare an interface parameter in an IMPORTS clause. The argument for this latter parameter is supplied by an implementation which exports the interface type. See § 4.3.5.

An opaque or exported type can be declared in an interface module. An implementation of the interface provides the actual argument. See § 4.3.4.

A type also can't be returned as a result, with two parallel exceptions:
an interface type is returned by an interface module;

an exported type is returned by an instance of an implementation. The other possible uses of a type value are these:

A type value appears in a declaration, after a colon; e.g.. i: INT.

A type value appears as a value bound to a type name; e.g., T: TYPE^—'INT.

Some of the values in the cluster of a primitive type can be denoted by T.n. In general a proc cannot be denoted this way, though it is often possible to write x.P[...] to apply the primitive P to x and other arguments.

Certain primitives take type arguments: CODE, DESCRIPTOR. FIRST. ISTYPE, LAST, LOOPHOLE, NARROW, NEW, SIZE and a number of type constructors.

The runtime type system (in the interface AMTypes) provides complete facilities for manipulating types during execution of the program (but currently not for constructing them). The type values it manipulates have the type AMTypes.Type, rather than TYPE. A AMTypes.Type can be obtained from a TYPE using the primitive:

CODE: PROC [T: TYPE]^—)[AMTypes.Type].

In a number of cases the syntax T[x] (which looks like applying a type value) can be used. Depending on the class of T, the meaning varies. The cases are summarized here, and described in detail in the appropriate section above:

TYPE applied to a static integer n yields an opaque type of size n: applied to ANY it yields a fully opaque type (§ 4.3.4).

A record type applied to a group or binding yields a record value; this is called a record constructor (§ 4.6.1). The same thing works for arrays (§4.4.2A).

A sequence-containing record type applied to a (not necessarily static) CARDINAL yields a record type containing a sequence of definite length, which can only be used in NEW and SIZE (§ 4.4.2B).

A subrange type (including NAT, INTEGER, or CARDINAL) applied to a value of its ground type yields a subrange value (§4.7.3).

·A variant record type applied to a static tag value yields a bound variant type (§4.6.2).

·An enumerated type applied to a name which is one of the enumeration literals yields the corresponding enumeration value (§4.7.1A).

The last two cases are obsolete notations for expressions which should be written with dot notation. One other use of TYPE is to denote the type of an interface: TYPE n (§ 4.3.5).

4.9 Miscellaneous types

•4.9.1 Unspecified

The type UNSPECIFIED both implies and is implied by any type T with T.SIZE= 1. The type LONG
UNSPECIFIED is implied by any type T with T.SIZE=2, and implies any type T equal to a type of
the form LONG ... or REAL. In a CHECKED block, T must not be RC (§ 4.5). These types are
assignable (§4.3.2), and in addition have a peculiar collection of operations in their clusters: if you
need to know about any of these, consult a wizard. The main use of unspecified types is as domains
of procs which must accept an assortment of types as arguments. Their use should be avoided if at

all possible.

4.9.2 Kernel types

Declarations are explained in § 2.4.5, groups in § 2.3.4, and bindings in § 2.3.5. There is a summary of the relations among these classes in § 2.8. The different kinds of constructor are explained in § 2.2.5. Precise definitions of the types and primitives are in § 2.2.1.

4.10 Concurrency

This section describes the Cedar facilities for concurrent programming, and offers some very sketchy guidance on the proper construction of concurrent programs. The paper by Lampson and Redell ("Experience with processes and monitors in Mesa," Comm. ACM, Feb. 1980) has more information on this subject.

4.10.1 Processes

FORK creates a new concurrent process P, which is returned as the value of the FORK. P runs the proc which is the first argument of the FORK. P is destroyed when the proc returns. JOIN P waits until P is destroyed, and returns the results returned by the proc. Thus

x⁴-JOIN FORK Proc[x. y]

is an inefficient way of doing

x⁴- Procfx.

Process.Detach[P] never waits, and causes the results of P to be discarded silently. If you do neither JOIN nor Detach, the process stays around uselessly after its proc returns.

A FoRKed proc runs just like one which is applied in the usual way, except that an exception which escapes from it is not propagated to the proc doing the FORK, but instead calls the debugger (an applEnⁿⁱ can be written on a FORK, but it does not catch exceptions from the new process). Thus any proc that can be FORKed can also be called normally, but not vice versa, since a proc to be FoRKed must handle all exceptions.

4.10.2 Monitors

Monitors are for synchronizing access to shared variables. A monitor is a construct which unifies synchronization, declaration of shared data, and the code which touches the data. A monitor is a module which normally contains all the procs that access a certain set of shared variables. These are of two kinds (declared in the block which contains the proc body), ENTRY procs which can be called only from outside the monitor, and INTERNAL procs, which can be called only from within the monitor. A monitor module can also contain other, external procs; these are in the module, but are not considered to be in the monitor. They have no special properties, and should not access any shared data that changes; however, this rule is not enforced.

Only one proc in the monitor is allowed to run at a time, so that such a proc behaves as though only one process could access the data. Associated with a monitor there should be an invariant. which is true of the shared data whenever no monitor proc is running. This invariant can be assumed whenever an ENTRY proc is entered, and must be established whenever an ENTRY proc returns, and whenever a proc in the monitor does a WAIT. There should be no shared variables not protected by a monitor. Further discussion of how to write concurrent programs that work is beyond

the scope of this manual.

There is exactly one MONITORLOCK variable associated with each monitor (not necessarily with each MONITOR module instance, though this is so in the simplest case). Note that this is a variable, and is not assignable: usually you use a reference to it. In most cases, however, this variable is not declared explicitly, but instead is declared implicitly with the name LOCK:

A MONITOR module with no LOCKS clause has an implicit declaration of a variable LOCK:

MONITORLOCK.

A MONITORED RECORD has an implicit declaration of a field LOCK: MONITORLOCK.

The locks⁴ clause in a MONITOR module determines which monitor all the entry and internal procs of the module belong to (i.e., which MONITORLOCK they lock and unlock). There are three cases, increasingly complicated to handle and providing increasing amounts of flexibility and concurrency. Use the simplest case you can get away with.

1) If there is no locks clause, the procs in one instance of the module all belong to a single monitor associated with the instance. The MONITORLOCK is the LOCK variable of the module instance.

2) If there is a locks clause but it has no USING clause, the e of the locks clause is evaluated to obtain the MONITORLOCK. This is done in the scope of the module parameters and any open on the module block. This case is useful when procs in several MONITOR modules must be part of the same monitor. One module declares the lock, and the others import Alternatively, it can be allocated elsewhere, and passed to each instance at initialization.

3) If the locks clause has a USING n_u: T. every proc I'_m in the monitor must have a parameter n_u: T. The e in the locks clause is made into a proc

PROC[n_u: t] RETURNS [MONITORLOCK]^—{RETURN[e]l

(in the same scope in which e is evaluated in (2)), and P_u is applied to the n_u parameter of P_m to yield the lock variable each time it is needed. This case is useful when there are

many instances of the shared data, all operated on by the same procs, and each instance has an invariant which is independent of the others.

Restriction on LOCK expressions: The evaluation of the expression that yields a lock must not do a

WAIT.

Caution that lock expressions must be functional: In cases (2) and (3), the expression that yields the lock variable is reevaluated each time the lock is needed, i.e., at start and end of each ENTRY proc application, and of each WAIT. Within a given application of an ENTRY proc, it must always yield the same variable, or chaos will result: however, this is not enforced.

Caution on global variables with USING: In case (3). the global variables of the MONITOR module instance are not protected by the lock. Almost certainly they should be changed only during initialization.

•In cases (2) and (3), the expression that yields the lock variable may yield a MONITORLOCK. a record containing a field LOCK: MONITORLOCK, or a reference value which can be dereferenced to yield one of these. This is a minor convenience to save you from writing ?.LOCK. and it should be avoided.

An ENTRY proc may be inline, and may be declared in an interface. In this case the interface must have a locks clause, which probably refers to an interface variable or has a USING.

4.10.3 Conditions. WAIT and SIGNAL

Often a monitor proc cannot complete its job. but must wait for the state of its data to change (e.g., in a bounded buffer, the Put proc might find the buffer full, and must wait for space to be available). Waiting is done by a WAIT primitive, which specifies a condition variable of type CONDITION on which to wait. Note that this is a variable and is not assignable: usually you use a

reference to it.

The WAIT releases the monitor lock for the monitor that encloses it, so the waiting process must establish the monitor invariant. Execution will resume after WAIT c at some time after one of the following is true:

There is a BROADCAST done on c.

There is a NOTIFY done on c, and the waiting process has the highest priority of any process waiting on c, and has been waiting on c longer than any other process with the same priority.

The process has been waiting longer than the timeout interval associated with c. There are procs in the Process interface for setting timeout intervals. There is no special indication that waiting ended because of a timeout: the program can read the clock, or find this out in some other way.

An ABORTED ERROR is caused in the process by some other process. There is a proc in the Process interface to accomplish this. The ABORTED is the result of the WAIT, and never arises from any other primitive.

A process continuing after a WAIT has no special priority, and may not assume anything about the monitor data except the invariant. Thus a WAIT should be inside a loop of the form UNTIL data is such that the process can proceed DO WAIT c ENDLOOP

The idea is that WAIT is simply an optimization of busy waiting, in which the process repeatedly tests for the desired state, wasting a lot of processor cycles.

For this to work, when a monitor proc changes the data so that a waiting process might be able to proceed, it should do a BROADCAST to a condition variable which has been declared to reflect this fact. It may do a NOTIFY instead if only one process should proceed, and it is always the process at the head of the condition queue: this is an optimization which may avoid needless execution of several waiting processes (but if misused, it may prevent the right process from running). In a properly written program. BROADCAST is always correct.

There is no way to time out a process waiting to acquire a monitor lock.

Note that an internal proc doing a WAIT in a monitor with a USING clause must have a suitable n_uparameter.

4.10.4 Exceptions

An exception which is the result of an entry proc will not release the lock when the proc is finalized, unless there is an enChoice⁹ which catches only UNWIND in the enable of the proc's block. Hence every entry proc should have such an enChoice, unless it is known that it never raises an ERROR or a SIGNAL that isn't resumed. Of course, the UNWIND enChoice should establish the invariant. If no work is required to do this, it can simply be NULL.

Anomaly about errors exiting from ENTRY procs: Recall that the current implementation of ERROR handling does not do finalization until there is a GOTO out of the enChoice that catches the error

(§ 3.4.3.1). This means that if the error came out of an entry proc the lock is not released: hence the enChoice should refrain from calling any monitor procs.

If the exception is actually raised in the ENTRY proc itself, an alternative is to raise it using RETURN wrrH ERROR instead of ERROR. This causes the lock to be released first. Of course the monitor invariant should be established. In this case the lock is released before the error is propagated, so the enChoice that catches it is free to call the monitor again.

An enChoice on a WAIT, like all the other code in a monitor proc, is executed with the lock held.

4.10.5 Miscellaneous

The monitor data must be initialized before any entry procs are called. It is unwise to rely on a start trap (§ 3.3.2A) for this, since the monitor lock is not held during execution of the program proc. An initialization proc should be called (•or the module should be sTARTed explicitly) before any processes are allowed to call entry procs of the monitor.

Performance of process primitives: WAIT, NOTIFY, BROADCAST, and entry to and exit from an ENTRY proc are quite efficient: each costs significantly less than an ordinary proc call. A process switch costs about as much as calling a null proc with no arguments or results. A FORK/JOIN pair costs about 30 times as much.

4.11 Defaults

ss defaultTC :: = CHANGEDEFAULT[OldT^—t, (

t I Default—NIL, trashOK-FALSE]

t ⁴-e I Default—INLINE A IN e, trashOK—FALSE] I

*t e ± TRASH I Default—INLINE A IN e, trashOK—TRUE]

*t <- TRASH Default—t.Trash, trashOK — TR U )

defaultTC legal only as the type in a decl in a body⁹ or field ⁴³(n: t ^Fe). in a TYPE binding¹³, or in NEW Note the terminal I. •TRASH may be written as NULL.

Declaration

n: T⁴- e

n: Tin drType⁴²

Binding

short for

n^—x

n^— or nothing

n^—x

n^—OMITTED

n^—x

ERROR

n^—x

n^—e (in scope of decl)

n^—x

ERROR

Table 4 —6: Usual cases for defaults

Anomaly on discarding defaults for domain and dange declarations: The last column says that if you just write T in a proc domain or range declaration, any default is discarded. This means that you can tell by looking at the declaration whether there will be defaulting, without knowing anything about the defaulting properties of the types.

The basic idea is complicated by an assortment of features for improving efficiency. which are described in the remainder of this section. Defaulting is controlled by two items in the cluster for a type T. and by two special values. The cluster items are:

Default: PROC [1— [7j, a procedure which supplies a default value. If this item is missing or NIL, values of T cannot be defaulted. Defaulting is done by coercing the special value OMITTED to T.Defaultn.

Trash: PROC []^.—[7]; a procedure which supplies a trash value of type T, a haphazard collection of bits of the same size as a value of type T. If this item is missing, values of T cannot be trashed. The main virtue of this procedure is that executes very fast. See the description of TRASH below.

The CHANGEDEFAULT primitive makes a new type with these items modified. It cannot be written in a program, but is invoked by the syntax for defaultTC.

CHANGEDEFAULT: PROC[OidT: TYPE, Default: PROC D^—q7], trashOK: BOOL]^—*[Newr TYPE]

NewT has the same predicate and cluster as OldT, except that: NewT.Default is Default.

NewT.Trash is copied from OldT.Trash if trashOK=TRuE; a missing OldT.Trash causes an error in this case. NewT.Trash is omitted if trashOK= FALSE.

As described earlier, a type in a proc domain or range which is not a defaultTC has its Default and Trash procs omitted.

The two special values cannot be written explicitly in a program, but are supplied as follows:

OMITTED — in an argBinding²⁷ the syntax n— which omits the value, means n^—OMITTED. Then if there is a Default, OMITTED is coerced to T.Defaultp to provide a value of type T. There is also a coercion which adds n^—OMITTED to a binding which lacks n, so that n can be left out entirely with the same effect as writing n^—' . You can write a denotation for OMITTED in a VAR constructor, i.e., on the left side of 4-.

In a group (constructor without names), an empty element means OMITTED; note that the group is first coerced to a binding by attaching the binding's names to the group elements in order (§2.2.6), and then if the resulting binding is too short, n^—OMITTED elements are added for the trailing names.

TRASH—a binding can specify this value explicitly with the syntax n^—TRASH. It is unwise to use TRASH if the program uses the value. Its purpose is to avoid the cost of initializing a variable which is going to be reinitialized before it is read.

The effect of these rules is that binding [n₁—e₁ to [n_i: T_I ...] has the same effect as binding any of
[nl ...]. [...], or [ ...] to [n₁: ' T11 ...] (assuming that any free variables have the same bindings).

Primitive types and those returned by primitive type constructors (except CHANGEDEFAULT) have a Trash proc, and a Default proc equal to the Trash proc, with the following exceptions:

CONDITION, MONITORLOCK and PORT have no Trash or Default: they do have an !NIT proc which sets any variable to NIL.

REF and PROC types have no Trash, and a Default which returns NIL.

Bound variant records have no Trash, and a Default which sets the tag value appropriately.

Composite types have a Trash or Default if all their component types do; it is the obvious concatenation of the component Trash or Default procs.

Including the various dangerous uses of TRASH which omit initializations, we get a larger and more confusing summary table, which should be ignored except by efficiency hackers.

Default type constructor			T<- e	T⁴- e I TRASH	fl-TRASH Tin domain/ range decl
Default			IN e	X IN e	TTrash
Trash				T.Trash	T.Trash
Declaration Binding	short for	n: T^4-	n: The	n: T-e 'TRASH n:T'-TRASH n: T
n— x	n^—x	x	x	x	x	x
n— or nothing	n^—OMITTED	ERROR	e (Belau*	e (Defau4)	T Trash	ERROR
n^—TRASH	n^—'TRASH	ERROR	ERROR	T.Trashn	T.Trasha	ERROR

Table 4 — 7: Complete cases for defaults

4.12 Type implication

A type T implies another type 7' (TT for short) if for any value x.

T.Predicate[x]r .Predicate[x]

In other words, if any value that has type T (satisfies T's predicate) also has type r, then T implies T . A consequence is that a proc with domain type 7' can safely be given a value of type T, since this value must also have type 7', as required by the proc. We also say that a 7' value is as good as a r value. or that T is a subtype of T .

If Ts predicate includes a test for some mark, then any type which implies T must test for the same mark or a bigger one. For instance, if R is a variant record type with variants a, b, and c, then R.a=^. R if R.a.S1ZE = R.SIZE. In fact, the predicate for R.a tests for R's mark and for a tag equal to a. In other words, a bound variant value is as good as an unbound one.

From the implementation's viewpoint (and after all, it is the implementation of an abstraction that is responsible for attaching marks), two values should have the same mark only if they both have representations with all the properties implied by that mark: occupy at least that much space. have the proper fields interpreted in the proper way, etc. This is the rationale for marks: to distinguish values which are not acceptable to the same primitives. Of course this is not an enforceable rule: an implementation can unwisely allow the marks it controls to be applied to unsuitable values.

For example, [0..5][0..7] because both occupy four bits and represent the integer unbiased. But [1..5] does not imply [0..7], because it happens that the implementation biases the representation of a subrange value, so that the value 1 is represented in [1..5] by binary 0000, but in [0..7] by binary 0001. [1..5] and [0..7] must have different marks. but [0.3] and [0..7] can have the same mark (which might be called "four bit unbiased representation for unsigned integer"), and distinguish the values with the rest of their predicates (0<x<5 vs 0<x<7).

For T to imply r . there must be a proof that 7's predicate implies r's predicate. If T is an arbitrary type, and nothing is known about its relationship to other types, or if it tests for a unique

mark. then no such proof is possible. As a result, only an argument with syntactic type T is acceptable to a T– >I? proc. For built-in types and type-returning procs, however. the compiler knows the predicates and keeps track of the implications. The implies relations among built-in type are (the transitive closure of those) specified in the following table.

Certain points about the table are of special interest:

The first line says that implies extends elementwise to declaration types.

The line for transfer types (including PROC) says that (D---PR)=(.0'–') if D'=D and R=R". The relation is reversed for the domain types, because a D'–+R' proc P' must accept any D', while a D–^,1? proc P only accepts Ds. If P is used in the former context, it is only

guaranteed to get a D'. and that must imply a D.

There are no implications of the form VAR TVAR U. You might think that T=U should imply this, but it doesn't work, because a VAR can be assigned to, and assigning a U (say a [0..7]) to a T (say a [0..5]) clearly won't do. So a VAR T can't be as good as a VAR U. which can be assigned a U value. In fact, if there were write-only VARS, the relation would be backwards. This is a reflection of the fact that the only interesting operation on such VARS is assignment, which has the type [VAR T. 7^-1–>[7]: as we have seen. proc type implication is backwards from the domain type implication.

Any argument omitted from the type constructor applications in the table may take any legal value, but it must take the same value in both applications in a single row.

4.13 Coercions

In a binding n: t- e. the value e must have the type t. To ensure that it does, the binding constructor is type-checked by requiring V e to imply t. If it does not, an attempt is made to find a coercion function C: V e–>t which can map the argument to the required type. If C is found, the binding is rewritten as n: t–C[e]. which typechecks. We say that e is coerced to the type D.

A coercion may also be done in an application such as f[e]: this is actually a special case of a binding. Note that infix operators, including assignment, are special ways of writing applications, and hence also do coercions. In particular. x: REAL: x⁴-3 will coerce 3 to a REAL.

There are no coercions from VAR T to VAR U; this is because coercing produces a new value, but a new VAR would be disjoint from the old one and would increase the size of the state, which is unlikely to be what is wanted.

Note that if T implies U (see § 4.12), no coercion from T to U is needed to make an application type-check. Another way of thinking about this: TU means that there is a coercion function from T to U. but it does no computation. This is why REF T can be coerced to REF U if T'U.

A group or binding can be coerced element by element. Formally, a declaration type, which is the type of a binding, has one coercion for each coercion that an element type has. These can be composed to coerce several elements.

There is currently no way for the program to specify coercion procs. However, there is a modest set of built-in coercions, which are are listed in table 4-9. These can be composed. if the types permit it, to yield a coercion function. None of them loses information, except those from various whole numbers to REAL: in other words, they all have inverses. None of them can raise an exception, except a coercion from a base type to a subrange. which can cause Runtime.BoundsFault. Any argument omitted from the type proc applications in the table may take any legal value, but it must take the same value in both applications in a single row.

4.14 Dot notation

Cedar provides a single basic mechanism for getting a name looked up in a particular binding. rather than in the current scope (§2.4.4):

If b is a binding, then b.n is the value of n in b: it is an error if b has no element n. By a natural extension:

If T is a type, then T.n is the value of n in Ts cluster.

By a somewhat less natural, but very useful further extension (inspired by classical notation for records, and by Smalltalk):

If e is an expression not a type or binding, then let P=(Ve).n.

If P.DOMAIN=[p: a then e.n is if el .

Otherwise, if P.DomAIN=[p_i: D_i, p₂: ..., p_n: DJ, e.n is A [p₂: D₂. p_n: DJ IN

P[e. p₂, p_n]

In other words, the value of n is obtained from the cluster of e's syntactic type; call it P. If P takes one argument, it is applied to e. Otherwise. e.n denotes a proc which collects the other arguments p₂.

p_n that P wants, and applies P to e, p₂. p_n. In current Cedar you can't do anything with this
proc except apply it immediately: you have to write e.n[...].

There are four major applications for dot notation in current Cedar; they are described in the table below. All use the simple rules just stated (look up n in a binding; in the cluster of a type; or in the cluster of De and then apply it). But the sources of the clusters used and the procedure values in the clusters are quite various.

Object notation is the most general, since any opaque, record or enumeration type D defined in an interface acquires a user-defined cluster by this method. The current implementation is clumsy: all the procs in the interface I from which D comes are added to D's cluster, with the names they have in I, except those whose names are already in D's cluster. Of course, an element of this cluster is only useful if it takes a D or reference to D as its first argument. The reference case is often useful because when these procs are inherited by a reference type, they are not modified. E.g.. if P: [REF

is in D's cluster, it will also be in REF D's cluster, and if r: REF D, then r.P will be correct.

The interface I from which P is obtained is normally an interface instance I (which is imported). not an interface type IT (declared in the DIRECTORY clause), because only the instance provides a proc value for P. See § 3.3 for more on interfaces.

Restriction on object notation with multiple imported instances: The value for P always comes from the principal imported instance of IT (see § 3.3.3). You can ignore this if only one IT value is imported. If more than one is imported, however. confusion can result. If it does, consult a wizard.

The cluster for a record type R is formed automatically by the record type constructor, and simply contains a procedure for each field f. T₁,which takes an R and returns a T_r There are similar

clusters for VAR R and READONLY R, in which the procedures take VAR or READONLY R and return

VAR or READONLY T

An imported interface instance can be thought of as a binding, with a value for each name in the interface. (Actually it is more like a record; its cluster contains a proc for each name declared in the interface, which returns the exported value when applied to the interface value.) An interface type also yields a binding, which contains those names which are bound in the interface rather than simply declared (usually constants and types).

Index of points to note

This section lists the headings of the paragraphs throughout the manual calling attention to points that should be specially noted: anomalies, cautions, performance, representation, restrictions, and style notes. It also gives the number of the page on which each note can be found. See § 3.1.3 for an explanation of these categories.

Anomalies

ALL.........................................................................................................	84
Applying a parameterless proc.....................................................	58
Arithmetic............................................................................................	102
Arrays with empty domains...........................................................	84
CHAR MINUS.......................................................................................	101
CODE......................................................................................................	82
Discarding defaults for domain and range declarations.....	109
Enables in funnyAppls...................................................................	61
Enumeration literals.........................................................................	100
Equality of variants..........................................................................	75, 97
Errors exiting from ENTRY procs...............................................	107
FORK........................................................... ........ ........................	81
Garbage collection............................................................................	88
GOTO and procs.................................................................................	58
GOTO and UNWIND............................................................................	53
GOTO FINISHED.................................................................................	59
LOOPHOLE on variable types........................................................	75
MINUS on pointers...........................................................................	92
Mixed integer and cardinal arithmetic.......................................	103
Narrowing to a short pointer.......................................................	92
NEW........................................................................................................	61
Order of evaluating bindings........................................................	49
Order of finalization........................................................................	52
Order of initializing variables......................................................	50
Parameter and result names...........................................................	56
Relative array descriptors...............................................................	87
RESUME................................................................................................	53
StringBody...............................................................................................	86
Separate name space of labels.......................................................	53
Separators for SELECT....................................................................	58
SIZE........................................................................................................	74
Space in ATOM literals.....................................................................	92
Target typing of DESCRIPTOR......................................................	86
Target typing of NARROW and LOOPHOLE: ...........................	75
Union constructors..........................................................................	98
Union values.......................................................................................	98

Cautions

ANY and UNWIND..............................................................................................

Dangling references to frames

Errors in finalization
Exceptions in enable choices

Exporting a name to several interfaces

Finalization
Global variables with USING

Initializing monitors........................................................................................
Inlines in interfaces

Inlines in interfaces........................................................................................
Lock expressions must be functional

Referencing module variables before initialization

Uninitialized interface variables

Uninitialized RC variables

Use of reserved words and predefined names

Block entry and exit 50

Converting between opaque and concrete types 78

Inlines................................................................................................................... 57

Interface variables 46

ISTYPE for PROC ANY 75

Proc calls 56

Process primitives 108

Row arguments and results 83

SELECT................................................................................................................... 63

Static expressions 64

Whole numbers 102

Restrictions

ASSIGN procs

Bindings in interfaces

Cross types

Dot notation

EQUAL procs

Importing a principal instance into imported interfaces

Importing multiple instances
Inlines

LOCK expressions

NEW

Object notation with multiple imported instances

Record sizes

Referring to names introduced in an interface

Row sizes

Types, declarations, bindings and unions

Values of opaque types

Variables

Text Box: 90. 93 Index
BASE POINTER
bcd
BEGIN
benign side-effects bind by name
binder
BINDING
binding 87, 91, 94 40, 42
39 50
37
40
11, 45
3, 10, 19, 42, 50. 20, 55. 34, 64, Text Box: abstract access Text Box: equality
fault
ALL
allocation
AND
anomaly
ANY 34

81 81,107 101

18 48,55

79, 87, 75

61, 83.84

49 88

34, 58. 63, 71

67, 104

12. 61, 100

33, 36, 116

11, 35, 51, 63, 75. 79, 80, 87

Text Box: and UNWIND
applEn
application
of a program of an error of a module
APPLY
argBin ding argument arithmetic array
with empty domains arrayTC
arrayType arrow type as good as
ASCII
assertion language
ASSIGN
assignable
assignment
of a local proc
ATOM
literal
AtomsPrivate attribute
b
BASE 52

60, 70

3, 5, 9. 12, 16, 50, 61, 79

42, 45

12, 17, 79. 84, 87, 90,91

60, 62

3, 6, 16, 18, 19, 20 101-103

20, 83

82 86 9

110

37,101

75, 76, 85, 89

74, 75, 79, 83, 87. 95. 98, 99,105 3, 58. 61, 64

5, 8, 11, 92, 100 92

13.34,39,55 83.86,92,94

constructor

in interfaces block entry and exit block of storage BNF

body

bold parentheses

Fi00 L BOOLEAN

bound variant BoundsFault

braces brackets

BROADCAST

BTOD BTOV

built-in builtInType

BUT

butChoice

call by name

CARDINAL

carriage return cases of unions catch

caution

CDOTG CEDAR

Cedar kernel CedarReals

CHANGEDEFAULT CHAR

literal

MINUS CHARACTER CHECKED

checking choice

chooses class

hierarchy cleanup

41, 114

45, 64

19, 76, 83. 87, 96

9. 16. 45, 48. 57, 59

35 11,100

100

75. 76, 78, 97,110 84, 85, 99, 101, 103,

111

14 91,107

10,11

60, 71. 72

67 13,15

101

50, 63 101, 102

63, 98

51, 82 36, 116 10

67, 103

76,109

101

51 64 6, 64. 88

62.63 37

1, 4, 65.70-73 65.66

Text Box: 9, 16
33 33 33
4. 11, 12, 56. 61.
114 Text Box: dangling reference
DDOTP DDOTT
deallocated decimal
decimal point Text Box: 74 DECL
35, 45, 56, 82. 104 declaration
15 declaration type
3, 4, 6, 16, 18, 20, DECREASING closure CLRMFullGram.press CLRMSafeGram CLRMSumm.press cluster

Cluster

CODE

code coercion

71, 75, 76. 79, 83, 86. 91. 94. 96, 100, 101. 103. 111, 113

colon 5

command line 44

comma 39, 58

comment 38

commentary 36

compatibility 2

compile-time 86

compiler 40, 41, 42

compiler switch 64

component 19, 76, 82

proccomponent 77

composite 77

composite type 110

composite variable 77, 89

computation 16

COMPUTED 63, 85, 97, 98

concepts 2

concrete type 78

concurrency 6, 76, 105

CONDITION 107, 110

condition variable 6, 107

CONS 61, 64, 83, 91. 95

constant 59

constructor 3, 5, 14, 20, 70

for a union 98

contain 3, 18

container 18

contents 5, 18
contiguous block of storage 83, 96

CONTINUE 51, 58

CONTROL 41, 43. 47, 81

control variable 59

conveniences 6, 7
converting between opaque

and concrete types 78

COPYIMPLINST 47, 78, 81, 94

core language 7, 8

counted storage 76, 88

cross type 10, 19, 80

cross-reference 36

current environment 9

cyclic structure 88

d 8, 13, 34. 39, 55

50, 81

11 11 50

102

103

4, 11, 34. 41, 55, 80 111

default 3, 17. 20. 62. 83. 91,

108

default access 48
defaults for domain and

range declarations 109

Default 75,109

defaultTC 78, 108

deferred 9

DEFINITIONS 40

delimiter 38

denormalized 103

denote 9

dereference 50. 64, 90, 92

dereferencing NIL 64

DEREFERENCE 90

descriptor 82, 83, 84. 85. 86, 90.

DESCRIPTOR 64

descriptorTC 82

desugaring 1,7,33.35

Detach 105

DF files 40

DIFF 92.101

different types 96,100

digit 37

DIRECTORY 41.43

discrete type 82.99

discriminating 63

DIVIDE 101

domain 4,9,12,15,56

domain declaration 109
domain type implication 111

DOMAIN 79, 80, 81, 84, 87

dot 5, 44

dot notation 5, 12, 20, 61. 70, 79,

114, 115

drType 39, 80

DTOB 11

dynamically narrowing 97

e 8, 13, 14, 15. 34, 36,

39, 48, 55. 57. 60, 62, 67, 95, 99. 108

editor 40

efficiency 34

elements 10, 19

elementwise 111

empty 39

empty domain 84

empty subrange 84, 103

Text Box: EMPTYSUBRANGE 103
enable 48, 51, 57.
enable choice 51, 52, 56.
82
enable in funnyAppl 61
enChoice 48.60
END 39
endChoice 62
ENDCASE 35, 63
endpoints 62
English 7
ENTRY proc 105.107 Text Box: 59 58, 61, 62, first-class values
FIX
flat
FOR
forbid defaulting
FORK
formal
FRAME
frame 18 9,10 10 59 108
57. 61. 7
94
41, 47, 81.
49, 105
50. 81, 86. frame allocation

FREE

free variables

FROMATOM FROMGROUND FROMIMPLINST

fully opaque

function

functional

funny application funny Appl

garbage collector General

general type

global frame

global variables good

GOTO

GOTO and procs GOTO and UNWIND GOTO FINISHED

grammar

GREATER GROUND

group guarantee

hasNiL header

HEX

hexadecimal

HIDE

hiding history

identifier

idiom

IEEE floating point if

if expression

IFPROC

immutable imperative implementation

implies

52 100 103

77, 78

17, 50, 68. 77

71, 73

60, 61

6.88

74, 76, 79, 85, 91, 95, 97.98

41, 89

106

51, 58

59 34

99 103

3, 10, 19

76, 79, 80, 87

54 15

102

15 40 2

5, 34

103

12, 62 61

100

7,14

39, 40, 46, 69, 78, 81,

100

9, 11, 66, 80, 87, 98,

105, 110, 111

Text Box: 39,41,43,114,115 44,78
94
44 44 64 35. 13
2 88 82 82 70 13.
3 65, 62
60. 76, 61 84, 90, 92, 96,
97. 98
49, 64, 74
4, 16, 43, 46, 85, 108
43, 43 50 50
44, 57, 107
44, 57
44, 45, 57.64
67. 102
49
40, 41, 44, 50, 78,114
101
101
39, 40, 41, 44, 45, 50.
57, 69, 77,100, 107, 115
40, 91
40, 41. 74, 78,114 Text Box: 43 105
6,53.106
76 76 63,74,75,91,98 75
65 57,59
61,81,105
1, 2 7
13
105 2
19
9,56,80 53
99.103 import

interface

module instance multiple instances principal instance values

IN IN

incompatibilities incremental collector index

indexed set of values infix

infixOp

informal

inherit

INIT

initialize

initialization initializing monitors initializing variables

inline

inlines in interfaces

INLINE

lnline

inner scopes instance

INT INTEGER

interface

interface instance interface type

interface variable 46

INTERFACETYPE INTERNAL

invariant

ISLONG

ISREADONLY

ISTYPE

ISTYPE for PROC ANY

item iterator

JOIN

kernel

kernel definition kernel expressions kernel types

keyboard

keyword argument list

A-expression labels

LAST

LENGTH 83, 86

lengthening 92

LESS 99

LET 12, 20

library 2

Lisp 52

LIST 61, 91

lists of items 39

LisisAndAtoms 67.92

listTC 87

literal 5. 8. 13, 37, 64. 85

literal parentheses 35

loader 40, 41, 42

local 3, 56

local proc 50, 64

local string 86

LOCALSTRING 86

LOCK 106

LOCK expression 106

LOCKS 56

locks 39

LONG 35, 92

LONG CARDINAL 101, 102

LONG INTEGER 101

LONG POINTER 92, 93

LONG STRING 86

LONG UNSPECIFIED 105

LOOKUP 10

LOOKUPC 12

loop 14, 57, 58, 59, 64

LOOPHOLE 50, 53, 61, 63, 74, 75

on variable types 75

MACHINE CODE 56

machine-dependent 34, 77, 99

MACHINE DEPENDENT 85
MACHINE DEPENDENT

RECORD 96

machine instructions 56

main data space 92

map 80, 83, 87

map type 79

mark 4, 77, 96, 110

matches 4

MAX 99

mdFields 95

MDSZone 93

meaning 7, 68

Mesa 2

Mesa manual 81

MIN 99

MINUS 71. 92, 101

MINUS on pointers 92
mixed integer and cardinal

arithmetic 103

MKBINDD 10

MKBINDP 10

MKCROSS 10,19

Text Box: REF 91,93,110
REF ANY 63,74,91,97
REF TEXT 85
reference 91, 92
reference type 90
reference-containing 88
RC type 78
RC variables 50
reference-counting 76
referencing opaque types 78 Text Box: referring to module variables before initialization 43
referring to names introduced
in an interface 45
refTC 84. 87
REJECT 51
related structures 40
RELATIVE 92, 94
relative array descriptor 87
relative pointer 87, 94
relativeTC 87 Text Box: relOp rely
REM REPEAT
ReportStartFault representation reserved word rest
RESTART
restrict
restriction
result
result name
RESUME
retained
RETRY
RETURN
RETURN WITH ERROR
rewriting
ROPE
ROPE literal Rope
row
row as an argument or result
row size
row type
rows of different sizes rule
Runtime
runtime type system
safe
SAFE
safe language safe reference safe storage Text Box: 60, 62 4
101
53, 59 43, 47 36,116 2, 37, 38
91
61, 81 44 2,33,36,116 3.16,18
56
53
47,50 51,58 51,56 108
35
75,85 38
67
82,83 Text Box: 83 83 64 82,86
69 67 104 Text Box: 48,34,39,57 6,87,93
53 33 88 88 predeclared types predefined names predicate Predicate

prefix

pre fixOp prime

primitive

primitive application

primitive proc

primitive type principal instance PrincOps

priority

PRIVATE

proc

PROC PROC ANY

PROC bindings

proc calls

PROC type

PROC type implication proc value

procedure

process

process array process primitive

PROCESS type Process

PROGRAM

PROGRAM arguments

program instance program proc

PROGRAM type

program value properties

PUBLIC

punctuation radix

RAISE

range

range declarations

RANGE

reachable

READONLY

REAL REC

reclaimed record type

record field

record field proc record size

recordTC

recursion

68 38 4,11

74. 110 61, 70

60 36

1, 2, 8, 9,

16,

54,

64,

61, 70, 73

67-69

44, 114

67, 80

107

46, 48

110

63, 74, 75

56 56

4, 80

111

56, 57

6, 105

18, 88

108

81 67

41 41.42,47

43,94,108

81 81

46,47,48

102

4, 12, 56

109
79,	80,	84,	91
88
18,	46,	76,	84,	90, 97,

114

64, 101, 103

5,14

20, 50, 95, 96,114,

115

97 91 96

77, 95

5, 57

type implication type option

type value

TYPE n

type-checking typeCons

typeName

TYPE[ANY] TYPE[n]

"u" switch

UMINUS

UNBOUND UnboundProc

unchecked

UNCHECKED UNCONS

uncounted zone underflow

underlined

UNHIDE

uninitialized

interface variables RC variables

union

union constructor

union value unionTC

Unnew

unpainted record

UNREF

unsafe

UNSAFE

unsafe storage UnsafeStorage unspecified

UNTIL

UNWIND

UNWIND and ANY UNWIND and GOTO UNWRAP

upper case

user-defined cluster

USING

VAL VALUE

value

VALUEOF

values of opaque type

VAR

VAR ANY

variable

variable type variant field

110

36, 70

104

43. 78

4. 11, 43 67

36, 60, 67, 77, 87

46, 68, 77 46, 68.77

64 101

47. 80

6, 88

53, 64

49, 95

90, 93

103

35 15 43

46 50

75, 76, 84, 91, 97

98 98 95

47 91

50, 63

34. 54, 63, 92, 93, 97, 98

74, 75, 77, 82, 85, 86 88

67,93

76,105

52, 53,107

53 95

37. 38

114

35, 44, 106

61, 100

76, 87

353, 9, 16, 18.. 51, 52 76

3, 18, 46, 76, 83, 90,

111, 114

3. 18, 49, 75, 79, 87, 90

75, 76

variant record

VARIANTPART VARIANTTYPE

varTC

VARTOPOINTER VOID

WAIT WHILE

white space whole number

whole number literal widening

WITH

withChoice withSelect wizard

write-only

xDOTy

xfer

ZONE

._c 'Vidd

()

63,74,75.76.84,96, 97

55,76,82.87

76,92 14.58

6,61,91,107

38,92

101,102

102

103 35,56

63,97,98 81,105,114

111

7 80

6,75,76,93

37, 70. 71. 72

70, 71, 72

37 37

5, 37, 92

11 37 37

37 101

101

36, 37

35, 37

35. 37, 85, 97, 101 35, 37. 101

35, 37

37. 101

14. 35. 37

35, 37

37, 99

35, 37, 75

35, 37, 51. 53. 63 37, 99

37. 50. 64. 76. 77.

83, 85, 92. 93. 96 13, 35, 37

37, 101

§3.4 6 block ::= ?(CHECKED I UNCHECKED I TRUSTED)

{ ?open ?enable ?body ?( EXITS (n, !..=>s): ) } --In 3. 13. 14. §4.4

7 open :: = OPEN (n e I e). !.. --In 2. 5.

8 enable :: = ENABLE { enChoice: }:

9 enChoice ::=( e. !.. I ANY) => s --In 7.27.1.

10 body :: = (d I b): L. : s: ... I s: L. --In 5.17.

§4.6

§3.6 14 Statement ::= e_l ⁴-e₂ I e I block⁶ I escape I loop I NULL

16 escape :: = GOTO n I EXIT I CONTINUE I (RETURNIRESUME) ?e

17 loop ::= ?iterator ?(WHILE e I UNTIL e) §4.7
DO ?body¹⁰ ?(REPEAT FINISHED= >s) ENDLOOP §4,11 18 iterator ::= THROUGH e

FOR n : t ( ?DECREASING IN e I ^4-e_l , e₂)

e is a subrange. n is readonly.

4tupaintedTC ::= typeName³⁷ PAINTED t

41 transferTC ::=?(SAFE I UNSAFE) xfer ?drType

41.ixfer = PROC I PROGRAM I PORT I PROCESS I SIGNAL I ERROR

42 drType ::= ?fields_' RETURNS fields₂ I fields_' -In 3.41.

43 fields :: = d11_,...1 I ft 1 I

ANY --In 42. 50. 52. ANY only in drType.

44 arrayTC :: = ARRAY t₁ OF t2

45 seqTC ::= SEQUENCE n : OF t2 -- Only as last type in 50 or 52 refTC ::= REF ?vaTTC⁴⁴⁾

47 listTC = LIST ?(OF varTC⁴0)

so recordTC :: = ?MONITORED RECORD fields⁴³

52 unionTC = SELECT n : (t I *) FROM (n => fields⁴³), ?. ENDCASE -- Only as last type in fields of 50 or 52.

54 enumTC ::= In, ...1

55 defaultTC :: = e

Only as t in a decl in body⁹ or field⁴³ (n: t e).in a TYPE bindingⁿor in NEW.

§3.7 19 expression :: = n I litera1⁵⁷ I (e) I (e I typeName³⁷) . (9) n prefixOp e I e_i infixOp e₂ I e_l AND (2) e₂ I e_i OR (I) e, I e t (9) I ERROR I[ argBinding²⁷ ] I

application7o I

builtln [ e.... ?applEn^27.1]

funnyAppl e ?( [?argBinding²⁷ ?applEn²⁷-¹1 )

subrange25 I if28 I select²⁹ I safeSelect³² I s

Precedence is in bold in rules 19-21. All operators associate to the left except f. which associates to the right. Application has highest precedence. Subrange only after iN or THROUGH. s onl^y in if²⁸and select choices³⁰ ³³.

20 prefixOp ::= g (8) I - (7) I (- I NOT) (3)

21 infixOp = * I / I MOD (6) I + I - (5) 1 relOp (4)1 *- (0)

22 relOp ::= ?NOT (?--(-= < )) < = I >= I # I IN) --In 21. 30.

23 builtln :: = -- These are enumerated in Table 4-5.

24 funnyAppl ::= FORK I JOIN I WAIT I NOTIFY I BROADCAST I SIGNAL I ERROR I RETURN WITH ERROR

25 subrange ::= ?typeName³⁷ ( [ I ( )e,til

^e2 I)) In 19.39.

26 application :: = e [ ?argBinding ?applEn ]

27 argBinding ::= (n ?e ). !.. I (?e). --In 19.26.

27AapplEn = ! enChoice⁹: in 19. 26.

§3.8 28 if ::= IF e_l THEN e, ?(ELSE e₃)

29 select :: = SELECT e FROM choice: endChoice The ":" is "." in an expression. here and in 32.

30 choice :: = (?relOp²² e_l ). !.. = > e₂

31 endChoice = ENDCASE ?(=> e₃) --In 29. 32. 34.

32 safeSelect ::= WITH e SELECT FROM safeChoice: endChoice³¹

safeChoice = n : t => e₂

§3.2 56 name = letter (letter I digit)... -- Not a reserved word (Table 3-2).

57 literal :: = num ?( DIB)I digit (digit IAIBICIDIEIF) H ?num . num ?exponent I num exponent I $ n I

• (extendedChar I ' I ") I " (extendedChar I ') "

58 exponent :: = E ?( + I - ) num

59 num = digit !..

f,o extendedChar :: = space I \ extension I anyCharNof "Or\ 61 extension ::=

§3.3 I module ::= DIRECTORY (n_d?( : TYPE ?Il_i)

?(USING [ n_u, ... ) ),

( interface I implementation )

2 interface :: = : ?CEDAR DEFINITIONS ?locks ?imports

·SHARES n₅, ...) ?•access12 { ?open^? (d I b); 1 .

3 implementation :: = n_m : ?CEDAR ?safety

( PROGRAM ?drType⁴² I MONITOR ?drType⁴2 ?locks ) ?imports ?(EXPORTS n_e, ...)?*(SHARES n, ...)

·access^I2 block .

3.iimports::= IMPORTS ((n;,.: I) n_il),...)--In 2.3.

$Text Box: 36 type :: = typeName I builtlnType I typeCons 37 typeName ::= n1 1 typeName. n2 I typeName[e] •n2 typeName --In 19. 25. 36. 40.1. 49. 38 builtInType ::= INT 1 REAL I TYPE 1 ATOM I CONDITION I MONITORLOCK 1 •LONG CARDINAL 1 •11LONG UNSPECIFIED I stMDSZone I* ?tUNCOUNTED ZONE See Table 4 -2. TYPE only in a b or an interfaces d 39 typeCons :: = subrange25 I paintedTC40.1 I transferTC41 I arrayTC44 I seqTC45 I tdescriptorTC45.1 I refTC46 I listTC47 tPointerTC48 I •-relativeT019 I recordTC50 I unionTC52 enumTCS4 I defaultTC55 40 varTC ::= ( I READONLY 1 VAR) t 1 ANY In L 45-48. ANY only in refit. VAR only in interface dect 4o.ipaintedTC :: = typeName37 PAINTED t 41 transferTC ::=?safety4 xfer ?drType 41.ixfer ::= PROCEDURE I PROC 1 PROGRAM 1 PORT 1 PROCESS 1 SIGNAL 1 ERROR 42 drType :: = ?fieldsi ?(RETURNS fields2) --In 3.41. 43 fields :: = j I [t, I ANY --In 42. 50. 52. ANY only in 42. 44 arrayTC :: = ?*PACKED ARRAY ?t1 OF t2 r 45 seqTC = ?*PACKED SEQUENCE tag53 OF t Legal only as last type in the fields of a recordTC or unionTC. 45.1-rdescriptorTC ::= ?LONG DESCRIPTOR FOR varTeto varTC must be an array type. 46 refTC:: = REF ?varTC40 47 listTC :: = LIST ?(OF varTC40) 48 fpointerTC :: = ?LONG ?ORDERED ?•BASE POINTER ?•subrange25 ?(TO varTC40)10POINTER TO FRAME [ n ] Subrange only in a relativeTC: no typeName37 on it. 49•1-relativeTC::= typeName37 RELATIVE t so recordTC ::= ?accessI2 ( ?MONITORED RECORD fields43 t- MACHINE DEPENDENT RECORD (mdFields I •fields43) ) 51 fmdFields = [( (n posy, : ?•access12 t ...] -- In 50.52. 51.1tpos ::= el ?(: e2 e3) ) in 51. 53. 52 unionTC = SELECT tag FROM ( n, => ( fields43 I mdFields51 I •NULL) ), ?. ENDCASE Legal only as last type in the fields of a recordTC or unionTC. : ?•access12 I *t(COMPUTEDIOVERLAID) ) 53 tag ::= (n ?fpossii (t I *) --In 44. S2.* only in unionTC. 54 enumTC::= {n....} I MACHINE DEPENDENT {(?n (e) I n), ...1 55 defaultTC ::= t 4- I t e 1 Id e ,_1 TRASH I *t 4- TRASH defaultTC legal only as the type in a decl in a body9 or field 41(n: I 0- e). in a TYPE binding13. or in NEW. Note the terminal'. *TRASH may be written NULL.$ Text Box: 58 exponent ::= (HO ?( + 1 -)um
59 num :: = digit !.. $Text Box: 60 extendedChar :: = space I \ extension I anyCharNot-Or\ 61 extension :: = digit) digit2 digit3 InINIrE111-11121E111E111L1"1"1\$ Text Box: Ni-- i literaL
-- Hex INT literal. I
-- REAL literal. I
CHAR literal. I
-- ROPE. MOM literal. 4 safety :: = SAFE I UNSAFE --In 3. 41.

5 locks :: = LOCKS e ?(USING n_u: t)

§3.4 6 block ::= ?(CHECKED I UNCHECKED 1 TRUSTED)

{ ?open ?enable ?body ?( EXITS (n, !..=>s); ) } --In 3.13.14.

7 open :: = OPEN (n--ele).!..;

In 2. 5. 17. •The - may be written as:.

8 enable :: = ENABLE (enChoice 1 {enChoice:...}); --In 5. 17.

9 enChoice ::=( e, I ANY) => s --In 7.27.1.

10 body :: = (d 1 b): !.. ; s: I S; L. --In 5. 17.

declaration :: = n, : ?access varTC⁴⁰

In 2. 10. 43. VAR. READONLY only for interface var.

12 access ::= PUBLIC 1 PRIVATE --In 2. 3. I1. 13. 50. 51. 51

13 binding ::= n, !.. ?access t (

e I t₂ -- if t=TYPE-- I CODE 1

?INLI NE ?(ENTRY 1 INTERNAL) block⁶ )

if ?TRUSTED MACHINE CODE { (e....): •-• I In 2. 10. •The - may be written as =. •ENTRY or INTERNAL may be written before/. Block or MACHINE CODE only for proc types.

§3.6 14 Statement ::= e_l ⁴-e, I e I block⁶ I escape I loop I NULL In 6. 9. 10. 17. 19.

16 escape ::= GOTO n I GO TO n I EXIT 1 CONTINUE 'SLOOP _I

·RETRY 1 •REJECT 1 (RETURN 1 RESUME) ?e 1 tte (- STATE

17 loop ::= ?iterator ?(WHILE e I UNTIL e)

DO ?•open⁷ ?•enable⁸ ?body^lo

·REPEAT (n, !.. => s); ...) ENDLOOP

is iterator :: = THROUGH e

FOR (n : t I *n) ( ?DECREASING IN e I <- e_l , e,) e is a subrange. In FOR n: t n is readonly.

§3.7 19 expression :: = n I litera1⁵⁷ I (e) I (e 1 typeName³⁷). (9) n I prefixOp e I e_l infixOp e₂ I e_l AND (2) e₂ I e_l OR (t) e₂ I e t (9) I •STOP 1 ERROR 1[ argBinding²⁷ ] I application26 I

builtIn [ e.... ?applEn²⁷-¹] I

funnyAppl e ?( [?argBinding²⁷ ?applEn^27.1] ) I subranae²⁵ I if²⁸ I select²⁹ I safeSelect³² I •withSelect³⁴ I s

Precedence is in bold in rules 19-21. All operators associate to the left except

which associates to the right. Application has highest precedence. Subrange only after I \ or THROLG H. s only in if²⁸ and select choices³⁰ ^33
35.

20 prefixOp :: = C (8) I - (7) I (- 1 NOT) (3)

21 infixOp ::= * I MOD (6)1 + I - (5)1 relOp (4) 1^4.-(0)

22 relOp ::= ?NOT (?- (= I < I >) I <- I>= # I IN) --In 21. 30.

23 builtIn ::= -- These are enumerated in Table 4 - 5.

24 funnyAppl ::= FORK 1 JOIN I WAIT I NOTIFY I BROADCAST I SIGNAL I ERROR I RETURN WITH ERROR I •NEW (_*START

·RESTART I ttTRANSFER WITH 1 if RETURN WITH

25 subrange ::= ?typeName³⁷

([e_i..e₂ Il[e_i _.-e2.)1(ei^e2] I ( e_i e₂ ) ) --in /9. 39. 48.

26 application ::= e [ ?argBinding ?applEn ]

27 argBinding :: = (n (e I I *TRASH )), !.. 1(e I I *TRASH). L.
In /9. 26. .The - may be written as :. •NULL may be written for TRASH.

miapplEn ::= ! enChoice⁹; In 19. 26.

bt if ::= IF e, THEN e, (ELSE _I)

Nseleel ::= SELECT e FROM
"

choice: endChoice

The ":" is ". M an expression: also in 32 and 34.

,,,choice ::= ( ( I relOpⁿ ) e_l ). !..=>e, niendChoice ::= ENDCASE ( > e; _I)

In 29. 32. 34.

usafeSelect ::= WITH e SELECT FROM
safeChoice: endChoice'
!,safeChoice = n : t => e,

,,•witbSelect ::= WITH (n, --e, I • e, ) SELECT ( I fen) FROM

withChoice: endChoice

*The -- may be written as:.

ⁿ!;•withChoice ::= => e, I

n:. !.. => e,

IF e: THEN e, ELSE (e:1 NULL) LET selector -e IN

choice ELSE „. endChoice

ELSE is a separator or of the choice.

IF ( (selector' (= I relOp ) e,)OR ) THEN e: ELSE (e, l NULL)

LET v'-e IN

safeChoice ELSE... endChoice

IF iSTYF⁵1E VA] THEN LET n : ROWiv', t] IN e,
OPEN v' -e, IN LET n'-(Sn, I NIL). type'-vs'. selector*-(e,.TAG I e,,) IN withChoice ELSE ... endChoice

-- e_l, must be defaulted except fora COMPUTED variant.

IF selector =Sn, THEN OPEN

(BINDP[ri'. LOOPHOLE]v'.type^..nd ] I BIND**. ) IN e;

i^.(IF j<3 THEN 6 ELSE 8): -
IF k NOT IN Range THEN RETUR \171:

Text Box: - An IF with results must have an ELSE.
- SELECT expressions are also possible.
- IF 1<7 THEN [...1 ELSE...
- 7.8 => or =7, = 8 =>{.,.} is the same. - ENDCASE=>{...} is the same here. - Redundant: choices are exhaustive.
- Assume r REF ANY in this example. - tint is declared in this choice only.
- Only the REF ANY r is known here. SELECT f[j] FROM -

<7 =>1...}: _

IN [7..8]=>{...]: ‑

NOT <=8=>1...}: -

ENDCASE=>ERROR: -

WITH r SELECT FROM

‑

rint: REF INT= >RETURN[GcclItIntt. 17]: -
rReal: REF REAL = >RETURN[Floor[Sin rRealtlif

ENDCASE = >RETURNIIF r= NIL THEN ELSE -

Text Box: - See rule 52 for the variant record Node. - dn is a Node.binary in this choice only. - dn is a Node.unary in this choice only. - dn is just a Node here. nr: REF Node"-...; WITH dn--nr SELECT FROM -binar^y = nr^.-dn.b1:

unary=>{nr^.•dn.a}: -

ENDCASE=>{nr^.NILI: -

typeName [e]l typeName typeName.SPECiALIZE[e] I typeName . n₂

In 19. 25. 36. 40.1. 49. --n₂ names a variant.

obuiltInType ::= INT I REAL I TYPEI ATOM I MONITOR LOCK !CONDITION I

* ?tUNCOUNTED ZONE I efIvIDs ZoneA*LONG CARDINAL1* j ?LONG UNSPECIFIED --See Table 4 -2. TYPE only as t in a b or an interface's d INTE ER. CARDIN^AL NA . TEXT STRING. BOOL CHAR are predefined

Text Box: -- See 13 for use.
-- A record with just a sequence in it.
-- ch./ex:PI or chid refers to an element.
NARROWIrifirsi. ROTextjt is a
- READONLY TEXT (or error). .typeCons ::= subrange" paintedTC*^1! transferTC' I arrayTC" I seqT0^, fdescriptorTe^s'

refTC" I listTC" tpointerTC" •frelativeTC' I recordTC^,' I unionT0² I enumTC' I defaultTC"

,ararlt ::= READONLY I VAR) t I ANY ( VAR I READONLY I VAR) I ANY

In 11.45-d . ANY only in refTC. AR only in interface dee.

typeName PAINTED t REPLACEPAiNllin: I. from: typeName]

typeName must bean opaque type. t recordTC or enumTC.

transferTC :: = ?safety' xfer ?drType MKXFERTYPE[drType. ^Flavor-xfer]ii_ixfer::= PROCEDURE I PROC I PROGRAM _I

PORT 1 PROCESS 1SIGNAL I ERROR

drType ::= ?fields, RETURNS fields, I fields: domain-fields:. range-fields,

No domain for PROCESS. In 3. 41.

Text Box: mKAPIRAY[domain-t., range-(51 mKSEQuENCE[domain -tag. range-I]
MKARRAYDESCRIarrayType-varTC] fields ::= [dⁿ. ] I 11 - I ANY

ANY only in dr_ Type. In 42. 50.52.

iiarrayTC::= ?*PACKED ARRAY ?t, OF t,

45setrg.0 ::=?*PACKED SEQUENCE tag^,' OF t

Legal only as last type in a recordTC or un?onTC.

..5itdescriptorTC::=

?LONG DESCRIPTOR FOR varTC'

varTC must be an array type.

. rein: = REF ( varTC'l )

listTC ::= LIST( OF vari C'" I MKLIST ) MKREFl_[target-( varTC ANY )]

range-( varTC REF ANY )]

4.fpointerTC:: = ?LONG ?ORDERED ?*BASE MKPOINTER[target-( varTC I UNSPECIFIED).

*POINTER TO FRAME n ]

POINTER Nsubran(e^, (TO varTC'' I )1

subrange-subrange ] I

Subrange only in a relative C: no typeName' on it.

s.trelativeTC ::= typeName" RELATIVE t MKRELATivElrange-t. baseType-typeName]

i must be a pointer or descriptor type typeName a base pointer type.

Text Box: - Otherwise there's a compile-time error. -- Ql]. Q(1- ] trash i(not in argBindine). - No defaulting or trash for j.
-- Qa. C[k- 4 leave k =3. Text Box: i: INT.
j: I INT -3-
k 3_
I: INT.-3 !TRASH, m: INT.TRASH ]: recordTC::=?access^o(

?MONITORED RECORD fields" I

t MACHINE DEPENDENT RECORD (mdFields l•fields1 )

tmdFields = [((n pos). !.. : --In 50. 52.

?•access^o t), ...]

,tpos ::= ( e, ?(: e,)) -- In 51. 53.

52 unionTC ::= SELECT tag FROM

(n....= )(fields-⁰ I md Fields`' *NUM..- ?. ENDCASE

Legal only as last type_I in a recordTC or unionTC

5, ta ::= (n ): Naccess^r, 1
*tCOMPUTED^(tpos" *tOVERLAID )(t I')

In 44. S2. • only in unionTC⁵².

5.,enumTC::= n. I I

MACHINE DEPENDENT ( (n I) (e) n). ^„

«defaultTC::= CHANGEDEFAULTkIdT-t(

t Default-NIL trashOK-FALSII

t e I Default- INLINE IN e. trash() -FALSE] _I

*t e 1 TRASH i Default-1NLINE A IN e. trashOK -TRUE] 1

*t TRASH Default-t.Trash, trashOK -TRUE] )

defaultTC legal any as the type tn a dee) in a body' or field" tn: t ^re3 in a TYPE binding'. or NEW_ Note the terminal I. .TRASH may be written as NULL

Symbols	Rules	Explanation
Symbols