## Semantic Restrictions

Consider an operation called `"some element of_"` that accepts a set and answers an arbitrary element. Clearly the parameterization of the set should not impact the meaningfulness of the operation. It is therefore desirable to define the operation with maximum generality: it may accept any set whatsoever and will answer an arbitrary value from it.

Method "some element of_" is [ aSet : set of any | /* Convert the set to a tuple, thereby ordering the elements in some * (arbitrary) manner. Answer the first element. */ aSet→tuple [1] ] : any; /* Note that "aNumber" is typed as "any". If we changed the annotation to * "natural number", then the compiler would reject the operation (because "any" * is not a subtype of "natural number"). */ someNumbers : set of natural number := {1, 2, 3}; aNumber : any := some element of someNumbers; Assert: aNumber ∈ someNumbers;

`someNumbers` is a set comprising only natural numbers, i.e., positive integers. The method accepts any set, and `set of natural number` is a subtype of `set of any`, so the method is certainly applicable. Intuitively then, an arbitrary element chosen from this set will be a natural number. But the method's return type is `any`, so the compiler cannot reach the same conclusion as human intuition. To improve the compiler's inferential capabilities, another method can be introduced that overrides the generic behavior for instances of the subtype.

/* Both the parameter type and return type are subtypes of the original method, * so this constitutes an override. */ Method "some element of_" is [ aSet : set of natural number | aSet→tuple [1] ] : natural number; /* Because "someNumbers" is typed as "set of natural number", the compiler infers * that the send of "some element of_" will invoke the override rather than the * general method. Therefore the result is known to be a natural number, and the * comparison operator "_≤_" may be used. */ someNumbers : set of natural number := {1, 2, 3}; aNumber : natural number := some element of someNumbers; Assert: aNumber ≤ 3;

The ability to define such an override is extremely powerful, but it should set off alarm bells that both methods have the same implementation. This override clearly exists solely for the benefit of the compiler. Without it, the compiler (apparently) cannot deduce that an element arbitrarily chosen from a set of natural numbers can only be a natural number. This is not necessarily so onerous — until one wishes to select an element from a `set of string`!

Method "some element of_" is [ aSet : set of string | aSet→tuple [1] ] : string; /* The next statement says it all … */ someStrings : set of string := {"this", "is", "old", "now"}; aString : string := some element of someStrings; Assert: aString[1] ∈ {¢t, ¢i, ¢o, ¢n};

This situation arises naturally and continually when defining and using operations on parameterized types. Were this the only mechanism that Avail provided, then the language would prove grossly unsatisfactory for practical usage. A programmer would be forever duplicating code in order to gain the benefit of type inference. It would be far better if a programmer could define an operation once and for all at the maximum appropriate level of generality, yet any specific usage of the operation could benefit from local information. In order to determine the legality of an operation, the compiler must use knowledge of the types of the arguments of an operation, i.e., a message send, so it would be nice if these types were available for the purpose of ascertaining the type of value produced by the operation.

In Avail, local information can be leveraged at the site of a particular message send through the definition of a semantic restriction — a function that is applied by the compiler to the inferred types of the arguments of the associated method. A single semantic restriction affords several opportunities:

• It can strengthen the result type of a message send if the argument types indicate that such strengthening is both possible and warranted.
• It can reject a message send as illegal if one or more of its arguments are deemed inappropriate based on an assessment of their types.
• It can encode information about the semantics of an operation in such a way that the compiler can mechanically operate upon it. In a sense, it assists the compiler in leveraging the meaning of an operation.

Consider the following redesign of `"some element of_"`, and assume that no overrides are defined:

/* This is the sole implementation of "some element of_". The overrides are * gone — and presently unnecessary. */ Method "some element of_" is [ aSet : set of any | aSet→tuple [1] ] : any; /* This semantic restriction applies only to "some element of_". Notice that the * argument, "aSetType", is annotated with a metatype; the actual argument will * be bound to "set of any" or one of its infinitely many subtypes. */ Semantic restriction "some element of_" is [ aSetType : (set of any)'s type | /* Remember that a set type is parametric on the type of its elements. * "_'s⁇element type" accepts a set type and answers the value of this * type parameter. */ aSetType's element type ] : type; /* The initialization of "aNumber" and "aString" now each work as desired. */ someNumbers : set of natural number := {1, 2, 3}; aNumber : natural number := some element of someNumbers; someStrings : set of string := {"much", "better"}; aString : string := some element of someStrings;

How does this work? When the compiler parses the expression `some element of someNumbers`, it determines that it represents a send of the message `"some element of_"` to the value of the argument `someNumbers`. The argument happens to be a variable use phrase, so its inferred type is `set of natural number`. The compiler then identifies which semantic restrictions apply. In this case, only a single semantic restriction exists, and it is applicable for any value that is an instance of `(set of any)'s type`. By the law of metacovariance, `set of natural number` being a subtype of `set of any` necessitates that `(set of natural number)'s type` is a subtype of `(set of any)'s type`, and therefore `set of natural number` is an instance of `(set of any)'s type`. The semantic restriction is thus applicable for the value `set of natural number`. The compiler then applies the semantic restriction to this value: it runs the user-supplied code on the type inferred for `someNumbers`. The function answers `natural number`, the element type of `set of natural number`. Finally, the compiler computes the type intersection of the declared return type of `"some element of_"``any` — and the value obtained from the semantic restriction — `natural number` — to obtain `natural number`. This is the type inferred for the expression `some element of someNumbers`. It is trivially compatible with the declared type of `aNumber`, so the initialization is permissible.

Consideration of the expression `some element of someStrings` is analogous. Eventually the compiler invokes the semantic restriction with the value `set of string` and it answers `string`. So the type inferred for the expression is ultimately `string`, meaning that the initialization of `aString` is valid. A single implementation of `"some element of_"` and a single semantic restriction take the place of the infinitely many overrides previously conjectured to be necessary.

Why does the compiler compute the type intersection of the return type and the semantic restriction's answer? Because multiple semantic restrictions may be specified for the same method and even for the same parameter types. This ensures the modularity of the system. Only the semantic restrictions already defined for a method can be applied to a particular message send. Subsequently loaded modules can introduce new semantic restrictions that will have effect on further message sends.

To make this concrete, consider the following code sample:

/* Assume that "some element of_" is implemented and restricted as above. This * is an additional semantic restriction. */ Semantic restriction "some element of_" is [ aSetType : (set of any)'s type | /* The incomplete type "set" actually has two type parameters: the * element type and the allowed cardinality. The allowed cardinality is * expressed as an integral type and, when unstated, is assumed to be * "whole number" ≝ [0..∞). * * The predicate is true when the upper bound of this type is zero; that * is, when the set type is known to have the empty set, ∅, as its only * instance. */ If ⎡aSetType⎤ = 0 then [ Reject parse, expected: "an argument that is not \ \|guaranteed to be ∅" ]; /* Answer ⊤. By an identity law, this type will disappear in any type * intersection with a type other than itself. Why answer ⊤? Because we * don't actually want to strengthen the type here. That has already been * done for us in the previous semantic restriction. We could answer * "any" to get the same effect in this case, but we can always answer ⊤ * to indicate at a glance that we don't care to strengthen the inferred * type. */ ⊤ ] : type;
/* It is clearly invalid to answer an arbitrary element of the empty set: it * doesn't have any. Because of the new semantic restriction, the compiler will * reject the parse of the following message send and emit a compiler error. */ bogus : any := some element of ∅;

Mentally move the initializing declaration of `aNumber` to a point below the introduction of the second semantic restriction. When the compiler parses the expression `some element of someNumbers`, it applies both semantic restrictions. The first yields the value `natural number` and the second yields the value `⊤`. The type intersection of `any`, `natural number`, and `⊤` is `natural number`. The inferred type remains unchanged from before.

The advantage introduced by the new semantic restriction does not pertain to strengthening of the inferred type. Instead it improves the meaningfulness of the operation by utilizing type information local to a particular message send. It is absurd to ask for one of the elements of the empty set because it does not include any elements. The new semantic restriction codifies this by invoking `"Reject parse,expected:_"` if the argument necessarily describes a type whose only instance is the empty set. This is possible because the incomplete type `set` has a type parameter that constrains the cardinality of instances, and the semantic restriction can query the upper bound of this constraint via `"⎡_⎤"`.

Note that the implementation specifies nothing about the lower bound of the cardinality constraint. So, it does not block message sends that might not work correctly, only those that cannot possibly work correctly. But because semantic restrictions are modular, a less tolerant semantic restriction can later be introduced that imposes a burden on the programmer to prove that a set cannot be empty before it can qualify as an argument to `"some element of_"`.

/* This is a more cumbersome, but statically safer, semantic restriction that * forces the programmer to ensure the nonemptiness of an argument set. */ Semantic restriction "some element of_" is [ aSetType : (set of any)'s type | /* Two different error messages are provided as an aid for the poor * programmer who has to suffer with this draconian semantic restriction. */ If ⎡aSetType⎤ = 0 then [ Reject parse, expected: "an argument that is not \ \|guaranteed to be ∅" ]; If ⎣aSetType⎦ = 0 then [ Reject parse, expected: "an argument that cannot \ \|possibly be ∅" ]; ];

The compiler cannot check a semantic restriction for correctness. For example, it is entirely possible to define a semantic restriction on `"some element of_"` that always answers `string`; this will certainly be wrong whenever the message is sent to a `set of natural number`. Why bother with a semantic restriction then? One may as well ask, why bother with programming at all? Semantic restrictions are tremendously convenient and powerful so long as they are coded and tested for correctness.

A semantic restriction allows a programmer to assert that a particular expression yields a type that is stronger than the compiler can otherwise ascertain. In many programming languages, this is accomplished via a type cast, a special syntactic construct that mandates dynamic coercion of a value into a stronger type. Such a cast is necessary for each expression that has an insufficiently strong static type. The run-time value may or may not be checked for conformance to the stronger type. A language that actually checks type casts for validity during program execution is run-time type safe; one that does not is run-time type unsafe.

Avail is run-time type safe. When a message send is strengthened by a semantic restriction, the result produced at execution time is checked for conformance to the type answered by the semantic restriction at compile time. This ensures that the type error will be detected as soon as possible, preventing further corruption. Semantic restrictions are automatically applied by the compiler wherever appropriate, so the programmer need not annotate expressions that must be coerced at run-time, thereby reducing syntactic clutter and incidental code complexity.

Avail is first-order type safe. If every semantic restriction is correct, then the compiler guarantees that the program is completely free of type errors. Thus every value computed by an expression will be an instance of that expression's inferred type. Avail is not second-order type safe, since semantic restrictions cannot be checked for correctness — although their implementations are ordinary code and therefore themselves first-order type-safe.

 ‹ Type Inference | Return to Type System | Type Lattice ›