The Avail Programming Language

Quick Start

This quick start guide is targeted at experienced programmers who want to get started writing Avail as quickly as possible. No special effort has been expended to ensure that the material herein is suitable for those with a limited or nonexistent programming background. Furthermore, the quick start guide assumes that you have already installed and configured an Avail environment suitable for development of personal projects, and that you know how to launch the workbench.

The quick start guide endeavors to use code patterns as often as possible to engage the phenomenal pattern finding abilities of the human brain. For the fastest possible start, I recommend that you study the code samples to get a feel for the language but skim the text as much as possible, concentrating on the text only when samples don't click for you. You may want to read sections that don't contain code samples more carefully, since these sections are likely to answer technical questions about the language at a high-level. Some sections end with a quick reference subsection; you can generally skip these subsections completely on a first pass.

Table of Contents

Modules

Modules are where Avail code lives. Modules are organized recursively into roots and packages, and every package is positioned recursively within a root. Modules and packages both end with the .avail file extension, but modules are regular files whereas packages are directories. Each package contains an eponymous module, called the representative, which specifies the exports of that package.

Before you can start coding, you will need to create a root for your own Avail packages and modules. Set the AVAIL_ROOTS environment variable to something like this:

myproject=$HOME/.avail/repos/myproject.repo,$HOME/myproject;avail=$HOME/.avail/repos/avail.repo,$INSTALL/src/avail;examples=$HOME/.avail/repos/examples.repo,$INSTALL/src/examples

This path contains three (3) module root specifications:

Root Description
avail The standard library.
examples The standard examples.
myproject Your exploratory project, where your own Avail packages and modules go.

Obviously you can choose a name other than myproject or a path other than $HOME/myproject. There is no need for the last component of your root's source path to match the name of the root. For the rest of this guide, I will assume that you called the root myproject.

Once you have updated AVAIL_ROOTS, relaunch the Avail workbench. You can now populate your myproject root with modules and packages. Use a shell or text editor to create a module called "Exploration.avail" directly inside myproject, then switch focus back to the workbench and hit F5 (or choose the "Refresh" item from the "Build" menu) to update the module view. You should be able to find Exploration now under your myproject root. You will double-click Exploration whenever you want to recompile it.

Anatomy of a Module

Module Header

The module header specifies linkage information for the module. It supports a small variety of keywords, each of which leads a section. The following sections are most likely to be immediately useful:

Keyword Section Description
Module module name section Declares the name of the module.
Uses private imports section Declares the private imports of the module.
Extends extended imports section Declares the re-exported imports of the module.
Names introduced names section Declares all names introduced and exported by the module.
Entries entry points section Declares all entry points of the module.
Module Body

The module body is where code lives. When you are experimenting with Avail, you will put your code into the body of your Exploration module.

Using the Standard Library

In order to gain access to the native Avail syntax and the extensive functionality of the standard library, be sure to include "Avail" in the private imports section of the module header. This section begins with the keyword Uses.

This is very important — much more important even than including the standard libraries of other programming languages. Since Avail's native syntax is supplied by the standard library, and not built into the compiler, you won't even be able to use literals, define constants, declare variables, or write statements without importing the standard library. So don't forget this critical step.

Tokenization

Module bodies are fully tokenized prior to parsing. The parser operates on tokens to produce unambiguous top-level sends of macros and methods.

There are four (4) types of tokens, described below using regular expressions and Unicode categories:

Token Type Scanning Rule
keyword [\p{L}\p{Nl}][\p{Cf}\p{L}\p{Mc}\p{Mn}\p{Nd}\p{Nl}\p{Pc}]*
string literal "(?:\\"|[^"])*?"
nonnegative integer literal \{Nd}+
operator \P{Cn} (and does not match a previous rule)

Keyword Tokens

Keyword tokens may be used as identifiers, i.e., names of constants and variables. Keyword tokens are read greedily.

String Literals

The Avail compiler has built-in support for scanning string literals. String literals are read non-greedily.

Nonnegative Integer Literals

The Avail compiler has built-in support for scanning nonnegative integer literals, e.g., those described by the type whole number. Nonnegative integer literals are read greedily, and can therefore denote arbitrarily large finite values, not just ℤ/(232) or ℤ/(264), i.e., the integers modulo 232 or 264.

Operator Tokens

Every other non-whitespace character is scanned as a single operator token, so ::= produces three (3) tokens, not one (1).

Whitespace

Whitespace (\p{Z}*) is permitted to appear before or after any token, and must appear between two distinct greedily read tokens of the same type in order to distinguish them as separate tokens.

Intervening whitespace is not necessary to distinguish adjacent tokens of different token types.

Tokens carry their leading ("_'s⁇leading whitespace") and trailing whitespace ("_'s⁇trailing whitespace"), meaning that macros can react to whitespace if they are so inclined. This is what allows 6.4 to be recognized as a literal double, but not 6 . 4. This mechanism allows macros to decide whether whitespace is significant situationally, rather than the compiler enforcing a universal policy.

Comments

Block comments begin with /* and end with */. Unlike in C, C++, or Java, block comments nest, allowing you to comment out sections of code that happen to contain other block comments. Block comments are permitted wherever whitespace would be permitted.

There are no special end-of-line comments.

Parsing

The compiler parses a module body by attempting to convert sequences of tokens into unambiguous top-level sends of statement-valued macros and -valued methods. Whenever such a send is recognized, the compiler evaluates it immediately, allowing it to perform its side effects in the environment of the module undergoing compilation. These side effects are how new Avail elements are introduced, like method definitions, macro definitions, semantic restrictions, grammatical restrictions, etc. Once an Avail element has been introduced, it can be used immediately, i.e., in the next top-level statement.

Types

Avail is statically typed, and more strongly typed than any traditional programming language. Every value has a most specific type that is distinct from the most specific type of every other value. This type is called the instance type (referred to as a singleton type in some programming language literature). For example, 5's type and "Hello!"'s type are both instance types.

Explicit type annotations are required for variable declarations and block argument declarations. They are permitted in some other places, like label declarations and return type declarations. Otherwise types are inferred from context.

Every value is an instance of infinitely many types, so Avail trivially supports multiple inheritance. Avail's types are organized into an algebraic lattice, so any two types have a most specific more general type (the type union) and a most general more specific type (the type intersection).

Every type is denotable in Avail itself, i.e., there are no types intelligible to the compiler which cannot be mentioned explicitly by an Avail programmer. Additionally, every type is itself a value, whose own type is one of Avail's infinitely many metatypes. Metatypes are organized by the law of metacovariance: given two types A and B, A is a subtype of B iff A's type is a subtype of B's type.

Types Quick Reference
Message Description
"number" type shared by all numbers
"integer" type shared by all finite integers
"float" type shared by all floats
"double" type shared by all doubles
"boolean" type shared by true and false
"tuple" type shared by all tuples
"string" type shared by all strings
"set" type shared by all sets
"map" type shared by all maps
"_'s⁇type" get instance type of value
"_'s⁇instance" get sole instance of instance type
"_⊆_" subtype
"_∪_" union
"_∩_" intersection
"_∈_" instance of
"_`?→_†" cast

Literals

Unlike traditional programming languages which have fixed syntax for a small variety of literal types, Avail's macros permit values of any data type to be literalized. Macros act on phrases at compile time, and can execute arbitrary user code to effect substitutions, thereby allowing any computable value to be literalized. It is therefore not possible to give an exhaustive list of literal formats.

Built-in Literals

String literals and nonnegative integer literals are the only literal formats built into the compiler.

Floating-point Literals

Floating-point literals must include the fractional part. Exponential notation is optional. If the literal ends in f (U+0066), then a single-precision floating-point number is indicated; otherwise, a double-precision floating-point number is indicated.

Negative Numeric Literals

Negative numeric literals are constructed using the macro "-_".

Boolean Literals

The boolean type comprises the values true and false (written thus).

Null Literals?

Avail does not expose a null value for use by a programmer. An unassigned variable can be used for a similar purpose, but it does not travel as conveniently as a traditional null.

Literals Quick Reference
Message Description
"…#.…#«f»?" floating-point literal constructor
"…#.…#…" floating-point literal constructor (positive exponential notation without sign)
"…#.…#e|E«+|-»!…#«f»?" floating-point literal constructor (positive exponential notation with sign)
"-_" negative numeric literal constructor
"true" value representing truth
"false" value representing falsehood

Constants

Avail features a rich collection of immutable data types and a plethora of persistent operators. It is very natural to bind the results of expressions to constants, thus constant definitions appear much more commonly in Avail code than variable declarations. As the name suggests, constants are written once, upon creation, and never change during their lifetime.

Module Constants

If a constant definition appears as a top-level statement, then a module constant is created. This constant can be referenced from any lexically subsequent statement in the module; it never goes out of scope.

Local Constants

If a constant definition appears within a block expression, then a local constant is created. This constant can be referenced from any lexically subsequent statement or expression in the same block. It goes out of scope after the block expression ends.

Constants Quick Reference
Message Description
"…::=_;" constant definition

Variables

Despite Avail's many functional programming features, Avail is an imperative programming language. As such, it permits the declaration of variables—holders of state that can be written as well as read and can thus change over time.

Variables can be initialized upon declaration. A variable that is not explicitly initialized is deemed unassigned until it is written. Reading from an unassigned variable will raise a cannot-read-unassigned-variable exception.

An unassigned variable can be used for a purpose similar to a traditional null value (which Avail does not support).

Module Variables

If a variable declaration appears as a top-level statement, then a module variable is created. This variable can be referenced from any lexically subsequent statement in the module; it never goes out of scope.

Local Variables

If a variable declaration appears within a block expression, then a local variable is created. This variable can be referenced from any lexically subsequent statement or expression in the same block. It goes out of scope after the block expression ends.

Variables Quick Reference
Message Description
"…:_†;" variable declaration
"…:_†:=_;" variable declaration with initialization
"…" variable use (just say its name)
"…:=_;" variable assignment
"_↑is assigned" true if variable is assigned
"_↑is unassigned" false if variable is assigned
"Clear_↑" make variable unassigned
"_↑++" increment (integer variables only)
"_↑--" decrement (integer variables only)
"cannot-read-unassigned-variable exception" raised when reading unassigned variable

Blocks

Blocks are lexical specifications of functions. A function is a body of code whose execution is deferred until the function is applied (or invoked or executed or called, depending on what terminology is familiar to you).

Blocks without Parameters

Blocks for arity-0 functions (i.e., functions without arguments) are just bodies of code inside left square bracket [ (U+005B) and right square bracket ] (U+005D).

Blocks with Parameters

Block parameters are separated by commas , (U+002C) and precede a vertical line | (U+007C).

Return Types

Blocks infer their return types from their final expressions, but you can also annotate the return type of a block explicitly. This annotation is only necessary if you wish to weaken the return type of the block (because the compiler will always infer the strongest possible type).

The return type of a block that ends with a statement is , Avail's most general type. A return type of means that a block returns control to its caller without producing a value.

is Avail's most specific type. A return type of means that a block never returns control to its caller, i.e., because it always raises an exception, loops forever, switches continuations, etc.

Semicolons

By now you might be confused by the semicolons. When are they needed, and when not?

The body of a block is a series of statements followed by an optional expression. Statements are -valued, and must end with a semicolon ; (U+003B). The optional final expression produces a value, and must not end with a semicolon, even if -valued.

Summary: Every "line of code" in a block except the last one definitely ends with a semicolon. If the last "line" is a value-producing expression, then don't put a semicolon after it; otherwise, throw a semicolon in at the end.

Labels

Following the parameters section, an optional label definition is permitted. The label represents a continuation that resumes at the top of the block, i.e., the beginning of the applied function. The return type of the label is optional, but should be included whenever you plan to exit the continuation using the label. If omitted, the label return type is assumed to be (because continuations are contravariant by return type).

Defining a label permits use of several transfer-of-control mechanisms:

Message Description
"Restart_" Restart the continuation (like traditional continue)
"Exit_with_" Exit the continuation with a specific value (usually like traditional return)
"Exit_" Exit the continuation without producing a value (like traditional break)

Lexical Closures and Outers

Blocks behave as lexical closures of bindings declared in outer scopes. These bindings comprise arguments, labels, primitive failure reasons, local and module constants, and local and module variables. The bindings declared in an outer scope and captured by lexical closure are called outers. Mutable outers, i.e., module and local variables, may be rebound (by assignments or other mechanisms).

Blocks in Control Structures

Because blocks are just lexical specifications of functions, they may appear wherever functions may appear. This means that blocks can occur as arguments to standard control structures, like "If_then_" and "While_do_". This makes Avail ideal for higher-order programming.

And because block labels represent continuations, blocks can even be used to implement low-level constructs like loops.

Evaluating a Block Immediately

It is sometimes useful to evaluate a block immediately after closing it. This is especially useful when you just want to introduce a new scope for local constants and variables, or when you want to use a label to exit an inner scope without exiting the outermost block.

Blocks Quick Reference
Message Description
"\ \|[\ \|««…:_†§‡,»`|»?\ \|«Primitive…#«(…:_†)»?§;»?\ \|«$…«:_†»?;§»?\ \|«_!§»\ \|«_!»?\ \|]\ \|«:_†»?\ \|«^«_†‡,»»?" block definition
"function" type shared by all functions
"Restart_" Restart the continuation (like traditional continue)
"Exit_with_" Exit the continuation with a specific value (usually like traditional return)
"Exit_" Exit the continuation without producing a value (like traditional break)
"_(«_‡,»)" apply a function (with or without arguments)

Methods

Methods are named operations. Method names are called messages. In traditional languages, messages are typically constrained to be standard identifiers. Avail's story is rather different.

Message Pattern Language

Messages are specified using a pattern language that is commonly understood by the various method definers, like "Method_is_". A message teaches the parser how to recognize that some new syntax represents an invocation of the named method.

The chart below describes the pattern language of messages, omitting some patterns that are only useful for defining macros.

  1. The Pattern column shows a single pattern abstractly, with metacharacters in blue monospace. Bold patterns are eponymous and descriptive, i.e., keyword stands for a keyword token. Italic patterns are shortened to conserve horizontal space, and elaborated upon in the description.
  2. The Example Message column shows a message that embeds the pattern at least once.
  3. The Example Signature shows the types of the arguments accepted by the method, in their natural order.
  4. The Example Use shows an expression that sends the message.
  5. The Description tersely describes the feature that is activated by using the pattern.

Throughout the chart, the red arrowing pointing downwards then curving rightwards (U+2937) indicates the beginning of a blue monospace region that corresponds to the illustrated pattern.

Pattern Example Message Example Signature Example Use Description
keyword "integer" <> integer match a keyword token case-sensitively
operator "" <> match an operator token
p "_occurrences⁇of_" <whole number, any> 7 of 9
OR
7occurrencesof 9
match 0 or 1 occurrence of a pattern p
t1||tn "If|if_then_else_" <boolean, []→⊤, []→⊤> If a ≥ 1 then [a] else [0]
OR
if a ≥ 1 then [a] else [0]
match any token t1 .. tn of an alternation
«p1||pn» "«a lion|a tiger|a bear|oh my»" <> a lion
OR
a tiger
OR
a bear
OR
oh my
match any pattern p1 .. pn of an alternation
p~ "«red alert»~" <> red alert
OR
RED ALERT
OR
rEd AlErT
OR
etc.
match a pattern p case-insensitively
_ "Print:_" <string> Print: "Hello!" match a type-safe argument expression, pass it as an argument
_↑ "`↑_↑" <variable> var match a variable use, pass the variable itself as an argument
_† "_`?→_†" <any, type> n ?→ [10..20] match a type expression, evaluate it in the module scope, pass result as an argument
"::=_;" <token, expression phrase> x ::= 0; match a keyword token, pass it as an argument
…# "…#.…#«f»?" <literal token⇒integer, literal token⇒integer, boolean> 6.4 match a type-safe literal token, pass it as an argument
…! …!" <token> ¢= match an operator token, pass it as an argument
«p» "sum«_»" <number+> sum 1 2 3 match ≥0 of pattern p, pass accumulated arguments as a single tuple
«p1p2» "{«_‡,»}" <any*> {2, 4, 6} match n≥0 of pattern p1 interleaved with n-1 occurrences of pattern p2, pass accumulated arguments as a single tuple
«p»? "…#.…#«f»?" <literal token⇒integer, literal token⇒integer, boolean> 6.4 
OR
6.4f
match 0≤n≤1 occurrence of pattern p, pass n=1 as an argument
«p»# "«please‡,»#stop" <whole number>  stop
OR
please stop
OR
please, please, please stop
match n occurrences of pattern p, pass n as an argument
«p1||pn»! "«a lion|a tiger|a bear|oh my»!" <[1..4]> a lion
OR
a tiger
OR
a bear
OR
oh my
match any pattern p1 .. pn of an alternation, pass the pattern's ordinal, e.g., 1 for a lion, 2 for a tiger, etc.
`m "Yes`…`?" <> Yes? escape the following metacharacter m, disabling its special meaning (otherwise and ? each would have had special meaning)

Method Overriding

The ordering of a method's parameter types is called its signature. Methods can be overridden for any combination of parameter types. The names of the parameters are not considered part of the method's signature.

Method Dispatch

Every argument of a message send participates in the resolution of a method. The method definition selected is the most specific definition available for the supplied combination of arguments. If no method definition is uniquely most specific at runtime, then the dispatch mechanism raises an ambiguous-method-definition exception.

WORK IN PROGRESS