Metacza: Introduction


Metacza’ is short for ‘Meta Compiler with Zero Apprehension’. The language it compiles is also called Metacza, and the language it compiles into is C++, specifically the C++ Met Template Language. Metacza looks at single statements and translates them. Because it does not try to fully understand the code, it is called ‘with zero apprehension’.

Metacza is quite a beast of a language, namely a higher order purely functional language with closures, dynamic typing, and multiple dispatch. It inherits all these features from the C++ Template Meta Programming (C++TMP), which is also such a language, so the compilation is only quite a shallow step. Yet, the barroque syntax of C++TMP gave rise to the idea to have a nicer syntax so that it using it is more fun.

Metacza is designed to be usable as a pre-compiler for C++. Pre-processor and namespace directives are passed through to the C++ output file, making it feasible to write C++ header files in Metacza.

The following sections give an overview of the Metacza language.


The Relation to C++

Before we go into detail of the Metacza language, a few words on the relation to C++. Metacza compiles into a C++ Meta language, namely the C++ Template Meta Language. This means that what is a value in Metacza is something beyond (i.e. 'meta') a value in C++:

Values in Metacza are types in C++.

Definitions in Metacza, therefore, translate to typedefs and struct definitions in C++, and function arguments translate to templates of structs. But since you are interested in this topic (you are still reading), you probably know what the C++ Template Meta Language is and that it is actually a functional language. So let us proceed to the Metacza language and compiler itself.

Whenever you think you need to write raw C++ code, you can do that by putting it between %{ and %}. For example:

let
    %{
        enum { value = 10 };
    %}
in
data mynum_tag;

In this example, the enum definition will be put inside the generated struct mynum_tag.


Command Line

Use metacza --help to get a list of command line options.

You can add command line options in an input file by putting them on the first line after the keyword metacza. Options with paramters must use an equals character to separate the value and must not be written as two arguments although this is supported on the command line.

The initial dashes for command line options may be dropped in the first line.

E.g. if you are writing a file that will always be used with Boost, you may start a file like this:

#! metacza style=boost

#include <boost/mpl/int.hpp>

...

The command line arguments on the first line of the file are parsed similarly to a Unix shell, but currently a little more simply:

Arguments are separated by whitespace. Quotation with single quotes is supported and works just like in a bash shell (no escape characters inside the string). Single quotes used as part of an argument may be escaped using a backslash.

Double quotes and dollar characters will be rejected and must be quoted with single quotes.

E.g., the above could also be written as follows:

#! metacza 'style=boost'

...

Just like in a shell, quotation does not start a new argument, but quoted text is glued to the current argument. To separate arguments, white space is needed. The following is equivalent to the above:

#! metacza style='boost'

...

The Metacza Functional Language

Metacza is a purely functional language. All objects are functions. That is, any definition defines a function. A constant is a nullary function (i.e., one without parameters). And a data type definition introduces a function whose funcall expression maps to itself, meaning the term foo(5), if evaluated, maps to foo(5), making the term foo(5) a new constant of the language.

Metacza in a dynamically typed language, so you do not need to declare parameter types of functions or definitions or return values: by default, every function accepts any type of value.

Kinds

Since Metacza is a dynamically typed language, types are generally not declared or mentioned in the language, but are implicit. However, the arity of a function does matter. The arity of a function is classified using kinds.

_ is the kind of a nullary function = a simple value that takes no arguments
_(_) is the kind of a unary function = a function that takes one argument.
_(_,_) is the kind of a binary function = a function that takes two arguments.
_(_...) is the kind of a variadic function that takes 0 or more arguments.
_(_(_)) is the kind of a function that takes a function with one argument as an argument.

Kinds may be arbitrarily nested.

Values

The values in Metacza are either of simple type like boolean, integer, or string, or are structured types that are marked using a constructor. As mentioned before, constructors are functions that map to themselves. To declare a constructor, use a data declaration. Example:

data red;
data green;
data blue;

This declares constructors for constructing new values called red, green, and blue. The above constructors have no arguments, thus they are of kind _.

Constructors may also have arguments, and then they look more like normal funcalls:

data car(_);

This declares the constructor car that takes one argument, i.e., it has kind _(_). Since the language is dynamically typed, you do not need to state which type the argument has: the contructor will accept anything. The kind is inferred from the structure of the declaration.

Functions

Function definitions in Metacza may be split into several clauses, because it is possible to define functions in terms of pattern matching on the arguments. You can have several clauses to define the return value of a function depending on its arguments. To pre-declare a function, you can have a prototype, which looks just like a constructor definition, but without the data keyword:

fib(x);

This states that there is a function called fib that has one argument of arbitrary type. In the simplest case, the prototype only defines the number of arguments of the function. As a feature inherited from C++11, Metacza has variadic functions (just as C++11 has variadic templates):

foo(x1,x2,x3);     // three arguments
bar(x,y...);       // one or more arguments

There may only be one such prototype per function. Every defining mention of fib after the prototype will be taken to be a specialisation. In a specialisation, the arguments given to the function are matched agains the argument patterns in the function definition. Therefore, any funcall definition containing constants will also be taken to be a specialisation, even if no prototypical definition was seen before (it may be in a different file. Finally, if a variable is contained in a funcall definition more than once on the left hand side, then it the definition is also taken to be a specialisation, because this is an implicit equality constraint only possible in specialisations. OTOH, whether your function definiton is really consistent will be checked only by the C++ compiler: remember the ‘zero apprehension’ part... So to define the whole fib function, you can use several clauses:

fib(n);                         // prototype
fib(0) = 0;                     // first specialisation
fib(1) = 1;                     // second specialisation
fib(n) = fib(n-1) + fib(n-2);   // ERROR: Too general!

WAIT! What looks like a natural function definition using pattern matching in an off-the-shelf functional language will lead to a C++ compile error: the reason is that C++ does not allow specialisations (pattern matches) that are just as general as the prototype. The last of the above clauses will, therefore, lead to an error.

To avoid the problem, you have to put the general clause at the beginning of the function definition. And since it is just as general as the prototype, you must leave out the prototype. The correct definition for the above function would, therefore, be:

fib(n) = fib(n-1) + fib(n-2);   // general case first instead of prototype
fib(0) = 0;                     // first specialisation
fib(1) = 1;                     // second specialisation

The first definition of a function must always be one without pattern matching. It will be called the prototypical definition, because it defines the arity of the function just like a prototype. All other definitions must be specialisations and match the prototypical definition wrt. number of parameters and argument types. Note that since a function definition may be distributed among many files, the prototypical definition may not be seen by Metacza. This is especially interesting with variadic functions. Imagine in File A, there is a variadic prototype:

colourOf(...);

This declares a function with an arbitrary number of arguments. Then imagine in File B, there is the following definition:

colourOf(x) = ...;

Since Metacza cannot see the prototype in File A, and since the definition of colourOf contains no pattern matching and no duplicate variables, Metacza will take the definition in File B to be a prototypical definition. But this is probably not correct, because this is meant to be a specialisation (it has a more special arity).

To help Metacza, you must predeclare that foo is already known. This is done by the following definition:

const colourOf;
colourOf(x) = ...;

Patterns in function specialisations may use arbitrary type constructors. In the fib function, we have encountered 0 and 1. But of course, user defined values may be used, too:

colourOf(car(x)) = x;

In such a definition, Metacza will treat all identifiers that are used as a functor as contructors, and all others as variables. I.e., car is taken to be a constant of kind _(...) and x will be treated as a variable of kind _.

Sometimes you may want to pass non-nullary functors. In such a case, you must predeclare the identifier to be a variable in the function (i.e., a parameter to the function):

let
    var Thing;
in
colourOf(Thing(x)) = x;

This will treat Thing as a variable, too, so the pattern definition for colourOf will match calls like colourOf(car(green)) [with Thing=car and x=green] just like colourOf(plane(red)) [with Thing=plane and x=red].

We will go into the details of declarations later.

Lambda Functions

What might be surprising is the fact that the C++ Template Meta Language is a higher order functional language. Real closures are supported. Such level of abstraction was introduced accidentally in C++, and I was surprised to learn about this. This section will give an overview of what this means and how it is mapped in Metacza.

A higher order functional language allows functions to be used as values freely. For example, you may return a function from another function. In Metacza, an expression in curly braces is taken to be a functional value, or an 'anonymous function', or 'lambda function'.

lazy24 = { 1 * 2 * 3 * 4 };

This declares a function with no arguments that returns a function that computes the value 24. You can invoke this like a normal function:

foo = lazy24();

The actual value will only be computed when lazy24 is invoked. You can have more interesting functions, e.g.:

makeAdd(x) = { (y) = x + y }

This declares a function that takes one argument x and that returns a function that takes another argument and adds the two. As can be seen, you can define a lambda function that takes arguments using a similar syntax as with normal functions, namely by using a parameter list and an equal symbol followed by the function body. The following then constructs a new function that adds 5:

add5 = makeAdd(5);

And of course this can be invoked:

value11 = add5(6);

makeAdd is noteworthy in that it returns a closure: a function that has access to the parameters of makeAdd after makeAdd has already terminated. It is not directly possible to do this in C++ itself (because the stack frame will be gone by the time). In C++ Template Meta Language, though, it is possible, and it is also possible in Metacza.

You can write higher order functions that take functions and compose them:

compose(f,g)= { (x) = f(g(x)) }

To try this out:

add5  = makeAdd(5);
mul10 = { (x) = x * 10 };
test  = compose(add5, mul10)(4);      // OK: will compute 45

Note that there is a difference between a lambda function and a normal function wrt. to the type: lambda functions are nullary values (they are of kind _) that can be passed to a function as values directly. You cannot pass normal functions that simply if they have a higher arity, since by default, function parameters are assumed to have kind _.

mul20(x) = x * 20;
test2    = compose(add5, mul20)(4);   // ERROR: mul20 is not a simple value!

So you need to wrap a normal function into a lambda block to convert it to a nullary value:

test3 = compose(add5, { (x) = mul20(x) })(4);   // OK!

In normal expressions, Metacza treats lambdas and higher arity functions the same, so as you have already seen, you can invoke nullary values like functions for convenience, so the fact that they are different is hidden in many places:

normalFunc(x) = x * 5;
lambdaFunc    = {(x) = x * 5};
invokeNormal  = normalFunc(6);     // this is a different kind of invocation...
invokeLambda  = lambdaFunc(6);     // ...than this one.

Here, Metacza knows from the definition what kind of things normalFunc and lambdaFunc are, so the difference is hidden in the invocation (although the compiled code looks quite different). However, in some places (e.g. when used as parameters), Metacza does not know the identifier yet. In such cases, you need to use declarations to change the default kind.

Declarations

Declarations are information for the Metacza compiler. We have seen already that a declaration will start with either const or var. Everything that is already declared and known to the C++ compiler is called const, while everything that is an input for following definitions is called var.

Consider a simple function for enumerating colours. The prototype is:

enumerate(_);

If you want to enumerate the above colours, you might think that you can simply write:

enumerate(red)   = 0;
enumerate(green) = 1;
enumerate(blue)  = 2;

However, Metacza may not have seen that red etc. are constants (because they may have been defined in a different file) and since in the function head enumerate(red), red is not a functor, it will treat it as a variable. So if Metacza has not seen red's definition, you may need a const declaration here:

let const red   in enumerate(red)   = 0;
let const green in enumerate(green) = 1;
let const blue  in enumerate(blue)  = 2;

Declarations can also be used to declare the kind of an identifier. If you want to pass non-nullary functors to a function, you need to predeclare a parameter's kind. For example:

let
    var f : _(_);
in
apply(f,x) = f(x);

With such a predeclaration, you can pass functors of the declared kind to the function:

mul20(x) = x * 20;
test17 = apply(mul20, 30);

Note that since lambda functions are nullary values, you cannot use lambda functions for a parameter declared like that. To do that, you must prefix them with a '*' operator, which transforms lambda functions into normal functions:

test18 = apply(*{ (x) = x + 10 }, 30);

Namespaces

Metacza supports all namespace directives of C++. This means the following are all valid Metacza statements:

namespace foo { }
namespace foo2 = bar::foo2;
using namespace foo3;

Definitions inside namespace ... { ... } are translated in that namespace, and Metacza keeps track of the definitions in the same way as C++ does. The directives themselves are translated into C++ one-to-one.

Metacza extends the namespace directives a little bit and allows qualified names when defining a namespace. Such directives are translated into (hopefully correct) C++, too. E.g.:

namespace boost::mpl {
...
}

In the same way, namespace copy directives can be qualified:

namespace foo::bar = foz::baz;

The defined namespace identifier must not be qualified with an absolute qualifier (i.e., must not start with ::).

Assertions

Metacza supports static assertions. They look like normal funcalls with two arguments: first the condition that has to be true, second a message to be shown in case the assertion fails.

Assertion statements are translated to the static_assert construction of C++11.

assert(x > 5, "Expected smaller value");

Content

Index

December 5th, 2011
Comments? Suggestions? Corrections? You can drop me a line.
zpentrabvagiktu@theiling.de
Schwerpunktpraxis
Datenschutz