Skip to content

Features of the new Script Compiler

Peter Bouillon [fernewelten] edited this page Jan 9, 2022 · 46 revisions

Contents:

Extensions to the compiling process

  • Every open delimiter must be closed with a matching close delimiter. This is checked right at the scanning stage.
  • A lot of error messages concern two code spots. In many cases the compiler will stop at the first spot and pinpoint the second spot in the message. For instance, if a function header doesn't match another function header, the compiler will stop at the first header and pinpoint in the message where that other header is.
  • The compiler will sometimes emit warnings. Warnings don't halt a compiler run.

Auto pointering

When a variable is declared as a managed type, then the compiler knows that it must be a pointer variable because non-pointers to managed types are illegal. (See below for one big exception to this rule.) If you forget the * in these cases, then the compiler will happily imply it and carry on compiling. For instance, in the example below, Var1, Var2, and W will be implied to have the type Weapon * because Weapon is a managed struct.

managed struct Weapon
{
		int A;
} Var1;
     
Weapon Var2;
     
function Foo(Weapon W)
{ … }

In practice, this means that you can get away with omitting nearly all the pointer symbols in your AGS code.

The big exception is import declarations. In import declarations, it is legal to declare managed structs that aren't pointers, and automatically generated AGS code will actually do that, so the compiler does not imply any * for things that are declared import. Any pointers must be explicitly written out in import declarations.

Compile-time evaluations

The compiler will evaluate integer or float expressions at compile time whenever it knows their arguments. Whenever the compiler expects a constant, it will accept an expression, too, provided that it can evaluate that expression at compile-time.

int Weapons[7 * 9 + 1];

Extensions to the type system

Structs

structs can contain variables that are other structs In this case, managed structs can only be included via pointers; non-managed structs may not be included via pointers.

struct Pair { 
    int X, Y; 
};
managed struct Trio {
    int X, Y, Z; 
};

struct TalkingPositions { 
    Pair PosOfChar1, PosOfChar2; // these components can't be Pair * because Pair isn't managed
}; 
struct FloatingHead { 
    Trio *  HeadPos;     // HeadPos must be pointer because Trio is managed;
};

Note: Currently there is an engine limitation that prevents managed structs from containing other managed structs or arrays.

Immediately after a struct definition, you can define global variables that have the type of the struct:

managed struct Pair
{
    int i, j;
} * pair1;

Attributes

Declaration:

struct S 
{
    attribute float att1a;
    readonly attribute float att1b;
    static attribute float att1c;

    attribute float att2a[];
    readonly attribute float att2b[];
    static attribute float att2c[];
};

attribute float att1d (this struct S);
readonly attribute float att1e (this struct S);
attribute float att1f(static struct S);

attribute float att2a[](this struct S);
readonly attribute float att2b[](this struct S);
attribute float att2c[](static struct S);

When an attribute of a non-array type is encountered, e.g., att1a, then the compiler declares internally the functions import float S::get_att1a() -- the so-called "getter" of the attribute -- and import void S::set_att1a(float) -- the so-called "setter" of the attribute. After this declaration, the compiler can't distinguish the situation from a situation where these functions have been declared "manually" with a forward declaration (see below, section "Functions").

All getters and setters are automatically declared as "import" functions.

When a "readonly" attribute is encountered, the compiler will only declare its getter, not its setter.

Extender syntax:

  • Attribute att1d is processed in the same way as att1ais (i.e., the compiler declares import float S::get_att1d()and import void S::set_att1d(float))
  • Attribute att1e is processed in the same way as att1bis
  • Attribute att1f is processed in the same way as att1cis

In all cases, you are still obliged to define the respective functions with body. You can do that with a definition below even if the definition is in the same module as the attribute float S::get_att1a() { … } (AFAIK, the old compiler didn't accept this.)

Or, you can use extender notation float get_att1a(this S) { … }

Attributes of array types, e.g., att2a, declare the getter float S::geti_att2a(int) and setter void S::seti_att2a(int, float).

Extender syntax:

  • Attribute att2d is processed in the same way as att2ais (i.e., the compiler declares import float S::get_att1d()and import void S::set_att1d(float))
  • Attribute att2e is processed in the same way as att2bis
  • Attribute att2f is processed in the same way as att2cis

Multi-dimensional non-dynamic arrays

You can declare int a[5, 3]; or equivalently, int a[5][3];. You can have any positive number of dimensions int a[2][3][5][7][11];.

Import, export

There are two linking categories that those entities can have that are outside of functions:

3.. import (this declaration has the import keyword)

2.. global (this declaration occurs outside of function bodies and does not have the import keyword)

These categories form a precedence: A global declaration supersedes an import declaration.

This isn't an edge case. It's the usual run of things when header files and code files interoperate.

  • In the header file of a module X, a variable is declared as import, e.g., import int var;
  • The code of module X file re-declares the variable to be global, e.g., int var;
  • Also in the code file, there's an export statement, e.g., export var;

So

  • When this module X is compiled, this results in the exported variable var.
  • When other modules are compiled, the header file of module X is prepended to them. This results in an imported variable var.

From the point of view of the compiler, a variable may be defined multiple times. The compiler accepts a re-definition of a variable that is identical in all aspects except for the linking category and that "upgrades" the linking category.

Functions

Functions can now be defined in any sequence. So you can reference a function first and define it afterwards. This makes recursive algorithms much easier to code.

int  Func1(int a)
{
    return 2 * Func2(a);    // uses Func2()
}

int Func2(int a)     // defines Func2() and is below its first use -- this used to be illegal.
{ … }

You can declare the signature of a function by having a ';' follow the function header instead of the body. This is called a “forward declaration” of the function. There are two variants of forward declarations:

  • Forward declarations of functions that are outside a struct declaration
  • Forward declarations of functions that are within a struct declaration.

Rules for forward declarations:

  • Parameters needn't be named int a(int, float);
  • If you do name the parameters in forward declarations, the names are immaterial. int a(int b, float c);
  • You can have any number of forward declarations as long as they are compatible with each other and with the function declaration where the body follows (see below). Each function may only be defined once with body.
  • Forward declarations need to have the import keyword only when the function needs to be available in other modules.
    • This means specifically, if the forward declaration is in a header file then it should have the import modifier.
    • If you have a struct declaration in a code file (.asc file) and if that struct has functions then it makes sense to declare the function headers without the import keyword. The struct is only going to be used within the module, so its functions will also only be used within the module. (“Say what you mean” principle: Don't declare functions as import if they aren't actually supposed to be used that way.)
  • Any function header may name defaults for parameters. (In contrast, the old compiler only allowed defaults in forward declarations.) But the defaults in all the function headers must be “compatible” to each other (see the section below)

Compatibility of re-declared functions

A function may have multiple forward definitions and at most one definition with body. The compiler reads all the function headers and keeps track of the "known information" about the function. If a function header is incompatible to the "known information" then this is flagged as an error.

A function header is compatible to this "known information" if it:

  • has the same return type
  • has the same number of parameters
    • The parameter list must end on ..., i.e., the function must be declared as "variadic" if and only if known information has it that the function is variadic.
  • the type of the n-th parameter of the function header is the same as the type of the n-th parameter of known information
  • when the n-th parameter of known information has a default X then the n-th parameter of the function header has the same default X and vice versa
    • Exception, for reasons of compatibility: If a body follows the function header then the function header may either name all the defaults that "known information" has or none of them
    • Exception, for reasons of compatibility: float parameters may have the default 0, which will result in a warning. For all other purposes, including function header compatibility, this is treated as 0.0. Similarly, a 0 default for String or managed parameters will yield a warning but be treated as null for the purposes of function header compatibility.
  • If a body follows the function header then some parameters may be declared readonly. readonly parameters cannot be modified within the function body. Known information doesn't keep track of readonly, so wheather the function header is compatible doesn't depend on what readonly parameters there are.
  • The linking type of "known information" may differ from the linking type of the current function header. That is, a global function may be re-declared as import etc. In this case, the linking type of "known information" becomes the highest linking type mentioned. So when any of the function headers declare the function to be global then the function is global.

Enums

Immediately after an enum definition, you can define global variables that have the type of the enum:

enum TrafficLightColour
{
    red,
    amber,
    green,
} col1;

Extensions to expressions

Ternaries

  • Standard form: foo = (x > 5) ? 1 : 2; means that foo is set to 1 if (x > 5), to 2 otherwise.
  • Can leave out the second part of a ternary so that "?" and ":" go right next to each other. For instance, int foo = bar ?: 15; This means that foo is normally set to bar, but if bar is 0 or null then foo is 15 instead. Particularly handy for specifying a default in case a pointer variable turns out to be null. (This is the same functionality as the ?? operator in C#.)

Increment, decrement operator

The old compiler treated ++ and -- as kinds of assignment symbols. So they had to go last in an assignment. So a statement such as ++var; was illegal.

Now, ++ and -- are parts of expressions, they are prepended or appended operators. So it is possible to do if (i++ == ++j) … just as in C or C#

Bitwise negation

When i is an integer expression, ~i is calculated as the bitwise negation of it: Each 1 bit is converted into a 0 bit and each 0 bit is converted into a 1 bit. This is just as in C or C#. As far as I know, the "old" compiler doesn't recognize the ~ operator.

Function calls as partial expressions:

If a function call yields a managed object, you can continue and use this object. For instance, you can continue with . and a component of the object. foo(bar).baz. (AFAIK, the old compiler didn't do this.)

Sequences such as a.b.c[17 * d].e[f] may now be arbitrarily long. (AFAIK, the old compiler had a limit on the number of symbols that make up such a term.)

Pseudo attributes

If a is a dynamic array, then a.Length will return the number of elements that were allocated to it.

Implied this in struct functions

When you are in the body of a struct function and you use an identifier that can be interpreted as a component of this then this. will be implied. (The old compiler did not do this.)

struct Warrior
{
    int ArmorDamage;
    import function Fight();
}
function Warrior::Fight()
{
    if (ArmorDamage < 0)    // taken to mean 'if (this.ArmorDamage < 0)'
        …
}

Concatenated string literals

String literals that are next to each other (separated only by whitespace) are treated as if they were one concatenated string literal.

player.Say("Hello" " world");  // same as "Hello world"

Backslash combinations in string literals

The compiler currently recognizes \', \", \?, \\, \a, \b, \e (escape char), \f, \n, \r, \t, and \v, as well as the customary [.

The compiler recognizes numeric character notations starting with \0\7 (octals) as well as notations starting with \0x (hexadecimals).

Extensions to statements

Compile-time constants

Declarations that begin with const define compile-time constants. Currently, int and float are possible. For instance, const int i = 5 + 3; declares that i is synonymous to the integer value 8. No code is generated at this point, this is only tracked in the symbol table, Whenever i is found in the code from this point onwards until i gets out of scope, it is treated as if an 8 was written there.

In contrast, declarations that begin with readonly define variables that are allocated at that point at that time. For instance, when readonly int i = 5 + 3; is encountered outside of functions, then bytes are allocated in global space for the variable i and loaded with the integer 8. The compiler will see to it that the variable is not changed. But it might still be changed behind-the-scenes, for instance if this variable is exported and then imported in another module where it is NOT declared readonly.

Note: If a declaration begins with const string, then it isn't interpreted as a compile-time-constant. Instead it is interpreted as a declaration of a variable that has the special type const string. (Since variables of all kinds of type string have been deprecated for years, you probably don't want to do this.)

Unintended fall through a switch case, fallthrough; statement, dead end detection

The compiler tries to find out whether code execution falls through the end of a non-empty case into the next case.

switch(inventory_item)
{
default:
    player.Say("I can't use this thing this way."); // fall through, will yield a warning
case iCup:
    player.Say("Putting the cup onto the table.");
    …
}

Such fall through is usually unintended (a break; is missing). So the compiler warns when it detects the situation. In order to shut up the warning and declare that the fall through is intended, you can code fallthrough; immediately in front of the next case.

default:
    player.Say("I can't use this thing this way.");
    fallthrough; 
case iCup:
    …

The compiler will try to detect whether statements can't be reached, e.g., because a break; preceded them. This will yield a warning.

function squared(int i)
{
    return i * i;
    if (i > 5000)    // can't be reached
        …
}

The compiler will try to detect whether the end of a function body can be reached when the function has a return type that isn't int. In this case, a default integer 0 would be returned which doesn't match the return type.

float squared (float f)
{
    float result = f * f;
} // control flow can reach the closing brace of 'squared' which is an error.

Local variable definitions can hide existing definitions

This only pertains to local variables, i.e., variables that are declared within a function.

Within a pair of braces { … } you can declare a local variable that has the same name as an already existing variable. Then this new definition will be valid within the pair of braces. Whenever the variable name is referenced within the braces, it will be interpreted to refer to this new definition. There's no way of referencing the old definition while the new definition is in scope, so the new definition "hides" the old definition.

When the closing brace is encountered, the new definition goes out of scope and the old definition will "reappear".

This will also work for variable declarations at the start of the parenthesized list oi a for statement. Those variables are valid within the for statement and hide any variables that have the same name.

Expressions as statements

You can use any expression with an appended ; as a statement, just as in C or C#. But this can lead to unintended mistakes, so the compiler will only accept those expressions that have a side effect, defined to be either a function call, or an increment or decrement operation. This means that 7; is illegal but both i--; and --i; are legal.

In contrast, the old compiler only accepted a function call or an assignment as an expression statement (where var++; was an assignment for the old compiler). This causes problems when we have statements such as ++var; that should be legal but don't end on an assignment symbol.

Just as in the old compiler, it's still illegal to use a naked ; as a statement, so the notorious code while(i != 0); will yield a compiler error instead of an endless loop at runtime.

Clone this wiki locally