-
Notifications
You must be signed in to change notification settings - Fork 162
Features of the new Script Compiler
Contents:
- Extensions to the compiling process
- Extensions to the type system
- Extensions to expressions
- Extensions to statements
- Every open delimiter must be closed with a matching close delimiter. This is checked right at the scanning stage.
- A lot of error messages concern two code spots. In many cases the compiler will stop at the first spot and pinpoint the second spot in the message. For instance, if a function header doesn't match another function header, the compiler will stop at the first header and pinpoint in the message where that other header is.
- The compiler will sometimes emit warnings. Warnings don't halt a compiler run.
When a variable is declared as a managed
type, then the compiler knows that it must be a pointer variable because non-pointers to managed types are illegal. (See below for one big exception to this rule.) If you forget the *
in these cases, then the compiler will happily imply it and carry on compiling. For instance, in the example below, Var1
, Var2
, and W
will be implied to have the type Weapon *
because Weapon
is a managed
struct.
managed struct Weapon
{
int A;
} Var1;
Weapon Var2;
function Foo(Weapon W)
{ … }
In practice, this means that you can get away with omitting nearly all the pointer symbols in your AGS code.
The big exception is import
declarations. In import declarations, it is legal to declare managed struct
s that aren't pointers, and automatically generated AGS code will actually do that, so the compiler does not imply any *
for things that are declared import
. Any pointers must be explicitly written out in import declarations.
The compiler will evaluate integer or float expressions at compile time whenever it knows their arguments. Whenever the compiler expects a constant, it will accept an expression, too, provided that it can evaluate that expression at compile-time.
int Weapons[7 * 9 + 1];
struct
s can contain variables that are other struct
s In this case, managed struct
s can only be included via pointers; non-managed structs may not be included via pointers.
struct Pair {
int X, Y;
};
managed struct Trio {
int X, Y, Z;
};
struct TalkingPositions {
Pair PosOfChar1, PosOfChar2; // these components can't be Pair * because Pair isn't managed
};
struct FloatingHead {
Trio * HeadPos; // HeadPos must be pointer because Trio is managed;
};
Note: Currently there is an engine limitation that prevents managed structs from containing other managed structs or arrays.
Immediately after a struct
definition, you can define global variables that have the type of the struct:
managed struct Pair
{
int i, j;
} * pair1;
Declaration:
struct S
{
attribute float att1a;
readonly attribute float att1b;
static attribute float att1c;
attribute float att2a[];
readonly attribute float att2b[];
static attribute float att2c[];
};
attribute float att1d (this struct S);
readonly attribute float att1e (this struct S);
attribute float att1f(static struct S);
attribute float att2a[](this struct S);
readonly attribute float att2b[](this struct S);
attribute float att2c[](static struct S);
When an attribute of a non-array type is encountered, e.g., att1a
, then the compiler declares internally the functions import float S::get_att1a()
-- the so-called "getter" of the attribute -- and import void S::set_att1a(float)
-- the so-called "setter" of the attribute. After this declaration, the compiler can't distinguish the situation from a situation where these functions have been declared "manually" with a forward declaration (see below, section "Functions").
All getters and setters are automatically declared as "import" functions.
When a "readonly" attribute is encountered, the compiler will only declare its getter, not its setter.
Extender syntax:
- Attribute
att1d
is processed in the same way asatt1a
is (i.e., the compiler declaresimport float S::get_att1d()
andimport void S::set_att1d(float)
) - Attribute
att1e
is processed in the same way asatt1b
is - Attribute
att1f
is processed in the same way asatt1c
is
In all cases, you are still obliged to define the respective functions with body. You can do that with a definition below even if the definition is in the same module as the attribute float S::get_att1a() { … }
(AFAIK, the old compiler didn't accept this.)
Or, you can use extender notation float get_att1a(this S) { … }
Attributes of array types, e.g., att2a
, declare the getter float S::geti_att2a(int)
and setter void S::seti_att2a(int, float)
.
Extender syntax:
- Attribute
att2d
is processed in the same way asatt2a
is (i.e., the compiler declaresimport float S::get_att1d()
andimport void S::set_att1d(float)
) - Attribute
att2e
is processed in the same way asatt2b
is - Attribute
att2f
is processed in the same way asatt2c
is
You can declare int a[5, 3];
or equivalently, int a[5][3];
. You can have any positive number of dimensions int a[2][3][5][7][11];
.
There are two linking categories that those entities can have that are outside of functions:
3.. import
(this declaration has the import
keyword)
2.. global (this declaration occurs outside of function bodies and does not have the import
keyword)
These categories form a precedence: A global declaration supersedes an import
declaration.
This isn't an edge case. It's the usual run of things when header files and code files interoperate.
- In the header file of a module X, a variable is declared as
import
, e.g.,import int var;
- The code of module X file re-declares the variable to be global, e.g.,
int var;
- Also in the code file, there's an
export
statement, e.g.,export var;
So
- When this module X is compiled, this results in the exported variable
var
. - When other modules are compiled, the header file of module X is prepended to them. This results in an imported variable
var
.
From the point of view of the compiler, a variable may be defined multiple times. The compiler accepts a re-definition of a variable that is identical in all aspects except for the linking category and that "upgrades" the linking category.
Functions can now be defined in any sequence. So you can reference a function first and define it afterwards. This makes recursive algorithms much easier to code.
int Func1(int a)
{
return 2 * Func2(a); // uses Func2()
}
int Func2(int a) // defines Func2() and is below its first use -- this used to be illegal.
{ … }
You can declare the signature of a function by having a ';' follow the function header instead of the body. This is called a “forward declaration” of the function. There are two variants of forward declarations:
- Forward declarations of functions that are outside a
struct
declaration - Forward declarations of functions that are within a
struct
declaration.
Rules for forward declarations:
- Parameters needn't be named
int a(int, float);
- If you do name the parameters in forward declarations, the names are immaterial.
int a(int b, float c);
- You can have any number of forward declarations as long as they are compatible with each other and with the function declaration where the body follows (see below). Each function may only be defined once with body.
- Forward declarations need to have the
import
keyword only when the function needs to be available in other modules.- This means specifically, if the forward declaration is in a header file then it should have the
import
modifier. - If you have a
struct
declaration in a code file (.asc file) and if thatstruct
has functions then it makes sense to declare the function headers without theimport
keyword. Thestruct
is only going to be used within the module, so its functions will also only be used within the module. (“Say what you mean” principle: Don't declare functions asimport
if they aren't actually supposed to be used that way.)
- This means specifically, if the forward declaration is in a header file then it should have the
- Any function header may name defaults for parameters. (In contrast, the old compiler only allowed defaults in forward declarations.) But the defaults in all the function headers must be “compatible” to each other (see the section below)
A function may have multiple forward definitions and at most one definition with body. The compiler reads all the function headers and keeps track of the "known information" about the function. If a function header is incompatible to the "known information" then this is flagged as an error.
A function header is compatible to this "known information" if it:
- has the same return type
- has the same number of parameters
- The parameter list must end on
...
, i.e., the function must be declared as "variadic" if and only if known information has it that the function is variadic.
- The parameter list must end on
- the type of the n-th parameter of the function header is the same as the type of the n-th parameter of known information
- when the n-th parameter of known information has a default
X
then the n-th parameter of the function header has the same defaultX
and vice versa- Exception, for reasons of compatibility: If a body follows the function header then the function header may either name all the defaults that "known information" has or none of them
- Exception, for reasons of compatibility:
float
parameters may have the default0
, which will result in a warning. For all other purposes, including function header compatibility, this is treated as0.0
. Similarly, a0
default forString
or managed parameters will yield a warning but be treated asnull
for the purposes of function header compatibility.
- If a body follows the function header then some parameters may be declared
readonly
.readonly
parameters cannot be modified within the function body. Known information doesn't keep track ofreadonly
, so wheather the function header is compatible doesn't depend on whatreadonly
parameters there are. - The linking type of "known information" may differ from the linking type of the current function header. That is, a global function may be re-declared as
import
etc. In this case, the linking type of "known information" becomes the highest linking type mentioned. So when any of the function headers declare the function to be global then the function is global.
Immediately after an enum
definition, you can define global variables that have the type of the enum:
enum TrafficLightColour
{
red,
amber,
green,
} col1;
- Standard form:
foo = (x > 5) ? 1 : 2;
means that foo is set to1
if(x > 5)
, to2
otherwise. - Can leave out the second part of a ternary so that "?" and ":" go right next to each other. For instance,
int foo = bar ?: 15;
This means thatfoo
is normally set tobar
, but ifbar
is 0 ornull
then foo is15
instead. Particularly handy for specifying a default in case a pointer variable turns out to benull
. (This is the same functionality as the??
operator in C#.)
The old compiler treated ++
and --
as kinds of assignment symbols. So they had to go last in an assignment. So a statement such as ++var;
was illegal.
Now, ++
and --
are parts of expressions, they are prepended or appended operators. So it is possible to do if (i++ == ++j) …
just as in C
or C#
When i
is an integer expression, ~i
is calculated as the bitwise negation of it: Each 1
bit is converted into a 0
bit and each 0
bit is converted into a 1
bit. This is just as in C
or C#
. As far as I know, the "old" compiler doesn't recognize the ~
operator.
If a function call yields a managed object, you can continue and use this object. For instance, you can continue with .
and a component of the object. foo(bar).baz
. (AFAIK, the old compiler didn't do this.)
Sequences such as a.b.c[17 * d].e[f]
may now be arbitrarily long. (AFAIK, the old compiler had a limit on the number of symbols that make up such a term.)
If a
is a dynamic array, then a.Length
will return the number of elements that were allocated to it.
When you are in the body of a struct function and you use an identifier that can be interpreted as a component of this
then this.
will be implied. (The old compiler did not do this.)
struct Warrior
{
int ArmorDamage;
import function Fight();
}
function Warrior::Fight()
{
if (ArmorDamage < 0) // taken to mean 'if (this.ArmorDamage < 0)'
…
}
String literals that are next to each other (separated only by whitespace) are treated as if they were one concatenated string literal.
player.Say("Hello" " world"); // same as "Hello world"
The compiler currently recognizes \'
, \"
, \?
, \\
, \a
, \b
, \e
(escape char), \f
, \n
, \r
, \t
, and \v
, as well as the customary [
.
The compiler recognizes numeric character notations starting with \0
… \7
(octals) as well as notations starting with \0x
(hexadecimals).
Declarations that begin with const
define compile-time constants. Currently, int
and float
are possible. For instance, const int i = 5 + 3;
declares that i
is synonymous to the integer value 8. No code is generated at this point, this is only tracked in the symbol table, Whenever i
is found in the code from this point onwards until i
gets out of scope, it is treated as if an 8 was written there.
In contrast, declarations that begin with readonly
define variables that are allocated at that point at that time. For instance, when readonly int i = 5 + 3;
is encountered outside of functions, then bytes are allocated in global space for the variable i
and loaded with the integer 8. The compiler will see to it that the variable is not changed. But it might still be changed behind-the-scenes, for instance if this variable is exported and then imported in another module where it is NOT declared readonly
.
Note: If a declaration begins with const string
, then it isn't interpreted as a compile-time-constant. Instead it is interpreted as a declaration of a variable that has the special type const string
. (Since variables of all kinds of type string
have been deprecated for years, you probably don't want to do this.)
The compiler tries to find out whether code execution falls through the end of a non-empty case into the next case.
switch(inventory_item)
{
default:
player.Say("I can't use this thing this way."); // fall through, will yield a warning
case iCup:
player.Say("Putting the cup onto the table.");
…
}
Such fall through is usually unintended (a break;
is missing). So the compiler warns when it detects the situation. In order to shut up the warning and declare that the fall through is intended, you can code fallthrough;
immediately in front of the next case.
default:
player.Say("I can't use this thing this way.");
fallthrough;
case iCup:
…
The compiler will try to detect whether statements can't be reached, e.g., because a break;
preceded them. This will yield a warning.
function squared(int i)
{
return i * i;
if (i > 5000) // can't be reached
…
}
The compiler will try to detect whether the end of a function body can be reached when the function has a return type that isn't int
. In this case, a default integer 0
would be returned which doesn't match the return type.
float squared (float f)
{
float result = f * f;
} // control flow can reach the closing brace of 'squared' which is an error.
This only pertains to local variables, i.e., variables that are declared within a function.
Within a pair of braces { … }
you can declare a local variable that has the same name as an already existing variable. Then this new definition will be valid within the pair of braces. Whenever the variable name is referenced within the braces, it will be interpreted to refer to this new definition. There's no way of referencing the old definition while the new definition is in scope, so the new definition "hides" the old definition.
When the closing brace is encountered, the new definition goes out of scope and the old definition will "reappear".
This will also work for variable declarations at the start of the parenthesized list oi a for
statement. Those variables are valid within the for
statement and hide any variables that have the same name.
You can use any expression with an appended ;
as a statement, just as in C
or C#
. But this can lead to unintended mistakes, so the compiler will only accept those expressions that have a side effect, defined to be either a function call, or an increment or decrement operation. This means that 7;
is illegal but both i--;
and --i;
are legal.
In contrast, the old compiler only accepted a function call or an assignment as an expression statement (where var++;
was an assignment for the old compiler). This causes problems when we have statements such as ++var;
that should be legal but don't end on an assignment symbol.
Just as in the old compiler, it's still illegal to use a naked ;
as a statement, so the notorious code while(i != 0);
will yield a compiler error instead of an endless loop at runtime.