Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

starlark: add support for addressing #101

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 18 additions & 4 deletions doc/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -1505,7 +1505,7 @@ Operand = identifier
| ListExpr | ListComp
| DictExpr | DictComp
| '(' [Expression] [,] ')'
| ('-' | '+') PrimaryExpr
| ('-' | '+' | '*' | '&') PrimaryExpr
.

DotSuffix = '.' identifier .
Expand Down Expand Up @@ -1637,13 +1637,15 @@ Examples:

### Unary operators

There are three unary operators, all appearing before their operand:
`+`, `-`, `~`, and `not`.
Starlark has the following unary operators, which all appear before their operand:
`+`, `-`, `~`, `*`, `&`, and `not`.

```grammar {.good}
UnaryExpr = '+' PrimaryExpr
| '-' PrimaryExpr
| '~' PrimaryExpr
| '*' PrimaryExpr
| '&' PrimaryExpr
| 'not' Test
.
```
Expand Down Expand Up @@ -1690,9 +1692,21 @@ The bitwise inversion of x is defined as -(x+1).
~0 # -1
```

The Go implementation of Starlark additionally has the unary
prefix operators `*` and `&`, in order to support Stargo, which
provides Starlark bindings for Go variables.
These operators are not part of standard Starlark, and must be enabled
in Go by the `-addressing` flag.

The expression `*ptr` dereferences a pointer value, `ptr`.
The expression `&expr` returns the address of an expression `expr`;
it may be applied only to field selection or index expressions
such as `&a[i]`, `&x.f`, or `&x[i].f[j].g`.
Consult the Stargo documentation for more details.

<b>Implementation note:</b>
The parser in the Java implementation of Starlark does not accept unary
`+` and `~` expressions.
`+` and `~` expressions, nor `*` and `&` as unary operators.

### Binary operators

Expand Down
120 changes: 104 additions & 16 deletions internal/compile/compile.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ import (
const debug = false // TODO(adonovan): use a bitmap of options; and regexp to match files

// Increment this to force recompilation of saved bytecode files.
const Version = 5
const Version = 6

type Opcode uint8

Expand Down Expand Up @@ -80,9 +80,10 @@ const (

IN

// unary operators
// unary operators (order of +/-/* must match Token)
UPLUS // x UPLUS x
UMINUS // x UMINUS -x
USTAR // x USTAR *x
TILDE // x TILDE ~x

NONE // - NONE None
Expand All @@ -94,13 +95,16 @@ const (
NOT // value NOT bool
RETURN // value RETURN -
SETINDEX // a i new SETINDEX -
INDEX // a i INDEX elem
INDEX // a i INDEX elem elem = a[i]
SETDICT // dict key value SETDICT -
SETDICTUNIQ // dict key value SETDICTUNIQ -
APPEND // list elem APPEND -
SLICE // x lo hi step SLICE slice
INPLACE_ADD // x y INPLACE_ADD z where z is x+y or x.extend(y)
INPLACE_ADD // x y INPLACE_ADD z where z is x+y or x.extend(y)
MAKEDICT // - MAKEDICT dict
ADDRESS // var ADDRESS addr fails if not Variable [stargo only]
VALUE // var VALUE value identity if not Variable [stargo only]
SETVALUE // var x SETVALUE - fails if not Variable [stargo only]

// --- opcodes with an argument must go below this line ---

Expand Down Expand Up @@ -139,6 +143,7 @@ const (
// TODO(adonovan): add dynamic checks for missing opcodes in the tables below.

var opcodeNames = [...]string{
ADDRESS: "address",
AMP: "amp",
APPEND: "append",
ATTR: "attr",
Expand Down Expand Up @@ -186,12 +191,14 @@ var opcodeNames = [...]string{
POP: "pop",
PREDECLARED: "predeclared",
RETURN: "return",
VALUE: "value",
SETDICT: "setdict",
SETDICTUNIQ: "setdictuniq",
SETFIELD: "setfield",
SETGLOBAL: "setglobal",
SETINDEX: "setindex",
SETLOCAL: "setlocal",
SETVALUE: "setvalue",
SLASH: "slash",
SLASHSLASH: "slashslash",
SLICE: "slice",
Expand All @@ -202,13 +209,15 @@ var opcodeNames = [...]string{
UNIVERSAL: "universal",
UNPACK: "unpack",
UPLUS: "uplus",
USTAR: "ustar",
}

const variableStackEffect = 0x7f

// stackEffect records the effect on the size of the operand stack of
// each kind of instruction. For some instructions this requires computation.
var stackEffect = [...]int8{
ADDRESS: 0,
AMP: -1,
APPEND: -2,
ATTR: 0,
Expand Down Expand Up @@ -261,6 +270,7 @@ var stackEffect = [...]int8{
SETGLOBAL: -1,
SETINDEX: -3,
SETLOCAL: -1,
SETVALUE: -2,
SLASH: -1,
SLASHSLASH: -1,
SLICE: -3,
Expand All @@ -270,6 +280,8 @@ var stackEffect = [...]int8{
UNIVERSAL: +1,
UNPACK: variableStackEffect,
UPLUS: 0,
USTAR: 0,
VALUE: 0,
}

func (op Opcode) String() string {
Expand Down Expand Up @@ -1005,25 +1017,41 @@ func (fcomp *fcomp) stmt(stmt syntax.Stmt) {
fcomp.set(lhs)
}

case *syntax.UnaryExpr:
// *x = ...
fcomp.expr(lhs.X)
fcomp.emit(DUP)
fcomp.emit(VALUE)
set = func() {
fcomp.setPos(lhs.OpPos) // op == STAR
fcomp.emit(SETVALUE)
}

case *syntax.IndexExpr:
// x[y] = ...
fcomp.expr(lhs.X)
fcomp.addr(lhs.X)
fcomp.expr(lhs.Y)
fcomp.emit(DUP2)
fcomp.setPos(lhs.Lbrack)
fcomp.emit(INDEX)
if resolve.AllowAddressing {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems error-prone to emit different bytecode depending on dialect flags. An application might compile and cache code for a library that doesn't depend on this flag. The library could be loaded by different future invocations of that library in different modes.

One solution would be to include the resolve flag in the serialized header. But I think I'd prefer the simplicity of emitting these opcodes unconditionally. They're always no-ops if AllowAddressing is false, right? How severe is the overhead?

fcomp.emit(VALUE)
}
set = func() {
fcomp.setPos(lhs.Lbrack)
fcomp.emit(SETINDEX)
}

case *syntax.DotExpr:
// x.f = ...
fcomp.expr(lhs.X)
fcomp.addr(lhs.X)
fcomp.emit(DUP)
name := fcomp.pcomp.nameIndex(lhs.Name.Name)
fcomp.setPos(lhs.Dot)
fcomp.emit1(ATTR, name)
if resolve.AllowAddressing {
fcomp.emit(VALUE)
}
set = func() {
fcomp.setPos(lhs.Dot)
fcomp.emit1(SETFIELD, name)
Expand Down Expand Up @@ -1143,7 +1171,7 @@ func (fcomp *fcomp) assign(pos syntax.Position, lhs syntax.Expr) {

case *syntax.IndexExpr:
// x[y] = rhs
fcomp.expr(lhs.X)
fcomp.addr(lhs.X)
fcomp.emit(EXCH)
fcomp.expr(lhs.Y)
fcomp.emit(EXCH)
Expand All @@ -1152,11 +1180,19 @@ func (fcomp *fcomp) assign(pos syntax.Position, lhs syntax.Expr) {

case *syntax.DotExpr:
// x.f = rhs
fcomp.expr(lhs.X)
fcomp.addr(lhs.X)
fcomp.emit(EXCH)
fcomp.setPos(lhs.Dot)
fcomp.emit1(SETFIELD, fcomp.pcomp.nameIndex(lhs.Name.Name))

case *syntax.UnaryExpr:
// *x = rhs
fcomp.expr(lhs.X)
fcomp.emit(USTAR)
fcomp.emit(EXCH)
fcomp.setPos(lhs.OpPos) // op == STAR
fcomp.emit(SETVALUE)

default:
panic(lhs)
}
Expand All @@ -1170,6 +1206,47 @@ func (fcomp *fcomp) assignSequence(pos syntax.Position, lhs []syntax.Expr) {
}
}

// addr emits code for a chain of x.f and a[i] operations such as a.f[i].g[j].
//
// In Addressing mode, all nonrecursive calls to addr() (a chain of ATTR
// and INDEX ops) must be followed by one of ADDRESS, SETFIELD,
// SETINDEX, or VALUE:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really confused me until I read the interpreter's code for VALUE a second time: it's a no-op for values that don't implement Variable. I think that should be mentioned here, too.

//
// &(expr.f[i].g) expr ATTR<f> i INDEX ATTR<g> ADDRESS
// (expr.f[i].g).x = y expr ATTR<f> i INDEX ATTR<g> y SETFIELD<x>
// (expr.f[i].g)[j] = y expr ATTR<f> i INDEX ATTR<g> y j SETINDEX
// _ = expr.f[i].g expr ATTR<f> i INDEX ATTR<g> VALUE
//
// We could in principle extend this scheme to all variables (such as x
// in x.f)), not just the x[i] and x.f operations applied to them, but
// it would add a cost to every variable lookup, and in practice it is
// unnecessary as most starlark.Variables live in a module such as http
// for http.DefaultServeMux, so they are acccessed using a dot expression.
//
// See comments at starlark.Variable for background.
//
func (fcomp *fcomp) addr(e syntax.Expr) {
switch e := e.(type) {
case *syntax.ParenExpr:
fcomp.addr(e.X)

case *syntax.DotExpr:
fcomp.addr(e.X)
fcomp.setPos(e.Dot)
fcomp.emit1(ATTR, fcomp.pcomp.nameIndex(e.Name.Name))

case *syntax.IndexExpr:
fcomp.addr(e.X)
fcomp.expr(e.Y)
fcomp.setPos(e.Lbrack)
fcomp.emit(INDEX)

default:
fcomp.expr(e)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: delete blank line here.

}
}

func (fcomp *fcomp) expr(e syntax.Expr) {
switch e := e.(type) {
case *syntax.ParenExpr:
Expand Down Expand Up @@ -1207,10 +1284,10 @@ func (fcomp *fcomp) expr(e syntax.Expr) {
fcomp.block = done

case *syntax.IndexExpr:
fcomp.expr(e.X)
fcomp.expr(e.Y)
fcomp.setPos(e.Lbrack)
fcomp.emit(INDEX)
fcomp.addr(e)
if resolve.AllowAddressing {
fcomp.emit(VALUE)
}

case *syntax.SliceExpr:
fcomp.setPos(e.Lbrack)
Expand Down Expand Up @@ -1255,13 +1332,23 @@ func (fcomp *fcomp) expr(e syntax.Expr) {
}

case *syntax.UnaryExpr:
fcomp.expr(e.X)
switch e.Op {
case syntax.AMP, syntax.STAR:
fcomp.addr(e.X)
default:
fcomp.expr(e.X)
}
fcomp.setPos(e.OpPos)
switch e.Op {
case syntax.AMP:
fcomp.emit(ADDRESS)
case syntax.MINUS:
fcomp.emit(UMINUS)
case syntax.PLUS:
fcomp.emit(UPLUS)
case syntax.STAR:
fcomp.emit(USTAR)
fcomp.emit(VALUE)
case syntax.NOT:
fcomp.emit(NOT)
case syntax.TILDE:
Expand Down Expand Up @@ -1317,9 +1404,10 @@ func (fcomp *fcomp) expr(e syntax.Expr) {
}

case *syntax.DotExpr:
fcomp.expr(e.X)
fcomp.setPos(e.Dot)
fcomp.emit1(ATTR, fcomp.pcomp.nameIndex(e.Name.Name))
fcomp.addr(e)
if resolve.AllowAddressing {
fcomp.emit(VALUE)
}

case *syntax.CallExpr:
fcomp.call(e)
Expand Down
Loading