Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify enhanced control structures #27

Open
wants to merge 1 commit into
base: spec-gold
Choose a base branch
from
Open

Conversation

jdpage
Copy link
Collaborator

@jdpage jdpage commented Mar 26, 2018

Specifically, this provides the following:

  • Bare do blocks. This is equivalent to the ability to just put a { ... } somewhere in C-like languages, which can be quite useful. Easy to implement.
  • Named loops. This is pretty straightforward to implement, and is massively useful. In addition, it actually saves space in the program image--you don't need a sentinel for breaking out of multiple nested loops anymore!
  • let blocks with multiple bindings. Not earth-shaking, but plays well with the redo feature.
  • let blocks with an attached scope. Easy to implement (since let statements have an implicit scope anyway), and another feature which plays nice with the redo feature.
  • A redo statement.

The redo statement is a little weird, and is a combination of a Scheme feature and a Perl5 feature. In Perl5, the redo statement is the missing partner to break and continue (or next in Perl), in that it allows you to rerun the current loop iteration without re-evaluating the condition. It's equivalent to something like:

while (x < 42) {
myloop:
    /* ... */
    goto myloop;
    /* ... */
}

which for us would be

while x < 42 do: myloop
  /* ... */
  redo myloop
  /* ... */
end

On the Scheme side, you can label your let forms. A Scheme let is actually equivalent to creating and immediately calling an anonymous closure, and labelling a let just gives it a name. You can then express tail-recursive looping constructs by calling the given name, and it lets you do tail-recursion without having to provide an additional helper function with an accumulator.

Our redo would combine those features, with the caveat that it must be tail-recursive (since regular recursion on the C64 would be a disaster). It's actually somewhat cheaper than a function call. In addition, the compiler can use it internally to express loops; it's essentially something that we'd be implementing anyway, and it's so useful that we might as well expose it to the user.

I think I'd like, if possible, to avoid having to introduce a goto statement into Gold-syntax, and redo is a step towards providing a facility which provides a lot of its power in a structured, safe fashion.

I'm not sure about the do: foo ... end syntax for naming blocks, since : normally means that we're specifying a type. It's a kind of binding, so it feels like = should be involved, so here's my two alternate syntaxes for labelled do, let-do, and while-do respectively. Note that bare do is dropped in favour of an empty let binding in both:

let loop = do
    /* ... */
end

let x: u8 = 7, loop = do
    /* ... */
end

while x < 7, loop = do
    /* ... */
end
let(loop) do
    /* ... */
end

let(loop) x: u8 = 7 do
    /* ... */
end

while(loop) x < 7 do
    /* ... */
end

@coveralls
Copy link

Pull Request Test Coverage Report for Build 154

  • 0 of 0 (NaN%) changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 58.727%

Totals Coverage Status
Change from base Build 151: 0.0%
Covered Lines: 465
Relevant Lines: 743

💛 - Coveralls

effectively tail-recursion: ::

fun factorial(n: u16) -> u16
let acc: u16, k: u16 = 1, n do
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is missing a mut?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is, but I might be wrong. All that "immutable" means is that "you can take a pointer to this, and as long as it's still valid, the result of dereferencing that pointer won't change".

In the factorial example, this is actually true; if you take a pointer to, say, acc, then you must be taking the pointer inside the let-do-block. However, that pointer is only valid to the end of the scope. The redo statement has a subtlety, here -- since it ends the block, it closes the scope, but until the scope is closed, the bindings keep the same value. It then re-starts the block, and re-opens the scope, this time with the new memory values. In other words, calling redo necessarily invalidates the pointer.

fun foo(n: u16) -> u16
  let k: u16, i: u16 = n, n do
    let j: &u16 = &k
    if k > 1 then
      /* this is fine */
      redo (i - 1, *j - 1)
    else
      return k
    end
  end
end

Notice above--the pointer j is actually valid until the next statement executed after the redo statement, and can be used safely! The memory used by k isn't changed until then -- in fact, k isn't even rebound until the redo statement completes. The fact that we re-use the memory for k and i should be an implementation detail; by the time we re-use it, you no longer have normal access to it anyway.

I agree that this is a little confusing at first, but I don't want to make the use of redo require you to smear gratuitous muts everywhere; in fact, I kind of see it as an alternative to mut. It gives you two options for an algorithm:

  1. Implement the algorithm iteratively, explicitly marking the memory that's to be re-used. Memory to be re-used can be changed arbitrarily.
  2. Implement the algorithm tail-recursively, allowing the compiler to automatically re-use memory based on scope. Memory to be re-used can only be changed per-iteration.

Which option is clearer depends on the algorithm and the person writing it, of course, though I personally feel like option 2 really is the better option.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see. this is starting to explain some of the advantages to your enhanced redo. i'm still struggling with it, but i'm going to let it sit for a while to be sure my aversion isn't "this is different!" but rather "this could be better".

@woodrowbarlow
Copy link
Collaborator

woodrowbarlow commented Apr 4, 2018

i am definitely on board with bare do blocks, named loops, redo statements.


for named loops, it feels a little awkward to have the name on the right side of the "do"... once i see "do", i expect the next thing to be a statement. and i think it's reasonable to expect that. i'm also worried that our use of colons is losing any sense of consistency. lemme think on that, but i'm not sure if i can do better than the options you've already suggested.


for multiple-binding let statements, i just feel that the explanation you provided in the spec document is muddled. you've tried to define a basic let statement, a multiple-binding let statement, and a let statement bound to a loop's scope all at once and i think it wasn't clear.

let acc: u16, k: u16 = 1, n

i think this means "assign the value 1 to acc and the value n to k. beyond, like, two bindings in this syntax I find it gets very difficult to read. i find myself literally putting two fingers on the screen and matching up "okay, this value gets assigned to this identifier, this value gets assigned to this one...".

also, is this allowed?

let x: u8, y: u8 = 0

does it mean both values get assigned a value of zero? or that y gets a value of zero and x is left uninitialized?

can a multiple-binding let statement contain bindings of multiple different types?


as for this concept of let statements being bound to a scope in a new way, i'm confused. it seems to be only useful for the redo statement.

my hesitation is mostly this:

let mut x: u8 = 5,
let mut y: i16 = 0,
while x > 0 do
    /* whatever */
end

these bindings are both scoped within the loop, but only because of a single comma! they certainly don't look like they belong to the inner loop. if i'm reading this, i might put a print(y) statement after the end for debug.

this is not the whitespace style you used in your document, but i think it is the natural syntax
(newlines after each comma) especially because it will be common to have maybe 4 or so variables bound in this way.


final summary

if i've correctly understood your motivation for multiple-binding let statements, scoping a let statement to a loop in a new way, and allowing expressions in a redo statement, it's basically to support this tail recursion example you've provided:

fun factorial(n: u16) -> u16
    let acc: u16, k: u16 = 1, n do
        if k > 1 then
            redo (acc * k, k - 1)
        else
            return acc
        end
    end
end

however, if i'm correctly understanding what these new features are meant to do, here is the same code written without multiple-binding let statements, let statements scoped to a loop in a new way, or expressions in a redo statement.

fun factorial(n: u16) -> u16
    let mut acc: u16 = 1
    let mut k: u16 = n
    do
        if k > 1 then
            acc *= k
            k -= 1
            redo
        else
            return acc
        end
    end
end

your example saved a few keystrokes, but isn't more optimized than mine and i argue that it's less readable than mine for three reasons:

  • the multiple assignment requires matching value order with the identifiers when they aren't right next to each other
  • the redo requires matching expression order with the identifiers and they aren't even on the same line
  • if you want the do and end to share the same vertical column, then the let binding in your example appears to belong to the parent scope with only a little comma hinting otherwise.
    • and, like i said above, i believe this syntax will naturally become a de-facto standard as the let bindings become more complex.

@jdpage
Copy link
Collaborator Author

jdpage commented Apr 4, 2018

for named loops, it feels a little awkward to have the name on the right side of the "do" ... i'm also worried that our use of colons is losing any sense of consistency.

I agree with this 100% -- colons should only be used for notating types. Of the options I've given, I like Alternative 1 best, with name = do. I don't really want to overload parentheses more than they're overloaded. (In fact, I'm tempted to suggest that we use [ ] for both function calls and indexing, but that might be a little too radical.)

for multiple-binding let statements ... the explanation ... is muddled.

It is. Maybe I should define the combined syntax, then provide examples of each sub-part?

let acc: u16, k: u16 = 1, n

i think this means "assign the value 1 to acc and the value n to k. beyond, like, two bindings in this syntax I find it gets very difficult to read.

Yeah. I think my thought process was that it allows for nice stuff later like a, b = b, a to swap variables, and allowing multiple return from functions, but to be honest if that's important I'd prefer to just allow both 'C' multiple assignment (i.e. let acc: u16 = 1, k: u16 = n) and the proposed syntax. The 'C'-style multiple assignment is way easier to read.

also, is this allowed? let x: u8, y: u8 = 0

My inclination is the say "no", but I guess allowing uninitialized declarations that can't be read from until they've been assigned could be useful. C# allows it, but Rust does it better by making control structures into expressions.

Maybe "no" for now, then relax it later if it seems useful?

can a multiple-binding let statement contain bindings of multiple different types?

Yes!

as for this concept of let statements being bound to a scope in a new way, i'm confused. it seems to be only useful for the redo statement.

That's true. If all you wanted to do was restrict the scope of some variables, you could just throw them in a bare do block.

my hesitation is mostly this:

let mut x: u8 = 5,
let mut y: i16 = 0,
while x > 0 do
    /* whatever */
end

these bindings are both scoped within the loop, but only because of a single comma!

More proof that my description is muddy, since the above code snippet is a syntax error the way I intended the spec to be read. It should be this instead (if we're allowing C-style multiple assignment as discussed above):

let mut x: u8 = 5, mut y: i16 = 0 do
    while x > 0 do
        /* whatever */
    end
end

If we're not allowing C-style multiple assignment (i.e. PR-as-reviewed), it would be one of the following:

let mut x: u8 = 5 do
    let mut y: i16 = 0 do
        while x > 0 do
            /* whatever */
        end
    end
end
let mut x: u8, mut y: i16 = 5, 0 do
    while x > 0 do
        /* whatever */
    end
end

if i've correctly understood your motivation for [above features], it's basically to support [tail recursion] ... here is the same code written without [above features] ... your example saved a few keystrokes, but isn't more optimized than mine and i argue that it's less readable than mine ...

My motivation is kind of mixed. On the one hand, the features given are really useful as internal compiler constructs anyway, and while one could do without them, they are handy if you want to implement a naturally-tail-recursive algorithm tail (which, admittedly, factorial is not). When it comes down to it, anything that you can express tail-recursively can be expressed iteratively as well -- this is the entire point of tail recursion.

I think that support for recursion is important. That's a hard sell on the C64, but I like to think that this captures a lot of the parts that one actually wants. If you have an algorithm that's naturally tail-recursive, you can express it. If you don't, then you can express it either tail recursively or iteratively.

the redo requires matching expression order with the identifiers and they aren't even on the same line

This criticism applies equally to function calls. At least the redo is probably on the same screen as the let.

@jdpage
Copy link
Collaborator Author

jdpage commented Apr 4, 2018

As a side note, if you REALLY wanted to implement factorial without recursion, a better way would be:

fun factorial(n: u16) -> u16
  let mut acc: u16 = 1
  for k: u16 in 2 to n + 1 do
    acc *= k
  end
  return acc
endfun

@jdpage
Copy link
Collaborator Author

jdpage commented Apr 5, 2018

Yet another alternative syntax for named loops, riffing on the idea that redo is similar to a tail-recursive function call.

let-bindings and for-loops fit really nicely into this syntax since they bind variables. while-loops are incredibly awkward.

/* Named let-binding with variables (name is 'foo') */
let foo(x: u8 = 7, y: i16 = 300) do
  redo foo
  redo foo(8, 9)
  break foo
  continue foo
end

/* Named do block */
let foo() do
  /* ... */
end

/* Named for loop */
for foo(x: u8) in 0 to 7 do
  /* ... */
end

/* Named while loop #1 (ugly) */
while foo() x < 7 do
  /* ... */
end

/* Named while loop #2 
 * (ugly and name position is inconsistent with above) */
while x < 7 foo() do
  /* ... */
end

/* Named while loop #3 (inconsistent syntax) */
while x < 7 let foo() do
  /* ... */
end

/* Named while loop #4 (weird but ok maybe?) */
let foo() while x < 7 do
  /* ... */
end

/* Named while loop #5 (impossible to parse context-free,
 * since you have to know whether 'foo' is bound) */
while foo(x < 7) do
  /* ... */
end

@woodrowbarlow
Copy link
Collaborator

woodrowbarlow commented Apr 9, 2018

what about this?

kinda similar to function declarations.

[scope <name>[(<binding>[, <binding>])]] [for|while <condition>] do [...] end

examples:

fun factorial(n: u16) -> u16
    scope inner(acc: u16 = 1, k: u16 = n) do
        if k > 1 then
            redo inner(acc * k, k - 1)
        else
            return acc
        end
    end
end

scope myloop(x: u8 = 15, y:i16 = -2) while x > 5 do
    if (x % 2 == 0) then
        redo myloop
    elseif (x % 5 == 0) then
        redo myloop(x, y*2)
    end
    x -= 1
end

scope iterator for x: u8 in 0 to 10 do
    /* ... */
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants