Skip to content

Questions and errors in the spec #804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Alxandr opened this issue May 7, 2020 · 26 comments
Closed

Questions and errors in the spec #804

Alxandr opened this issue May 7, 2020 · 26 comments

Comments

@Alxandr
Copy link

Alxandr commented May 7, 2020

According to the language spec, the following is the ways you can slice an array:

image

As far as I can tell, this does not accept the following strings:

  • foo[::]
  • foo[1::]
  • foo[::1]
  • foo[1::1]

However, as from what I can see, all of those works. Is this just an error in the BNF? Or am I reading it wrong?

@sbarzowski
Copy link
Collaborator

sbarzowski commented May 7, 2020

Thanks for catching this. I think you're right. The second and third expr should be marked as optional. And I think the second rule is going to be redundant then.

We support all the variants you mentioned. We even have rewriting rules for desugaring them in the next section of the spec:
image

@sparkprime Is that correct?

@Alxandr If you feel like contributing to Jsonnet documentation, this snippet comes from here: https://github.com/google/jsonnet/blob/v0.15.0/doc/ref/spec.html#L328. That's completely optional ofc. If not, I'll do it myself later.

@Alxandr
Copy link
Author

Alxandr commented May 7, 2020

I'm currently busy porting the whole thing to rust (for fun), so maybe after xD. Gonna leave this issue in my github inbox so I remember it til then, but it will probably be a week at least.

Also; while we're on the topic: { [x]+:true for x in ['a'] } is not legal according to the BNF, but is accepted by the parser. I've not gotten far enough yet to figure out whether this should be legal or not.

@sbarzowski
Copy link
Collaborator

That's cool! If you have any questions feel free to ask them here. Please report any discrepancies/bugs that you find – reimplementation allows to catch very tricky issues and it's a great way for us to make sure that docs, tests and code stay in sync.

I think you're also right about the +: in comprehensions and I think it should be legal. It's a bit weird thing to do, but it's clear what it should mean. I'm not sure if it's covered by the test suite or if anyone does that in the real code.

@Alxandr
Copy link
Author

Alxandr commented May 7, 2020

It would be great if someone could say definitely whether or not it should be legal or not. Currently, I've set up my parser not to accept it, but it's easily enough to reverse. On the same note, the fact that you can't use :: and ::: in comprehension I cannot find anywhere in the spec. Nor have I yet figured out what ::: even means (but I've only read the parsing section yet, with some minor exceptions).

And for those interested; the code can be found here. Parsing is found in this file (link is to specific commit, not master).

@Alxandr
Copy link
Author

Alxandr commented May 7, 2020

More pseudo-random notes (figured I might just write these downs as I go through things, and it can be discussed/ignored later):

image

This claims that desugar_assert is a function that takes a field and a boolean (not stating what that boolean is, nor how the assert got morphed into a field) and then returns a field (which is also an assert). I'm guessing that you've just labeled everything inside of an object a "field" (which is a bit confusing, but fine), but is binds a boolean? Because that doesn't seem to follow, given that it translates into local <binds>;. If not, what is binds?

The same kind of thing happens at desugar_field. Here, it obviously make sense for it to be called a field, but this function clearly doesn't take just 2 parameters. Not to mention, there is even imbalance in parentheses used:

image

My guess here is that (id h e) is the field (which make sense), binds is something else which I'll have to figure out, and b is whether or not we're inside an object?

Also; on desugar_field, it says it's for desugaring object fields, but it sill has the b bool, that it passes forward to desugar_expr.

@sbarzowski
Copy link
Collaborator

sbarzowski commented May 8, 2020

+: in comprehensions

It would be great if someone could say definitely whether or not it should be legal or not.

Ok, I check and it works in both official implementations, but it is not allowed by the spec. I'm not sure which way we should go at this point and it may take a while to commit one way or another. I don't think this is blocking, though. We slowly extend the language in various ways anyway, so you can consider it not allowed as of v0.15, but it's quite likely to be added in the future versions.

Actually, you may consider leaving object comprehensions for later, because it's historically one of the most problematic parts of the language, with a lot of boundary cases (mostly related to the visibility and scoping rules).

:: and :::

On the same note, the fact that you can't use :: and ::: in comprehension I cannot find anywhere in the spec.

Well, the syntax formally allows only one field format for object comprehensions ([expr]: expr):
image.

This is a restriction we may potentially relax in future versions, though.

Nor have I yet figured out what ::: even means (but I've only read the parsing section yet, with some minor exceptions).

: is "default visibility", :: is always hidden, ::: is always visible. When you use : it's visible by default, but when : field is used to override a hidden field it stays hidden. E.g. {a:: 1} + {a: 2} gets printed as {} (the field a is hidden). On the other hand ::: will be visible no matter what. The always visible variant is a pretty obscure feature, included mostly for completeness.

Formally, it is covered here:
image
and here (this is actually the most complicated rule in the whole spec – object inheritance):
image

This part describes what is the visibility of fields in the resultant object:
image

Desugaring fields

This claims that desugar_assert is a function that takes a field and a boolean (not stating what that boolean is, nor how the assert got morphed into a field) and then returns a field (which is also an assert). I'm guessing that you've just labeled everything inside of an object a "field" (which is a bit confusing, but fine)[...]

Yes, both asserts and locals inside the object count as "fields". I agree it's a bit weird. In some ways they are like fields – they are carried along with an object and have access to self and super. But they cannot be indexed, which I would normally consider what makes things "fields". But they are often referred to as "fields" in Jsonnet spec and implementations. I personally prefer the term "object locals" and "object asserts" when possible to avoid confusion.

[...], but is binds a boolean? Because that doesn't seem to follow, given that it translates into local ;. If not, what is binds?

Sorry, there is a mistake in that function's signature. It doesn't take a Boolean as a second argument, but a bunch of Binds. Bind is a short for binding –̀ the definition of a single local variable which binds a value to a name. For example in local a = 42, b = a + a; ... there are two Binds a = 42 and b = a + a.

As you noticed, the desugaring rules for a field miss that Binds argument. And yes there is an extra paren in that desugar_field rule – you don't need one after the first e. Thanks for catching this stuff.

FYI the binds come from these rules:
image

And here is a formal definition of a Bind (in the abstract syntax):
image

My guess here is that (id h e) is the field (which make sense), binds is something else which I'll have to figure out, and b is whether or not we're inside an object?

id h e is a field, where h is :, :: or :::. binds is a bunch of local bindings as explained above. b is whether or not we are inside the object.

You are probably wondering why do we need that and what's the intention behind all these. So, we have the binds parameter to eliminate object locals (aka local fields) from the core language. Object locals are just "copy-pasted" into every "real" field and every object assert. The C++ implementation implements that literally, but the version in Go (which is a newer implementation) object locals are actually preserved (the behavior is equivalent, only more efficient). And the bool for whether or not we are inside the object is used for $ which refers to the lexically outermost object (see local $ = self in the rule for desugaring object literals).

@Alxandr
Copy link
Author

Alxandr commented May 8, 2020

Thanks a bunch. I knew what a binding is, and did arrive at where "binds" originate myself, however, I didn't know (at the time of writing) if binds was a list of bindings, a list of local names in scope, etc. Given that both the reference implementations support +: in object comprehensions, I'm going to allow that as well. And I apologize about some of these questions that are somewhat obvious, but I'm reading the spec and implementing top to bottom, so I've not yet gotten to the point where the rules for :, :: etc. that you posted are explained, as I'm still just on the desugar section.

On a happy note, my parser passes all the parser_tests from the go implementation (though they only check that it accepts valid input and rejects invalid input, not what it parses it to).

@sbarzowski
Copy link
Collaborator

And I apologize about some of these questions that are somewhat obvious [...]

Please don't worry about this. I think most engineers have a habit of trying too hard. The sort of things that you are asking about used to be confusing for me as well, when I was implementing go-jsonnet :-). I don't mind answering even basic questions as long as they are asked in a public forum (so that the next person with a similar issue can find it). It's also valuable feedback for us about what could be explained better.

@Alxandr
Copy link
Author

Alxandr commented May 8, 2020

I'm back with more questions :). I've gotten past the first bind_expr after I realized I was looking at an object and not a local expression. The { is really easy to interpret as just a grouping of statements, because that's more akin to how they've been used in the BNF. I recommend making them clearer here somehow, maybe by putting them in ', like literals are done in the BNF above.

I'm currently completely stumped by the desugaring of object comprehension though, and figured I'd try to walk through how I read the definition here, so that I can be corrected, and maybe the description can be improved. And if I'm lucky, I might just figure it out as I'm writing (rubber ducking at it's finest). The definition looks like this:

image

And for having something to discuss, let's go with the following object comprehension expression:

{ 
  local l1 = 1,
  [f2]: 'body',
  local l2 = 2, 
  for f1 in [['a'], ['b']] 
  if true 
  for f2 in f1 
}

It's fairly convoluted, but it has all of the features (as far as I know) of a object comprehension that's being defined how to deal with in the desugar definition above.

So, if I am to interpret the desugar definition, it should go as follows (using some pseudo code):

// whether or not we're in an object, originates outside
let in_obj: b;

// Let arr fresh
let arr = Symbol('arr'); // new "binding" (no value)

// and x1…xn be the sequence of variables defined in forspec compspec
let x1 = 'f1'; // not sure if these are correct
let x2 = 'f2';

return {
  [desugar_local_expr(
    expr: local f1 = arr[1] // this is one indexed I assume?
              , f2 = arr[2]
              ; f2,
    in_obj: in_obj)]:
    
    desugar_local_expr(
      expr: local f1 = arr[1]
                , f2 = arr[2]
                , l1 = 1,
                , l2 = 2,
                ; 'body',
      in_obj: true),

  for arr in desugar_arr_comp_expr(
    expr: [[f1, f2] for f1 in [['a'], ['b']] if true for f2 in f1],
    in_obj: in_obj)
};

And as predicted, I might just have figured it out by trying to explain it 😅. Regardless, I figured I'd try to also explain the reason I'm personally having so much problem with this. What makes it really hard to interpret the definition of the desugar expressions (in addition to the fact that it's super terse) is the fact that it's really hard to see what is part of the language used to write the desugar function, and what is part of an jsonnet expression inside the function. For instance, if we look at the following definition only:

image

  1. The [ bracket at the beginning is jsonnet verbatim.
  2. The desugar_expr is desugar syntax for a call.
  3. The ( is desugar syntax.
  4. The local is jsonnet verbatim.
  5. The x_1 is a variable in desugar syntax to be printed.
  6. The = is jsonnet verbatim.
  7. The whole of arr[1] is jsonnet verbatim.
  8. The ... is repetition in desugar syntaxt.
  9. The x_n is a variable in desugar syntax to be printed.
  10. The = is jsonnet verbatim.
  11. The arr[ is jsonnet verbatim.
  12. The n is a variable in desugar syntax to be printed.
  13. The ] is jsonnet verbatim.
  14. The ; is jsonnet verbatim.
  15. The e_f is a variable in desugar syntax to be printed.
  16. The , is desugar syntax (part of the call started at pr.2).
  17. The b is a variable in desugar syntax (part of the call started at pr.2).
  18. The ) is desugar syntax (ends the call started at pt.2, matches pt.3).
  19. The ] is jsonnet verbatim.
  20. The : is jsonnet verbatim.

Basically, it's really hard to know what is and isn't jsonnet verbatim, and what is "code". Maybe adding color coding or somesuch to make this clearer would vastly help with readability here. Even having gone though this multiple times, I'm not sure the list above is correct.

@Alxandr Alxandr changed the title Error in language spec BNF? Questions and errors in the spec May 8, 2020
@Alxandr
Copy link
Author

Alxandr commented May 8, 2020

Also; as far as I can tell, jsonnet is 0-indexed:

local arr = [0, 1, 2, 3];

arr[2] // <- outputs 2

However, that would offset all of the variables defined in the spec here: 𝚕𝚘𝚌𝚊𝚕 x1=arr[1] by one, because x1 is the first x.

@Alxandr
Copy link
Author

Alxandr commented May 9, 2020

I find myself stuck once again:

image

An example here is ["key"]+: "value":
e: "key"
h: :
e_': "value"

Which means that e_n = "key"[$outerself/self,$outersuper/super] which seems weird and I don't know what it means. There is a comma inside a index/member access, and we are dividing tokens?
Then e_m is

if "key"[$outerself/self,$outersuper/super] in super 
then super["key"[$outerself/self,$outersuper/super]] + "value" 
else "value"

and we end up with

["key"]: 
  if "key"[$outerself/self,$outersuper/super] in super 
  then super["key"[$outerself/self,$outersuper/super]] + "value" 
  else "value"

@sbarzowski
Copy link
Collaborator

sbarzowski commented May 9, 2020

The desugaring rule for object comprehensions

So, if I am to interpret the desugar definition, it should go as follows (using some pseudo code):

AFAICT your interpretation is correct.

What makes it really hard to interpret the definition of the desugar expressions (in addition to the fact that it's super terse) is the fact that it's really hard to see what is part of the language used to write the desugar function, and what is part of an jsonnet expression inside the function.

I agree that it is hard to distinguish sometimes and that can be quite confusing. I have never thought about it, though. Thanks for spelling it out.

Being terse, on the other hand, is not a bug, it's a feature. You can't read it as fast as prose, but you'll need waaay more prose to describe things precisely (and that tends to be language-lawyery prose which is not exactly accessible either). That said, I would like to add comments explaining the intention for each rule, e.g. "reduce object comprehensions to use only one for and no ifs".

Maybe adding color coding or somesuch to make this clearer would vastly help with readability here.

Good idea. The most natural way for me seems to be using <code>-like formatting (the same that is used for verbatim tokens in abstract syntax.

And of course you're right about 0-indexing vs 1-indexing. Another good catch :-).

FYI go-jsonnet desugares object locals differently, but it turned out to also be quite painful. I plan to get rid of this rule completely and directly handle arbitrarily nested fors and ifs in runtime (the behavior for the end users will still be the same, of course)

Desugaring +:

Which means that e_n = "key"[$outerself/self,$outersuper/super] which seems weird and I don't know what it means. There is a comma inside a index/member access, and we are dividing tokens?

No. It just proves your previous point about lack of clarity about what is Jsonnet syntax and what part is the syntax of desugaring rules. The [$outerself/self,$outersuper/super] is the latter. It means that self and super are replaced with $outerself and $outersuper in that expression. Generally the field names are evaluated in the scope outside of the object, because they are all needed to create it. So for example { a: "foo", b: { a: "bar", [self.a]: "baz" }} evaluates to:

{
   "a": "foo",
   "b": {
      "a": "bar",
      "foo": "baz"
   }
}

In your example this whole substitution does nothing, because it does not use self or super. You can see a precise definition of this substitution in the "Capture-Avoiding Substitution" section. This is the keyword substitution variant.

If you are wondering why this is needed, it's because the expression for field name is also used in an if within the object. For example if you have {a: "foo", b: { [self.x] +: 15 } it desugars to something like { "a": "foo", "b": local $outerself = self; { [$outerself.x]: if $outerself.x in super then super[$outerself.x] + 15 else 15 }}.

FYI go-jsonnet does not implement that rule literally – the +: is actually not desugared and is handled during evaluation, which turns out to be much easier to implement than the whole substitution business here.

@Alxandr
Copy link
Author

Alxandr commented May 9, 2020

Oh, thanks. That's a great explanation :). Though, you might want to explain the replacement syntax (if I didn't just miss it).

Also, I just found this:

image

I can't find base anywhere else in the spec, so I'm not sure what this is supposed to be.

@sbarzowski
Copy link
Collaborator

s/base/e/, sorry

@Alxandr
Copy link
Author

Alxandr commented May 14, 2020

I've finished the code to lower from AST to Core (though my core language is slightly larger than the spec, as I keep the ´import/importstrto be evaluated later, and I've also opted for keeping the+:` operator instead of lowering that at this point. However, I have (at least) a bug in array comprehensions.

Using the following input:

[
  x * y
  for x in [1, 2, 3]
  if true
  for y in [4, 5, 6]
]

I get the following (erroneous) output:

local
  arr = [
    1,
    2,
    3,
  ];
std.join(
  [
  ],
  std.makeArray(
    std.length(
      arr,
    ),
    function(
      i=error 'Parameter not bound'
    )
      local
        x = arr[i];
      [

        if true
        then local
          arr = [
            4,
            5,
            6,
          ];
             std.join(
          [
          ],
          std.makeArray(
            std.length(
              arr,
            ),
            function(
              i=error 'Parameter not bound'
            )
              local
                y = arr[i];
              [
                x
                * y,
              ],
          ),
        )
        else [
        ],
      ],
  ),
)

Unfortunately, I don't know jsonnet well enough to see where I've gone wrong here. Do you see any obvious errors?

@Alxandr
Copy link
Author

Alxandr commented May 14, 2020

More questions. According to the spec; the following jssonet

{ c: 5 } { f: $.c }

Should expand to this:

{
    ["c"]: local
        $ = self;
    5,
}
  + {
      ["f"]: local
          $ = self;
      $["c"],
  }

Now, this might be entirely correct (and I might get answers if I started implementing the engine), but it doesn't seem correct. Basically, as per my understanding of self, self.c should not exist in the second object. Yet, when I try out jsonnet on the website, I get { c: 5, f: 5 }. Is this correct? And is the expansion correct? Am I just misunderstanding self? Is self late enough bound that it's bound after the two objects have been merged?

@sbarzowski
Copy link
Collaborator

sbarzowski commented May 14, 2020

Do you see any obvious errors?

Yes. There is one unnecessary array. This is already an array, you don't need to wrap it in another one:

        if true
        then local
          arr = [
            4,
            5,
            6,
          ];
             std.join(
          [
          ],
          std.makeArray(
            std.length(
              arr,
            ),
            function(
              i=error 'Parameter not bound'
            )
              local
                y = arr[i];
              [
                x
                * y,
              ],
          ),
        )
        else [
        ],

BTW a great way to debug Jsonnet programs in cases like this is to take the interesting subexpressions out of the program and evaluate them on their own. Due to referential transparency, explicit namespaces and everything being an expression this tends to work much better than in other languages. Basically the only need you need to fix if you take expression out of context is local variables passed from above (and Jsonnet will always tell you about a missing variable, and it's impossible to get evaluation to succeed, but produce a wrong result this way).

EDIT: from what I say this is also a problem in desugaring rules – there is an extra [] for the if rule and I'm not sure about nested for rule,

@sbarzowski
Copy link
Collaborator

sbarzowski commented May 14, 2020

Now, this might be entirely correct (and I might get answers if I started implementing the engine), but it doesn't seem correct. Basically, as per my understanding of self, self.c should not exist in the second object. Yet, when I try out jsonnet on the website, I get { c: 5, f: 5 }. Is this correct?

Yes.

And is the expansion correct?

Yes.

Am I just misunderstanding self?

Maybe.

Is self late enough bound that it's bound after the two objects have been merged?

Yes, exactly. This is how operator + provides inheritance in the OO sense – it can refer to fields which can be overridden and/or defined in the base object. The fields of objects are always evaluated only when they are indexed. The local variables are captured in the object (just like in a function), but self and super are preserved – their meaning is dependent on the final object.

So, of course { f: $.c }.c will fail to evaluate on its own, but you can provide c by combining it with another object like in your example.

I usually think of Jsonnet objects as stacks of layers. Each layer consists a bunch of field definitions. So {a: 1} is a single layer object with field a, while {a: 1} + {b: 2} + {a: 3, c: 1} has three layers (first with field a, second with b and third with a and c). All indexing operations are defined only on the stacks, not on individual layers. The a + b operation creates a new object by putting layers of b on top of the layers from a.

Self, super and indexing are best explained with an example:

local obj = {
  val: 1,
  with_var: obj.val,
  with_self: self.val,
  with_super: super.val.
};
local obj2 = obj + {val: 2};
local obj3 = {val: 3} + obj2;

Indexing object from outside like obj.val starts looking for the field val from the top of the stack. The obj is a variable which refers to a fixed, concrete object, so obj2.with_var = 1. On the other hand self.x also starts looking from the field x from the top of the stack of the object in which it is evaluated, so obj2.with_self = 2. The last variant super.x searches is from the layer below the current one towards the bottom. So obj2.with_super will fail to evaluate, because it has already reached the bottom of the stack without finding val (there is no layer below the one where with_super is defined). When you put a layer with val below it, it will work – obj3 evaluates to {val: 2, with_var: 1, with_self: 2, with_super: 3}.

Hope it helps.

@Alxandr
Copy link
Author

Alxandr commented May 15, 2020

I got array comprehension working :). It seems there are two errors in the spec here:

image

the array around the recursive call here is unneeded, as desugar_arrcomp will always produce an array.

image

the same is true here.

@sparkprime
Copy link
Collaborator

sparkprime commented May 15, 2020 via email

@Alxandr
Copy link
Author

Alxandr commented May 15, 2020

Not sure I get the question? Is what to desugar e'?

@sparkprime
Copy link
Collaborator

Sorry I misread what you said. You're right. Also there's a "base" just below that should be an e.

The desugar_expr could be around the (e') as well, but that's more stylistic than correctness

@sparkprime
Copy link
Collaborator

Thanks for taking the time to read through this by the way. It's rare for people to read it in detail unless they're actually needing to use it for something. And being as it's written by a human being and not verified, there still lingering mistakes even years after it was written.

@Alxandr
Copy link
Author

Alxandr commented May 16, 2020

Yeah, well, I am using it 😛. But yeah, I figured I might as well take the time to report any issues I find given that I'm going through it with a rather fine comb (not sure if that expression translated correctly). I've started looking at the rules for validation and execution and will probably have more questions shortly because I think the syntax needs to explained more, but for now, I'm redoing my parser to a different architecture, so not entirely sure when I'll get to it.

sbarzowski added a commit to sbarzowski/jsonnet that referenced this issue May 21, 2020
This change only covers clear, simple errors
and not stylistic issues or spec-implementation
discrepancies.
sbarzowski added a commit to sbarzowski/jsonnet that referenced this issue May 21, 2020
This change only covers clear, simple errors
and not stylistic issues or spec-implementation
discrepancies.
@Alxandr
Copy link
Author

Alxandr commented May 21, 2020

By the way; I've (for the most part) finished implementing v3 of the parser, and I'm quite happy with this one. There are some issues with slices (not parsing them, I believe, but figuring out what expr is what in the resulting parse tree), but I'll be looking at that shortly. In the meantime, I have my parser running in WASM, so I added it as a plugin to astexplorer. I thought some of you might find this cool. It's available here: https://jsonnet-astexplorer.now.sh/. Just select jsonnet from the language dropdown, and you can inspect the parse tree output of the jsonnet parser 😃.

sbarzowski added a commit that referenced this issue Aug 12, 2020
This change only covers clear, simple errors
and not stylistic issues or spec-implementation
discrepancies.
@sbarzowski
Copy link
Collaborator

I think we fixed all issues reported here. Feel free to reopen if we missed something or if there is something new (or just open a new issue, either way is fine).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants