Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested tuple unpacking #194

Closed
metagn opened this issue Feb 20, 2020 · 10 comments
Closed

Nested tuple unpacking #194

metagn opened this issue Feb 20, 2020 · 10 comments

Comments

@metagn
Copy link
Contributor

metagn commented Feb 20, 2020

This issue now proposes nested tuple unpacking in the core language, I have implemented some of the features I previously proposed in a small library: https://github.com/hlaaftana/definesugar.

Relevant issues on the Nim repo: nim-lang/Nim#13439, nim-lang/Nim#2436

The current tuple unpacking syntax is limited to:

let (a, b*, c) = (1, 2, 3)

This issue proposes:

let (a, (b*, c)) = (1, (2, 3))

Main use is for sugar/optimization in for loops:

for i, (a, b) in {"a": 1, "b": 2}:
  discard

Don't know if it's useful in any other context.

The old version of the RFC (long, only keeping as reference):

2. Multiple definitions of the same term

I thought of this syntax:

let (a, (b, c) as y) as x = (1, (2, 3))
var (a, b) as (c, d) = (1, 2)

but this looks kind of ugly and in vars, this would copy instead of alias as the word as implies.

var (a, b) as x = (1, 2)
x[0] = 3
echo a # 1

The solution is just the way you do nested tuples now:

let
  x = (1, (2, 3))
  (a, y) = x
  (b, c) = y
let irrelevant = (1, 2)
var
  (a, b) = irrelevant
  (c, d) = irrelevant

3. Unpacking in routine arguments

I realized after the fact that the macro solution for this is more expressive than the syntax solution.

proc foo((a, b): (int, int)) =
  discard

let x = (1, 2)
foo(x)
var t: Thread[(int, int)]
t.createThread(foo)

vs:

proc foo(x, y: int, str: string) {.packArgs.} =
  discard

var t: Thread[firstArgType(foo)]
t.createThread(foo)

proc foo(a: bool, x, y: int, str: string) {.packArgs(1..2).} =
  discard
foo(true, (1, 2), "str")

4. Named tuple unpacking

By typing out the fields, you can make unpacking named tuples field-order-agnostic.

type Person = tuple[name: string, age: int]
let (age: a, name: n) = Person("John Smith", 30)

Note that this depends on any var (a: int, b: int) = (1, 2) syntax not existing, as it doesn't now.

This could also be extended to objects/ref objects/anything with fields, but it would have to be duck typed. For objects, it would mean the left hand side of an assignment could have the side effect of throwing a FieldError, because of case object fields. This would be solved if the ProveField warning was turned into an error in the specific case of a case object field being unpacked, but still would have no solution for a general field access unpacking.

type Obj = object
  field1: int
  case isBranch: bool
  of true:
    field2: int
  else: discard
let (field1: a, field2: b) = Obj(field1: 1, isBranch: false) # compile time error

5. Splat/spread unpacking

I wrote this without postfix * in mind (in a macro it would be replaced with some other prefix) and I don't think it makes that much sense for tuples. Keeping it here for reference

Nim has no concept of a splat/spread operator as in other languages, so this would be a completely new idea to Nim. My suggestion in this case is a prefix * as in Python. The key use of this is only getting the first/last few elements of a tuple, using the _ special symbol, although most tuples are too small for this to matter.

let (a, b, *c, d) = (1, 2, 3, 4, 5)
let (a, b, *_) = (1, 2, 3, 4, 5)

There can only be 1 splat/spread in a single unpack. Splatting could also work for named tuples, but the compiler would have to create a new named tuple type, so it would not work for objects.

6. Array/generic subscript unpacking

This was originally simpler but it doesn't fit the core language so I mixed it with other wild ideas and am keeping it here for reference.

You can either extend the let (,) syntax to work for anything you can index from (not the best idea), or introduce a new let [,] syntax for arrays, which is excessive because it implies you can do [a, b] = [1, 2] as a normal assignment.

var arr: array[1..10, someType]
var [firstElem, *rest] = arr
var i: someType
[firstElem, i] = [1, 2]

Having a distinct syntax for arrays than for tuples would mean you could index by enums just like you can do in normal arrays.

type StreetlightColor = enum
  Red, Yellow, Green

var areOn: array[StreetlightColor, bool]
let [Red: redOn, Yellow: yellowOn, Green: greenOn] = areOn

Combined with splatting though, this syntax can get complex.

type Days = enum
  Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday

var arr: array[Days, bool]
let [Monday..Thursday: *weekdayArr, Friday: fridayValue, Saturday..Sunday: *weekendArr] = arr
let [Monday: *weekdayArr[4], Friday: fridayValue, Saturday: *weekendArr[2]] = arr

This could be extended for anything that you can index like seqs or strings, but it would have the same side effect problem as 4.

let [0: first, 0..^1: all, ^3: lastThird] = @[1, 2, 3, 4, 5]

So you could have an alternative syntax that implies side effects are possible like so:

let {0: first, 0..^1: all, ^3: lastThird} = @[1, 2, 3, 4, 5]
import json
let {"a": a, "b": b} = %*{"a": 1, "b": 2}
# expands to
let table = %*{"a": 1, "b": 2}
let a = table["a"]
let b = table["b"]

This is also weird syntax because {} constructs arrays of 2-tuples and sets and works as its own subscript operator, it has nothing to do with [] subscripting.

Custom unpacking macros:

This would be an experimental feature akin to case statement macros, term rewriting macros, for statement macros etc. that requires a parser change. I don't know how the parser works exactly, but the way I see it it would go like this: let a = b and var and const can have any untyped expression for a that a = b would allow, if it's not one of the predefined assignment syntaxes from Nim. For example, if Nim is parsing (a, b, (c, d)), and reaches (a, b, then gets the token (, it rerecords the tokens starting from the first ( as an expression, then it's mapped to an unpack(lhs: untyped, rhs: T) macro. This returns an nnkAsgn that gets remapped inside a let/var/const statement or stays as nnkAsgn if it's from an assignment. The fact that let (a, b) and let a are parsed normally and not as expressions means those would not be handled by unpack, meaning you can return them as the concrete value of unpack.

import options, macros
let opt = some 3
macro unpack[T](lhs: untyped, rhs: Option[T]): untyped =
  template invalid = error("invalid left hand side for Option", lhs)
  case lhs.kind
  of nnkCall:
    if lhs.len == 2 and lhs[0] == ident"Some":
      result = newAssignment(lhs[1], newCall(bindSym"get", rhs))
    else: invalid
  of nnkIdent, nnkAccQuoted:
    # this should be unreachable, `let ident = expr` is always handled by Nim
    warning("got identifier/accent for left hand side of unpack, should not happen", lhs)
    result = newAssignment(lhs, rhs)
  else:
    invalid
let Some(val) = opt
echo val # 3

The only difference here between this and case statement macros is that case statement macros do not recheck for another case statement macro, but the unpack macro would. Ie:

import macros

{.experimental: "caseStmtMacros".}

type X = distinct string

macro match(s: X): untyped =
  result = quote do:
    echo "recursion worked"

macro match(n: tuple): untyped =
  result = newTree(nnkCaseStmt)
  result.add(newCall(bindSym"X", n[0][0]))
  for i in 1 ..< n.len:
    let it = n[i]
    case it.kind
    of nnkElse, nnkElifBranch, nnkElifExpr, nnkElseExpr:
      result.add it
    of nnkOfBranch:
      let br = newTree(nnkOfBranch)
      let min1 = it.len - 1
      for j in 0..<min1:
        br.add(newCall(bindSym"X", it[j][0]))
      br.add(it[min1])
      result.add(br)
    else:
      error "'match' cannot handle this node", it

case ("foo", 78)
of ("foo", 78): echo "tuple yes"
of ("bar", 88): echo "tuple no"
else: discard

prints "tuple yes". The problem with recursing is that it may go on infinitely, which is solved by just doing echo result.repr, maybe a explain macro like what concepts have or a hard limit like term rewriting macros have.

Here is a JSON concept:

import json

let obj = parseJson"""{"name": "John Smith", "age": 30, "children": ["James", "Mary"]}"""
let Object{name: String(name), age: Int(age), children: Array[String](children)} = obj
@Clyybber
Copy link

I agree on 1 .
(3 or 4 don't seem too crazy, but also not too useful :p )

@Varriount
Copy link

I agree with 1 as well. I'm on the fence about 3 and 5.

With regards to 2, 4, and 6, I just don't feel like there's enough of a use-case or frequency to justify the added syntax.

@Araq
Copy link
Member

Araq commented Feb 21, 2020

Only 1 is acceptable, the rest is overkill.

@metagn
Copy link
Contributor Author

metagn commented Feb 21, 2020

I can understand them being too much for the core language, though I don't know if it's worth creating some macro like customLet: a = b or customLet(a = b) if it's never going to be as expressive as let a = b. Macros that create objects/enums also have this problem, first of all macros can't generate nnkObjectTy or nnkEnumTy so they have to generate an entire new type section, second of all nnkObjectTy and nnkEnumTy are not standalone untyped expressions to be used in macros, neither are a*: int or case a: int.

@metagn
Copy link
Contributor Author

metagn commented Feb 21, 2020

Don't want this whole issue to go to waste just for nested tuples, so I wrote a better part on custom unpacking macros, hopefully it makes sense.

@timotheecour
Copy link
Member

  1. Multiple definitions of the same term

instead of

var (a, b) as x = (1, 2)

we can build upon nim-lang/Nim#13508 (lvalue references) as follows:

var x = (1, 2)
var (a,b) {.byaddr.} = x

which should be a (conceptually) straightforward extension of nim-lang/Nim#13508

@metagn
Copy link
Contributor Author

metagn commented Mar 1, 2020

Made a library that does most of the unpacking syntax enhancements and has custom unpacking: https://github.com/hlaaftana/definesugar. I was thinking, would people like an RFC that proposes a variable pragma-macro syntax as said in this comment? I would reformat this RFC to only nested tuples, and the new RFC would propose like:

import macros

macro foo(x: untyped): untyped =
  expectKind x, {nnkVarSection, nnkLetSection}
  expectLen x, 1
  result = newVarStmt(ident(x[0][0].strVal & "Foo"), x[0][2])

var
  x {.foo.} = 4
  y = 5
echo xFoo # 4

Edit: Forgot about this issue, the RFC I was proposing in this comment already exists just not on this repo. nim-lang/Nim#6696. Also it's pretty much implemented with nim-lang/Nim#13508

@metagn metagn changed the title Enhanced tuple unpacking Nested tuple unpacking Mar 1, 2020
@gemath
Copy link

gemath commented Sep 29, 2020

General rant, couldn't think of a better place to put it:

For the love of <whatever you believe in>, let's stop it with the piecemeal, special case, wouldn't-it-be-cool-if-we-could-do-that-too features with bells-and-whistles and special-sauce syntax. The concepts we are talking about have names: Structural matching, Destructuring and (in the case of nim-lang/Nim#245 also) Content matching and they are separate but connected. There's two options IMHO:

a) go all in and implement them consistently in a general and obvious KISS way without nice-to-have bloat.
b) stick to the minimal implementations we already have in the language core and standard lib (like tuple unpacking and {.byaddr.}, in the case at hand) and publish functional extensions as third-party modules.

Making users guess which additional special case of destructuring might be supported this month by lang core/stdlib is not helpful and frankly makes Nim look bad consistency-wise compared to other languages.

If you like a), like me, here's some opinions:

  • Stick to simple, general principles and to what's already there and stay backwards compatible, without imposing limitations the user "just has to know" about.
    example principle: "Destructuring patterns look like constructor calls with identifiers for values."
type
  A = tuple[s: string, n: int]
  B = object
    s: string
    a: A

let
  (x, _) = a1    # defines x like the existing tuple unpacking
  (s: y, n: _) = a1    # every kind of constructor syntax works
  B(s: z, a: (_, j)) = b1    # defines z and j, objects and structural depth work too, just like in constructor calls
  (s: w, a: (_, k)) = b1    # we don't even need the object type name because of type inference
  # insert other language core constructor kinds here...

Similar to definesugar by @hlaaftana, but part of the language core without a need for special syntax. The compiler already does structural matching by figuring out the type and it already knows how to parse constructor calls. Everybody who knows constructor syntax can easily learn destructuring patterns, with no unnecessary exceptions.

  • Respect separation of concerns and reuseability (this is about synergy with Adds to the tutorial some info about exception tracking. Nim#245)
    Structural and content matching are two different things and should be implemented separately. For a statically typed language like Nim, that's a clear distinction: the former happens at compile-time, the latter (modulo optimizations) at execution time (let's take hints from Rust, not so much from Python/Ruby). General matching builds on structural matching and could re-use principles from destructuring: "Matching patterns look like constructor calls with content matching patterns for values". So please don't implement Adds to the tutorial some info about exception tracking. Nim#245 stand-alone, there's synergy and consistency to gain.

@haxscramper
Copy link

haxscramper commented Sep 29, 2020

@gemath

First of all

publish functional extensions as third-party modules

I think it is an inferior solution because we already have six different libraries that do whatever is described in RFC

and two more that partially implement features from the library

From my perspective it is quite clear that if ecosystem has ~9 different libraries implementing partially overlapping features (and in case of gara/macroutils/nimtrs/patty/matsuri or definesugar/unpack/nimpylib(partially) - significantly overlapping) it is a sign that there is a demand for pattern matching.

In addition to consolidating syntax from 6+ different libraries (how many should we get until it becomes a real issue of choosing pattern matching library?) this RFC also provides support for rust-like if let, makes nim one step closer to being appealing for functional programmers, simplifies two very common use cases (writing macros and extracting/validating data from external sources), makes python-like sequence unpacking possible (there are two libraries for it already).

Increasing number of programming languages implement pattern matching support, and there is a big difference between something being implemented as third-party module and being shipped with stdlib. Latter one is considered part of the language while former is just "well, yeah, now I need to install package manager, find out how to make it work, install the package itself etc."

special case, wouldn't-it-be-cool-if-we-could-do-that-too features with bells-and-whistles and special-sauce syntax

without nice-to-have bloat

I'm not saying it is the best possible syntax ever for pattern matching, but I tried to make it more pragmatic rather than having minimal set of features. In my opinion pattern matching should not only work with sum types since real-world use cases extremely often consist of things like

  • if "key" in dict followed by let val = dict["key"] - probably even more often if you are dealing with any kind of configuration format - key-value pairs matching is a very common use case and perfectly justified.
  • checking if value matches some predicate and then doing something with it - custom predicates are more than justified
  • extracting nth subnode (array access syntax) from NimNode - this is the use case for pattern matching in most cases and checking for things like of Infix([== ident(".."), @lhs, @rhs]) is much better (more declarative, easier to understand) than good old of nnkInifx: if node[0] == ident(".."): let lhs = node[1]; let rhs = node[2];. This is a simple example, but for something more complex (for example a custom DSL) it quickly gets increasingly annoying to write.
  • Syntax sugar for kind checks and differences from object construction. - first - pattern matching of sum types is expected to work more or less similar in terms of syntax. Yes, we have some additions but in general you can translate code from haskell, ocaml, rust, F# and other functional languages into nim without really changing its semantic. So this is a good move towards making nim more appealing for functional programmers (on top already existing features).

And then, having support for these features naturally drags a couple of others

piecemeal

consistently in a general and obvious KISS way

Suggestions are welcome. List things that you don't like, provide possible suggestions for improving syntax (or why some feature is not necessary).

Making users guess which additional special case of destructuring might be supported this month

Due to complexity of design space it was necessary to consider many possible implementations and use cases. If GitHub discussion thread seems overly erratic (it certainly is) you should look for last version of syntax draft and/or currently implemented tests.

In addition - documentation will be provided, so users who are willing to read manuals won't be "guessing". Yes, this is a new library and users are expected to read documentation for it. Sequence keywords are chosen to be close to sequtils module, making it even less of an issue.

supported this month by lang core/...

This addition does not require modification in language core except for making caseStmtMacros a non-experimental features (considering they have been added in 2018 it more than reasonable request in my opinion), as opposed to this RFC (#194)

Matching patterns look like constructor calls with content matching patterns for values

I think pattern matching should look like objects access. If you have [int] defined that means you are very likely to treat the object as some kind of int-indexable array. If you have kind - most likely this is a case object and so on. It might be possible to spend large amount of time, making implementation as restrictive as possible, wrapping everything in concepts and assert is everywhere, but I don't see any reason why it is necessary to do so.

whatever you believe in

I certainly have strong opinion that writing

let val = some("hello")
if Some(@x) ?= val:
  echo x, "ok"

# and

case head[0]:
  of Asgn([@lhs is Ident(), @rhs]):

is better than

let val = some("hello")
if val.isSome():
  let x = val.get()
  echo x, "ok"

# and

case head[0].kind:
  of nnkAsgn:
    if head[0][0].kind == nnkIdent:
      let
        lhs = head[0][0][1]
        rhs = head[0][0][1]
    else:
      # deal with other case like `let (a, b)` but now this i in the
      # same case branch with +1 level of conditional nesting.

"But last" element - comparing implementations in different language (nim with new pattern matching library)

butLast :: [a] -> a
butLast [] = error "Cannot take one but last from empty list!"
butLast [x] = error "Cannot take one but last from list with only one element!"
butLast [x:y] = x
butLast (x:xs) = myButLast xs
func butLastGen[T](a: seq[T]): T =
  case a:
    of []: raiseAssert("List is empty")
    of [_]: raiseAssert("Only one element")
    of [@pre, _]: pre
    of [_, all @tail]: butLastGen(tail)
    else: raiseAssert("Not possible")

assertEq butLastGen(@["1", "2"]), "1"
let rec second_to_last lst =
  match lst with
  | [] -> failwith "Empty list"
  | x::[] -> failwith "Singleton list"
  | fst::snd::[] -> snd
  | hd::tl -> second_to_last tl

@metagn
Copy link
Contributor Author

metagn commented Jan 26, 2021

Closing because too broad, doesn't even address nkVarTuple etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants