Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP to port lisp-parser.dart #2

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

duncanmak
Copy link

A few months ago, I wanted to learn Kotlin and thought it'd be a good exercise to port the Lisp parser in dart-petitparser over to Kotlin.

But I don't really know Kotlin and the use of the by operator felt quite mysterious to me. I think I spent 2 evenings on this, and what I ended with doesn't compile.

In terms of API, I remember thinking that java-petitparser's API seems a lot more similar than kotlin-petitparser. Since I'm targeting the JVM, I wondered what the difference is using java-petitparser in kotlin vs. using kotlin-petitparser.

I was reminded of this little experiment and wanted to get it moving forward, so I'm opening a draft PR to see if I could get some help or tips to getting this port completed.

Thanks!

import org.petitparser.core.parser.consumer.CharPredicate.Companion.any
import org.petitparser.core.parser.consumer.CharPredicate.Companion.anyOf
import org.petitparser.core.parser.consumer.CharPredicate.Companion.char
import org.petitparser.core.parser.consumer.CharPredicate.Companion.pattern
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are predicates, you want the parsers:

import org.petitparser.core.parser.consumer.any
import org.petitparser.core.parser.consumer.anyOf
import org.petitparser.core.parser.consumer.char
import org.petitparser.core.parser.consumer.digit

Possibly PetitParser could reduce their visibility?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe prefixing the predicates with is, i.e. (isAny, isChar) will help differentiate the two sets?

Not tested the functionality though ...
@renggli
Copy link
Member

renggli commented Jan 12, 2024

I made some changes to your branch that fixes the syntax errors. I haven't tested the changes though.

The API tries to adopt Kotlin practices. If you wanted the Java API you could use the Java library. IMHO, the biggest problem with the Java API is that sequences and choices are essentially untyped, however this makes porting dynamic grammars like the lisp one quite tricky.

The use of the delegates (by-operator) was me trying to make the definition of grammars more lightweight. Though Kotlin also seem to have its limitations there: it is very easy to confuse the type system and/or crash the compiler. I am open for suggestions of how this could be improved?

@duncanmak
Copy link
Author

I made some changes to your branch that fixes the syntax errors. I haven't tested the changes though.

@renggli Thanks for pushing the changes! Now that it finally compiles, I'm gonna spend some time to work on making it do the right thing. I hope it won't take me too long to complete this PR.

Thanks again!

@duncanmak
Copy link
Author

Hello @renggli

I've been using a copy of this parser in my project, but I ran into an issue.

Using something like this in a test causes it to loop forever:

    read("""
      (foo 1) ; comment
      (foo 2)
    """.trimIndent()

I looked through the examples in Dart and Kotlin and my understanding is that the existing parser is line-based, so it doesn't support reading multi-line content, am I wrong in saying that?

I changed start to accept more than one atom by doing this:

val start by atom.end()
val start by atom.star().end()

Is that a correct step to take?

@renggli
Copy link
Member

renggli commented Jun 28, 2024

Hmm, looks to me like the Dart parser also reads multiple atoms at once?

@duncanmak
Copy link
Author

@renggli You're right about the multiple atoms. That was a change I made locally by mistake.

I'm still trying to get comments to parse correctly. I think it's because I don't understand how star() works - how I have it in my grammar right now, it's causing parse to get stuck.

I've written this smaller example as a test:

class TestGrammar : Grammar() {

  val comment by seq(char(';'), newline().not()).star()

  private val symbolToken by seq(
    pattern("a-zA-Z!#$%&*/:<=>?@\\^_|~+-"),
    pattern("a-zA-Z0-9!#$%&*/:<=>?@\\^_|~+-").star(),
  )
  val symbol by symbolToken.flatten().map { s ->
    when (s) {
      "#t", "#true" -> true
      "#f", "#false" -> false
      "#!null" -> null
      else -> s
    }
  }
}

Here are my tests:

object Example {
  val g = TestGrammar()

  @Test
  fun symbol() {
    assertEquals("foo", g.symbol.parse("foo").value)
    assertEquals(true, g.symbol.parse("#t").value)
  }

  @Test
  fun comment() {
    assert(g.comment.parse("; foo") is Output.Success)
  }

}

I noticed with my comment test above:

// Dart
// Parser comment() => char(';') & Token.newlineParser().neg().star();

// 1 - passes
val comment by seq(char(';'), newline().not()).star()

// 2 - hangs
val comment by seq(char(';'), newline().not().star())

But isn't 2 the direct translation of the original Dart code?


Similarly, I'm also trying to add #| nested comments |# to my grammar, but the application of star() is also wonky:

val nestedComment = seq(string("#|"), string("|#").not().star(), string("|#"))

@renggli
Copy link
Member

renggli commented Jul 18, 2024

Note that the not() parser is not consuming any input, that's why it hangs when you repeat it with star(). The neg() parser is different and consumes and returns the character that could not be parsed. It unfortunately was missing, but can be trivially implemented as:

fun <R> Parser<R>.neg(message: String = "input not expected") =
  any().skip(not(message))

I've added it in 08fc311.

@duncanmak
Copy link
Author

@renggli Oh! I looked into the definition of neg and not earlier! I should have compared it with the Dart implementation, and I'd have seen what's missing.

This is great, my tests finally pass. Do you think you could do a 0.0.2 release? For now, I've just copied the neg definition into my own Grammar.

@renggli
Copy link
Member

renggli commented Jul 19, 2024

I released 1.1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants