Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Student Proposal: Implementing a Semantic parser for Elm #16

Open
ValerianClerc opened this issue Apr 12, 2021 · 0 comments
Open

Student Proposal: Implementing a Semantic parser for Elm #16

ValerianClerc opened this issue Apr 12, 2021 · 0 comments

Comments

@ValerianClerc
Copy link

ValerianClerc commented Apr 12, 2021

Student Proposal: Implementing a Semantic parser for Elm

Edits: prioritize step 4 alongside step 3, use TDD

Name: Valerian Clerc
Email: valerian.clerc@gmail.com
Slack nickname: vclerc
Potential mentor: TBD

Summary

Following this project idea from the GSOC project page, I'd like to tackle the addition of a Semantic parser for the Elm language. Semantic is a technology supported by Github which powers their code navigation. Adding Semantic support for Elm involves building a Haskell library that works with tree-sitter's output to power enhanced language support on Github (and potentially for other dev tools in the future). As described in the Semantic documentation, adding a semantic parser is a long procedure, but it's broken down into distinct and modular steps, described below.

What will the project focus on

These are the complete steps for adding a new language to Semantic:

  1. Write a tree-sitter parser for the language (already mostly done!).
  2. Create a Haskell library providing an interface to that C source code outputted by tree-sitter.
  3. Create a Haskell library in Semantic to auto-generate ASTs.
  4. Add tests for precise ASTs, tagging and graphing, and evaluating code written in the language.

How will I achieve this

  • Will work on completing and testing the elm tree-sitter library (step 1 from above) to prepare it for next steps of the Semantic parser.
  • Will develop Haskell library to interface with the C source outputted by elm-tree-sitter (step 2). Examples of similar libraries for different languages can be found in this repo. This looks simple, as it seems to be just adding some config files and a submodule reference to the tree-sitter library, examples can be found in past PRs.
  • Will create a Haskell library in Semantic to auto-generate ASTs (step 3). The result will be a package similar to this. This step seems to be the bulk of the work for this project.
  • Alongside the development of the AST library, I will develop thorough tests for the Haskell library (step 4). Using test-driven development will ensure that the library will be sturdy and dependable, and will tackle another requirement of Semantic integration.

Benefits

Semantic and tree-sitter-elm are impactful projects because they power features available to developers, namely code navigation on Github. Github code navigation allows users to understand code quickly and hassle-free by letting us click on variables or functions and see the declaration or other references to that identifier. Semantic is in active development, and Github promises that this is just "scratching the surface" of the project's possibilities! Most major languages have a Semantic package implemented, so creating one for Elm will enable the same level of features and support that giants like Python/Java have.

Timeline

Weeks of May 17th - 31st:

Community Bonding, meeting mentors and Elm community members. Setting/refining goals and expectations for the summer, and familiarizing myself with the Elm ecosystem. Scope out steps 1 and 2 in more rigorous detail. Read documentation! Start playing around with tree-parser, and define what work still needs to be done on it.

Weeks of June 7th - July 5th:

Heads-down working on this project. Ideally I will finish steps 1 and 2 during this period, and start working on 3 and 4 (this pace will probably depend on how much work remains to be done on tree-parser before moving on).

Week of July 12th:

First evaluation, check in with goals that we set during the first weeks, and reflect on what's working well and what could be changed. Set realistic goals for the rest of summer.

Weeks of July 19th - August 16th:

Code some more! Ideally will be working on steps 3 + 4 at this point, and have it wrapped up and well documented by August 16th. I will be moving across the country in early August, so I'm hoping to put in extra hours in the first half of summer to make up for the distraction and chaos of moving.

Goals

The goal of this project is to progress the development of a Semantic parser for Elm. Conveniently, this project is broken down into discrete steps (1-4). So, even if I don't finish the project in its entirety, others can continue to build on top of the steps that I have completed. Considering that step 1 is mostly completed, and step 2 is a short PR, my main goal will be implementing steps 3 and 4: building a Haskell AST generator for Elm, and generating tests for this AST generator.

Requirements

  • tree-sitter knowledge
  • Haskell (learning in progress)
  • Elm (learning in progress)
  • Functional programming knowledge (university course taken, hoping to learn more!)
  • Compilers knowledge (university course taken, hoping to learn more!)

Thanks for reading my proposal, and I'm happy to take any feedback or input :)

-Valerian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant