Yes, I think I win the longest prefix competition
- Introduction
- Usage
- Limitations
- Design (and about developing a tree-sitter spec from org-element ASTs)
- Extra test files
As currently there are multiple efforts to enhance the definition of Org-mode grammar, this project allows to take a region of Org-mode text, parse it using emacslisp implementation of org, and make a format suitable for tree-sitter corpus.
This way, anyone can build the part of the spec they want to be respected as test files for a TS parser.
Functions are not interactive yet, but the main loop right now is to use
(org-to-tree-sitter-corpus-convert-org-file
"/abs/path/to/org-to-tree-sitter-corpus/test_files/test_inputs/simple_tag.org"
'delete-old)
Which will
- forcefully delete the old converted file (if it exists)
- create
/abs/path/to/org-to-tree-sitter-corpus/test_files/test_inputs/simple_tag.txt
with this content (the,
is only in this README file to avoid bad parsing)
========== simple tag ==========
,* Has a tag :tag3:
—
(org_data (headline (stars) (title) (tags)))
Because it just copy pastes the input into a .txt
file, the input .org
file
must not contain “tree-sitter test” specific syntax like the ---
or the =
banner.
Currently the conversion from an actual parse tree coming from org-element
to
an AST suitable for tree-sitter tests is done with the helper functions
org-to-tree-sitter-corpus--transform-*
. Each one is responsible for
transforming an org-element
tree node into the matching list expression. It
therefore also allows to control the expected format for a tree-sitter grammar.
I didn’t test nested paragraphs/structures yet, it’s really early stage for now.
At least, you can see in the headline
handling how we read the extra metadata
in order to make a relevant tree-sitter tree for the tests later; i.e. adding
the (tags)
or (todo)
… nodes if relevant.
Currently the files in ./test_files/laundry folder are coming from Laundry test files, on commit 5a396be