Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison with approach of react-simple-code-editor #109

Closed
curran opened this issue Jul 31, 2019 · 10 comments
Closed

Comparison with approach of react-simple-code-editor #109

curran opened this issue Jul 31, 2019 · 10 comments

Comments

@curran
Copy link
Contributor

curran commented Jul 31, 2019

I recently came across this project https://github.com/satya164/react-simple-code-editor .

That library works by allowing users to edit inside a native textarea, while overlaying syntax-highlighted code on top of it. On the face of it this seems like quite an elegant and simple approach that works well.

I find myself wondering what makes the more complex approach taken by CodeMirror superior to that approach, which seems so simple by comparison.

I can only think of the following drawbacks to the approach taken by react-simple-code-editor:

  • Syntax highlight requires a complete parse on each keystroke. This could be addressed by using tree-sitter or lezer for maintaing the AST.
  • The syntax-highlighted DOM tree is clobbered and regenerated on each keystroke. This could be addressed by using virtual DOM to update the DOM based on the fresh AST (e.g. using preact or react).
  • Really large files would render in full with this approach, whereas CodeMirror only renders what's visible.

The above problems can be ignored for use cases where the files will not be large, or syntax highlighting is not required when the amount of text is above a certain threshold.

I'm posting this issue as a sanity check, to see if I'm missing some important points that make the complex update patterns within CodeMirror inherently a better choice than the approach taken by react-simple-code-editor. There probably are many, but I fail to see them at the moment.

@marijnh
Copy link
Member

marijnh commented Jul 31, 2019

This isn't a new idea—it was also the approach used by Editarea, the only halfway serious predecessor of CodeMirror.

I find myself wondering what makes the more complex approach taken by CodeMirror superior to that approach, which seems so simple by comparison.

  • Can't use anything but single-font single-style text inside the editor or the characters get misaligned.

  • Reading from a textarea is all-or-nothing, so when the document is big, even if you manage to do incremental highlighting, you have to read the entire string and then diff it against the previous value, which is expensive. (Have you tried pasting a big document into their demo? It becomes unusable.)

In short, most of the features that CodeMirror provides couldn't be built on top of this approach, and forget about scalability. A component like this is perfectly appropriate if you know your snippets are going to be small and you don't need much extra functionality, but doesn't scale beyond that.

@marijnh marijnh closed this as completed Jul 31, 2019
@curran
Copy link
Contributor Author

curran commented Jul 31, 2019

Yes I see, all that makes perfect sense. Thank you for your time in responding!

@curran
Copy link
Contributor Author

curran commented Aug 1, 2019

FWIW I came across this video that I found very informative, regarding why CodeMirror operates the way it does: YouTube: Marijn Haverbeke - Grammar-based language modes for text editors. I'm beginning to grok that by comparison, the approach taken by react-simple-code-editor is extremely crude and wasteful of compute resources.

@marijnh
Copy link
Member

marijnh commented Aug 1, 2019

Note that the system I describe there never made it into mainstream CodeMirror (though you can use it if you install it separately), and lezer is a new, different approach.

@curran
Copy link
Contributor Author

curran commented Aug 2, 2019

Good to know! Looking forward to one day maybe a talk like that on Lezer. It's very interesting.

@marijnh
Copy link
Member

marijnh commented Aug 2, 2019

Max Brunsfeld has done some talks on tree-sitter (http://tree-sitter.github.io/tree-sitter/#talks-on-tree-sitter) that you might find interesting.

@curran
Copy link
Contributor Author

curran commented Aug 2, 2019

Oh nice! Thank you for that link.

Actually one of my main questions about Lezer is why not just use tree-sitter, which appears to do the same thing more or less. That could be a good point to address on the Lezer home page (which surprisingly doesn't mention tree-sitter), as many people will probably be wondering the same thing when they discover the project. The point is partially addressed in the README here https://github.com/lezer-parser/lezer , but it requires a lot of the reader to figure out exactly what Lezer has or does, that tree-sitter doesn't.

@curran
Copy link
Contributor Author

curran commented Aug 2, 2019

image

Something like this - "How is Lezer different from existing things?"

@curran
Copy link
Contributor Author

curran commented Aug 2, 2019

A little experiment I did - paste the large file unminified d3.js build into the react-simple-code-editor demo, then record performance after the first parse, when entering a single character.

image

11 seconds is taken invoking prism for highlighting. 500ms for "recalculate style" and 500ms for "Layout", totaling 1 second for DOM updates and rendering.

I'm still kind of intrigued at the idea of using tree-sitter (WASM build) for parsing, and React for DOM updates, which might make the whole approach more workable. The approach, however, would preclude partial highlighting I think (the whole content would be parsed and rendered).

@curran
Copy link
Contributor Author

curran commented Aug 3, 2019

So I put together a proof-of-concept that uses

  • highlighted-pre-over-textarea approach (like react-simple-code-editor)
  • tree-sitter for incremental parsing
  • diff-match-patch for computing text diffs (needed for using tree-sitter)
  • React for DOM updates

It seems to work. It turns out tree-sitter is a pain to use, due to WASM issues (needed to set up a custom server for MIME type problems, large build size for WASM language) as well as a picky API (requiring row,column offsets for edits).

Performing the same manual performance test of entering a single character addition to unminified d3.js, the time taken for the update was 1.2 seconds (down from 11 seconds taken by react-simple-code-editor).

image

0.6 seconds is taken by my walkTree function that generates JSX from tree-sitter's syntax tree, and 0.5 seconds is taken by React's virtual DOM reconcilliation. 100ms is taken by rendering updates (down from 1 second taken by react-simple-code-editor, since the DOM is minimally changed).

Here's the code:

https://github.com/datavis-tech/codearea

It's just a rough proof of concept, but maybe could be useful one day for something.

I'd love to try pairing this with Lezer - the build size would come way down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants