Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100% Freeze on any catastrophic backtracking #104

Open
jeff-hykin opened this issue Aug 13, 2019 · 4 comments
Open

100% Freeze on any catastrophic backtracking #104

jeff-hykin opened this issue Aug 13, 2019 · 4 comments
Labels

Comments

@jeff-hykin
Copy link

Reasons a change is needed:

  • Debugging catastrophic backtracking is tough because the entire process freezes. Inside of VS Code, basically the only way to find the bug is to guess what it is, and the see if the process still crashes.
  • Catastrophic backtracking is easy to accidentally introduce, even for regex veterans
  • Catastrophic backtracking is hard to write tests for; it can require a very specific circumstance to become catastrophic

Recommended change:
Find some way to have a timeout on a per-line check. If the line goes past the timeout (ex: 5 seconds) the parser just groups the entire rest of the document (all of the remaining lines) as one scope, and closes any unclosed scopes. The reason being: that it would fail safely (no freeze) but would fail badly (not-user friendly) so that the failsafe is not used a a crutch to allow poorly written regex to become acceptable. Failing like this would also give the dev the ability to see exactly which line the error occurred at.

Difficulty:
Probably pretty difficult. The only solutions I can imagine are that the code needs to be async (I'd assume its currently synchronous), or some advanced tool like worker threads would need to be used (pass the current line to the worker, let it start working, if the worker doesn't finish in the timeout, then fail)

@alexdima
Copy link
Member

We call into oniguruma, which is written in C++. Control never returns and it does not have a timeout option of any sorts. In the past, I have "debugged" such cases only by logging in node-oniguruma right before the call into oniguruma.

Here I would add a printf for source_.c_str() so I'd get to find out which regular expression begins and never terminates.

@jeff-hykin
Copy link
Author

jeff-hykin commented Aug 13, 2019

@alexandrudima Is the call into oniguruma from this upstream https://github.com/kkos/oniguruma? I can post an issue on there.

Funny enough, I'd bet this issue is much easier to fix in C++ thanks to multithreading.

Thanks for the debugging advice too.

@alexdima
Copy link
Member

We have plans to improve things for VS Code. See microsoft/vscode#77140 where we want to look into running the TM grammars on separate threads.

We consume oniguruma via the node-oniguruma project at https://github.com/atom/node-oniguruma . I'm not entirely sure why, but they have a pretty old version of oniguruma

@jeff-hykin
Copy link
Author

That is an awesome post, thanks for linking me to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants