-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Whitespace is breaking token detection #105
Comments
Hey, so you are parsing Looking from your example, you are asserting newlines in a string value, which is not going to work. Simply use a template literal, eg:
Lastly, Sparser is no longer maintained. I don't know your exact use cases, but if you don't require diffing and just want to the data structures then maybe take a peek at my hard forked variation Prettify which leverages the powerful Sparser under the hood. It's still a WIP but might help you. |
Ty for the reply @panoply , im parsing js-like scripts like the one I shared, not just JSON. Wrapping text a la "template literals" is managed by my code using the " ' " token. So I'm expecting to find js mixed with JSON on my inputs. Prettify looks promising! I will check on it once you officially release it :) |
No problems @zxpectre happy to help! Can you submit a detailed issue to Prettify for me (with detailed code sample/example). I will be doing some work on the script lexer this week and it would be nice to find out what is causing the issue in order to prevent it from occurring in other use cases and with some luck bring it up to a stable enough level where you can use it in your project. |
@zxpectre I must of read your issue incorrectly, I see now that you are parsing the entirety of:
I assumed you were only parsing the contents of |
I would really appreciate if you could cover my use case as I'm sure this can help everybody, this are very generic needs btw. I'm on a hurry and using sparser right now, but I could migrate if prettify does a nice job for us! Can I ask you to share the output of your method I will try to make a detailed issue if the output is handy for me. I like the idea of returning a tree, sparser has some limitations that compliicates things when trying to nest nodes correctly (mixes global and local scopes on end tokens sometimes so is hard to nest recursively) |
Prettify will return an almost identical structure as its using Sparser under the hood (but with various bug fixes and some improved handling across the board). Don't get to married to the naming convention of {
begin: [
-1, -1, 1, 1, 3, 3, 5,
5, 5, 8, 8, 8, 11, 11,
11, 11, 8, 5, 3
],
ender: [
-1, -1, -1, -1, -1, 17, 17,
17, 16, 16, 16, 15, 15, 15,
15, 15, 16, 17, -1
],
lexer: [
'script', 'script', 'script',
'script', 'script', 'script',
'script', 'script', 'script',
'script', 'script', 'script',
'script', 'script', 'script',
'script', 'script', 'script',
'script'
],
lines: [
0, 0, 0, 0, 0, 1, 0,
0, 1, 2, 0, 1, 2, 0,
1, 2, 2, 2, 0
],
stack: [
'global', 'global', 'method',
'method', 'method', 'method',
'object', 'object', 'object',
'object', 'object', 'object',
'object', 'object', 'object',
'object', 'object', 'object',
'method'
],
token: [
'jsonToObj',
'(',
'replaceAll',
'(',
"'\n",
'{',
'"hello"',
':',
'{',
'"FOO"',
':',
'{',
'"world"',
':',
'1234',
'}',
'}',
'}',
`',"FOO",cache.foo));\n`
],
types: [
'word', 'start', 'word',
'start', 'string', 'start',
'string', 'operator', 'start',
'string', 'operator', 'start',
'string', 'operator', 'number',
'end', 'end', 'end',
'string'
]
} The defect occurs at the I could introduce a rule for this, but I'd personally rather not allow invalid syntax pass through. |
Hi, I'm afraid whitespace is breaking proper token detection, without whitespaces this works.
I am missing some option setup in here?
Works:
Fails:
producing a last compact token of
',"FOO",cache.foo));
Options:
The text was updated successfully, but these errors were encountered: