Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rationalise regular expressions in Kotlin lexer #1324

Closed
pyrmont opened this issue Sep 7, 2019 · 4 comments · Fixed by #1326
Closed

Rationalise regular expressions in Kotlin lexer #1324

pyrmont opened this issue Sep 7, 2019 · 4 comments · Fixed by #1326
Assignees
Labels
bugfix-request A request for a bugfix to be developed.

Comments

@pyrmont
Copy link
Contributor

pyrmont commented Sep 7, 2019

Name of the lexer
Kotlin

Code sample

name = %r'@?[_\p{Lu}\p{Ll}\p{Lt}\p{Lm}\p{Nl}][\p{Lu}\p{Ll}\p{Lt}\p{Lm}\p{Nl}\p{Nd}\p{Pc}\p{Cf}\p{Mn}\p{Mc}]*'
name_backtick = %r'#{name}|`#{name}`'
id = %r'(#{name_backtick})'

Additional context
Does id have any purpose? It's used throughout the lexer rather than name but as best as I can tell is just name inside a capture group. The capture group is never used, which seems wasteful.

Directing this at you, @lordcodes, because I'm not familiar with Kotlin and so am not sure if there would be a reason to have done things this particular way.

@pyrmont pyrmont added the bugfix-request A request for a bugfix to be developed. label Sep 7, 2019
@pyrmont pyrmont self-assigned this Sep 7, 2019
@lordcodes
Copy link
Contributor

No, I had exactly the same thought when making changes to the Kotlin lexer. name_backtick can be used instead.

@pyrmont
Copy link
Contributor Author

pyrmont commented Sep 7, 2019

Perhaps this is going overboard but is it worth simply putting the backticks into the name regex as optional matches? It's a non-goal of Rouge to check syntax and so the fact this would allow: (1) backticks in places they shouldn't be; and (2) 'unbalanced' backticks, doesn't seem problematic to me. But what do you think? Perhaps overrationalising things?

@lordcodes
Copy link
Contributor

lordcodes commented Sep 7, 2019

Can be done, I am assuming name and name_backtick were kept separate to make the regexes easier to read.
I will have a look in more detail soon.

@pyrmont
Copy link
Contributor Author

pyrmont commented Sep 7, 2019

Yeah, if they were combined it's probably a good idea to include a code comment explaining that backticks are being included but will result in some technically false positives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix-request A request for a bugfix to be developed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants