Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Ruby parsing for block assigns #1732

Closed
wants to merge 6 commits into from

Conversation

tambeta
Copy link
Contributor

@tambeta tambeta commented Apr 9, 2018

Assigning blocks' return values to variables is a common idiom in Ruby. Fix the parser to take this into account.

I'm not sure if matching against regexen would be acceptable in ctags, but this would certainly make things easier. The proposed implementation doesn't use regexps.

I originally submitted this fix to Geany, which includes ctags as part of its source tree - Geany devs recommended posting it to upstream first. The issue is explained over there.

Assigning blocks' return values to variables is a common idiom in Ruby.
Fix the parser to take this into account.
@coveralls
Copy link

coveralls commented Apr 9, 2018

Coverage Status

Coverage increased (+0.03%) to 84.785% when pulling ceaa167 on tambeta:ruby-assign-blocks into 8e09804 on universal-ctags:master.

@b4n
Copy link
Member

b4n commented Apr 9, 2018

Could you provide test cases for this? Also, test cases for your TODOs would be nice too and help future work.

tambeta added 2 commits April 9, 2018 18:33
After entering the `while` block parsing the line character by
character, we don't need to use the previously introduced
canMatchKeywordWithAssign function.
A test case assigning various blocks to variables, for exercising the
fix introduced in 6c90039.
@tambeta
Copy link
Contributor Author

tambeta commented Apr 9, 2018

OK, for now I added a test case for determining if the correct tags are extracted if blocks are assigned to variables.

@masatake
Copy link
Member

masatake commented Apr 9, 2018

@tambeta, Thank you.

Could you paste here an example input that uses ||= and/or += if you don't have time?
I will convert it to .b test case as I wrote in http://docs.ctags.io/en/latest/units.html#gathering-test-cases-for-known-bugs

method_a input.rb /^ x = def method_a$/;" f class:Bar
method_b input.rb /^ def method_b$/;" f class:Bar
method_c input.rb /^ def method_c$/;" f class:Bar
method_d input.rb /^ def method_d$/;" f class:Bar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't m, c and x be reported as something (variables I guess)?

I'm not sure where this syntax is used, but apparently it does something (store the result of the last statement? not sure from experimenting…), so I guess they should ideally be reported, shouldn't they?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could be reported as variables, but I believe this hasn't been the case thus far. Anyway, it isn't in the scope of this PR, as it just aims to fix the annoying bug where class members are reported as free-floating methods if block assignment is used.

A failing test case using shortcut assignment operators with blocks,
serving as a TODO for improving on 6c90039.
@tambeta
Copy link
Contributor Author

tambeta commented Apr 9, 2018

@masatake, I took the time to read about the protocol for adding known bug tests and the latest commit adds a failing test where various "shortcut" assignment operators are used (+= etc).

Extend block assignment parsing by allowing "shortcut assignment"
operators, e.g. `||=`, `+=`.
@tambeta
Copy link
Contributor Author

tambeta commented Apr 10, 2018

Giving it some thought, it wasn't a big job to extend parsing to allow "shortcut" assignment operators. The latest commit introduces this.

parsers/ruby.c Outdated
*/
static bool advanceWhile (const unsigned char** s, bool (*predicate) (int))
{
const int s_length = strlen ((const char *)*s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t is better.

parsers/ruby.c Outdated
unsigned char this_char;
int i = 0;

for (i = 0; i < s_length; ++i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function scans the input twice: once by strlen and once by this for.
How about replacing them with following while?

while (**s != `\0`)

parsers/ruby.c Outdated
@@ -167,7 +235,7 @@ static bool parseRubyOperator (vString* name, const unsigned char** cp)
{
if (canMatch (cp, RUBY_OPERATORS[i], notOperatorChar))
{
vStringCatS (name, RUBY_OPERATORS[i]);
vStringCatS (name, RUBY_OPERATORS[i]);
Copy link
Member

@masatake masatake Apr 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you keep this line as the orignal?

@masatake
Copy link
Member

I made some minor comments but it looks good to me.

@tambeta
Copy link
Contributor Author

tambeta commented Apr 10, 2018

Noted and implemented.

@masatake
Copy link
Member

Thank you.

@masatake
Copy link
Member

Could you look at #1734. To make the history of commits simple, I rearranged your changes.

@tambeta
Copy link
Contributor Author

tambeta commented Apr 11, 2018

OK, let's take the discussion over there.

@masatake
Copy link
Member

I merged your work in #1734. Thank you.

@masatake masatake closed this Apr 15, 2018
@tambeta
Copy link
Contributor Author

tambeta commented Apr 15, 2018

Great, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants