Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious Performance/Count violations in rails #1868

Closed
wli opened this issue May 5, 2015 · 12 comments · Fixed by #1872
Closed

Spurious Performance/Count violations in rails #1868

wli opened this issue May 5, 2015 · 12 comments · Fixed by #1872

Comments

@wli
Copy link
Contributor

wli commented May 5, 2015

Just upgraded to 0.31.0, and it looks like Performance/Count is complaining about my rails code. Specifically, a line like this:

User.select { |x| x.admin? == true }.count

will result in an error: "Use count instead of select...count"

However, the .count function in rails does not work with a block. You can see that nothing happens when I try to follow the suggestion:

irb(main):033:0> User.count{|x| x.admin?}
   (9.6ms)  SELECT COUNT(*) FROM "users" 
=> 56373
irb(main):034:0> User.count
   (5.5ms)  SELECT COUNT(*) FROM "users" 
=> 56373
irb(main):035:0> User.select{|x| x.admin?}.count
  (530.9ms)  SELECT "users".* FROM "users" 
=> 21

Is there a workaround for this, or do I need to turn off this cop for all rails projects?

cc @rrosenblum

@bbatsov
Copy link
Collaborator

bbatsov commented May 6, 2015

I find your usage of select a bit strange - why not use User.where(admin: true).count instead?

@wli
Copy link
Contributor Author

wli commented May 6, 2015

Funny you bring that up, it complains about that case too: Use count instead of where...count.. Specifically, something like where(admin: false).select('distinct user_ip').count. This is a different failure mode, but same cop. This one can luckily be fixed by simply swapping the order of the where and select clauses, but ideally it wouldn't complain about it since there's no performance benefit by swapping the clauses.

For my original post, I simplified the example query so it's understandable. I agree you'd never do this with the user table in a real project, but that doesn't change the overall problem here. The actual query I'm using in my project is a bit more complex, and can't be easily done in the database.

In addition, when following a relation, you might want to avoid a database query if you have all the data local already. Something like this:

user_with_comments = User.joins(:comments)
num_long_comments = user_with_comments.select { |x| x.size > 5 }.count
return user_with_comments, num_long_comments

@bbatsov
Copy link
Collaborator

bbatsov commented May 6, 2015

If for where(admin: false).select('distinct user_ip').count the cop reports a problem it's a bug. The cop is supposed to ignore select without a block or &:something argument.

@wli
Copy link
Contributor Author

wli commented May 6, 2015

Yep, here's the actual violation call if it helps debugging:

app/models/provider_stat.rb:47:109: C: Use count instead of where...count.
    where(["provider_id = ? and created_at >= ? and created_at < ? ", provider.id, month, month + 1.month]).select('distinct user_ip').count

@rrosenblum
Copy link
Contributor

Thanks for reporting the bug, and sorry for any issues that it has caused you.

This cop can be improved, but I do not think that it can be completely fixed.

As @bbatsov said,

The cop is supposed to ignore select without a block or &:something argument.

The cop, right now, does not have a safe guard for the params only being the &:something format.
This part of the cop can be fixed so that where(admin: false).select('distinct user_ip').count would not register an offense.

As for the case of User.select { |x| x.admin? == true }.count, this is a completely valid case in rails that I do not think we can account for in RuboCop.

select api
count api

The use case from rails is Model.all.select { |m| m.field == value }. From the documentation:

This will build an array of objects from the database for the scope, converting them into an array and iterating through them using Array#select.

Active Record is kind of lazy in its implementation of select with a block. It does not modify the query to the database as I originally expected. The source code show that it gets the results as an array and then calls select.

def select(*fields)
  if block_given?
    to_a.select { |*block_args| yield(*block_args) }
  else
    raise ArgumentError, 'Call this with at least one field' if fields.empty?
    spawn._select!(*fields)
  end
end

Unfortunately, Active Record does not implement the same logic for count with a block.

def count(column_name = nil, options = {})
  # TODO: Remove options argument as soon we remove support to
  # activerecord-deprecated_finders.
  column_name, options = nil, column_name if column_name.is_a?(Hash)
  calculate(:count, column_name, options)
end

I personally feel that Active Record should treat count with a block in the same way that it treats select with a block. If they had similar implementations, then this would not be an issue. I will try to find more information as to why it works this way and possibly create a pull request for to change the implementation of count.

Since there are conditions that this cop cannot safely handle, I think that the options are to remove the cop, or modify it to only complain about the &:something syntax. I would like to get @bbatsov, @jonas054, and anyone else's opinion this situation.

@bbatsov
Copy link
Collaborator

bbatsov commented May 6, 2015

I think we should just fix the problem with select('some string') for now. The problematic usage is fairly uncommon (at least in my experience), so I doubt many people will have issues with it.

rrosenblum added a commit to rrosenblum/rubocop that referenced this issue May 7, 2015
rrosenblum added a commit to rrosenblum/rubocop that referenced this issue May 9, 2015
bbatsov added a commit that referenced this issue May 9, 2015
[Fix #1868] Only register an offense in Count for select with block or &:something format
@tobypinder
Copy link

There are a number of scenarios where this usage impacts our applications too. User.select { |x| x.admin? == true }.count is an antipattern because it forces the full ActiveRecord::Relation to load, but there are many times that we specifically try to get the "full" queries to execute to avoid n+1 problems, and let ruby sift the data after.

I would speculate that many people manage their ActiveRecord::Relation execution like this, especially since Rails 4 has such good support for passing around unexecuted relations, and popular gems like Bullet help identify n+1 query issues that may have arisen from over specific queries.

While I recognise that the cop is unfixable currently, I might suggest disabling it by default as @rrosenblum suggested was one of the alternatives? Or restricting its capabilities even further, if it's possible to make a safe variant?

@bbatsov
Copy link
Collaborator

bbatsov commented May 12, 2015

Half the Rails checks have limitations, so I'd rather leave it to the users to decide which to use and which to leave out. I plan to dwell more on those some day, but they are generally low on my todo list. It's way more important for me to avoid positives in the universal Ruby checks.

@tobypinder
Copy link

My mistake, I didn't realise this was a Rails-specific cop - I had assumed these were rails specific problems for an otherwise generic Ruby performance cop.

@rrosenblum
Copy link
Contributor

The original intention for this cop was not to be a Rails specific cop. In implementation, it appears to not function well in the context of Rails.

I can modify the cop to not register an offense when select...count is called as a class method. This would fix the use case of User.select { |x| x.admin? == true }.count, and still leave to cop to be effective when called on enumerable. The only thing that this will not catch is dynamic programming of setting the class to a variable before hand.

klass = user? User : Account
klass.select { |x| x.admin? == true }.count

@KODerFunk
Copy link

Hi! I use relation for unpersisted object on new action for predefined form filling and check visitors = f.object.registration_visitors.select { |v| v.user.present? }.size.
Check assigned user because use cocoon gem and nested attributes for registration_visitors when one registration_visitor is empty.

@ngoskillz
Copy link

I just ran into the same problem @wli and my work around was to put everything up to the select in a separate variable, then ran that variable with .count on the next line to avoid the rubocop issues.

Edited: I just realized that the count method can take in a block, so you don't need the select. Either way, both are good for rubocop. You can do this:
User.count { |x| x.admin? == true }

Or with your other example:
user_with_comments = User.joins(:comments)
num_long_comments = user_with_comments.count { |x| x.size > 5 }
return user_with_comments, num_long_comments

For example with your code above:
User.select { |x| x.admin? == true }.count

Workaround:
your_variable = User.select { |x| x.admin? == true }
your_variable.count

So in your other case:
user_with_comments = User.joins(:comments)
num_long_comments = user_with_comments.select { |x| x.size > 5 }.count
return user_with_comments, num_long_comments

You could try this workaround:
user_with_comments = User.joins(:comments)
num_long_comments = user_with_comments.select { |x| x.size > 5 }
return user_with_comments, num_long_comments.count

I hope this workaround helps if anyone else runs into this rubocop issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants