-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop generating more methods than necessary #293
base: main
Are you sure you want to change the base?
Conversation
|
||
gem.contains_path?(source_location) | ||
source_location == "(eval)" || gem.contains_path?(source_location) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is the source_location equal to (eval)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, forgot to mention that in the PR description, good catch. That happens when:
class Foo
module_eval("def foo; end")
end
Foo.instance_method(:foo).source_location
# => ["(eval)", 1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most usages of eval
and friends do:
class Foo
module_eval("def bar; end", __FILE__, __LINE__)
end
puts Foo.instance_method(:bar).source_location
# => ["foo.rb", 2]
though, so I would expect this to be an edge-case situation.
Still, I think erring on the side of caution for eval defined methods is the right call.
593bfa7
to
da53327
Compare
We used to treat methods that didn't have a source location the same as methods that explicitly were not defined in the current gem. That resulted in Tapioca creating more method definitions than necessary. We would only skip method generation for a method if the constant it was on was an ignored type (i.e. a built-in type), so that we wouldn't keep redefining methods for built-in types. However, for all other types, especially types that come from other gems, we would just keep on generating all the methods regardless of if they were defined by this gem or not. Moreover, the source location check was happening at the wrong location, before unwrapping the method signature. Thus, many methods with signatures would not be generated when the previous problem was fixed, since our code would see them as being defined in Sorbet runtime. The fix is to return a more fine-grained result from `method_in_gem?` which signals yes/no/don't-have-source-location. Based on that we can skip generating don't-have-source-location cases if they are for built-in types and totally ignore the methods that have a source location and are definitely not defined in the current gem. Additionally, if we try to unwrap the method signature and we get an exception, that means the signature block raised an error. If we continue with the method as is, the source location checks would think the method definition does not belong in the gem (since the method is still wrapped), and we would thus skip the method generation. To avoid that, the `signature_for` method is now raising a custom exception to signal that exceptional case, so that we can at least continue generating "a" method definition.
da53327
to
0d40ba9
Compare
Unfortunately, while the CI is green, this work is still not complete, since the implementation ends up removing too many methods from generated RBI files. In general there are 2 ways in which "gem B" can add methods to a constant
To do this properly, we need to attribute such dynamic method definitions to the file require that triggered them, just like how we did with mixin tracking. Unlike mixin tracking, though, this is much harder to do, since we can't intercept |
I've taken a stab at fixing the problem above. I have a failing test demonstrating Ufuk's use case and some half-working code. Here's my WIP PR, I'll return to it next week. |
Does hooking |
That is exactly what @egiurleo's PR in the comment above is doing, but in general I am afraid it will be brittle since you need one constant to forget to call |
This commit adds a `MethodDefinition` tracker which properly attributes methods definitions to the gem that triggers them. This is useful for tracking which gem is responsible for a method definition, and is used in the `Tapioca::Gem::Pipeline` to check if a method should be generated for the given gem or not.
By tracking line numbers for method definitions, we can generate more accurate source locations for methods in RBI files. This will help with tools that rely on source locations, such as Ruby LSP.
When we are dealing with C-extension/native methods, we have been relying on the fact that `source_location` method returning `nil` for those methods. When we switched to using the method definition tracker, we ended up breaking this and started generating method definitions for all the stdlib gem constant methods. This commit tries to catch the case the same case by looking at the previous frame and seeing if it is a `require`, which means that something was required but there are no more Ruby frames, which implies that the attribution should be to a C-extension/native method.
Aliased methods seem to have equal `Method` objects associated with them, so we can't use the method object the a key in a hash. Instead, we should store the method definitions in a hash keyed by the owner of the method, and then the method name. This way, we can disambiguate between aliased methods.
dcd91bd
to
deda787
Compare
By checking the same flag that Sorbet runtime uses to skip method_added handling, we can avoid processing method redefinitions that are triggered by Sorbet runtime.
6ee7756
to
a11f4ce
Compare
@@ -7,7 +7,7 @@ gemspec | |||
CURRENT_RAILS_VERSION = "7.1" | |||
rails_version = ENV.fetch("RAILS_VERSION", CURRENT_RAILS_VERSION) | |||
|
|||
gem "minitest" | |||
gem "minitest", "< 5.25.0" # minitest 5.25.0+ is incompatible with minitest-hooks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be removed minitest/minitest@6d83843
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll clean up this branch quite a lot, but I needed to make sure the tests weren't failing.
Previously we were doing two things wrong: 1. Looking up the method object from the class that the `{singleton_}method_added` call was happening on and indexing method definitions keyed by that was mostly working. However, it was failing in an interesting way when a method was an alias to another method. In that case, the method object would be the same, but the method definition would be different. For example, `alias_method :foo, :bar` would cause the lookup of `foo` method to return the definition of `bar` method. That would cause us to fail generating `foo` method for a gem, if `bar` was defined in another gem. 2. We were storing a single definition for a method, but that proved to be not enough. For example, if a method was defined in a gem and then redefined in application code, we would only store the definition from the application location, and never be able to attribute the method to the gem it was originally defined in. That lead us to not generate that method for the gem. Instead, we now store all definitions of a method in `MethodDefinition` tracker. When looking up a method definition, we now look up all definitions. If there are no definitions, that means the method is probably a C-method, so we return `nil`. If there are definitions, we see if we can find one that matches the gem location. If not, we check for `(eval)` locations, in which case, we include the method but can't return a source location. If all fails, we return `false` to signal that we couldn't find a definition. If we are able to find a definition that matches the gem location, we return that definition. This new logic is used both in `Tapioca::Gem::Listeners::Methods` and `Tapioca::Gem::Listeners::SourceLocation` listeners. The former uses it to check if the method should be included in the gem RBI, and the latter uses it to add more correct source location information in the comments.
330b100
to
862f422
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic work!
sig { params(method_name: Symbol, owner: Module).returns(T.any([String, Integer], NilClass, T::Boolean)) } | ||
def method_definition_in_gem(method_name, owner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is less-than-obvious, but I can't think of a better one. How about we add a doc comment? The return value is particularly tricky to grok
sig { params(method_name: Symbol, owner: Module).returns(T.any([String, Integer], NilClass, T::Boolean)) } | |
def method_definition_in_gem(method_name, owner) | |
# @returns - The source file and line number, if this method was defined in this pipeline's gem | |
- `nil` if the source location is unknown | |
- `true` if the method was defined via `eval` without a known file/line number | |
- `false` if this method wasn't defined by this gem | |
sig { params(method_name: Symbol, owner: Module).returns(T.any([String, Integer], NilClass, T::Boolean)) } | |
def method_definition_in_gem(method_name, owner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could use a class to represent the result, maybe accompanied with an enum to state the found/not found?
Or we can go full blown inheritance:
class MethodLocationLookupResult
abstract!
end
# The method doesn't seem to exist
class MethodLocationUnknown < MethodLocationLookupResult; end
# The method exists but doesn't have a source location
class MethodLocationNotSpecified < MethodLocationLookupResult; end
# The method exists and has a source location
class MethodLocationLookupInGem < MethodLocationLookupResult
@location: [String, Integer]
end
This would make handling the return result much easier.
# this looks something like: | ||
# "(eval at /path/to/file.rb:123)" | ||
# and we are just interested in the "/path/to/file.rb" part | ||
EVAL_SOURCE_FILE_PATTERN = T.let(/\(eval at (.+):\d+\)/, Regexp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we anchor the start or end of this with ^
and/or $
?
|
||
file, line = Object.const_source_location(constant_name) | ||
|
||
# Ruby 3.3 adds automatic definition of source location for evals if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GOOD
# `file` and `line` arguments are not provided. This results in the source | ||
# file being something like `(eval at /path/to/file.rb:123)`. We try to parse | ||
# this string to get the actual source file. | ||
file = file&.sub(EVAL_SOURCE_FILE_PATTERN, "\\1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file = file&.sub(EVAL_SOURCE_FILE_PATTERN, "\\1") | |
file = file&.sub(EVAL_SOURCE_FILE_PATTERN, '\1') |
lib/tapioca/gem/pipeline.rb
Outdated
|
||
gem.contains_path?(source_file) | ||
end | ||
|
||
sig { params(method: UnboundMethod).returns(T::Boolean) } | ||
sig { params(method: UnboundMethod).returns(T.nilable(T::Boolean)) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment about nil
meaning no source location
@@ -129,15 +129,19 @@ def qualified_name_of(constant) | |||
end | |||
end | |||
|
|||
SignatureBlockError = Class.new(StandardError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a Tapioca::Error
?
lib/tapioca/runtime/reflection.rb
Outdated
resolved_loc = locations.find { |loc| loc.absolute_path == resolved_loc.absolute_path } | ||
return ["", 0] unless resolved_loc | ||
|
||
[resolved_loc.absolute_path || "", resolved_loc.lineno] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there the risk of absolute_path
being nil yet lineno is not?
definition = @pipeline.method_definition_in_gem(event.method.name, event.constant) | ||
|
||
case definition | ||
when NilClass, FalseClass, TrueClass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is unfortunate with the return type of method_definition_in_gem
, I think we can do a bit more elegant, see my comment on the method itself.
sig { params(method_name: Symbol, owner: Module).returns(T.any([String, Integer], NilClass, T::Boolean)) } | ||
def method_definition_in_gem(method_name, owner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could use a class to represent the result, maybe accompanied with an enum to state the found/not found?
Or we can go full blown inheritance:
class MethodLocationLookupResult
abstract!
end
# The method doesn't seem to exist
class MethodLocationUnknown < MethodLocationLookupResult; end
# The method exists but doesn't have a source location
class MethodLocationNotSpecified < MethodLocationLookupResult; end
# The method exists and has a source location
class MethodLocationLookupInGem < MethodLocationLookupResult
@location: [String, Integer]
end
This would make handling the return result much easier.
@@ -10,14 +10,14 @@ module MethodDefinition | |||
|
|||
@method_definitions = T.let( | |||
{}.compare_by_identity, | |||
T::Hash[Module, T::Hash[Symbol, T.nilable([String, Integer])]], | |||
T::Hash[Module, T::Hash[Symbol, T::Array[[String, Integer]]]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're really using this tuple a lot, I wonder if we should extract it as a FileLocation
:
class FileLocation
@path: String
@line: Integer
end
Motivation
We used to treat methods that didn't have a source location the same as methods that explicitly were not defined in the current gem. That resulted in Tapioca creating more method definitions than necessary.
We would only skip method generation for a method if the constant it was on was an ignored type (i.e. a built-in type), so that we wouldn't keep redefining methods for built-in types. However, for all other types, especially types that come from other gems, we would just keep on generating all the methods regardless of if they were defined by this gem or not.
Moreover, the source location check was happening at the wrong location, before unwrapping the method signature. Thus, many methods with signatures would not be generated when the previous problem was fixed, since our code would see them as being defined in Sorbet runtime.
Implementation
The fix is to return a more fine-grained result from
method_in_gem?
which signals yes/no/don't-have-source-location. Based on that we can skip generating don't-have-source-location cases if they are for built-in types and totally ignore the methods that have a source location and are definitely not defined in the current gem.Additionally, if we try to unwrap the method signature and we get an exception, that means the signature block raised an error. If we continue with the method as is, the source location checks would think the method definition does not belong in the gem (since the method is still wrapped), and we would thus skip the method generation. To avoid that, the
signature_for
method is now raising a custom exception to signal that exceptional case, so that we can at least continue generating "a" method definition.Tests
Updated existing tests.