Create `infer_object` #2167

DanielNoord · 2023-05-06T19:32:31Z

Type of Changes

	Type
✓	🐛 Bug fix
✓	✨ New feature
✓	🔨 Refactoring

Description

Perhaps I should have created an issue for this, but:

This is my proposal to remove the import hell from astroid and the method assignment we do for the inference.py file.
Instead of calling node.infer() we would now just call infer_object(node).
This decouples the nodes and their meta-information from the actual astroid magic that needs and uses that information. Next steps would be to change calls in the astroid code base and remove _infer everywhere.
The migration in pylint seems rather straightforward as there are not that many calls to infer() actually.

I haven't made a news fragment as I'd like to do this after we have finalised the internal transition.

Let me know what you think guys!

codecov · 2023-05-06T19:40:51Z

Codecov Report

Merging #2167 (75f0b08) into main (0740a0d) will decrease coverage by 0.02%.
The diff coverage is 98.92%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2167      +/-   ##
==========================================
- Coverage   92.53%   92.52%   -0.02%     
==========================================
  Files          94       94              
  Lines       10804    10864      +60     
==========================================
+ Hits         9998    10052      +54     
- Misses        806      812       +6

Flag	Coverage Δ
linux	`92.28% <98.92%> (-0.02%)`	⬇️
pypy	`87.50% <98.92%> (+0.01%)`	⬆️
windows	`92.12% <98.92%> (-0.02%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
astroid/inference.py	`95.17% <98.90%> (+0.59%)`	⬆️
astroid/nodes/node_ng.py	`91.58% <100.00%> (-1.11%)`	⬇️

... and 2 files with indirect coverage changes

Pierre-Sassoulas

I don't know what the right approach is, but shouldn't we remove more code in node_classe and scope_nodes if we refactor the way it's done right now ?

Pierre-Sassoulas · 2023-05-07T07:45:06Z

astroid/inference.py

+    node: nodes.NodeNG, context: InferenceContext | None = None, **kwargs: Any
+) -> Generator[InferenceResult, None, None]:
+    """Find the infer method for the given node and call it."""
+    if isinstance(


Given the number of time this is going to be called it might be a good idea to do stats on the most common one so it's tested first if we go with this approach.

Yeah I think we might want to refactor this anyway before releasing the final version.

For now I'd say, let's use this as a start and see how we can optimise. I think there are a lot of other small optimisations we could do here.

Even with a statistical approach, I'm nervous about this approach, it feels like a code smell that we should be using polymorphism, which is more like the existing way.

We could also use a dictionary of nodes types and associated methods, which makes it (almost) O(1). However, I think there might be good reason to eventually refactor this inference function to be part of an instantiated AstroidManager object instead of having it rely on one global AstroidManager class (as this makes parallelisation much harder).
Thus, too many optimisations to this don't make sense as the moment I think. It's more about the intent of decoupling nodes from the interaction with nodes.

Agree with @jacobtylerwalls about the code smell. Couldn't we put this in the nodes instead of outside the node ? Sometime the deferred creation of the infer function wasn't required.

Just look at the imports that are needed in this file. If we want to put this in astroid.nodes we would have to do all of those within the methods. See also my other comment in the main thread.
This design really boils down whether the nodes should know how to infer themselves. I don't think they do as it heavily complicates the structure of the code and so far we haven't found a good way to combine it in one file.

DanielNoord · 2023-05-07T07:51:04Z

No, this works already because we don't remove the old way of doing inference with this PR.
Only after we remove _infer we need to do some more refactoring.

jacobtylerwalls · 2023-05-07T13:07:05Z

Before offering a view on whether this is the right direction, I'd like to understand a bit more of the benefits.

This decouples the nodes and their meta-information from the actual astroid magic that needs and uses that information.

Can you elaborate on this? I'm not sure I see the immediate upside.

DanielNoord · 2023-05-07T13:12:12Z

Can you elaborate on this? I'm not sure I see the immediate upside.

One of the issue (imo) with the current code structure is that the node objects serve two purposes: they 1) encode information about the node, its position, children, parent, etc. and 2) have a lot of methods that interact with them. However, this second category doesn't actually need to be "on" the nodes and gives a lot of issues such as this 1000 line file of method assignments.
I think it should be possible to create a full astroid.nodes module without any imports from other files: they are essentially just drop in replacements of stdlib ast nodes. The second part of the astroid library would then be the inference logic that can interact with the nodes package and can tell you more about the nodes based on inference. By decoupling the storage of "nodes information" and the "interaction with nodes" I think we would get a much cleaner code base without the import and method assignment mess that we currently deal with.

Pierre-Sassoulas · 2023-05-07T13:29:53Z

However, this second category doesn't actually need to be "on" the nodes and gives a lot of issues such as this 1000 line file of method assignments.
I think it should be possible to create a full astroid.nodes module without any imports from other files: they are essentially just drop in replacements of stdlib ast nodes.

And ast nodes are very well typed so inheriting from ast nodes would be nice.

We could use composition inside NodeNG for inference instead of using an external function that check the type of its argument hough.

jacobtylerwalls · 2023-05-07T13:33:30Z

I think we would get a much cleaner code base without the import and method assignment mess that we currently deal with.

The method assignments are definitely lame, but we can solve that in two ways: either your proposal, or just overriding _infer() everywhere. I think I'd like to see the second approach before deciding, and I'm volunteering to put up a PR to compare!

Also, I don't think we should merge either approach to main just yet--I think we should still review and merge incremental PRs, but just merge them to a dedicated branch. With something this disruptive it would be nice to wait to merge to main until the docs are done and we're sure the performance has improved. (In other words, I'd like the chance to have a final go/no-go decision.)

DanielNoord · 2023-05-07T13:33:40Z

We could use composition inside NodeNG for inference instead of using an external function that check the type of its argument hough.

This doesn't solve the issue that many inference functions use parts of the astroid.nodes package themselves. For example, some nodes are inferred based on the type of their parent. This is an inherent cyclic import in the inference code paths which will always exist as long as nodes need to know how to infer themselves.

DanielNoord · 2023-05-07T13:36:50Z

I think we would get a much cleaner code base without the import and method assignment mess that we currently deal with.

The method assignments are definitely lame, but we can solve that in two ways: either your proposal, or just overriding _infer() everywhere. I think I'd like to see the second approach before deciding, and I'm volunteering to put up a PR to compare!

I could help review such as PR. I have tried previously to fix this, but the import dependency were just too sever to define them cleanly on the nodes themselves.

Also, I don't think we should merge either approach to main just yet--I think we should still review and merge incremental PRs, but just merge them to a dedicated branch. With something this disruptive it would be nice to wait to merge to main until the docs are done and we're sure the performance has improved. (In other words, I'd like the chance to have a final go/no-go decision.)

I can understand that, but do think this creates a very high risk of becoming another 2.0 branch. The change itself is very disruptive, but is managable for client libraries (as can be seen from searching for .infer( in pylint). There is a balance to strike here.

jacobtylerwalls · 2023-05-07T17:09:41Z

I could help review such as PR.

Thank you! Taking a break, but I made some good progress earlier today on this.

jacobtylerwalls · 2023-05-13T13:39:21Z

Superseded by #2171

jacobtylerwalls · 2023-05-14T13:51:06Z

Sounds like I prematurely closed this @DanielNoord, sorry 😅

We could also use a dictionary of nodes types and associated methods, which makes it (almost) O(1)

For me, this was one of the main flaws I felt with this approach. If we reintroduce a dynamic mapping of nodes to inference functions, we've reintroduced the original indirection we were trying to improve on. If it's not dynamic, then it's coded with a match/case or a storm of isinstance checks.

DanielNoord · 2023-05-14T13:54:24Z

For me, this was one of the main flaws I felt with this approach. If we reintroduce a dynamic mapping of nodes to inference functions, we've reintroduced the original indirection we were trying to improve on. If it's not dynamic, then it's coded with a match/case or a storm of isinstance checks.

I think if we hardcode the mapping there is much less indirection. For me the main issue was that the node objects are defined across multiple files and you can't see what methods they have at runtime.
My splitting off the inference from the nodes you can still reasonably understand the different code paths by looking at the code.

jacobtylerwalls · 2023-05-14T13:55:44Z

if it's not dynamic, then it's coded with a match/case

And then in this case, we lose the static analysis/type checking benefits of the polymorphism approach. See this disable on my branch necessitated from better inference.

jacobtylerwalls · 2023-05-14T14:04:58Z

I'm starting to wonder if we're solving two different problems or if we're just coming from two different ideas of what makes better Python design. Or both 😅 !

For me the main issue was that the node objects are defined across multiple files and you can't see what methods they have at runtime.
My splitting off the inference from the nodes you can still reasonably understand the different code paths by looking at the code.

Would you be able to flesh this out a little more? I'm not sure what you're referring to when you say you can't see what methods they have at runtime.

DanielNoord · 2023-05-14T14:16:35Z

I'm starting to wonder if we're solving two different problems or if we're just coming from two different ideas of what makes better Python design. Or both 😅 !

Well, I'm in general not a big fan of composition when it involves setting instance attributes. The current base nodes "work" because they "upgrade" (or build upon) the node that they inherit from them and give some more functionality.
I think the pattern where we have to annotate an attribute at the class level but can't enforce it being set shows that composition might not be ideal in that situation.

Would you be able to flesh this out a little more? I'm not sure what you're referring to when you say you can't see what methods they have at runtime.

Well, by looking at the definition of a node in the current code you can't see they will have a _infer method at runtime. That's (obviously) a flaw. Your proposed fix fixes that issue and is therefore already a clear improvement.

However, I still wonder whether it makes sense to have that much code in the nodes modules that depends on other modules or on itself. If I were able to redesign astroid I would probably split the inference engine from the nodes module. The inference just has too much state that is now put into the nodes. For example, NodeNG.infer still depends on global state of the AstroidManager. I think it would be much better to be able to instantiate a manager and ask it to infer a node for you that you pass to it. That makes it much easier to have one entry point for astroid which I think makes it easier to parallelise or daemon-ize some of the parts of pylint and astroid.

So, I'm not necessarily opposed to your PR as it is definitely an improvement. However, on a higher level I think we should move in another direction and we could start doing so with this issue/fix.

DanielNoord · 2023-05-24T22:31:52Z

@jacobtylerwalls gentle ping on this issue as I don't want to lose all the work you put into the other PR (and this discussion).

jacobtylerwalls · 2023-05-25T07:55:06Z

Sounds like we need a decision. Should we get more views from others?

DanielNoord · 2023-05-27T09:13:41Z

@Pierre-Sassoulas @mbyrnepr2 @cdce8p Do you have an opinion on the matter?

Pierre-Sassoulas · 2023-05-27T19:13:21Z

I think #2171 is the best approach because the implementation would be in the node class where I'd personally expect it to be.

jacobtylerwalls · 2023-06-06T17:52:20Z

Nick weighed in in favor of #2171 at pylint-dev/pylint#8735 (comment).

DanielNoord · 2023-06-06T18:40:35Z

Nick weighed in in favor of #2171 at pylint-dev/pylint#8735 (comment).

I mean, Nick would probably also be in favour of an approach where we split inference from the nodes as long as the module isn't so "gnarly".

I must confess it's a bit frustrating that I feel that some of my points (e.g., the use of global state and its impact on future multiprocessing improvements, composition with incompatible __init__ and __postinit signatures, etc.) haven't been discussed. I still feel strongly about the fact that we should keep the nodes simple like they are in ast and do everything else we need in a separate module. That said, it feels like I'm fighting an uphill battle here so I'll give in. I probably won't be contributing much to the implementation but I'd be happy to give input when asked for it!

jacobtylerwalls · 2023-06-06T18:46:43Z

One problem is that we have splintered conversation across a couple of PRs. Daniël, I'd be glad to open an issue and summarize the discussion we've had so far. I'd like to make sure that all points get a hearing before we take a decision. Will that work well for you?

jacobtylerwalls · 2023-06-06T18:49:18Z

(It's possible that for the issues you feel strongly about, I just failed to see clearly their connection with the changesets we're discussing--but that could become clearer for me once we have a side-by-side pros-n-cons table. I'll tag you to collaborate on it when it's ready!)

Pierre-Sassoulas · 2023-06-06T19:53:58Z

I still feel strongly about the fact that we should keep the nodes simple like they are in ast and do everything else we need in a separate module.

I view astroid as an overlay over the ast module with a lot of embedded utilities to make your life easier (the main one being inference). If we want those utility to be discoverable when using astroid without reading the doc we need them to be on the node object. I'm not super knowledgeable about astroid but it feel like this is one positive aspect of astroid's design that you're able to simply explore the node with a small reproducer file with the node you want to target and a debugger without having to read tons of docs (docs that are lacking, and that we should start to produce if we choose to do external utility functions imo).

Regarding the limitation of global states and multiprocessing, I'm not sure I followed well enough because I don't see how the proposed design would affect it ? Could you summarize the main argument in another issue like Jacob suggested ? If there's a limitation with the approach from #2171 we need to take it into account before investing in a refactor, for sure.

DanielNoord · 2023-06-06T20:32:47Z

but that could become clearer for me once we have a side-by-side pros-n-cons table. I'll tag you to collaborate on it when it's ready!)

I'm assuming Jacob will make an issue, or am I misunderstanding you here?

jacobtylerwalls · 2023-06-06T20:35:17Z

I will when I have time in the next few days!

jacobtylerwalls · 2023-06-07T09:16:41Z

Turns out we had an existing issue! See #679 (comment).

@DanielNoord I would be very grateful if you would read and respond when you have time, including fleshing out the parts where I didn't want to speak for you or didn't fully understand your argument. (I would like to be able to 😄 !)

Create infer_object

75f0b08

DanielNoord added the Enhancement ✨ Improvement to a component label May 6, 2023

DanielNoord added this to the 3.0.0a3 milestone May 6, 2023

DanielNoord requested review from Pierre-Sassoulas and jacobtylerwalls May 6, 2023 19:32

Pierre-Sassoulas reviewed May 7, 2023

View reviewed changes

DanielNoord mentioned this pull request May 7, 2023

Remove all references to _infer #2168

Closed

2 tasks

jacobtylerwalls mentioned this pull request May 8, 2023

Move inference methods to nodes module(s) #2171

Merged

jacobtylerwalls closed this May 13, 2023

jacobtylerwalls removed their request for review May 13, 2023 13:39

jacobtylerwalls mentioned this pull request Jun 6, 2023

Checking incorrect usage of the keyword 'nonlocal' results in fatal errors. pylint-dev/pylint#8735

Closed

DanielNoord deleted the split-infer branch June 6, 2023 18:40

jacobtylerwalls mentioned this pull request Jun 7, 2023

Remove monkey-patching of methods onto classes #679

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create `infer_object` #2167

Create `infer_object` #2167

DanielNoord commented May 6, 2023

codecov bot commented May 6, 2023

Pierre-Sassoulas left a comment

Pierre-Sassoulas May 7, 2023

DanielNoord May 7, 2023

jacobtylerwalls May 7, 2023

DanielNoord May 7, 2023

Pierre-Sassoulas May 7, 2023

DanielNoord May 7, 2023

DanielNoord commented May 7, 2023

jacobtylerwalls commented May 7, 2023

DanielNoord commented May 7, 2023

Pierre-Sassoulas commented May 7, 2023

jacobtylerwalls commented May 7, 2023

DanielNoord commented May 7, 2023

DanielNoord commented May 7, 2023

jacobtylerwalls commented May 7, 2023

jacobtylerwalls commented May 13, 2023

jacobtylerwalls commented May 14, 2023

DanielNoord commented May 14, 2023

jacobtylerwalls commented May 14, 2023

jacobtylerwalls commented May 14, 2023

DanielNoord commented May 14, 2023 •

edited

Loading

DanielNoord commented May 24, 2023

jacobtylerwalls commented May 25, 2023

DanielNoord commented May 27, 2023

Pierre-Sassoulas commented May 27, 2023

jacobtylerwalls commented Jun 6, 2023

DanielNoord commented Jun 6, 2023

jacobtylerwalls commented Jun 6, 2023

jacobtylerwalls commented Jun 6, 2023

Pierre-Sassoulas commented Jun 6, 2023

DanielNoord commented Jun 6, 2023

jacobtylerwalls commented Jun 6, 2023

jacobtylerwalls commented Jun 7, 2023

Create infer_object #2167

Create infer_object #2167

Conversation

DanielNoord commented May 6, 2023

Type of Changes

Description

codecov bot commented May 6, 2023

Codecov Report

Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

Pierre-Sassoulas May 7, 2023

Choose a reason for hiding this comment

DanielNoord May 7, 2023

Choose a reason for hiding this comment

jacobtylerwalls May 7, 2023

Choose a reason for hiding this comment

DanielNoord May 7, 2023

Choose a reason for hiding this comment

Pierre-Sassoulas May 7, 2023

Choose a reason for hiding this comment

DanielNoord May 7, 2023

Choose a reason for hiding this comment

DanielNoord commented May 7, 2023

jacobtylerwalls commented May 7, 2023

DanielNoord commented May 7, 2023

Pierre-Sassoulas commented May 7, 2023

jacobtylerwalls commented May 7, 2023

DanielNoord commented May 7, 2023

DanielNoord commented May 7, 2023

jacobtylerwalls commented May 7, 2023

jacobtylerwalls commented May 13, 2023

jacobtylerwalls commented May 14, 2023

DanielNoord commented May 14, 2023

jacobtylerwalls commented May 14, 2023

jacobtylerwalls commented May 14, 2023

DanielNoord commented May 14, 2023 • edited Loading

DanielNoord commented May 24, 2023

jacobtylerwalls commented May 25, 2023

DanielNoord commented May 27, 2023

Pierre-Sassoulas commented May 27, 2023

jacobtylerwalls commented Jun 6, 2023

DanielNoord commented Jun 6, 2023

jacobtylerwalls commented Jun 6, 2023

jacobtylerwalls commented Jun 6, 2023

Pierre-Sassoulas commented Jun 6, 2023

DanielNoord commented Jun 6, 2023

jacobtylerwalls commented Jun 6, 2023

jacobtylerwalls commented Jun 7, 2023

Create `infer_object` #2167

Create `infer_object` #2167

DanielNoord commented May 14, 2023 •

edited

Loading