Speed up at_addrs with a dict. #38

rainwoodman · 2018-07-05T18:40:54Z

The old loop was O(mn). This is O(m+n).

mgedmin

It's hard for me to make decisions about this function since I don't use it myself. I except that for small address sets naive iteration can be faster (and should use less memory), but I don't know what the expected average size of the address set is.

Also the KeyError thing -- perhaps it makes sense, but old behavior was to suppress errors, and changing it this way would probably require a major version bump according to SemVer. Again, I don't know what makes more sense to users, as I'm not one.

mgedmin · 2018-07-09T19:25:17Z

objgraph.py

+    id_to_obj = dict((id(o), o) for o in gc.get_objects())
+
+    for i in address_set:
+        o = id_to_obj[i]


This can raise KeyError.

rainwoodman · 2018-07-09T20:02:16Z

As there is no hard copy, the memory used by the dict should be close to 16 * number of GC objects. GC itself probably consumes as much memory as this.

I did not benchmark this in typical use cases (I only ran objgraph on small cases), though I can imagine it is difficult to find cases where O(m+n) is slower than O(mn).

For a stress test, one can probably collect new IDs with debug level of SAVEALL, and run this on the full id list.

mgedmin

Eh, LGTM.

Would you care to add a small changelog note in CHANGES.rst?

klahnakoski · 2018-08-29T12:39:56Z

objgraph.py

-    for o in gc.get_objects():
-        if id(o) in address_set:
-            res.append(o)
+    id_to_obj = dict((id(o), o) for o in gc.get_objects())


I suggest

id_to_obj = {id(o): o for o in gc.get_objects()}

klahnakoski · 2018-08-29T12:43:03Z

objgraph.py

-            res.append(o)
+    id_to_obj = dict((id(o), o) for o in gc.get_objects())
+
+    for i in address_set:


I suggest change these lines to

return [ id_to_obj[i] for i in address_set if i in id_to_obj ]

👍 except I'd put it all on a single line

rainwoodman added 2 commits July 5, 2018 11:40

Speed up at_addrs with a dict.

09cc5c0

The old loop was O(mn). This is O(m+n).

flake

21c7cb7

mgedmin reviewed Jul 9, 2018

View reviewed changes

Ignore addresses not backed by objects.

7f5e2e3

lint

775a796

mgedmin reviewed Jul 17, 2018

View reviewed changes

klahnakoski reviewed Aug 29, 2018

View reviewed changes

mgedmin added the waiting-for-updated-pr label Oct 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up at_addrs with a dict. #38

Speed up at_addrs with a dict. #38

rainwoodman commented Jul 5, 2018

mgedmin left a comment

mgedmin Jul 9, 2018

rainwoodman commented Jul 9, 2018

mgedmin left a comment

klahnakoski Aug 29, 2018

mgedmin Aug 29, 2018

klahnakoski Aug 29, 2018

mgedmin Aug 29, 2018

Speed up at_addrs with a dict. #38

Are you sure you want to change the base?

Speed up at_addrs with a dict. #38

Conversation

rainwoodman commented Jul 5, 2018

mgedmin left a comment

Choose a reason for hiding this comment

mgedmin Jul 9, 2018

Choose a reason for hiding this comment

rainwoodman commented Jul 9, 2018

mgedmin left a comment

Choose a reason for hiding this comment

klahnakoski Aug 29, 2018

Choose a reason for hiding this comment

mgedmin Aug 29, 2018

Choose a reason for hiding this comment

klahnakoski Aug 29, 2018

Choose a reason for hiding this comment

mgedmin Aug 29, 2018

Choose a reason for hiding this comment