More efficient resolving #16

alexdutton · 2015-02-15T20:13:53Z

I was profiling some of my code which used jsonpointer, and I noticed that resolving pointers could be made faster.

My profiling script is at https://gist.github.com/alexsdutton/bb95e47a381d0e250fd3/988feaa0e8968893287df01c87eb233c9b86a010. Running on af04ec0, we get the following times on my laptop (Macbook Air, Fedora, Python 2.7.8):

3.43161201477 (/a/b/c/d/e/f)
8.24996113777 (/b/1/1/1/1/1)
7.04101395607 (/c/1/d/1/e/1/f)
1.00753712654 (/-)

After each commit we get, respectively:

2.40125393867
5.88285589218
4.9583530426
1.05458307266

3.08297109604
4.70882201195
4.62245106697
1.13339781761

1.59154200554
3.8058848381
3.40330195427
0.9848549366

So, about a 2x speed-up on the original code.

coveralls · 2015-02-15T20:15:46Z

Coverage remained the same at 100.0% when pulling c4372cd on alexsdutton:more-efficient-resolving into af04ec0 on stefankoegl:master.

coveralls · 2015-02-15T20:15:46Z

Coverage remained the same at 100.0% when pulling c4372cd on alexsdutton:more-efficient-resolving into af04ec0 on stefankoegl:master.

kxepal · 2015-02-15T21:04:32Z

You can also move try-except in JsonPointer.resolve out of the for loop to prevent Python start/stop tracking context on every loop. This will give you few points for the last test case.

kxepal · 2015-02-15T21:05:10Z

Also, it may be worth to optionally cythonize jsonpointer for greater speedup.

Also means that isinstance(..., Sequence) doesn't get called twice for non-list sequences (once in walk, and one in get_part)

alexdutton · 2015-02-19T00:28:39Z

The latest commit gives me the following numbers:

1.25914096832
2.52786207199
2.27428197861
0.568547010422

… but adds a bit more duplication. It also makes get_part and walk redundant, but I've left those in because other people might be relying on them.

coveralls · 2015-02-19T00:28:43Z

Coverage decreased (-17.78%) to 82.22% when pulling 9653d1b on alexsdutton:more-efficient-resolving into af04ec0 on stefankoegl:master.

alexdutton · 2015-02-19T00:29:46Z

Ah, the coverage decrease will because the tests no longer cover get_part and walk (and I've added more lines of code)

kxepal · 2015-02-19T07:03:57Z

@alexsdutton my idea was just about to replace this:

for part in self.parts:
  try:
    doc = self.walk(doc, part)
  except JsonPointerException:
    if default is _nothing:
      raise
    else:
      return default
return doc

with:

try:
  walk = self.walk
  for part in self.parts:
    doc = walk(doc, part)
except JsonPointerException:
  if default is _nothing:
    raise
  return default
else:
  return doc

Inlining function calls may indeed change the timings because stack frames aren't cheap, but it there should be a balance between performance optimizations and code quality. Again, Cythonization will give you much better numbers without need to play with micro optimizations.

alexdutton · 2015-02-19T22:00:14Z

@kxepal that gives:

1.53566789627
3.51046395302
3.03776717186
0.941489219666

I can play with Cython to see how that fares, but presumably that would lead to two codebases (i.e. you'd still need the pure Python implementation for non-CPython environments)?

kxepal · 2015-02-19T22:10:17Z

@alexsdutton no, it doesn't leads to have two codebases. You can make Cython module as optional and fallback to pure Python one in case if it's hard (hello, windows) or not makes a sense to build (hello, pypy).

kxepal · 2015-02-19T22:12:00Z

About the numbers, yes, there is no much gain, but this have to be done if you're going to optimize code for performance (:

alexdutton · 2015-02-19T22:18:10Z

The reason I'm trying to push it as far as it can go is that I've got code that's doing 5.8m resolve calls, and it's taking 100s just for those at its most optimized, down from 175s (I think) when I started.

re codebases, you'd have the Cython implementation, and the Python one, which is two. Have I missed something?

(I presume the usual approach is to do a "try: import _jsonpointer" to get the Cython implementation, and if that fails, use the pure Python implementation)

kxepal · 2015-02-19T22:24:34Z

@alexsdutton sure, things could be done iterative. Your improvements are significant enough boosts the performance.

As about Cython, take a look on https://github.com/KeepSafe/aiohttp - it has multidict module implemented both as in Python and Cython in the same time.

alexdutton · 2015-02-19T22:53:11Z

Well that was easy. Pulled out just the resolve method, added a cdef unicode part (which is presumably bad for Py3 compat) and numbers become:

1.12338781357
2.26954817772
2.0227599144
0.52579498291

Though I have broken the tests. (Yes, all a bit hacky at this hour)

kxepal · 2015-02-19T22:54:46Z

@alexsdutton w00t! looks promising, isn't it? (;

alexdutton · 2015-02-19T22:59:51Z

Worth trying to get the attention of @stefankoegl at this point?

alexdutton · 2015-02-19T23:00:33Z

(and yes, certainly promising 😁)

stefankoegl · 2015-02-23T17:52:00Z

jsonpointer.py

+        ptype = type(doc)
+        if ptype == dict:
+            pass
+        elif ptype == list or isinstance(doc, Sequence):


Doesn't ptype == list imply isinstance(doc, Sequence) It should then be possible to reduce this to

elif isinstance(doc, Sequence):

Oh, I miss this accidentally too. Normally, this should be used is operator since types are some sort of singletons and there is no need to invoke regular comparison routines for them.

The idea with check for list before call isinstance is an optimization for general case where doc is mostly dict or list. So the idea is to use cheaper operations for common cases and fallback to more wide, but costly, type checks: isinstance(foo, Sequence) is 100 times slower when direct type comparison because of function call and unwind all the registered types to those abstract Sequence one.

OK, so in that case the is operator should be used, and it should have some comment explaining that this is a performance optimization.

stefankoegl · 2015-02-23T17:55:56Z

Sorry for the late reply! I was away from computers for a few weeks.

Yes, this certainly looks promising! I've added one inline comment. It might be just an optimization, but if so, it should be commented (otherwise it might not be noticed, and be removed in the future).

alexdutton added 3 commits February 15, 2015 19:45

Simplify walk method. No need to look before we leap.

3f460a5

Reduce number of isinstance() calls in get_part

76b542b

Optimize in get_part for the common cases of doc being dict or list

c4372cd

Re-factor resolve and to_last to not make lots of function calls.

9653d1b

Also means that isinstance(..., Sequence) doesn't get called twice for non-list sequences (once in walk, and one in get_part)

stefankoegl reviewed Feb 23, 2015
View reviewed changes

stefankoegl merged commit 9653d1b into stefankoegl:master Sep 10, 2017

stefankoegl mentioned this pull request Sep 10, 2017

Optimizations from #16 reverted #21

Open

More efficient resolving #16

More efficient resolving #16

Uh oh!

Conversation

alexdutton commented Feb 15, 2015

Uh oh!

coveralls commented Feb 15, 2015

Uh oh!

coveralls commented Feb 15, 2015

Uh oh!

kxepal commented Feb 15, 2015

Uh oh!

kxepal commented Feb 15, 2015

Uh oh!

alexdutton commented Feb 19, 2015

Uh oh!

coveralls commented Feb 19, 2015

Uh oh!

alexdutton commented Feb 19, 2015

Uh oh!

kxepal commented Feb 19, 2015

Uh oh!

alexdutton commented Feb 19, 2015

Uh oh!

kxepal commented Feb 19, 2015

Uh oh!

kxepal commented Feb 19, 2015

Uh oh!

alexdutton commented Feb 19, 2015

Uh oh!

kxepal commented Feb 19, 2015

Uh oh!

alexdutton commented Feb 19, 2015

Uh oh!

kxepal commented Feb 19, 2015

Uh oh!

alexdutton commented Feb 19, 2015

Uh oh!

alexdutton commented Feb 19, 2015

Uh oh!

stefankoegl Feb 23, 2015

Choose a reason for hiding this comment

Uh oh!

kxepal Feb 23, 2015

Choose a reason for hiding this comment

Uh oh!

stefankoegl Mar 2, 2015

Choose a reason for hiding this comment

Uh oh!

stefankoegl commented Feb 23, 2015

Uh oh!

Uh oh!