Performance improvements, and associated API changes #138

davidhassell · 2021-04-22T12:10:17Z

Fixes #130
Fixes #132
Fixes #137

sadielbartholomew

Fantastic! Users should be very pleased with the performance gains. Great work.

I'm happy with the API changes proposed in the assocaited issues and satisfied that they are implemented successfully in this PR.

Regarding the code, I have made a few minor comments as provided in-line.

For your reference, below I have outlined the results of some speed tests I conducted to gauge the performance gains. All tests go faster, though there wasn't nearly as much of a relative speed up as you have reported, especially in the latter case. Do you know why that might be?

Observed performance improvements

Test suite

As a quick indication for the speed, as reported in the pytest message "Ran tests in s" rather than by using a time command), of running the full test suite in my local environment, across five runs on each branch I saw:

current master: "Ran 171 tests in ..." between 30.2 and 32.3 s;
this branch: "Ran 195 tests in ..." between 27.7 and 29.6 s.

So the suite is running faster despite having more tests to get through, which is great.

Snippet from (end of) #130 (comment)

Running that snippet I see:

current master: ~0.021 (0.020565794229769382)
this branch: ~0.017 (0.016937515400059056)
which is faster after this PR but not by nearly as much as suggested when you did the same test?

cfdm/constructs.py

sadielbartholomew

Just registering a few more minor comments to complete my review started above. Thanks.

cfdm/mixin/container.py

sadielbartholomew · 2021-05-09T23:52:07Z

cfdm/mixin/container.py

-        * If ignore_type=False and the LHS operand is not of the same type, or
-          a squblcass of, the RHS operand then return False
+        * If ignore_type=False and the LHS operand is not of the same
+          type, or a squblcass of, the RHS operand then return False


Suggested change

type, or a squblcass of, the RHS operand then return False

type, or a subclass of, the RHS operand then return False

Fixed in next commit

Not quite, it looks like! We can always sort it later for the pre-release checks, though, if you prefer.

Oh - I see! How about now?

cfdm/mixin/fielddomain.py

davidhassell · 2021-05-10T07:38:44Z

Re. Snippet from (end of) #130 (comment)

I still get a factor of 10 speed up:

In [1]: import cfdm, timeit

In [2]: f = cfdm.example_field(1)

In [3]: cfdm.write(f, 'tmp.nc')

In [4]: assert cfdm.__version__ == '1.8.9.0'

In [5]: sum(timeit.repeat("cfdm.read('tmp.nc')", globals=globals(), repeat=100, number=1))/100
Out[5]: 0.02655579654999656

In [1]: import cfdm, timeit

In [2]: assert cfdm.__version__ == '1.8.8.0'

In [3]: sum(timeit.repeat("cfdm.read('tmp.nc')", globals=globals(), repeat=100, number=1))/100
Out[3]: 0.2563364723699988

???

sadielbartholomew · 2021-05-10T11:55:58Z

I still get a factor of 10 speed up

🤔 Ah, wait, sorry @davidhassell I was about to ask which master state you are using to compare to and but I see from the versioning outputs it is instead probably the tagged 1.8.8.0 release, in which case that could be the root of it because I was comparing against the current master. Still, I think my test is the fairer one for this PR, as testing against 1.8.8.0, though you acknowledge that it I think:

Note that much of the improvement comes from unrelated changes (such as removing unnecessary repr calls and unnecessary deep copies).

My bad, let me re-do the test comparing against 1.8.8.0, though it is noteworthy that we're only seeing factor ~1.25 speed up from this PR, at least in my environment. Still, that's good in itself!

sadielbartholomew · 2021-05-10T13:23:44Z

OK when attempting comparison with the v1.8.8.0 release branch, I keep seeing a seg fault ☹️, even if I reduce the repeat parameter to 2 and also 1!:

>>> import cfdm, timeit
>>> f = cfdm.example_field(1)
>>> cfdm.write(f, 'tmp.nc')
>>> assert cfdm.__version__ == '1.8.8.0'
>>> sum(timeit.repeat("cfdm.read('tmp.nc')", globals=globals(), repeat=100, number=1))/100
Segmentation fault (core dumped)

which isn't useful but potentially illuminating in the seg faulting saga. I'm not sure I've seen one outside of the test suite before. Luckily I didn't get any such seg faults on the current master and/or this branch, so that could imply something has been fixed in a dependency else in our library...

Either way, because of that I can't do the comparison test with 1.8.8.0. Let me try an earlier version, one moment...

sadielbartholomew · 2021-05-10T13:30:35Z

Every release branch I try below 1.8.8.0 is seg faulting so let's assume there is some issue with my environment there and that I can't do the comparison test. I'm happy to take your factor of 10 comparison from v1.8.8.0 to master after this is merged as representative.

sadielbartholomew · 2021-05-10T13:42:32Z

All feedback except for one typo (which can be fixed later if you like) addressed well now in the feedback response commit, thanks. So I think we are ready to merge?

davidhassell · 2021-05-10T14:14:40Z

Great - thanks @sadielbartholomew, as ever, for the careful review. I agree that my various PRs relating to this are not that clean (sorry!), but I think we've got there.

I'll merge when I've heard back from you on that typo (0a60a47), and done a final run of the unit tests (and checked that the upstream cf-python tests still pass).

I'm looking forward to reviewing the append mode PR later.

sadielbartholomew · 2021-05-10T15:30:02Z

All good, thanks @davidhassell. Please merge away!

davidhassell · 2021-05-10T16:21:06Z

Great - will do!

davidhassell added 30 commits March 23, 2021 20:15

dev

ac94b43

dev

839547b

tests pass

fd44732

devs

f12efd3

tests pass

eb4c1f9

tests pass

01b9e7f

chain

b033a02

devs

5151a73

devs

8c57b83

devs

07b1e16

devs

7919db1

devs

7a12511

protect logging.debug

0d18935

devs

4e56400

devs

79425b3

docs

2273a0c

docs

eb7f7d1

docs

fa282ef

devs

cddab75

devs

08b3700

comments

48d05f4

devs

a9e9435

devs

084fc2d

devs

6c22c58

devs

0d9da41

devs

5a94bb6

devs

905fc9d

devs

48085ae

devs

ad9a2e8

del_construct

393b5bb

experiment on Pattern import for 3.6

ef5e4b6

davidhassell marked this pull request as draft April 23, 2021 07:25

davidhassell marked this pull request as ready for review April 23, 2021 07:25

experiment on Pattern import for 3.6 (2)

85f2fcb

davidhassell marked this pull request as draft April 23, 2021 07:35

davidhassell marked this pull request as ready for review April 23, 2021 07:36

davidhassell added 6 commits April 23, 2021 11:16

increase test coverage

24964ef

spelling

4b02a00

spelling

72af4fa

spelling

234ae5c

core/mixin/constructaccess.py -> fielddomain.py

94df404

use get_data_axes 'default' keyword

b45e737

sadielbartholomew approved these changes May 7, 2021

View reviewed changes

cfdm/constructs.py Outdated Show resolved Hide resolved

cfdm/constructs.py Outdated Show resolved Hide resolved

cfdm/constructs.py Outdated Show resolved Hide resolved

sadielbartholomew reviewed May 10, 2021

View reviewed changes

resolution of some review issues

62ad749

davidhassell added 2 commits May 10, 2021 15:06

docs: acknowledge sponsors in sidebar

53605ff

typo in _equals_preprocess addressed

0a60a47

davidhassell merged commit 4120dcf into NCAS-CMS:master May 10, 2021

davidhassell changed the title ~~Performance improvement, and associate API changes~~ Performance improvements, and associated API changes May 10, 2021

davidhassell added this to the 1.8.9.0 milestone May 10, 2021

davidhassell mentioned this pull request May 12, 2021

Performance improvements NCAS-CMS/cf-python#210

Merged

davidhassell deleted the performance2 branch November 14, 2022 09:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvements, and associated API changes #138

Performance improvements, and associated API changes #138

davidhassell commented Apr 22, 2021

sadielbartholomew left a comment •

edited

Loading

sadielbartholomew left a comment

sadielbartholomew May 9, 2021

davidhassell May 10, 2021

sadielbartholomew May 10, 2021

davidhassell May 10, 2021

davidhassell commented May 10, 2021 •

edited

Loading

sadielbartholomew commented May 10, 2021

sadielbartholomew commented May 10, 2021 •

edited

Loading

sadielbartholomew commented May 10, 2021 •

edited

Loading

sadielbartholomew commented May 10, 2021

davidhassell commented May 10, 2021

sadielbartholomew commented May 10, 2021

davidhassell commented May 10, 2021

	type, or a squblcass of, the RHS operand then return False
	type, or a subclass of, the RHS operand then return False

Performance improvements, and associated API changes #138

Performance improvements, and associated API changes #138

Conversation

davidhassell commented Apr 22, 2021

sadielbartholomew left a comment • edited Loading

Choose a reason for hiding this comment

Observed performance improvements

Test suite

Snippet from (end of) #130 (comment)

sadielbartholomew left a comment

Choose a reason for hiding this comment

sadielbartholomew May 9, 2021

Choose a reason for hiding this comment

davidhassell May 10, 2021

Choose a reason for hiding this comment

sadielbartholomew May 10, 2021

Choose a reason for hiding this comment

davidhassell May 10, 2021

Choose a reason for hiding this comment

davidhassell commented May 10, 2021 • edited Loading

Re. Snippet from (end of) #130 (comment)

sadielbartholomew commented May 10, 2021

sadielbartholomew commented May 10, 2021 • edited Loading

sadielbartholomew commented May 10, 2021 • edited Loading

sadielbartholomew commented May 10, 2021

davidhassell commented May 10, 2021

sadielbartholomew commented May 10, 2021

davidhassell commented May 10, 2021

sadielbartholomew left a comment •

edited

Loading

davidhassell commented May 10, 2021 •

edited

Loading

sadielbartholomew commented May 10, 2021 •

edited

Loading

sadielbartholomew commented May 10, 2021 •

edited

Loading