Snapshot discovery and reading takes quadratic time #539

huonw · 2021-08-16T07:51:27Z

Describe the bug

We're using syrupy, and it works well. Thank you!

Unfortunately we have nearly 500 snapshots, and our tests runs are starting to get quite slow. It seems like syrupy unfortunately makes the testing take quadratic time with respect to the number of snapshots.

To reproduce

Create this file

# test_performance.py
import pytest
import os

SIZE = int(os.environ.get("SIZE", 1000))

@pytest.mark.parametrize("x", range(SIZE))
def test_performance(x, snapshot):
    assert x == snapshot
    # assert x == x

Run, for instance:

for s in 100 500 1000 2000; do
    echo "size = $s"
    # create the snapshots
    SIZE=$s pytest test_performance.py --snapshot-update
    # just check them
    SIZE=$s pytest test_performance.py
done

The times reported by pytest scale scales quadratically with the number of tests/snapshots (O(size**2)). I think this is because the number of read_file/_read_snapshot_fossil calls and discover_snapshots calls, as reported by python -m cProfile -m py.test test_performance.py, scales linearly (O(size)) with the number of tests/snapshots, and the work required for each call also scales linearly (because the files contain O(size) data, for the snapshots).

The times and number of calls (this is just for the invocation that's just checking snapshot) is something like:

size	time (seconds)	`discover_snapshots` calls	`read_file` calls
100	0.15	200	300
500	1.73	1000	1500
1000	6.24	2000	3000
2000	21.84	4000	6000

Things of note:

each doubling from 500 -> 1000 -> 2000 multiplies the time by 4, approximately (classic marker of quadratic performance)
the number of calls seems rather large: 2 discovery calls per test/snapshot, and 3 read-file calls

Expected behavior

The test runs should be linear in the number of tests/snapshots. For instance, if the assert x == x line is used (and the snapshot fixture removed) instead of assert x == snapshot, the test run is linear: even SIZE=10000 finishes in < 4s on my machine.

It seems like this could be handled by discovering the snapshots once (or once per file) and reading each snapshot file once too.

Screenshots

Environment (please complete the following information):

OS: macOS
Syrupy Version: 1.4
Python Version: 3.8

Additional context

The text was updated successfully, but these errors were encountered:

noahnu · 2021-08-16T12:30:05Z

It definitely shouldn't be quadratic. I'll take a look this week. I suspect there's a few easy optimizations we can do. We also have benchmarking setup in the repo.

noahnu · 2021-08-18T01:24:39Z

This was a really good catch. As I mentioned above, there are a number of optimizations that can be made. I'll leave this GitHub issue as I merge in performance improvements over the next couple weeks.

The first improvement is a change to only discover snapshots once per test file rather than once per assertion. This should have significant impact on your use case, especially if using parameterization. See https://github.com/tophat/syrupy/releases/tag/v1.4.2

I'll have to think about some options around hoisting the read/write snapshot logic out of the assertions, and into the session. Perhaps we batch the writes, and memoize the reads? There might be a memory concern for sufficiently large snapshots, however I think we can optimize for performance here.

huonw · 2021-08-20T00:02:13Z

The first improvement is a change to only discover snapshots once per test file rather than once per assertion. This should have significant impact on your use case, especially if using parameterization. See https://github.com/tophat/syrupy/releases/tag/v1.4.2

Thanks for the quick improvement! It leads to a 33% improvement in the test, and a 25% improvement in our real test suite (68s -> 51s), definitely a useful start.

However, I'm not entirely sure it's working as intended: I installed the new version:

$ pip freeze | grep syrupy
syrupy==1.4.2

I reran the test in the issue, and the time, and the number of calls to discover_snapshots/read_file has definitely improved (by 50%/33% respectively).

size	time (seconds)	`discover_snapshots` calls	`read_file` calls
100	0.12	101	201
500	1.24	501	1001
1000	4.20	1001	2001
2000	15.70	2001	4001

discover_snapshots is still called more than once per file (i.e. more than one, in this example): pstats's callers functionality says the culprit might be a register_request function. For instance, for SIZE=2000:

% callers discover_snapshots
...
Function                                              was called by...
                                                          ncalls  tottime  cumtime
.../syrupy/extensions/base.py:91(discover_snapshots)  <-       1    0.000    0.005  .../syrupy/report.py:83(__attrs_post_init__)
                                                            2000    0.009   11.034  .../syrupy/session.py:81(register_request)

read_file is by far the dominant factor, and so the improvement is close to the reduction in read_file calls (33%). For instance, for SIZE=2000, 21.5s out of the 24.2s test run time is spent in read_file (running under cProfile seems to have significant overhead for this, which is why the total number is higher than the 15.7s in the table above).

Assuming the extra calls in 1 can be eliminated (leaving 1 or 2 discover_snapshots calls per file, and number of snapshots + number of files × (1 or 2) read_file calls), this still looks like it could be a ~50% reduction in our real-world test time, which would be awesome (even without doing potentially harder work to resolve the overall quadratic behaviour)!

I'll have to think about some options around hoisting the read/write snapshot logic out of the assertions, and into the session. Perhaps we batch the writes, and memoize the reads? There might be a memory concern for sufficiently large snapshots, however I think we can optimize for performance here.

That seems reasonable to me. I wonder if throwing @functools.lru_cache on the discovery/read functions might resolve this with little effort?

In terms of memory concerns, I think the memory impact should be reasonably controlled, as the cached memory would generally be proportional to the file size, which would generally be relatively small.

For instance, in our real world test suite, we have 7.5MB total .ambr files, and the largest individual one is 2.6MB, which would be totally fine to hold in memory for us. We're likely to be reducing some of the size in future too, because a human can't validate snapshots that large look right, or even that a diff is correct.

Also, my feeling is that read performance is far more important as a focus than write performance for now, as I'd expect read-only tests will happen far more often than write ones (e.g. CI is read-only, and most tests even in a --snapshot-update run would only read, without writing, since my experience is that only a few snapshots change at a time).

noahnu · 2021-08-20T01:08:23Z

I didn't notice any impact from caching the read calls in the ambr serializer. If anything, performance was slightly worse from the additional cache overhead. Ended up repeating the logic around caching "discovery" across "report" and "session". When I tried to memoize the discover_snapshot method, I didn't notice any improvements, which leads me to believe the time spent is actually in the looping + dict merging.

With my latest change, cProfile reports 2 discover_snapshot calls rather than 1001.

https://github.com/tophat/syrupy/releases/tag/v1.4.3

It should at least be linear now

huonw · 2021-08-20T01:47:18Z

That's awesome, thank you! 👏

1.4.3 takes our test suite to 35s (-49% overall), way better!

Here's the updated table; it's much faster, but still quadratic:

size	time (seconds)	`discover_snapshots` calls	`read_file` calls
100	0.09	2	102
500	0.75	2	502
1000	2.42	2	1002
2000	8.24	2	2002

I didn't notice any impact from caching the read calls in the ambr serializer. If anything, performance was slightly worse from the additional cache overhead. Ended up repeating the logic around caching "discovery" across "report" and "session". When I tried to memoize the discover_snapshot method, I didn't notice any improvements, which leads me to believe the time spent is actually in the looping + dict merging.

I experimented a little bit and found that putting into the DataSerializer.read_file method made a big difference, seemingly resolving the quadratic behaviour when reading: #543 .

noahnu · 2021-08-20T01:50:20Z

If we're saying the read call is the issue then there's a linear relationship between the number of reads and the number of assertions. In fact there's one read per assertion.

I'm not sure where the quadratic relationship is coming from. Have you tried with substantially more test cases? With and without syrupy?

Also feel free to join our discord: https://discord.gg/kZYy8agD

huonw · 2021-08-20T02:14:53Z

Yeah, unfortunately each file is generally O(number of assertions) size, because it stores info for each assertion, so when reads the whole file again it's doing O(number of assertions) work (since it has to at least touch every byte in the file to find the names). That is, it's O(number of assertions) assertions doing O(number of assertions) parsing work, leading to O(number of assertions**2) quadratic behaviour for each file.

The simple test in the issue is an extreme example, with up to 2000 assertions in a single file, but it has an impact even for our real-world ~450 snapshots spread across 25 files.

Have you tried with substantially more test cases? With and without syrupy?

Here's the test with SIZE=10000:

version	time (s)
syrupy==1.4.3	201
syrupy #543	4.64
no syrupy	3.57

(syrupy 1.4.3 seems to be approximately 5² × the time for SIZE=2000 (8.24s), matching expectations for quadratic behaviour. The no syrupy version is the one described in the issue, removing the snapshot fixture and changing to assert x == x.)

Also feel free to join our discord: https://discord.gg/kZYy8agD

Sorry, I'd prefer not to do so for now, but thanks for the invitation! 😄

noahnu · 2021-08-20T03:25:05Z

Based on your metrics, it seems performance is under control now, or at a minimum it's no longer quadratic, so I'll close this issue. If you have other ideas/requests, we're always open to contributors.

huonw · 2021-08-20T03:40:58Z

Sounds good. I think writing snapshots is still quadratic (in the number of written snapshots), but that's not something we're struggling with, compared to the reading performance as we've resolved here 🎉

Thanks for being so responsive!

noahnu mentioned this issue Aug 18, 2021

perf: discover snapshots once per file rather than per assertion #541

Merged

noahnu mentioned this issue Aug 18, 2021

Incompatible with pytest-xdist / thread safety #535

Closed

noahnu mentioned this issue Aug 20, 2021

perf: cache session snapshot extension discovery #542

Merged

huonw mentioned this issue Aug 20, 2021

perf: memoise DataSerializer.read_file results #543

Merged

3 tasks

noahnu closed this as completed Aug 20, 2021

noahnu added the performance label Aug 29, 2021

noahnu mentioned this issue Aug 11, 2022

Graduate Syrupy v4 pre-release #605

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snapshot discovery and reading takes quadratic time #539

Snapshot discovery and reading takes quadratic time #539

huonw commented Aug 16, 2021 •

edited

Loading

noahnu commented Aug 16, 2021

noahnu commented Aug 18, 2021

huonw commented Aug 20, 2021 •

edited

Loading

noahnu commented Aug 20, 2021 •

edited

Loading

huonw commented Aug 20, 2021

noahnu commented Aug 20, 2021 •

edited

Loading

huonw commented Aug 20, 2021 •

edited

Loading

noahnu commented Aug 20, 2021

huonw commented Aug 20, 2021 •

edited

Loading

Snapshot discovery and reading takes quadratic time #539

Snapshot discovery and reading takes quadratic time #539

Comments

huonw commented Aug 16, 2021 • edited Loading

noahnu commented Aug 16, 2021

noahnu commented Aug 18, 2021

huonw commented Aug 20, 2021 • edited Loading

noahnu commented Aug 20, 2021 • edited Loading

huonw commented Aug 20, 2021

noahnu commented Aug 20, 2021 • edited Loading

huonw commented Aug 20, 2021 • edited Loading

noahnu commented Aug 20, 2021

huonw commented Aug 20, 2021 • edited Loading

huonw commented Aug 16, 2021 •

edited

Loading

huonw commented Aug 20, 2021 •

edited

Loading

noahnu commented Aug 20, 2021 •

edited

Loading

noahnu commented Aug 20, 2021 •

edited

Loading

huonw commented Aug 20, 2021 •

edited

Loading

huonw commented Aug 20, 2021 •

edited

Loading