Help wanted: ideas for improving the test run time #37

jmcnamara · 2023-05-07T19:12:36Z

Help wanted

One of the goals of rust_xlsxwriter is fidelity with the xlsx file format generated by Excel. This is achieved using integration tests that take files created in Excel 2007 and compares them file by file and element by element with files created using rust_xlsxwriter.

Here is a typical test file, the associated xlsx file and the test runner code.

This approach has a number of advantages from an maintenance point of view:

It allows incremental test-drive development of Excel features.
It allows bug reports to be replicated quickly in Excel and compared with rust_xlsxwriter
It avoids subjective arguments about whether rust_xlsxwriter or some other third party Excel reading software is correct in its implementation/interpretation of the XLSX file specification since it uses Excel as the standard.

For the end user the benefits of having output files that are effectively identical to files produced by Excel means the maximum possible interoperability with Excel and applications that read XLSX files.

The test suite contains an individual test for each file (although there is sometimes more than one test against the same input file). Each of these tests in compiled into and run as a crate which means the test suite is slow. For usability reasons I don't want to test more than one xlsx file per file/crate (apart from maybe the grouping scheme outlined below).

There are currently ~540 test files and it takes 8+ minutes to run on a 3.2 GHz 6-Core Intel Core i7 with 32GB of fast RAM:

$ time cargo test

real	8m36.340s
user	30m34.062s
sys	9m0.802s

In the GitHub Actions CI this is currently taking around 18 minutes.

There will eventually be around 800 test files so the runtime will be ~50% longer.

nextest is bit faster but not significantly so. The timing also doesn't include the doc tests:

$ time cargo nextest run

real	7m45.029s
user	26m44.624s
sys	6m59.271s

A few months ago when the test suite took around 4 minutes I tried to consolidate the tests into one crate using the advice in this article on Delete Cargo Integration Tests. This was significantly faster by around 5-10x but didn't allow me to run individual tests (I'm 99% sure). I tried to replicate that again to redo the performance testing and verify the running of individual tests but failed for some reasons related to test refactoring since then.

For comparison the Python bytes test suite runs 1600 integration and unit tests in 18 seconds. The Perl test suite takes around 3 minutes and the C test suite takes 5 minutes.

Anyway to the help wanted: if anyone has any ideas how the test runtime might be improved or if you can get the above "Delete Cargo Integration Tests" approach to work again for comparison let me know. I might be able to come up with a hybrid approach where the tests under development or debug are in their own crates and moved back to an overall test crate/folder afterwards.

The text was updated successfully, but these errors were encountered:

jmcnamara · 2023-05-23T00:25:06Z

Closing since there probably isn't any fix for this. But if anyone comes up with anything I'd be happy to hear.

adriandelgado · 2023-07-24T21:15:46Z

Hi, I always thought the time to test this crate is massive and I almost gave up last time I contributed.
My recommendation is using the insta crate instead of pretty_assertions. It's specifically designed for this sort of testing (snapshot testing). Look at this quote from their description:

Snapshot tests are particularly useful if your reference values are very large or change often.

I haven't tried to implement this to check if it is indeed faster, I'm just speculating.

adriandelgado · 2023-07-24T21:26:58Z

You can put all the tests in a single crate and use the insta's cli to check specific snapshots.
See: https://insta.rs/docs/cli/#review

adriandelgado · 2023-07-25T14:40:53Z

@jmcnamara I think this is a very important issue to solve to attract more contributors in the future.

jmcnamara · 2023-07-25T15:21:55Z

@adriandelgado I agree it is important to fix this issue. It is starting to have a direct effect on my development time.

I’m offline for a few days so I’ll try your suggestions when I get home.

Ultimately I think the issue is the amount of time required to compile, link and run several hundred test crates. The only way around that is to group them all into one sub-folder crate.

As I said in the initial post above I did that several months ago and it was significantly faster. However, it limited my ability to run individual tests so I reverted the change. I tried to get it to work again around the time of posting this but couldn’t get it to compile due to some changes made to the test suite in the meantime. If you think you could get that working based on the outline in the article linked above that would be helpful. If not I’ll get around to it soon myself.

jmcnamara · 2023-07-27T14:07:30Z

I'm working on this now and I think I have a reasonable fix.

Refactored integration tests into single crate for increased performance. Tests now run ~ 300% faster. Issue #37

jmcnamara · 2023-07-27T15:54:37Z

My guess about time wasted on compiling and linking was right. I moved all the tests into a sub-directory/crate and the test execution when from around 700 sec to 2 sec!!

I now have the issue of not being able to run individual tests (useful when working on new features) but I'll figure out a way around that.

Note cargo test compiles the example files as well and that can take a bit longer. I'm looking for a way to turn that off by default and only running it explicitly but I haven't found it yet. In the meantime you can run just the integration tests like this:

cargo test --test '*'

Or the integration+lib tests (slightly slower but not much):

cargo test --tests

Also, I turned off the doctests by default since they are slow (~ 60 secs). You can test those with:

cargo test --doc

Try it out and if you spot any other improvements let me know.

jmcnamara · 2023-07-28T11:19:00Z

It doesn't seem to be currently possible to cargo test without building the examples (see this question) but the other 2 more verbose options above are workable.

Closing since the single crate solution solved the problem this requisition was opened for.

arifd · 2024-08-14T18:24:30Z

This just happened, which may be of interest.

jmcnamara · 2024-08-14T18:54:35Z

This just happened, which may be of interest.

Thanks for the heads up. That does indeed look promising.

jmcnamara added question This is a general question help wanted and removed question This is a general question labels May 7, 2023

jmcnamara closed this as completed May 23, 2023

jmcnamara reopened this Jun 21, 2023

jmcnamara closed this as not planned Won't fix, can't repro, duplicate, stale Jul 5, 2023

jmcnamara reopened this Jul 25, 2023

jmcnamara added a commit that referenced this issue Jul 27, 2023

tests: refactor integration tests into single test

a3067eb

Refactored integration tests into single crate for increased performance. Tests now run ~ 300% faster. Issue #37

jmcnamara added a commit that referenced this issue Jul 27, 2023

tests: refactor integration tests into single crate

fe87cb3

Refactored integration tests into single crate for increased performance. Tests now run ~ 300% faster. Issue #37

jmcnamara added a commit that referenced this issue Jul 27, 2023

tests: refactor integration tests into single crate

609ab95

Refactored integration tests into single crate for increased performance. Tests now run ~ 300% faster. Issue #37

jmcnamara closed this as completed Jul 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help wanted: ideas for improving the test run time #37

Help wanted: ideas for improving the test run time #37

jmcnamara commented May 7, 2023 •

edited

Loading

jmcnamara commented May 23, 2023

adriandelgado commented Jul 24, 2023 •

edited

Loading

adriandelgado commented Jul 24, 2023

adriandelgado commented Jul 25, 2023

jmcnamara commented Jul 25, 2023 •

edited

Loading

jmcnamara commented Jul 27, 2023

jmcnamara commented Jul 27, 2023

jmcnamara commented Jul 28, 2023 •

edited

Loading

arifd commented Aug 14, 2024

jmcnamara commented Aug 14, 2024

Help wanted: ideas for improving the test run time #37

Help wanted: ideas for improving the test run time #37

Comments

jmcnamara commented May 7, 2023 • edited Loading

Help wanted

jmcnamara commented May 23, 2023

adriandelgado commented Jul 24, 2023 • edited Loading

adriandelgado commented Jul 24, 2023

adriandelgado commented Jul 25, 2023

jmcnamara commented Jul 25, 2023 • edited Loading

jmcnamara commented Jul 27, 2023

jmcnamara commented Jul 27, 2023

jmcnamara commented Jul 28, 2023 • edited Loading

arifd commented Aug 14, 2024

jmcnamara commented Aug 14, 2024

jmcnamara commented May 7, 2023 •

edited

Loading

adriandelgado commented Jul 24, 2023 •

edited

Loading

jmcnamara commented Jul 25, 2023 •

edited

Loading

jmcnamara commented Jul 28, 2023 •

edited

Loading