Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests for HSLuv/HPLuv clogging up the test runner #487

Closed
LeaVerou opened this issue Mar 16, 2024 · 10 comments · Fixed by #490
Closed

Tests for HSLuv/HPLuv clogging up the test runner #487

LeaVerou opened this issue Mar 16, 2024 · 10 comments · Fixed by #490
Assignees

Comments

@LeaVerou
Copy link
Member

It’s great that we have so many tests for these new color spaces (16K tests!) but that means that now 99% of our testsuite is testing these two color spaces, and npm run test now takes several minutes (!).

I propose we pick a few characteristic tests to run with a regular testsuite run, and the rest can be run explicitly (via e.g. htest test/hpluv/all.js), when we really need to thoroughly test correctness for these color spaces.

(also if there's anything on the hTest side that could improve that experience, I'm all ears!)

@facelessuser
Copy link
Collaborator

I think it is a matter of efficiency and testing what matters. In my library, I can test all relevant tests pretty fast, and Python is not known for its blazing fast speed 🙃 .

coloraide git:(main) ✗ python3 -m pytest tests/test_hsluv/test_snapshot.py
========================================================================================== test session starts ==========================================================================================
platform darwin -- Python 3.11.3, pytest-7.1.3, pluggy-1.0.0
rootdir: /Users/facelessuser/Code/github/coloraide
plugins: xonsh-0.13.4
collected 1 item

tests/test_hsluv/test_snapshot.py .                                                                                                                                                               [100%]

=========================================================================================== 1 passed in 1.28s ===========================================================================================

My comment from the test:

        The snapshot contains a number of unnecessary tests, at least for our purposes. It is sufficient to verify that
        our path from sRGB to HSLuv and HPLuv is correct, both forward and backwards, as our conversion path is noted
        below:

            ```
            srgb -> srgb-linear -> xyz-d65 -> luv -> lchuv -> hsluv
            srgb -> srgb-linear -> xyz-d65 -> luv -> lchuv -> hpluv
            ```

        Verifying each step in the chain makes no sense. If we are able to get through this entire chain and the sRGB
        to HSLuv checks out, both forward and backwards through the path, then we can consider our implementation
        successful. Any failure in the chain would cause errors to propagate up that would cause the test to fail. It
        would only be useful to test each point in the chain if we were trying to identify where in the chain breakage
        occurred.

You can see what I actually test here: https://github.com/facelessuser/coloraide/blob/main/tests/test_hsluv/test_snapshot.py

I would advise testing only the forward and backward conversion from HSLuv <-> sRGB and HPLuv <-> sRGB. Don't test XYZ as it is tested in the chain already. I wanted to formally pass the HSLuv and HPLuv tests to claim it was compliant, so I did not test all their data points for Luv and LCh, because again, if I can pass the HSLuv and HPLuv tests, I'm already passing LChuv and Luv. You can do a small set of Luv/LCh tests for completeness if you like.

One thing I'm uncertain of is whether part of the problem is that the test suite generates individual tests or each case generating excessive overhead? If so, I would consider maybe having a snapshot test, one test that validates the snapshot, so no additional overhead per case in the JSON. I imagine, even if I were to test every data point in the JSON in my own test suite, I could only see it increasing the snapshot test time by a little over double. So, I suspect some excessive overhead is generated by the test suite per test case in the JSON. But maybe I'm wrong.

@lloydk
Copy link
Collaborator

lloydk commented Mar 16, 2024

It’s great that we have so many tests for these new color spaces (16K tests!) but that means that now 99% of our testsuite is testing these two color spaces, and npm run test now takes several minutes (!).

That was the reason I initially had the tests in the hsluv directory but somebody suggested moving them to the top level test directory. 😃

I propose we pick a few characteristic tests to run with a regular testsuite run, and the rest can be run explicitly (via e.g. htest test/hpluv/all.js), when we really need to thoroughly test correctness for these color spaces.

I'm fine with that.

(also if there's anything on the hTest side that could improve that experience, I'm all ears!)

The majority of the time is spent updating the console with the countdown of the number of tests remaining.

More specifically converting result to a string in this code causes the majority of the slowdown:

https://github.com/LeaVerou/htest/blob/bc877e6a806f759d29976e9524dbf25f70f12874/src/env/node.js#L65-L79

And I think this line in the toString() method could be problematic because ret.children might have 4096 tests (x4 if you're running the whole test suite)

https://github.com/LeaVerou/htest/blob/bc877e6a806f759d29976e9524dbf25f70f12874/src/classes/TestResult.js#L367

@lloydk
Copy link
Collaborator

lloydk commented Mar 16, 2024

I would advise testing only the forward and backward conversion from HSLuv <-> sRGB and HPLuv <-> sRGB. Don't test XYZ as it is tested in the chain already. I wanted to formally pass the HSLuv and HPLuv tests to claim it was compliant, so I did not test all their data points for Luv and LCh, because again, if I can pass the HSLuv and HPLuv tests, I'm already passing LChuv and Luv. You can do a small set of Luv/LCh tests for completeness if you like.

The tests only check sRGB to/from HSLuv and HPLuv.

@lloydk
Copy link
Collaborator

lloydk commented Mar 16, 2024

Note that the test times reported by htest are misleading and don't account for any overhead added by htest.

image

@facelessuser
Copy link
Collaborator

Yeah, that overhead was what I was afraid was happening.

@lloydk
Copy link
Collaborator

lloydk commented Mar 17, 2024

I hacked in some throttling to the node runner and the results look pretty good:

image

I did make some modifications to the hsluv tests to not parse color strings or create Color objects which speed up the test times reported by htest by around 3x.

Changes I made to src/env/node.js:

	done: (result, options) => {
		let tree;
		// Get the tree every 16ms or when there are no pending tests
		if (result.lastUpdate === undefined || performance.now() - result.lastUpdate > 16 || result.stats.pending === 0) {
			result.lastUpdate = performance.now();
			let messages = result.toString({ format: options.format ?? "rich" });
			tree = getTree(messages).toString();
			tree = format(tree);
			logUpdate(tree);
		}
...

I also had to comment out this line in the same file because I was getting a "console.log is not a function" error.

//import console from "./console.js";

Changing from 16ms to 100ms made things slightly faster but less smooth.

@lloydk
Copy link
Collaborator

lloydk commented Mar 17, 2024

Keep in mind that my PC is around 10 years old so those are probably close to worst case results

@LeaVerou
Copy link
Member Author

Good catch, throttling on the hTest side would probably be a good thing regardless. But even if perf was not an issue, I'm not sure having 8K tests testing a single color space is a good thing.

Note that the test times reported by htest are misleading and don't account for any overhead added by htest.

That is intentional, since you typically care more about how long your code takes to run rather than the test runner. They just don 't help in this case 😁

@facelessuser
Copy link
Collaborator

I'm not sure having 8K tests testing a single color space is a good thing.

I generally agree. I think it is a bit excessive. I assume it was done for the same reason I did, you are implementing a space, it has a test suite, so you verify against the test suite. If overhead is not a problem, it is completed fairly quickly.

But, I think at this point, we've verified the spaces, we can probably pick some data points at random as a sanity check, and call that sufficient maybe.

@lloydk
Copy link
Collaborator

lloydk commented Mar 17, 2024

The conversion tests already contain a few tests for HSLuv and HPLuv so I'll just move the comprehensive test suite back to the hsluv directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants