New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Compare datasets - Integrate Quisby into Pbench Server API #3470

Merged

webbnh merged 7 commits into distributed-system-analysis:main from siddardh-ra:PBENCH-1189

Jun 26, 2023

Member

siddardh-ra commented Jun 22, 2023

Implementation of API to compare multiple datasets using quisby processing

GET /api/v1/compare?datasets=d1,d2,d3

Currently, the pquisby package supports only the uperf benchmark

siddardh-ra requested review from dbutenhof, webbnh and riya-17

June 22, 2023 11:36

dbutenhof requested changes

View reviewed changes

Member

dbutenhof left a comment

Great start, but I think there are some better patterns you could use.

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

siddardh-ra requested a review from dbutenhof

June 22, 2023 19:35

dbutenhof reviewed

View reviewed changes

Member

dbutenhof left a comment

Some more comments: partly (sorry) because I was too lazy before to go through the unit tests.

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/client/__init__.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved

webbnh requested changes

View reviewed changes

Member

webbnh left a comment

The test_unsuccessful_get_with_incorrect_data() test is not working, even though it is passing; and, it appears to have a bad mock setup, which probably isn't helping. There are a few other gaps in the coverage, one in particular which I would like to see closed is the multiple/non-comparable benchmarks error. And, I would like to see a test for comparing public datasets owned by someone else from a non-authorized user (which I think should work and pass...). And, there is a bug which also shows up in two of the tests.

Beyond those, I have a number of other thoughts, suggestions, and nits.

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated

Comment on lines 127 to 130

+                      if not benchmark_type:
+                          raise APIAbort(
+                              HTTPStatus.UNSUPPORTED_MEDIA_TYPE, f"Unsupported Benchmark: {benchmark}"
+                          )

Member

webbnh Jun 22, 2023

I think we convinced ourselves that UNSUPPORTED_MEDIA_TYPE is not an appropriate status to use here.

I think we settled on BAD_REQUEST. 🤷 @dbutenhof?

Member

dbutenhof Jun 23, 2023

I dislike using the overly generic BAD_REQUEST intensely, it's just that the more I think about it, the less intensely I dislike it compared to all the other options ... 😦

lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved

webbnh assigned siddardh-ra

webbnh added the Server label

webbnh added this to the v0.73 milestone

siddardh-ra requested review from dbutenhof and webbnh

June 23, 2023 12:26

siddardh-ra added the API label

webbnh requested changes

View reviewed changes

Member

webbnh left a comment

There are still a couple of places where a mock is returning a mock Tarball class object where I'm pretty sure it should be returning an instance of the class. And, there are a couple of other similar sorts of issues (e.g., functions which should be marked @staticmethod). These should be fixed before we merge this PR.

That said, the test coverage for lib/pbench/server/api/resources/datasets_compare.py is 95%, which is excellent -- it's missing jsut two lines -- consider going for 100% (we just need one more test case...). 🏁

Also, there are a few opportunities for "DRYing" the test code out. There are some code blobs which could be refactored and made common, and, better yet, I think there are some opportunities to collapse whole test functions together using Pytest's "parametrization".

And, there are some smaller items for your consideration.

lib/pbench/server/api/__init__.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

dbutenhof requested changes

View reviewed changes

Member

dbutenhof left a comment

I think we're getting close here; down to relatively small stuff.

lib/pbench/server/api/resources/datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/server/api/resources/datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Show resolved Hide resolved

dbutenhof previously approved these changes

View reviewed changes

webbnh requested changes

View reviewed changes

Member

webbnh left a comment •

edited

Loading

Oh, this is so close!

Thanks for doing the parametrization -- that's much nicer, I think (although, we did lose some negative assertions in the process, but I think it's worth it).

There are a few rough edges remaining, which we could ignore, but there are more substantial problems with two of the tests which I'd think should be addressed, and that will give you a chance for some additional polishing.

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated

Comment on lines 78 to 80

+                  def test_unsuccessful_get_with_incorrect_data(self, query_get_as, monkeypatch):
+                      def mock_extract(_tarball_path: Path, _path: str) -> str:
+                          raise "IncorrectData"

Member

webbnh Jun 26, 2023

Raising a string as an exception results in a a TypeError exception...so...yeah, this results in an exception, which is what you wanted, but it doesn't produce a plausible behavior.

I don't know what exact behavior we want here, but something more like this would be better:

            raise RuntimeError("Testing unexpected exception behavior")

lib/pbench/test/unit/server/test_datasets_compare.py Outdated

Comment on lines 95 to 97

+                  def test_tarball_unpack_exception(self, query_get_as, monkeypatch):
+                      def mock_extract(_tarball_path: Path, _path: str):
+                          raise TarballUnpackError

Member

webbnh Jun 26, 2023

This raise is raising an exception class; while this does work, what we want to raise is an instance of an exception class...so,

            raise TarballUnpackError(_tarball_path, f"Testing unpack exception for path {_path}")

(And, you should presumably remove the leading underscore from the parameter names, now that we're using them.)

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Outdated

Comment on lines 95 to 104

+                  def test_tarball_unpack_exception(self, query_get_as, monkeypatch):
+                      def mock_extract(_tarball_path: Path, _path: str):
+                          raise TarballUnpackError
                       monkeypatch.setattr(CacheManager, "find_dataset", self.mock_find_dataset)
+                      monkeypatch.setattr(self.MockTarball, "extract", mock_extract)
                       monkeypatch.setattr(Metadata, "getvalue", mock_get_value)
-                      monkeypatch.setattr(
-                          QuisbyProcessing, "compare_csv_to_json", mock_compare_csv_to_json
-                      )
                       query_get_as(["uperf_1", "uperf_2"], "test", HTTPStatus.INTERNAL_SERVER_ERROR)

Member

webbnh Jun 26, 2023

Based on looking at the coverage, I don't think this test is working correctly. I think we're expecting this test to raise an APIInternalError at line 110, but Cobertura claims that that line is never executed. Presumably, the exception is being raised at line 102, instead, but I don't know why. (Is there some reason why user test doesn't have access to datasets uperf_1 and uperf_2 such that the mock_find_dataset() call is raising an exception?)

Member Author

siddardh-ra Jun 26, 2023

Yeah @webbnh, It was due to this raise TarballUnpackError as you mentioned in the earlier comment, it expects the instance of the TarballUnpackError which I missed. After adding this suggestion
raise TarballUnpackError(_tarball_path, f"Testing unpack exception for path {_path}") , It was working as expected :)

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved

siddardh-ra dismissed dbutenhof’s stale review via

c161486

June 26, 2023 19:34

siddardh added 6 commits

June 27, 2023 01:08


          Integrate Quisby to Compare datasets in server

5d7c1c7

Fix and debug

Refactoring


          Added unit test cases and code cleanup

76aee86


          Review comments

93b627d


          Code refactoring and addressing the review comments

41bdcb7


          Refactored unit test cases and add parameterized method to clean up

e109344


          Addressed review comments & Refactored unit test cases

3f04ce9

siddardh-ra requested review from dbutenhof and webbnh

June 26, 2023 19:39


          Fix minor bug in unit tests

aed90bc

siddardh-ra force-pushed the PBENCH-1189 branch from c161486 to aed90bc Compare

June 26, 2023 19:56

dbutenhof approved these changes

View reviewed changes

lib/pbench/test/unit/server/test_datasets_compare.py Show resolved Hide resolved

lib/pbench/test/unit/server/test_datasets_compare.py Show resolved Hide resolved

Member

webbnh commented Jun 26, 2023 •

edited

Loading

So, I just checked build #13, and it's still showing two gaps in the coverage. I wouldn't block the merge on this basis, except that I think that we think that the existing tests are supposed to cover those gaps. So, I think we need to at least understand why the tests aren't working as expected. 😞

webbnh approved these changes

View reviewed changes

Member

webbnh left a comment

At last!

Member

dbutenhof commented Jun 26, 2023 •

edited

Loading

At last!

Now assuming Jenkins and GitHub do their jobs, someone can merge soon. Sigh.

(And, for the record, we had a 100% successful run with 100% coverage of the new module, but Jenkins glitched and couldn't talk to GitHub so the result never propagated. We're hoping for a reprise... of the success, that is, not of the glitch ...)

webbnh merged commit 1eebe0f into distributed-system-analysis:main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels