Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare datasets - Integrate Quisby into Pbench Server API #3470

Merged
merged 7 commits into from
Jun 26, 2023

Conversation

siddardh-ra
Copy link
Member

Implementation of API to compare multiple datasets using quisby processing

GET /api/v1/compare?datasets=d1,d2,d3

Currently, the pquisby package supports only the uperf benchmark

Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start, but I think there are some better patterns you could use.

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments: partly (sorry) because I was too lazy before to go through the unit tests.

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/client/__init__.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_unsuccessful_get_with_incorrect_data() test is not working, even though it is passing; and, it appears to have a bad mock setup, which probably isn't helping. There are a few other gaps in the coverage, one in particular which I would like to see closed is the multiple/non-comparable benchmarks error. And, I would like to see a test for comparing public datasets owned by someone else from a non-authorized user (which I think should work and pass...). And, there is a bug which also shows up in two of the tests.

Beyond those, I have a number of other thoughts, suggestions, and nits.

lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
Comment on lines 127 to 130
if not benchmark_type:
raise APIAbort(
HTTPStatus.UNSUPPORTED_MEDIA_TYPE, f"Unsupported Benchmark: {benchmark}"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we convinced ourselves that UNSUPPORTED_MEDIA_TYPE is not an appropriate status to use here.

I think we settled on BAD_REQUEST. 🤷 @dbutenhof?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike using the overly generic BAD_REQUEST intensely, it's just that the more I think about it, the less intensely I dislike it compared to all the other options ... 😦

lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_compare_datasets.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/compare_dataset.py Outdated Show resolved Hide resolved
@webbnh webbnh added the Server label Jun 23, 2023
@webbnh webbnh added this to the v0.73 milestone Jun 23, 2023
@siddardh-ra siddardh-ra added the API Of and relating to application programming interfaces to services and functions label Jun 23, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a couple of places where a mock is returning a mock Tarball class object where I'm pretty sure it should be returning an instance of the class. And, there are a couple of other similar sorts of issues (e.g., functions which should be marked @staticmethod). These should be fixed before we merge this PR.

That said, the test coverage for lib/pbench/server/api/resources/datasets_compare.py is 95%, which is excellent -- it's missing jsut two lines -- consider going for 100% (we just need one more test case...). 🏁

Also, there are a few opportunities for "DRYing" the test code out. There are some code blobs which could be refactored and made common, and, better yet, I think there are some opportunities to collapse whole test functions together using Pytest's "parametrization".

And, there are some smaller items for your consideration.

lib/pbench/server/api/__init__.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
Copy link
Member

@dbutenhof dbutenhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're getting close here; down to relatively small stuff.

lib/pbench/server/api/resources/datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/server/api/resources/datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
dbutenhof
dbutenhof previously approved these changes Jun 24, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is so close!

Thanks for doing the parametrization -- that's much nicer, I think (although, we did lose some negative assertions in the process, but I think it's worth it).

There are a few rough edges remaining, which we could ignore, but there are more substantial problems with two of the tests which I'd think should be addressed, and that will give you a chance for some additional polishing.

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
Comment on lines 78 to 80
def test_unsuccessful_get_with_incorrect_data(self, query_get_as, monkeypatch):
def mock_extract(_tarball_path: Path, _path: str) -> str:
raise "IncorrectData"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising a string as an exception results in a a TypeError exception...so...yeah, this results in an exception, which is what you wanted, but it doesn't produce a plausible behavior.

I don't know what exact behavior we want here, but something more like this would be better:

            raise RuntimeError("Testing unexpected exception behavior")

Comment on lines 95 to 97
def test_tarball_unpack_exception(self, query_get_as, monkeypatch):
def mock_extract(_tarball_path: Path, _path: str):
raise TarballUnpackError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raise is raising an exception class; while this does work, what we want to raise is an instance of an exception class...so,

            raise TarballUnpackError(_tarball_path, f"Testing unpack exception for path {_path}")

(And, you should presumably remove the leading underscore from the parameter names, now that we're using them.)

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
Comment on lines 95 to 104
def test_tarball_unpack_exception(self, query_get_as, monkeypatch):
def mock_extract(_tarball_path: Path, _path: str):
raise TarballUnpackError

monkeypatch.setattr(CacheManager, "find_dataset", self.mock_find_dataset)
monkeypatch.setattr(self.MockTarball, "extract", mock_extract)
monkeypatch.setattr(Metadata, "getvalue", mock_get_value)
monkeypatch.setattr(
QuisbyProcessing, "compare_csv_to_json", mock_compare_csv_to_json
)
query_get_as(["uperf_1", "uperf_2"], "test", HTTPStatus.INTERNAL_SERVER_ERROR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on looking at the coverage, I don't think this test is working correctly. I think we're expecting this test to raise an APIInternalError at line 110, but Cobertura claims that that line is never executed. Presumably, the exception is being raised at line 102, instead, but I don't know why. (Is there some reason why user test doesn't have access to datasets uperf_1 and uperf_2 such that the mock_find_dataset() call is raising an exception?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah @webbnh, It was due to this raise TarballUnpackError as you mentioned in the earlier comment, it expects the instance of the TarballUnpackError which I missed. After adding this suggestion
raise TarballUnpackError(_tarball_path, f"Testing unpack exception for path {_path}") , It was working as expected :)

lib/pbench/test/unit/server/test_datasets_compare.py Outdated Show resolved Hide resolved
@webbnh
Copy link
Member

webbnh commented Jun 26, 2023

So, I just checked build #13, and it's still showing two gaps in the coverage. I wouldn't block the merge on this basis, except that I think that we think that the existing tests are supposed to cover those gaps. So, I think we need to at least understand why the tests aren't working as expected. 😞

Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At last!

@dbutenhof
Copy link
Member

dbutenhof commented Jun 26, 2023

At last!

Now assuming Jenkins and GitHub do their jobs, someone can merge soon. Sigh.

(And, for the record, we had a 100% successful run with 100% coverage of the new module, but Jenkins glitched and couldn't talk to GitHub so the result never propagated. We're hoping for a reprise... of the success, that is, not of the glitch ...)

@webbnh webbnh merged commit 1eebe0f into distributed-system-analysis:main Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Of and relating to application programming interfaces to services and functions Server
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants