Improve client benchmark identification #3496

dbutenhof · 2023-07-13T21:10:26Z

PBENCH-1186
PBENCH-1187

A bit of a catch-all. First, this simplifies and standardizes the dashboard's need to identify a benchmark. Eventually we'll be "smart enough" to go beyond the Pbench benchmark wrapper (as we intend to move away from relying on them) and dataset.metalog.pbench.script will become unreliable. This adds a new server.benchmark, which right now is the same but can be extended beyond that in the future.

The dashboard would also like to be able to identify which benchmark types the current server can support, in advance, and we'd like to avoid hardcoding that knowledge in the dashboard. For this purpose, I've added a "visualization" key in the endpoints response to indicate the benchmark types supported by the current server's datasets_visualize and datasets_compare APIs.

For some reason, minor consolidation of metadata setting code in the intake API base resulted in weird unit test and functional test failures. The odd part is that these should have been failing earlier, as they're cases where a dataset doesn't have server.index-map but the API requires it (for example in datasets_detail and datasets_contents). I twizzled the code a bit to fail with a CONFLICT error (which had already been used elsewhere) rather than a server internal error when the dataset is server.archiveonly, and adjusted the tests to match.

PBENCH-1186 PBENCH-1187 A bit of a catch-all. First, this simplifies and standardizes the dashboard's need to identify a benchmark. Eventually we'll be "smart enough" to go beyond the Pbench benchmark wrapper (as we intend to move away from relying on them) and `dataset.metalog.pbench.script` will become unreliable. This adds a new `server.benchmark`, which right now is the same but can be extended beyond that in the future. The dashboard would also like to be able to identify which benchmark types the current server can support, in advance, and we'd like to avoid hardcoding that knowledge in the dashboard. For this purpose, I've added a "visualization" key in the endpoints response to indicate the benchmark types supported by the current server's `datasets_visualize` and `datasets_compare` APIs. For some reason, minor consolidation of metadata setting code in the intake API base resulted in weird unit test and functional test failures. The odd part is that these should have been failing earlier, as they're cases where a dataset doesn't have `server.index-map` but the API requires it (for example in `datasets_detail` and `datasets_contents`). I twizzled the code a bit to fail with a `CONFLICT` error (which had already been used elsewhere) rather than a server internal error when the dataset is `server.archiveonly`, and adjusted the tests to match.

siddardh-ra

👍

lib/pbench/test/functional/server/test_datasets.py

webbnh

I'm late to this game, but I think this change may have introduced a bug into one of the tests (and I think it may have removed some checks for stuff that we do want to test for)...and, as long as I'm here, I have a few other comments.

webbnh · 2023-07-19T15:58:58Z

lib/pbench/server/api/resources/query_apis/datasets/__init__.py

+            if Metadata.getvalue(dataset, Metadata.SERVER_ARCHIVE):
+                raise APIAbort(HTTPStatus.CONFLICT, "Dataset is not indexed")


This message is insufficient: in 90% of the cases when users receive this, they are going to ask, "why isn't this dataset indexed?" And, it's unlikely that they are going to realize that it is because indexing was disabled on it when it was uploaded.

So, this message really should say something more like, "Indexing of this dataset was disabled by the owner".

It's not necessarily by deliberate action of the owner: if a tarball is uploaded without a metadata.log in the right place, we'll accept it but mark it "archive only". I suppose I could have said that it's marked for archive only, but at the time that seemed more complicated and more obscure than just saying it wasn't indexed. Maybe "indexing was disabled" would be OK, but "by the owner" implies deliberate intent.

if a tarball is uploaded without a metadata.log in the right place, we'll accept it but mark it "archive only".

Ah, yes, I'd neglected that detail. But, yes, "marked for archive only" would be much better than "not indexed".

webbnh · 2023-07-19T16:01:44Z

lib/pbench/server/database/models/datasets.py

+    # BENCHMARK provides a simple identification of the native benchmark where
+    # one can be identified. At the simplest for Pbench Agent tarballs this is
+    # just "dataset.metalog.pbench.script" but will be expanded in the future
+    # for example to identify a "pbench-user-benchmark -- fio" as "fio".
+    SERVER_BENCHMARK = "server.benchmark"
+    SERVER_BENCHMARK_UNKNOWN = "unknown"


It probably would be good to have a reference here to possible values for this key (i.e., to BenchmarkName or endpoints["visualization"]["benchmarks"]?).

Those would constitute examples, but I already added fio as an example. Maybe eventually we'll support visualization of all possible benchmarks, but right now either of those references would be incomplete.

So, the value associated with "server.benchmark" can be anything the user wants it to be...but, if it's not one of the BenchmarkName values, the Server won't be able to recognize it, and if it's not one of endpoints["visualization"]["benchmarks"] it can't be visualized by the Dashboard, right? I vote that we say that, here....

webbnh · 2023-07-19T16:06:31Z

lib/pbench/server/api/resources/intake_base.py

                if not metalog:
-                    metalog = {"pbench": {"script": "Foreign"}}
+                    benchmark = Metadata.SERVER_BENCHMARK_UNKNOWN
+                    metalog = {"pbench": {"name": dataset.name, "script": benchmark}}
                    metadata[Metadata.SERVER_ARCHIVE] = True
                    attributes["missing_metadata"] = True
+                else:
+                    p = metalog.get("pbench")
+                    if p:
+                        benchmark = p.get("script", Metadata.SERVER_BENCHMARK_UNKNOWN)
+                    else:
+                        benchmark = Metadata.SERVER_BENCHMARK_UNKNOWN


Alternatively,

benchmark = Metadata.SERVER_BENCHMARK_UNKNOWN if not metalog: metalog = {"pbench": {"name": dataset.name, "script": benchmark}} metadata[Metadata.SERVER_ARCHIVE] = True attributes["missing_metadata"] = True else: p = metalog.get("pbench", {"script": benchmark}) benchmark = p.get("script", benchmark)

webbnh · 2023-07-19T16:08:54Z

lib/pbench/test/functional/server/test_datasets.py

+                assert (
+                    dataset.metadata["dataset.metalog.pbench.script"]
+                    == dataset.metadata["server.benchmark"]
+                )


While this assertion (generally) holds at the moment, this isn't really what we want going forward, right? That is, we're hoping that at some point we'll have pbench-user-benchmark as the script and fio as the benchmark, right? So, it would be better if we could structure this assertion in a way which doesn't break when we get where we want to be going. (I.e., "test as specification".)

(Also, you didn't take this shortcut at lines 190 and 192.)

For the archiveonly test, I know exactly what values to expect; this is an iteration, and all we can really know at this point is that they ought to match.

When and if we work out a more general "base benchmark" recognition that doesn't rely on dataset.metalog.pbench.script, we'll have to figure out how to mimic that on the functional test side.

all we can really know at this point is that they ought to match.

They should only match currently because of an implementation detail, not because that's specifically the way we want these fields to behave. It would be best if the tests were constructed in a way that doesn't require them to change when details of the implementation change.... (Ideally, we would not change the tests nearly as often as we seem to do.)

lib/pbench/test/functional/server/test_datasets.py

webbnh · 2023-07-19T16:20:13Z

lib/pbench/test/functional/server/test_datasets.py

        for dataset in datasets:
-            server_client.update(dataset.resource_id, access=expected)
-            print(f"\t ... updating {dataset.name} to {access!r}")
-            meta = server_client.get_metadata(
-                dataset.resource_id, metadata="dataset.access"
+            response = server_client.update(
+                dataset.resource_id, access=expected, raise_error=False
            )
-            assert meta["dataset.access"] == expected
+            print(f"\t ... updating {dataset.name} to {access!r}")
+            if response.ok:


Shouldn't the print() at line 734 precede the operation at line 731? (And, shouldn't the print() be referencing expected rather than access, since that's what line 732 does?)

This shows up as a diff because I rearranged code, but I didn't change the line.

True. But, I still think the lines are in the wrong order, etc. 🙂

PBENCH-1215 Webb pointed out a broken functional test in distributed-system-analysis#3496. Once it was testing what I had wanted it to test, I realized that it wasn't quite working. This contains some server and functional test changes to make archive-only dataset queries work correctly and to be sure that's adequately tested in the functional test cases.

* Fix some problems with archive-only behavior PBENCH-1215 Webb pointed out a broken functional test in #3496. Once it was testing what I had wanted it to test, I realized that it wasn't quite working. This contains some server and functional test changes to make archive-only dataset queries work correctly and to be sure that's adequately tested in the functional test cases.

dbutenhof added Server API Of and relating to application programming interfaces to services and functions labels Jul 13, 2023

dbutenhof requested review from ndokos, MVarshini and siddardh-ra July 13, 2023 21:10

dbutenhof self-assigned this Jul 13, 2023

siddardh-ra approved these changes Jul 15, 2023

View reviewed changes

ndokos reviewed Jul 17, 2023

View reviewed changes

lib/pbench/test/functional/server/test_datasets.py Show resolved Hide resolved

ndokos approved these changes Jul 17, 2023

View reviewed changes

dbutenhof merged commit 5329c0f into distributed-system-analysis:main Jul 17, 2023

dbutenhof deleted the bmark branch July 17, 2023 12:01

webbnh reviewed Jul 19, 2023

View reviewed changes

dbutenhof mentioned this pull request Jul 20, 2023

Fix some problems with archive-only behavior #3497

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve client benchmark identification #3496

Improve client benchmark identification #3496

dbutenhof commented Jul 13, 2023

siddardh-ra left a comment

webbnh left a comment

webbnh Jul 19, 2023

dbutenhof Jul 19, 2023

webbnh Jul 20, 2023

webbnh Jul 19, 2023

dbutenhof Jul 19, 2023

webbnh Jul 20, 2023

webbnh Jul 19, 2023

webbnh Jul 19, 2023

dbutenhof Jul 19, 2023

webbnh Jul 20, 2023

webbnh Jul 19, 2023

dbutenhof Jul 19, 2023

webbnh Jul 20, 2023

		if Metadata.getvalue(dataset, Metadata.SERVER_ARCHIVE):
		raise APIAbort(HTTPStatus.CONFLICT, "Dataset is not indexed")

Improve client benchmark identification #3496

Improve client benchmark identification #3496

Conversation

dbutenhof commented Jul 13, 2023

siddardh-ra left a comment

Choose a reason for hiding this comment

webbnh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment