-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve client benchmark identification #3496
Conversation
PBENCH-1186 PBENCH-1187 A bit of a catch-all. First, this simplifies and standardizes the dashboard's need to identify a benchmark. Eventually we'll be "smart enough" to go beyond the Pbench benchmark wrapper (as we intend to move away from relying on them) and `dataset.metalog.pbench.script` will become unreliable. This adds a new `server.benchmark`, which right now is the same but can be extended beyond that in the future. The dashboard would also like to be able to identify which benchmark types the current server can support, in advance, and we'd like to avoid hardcoding that knowledge in the dashboard. For this purpose, I've added a "visualization" key in the endpoints response to indicate the benchmark types supported by the current server's `datasets_visualize` and `datasets_compare` APIs. For some reason, minor consolidation of metadata setting code in the intake API base resulted in weird unit test and functional test failures. The odd part is that these should have been failing earlier, as they're cases where a dataset doesn't have `server.index-map` but the API requires it (for example in `datasets_detail` and `datasets_contents`). I twizzled the code a bit to fail with a `CONFLICT` error (which had already been used elsewhere) rather than a server internal error when the dataset is `server.archiveonly`, and adjusted the tests to match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm late to this game, but I think this change may have introduced a bug into one of the tests (and I think it may have removed some checks for stuff that we do want to test for)...and, as long as I'm here, I have a few other comments.
if Metadata.getvalue(dataset, Metadata.SERVER_ARCHIVE): | ||
raise APIAbort(HTTPStatus.CONFLICT, "Dataset is not indexed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message is insufficient: in 90% of the cases when users receive this, they are going to ask, "why isn't this dataset indexed?" And, it's unlikely that they are going to realize that it is because indexing was disabled on it when it was uploaded.
So, this message really should say something more like, "Indexing of this dataset was disabled by the owner".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily by deliberate action of the owner: if a tarball is uploaded without a metadata.log
in the right place, we'll accept it but mark it "archive only". I suppose I could have said that it's marked for archive only, but at the time that seemed more complicated and more obscure than just saying it wasn't indexed. Maybe "indexing was disabled" would be OK, but "by the owner" implies deliberate intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a tarball is uploaded without a metadata.log in the right place, we'll accept it but mark it "archive only".
Ah, yes, I'd neglected that detail. But, yes, "marked for archive only" would be much better than "not indexed".
# BENCHMARK provides a simple identification of the native benchmark where | ||
# one can be identified. At the simplest for Pbench Agent tarballs this is | ||
# just "dataset.metalog.pbench.script" but will be expanded in the future | ||
# for example to identify a "pbench-user-benchmark -- fio" as "fio". | ||
SERVER_BENCHMARK = "server.benchmark" | ||
SERVER_BENCHMARK_UNKNOWN = "unknown" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably would be good to have a reference here to possible values for this key (i.e., to BenchmarkName
or endpoints["visualization"]["benchmarks"]
?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those would constitute examples, but I already added fio
as an example. Maybe eventually we'll support visualization of all possible benchmarks, but right now either of those references would be incomplete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the value associated with "server.benchmark"
can be anything the user wants it to be...but, if it's not one of the BenchmarkName
values, the Server won't be able to recognize it, and if it's not one of endpoints["visualization"]["benchmarks"]
it can't be visualized by the Dashboard, right? I vote that we say that, here....
if not metalog: | ||
metalog = {"pbench": {"script": "Foreign"}} | ||
benchmark = Metadata.SERVER_BENCHMARK_UNKNOWN | ||
metalog = {"pbench": {"name": dataset.name, "script": benchmark}} | ||
metadata[Metadata.SERVER_ARCHIVE] = True | ||
attributes["missing_metadata"] = True | ||
else: | ||
p = metalog.get("pbench") | ||
if p: | ||
benchmark = p.get("script", Metadata.SERVER_BENCHMARK_UNKNOWN) | ||
else: | ||
benchmark = Metadata.SERVER_BENCHMARK_UNKNOWN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively,
benchmark = Metadata.SERVER_BENCHMARK_UNKNOWN
if not metalog:
metalog = {"pbench": {"name": dataset.name, "script": benchmark}}
metadata[Metadata.SERVER_ARCHIVE] = True
attributes["missing_metadata"] = True
else:
p = metalog.get("pbench", {"script": benchmark})
benchmark = p.get("script", benchmark)
assert ( | ||
dataset.metadata["dataset.metalog.pbench.script"] | ||
== dataset.metadata["server.benchmark"] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this assertion (generally) holds at the moment, this isn't really what we want going forward, right? That is, we're hoping that at some point we'll have pbench-user-benchmark
as the script and fio
as the benchmark, right? So, it would be better if we could structure this assertion in a way which doesn't break when we get where we want to be going. (I.e., "test as specification".)
(Also, you didn't take this shortcut at lines 190 and 192.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the archiveonly test, I know exactly what values to expect; this is an iteration, and all we can really know at this point is that they ought to match.
When and if we work out a more general "base benchmark" recognition that doesn't rely on dataset.metalog.pbench.script
, we'll have to figure out how to mimic that on the functional test side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all we can really know at this point is that they ought to match.
They should only match currently because of an implementation detail, not because that's specifically the way we want these fields to behave. It would be best if the tests were constructed in a way that doesn't require them to change when details of the implementation change.... (Ideally, we would not change the tests nearly as often as we seem to do.)
for dataset in datasets: | ||
server_client.update(dataset.resource_id, access=expected) | ||
print(f"\t ... updating {dataset.name} to {access!r}") | ||
meta = server_client.get_metadata( | ||
dataset.resource_id, metadata="dataset.access" | ||
response = server_client.update( | ||
dataset.resource_id, access=expected, raise_error=False | ||
) | ||
assert meta["dataset.access"] == expected | ||
print(f"\t ... updating {dataset.name} to {access!r}") | ||
if response.ok: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the print()
at line 734 precede the operation at line 731? (And, shouldn't the print()
be referencing expected
rather than access
, since that's what line 732 does?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shows up as a diff because I rearranged code, but I didn't change the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. But, I still think the lines are in the wrong order, etc. 🙂
PBENCH-1215 Webb pointed out a broken functional test in distributed-system-analysis#3496. Once it was testing what I had wanted it to test, I realized that it wasn't quite working. This contains some server and functional test changes to make archive-only dataset queries work correctly and to be sure that's adequately tested in the functional test cases.
PBENCH-1215 Webb pointed out a broken functional test in distributed-system-analysis#3496. Once it was testing what I had wanted it to test, I realized that it wasn't quite working. This contains some server and functional test changes to make archive-only dataset queries work correctly and to be sure that's adequately tested in the functional test cases.
* Fix some problems with archive-only behavior PBENCH-1215 Webb pointed out a broken functional test in #3496. Once it was testing what I had wanted it to test, I realized that it wasn't quite working. This contains some server and functional test changes to make archive-only dataset queries work correctly and to be sure that's adequately tested in the functional test cases.
PBENCH-1186
PBENCH-1187
A bit of a catch-all. First, this simplifies and standardizes the dashboard's need to identify a benchmark. Eventually we'll be "smart enough" to go beyond the Pbench benchmark wrapper (as we intend to move away from relying on them) and
dataset.metalog.pbench.script
will become unreliable. This adds a newserver.benchmark
, which right now is the same but can be extended beyond that in the future.The dashboard would also like to be able to identify which benchmark types the current server can support, in advance, and we'd like to avoid hardcoding that knowledge in the dashboard. For this purpose, I've added a "visualization" key in the endpoints response to indicate the benchmark types supported by the current server's
datasets_visualize
anddatasets_compare
APIs.For some reason, minor consolidation of metadata setting code in the intake API base resulted in weird unit test and functional test failures. The odd part is that these should have been failing earlier, as they're cases where a dataset doesn't have
server.index-map
but the API requires it (for example indatasets_detail
anddatasets_contents
). I twizzled the code a bit to fail with aCONFLICT
error (which had already been used elsewhere) rather than a server internal error when the dataset isserver.archiveonly
, and adjusted the tests to match.