Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support limited non-alphanumeric metadata keys #3354

Merged
merged 4 commits into from
Mar 22, 2023

Conversation

dbutenhof
Copy link
Member

PBENCH-1111

Internally defined and managed Pbench Server metadata keys are lowercase alphanumeric strings; however the dataset.metalog namespace is extracted directly from the tarball metadata.log file and some benchmarks (especially fio and uperf generate tool and iteration keys that aren't accessible because they fail the normal key path validation checks.

This PR loosens validation checks in some paths so GET APIs that retrieve metadata will be able to find these keys, for example uperf iterations like dataset.metalog.iterations/1-tcp_stream-64B-1i.

(NOTE: I first noticed this problem when I added general metadata filtering, and made special allowances in the filter keyword processing, but at the time didn't want to expand the scope to deal with more general handling. I was reminded when I added the key aggregator, and finally I'm fixing it.)

@dbutenhof dbutenhof added Server API Of and relating to application programming interfaces to services and functions labels Mar 17, 2023
@dbutenhof dbutenhof self-assigned this Mar 17, 2023
webbnh
webbnh previously approved these changes Mar 20, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me, but...do we guard against metadata keys containing dots, colons, and any other characters that we have special semantics for?...and do we have tests that ensure that we disallow the things we're supposed to disallow? And, I found a typo and I have a code suggestion.

webbnh
webbnh previously approved these changes Mar 20, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; just two "small questions".

lib/pbench/test/unit/server/test_schema.py Outdated Show resolved Hide resolved
lib/pbench/test/unit/server/test_schema.py Show resolved Hide resolved
webbnh
webbnh previously approved these changes Mar 21, 2023
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

lib/pbench/server/api/resources/__init__.py Show resolved Hide resolved
lib/pbench/test/unit/server/test_schema.py Show resolved Hide resolved
PBENCH-1111

Internally defined and managed Pbench Server metadata keys are lowercase
alphanumeric strings; however the dataset.metalog namespace is extracted
directly from the tarball metadata.log file and some benchmarks (especially
fio and uperf generate tool and iteration keys that aren't accessible
because they fail the normal key path validation checks.

This PR loosens validation checks in some paths so `GET` APIs that retrieve
metadata will be able to find these keys, for example uperf iterations like
`dataset.metalog.iterations/1-tcp_stream-64B-1i`.

(NOTE: I first noticed this problem when I added general metadata filtering,
and made special allowances in the `filter` keyword processing, but at the
time didn't want to expand the scope to deal with more general handling. I
was reminded when I added the key aggregator, and finally I'm fixing it.)
(There could always be more, but I think this covers the basics.)
Some more test cases, but also now recursively validating JSON keys to avoid
that loophole.
Confirm that multiple unacceptable JSON keys are collected across various
levels of the JSON tree.
Copy link
Member

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@dbutenhof dbutenhof merged commit afb9bef into distributed-system-analysis:main Mar 22, 2023
@dbutenhof dbutenhof deleted the metakey branch March 22, 2023 15:42
@portante portante added this to the v0.73 milestone Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Of and relating to application programming interfaces to services and functions Server
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants