-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include a new downsampling operation #1574
Include a new downsampling operation #1574
Conversation
The new downsampling operation allows Rally to hit the Elasticsearch /<source-index>/_rollup/<target-index> endpoint to start a downsampling operation on the source index. Expected parameters include: * fixed_interval: the aggregation granularity using the same interval definition used by date histograms * source-index: the index to downsample applying the aggregation * target-index: the index created as a result of the aggregation on the @timestamp field
We would like to benchmark the downsample operation so we make it a non-administrative operation. Also we don't want it to be retried if it fails to avoid latency to be affected by retries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some suggestions.
One optional thing to consider that might be cool/useful is to add a parameter source for this operation, that can provide defaults for source_index
and target_index
for ease-of-use. We have helper function params.get_target(track, params)
for this purpose in param sources, and your target-index
could just default to some fixed pattern like f"{source-index}_{fixed-interval}"
Co-authored-by: Rick Boyd <boyd.richardj@gmail.com>
Co-authored-by: Rick Boyd <boyd.richardj@gmail.com>
@elasticmachine run rally/rally-tracks-compat |
The CI is failing with |
This means that test is now passing! It should be removed from elastic/rally-tracks@117f306 after merging this pull request. Or we could change that line in rally-tracks to See https://docs.pytest.org/en/7.1.x/how-to/skipping.html for details on that pytest fixture. |
Co-authored-by: Rick Boyd <boyd.richardj@gmail.com>
docs/track.rst
Outdated
"""""""""" | ||
|
||
* ``fixed_interval`` (optional, defaults to ``1h``): The aggregation interval key defined as in `https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html#fixed_intervals`. | ||
* ``source-index`` (optional, defaults to ``tsdb-index``): The index containing data to aggregate which includes a ``@timestamp`` field. Note that this index should be marked read-only prior to the execution of this operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* ``source-index`` (optional, defaults to ``tsdb-index``): The index containing data to aggregate which includes a ``@timestamp`` field. Note that this index should be marked read-only prior to the execution of this operation. | |
* ``source-index`` (optional): The index containing data to aggregate which includes a ``@timestamp`` field. Note that this index should be marked read-only prior to the execution of this operation. If there is only one index defined in the ``indices`` of the track definition, that will be used as the default. |
assert p["source-index"] == "tsdb" | ||
assert p["target-index"] == "tsdb-1m" | ||
|
||
def test_downsample_empty_params(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what this test buys us? Checked with a debugger and captured the value of p:
{'fixed-interval': '1h', 'source-index': None, 'target-index': 'None-1h', 'request-timeout': None, 'headers': None, 'opaque-id': None}
Might be better instead to have the Track constructor call include an indices
param with value like [{"name": "test-source-index", "body": "index.json"}]
and assert that source-index gets provided as test-source-index
tests/track/params_test.py
Outdated
|
||
p = source.params() | ||
|
||
assert p["fixed-interval"] != "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the documented default, and should need changing if the default changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still have 3 "unresolved conversations" on this PR, relating to documentation update and hardening testing the defaults
tests/track/params_test.py
Outdated
p = source.params() | ||
|
||
assert p["fixed-interval"] == "1h" | ||
assert p["source-index"] != "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To put my above comment slightly differently, this assert only passes because the source-index
is None
instead of ""
, technically different but not a going concern. You've covered my larger concern in test_downsample_default_index_param
, where you ensure there's a default when there is supposed to be a default, so perhaps this can just be deleted and the test is left to assert the default fixed-interval
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, thank you for iterating
The new downsampling operation allows Rally to hit the Elasticsearch
/<source-index>/_downsample/<target-index>
endpoint to start a downsampling operation on the source index. Expected parameters include:fixed_interval
: the aggregation granularity using the same interval definition used by date histogramssource-index
: the index to downsample applying the aggregationtarget-index
: the index created as a result of the aggregation on the timestamp fieldExample: aggregate the index
test-source-index
on the timestamp field using a 1 minute interval and store the result in a new index calledtest-target-index
.