Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSDB: add index timestamp range check #78291

Merged
merged 30 commits into from
Nov 11, 2021

Conversation

weizijun
Copy link
Contributor

  • Add a index.time_series.start_time and index.time_series.end_time setting to time series indices and reject documents with @timestamp outside the range
  • Reject documents without an @timestamp

@elasticsearchmachine elasticsearchmachine added v8.0.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Sep 24, 2021
@nik9000 nik9000 requested review from nik9000 and imotov September 24, 2021 20:13
@nik9000 nik9000 added the :StorageEngine/TSDB You know, for Metrics label Sep 24, 2021
@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Sep 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. You already did the end time only getting bigger. It's just the down time.

Could you make sure the setting aren't set-able without being in time series mode? I think right now they are behind the feature flag but not behind enabling the mode. I probably didn't mention that when we typed at each other earlier.

@weizijun
Copy link
Contributor Author

Could you make sure the setting aren't set-able without being in time series mode? I think right now they are behind the feature flag but not behind enabling the mode. I probably didn't mention that when we typed at each other earlier.

ok, I will add the time_series mode check

@nik9000
Copy link
Member

nik9000 commented Sep 26, 2021 via email

Copy link
Contributor

@imotov imotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 let's discuss the timestamp issue.

This reverts commit b4a68c3.
This reverts commit 63e552c.
* upstream/master: (521 commits)
  Migrate custom role providers to licensed feature (elastic#79127)
  Remove stale AwaitsFix in InternalEngineTests (elastic#79323)
  Fix errors in RefreshListenersTests (elastic#79324)
  Reeable BwC Tests after elastic#79318 (elastic#79320)
  Mute BwC Tests for elastic#79318 (elastic#79319)
  Reenable BwC Tests after elastic#79308 (elastic#79313)
  Disable BwC Tests for elastic#79308 (elastic#79310)
  Adjust BWC for node-level field cap requests (elastic#79301)
  Allow total memory to be overridden (elastic#78750)
  Fix SnapshotBasedIndexRecoveryIT#testRecoveryIsCancelledAfterDeletingTheIndex (elastic#79269)
  Disable BWC tests
  Mute GeoIpDownloaderCliIT.testStartWithNoDatabases (elastic#79299)
  Add alias support to fleet search API (elastic#79285)
  Create a coordinating node level reader for tsdb (elastic#79197)
  Route documents to the correct shards in tsdb (elastic#77731)
  Inject migrate action regardless of allocate action (elastic#79090)
  Migrate to data tiers should always ensure a TIER_PREFERENCE is set (elastic#79100)
  Skip building of BWC distributions when building release artifacts (elastic#79180)
  Default ENFORCE_DEFAULT_TIER_PREFERENCE to true (elastic#79275)
  Deprecation of transient cluster settings (elastic#78794)
  ...

# Conflicts:
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/tsdb/10_settings.yml
#	server/src/main/java/org/elasticsearch/common/settings/IndexScopedSettings.java
#	server/src/main/java/org/elasticsearch/common/settings/Setting.java
#	server/src/main/java/org/elasticsearch/index/IndexMode.java
#	server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java
@imotov
Copy link
Contributor

imotov commented Oct 18, 2021

@elasticmachine test this please

@weizijun
Copy link
Contributor Author

@elasticmachine test this please

@imotov After #79136 Merged, I will change the timestamp check in DataStreamTimestampFieldMapper

@imotov
Copy link
Contributor

imotov commented Oct 19, 2021

There are test failing here. I am not sure if you can see them.

org.elasticsearch.xpack.ccr.action.TransportResumeFollowActionTests > testDynamicIndexSettingsAreClassified FAILED
    java.lang.AssertionError: setting [index.time_series.end_time] is not classified as replicated or not replicated
    Expected: is <true>
         but: was <false>
        at __randomizedtesting.SeedInfo.seed([9AB92C831026ACBD:2F05BBEA3A1C9FC7]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.junit.Assert.assertThat(Assert.java:956)
        at org.elasticsearch.xpack.ccr.action.TransportResumeFollowActionTests.testDynamicIndexSettingsAreClassified(TransportResumeFollowActionTests.java:227)
REPRODUCE WITH: ./gradlew ':x-pack:plugin:ccr:test' --tests "org.elasticsearch.xpack.ccr.action.TransportResumeFollowActionTests.testDynamicIndexSettingsAreClassified" -Dtests.seed=9AB92C831026ACBD -Dtests.locale=ar-EG -Dtests.timezone=Africa/Windhoek -Druntime.java=11

Suite: Test class org.elasticsearch.xpack.ccr.action.TransportResumeFollowActionTests
  1> [2021-10-18T22:07:28,045][INFO ][o.e.x.c.a.TransportResumeFollowActionTests] [testDynamicIndexSettingsAreClassified] before test
  1> [2021-10-18T22:07:28,568][INFO ][o.e.x.c.a.TransportResumeFollowActionTests] [testDynamicIndexSettingsAreClassified] after test
  2> REPRODUCE WITH: ./gradlew ':x-pack:plugin:ccr:test' --tests "org.elasticsearch.xpack.ccr.action.TransportResumeFollowActionTests.testDynamicIndexSettingsAreClassified" -Dtests.seed=9AB92C831026ACBD -Dtests.locale=ar-EG -Dtests.timezone=Africa/Windhoek -Druntime.java=11
  2> java.lang.AssertionError: setting [index.time_series.end_time] is not classified as replicated or not replicated
    Expected: is <true>
         but: was <false>
        at __randomizedtesting.SeedInfo.seed([9AB92C831026ACBD:2F05BBEA3A1C9FC7]:0)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.junit.Assert.assertThat(Assert.java:956)
        at org.elasticsearch.xpack.ccr.action.TransportResumeFollowActionTests.testDynamicIndexSettingsAreClassified(TransportResumeFollowActionTests.java:227)
  1> [2021-10-18T22:07:28,617][INFO ][o.e.x.c.a.TransportResumeFollowActionTests] [testValidation] before test
  1> [2021-10-18T22:07:29,608][INFO ][o.e.x.c.a.TransportResumeFollowActionTests] [testValidation] after test
  1> [2021-10-18T22:07:29,631][INFO ][o.e.x.c.a.TransportResumeFollowActionTests] [testFilter] before test
  1> [2021-10-18T22:07:29,632][INFO ][o.e.x.c.a.TransportResumeFollowActionTests] [testFilter] after test
  2> NOTE: leaving temporary files on disk at: /dev/shm/elastic+elasticsearch+pull-request+part-2/x-pack/plugin/ccr/build/testrun/test/temp/org.elasticsearch.xpack.ccr.action.TransportResumeFollowActionTests_9AB92C831026ACBD-001
  2> NOTE: test params are: codec=Asserting(Lucene90): {}, docValues:{}, maxPointsInLeafNode=803, maxMBSortInHeap=7.284619573576025, sim=Asserting(RandomSimilarity(queryNorm=false): {}), locale=ar-EG, timezone=Africa/Windhoek
  2> NOTE: Linux 5.10.0-9-cloud-amd64 amd64/Oracle Corporation 11.0.12 (64-bit)/cpus=32,threads=1,free=429254720,total=536870912
  2> NOTE: All tests run in this JVM: [ActivateAutoFollowPatternActionRequestTests, ShardChangesResponseTests, TransportResumeFollowActionTests]
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8 -Dsun.jnu.encoding=UTF8
org.elasticsearch.index.TimeSeriesModeTests > testValidateTimestamp FAILED
    java.lang.IllegalArgumentException: [index.mode=time_series] requires [index.routing_path]
        at __randomizedtesting.SeedInfo.seed([84AD7654F32C4F66:426B8A67AE365071]:0)
        at org.elasticsearch.index.IndexMode$2.validateWithOtherSettings(IndexMode.java:70)
        at org.elasticsearch.index.IndexSettings$1.validate(IndexSettings.java:356)
        at org.elasticsearch.index.IndexSettings$1.validate(IndexSettings.java:350)
        at org.elasticsearch.common.settings.Setting.get(Setting.java:484)
        at org.elasticsearch.common.settings.Setting.get(Setting.java:455)
        at org.elasticsearch.common.settings.Setting.get(Setting.java:598)
        at org.elasticsearch.common.settings.AbstractScopedSettings.get(AbstractScopedSettings.java:709)
        at org.elasticsearch.index.IndexSettings.<init>(IndexSettings.java:581)
        at org.elasticsearch.index.IndexSettings.<init>(IndexSettings.java:561)
        at org.elasticsearch.index.mapper.MapperServiceTestCase.createIndexSettings(MapperServiceTestCase.java:212)
        at org.elasticsearch.index.mapper.MapperServiceTestCase.createMapperService(MapperServiceTestCase.java:183)
        at org.elasticsearch.index.mapper.MapperServiceTestCase.createMapperService(MapperServiceTestCase.java:171)
        at org.elasticsearch.index.mapper.MapperServiceTestCase.createMapperService(MapperServiceTestCase.java:140)
        at org.elasticsearch.index.TimeSeriesModeTests.testValidateTimestamp(TimeSeriesModeTests.java:83)
REPRODUCE WITH: ./gradlew ':server:test' --tests "org.elasticsearch.index.TimeSeriesModeTests.testValidateTimestamp" -Dtests.seed=84AD7654F32C4F66 -Dtests.locale=nb -Dtests.timezone=America/Marigot -Druntime.java=11

@weizijun
Copy link
Contributor Author

I'm trying to fix it. If I have any questions, I will ask your help.

* upstream/master: (209 commits)
  Enforce license expiration (elastic#79671)
  TSDB: Automatically add timestamp mapper (elastic#79136)
  [DOCS]  `_id` is required for bulk API's `update` action (elastic#79774)
  EQL: Add optional fields and limit joining keys on non-null values only (elastic#79677)
  [DOCS] Document range enrich policy (elastic#79607)
  [DOCS] Fix typos in 8.0 security migration (elastic#79802)
  Allow listing older repositories (elastic#78244)
  [ML] track inference model feature usage per node (elastic#79752)
  Remove IncrementalClusterStateWriter & related code (elastic#79738)
  Reuse previous indices lookup when possible (elastic#79004)
  Reduce merging in PersistedClusterStateService (elastic#79793)
  SQL: Adjust JDBC docs to use milliseconds for timeouts (elastic#79628)
  Remove endpoint for freezing indices (elastic#78918)
  [ML] add timeout parameter for DELETE trained_models API (elastic#79739)
  [ML] wait for .ml-state-write alias to be readable (elastic#79731)
  [Docs] Update edgengram-tokenizer.asciidoc (elastic#79577)
  [Test][Transform] fix UpdateTransformActionRequestTests failure (elastic#79787)
  Limit CS Update Task Description Size (elastic#79443)
  Apply the reader wrapper on can_match source (elastic#78988)
  [DOCS] Adds new transform limitation item and a note to the tutorial (elastic#79479)
  ...

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/IndexMode.java
#	server/src/test/java/org/elasticsearch/index/TimeSeriesModeTests.java
@weizijun
Copy link
Contributor Author

weizijun commented Nov 2, 2021

@elasticmachine test this please

@imotov I fixed the bwc failed tests. The reason for the failure is that the latest version is 8.1.0, and the settings skip version is before 8.0.0. I fixed it.

@imotov
Copy link
Contributor

imotov commented Nov 2, 2021

@elasticmachine test this please

if (value >= endTime) {
throw new IllegalArgumentException("time series index @timestamp value [" + value + "] must be smaller than " + endTime);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I wrote this, but on further reading, I think maybe it makes sense to just move it all to DataStreamTimestampFieldMapper. I thought I was being clever, but all of this code should be able to run just fine from there and it'd be nice that DateFieldMapper doesn't need to care about the validation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with moving this logic to DataStreamTimestampFieldMapper, since there there is already validation for the @timestamp field (whether it exists and whether a single value has been specified). And I think at least initially, tsdb will work only with data streams.

@martijnvg martijnvg self-requested a review November 11, 2021 11:14
/**
* End time of the time_series index.
*/
private volatile Instant timeSeriesEndTime;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that right now these can't be null so all indices are going to validate the range. I wonder if we can convert these to null for the defaults so we disable the validation when it isn't set.

@nik9000
Copy link
Member

nik9000 commented Nov 11, 2021

@elasticmachine, test this please

@nik9000
Copy link
Member

nik9000 commented Nov 11, 2021

@elasticmachine, test this please

1 similar comment
@nik9000
Copy link
Member

nik9000 commented Nov 11, 2021

@elasticmachine, test this please

@nik9000
Copy link
Member

nik9000 commented Nov 11, 2021

@elasticmachine, test this please

@nik9000
Copy link
Member

nik9000 commented Nov 11, 2021

All checks have passed

Nice

@nik9000 nik9000 merged commit 8b2019a into elastic:master Nov 11, 2021
@nik9000
Copy link
Member

nik9000 commented Nov 11, 2021

Thanks for all that back and for @weizijun ! I've merged master in and merged it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external-contributor Pull request authored by a developer outside the Elasticsearch team :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.0.0-rc1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants