-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fold EngineDiskUtils into Store, for better lock semantics #29156
Conversation
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minors Looks good @bleskes
/** | ||
* This class contains utility methods for mutating the shard lucene index and translog as a preparation to be opened. | ||
*/ | ||
public abstract class EngineDiskUtils { | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class is empty now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woops. will remove.
return userData; | ||
} | ||
|
||
private IndexWriter newIndexWriter(final boolean create, final Directory dir) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just pass an open mode instead of the redundant create flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
if (translogUUID.equals(getUserData(writer).get(Translog.TRANSLOG_UUID_KEY))) { | ||
throw new IllegalArgumentException("a new translog uuid can't be equal to existing one. got [" + translogUUID + "]"); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra newline?
/** | ||
* Force bakes the given translog generation as recovery information in the lucene index. | ||
*/ | ||
public void associateIndexWithNewTranslog(final String translogUUID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this override the existing one? what is the usecase for associating a new one? can you document it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the "ugly" part of the PR. What we're doing is create an empty translog and bake it's new uuid into an existing lucene index. This is done during snapshot restore and file based peer recovery. Before it was bundled nicely in one method but now it has be done out of store and the uuid passed in here. I will add this to the java docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm seems dangerous, can we assert stuff here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I tried to come up with some check but couldn't. I'll give it some more thought tmr morning.
@@ -93,7 +93,7 @@ which returns something similar to: | |||
{ | |||
"commit" : { | |||
"id" : "3M3zkw2GHMo2Y4h4/KFKCg==", | |||
"generation" : 3, | |||
"generation" : 4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm why did this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To create an empty shard, we now:
- Create an empty index via
Store.createEmpty
- Create an empty translog
- Associate it with the index using
Store. associateIndexWithNewTranslog
This creates one more commit compared to how it used to be. I can change createEmpty to require a translogUUID as a parameter , if you prefer. I'm OK either way.
|
||
|
||
/** | ||
* Marks an existing lucene index and marks it with a new history uuid. This should be called on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This java docs is stale. I'll work on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
* This is used to make sure no existing shard will recovery from this index using ops based recovery. | ||
*/ | ||
public void bootstrapNewHistory() | ||
throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we may not a new line here.
* created and the existing lucene index needs should be changed to use it. | ||
*/ | ||
public void associateIndexWithNewTranslog(final String translogUUID) | ||
throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same here - we may not need this new line.
…tore # Conflicts: # server/src/test/java/org/elasticsearch/index/engine/EngineDiskUtilsTests.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* master: (40 commits) Do not optimize append-only if seen normal op with higher seqno (elastic#28787) [test] packaging: gradle tasks for groovy tests (elastic#29046) Prune only gc deletes below local checkpoint (elastic#28790) remove testUnassignedShardAndEmptyNodesInRoutingTable elastic#28745: remove extra option in the composite rest tests Fold EngineDiskUtils into Store, for better lock semantics (elastic#29156) Add file permissions checks to precommit task Remove execute mode bit from source files Optimize the composite aggregation for match_all and range queries (elastic#28745) [Docs] Add rank_eval size parameter k (elastic#29218) [DOCS] Remove ignore_z_value parameter link Docs: Update docs/index_.asciidoc (elastic#29172) Docs: Link C++ client lib elasticlient (elastic#28949) [DOCS] Unregister repository instead of deleting it (elastic#29206) Docs: HighLevelRestClient#multiSearch (elastic#29144) Add Z value support to geo_shape Remove type casts in logging in server component (elastic#28807) Change BroadcastResponse from ToXContentFragment to ToXContentObject (elastic#28878) REST : Split `RestUpgradeAction` into two actions (elastic#29124) Add error file docs to important settings ...
* es/master: (22 commits) Fix building Javadoc JARs on JDK for client JARs (#29274) Require JDK 10 to build Elasticsearch (#29174) Decouple NamedXContentRegistry from ElasticsearchException (#29253) Docs: Update generating test coverage reports (#29255) [TEST] Fix issue with HttpInfo passed invalid parameter Remove all dependencies from XContentBuilder (#29225) Fix sporadic failure in CompositeValuesCollectorQueueTests Propagate ignore_unmapped to inner_hits (#29261) TEST: Increase timeout for testPrimaryReplicaResyncFailed REST client: hosts marked dead for the first time should not be immediately retried (#29230) TEST: Use different translog dir for a new engine Make SearchStats implement Writeable (#29258) [Docs] Spelling and grammar changes to reindex.asciidoc (#29232) Do not optimize append-only if seen normal op with higher seqno (#28787) [test] packaging: gradle tasks for groovy tests (#29046) Prune only gc deletes below local checkpoint (#28790) remove testUnassignedShardAndEmptyNodesInRoutingTable #28745: remove extra option in the composite rest tests Fold EngineDiskUtils into Store, for better lock semantics (#29156) Add file permissions checks to precommit task ...
As follow up to #28245 , this PR removes the logic for selecting the right start commit from the Engine constructor in favor of explicitly trimming them in the Store, before the engine is opened. This makes the constructor in engine follow standard Lucene semantics and use the last commit. Relates #28245 Relates #29156
#28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change translog and lucene commit points. That util class bundled everything that's needed to create and empty shard, bootstrap a shard from a lucene index that was just restored etc. In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That would sometime fail due to concurrent shard store fetching or other short activities that require the files not to be changed while they read from them. Since there is no way to wait on the index writer lock, the `Store` class has other locks to make sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the translog manipulations on their own.
As follow up to #28245 , this PR removes the logic for selecting the right start commit from the Engine constructor in favor of explicitly trimming them in the Store, before the engine is opened. This makes the constructor in engine follow standard Lucene semantics and use the last commit. Relates #28245 Relates #29156
#28245 has introduced the utility class
EngineDiskUtils
with a set of methods to prepare/change translog and lucene commit points. That util class bundled everything that's needed to create and empty shard, bootstrap a shard from a lucene index that was just restored etc.In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That would sometime fail due to concurrent shard store fetching or other short activities that require the files not to be changed while they read from them.
Since there is no way to wait on the index writer lock, the
Store
class has other locks to make sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this PR foldsEngineDiskUtils
intoStore
. Sadly this comes with a price - the store class doesn't and shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the translog manipulations on their own.