You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to start persisting the progress of the migration as it is running so that other Kibana instances have a way to know that a migration is not actually in progress, but has failed. There are a number of ways we can do this, but I wanted to write down what i think we should do:
The current migration process uses the ability to create the index as a sort of "lock" that it is the only Kibana instance that is currently migrating the index. I don't think this is sufficient if we want to track progress, do retries, and avoid relying on a single-shared clock. I think we now need to store the migration progress in a new temporary index:
Rather than using the creation of the migration target to identify which Kibana instance will run the migration we could use the .kibana_migration_{hash(kibana.index)}/_doc/progress document as our "lock". Each instance will:
Check if a migration is necessary at startup
If no migration is necessary, continue with startup
Else, put the progress index template
Attempt to create the progress document (using op_type=create) with the right node_id and an attempt count of 1
If the create was successful this instance will run the migration
If the document already exists,
Read the progress document (using preference=_primary)
If the document includes an error:
log the error
if the attempt count is 10 (configurable?)
log an error instructing the user to delete the .kibana_migration_{hash(kibana.index)} to resume
abort startup
else, attempt to update the document (using the version param) with our node_id and incrementing the attempt count
the instance that successfully updates the document runs the migration
If the document DOES NOT include an error:
wait 30 seconds and fetch the document again (using preference=_primary)
if the version is greater than previously go to step 6.iii.a
if the version is not greater than previously
log an error describing the node_id that failed to show sign of life
go to step 6.ii.b
Kibana instance running the migration will:
Reindex the progress document every 15 seconds, causing the version to increment
Re-check if a migration is necessary
If migration is not necessary it's possible a migration completed between when we previously thought it was necessary and when we acquired the progress document
Delete old versions of the target index if migration necessary
Recreate target index
Migrate documents
Before each write
fetch the progress document and verify node_id
if node_id does not match, abort startup with an error
If an error occurs during the migration:
fetch the progress document and verify node_id
update the progress document (using the version param) to include error and unset the node_id
abort startup
When the migration is complete, and the target index has been refreshed, delete the progress index
The text was updated successfully, but these errors were encountered:
We need to start persisting the progress of the migration as it is running so that other Kibana instances have a way to know that a migration is not actually in progress, but has failed. There are a number of ways we can do this, but I wanted to write down what i think we should do:
The current migration process uses the ability to create the index as a sort of "lock" that it is the only Kibana instance that is currently migrating the index. I don't think this is sufficient if we want to track progress, do retries, and avoid relying on a single-shared clock. I think we now need to store the migration progress in a new temporary index:
Rather than using the creation of the migration target to identify which Kibana instance will run the migration we could use the
.kibana_migration_{hash(kibana.index)}/_doc/progress
document as our "lock". Each instance will:op_type=create
) with the right node_id and an attempt count of 1preference=_primary
).kibana_migration_{hash(kibana.index)}
to resumeversion
param) with our node_id and incrementing the attempt countpreference=_primary
)Kibana instance running the migration will:
version
param) to include error and unset the node_idThe text was updated successfully, but these errors were encountered: