fix(upgrade): Improving NoCodeUpgrade logic to account for Bootstrap logic #3301
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In a recent version of DataHub, we introduced a BootstrapManager that is responsible for executing steps when GMS boots up. These include ingesting default set of policies and the stock data platforms. This process does not support upgrading from a version of DataHub before the NoCodeUpgrade was introduced because even if the table is created and the bootstrap does succeed, the service will not qualify for upgrade since that table already has rows.
The solution we introduced is to change the NoCodeUpgrade qualification process to check whether the number of rows not created by the system user are 0. As part of this, we are also standardizing the concept of "system actor" to be an actor with the primary key identifier "urn:li:corpuser:__datahub_actor". This was chosen to avoid future conflict potential and to ensure that the existing concept of corpuser was reused. Previously, we'd had multiple notions of a system actors including
Going forward, we will standardize on
urn:li:corpuser:__datahub_system
as the official system actor. This means that we may need an upgrade for companies that already have the previous names of system actors in their DBs. We will consider whether this is required ( not yet ) and publish an upgrade if necessary in the future. For now, this should solve the immediate problem of upgrading to the latest from an old version of DataHub.Checklist