Nep 18178 migration across environments(POC) #32
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://jobready.atlassian.net/browse/NEP-18178
DO NOT MERGE .. am splitting this work into 2 parts to clean up.
https://jobready.atlassian.net/browse/NEP-18450
https://jobready.atlassian.net/browse/NEP-18548
WORK COMPLETED in #37
Transferring Dashboards across Environments (Proof of Concept)
In this doc, we will discuss how to transfer dashboards across Superset hosting environments with the goal of heading towards an API call to automate the process.
Background
A common practice is to setup infrastructure to deploy multiple Superset environments. For example a simple setup might be:
For the above example, the Superset staging env often holds connections to staging databases, and the Superset production staging env will hold connections to the production databases.
In the event where the database schema structure for the local dev, staging and production databases are exactly the same, then dashboards can be transferred across Superset hosting environments.
It requires some manual updating of the exported yaml files before importing into the target environment. Also required is some understanding of the underlying dashboard export structure and how the a object UUIDs work and relate to each other expecially in the context of databases and datasets.
Dashboard Export/Import within same Environment
This is a fairly straight forward process.
There are multiple methods for exporting a dashboard.
Each method will result in zip file that contains a set of yaml files as per this list below which is an export of the Sales dashboard from the default examples dashboards.
Each of the above yaml files holds UUID values for the primary object and any related objects.
Example of the database yml file.
If we grep the database/examples.yaml we can see the UUID of the database.
Now if we look at the UUID values in the datasets, you will see both the dataset UUID and the reference to the database UUID.
If the above zip file was imported as is to the same superset environment, this would mean all UUID's exist in superset and these objects would be found an updated with the imported zip data.
If the above zib file was imported to a different target superset environment, it would fail as there would be no matching database UUID entry in that target superset environment.
Migrate a dashboard across to a different Superset Environment
Give the above knowledge, we can now think about how to migrate boards between superset environments.
If we have an export from Staging Env, and an export from the Production Env, and we compared the database UUIDs and dataset UUIDs you would see that the UUIDs are unique to the Environment or Supserset instance and that the database connection string would also be unique to each env.
Given we have a request to 'transfer' a dashboard across to a different environment, say Staging to Production, how would we then proceed?
With the condition that the database in Staging and Production are structurally exactly the same schema. From above discussion on UUIDs, you can then see that if we want to import a Staging dashboard export into the Production environment we will need to perform the following steps:
databases/
directoryThe process above assumes that whoever is migrating the dashboard has a copy of the target database yml files so that
in step 3 and 4 we can then replace the staging database yaml with the production one.
Requirements
For the above process we need to know or have access to the following:
Currently the Superset Acumania repo holds the copies of the Dashboard backups in extracted yaml file format.
https://github.com/rdytech/superset-acumania/tree/develop
The ideal process for creating a dashboard is to create the initial template version on the staging (or edge) environment, then backup that dashboard into acumania repo.
As this template can be considered the source dashboard, it makes logical sense to use that backup to replicate into a different environment.
Potentially we could store all database yml files in the superset-acumania repo, where we currently store the dashboard backups.
Then we could use that list, as part of the script class calls to Export -> manipulate yaml file -> Import to new Env
Further Work:
Proof of Concept
Some intial coding has been done to prove the direction works.
Needs another 3pt to finish off and clean up specs.
Note .. jb has been doing this process manully to get the boards migrating across environments.
Example of migrating Modules dashboard to a Pool1 client.
Gotchas !
Migrating a Dashboard ONCE .. to a new target env, database, schema will result in
Migrating the same Dashboard a second time, to the same target env, database, BUT different schema will NOT create a new Dashboard.
It will attempt to update the same Dashboard .. as the UUID for the dashboard has not changed.
It will also NOT change any of the Datasets to the new schema. Looks to be a limitation of the import process
This may lead to some confusing results.
References
Some more helpful references relating to cross environment workflows.