Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Ability to tag migrations and execute migrations by tag. #1044

Closed
rquadling opened this issue Feb 25, 2017 · 10 comments
Closed

RFC: Ability to tag migrations and execute migrations by tag. #1044

rquadling opened this issue Feb 25, 2017 · 10 comments
Labels

Comments

@rquadling
Copy link
Collaborator

rquadling commented Feb 25, 2017

Scenario:

Multi-developer team, developing branches of code and migrations for a product.

Each branch will have multiple migrations created over the time that the branch is in development. Some of these migrations will need to be run before the code is deployed, other afterwards. Not all migrations are known prior to the development of the branch, so, as such, they are interleaved.

e.g.

AddNewTable (run before code deploy)
IntroduceNewColumnForRenaming (run before code deploy)
AddTriggersToPopulateNewColumn (run before code deploy)
DropTriggersAndOldColumn (run after code deploy)
AddSecondNewTable (run before code deploy)

Whilst it is possible to manually rename the migrations versions, this is still a problem for automatic deployment tools in that the migrations have to be manually processed within the deployment process.

In addition to migrating, rolling back migrations in the right order in terms of code deployment is quite complicated.

The proposal is to incorporate migration tagging, allowing you to tag migrations when they are created and filter migrations when they are executed.

Migration creation
Tagging would start with the team agreeing upon a default set of tag values. These would be added to the phinx.yml (or equivalent) and saved in the project repo for all developers to share. There will be 2 types of tag sets: SingleTag (no more than one tag may be selected from this set), MultiTag (any number of tags may be selected).

When you create a migration, you can supply an optional --tag parameter, and supply any number of tags.

Each tag will be validated against the tag definitions such that you cannot choose more than 1 tag from a SingleTag set, and that all tags are valid.

Any invalid tags will abort the creation of the migration.

Once validated, the tags would be presented to the migration template creation logic, similarly to the migration name and version.

The expected route for the template would be to add the tags as a comma separated value stored in a new migration class level constant TAGS.

For those using their own migration templates and/or migration template creation classes, then it will be their responsibility to upgrade their templates and classes to process the supplied tags.

Migration execution
Currently, any unexecuted migration is a target for migrating. An additional --tag parameter will be used to filter migrations based upon those that have all the tags supplied. More tags means a more specific filter. At the time of writing this, I am not sure if an option should exist to allow the tags to be treated as OR rather than AND. My personal opinion is that the this feature only makes sense by using an AND approach.

So, as an example, say you have several new tables and permissions to migrate, adding these commands to your deployment pipeline would allow you to define the migration order in a consistent way:

phinx breakpoint
phinx migrate --tag BEFORE CREATE
phinx migrate --tag BEFORE PERMISSIONS
phinx migrate --tag BEFORE VIEWS

Now whilst this will control the order of migrations being executed, the rollback logic would currently fail and so this RFC is based upon the acceptance of #926. Assuming that request is merged, migration ordering is based upon creating date, grouped by tags, and rollback ordering is based upon reverse execution time. At this stage, filtering the rollback by tags could lead to all sorts of inconsistencies, so unless enough people want the feature, I would not incorporate rollback filtering by tags.

By using the breakpoint, if there is anything wrong with the deployed migrations, you can rollback to the beginning of the deployment process significantly easier.

Migration status
When running phinx status, all tags would be displayed.

@rquadling
Copy link
Collaborator Author

@coatesap, @dignat, @hkwak, @joeHickson, @pedanticantic, @antriver

Any input you may have on this RFC would be greatly appreciated.

@joeHickson
Copy link

I like the and filter on multiple tags - in your BEFORE [CREATE|PERMISSIONS|VIEWS] scenario you could still just run migrate --tag BEFORE if the order of the second tag is irrelevant.

This process could run migrations out of order which may prove problematic for those running large releases.
e.g. dev 1 commits 1 BEFORE, 1 AFTER, dev 2 commits 2 BEFORE, 2 AFTER - both are deployed in the same release by running --tag BEFORE then --tag AFTER, resulting in the migrations being run in the order 1B, 2B, 1A, 2A.
Given the existing functionality would still be the default this is just be something to be aware of if switching to use tags.

Rollback by timestamp feels logical but would force the rollback of a whole release.
Using the example above It would not be possible to rollback just dev 1 or dev 2's code in the release because they are now interleaved.
As the code would likely need to be reverted as one this is unlikely to be a problem - you would have to pull the whole deployment and rollback all migrations regardless of which developers commit caused the problem.

@coatesap
Copy link

As this is trying to solve the issue around which migrations run before and after a code deployment, when using blue-green deployments, here's a completely different alternative for discussion!...

What if migrations were normally run by the application itself, in the front controller, for instance. This is Flyway's (Java migration tool) preferred method, and it means you can never forget to run a migration, or end up with an application and database out of sync.

This would work great for fast migrations such as a column rename, and would save the developer a huge amount of time and complexity building before/after migrations, adding triggers and temporary columns, etc. Where slow migrations are involved (adding column to very large table), and running them as part of the application bootstrap process would be undesirable, the option should still be there for a developer to run a migration remotely - which we can anyway with Phinx.

This approach also means that the CI pipeline would never need to get directly involved with database migrations.

@rquadling
Copy link
Collaborator Author

@joeHickson The use of tags would have to be something the dev team would have to be made fully aware of and would be incorporated into any CI/CD setup. Hopefully, peer-review would catch any untagged migrations. So, as such, if the team decides to use multiple tags, then multiple tags would have to be used consistently. I would envisage that a tagged migration will only be executed with a tagged migrate command.

With multiple features having their migrations interleaved (regardless of tagging), this is something that is already an issue. Reordering/renumbering migrations upon merging into master could be an option here - though that messes up the dev with the migrations already executed. No obvious answer to this except to release 1 branch at a time I think.

Untagged migrations would need to be executed at some stage. So running phinx migrate without any tags would only execute migrations that are not tagged. If they are executed as part for the deployment process, then it would still be peer review that decides if an untagged migration should be tagged.

@coatesap Running migrations as part of the front controller would have no chance to fail as code would already be deployed. If migrations are a separate part of the deployment process, the successful execution of all appropriate BEFORE migrations would allow code deployment, followed by all appropriate AFTER migrations and could be automated considerably.

Whilst this doesn't take away running manual migrations, having tags describes the developers intent as to when the migration should be run in a consistent way such that the entire team can review the code with less effort.

@ghost
Copy link

ghost commented Mar 10, 2017

A few questions:

  • Do the possible tags need to be written into phinx.yml? Can't they simply be defined on-the-fly when creating a new Migration?
  • Maybe a command to tag a Migration after it having been created would come in handy (although now that I think of it it would involve altering the Migration code, so I guess it's impossible - can still be done manually though).
  • It seems to me that this is somehow related with the concept of migrating/rolling back a single Migration by name, which afaik is not currently possible (but was asked for at least in ability to rollback an individual migration without migrations that have... #451, probably other issues too). With the feature described here, it would be possible to tag a single migration in order to execute only that migration, but it wouldn't be possible to rollback that single migration by tag (for the reasons described above). Not sure it's relevant, just thought I'd bring it up.

@rquadling
Copy link
Collaborator Author

@daniel-gomes-sociomantic The idea behind the tags is to be able to control the order of execution better. With v0.8.0, the rollback is in reverse execution order, what we now want to be able to do is control the execution order that isn't based upon when the migration was created. By having even the simplest of tagging (BEFORE_DEPLOY/AFTER_DEPLOY), you gain the separation needed to control the migrations such that if you DO need to rollback, the reverse execution order is accurate (i.e. all the AFTER_DEPLOYs get rolled back first).

For the team I work with, it is about being able to (for example) order the migrations based upon the intent of the migration (BEFORE/AFTER is the big one so with 1 branch of code, you can run all the before migrations, deploy the code, run all the after migrations without needing 3 separate branches).

Beside BEFORE/AFTER, being able to group the migrations related to table creation, column manipulation, index creation, fk creation, etc. so a rollback gets them all in the right order also.

By using migration templates that include the tags, then there is very little needed for a developer to remember. They need a new table migration, then they create a migration using the CreateTable template. If they then create a load more migrations and then another CreateTable migration, all the create table migrations can be grouped together and executed in sequence. And if they are the first migrations to be run, they will be the last migrations rolled back.

@winkbrace
Copy link
Contributor

I would really like this feature. We currently have a work-around implemented with 2 configuration files and 2 separate migration directories for the pre- and post-deploy migrations. It's a little cumbersome to always have to specify the configuration file.

This will also solve our problem (as referenced) of being able to re-run a single migration and not all migrations since.

@josegonzalez
Copy link
Member

@winkbrace is this something you would be interested in working on?

@rquadling
Copy link
Collaborator Author

There is a branch of work allowing for devs to have a greater control over the naming of migrations. For example, we want to enforce that the current git branch ID (XXX-nnn) is prefixed in the migration name.

It may be worth combining tagging with this feature in some way to allow the tag to be included in the name. So, in my case XXX-nnn-BEFORE-AddNewStuff / XXX-nnn-AFTER-RemoveOldStuff.

@othercorey
Copy link
Member

Closing due to lack of activity or out of scope. If there is new interest, please open a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants