-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrations are a right royal pain in the arse #482
Comments
@ErisDS Having been recently in there a fair bit, I should have a good perspective on some of those changes. Just need to compile my thoughts. Off the bat though, I think part of simplifying the process at least (wouldn't be simplifying the amount of code we right) might be having a merged database schema per version so that each time a new instance is spun up it doesn't have to start at ground zero and run through all the migrations/changes to get to the current version. |
That's what I'm trying to get at.. although not very eloquently. Even though migrations as we have them are a pretty standard model (I think it is used in rails, and is essentially what liquibase does) it just seems backwards to start at ground zero and work up.. when it's where we want to be that is the known thing, and getting there that is the difficult part. |
Hmm so it turns out that right now, Knex has no ability to modify columns :( knex/knex#46 Is going to need to be built ourselves, either building the necessary features into knex or manually writing migrations |
Ok, so we definitely need to be able to modify columns. Options are (I guess) extend Knex, or try thinking about this another way. We also definitely need the schema somewhere. Even if that's a generated file using @ErisDS's script (which would be read-only, obviously). I'm not sure if it's over-complicating things a bit, but one idea about how we could rework migrations: There's a schema file containing all the table data. Any new Ghost installs build their database from this file. (same as how Rails does new installs) When making schema changes, you change the schema file. If you try to run the migration task, you will get presented with an error telling you that the current migrations don't match up to the schema and prompts you to create a matching migration to get there. Two benefits of this approach:
The clever part (not needed right now). The migration system should in theory be able to detect table additions/deletions, as well as column additions/deletions in the schema. When you try to migrate it could pre-load a new migration file with what it thinks you're trying to do. Thoughts? |
There are 2 clear, very high priority next steps for this:
Anyone up for taking one or t'other? |
The migration system is a convoluted & undocumented little corner of hell. Please help me out by discussing how we can improve it.
Don't get me wrong, it works, and it works really quite well. However it seems that it is tricky to understand what is going on, and on top of that getting a migration, fixtures file, import & export setup just to change a tiny property on the DB is... ridiculous.
I think we need to break this whole problem down into smaller chunks which are solved separately and independently. Then we need to document how this works & put into place tools to make it as easy to do as possible.
The following is a brain dump of all the things I know to be true about the current system, along with some ideas on how to improve it. It's a pretty big topic so discussing various areas / suggestions for improvements may need to be done by creating a new issue for that idea.
Migrations
Migrations are what we call changes to the database schema. However, the schema itself is defined through the magical method of a series of migrations. There is nowhere that the entire schema can be seen and understood, apart from my magical gist: https://gist.github.com/ErisDS/cd0f3d33072410309f37.
I think Migrations are really two problems.
At the very least, I think that at this stage we are, and need to be, moving much faster than we can with this system in terms of DB changes. So I think it would be a good idea, at some point before the so called 0.3.0 release, to stabilise the schema, create a version of Ghost which upgrades to the stable schema, and then in the next version flatten everything down again & remove the upgrade path. This might mean for example that 0.3.0 is in fact the 'stable schema version' and that anyone (just us & our VIPS atm) on 0.2.0 must upgrade to 0.3.0 before they can upgrade to any 0.4.0 version. 0.4.0 would then become the version we release to KS backers.
KS backers would get a nice clean migration system. VIPs would have to jump through the 0.3.0 hoop or start a fresh 0.4.0 install.
Fixtures
Fixtures are the default data which should exist in a fresh DB at the current version of the code. The data relies on the structure, so they can only be inserted once we have reached the correct structure.
Upgrading fixtures is a strange problem that I think has been misunderstood.
Fixtures should only be for the initial state of the DB, it is expected that the user will overwrite them and therefore we cannot change them later. If we decide we want the 'Fresh Install State' to be different, then we simply change the fixtures.
However, at the moment, we are using 'fixture upgrades' to add new settings which are required to be present, and this is the only thing we are using them for. I think this is indicative of a completely different problem.
Settings, as they stand, cannot really be added through the API. Additionally, apart from the fixtures, we have no way of describing the settings which are required, what sort of data they should contain and how to validate them.
I think a lot of problems could be solved by improving the API for settings. Having a definition of required settings, their default value, their validation rules, if they are not overridable (see import) and then having a way to request a setting which, if it is not present, adds it with it's default setting.
** Update: ** we do also want to move away from using these fixtures as test fixtures and only use them in tests if we are testing the 'Fresh Install State'.
Export
Export is a JSON dump of EVERYTHING in the DB, so that the DB can be recreated later if something goes wrong. At the moment, the only thing that is different between different versions of export is the list of tables which are present.
This information should be accessible by doing some raw SQL queries to list the tables I believe?
I think we can flatten export down into a single, simple tool by querying the database and asking what's there. We will have a data version stamp which we can put in the meta data, and I recommend also adding the Ghost version number, just in case.
Import
Import is a slightly different ball game to export. Bringing data in is a bit harder as we need to know what version of data we have, and what version of database schema we have and how to get from one to the other.
We also have to know a bit more about the structure / type of the data we are importing. At the moment there are a set of rules which are something like:
These rules are reasonably sane, and the way the import is written there is very little reason to have any more than 1 importer. In the case that this changes, we could expand the import system, but for now I believe it can be flattened?
For example, if we finally update the users to work for multi-user, we can change the importer at the same time, and any user importing TO that version will get the correct functionality.
Summary
The user can only ever be on a single version of the codebase, which has a single version of the database schema. Thus all each version needs is to be able to handle getting TO that version from any earlier version.
The code we have achieves this, but I think it's a little bit complex and I have the feeling it's a bit backwards. It seems to be focused around pushing from nothing towards a new version, when in fact we should be pulling users into the right version from wherever they are. It's a subtle difference that I'm not articulating very well.
Thoughts?
The text was updated successfully, but these errors were encountered: