Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AddColumn migrations created with bad default #6035

Closed
roji opened this issue Jul 9, 2016 · 28 comments
Closed

AddColumn migrations created with bad default #6035

roji opened this issue Jul 9, 2016 · 28 comments
Assignees
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Milestone

Comments

@roji
Copy link
Member

roji commented Jul 9, 2016

Hope I'm not reporting a dup here.

Testing with SqlServer, I've added this property to my model:

builder.Entity<Blog>().Property(b => b.SomeProp).ValueGeneratedOnAdd();

Generating a migration creates the following code:

migrationBuilder.AddColumn<int>(
    name: "SomeProp",
    table: "Blogs",
    nullable: false,
    defaultValue: 0)
    .Annotation("SqlServer:ValueGenerationStrategy", SqlServerValueGenerationStrategy.IdentityColumn);

Note the defaultValue: 0, which I didn't specify in the model in any way. Note also that if the column is defined when creating the table, the correct CreateTable migration is generated:

SomeProp = table.Column<int>(nullable: false)
    .Annotation("SqlServer:ValueGenerationStrategy", SqlServerValueGenerationStrategy.IdentityColumn),

This causes an issue with Npgsql (npgsql/efcore.pg#68) since the default value is examined in the migration decision process.

@gdoron
Copy link

gdoron commented Jul 10, 2016

@roji How otherwise would you handle the already existing rows in the table?

@roji
Copy link
Member Author

roji commented Jul 10, 2016

That's a good question, but it seems like it's a migration detail. In other words, at the end of the process I should get the table definition that corresponds to my model, i.e. the same table definition I would have gotten if I created the table with the column in the first place.

One solution would be to add another migration operation to remove the default after the column is added. This way rows that existed before the addition have default values, but rows created afterwards don't.

Another possibility which seems better to me, is to force the developer to take manual action here, i.e. to generate the column addition without a default, making it fail if there are any rows. After all, adding a non-nullable non-default column requires a real decision from the developer - putting a zero for ints may not what what they want. It may even make them think and make the new column nullable, depending on their scenario.

@gdoron
Copy link

gdoron commented Jul 10, 2016

@roji non nullable property obviously means NOT NULL column, so the developer made the decision already, I think there's no need to force any other manual action.

I agree that leaving the default value as part of the scheme is a bad idea though.

I would rather have something like valueForExistingRows which would update the rows before adding the new row.

migrationBuilder.AddColumn<int>(
    name: "SomeProp",
    table: "Blogs",
    nullable: false,
    valueForExistingRows: 0) // <---------
    .Annotation("SqlServer:ValueGenerationStrategy", SqlServerValueGenerationStrategy.IdentityColumn);

putting a zero for ints may not what what they want.

The auto generated migration is just a helper, you always should review the generated migrations and script and customize it if needed.

What do you think?

ps. Truncate Table is DDL 😄

@roji
Copy link
Member Author

roji commented Jul 10, 2016

@roji non nullable property obviously means NOT NULL column, so the developer made the decision already, I think there's no need to force any other manual action.

The developer made the decision about the column being not null, but not regarding what should happen to existing rows... Basically it's a matter of whether 0 for existing rows should be a default, or something we force developers to choose. Both options make sense I guess; I definitely prefer the second because it seems like something the developer should confront - they may not have thought about the existing rows and just putting a zero there may cause trouble later. But I also understand the need for migrations to be seamless and not require too much manual intervention. It's a decision that has to be made.

Regarding your specific proposal (valueForExistingRows), something like that could make sense. Under the hood we're talking about two migrations: adding the column with the default, then altering the column to remove the default. I think that means two operations in the migration code as well (i.e. migrationBuilder.AddColumn(...) followed by migrationBuilder.AlterColumn(...)), but I may be wrong.

@gdoron
Copy link

gdoron commented Jul 10, 2016

Under the hood we're talking about two migrations: adding the column with the default, then altering the column to remove the column.

It's two operations. Update existing rows and adding the new column, no need to create a default value at all.

The table needs to be locked though so there won't be a new row between the update and the altering of the table.

@roji Now that I rethink about it, I prefer the migration to leave the value blank and not compile and force the developer to fill the value.
I just don't think it's inline with how migrations in EF behave.

@roji
Copy link
Member Author

roji commented Jul 10, 2016

Under the hood we're talking about two migrations: adding the column with the default, then altering the column to remove the column.

It's two operations. Update existing rows and adding the new column, no need to create a default value at all.
The table needs to be locked though so there won't be a new row between the update and the altering of the table.

Uh, how do you update the existing rows if you haven't yet added the column?? Or am I totally misunderstanding?

If you add a column with a default, then remove the default everything seems nice and safe, no concurrency/locking issues...

Anyway, I guess we can wait for Monday to see what the team thinks...

@gdoron
Copy link

gdoron commented Jul 10, 2016

Uh, how do you update the existing rows if you haven't yet added the column?? Or am I totally misunderstanding?

OMG I'm an idiot... 😳

Yes you're right. You need to add default values and remove it later.

@roji
Copy link
Member Author

roji commented Jul 10, 2016

We all get brain malfunctions every now and then :)

@gdoron
Copy link

gdoron commented Jul 10, 2016

Actually, I was off, but not that off.
What I usually do is add the new column as nullable.
Then update the column with the desired value.
Then alter the column to not null.

That is the reason I got confused.

@flagbug
Copy link

flagbug commented Jul 10, 2016

See also: #5995

@rowanmiller
Copy link
Contributor

Triage: We should not try and set a default value if the column is set as identity, since it is not valid (and SQL Server would fill in the missing values anyway).

On a more general note, using the CLR default when you add a non-nullable column is just what we scaffold and we anticipate that folks would often change this to something that makes sense for their app. We don't think we need to change anything here.

@roji
Copy link
Member Author

roji commented Jul 15, 2016

@rowanmiller, note the discrepancy between what happens if the non-nullable column is present when the table is created, and when it's added later (i.e. between the CreateTable migration and the AddColumn migration) - in the first case there's no default, in the second there is. This is the part that bothers me most; if you always set the CLR default as the column default it would at least be consistent (although still wrong IMHO).

Again, I get the reasoning behind the default - when adding a column to an existing table there's the issue of existing rows, which doesn't exist when creating a new table. But that seems to be a detail; it seems that the database schema for a given model should be the same regardless of the history/migration path taken to reach it.

@rowanmiller
Copy link
Contributor

@roji we discussed this at length and decided the complexity of removing the default after adding the column isn't worth the benefit. Since the column is non-nullable, EF will always send a value in INSERT statements, so the database default is never used. It would only be used if SQL was run against the database out side of EF. Another factor in our decision is that we've had this behavior in EF6.x since the 4.3 release and have not had any complaints about it.

@roji
Copy link
Member Author

roji commented Jul 18, 2016

Thanks for looking at this, I didn't realize removing the default was such a complicated operation.

I'll describe how this bit me, I'm interested in how you deal with it. In the current Npgsql provider, you can mark an int property as ValueGeneratedOnAdd, without specifying any default. When the Npgsql migrations SQL generator sees such a property, it generates a PostgreSQL serial, which is actually just an int with a default coming from a sequence. The logic here is that the user may decide to have their own default which isn't a sequence. A similar thing is done with GUID fields that have ValueGeneratedOnAdd but no default. Note that this same logic works both for keys and non keys.

Now, as long as the column is present when the table is first created, everything works fine. But when the column is added later, the default is added by EFCore, and the migrations provider doesn't create a serial. In other words, it's not possible to know whether the default is a genuine user intention, or whether it's just an artifact of the column addition.

How do you guys handle this sort of thing?

@rowanmiller
Copy link
Contributor

I think that is similar to the issue we are planning to fix here, where the default value should not be added for database generated columns. BTW there is also DefaultValueSQL, which is different from a static default value and is where something like "newid()" would be used for SQL Server generated GUID columns.

@ajcvickers
Copy link
Contributor

@roji Also, look at SqlServerValueGenerationStrategy. This is how we define a high-level group of behaviors for SQL Server value generation across the stack. This is what allows Migrations and the update pipeline to know what to create/expect for different strategies.

@roji
Copy link
Member Author

roji commented Jul 18, 2016

@rowanmiller I check for DefaultValueSQL in addition to DefaultValue, if any of them are set I bypass the implicit sequence creation.

@ajcvickers, if I understand correctly you support two strategies:

  1. Your migrations SQL provider checks for SqlServerValueGenerationStrategy.Identity and generates identity columns.
  2. HiLo, where it's up to the user to manually specify the sequence in the model. In other words, migrations don't automatically seem to take HiLo into account in any way. There also seems to be a default for a single HiLo sequence, having a sequence-by-column seems to require an additional annotation.

In the PostgreSQL world, the standard for auto-increment columns (keys or otherwise) is the serial pseudo-type, which in reality implicitly creates a column-specific sequence and sets the column's default to get values from it (here are the relevant docs). The natural expectation is that when EF sees a value-generated int, it would do exactly that.

Requiring users to manually specify sequences for each value-generated would be awkward to say the least, as would be an additional annotation just to say "please create a sequence for this". This is why I made things simple: if your column is numeric, has ValueGeneratedOnAdd and doesn't have a default, I deduce you want a sequence. The only problem with this strategy is this issue - that some defaults aren't actually defaults...

Do you have any suggestion for supporting this scenario without imposing explicit sequence creation in the model or an added annotation? I hope I haven't misunderstood how you do thing in SqlServer and missed something.

@ajcvickers
Copy link
Contributor

@roji Is the problem that the PostgreSQL provider is using the "defaultValue" set in the Migrations model as the way to choose what strategy to use? I don't think this can work. I think you need to look in the actual EF model to determine which strategy is being used and then create the appropriate Migrations model from this. @bricelam might have more insight.

@roji
Copy link
Member Author

roji commented Jul 18, 2016

@ajcvickers, ah, that seems like a good way forward. The model would retain the user's intent (no default) even if the database has a default because of the migrations... I'll look into that, thanks!

@roji
Copy link
Member Author

roji commented Jul 18, 2016

Yep, that seems like it works. My NpgsqlMigrationsAnnotationProvider checks the property for ValueGenerate.OnAdd and for null DefaultValue/DefaultValueSql, and if so, passes an annotation that makes the migrations SQL generator create a serial column.

Note that this overrides the default value 0 set in the migration operation.

Thanks for the help!

@roji roji closed this as completed Jul 18, 2016
@rowanmiller
Copy link
Contributor

Reopening this one as I think we should still avoid scaffolding a default value for new columns when the column is marked as store generated.

@rowanmiller rowanmiller reopened this Jul 19, 2016
@gdoron
Copy link

gdoron commented Jul 20, 2016

new columns when the column is marked as store generated.

What does "store generated" mean?
Does it mean that new columns won't have default values when they're not nullable?

@rowanmiller
Copy link
Contributor

@gdoron it means that values for the column are generated by the database (e.g. SQL Server IDENTITY) so there is no need for a default value. All other non-nullable columns will still have a default value generated.

@roji
Copy link
Member Author

roji commented Jul 20, 2016

@rowanmiller I admit I'm a bit confused as well by the terminology (have always been a little)... Columns with default are also generated by the database. In addition, the Npgsql/PostgreSQL model seems to complicate this somewhat - let's take autoincrement (serial) columns. On the CLR model, the user is simply supposed to specify ValueGeneratedOnAdd, without specifying an explicit default (that would be needlessly verbose and un-PostgreSQL-like). The database column created from this should be a serial column, which in reality is a column with a default backed by a sequence (the sequence is implicitly created when the serial column is created).

To try to explain this in terms of the SqlServer provider, imagine a new value generation strategy in addition to identity and hilo. With this strategy, every ValueGeneratedOnAdd column gets its own sequence, which is automatically created by the migrations and named after the column. This is more or less how PostgreSQL serial works.

I think it's worth disambiguating whether, when we say "default value", we mean that on the CLR model or in the database column definition, because the two are not the same.

@bricelam bricelam added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Aug 10, 2016
@roji
Copy link
Member Author

roji commented Sep 10, 2016

@ajcvickers, coming back to this after a long pause I have a small question with regards to your comment above.

In that comment you suggest that I look at the model directly to find out whether a property in question has a default or not - looking at the migration operation doesn't work because EFCore adds a default there even if one isn't defined on the model.

My question is whether this should happen directly in my MigrationsSqlGenerator, or whether an annotation should be generated in my MigrationsAnnotationProvider, and be examined in the MigrationsSqlGenerator. On the one hand, I can see that SqlServerMigrationsAnnotationProvider generates a ValueGenerationStrategy annotation which will later be examined by SqlServerMigrationsSqlGenerator. On the other hand, I can see cases where SqlServerMigrationsSqlGenerator directly looks up properties on the model (via FindProperty) and examines various things. I understand that annotations get embedded in actual migration code, but I'm not sure exactly what the consequences of each method are.

@ajcvickers
Copy link
Contributor

@roji I think @bricelam is probably better able to answer that.

@bricelam
Copy link
Contributor

Use FindProperty in the Migrations SQL generator to discover additional context about performing an operation. For example, SQL Server checks to see if a column is indexed to determine the default max length (since SQL Server has a limit on the maximum bytes in an index).

Use annotations to expose store-specific features that the user may want to configure--imagine that they're creating a column without a backing property in the model. SQLite adds an annotation to set AUTOINCREMENT on the column because it is the best thing to do when working with EF, but the user free to change (remove) it if they know what they're doing.

I guess as a principle, try not to make decisions for the user in the SQL generator unless there is a way for them to override it (e.g. configure the max length in the call to CreateColumn).

@roji
Copy link
Member Author

roji commented Sep 12, 2016

@bricelam thanks for the guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Projects
None yet
Development

No branches or pull requests

6 participants