Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Concatenating null with not null string resulting in null #16120

Conversation

Muppets
Copy link
Contributor

@Muppets Muppets commented Jun 16, 2019

DO NOT COMMIT THIS - work in progress.

Looking for some feedback on WIP for issue #3836.

My attack plan was to include a Coalesce function call with all string concatenations. This does produce somewhat ugly SQL for a ton of common scenarios. I changed the check to only wrap references to nullable columns in Coalesce but this will miss a ton of different possible outcomes.

Looking for some guidance on what you fancy doing here, uglify everything or only cover some really specific examples?

I've not updated all the unit tests to match this change yet so there are broken tests.

@mburbea
Copy link

mburbea commented Jun 28, 2019

Why not use concat function available in most major sql languages that supports adding null strings together?

@smitpatel
Copy link
Contributor

Concat support was added in SqlServer 2012. Cannot use it since we still support SqlServer 2008.
https://docs.microsoft.com/en-us/sql/t-sql/functions/concat-transact-sql?view=sql-server-2017

@mburbea
Copy link

mburbea commented Jun 28, 2019

SQL server 2008R2 is going EOL next month. I think it makes sense to offer concat.

@ajcvickers
Copy link
Contributor

@divega Looking for your input on this.

@smitpatel
Copy link
Contributor

Ping @divega

@divega
Copy link
Contributor

divega commented Sep 3, 2019

Here is the requested input:

  1. I did some looking around and there are fundamental variations in support for CONCAT() across many of the most relevant database engines. For example, according to what I could find:

    • In MySQL, CONCAT() does return null if any of the arguments is null.
    • In SQLite, CONCAT() isn't supported.
    • In Oracle, CONCAT() only accepts two arguments.
  2. It seems that CONCAT() would be a good option only for SQL Server 2012 and above, and PostgreSQL 9.1 and above. We could implement this, but assuming we still care about older versions and other databases, we would need a fallback translation that still uses the concatenation operator, and presumably a way to switch behaviors. All in all, it doesn’t seem compelling.

  3. Given that CONCAT() doesn't help addressing Query: Concatenating null with not null string resulting in null #3836 across providers, I don't see a better way to achieve the desired behavior than pursuing the direction of this PR. That is, add COALESCE(x,'') when we know x can be null. The generated SQL is more complicated, but that should improve a bit when we start caching generated SQL statements based on sniffing parameter values. @smitpatel also mentioned that people can already explicitly add their own coalesce operator code in the LINQ query to get the desired result, but that feels more like a workaround than anything. We have precedent in older versions of EF where C# semantics for string concatenation was fixed based on feedback and community PRs.

  4. Incidentally, the concatenation operator in SQL Standard and in the majority of the databases outside of SQL Server is ||, and not +. I believe we missed making || the default in relational and now most providers need to override it.

@divega divega removed their assignment Sep 3, 2019
@divega
Copy link
Contributor

divega commented Sep 3, 2019

One more detail to take into account:

  1. Changing the behavior in this way, although and improvement, is also technically a behavioral breaking change.

@smitpatel
Copy link
Contributor

Design notes:

  • We need to add COALESCE(column, N'') only when column can be null and is used in string concatenation.
  • The place where we find nullability is NullSemanticsRewritingExpressionVisitor.
  • Based on that, @divega, is it fair to say that if user wants database nulls behavior then null + "b" may return null rather than "b"?
  1. Separate issue. Though as discussed, it ended up with that way because we don't special case string concat. They are a + b in tree and we always print +. That is to say, it wasn't because of SqlServer but because of ease of printing due to our Sql tree structure. We can fix it. Probably best to file an issue and put good-first-issue

  2. Already targeting master branch so this would go into next breaking release.

@divega
Copy link
Contributor

divega commented Sep 4, 2019

(Fixed/simplified my initial write up)

Based on that, @divega, is it fair to say that if user wants database nulls behavior then null + "b" may return null rather than "b"?

Yes, I think it is reasonable for the same UseRelationalNulls flag to control both null comparisons and this. The name is a good fit, plus both behaviors are a consequence of null meaning unknown.

@smitpatel
Copy link
Contributor

@Muppets - Would you like to work based on new design? @maumar can guide you in terms of how NullSemanticsRewritingExpressionVisitor works.

@smitpatel
Copy link
Contributor

Added design notes in original issue. Closing PR.

@smitpatel smitpatel closed this Sep 11, 2019
@Muppets Muppets deleted the Query-Concating-null-with-not-null-string-resulting-in-null branch February 2, 2020 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants