-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document how to use a value comparer to compensate for different behavior of string comparisons in client and store #2979
Comments
Hi, It's an interesting problem and it is not just string keys that catch people out. Querying the data in memory (eg: from a previously materialised query) vs projecting the query to the store can definately give different results depending on how the store is configured. Harry versus harry vs HARRY, etc. We had problems when we allowed searches against columns like Name, Street, etc... Another one for me was importing external data, say a spreadsheet, and matching the column values in the sheet to a description in a reference table (to get a 1:M relationship and normalise the data). If I cached the reference table to cut I/O then I had to be careful to do case insensitive queries on the in memory data. If I allowed it to query the store each time I got different results. I might be forgetting something, but pretty sure that is what I found.. |
I consistently run into this issue when migrating to EF, and the existing database uses strings as PK/FKs. These existing databases are almost always case insensitive, and I have to update the collations to case sensitive to prevent EF from losing track of relationships. This causes friction with direct sql users that expect case insensitive comparisons, and get tripped up when the collation is case sensitive. |
I was able to call Include to get relative data which have case insensitive PK/FKs in old EF core 1.0. This was changed in 2.0, is there anyway to bring back old feature? |
Note from API review: should be able to use ValueComparers to implement this. |
Is adding a ValueConverter like the following the best workaround for case-insensitive comparisons at present?
Are there workarounds that use ValueComparers? It looks to me like we'll always end up in EntityFinder.GenerateEqualExpression which wouldn't respect a ValueComparer. |
What is the priority of working with existing data for the EF Core team? |
@junkbondtrader that paragraph implies that only a subset of our customers will be in a situation in which some of the strings in the database they access through EF Core aren't normalized (i.e. they don't match in memory because of different casing, etc.) and that they also don't own the database or for some other reason are unable to normalize the stored strings themselves. We don't know what this subset is in terms of percentage or business impact, but we believe that it is a relatively small subset within the universe of all EF Core users. There are likely going to be other features in our backlog that are going to help a larger portion of our customers. Anyway, this is just one of the factors that can play a role in how we prioritize. There are many more detailed in https://docs.microsoft.com/ef/core/what-is-new/roadmap#release-planning-process. |
@divega It's not just a question of owning the database- this creates a maddening disconnect between tsql queries (SSRS reports, SSMS sleuthing, stored procedures) and EF code, one that I've had to explain countless times to new people. But what you're really saying is string FKs are bad, and don't expect this issue to be addressed in EF. |
@StormRider01 if I wanted to say "don't expect this issue to be addressed on EF Core", I would close it. You are actually preaching to the choir. I am the one that created the issue because I agree it is important, and I know we have some ideas on how to fix it. The problem is the size of the EF Core backlog and how much of a dent we can make on it on each release. In responding @junkbondtrader's comment I just tried to explain why other issues have been prioritized over this one. |
@StormRider01 anyway, your comment implies something that is a very good point about this issue: when things fail to match and you get unexpected behavior (e.g. navigation properties don't get fixed up), it is actually hard to figure out why. In my mind, that automatically raises the priority of addressing it, at least a bit. |
Since this has been opened for 5 years and still not made it into a release, can you please suggest a reasonable workaround? I'm stumped how we avoid causing client side evaluation in this, especially as we have a custom OData wrapper. Normalizing the database is not an answer as we are looking at the names of our clients businesses and it is not up to us how they chose to capitalize things and it is lost data if we do so |
Also see high-level collations issue dotnet/efcore#19866 |
any update on this? |
@chamikasandamal I've been playing with this and it seems that this can now be handled using a value comparer (). For example, to ensure EF Core does case-insensitive string key comparisons: protected override void OnModelCreating(ModelBuilder modelBuilder)
{
var comparer = new ValueComparer<string>(
(l, r) => string.Equals(l, r, StringComparison.OrdinalIgnoreCase),
v => v.ToUpper().GetHashCode(),
v => v);
modelBuilder.Entity<Blog>().Property(e => e.Id).Metadata.SetValueComparer(comparer);
modelBuilder.Entity<Post>(b =>
{
b.Property(e => e.Id).Metadata.SetValueComparer(comparer);
b.Property(e => e.BlogId).Metadata.SetValueComparer(comparer);
});
} Or if you also need to deal with padding due to fixed length string keys: protected override void OnModelCreating(ModelBuilder modelBuilder)
{
var comparer = new ValueComparer<string>(
(l, r) => string.Equals(
l == null ? null : l.Trim(),
r == null ? null : r.Trim(),
StringComparison.OrdinalIgnoreCase),
v => v.Trim().ToUpper().GetHashCode(),
v => v);
modelBuilder.Entity<Blog>(b =>
{
b.Property(e => e.Id).HasColumnType("char(20)");
b.Property(e => e.Id).Metadata.SetValueComparer(comparer);
});
modelBuilder.Entity<Post>(b =>
{
b.Property(e => e.Id).HasColumnType("char(20)");
b.Property(e => e.Id).Metadata.SetValueComparer(comparer);
b.Property(e => e.BlogId).HasColumnType("char(20)");
b.Property(e => e.BlogId).Metadata.SetValueComparer(comparer);
});
} You could also handle the padding with a value converter that removed and added padding as needed to and from the database. |
@ajcvickers Does the |
@RichardD2 It's very likely this doesn't work on 3.1. Value comparers have seen significant improvements since then. |
@ajcvickers Thanks. Looks like using a value converter works in 3.1 as an alternative: public sealed class StringToUpperValueConverter : ValueConverter<string, string>
{
private static readonly Expression<Func<string, string>> ToProvider = s => s;
private static readonly Expression<Func<string, string>> FromProvider = s => s == null ? null : s.ToUpperInvariant();
public static readonly StringToUpperValueConverter Instance = new();
private StringToUpperValueConverter() : base(ToProvider, FromProvider)
{
}
} [AttributeUsage(AttributeTargets.Property)]
public sealed class StringToUpperAttribute : Attribute
{
} protected override void OnModelCreating(ModelBuilder modelBuilder)
{
base.OnModelCreating(modelBuilder);
...
foreach (var entityType in modelBuilder.Model.GetEntityTypes())
{
foreach (var property in entityType.GetProperties())
{
if (property.PropertyInfo?.GetCustomAttribute<StringToUpperAttribute>() is null) continue;
property.SetValueConverter(StringToUpperValueConverter.Instance);
}
}
} |
Typically string comparisons performed on the store can have different behaviors to comparisons in memory, e.g. trailing blanks and casing may be ignored in the store.
In previous versions of EF this made things like fixup of string keys hard to predict: related data would be pulled from the database because the FK and PK values matched according to the server semantics but once in memory the strings didn't match and therefore fixup didn't happen. This is described in more detail in CodePlex dotnet/efcore#284.
This kind of difference appears more often when working with existing data, so how important this is depends on how much we prioritize those scenarios. If the data was stored by EF then string keys have a better chance to be an exact match.
The idea to address this is that on the client, EF Core could always use a custom string comparison algorithm that would mimic the behavior in the database. That is, one that ignores trailing blanks and casing.
The text was updated successfully, but these errors were encountered: