-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use column/property facets for parameter types in Update Pipeline #4134
Comments
Do you have evidence of indexes not being used? |
I'm sorry, I was going off what I learned about db design, not an actual use case. I cannot repo the issue and after research (http://stackoverflow.com/questions/14342954/ado-net-safe-to-specify-1-for-sqlparameter-size-for-all-varchar-parameters) the functional impact seems to be wrong. If the developer goes through the trouble of telling you the parameter size I think you should use it, instead of 4000 / 8000. This will still avoid query cache fragmentation but produce the query that the developer expects. Thanks. |
But HasMaxLength etc does not define a parameter size, but a storage max size. |
@ErikEJ Exactly. Theoretically for very simple patterns that compare a parameter for equality against a column we could borrow the size of the column to create the parameter, but in the general case (i.e. for abitrary expressions) it could become harder to keep track of how they relate. Even for the simple case, what happens if the data in the parameter doesn't fit in the size of the column? Should we truncate it and potentially return wrong results? Should we instead have the smarts to short-circuit the comparison on the client knowing that the parameter could never match any value in the column? @michaelpaulus I understand your expectation here but I believe it is not as straightforward as you may think. Rather than having a set of complicated rules to try to meet that expectation when we can, we use simpler rules that functionally work in all cases and have desirable effects on query caching on the server. By the way, these rules are also proven. We have used them in previous O/RMs. |
In EF6 the parameter length was defined by the maxlength for insert / update params. This is no longer the case in EF7. The parameter length wasn't defined by the maxlength for select params in both. EF6 Output:
EF7 Output
I'm not necessarily asking about maxlength, although this was a departure from ef6, but more to the point I've declared the datatype explicitly using HasColumnType("nvarchar(200)") or ForSqlServerHasColumnType("nvarchar(200)") and I get nvarchar(4000). As far as what you would do if the value passed in is greater than the length specified, you already have this logic, but for values that are greater than 4000. I would suggest let the value be the length specified unless the value is greater then go to 4000 unless the value is greater than 4000 then go to -1 (max). The only other question I would have, does this have any affect on sql server and memory allocation? If I have a table that has 50 nvarchar(25) columns and they are all being inserted with parameters of nvarchar(4000), does this change the amount of memory sql server allocates for all of those params? In my short testing it seems that it does. The one with nvarchar(4000) params adds an additional compute scalar step to the execution plan with a much larger row size. Compare
with
for table
Thanks again for reviewing this. |
So the issue is not indexes not being used, but potential increased memory consumption server side in the update pipeline? |
The issue is HasMaxLength / HasColumnType / ForSqlServerHasColumnType don't produce parameters with that size. In select parameters this is the same as EF6. In Insert / Update Parameters this is different then EF6. I pointed out both issues in the repo steps 4/5 select, 6/7 update. I'm not sure this is an issue or not, but defiantly not expected behavior from the developers perspective. The functional impact I originally wrote is wrong. I'm just bringing up the point of the insert / update being different then EF6 and asking the question if this has an impact on server memory. It seems to have an issue but I'm no expert in this area. Thanks. |
@michaelpaulus thanks for clarifying that you concern is about the update pipeline. I missed that detail in your original post. I agree that this could have memory and perf implications on the server that we may need to take into account. I also agree that by construction parameters generated in the update pipeline have a more direct correspondence to columns. |
Clearing milestone because I would like to chat about this one in triage again (I probably missed the discussion while I was out). |
assigning to @ajcvickers to see if there are any easy wins for RC2 here |
We should consider borrowing the unicodeness of the column as well as the length. See #4608. |
Marking for re-triage. No priority has been set. |
BTW, regarding my previous comment on borrowing both unicodeness and length, as per #4949 (comment), it seems that we area already doing the right thing for the unicodeness. |
In looking at #4425 and #4134 in became quickly apparent that the support for inferring Unicode was too specific to Unicode. What is really needed is that if the type mapping to be used can be inferred in some way, then this type mapping should be used for all relevant facets, not just Unicode. Only in cases where no inference is possible should the default mapping for the CLR type be used.
In looking at #4425 and #4134 in became quickly apparent that the support for inferring Unicode was too specific to Unicode. What is really needed is that if the type mapping to be used can be inferred in some way, then this type mapping should be used for all relevant facets, not just Unicode. Only in cases where no inference is possible should the default mapping for the CLR type be used.
These changes address issue #4425 while at the same time re-introducing the parsing of sizes in store type names such that these can then be used in parameters for #4134. The mapping of strings and binary types based on facets has been split out such that the information provided by the type mapper can be interpreted more easily by the scaffolding code. This also simplifies the type mapper APIs themselves. A new class ScaffoldingTypeMapper has been introduced that can be used to determine what needs to be scaffolded for a given store type. This has not yet been integrated into scaffolding, but has unit tests to check it gives the correct answers. Scaffolding must provide this service with the full store type name (e.g. nvarchar(256)" together with information about whether the property will be a key or index or a rowversion. The service then gives back data indicating whether the type is inferred (need not be explicitly set) and, if so, whether max length and/or unicode factes needs to be set.
These changes address issue #4425 while at the same time re-introducing the parsing of sizes in store type names such that these can then be used in parameters for #4134. The mapping of strings and binary types based on facets has been split out such that the information provided by the type mapper can be interpreted more easily by the scaffolding code. This also simplifies the type mapper APIs themselves. A new class ScaffoldingTypeMapper has been introduced that can be used to determine what needs to be scaffolded for a given store type. This has not yet been integrated into scaffolding, but has unit tests to check it gives the correct answers. Scaffolding must provide this service with the full store type name (e.g. nvarchar(256)" together with information about whether the property will be a key or index or a rowversion. The service then gives back data indicating whether the type is inferred (need not be explicitly set) and, if so, whether max length and/or unicode factes needs to be set.
Issues: #4608 Use column/property facets for parameter types in Query Pipeline #4134 Use column/property facets for parameter types in Update Pipeline If a property length is specified, then this is used to infer the length to use for parameters relating to that property, unless the length of the data is too long, in which case unbounded length is used. Because query has code that can infer the length to use for a parameter this still means that fragmentation will not happen without always using the 4000/8000 value.
Issues: #4608 Use column/property facets for parameter types in Query Pipeline #4134 Use column/property facets for parameter types in Update Pipeline If a property length is specified, then this is used to infer the length to use for parameters relating to that property, unless the length of the data is too long, in which case unbounded length is used. Because query has code that can infer the length to use for a parameter this still means that fragmentation will not happen without always using the 4000/8000 value.
Issues: #4608 Use column/property facets for parameter types in Query Pipeline #4134 Use column/property facets for parameter types in Update Pipeline If a property length is specified, then this is used to infer the length to use for parameters relating to that property, unless the length of the data is too long, in which case unbounded length is used. Because query has code that can infer the length to use for a parameter this still means that fragmentation will not happen without always using the 4000/8000 value.
Title
HasMaxLength / HasColumnType / ForSqlServerHasColumnType don't produce parameters with that size
Functional impact
Data Types on generated parameters don't match actual sql data types and don't work as expected. This can lead to indexes not being used properly due to size mismatches between parameters and columns.
Minimal repro steps
Expected result
Parameter is created with the size specified by HasMaxLength or HasColumnType
Actual result
Parameter is always being created with the _maxSpecificSize in Microsoft.Data.Entity.Storage.Internal.SqlServerMaxLengthMapping
Further technical details
There is a comment in the method Microsoft.Data.Entity.Storage.Internal.SqlServerMaxLengthMapping.ConfigureParameter
// For strings and byte arrays, set the max length to 8000 bytes if the data will
// fit so as to avoid query cache fragmentation by setting lots of different Size
// values otherwise always set to -1 (unbounded) to avoid SQL client size inference.
I don't agree with this comment if the user is going through the trouble of telling you what size to use.
The text was updated successfully, but these errors were encountered: