-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation is unclear for Guid primary keys #34260
Comments
I don't think anything in dotnet/runtime#86084 is relevant to this discussion - that's purely about endianness in the binary serialization APIs around Guid. Getting the endianness right when serializaing Guid is an internal concern of the low-level database driver (SqlClient in this case), which is also why I think the impact on the end user in that conversation is greatly exaggerated (i.e. users generally shouldn't need to deal with binary serialization of GUIDs in database contexts, unless the GUID is stored as a binary field). In any case, all this has nothing really to do with index fragmentation.
The SequentialGuidValueGenerator that's used by default with the EF SQL Server provider largely mimics the SQL Server NEWSEQUENTIALID function, as is clearly documented; the point there is precisely to reduce index fragmentation. A link to the NEWSEQUENTIALID page in the SQL Server docs is provided right there, so users can go deeper and understand exactly what that means. What additional documentation do you think is missing in the EF docs, beyond us replicating the entire SQL Server NEWSEQUENTIALID docs inside the EF docs, which I don't believe we should do? If you're looking for guidance in the EF docs on how to generate your own binary representation of efficient GUID values for SQL Server, then I don't think that belongs in our docs. That's quite an anti-pattern, and doing this correctly has nothing to do with EF.
That's most likely not the case: i haven't personally verified this, but SQL Server has its own sort order which UUIDv7 wouldn't be appropriate for (see #33579 (comment)).
I still am not clear on whether you're representing your GUIDs as binary or string in the database; at the top of your post you mention |
@jscarle A .NET The endianness only comes in to play when serializing to binary, which you don't seem to be doing. |
Closing this as no answer was provided to my last comment, and I'm not seeing anything actionable on our side. @jscarle if you'd like to continue the conversation, please post back here and we can revisit. |
I am currently in the midst of trying to repair the data architecture of an application which fell victim to the original designer's poor decision to make the primary keys of each of the tables Guids stored as char(36) strings... sigh... 😔
Concidently, with the arrival of Guid.CreateVersion7, this has renewed discussions regarding the use of Guid in databases and surfaced the issues faced SQL Server's endianess conflicts when dealing with Guids. Fragmentation of clustered indexes when using Guids has been going on for years now.
I know that similar discussions have come up within this repository and I know that the documentation has been improved in the past, but honestly I'm now more confused than ever.
In the documentation, Explicitly configuring value generation, there is a mention that "on SQL Server, when a GUID property is configured as a primary key, the provider automatically performs value generation client-side, using an algorithm to generate optimal sequential GUID values", but it doesn't mention whether the generation is lexicographical, nor if it is stored lexicographically within the database by taking into account the endianness difference between Guids in .NET and uniqueidentifiers in SQL Server.
In SQL Server's documentation for NEWSEQUENTIALID, there is a mention of how it can be used to "generate GUIDs to reduce page splits and random IO at the leaf level of indexes".
The EF Core documentation for the SQL Server provider for Guids also mentions that "the provider automatically generates optimal sequential values, similar to SQL Server's NEWSEQUENTIALID function", but again without any mention of its impacts on clustered index fragmentation.
There's also the privacy concern that sequential uniqueidentifier values brings forward.
On the surface, it seems like Version 7 of the Guid would solve a lot of these issues, but with the lack of clarity regarding the way a Guid translate to the way it's finally stored in the database, and what EF Core does when storing it, makes me unsure on which direction to go with this.
I could write my own value converter that calls
TryWriteBytes(destination, bigEndian: true)
when storing andnew Guid(source, bigEndian: true)
when reading with abinary(16)
instead of auniqueidentifier
, but I'd rather avoid going through all of the trouble to reimplement something that may already be done internally.So any clarification would be greatly appreciated, and thanks for all your hard work @roji and @ajcvickers!
The text was updated successfully, but these errors were encountered: