Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for the FTS5 trigram tokenizer #1655
base: development
Are you sure you want to change the base?
Add support for the FTS5 trigram tokenizer #1655
Changes from 1 commit
85784be
3df8834
a303308
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have read the SQLite Documentation which says:
Still, I would suggest exposing two distinct options for
case_sensitive
andremove_diacritics
.The reasons for this suggestion are:
As time has passed, I have learned that the Swift version of infrequently used SQLite APIs should be as close to the original as possible. It should have the full SQLite smell. This spares the user from translating the SQLite knowledge that was not always easily acquired into a foreign Swift API.
SQLite evolves and fixes bugs. Today both options only support 0 and 1, but this may change. It is good to let the user extend the list of options, even if GRDB api lags behind:
This kind of SQLite evolution has happened for Unicode61:
remove_diacritics=2
was introduced once a bug was discovered inremove_diacritics=1
.Some GRDB APIs, such as
FTS5.Diacritics
used by Unicode61, do not support such extensibility. This is not an example to follow.Maybe we could aim at:
API:
The
FTS5.TrigramDiacriticsOption
values (.keep
and.remove
), which require SQLite 3.45, would be unavailable on Apple platforms.FTS5.TrigramDiacriticsOption
can remain fully available.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-reading myself with a critic eye, I stick to my initial arguments about the "SQLite smell", but I really start to wonder if
TrigramCaseSensitiveOption
andTrigramDiacriticsOption
are that useful.This would pass my review like a breeze :-)
Yes, this would make the Swift FTS5 apis not very consistent. Keeping the old existing apis is very important, because all code that users can put in a migration must basically compile forever (modifying a migration is a cardinal sin). This should not prevent new apis from being shaped by the experience that was acquired over the years.
I leave it up to your good taste 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed feedback!
When using the raw
Int?
arguments we obviously can't attach any availability information. Do you think it would make sense to note the SQLite >= 3.45 requirement in the documentation for theremoveDiacritics
parameter? Or better to just leave that to the SQLite documentation entirely since we can't enforce it anyway?(In theory we could also have two
trigram()
methods as well, one only taking thecaseSensitive
parameter and a second one with both parameters with more restrictive availability. But I don't think that makes sense since that approach won't work if SQLite adds more options for the existing parameters so probably better not to try to enforce availability for the parameters at all than try to partially do it.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. I see. Yes it wouldn't be great if people could write
t.trigram(caseSensitive: 3)
in an app that targets iOS 17. OK, I have been a poor guide here.So are we back to
TrigramCaseSensitiveOption
andTrigramDiacriticsOption
, with built-in values subject to availability checkings, along with raw initializers for people who know what they are doing?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense, that makes it possible to add availability checks while allowing the user to use raw values to bypass restrictions if needed.
A couple of minor questions:
Do we make
TrigramDiacriticsOption
itself unavailable for non-custom SQLite or only theremove
option? I don't think there is a use case for manually using a raw value on OSs where it isn't supported anyway?Thoughts on adding
Equatable
,Hashable
andBitwiseCopyable
conformances forTrigramCaseSensitiveOption
andTrigramDiacriticsOption
; I don't imagine these would see much use but probably no downside to adding them either?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️