Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full-text search support for EF Core #1

Closed
roji opened this issue Apr 14, 2016 · 13 comments
Closed

Full-text search support for EF Core #1

roji opened this issue Apr 14, 2016 · 13 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@roji
Copy link
Member

roji commented Apr 14, 2016

From @roji on April 10, 2016 10:31

Same as #999 but for EF Core. @rwasef1830 you're welcome to do this whenever you want.

Note that there's the question of where the C# text functions should live (in EF6 it's NpgsqlTextFunctions), EF Core has an opinion on this - follow
dotnet/efcore#2850.

Copied from original issue: npgsql/npgsql#1021

@rwasef1830
Copy link
Contributor

rwasef1830 commented Feb 10, 2017

@roji I'm planning to take a stab at this, since I'll need EF Core soon. A little issue though, dotnet/efcore#2850 is still open.

Do you suggest I go with a static NpgsqlTextFunctions class ? It seems EF.Functions still hasn't been implemented, so I can't follow the recommendation there. Or perhaps, Npgsql.EF.Functions ?

@roji
Copy link
Member Author

roji commented Feb 10, 2017

Great, that would be a really nice addition! Good question on EF.Functions.

I don't know full-text very well, but here's a different direction... One of the nice things about EF Core is that it allows easy translation between any .NET expression type type and SQL - it doesn't have to by a static method somewhere. Ideally, users should be able to perform PostgreSQL operations by using the same .NET APIs they're already used to. For example, to do a regex search, they simply do Regex.Match() as they're used to, and the EF Core provider translates that into PostgreSQL's regex operator.

So, we could implement full-text support by translating expressions that make use of Npgsql's NpgsqlTsVector and NpgsqlTsQuery. For example, it could look like the following:

var query = new NpgsqlTsQuery(...).
var result = ctx.Books.Where(b => b.IsMatch(query));

In this example, IsMatch() is an extension method added by the Npgsql EF Core, which recognizes it while doing query translation and emits the @@ operator. This is just the very basic operation - we'd also have to think about tsvector, etc.

What do you think? This makes these special operations very idiomatic and easy to use, without a global static class. If you like this direction, it would be good to submit a proposal for the API you'd like to implement (it obviously doesn't have to cover anything, at least not at first).

Another point to consider is the definition of full-text searches in migrations... The docs show the use of expression indices. In #119 I looked at modeling this in EF Core, but it turned out PostgreSQL supports extremely flexible scenarios (e.g. a single index over multiple expressions and columns). It may be worth revisiting and modeling at least a simple, single-expression case.

Let me know your thoughts...

@rwasef1830
Copy link
Contributor

rwasef1830 commented Feb 10, 2017

Hey @roji,
Thanks for the ideas, here are my thoughts on some of them:

So, we could implement full-text support by translating expressions that make use of Npgsql's NpgsqlTsVector and NpgsqlTsQuery. For example, it could look like the following:

var query = new NpgsqlTsQuery(...).
var result = ctx.Books.Where(b => b.IsMatch(query));

In this example, IsMatch() is an extension method added by the Npgsql EF Core, which recognizes it while doing query translation and emits the @@ operator.

I think it looks idiomatic, but it encourages a false expectation that there is a C# side implementation to it (unlike Regex.IsMatch and String.Contains). I don't like the fact that when you type the period character after b, intellisense is polluted with possible members and methods that work in normal context, but are not actually usable in LINQ and vice versa. It just adds noise and harms discoverability in this case.

I completely agree with idiomatic looking code, but only when there is a valid C# implementation to such methods (such that they work in a LINQ-to-Objects query), otherwise, the methods should be completely separate (or delineated somehow) to make it absolutely clear that this is a 100% Postgresql side function.

In addition, the C# vs Postgresql semantics for tsquery are sort of different. You can cast a string as tsquery, or you can ask postgresql to convert it to one. To replicate this in idiomatic C#, you would have an explicit cast function on NpgsqlTsQuery that actually throws exceptions and for conversion, the user would use NpgsqlTsQuery.Parse in the LINQ code.

This leads to me to think that full-text-search is a strictly postgresql technology and as such should not appear as idiomatic C# code (because it has no equivalent C# implementation).

What do you think about this problem ? Maybe there is a way to reduce it without losing idiomatic shape of the API completely, I'm open to suggestions.

Another point to consider is the definition of full-text searches in migrations... The docs show the use of expression indices. In #119 I looked at modeling this in EF Core, but it turned out PostgreSQL supports extremely flexible scenarios (e.g. a single index over multiple expressions and columns). It may be worth revisiting and modeling at least a simple, single-expression case.

As for this, in the EF6 code, I didn't writing full-text-search specific code for migrations because I found that it would be a lot of work to express the full fidelity of Postgres index definition and it wouldn't be strongly typed anyway, the user is better off writing raw migration SQL to create the index using the most expressive way directly instead of through an awkward C# API.

I still think so for EF Core. Better to leave it to raw SQL migration statements.

If the user wants to add a tsvector column (and then index that), it can be modeled as a regular string property in EF6 (and depend on unknown string behavior), or maybe as a proper NpgsqlTsVector property in EF Core (but would need to map that type, not sure how in EF Core).

What do you think ?

I'll try to think of some way to better express queries and post some example API to discuss later.

@roji
Copy link
Member Author

roji commented Feb 11, 2017

@rwasef1830, I agree with all your comments. You're especially right that we should be adding extension methods to regular types (e.g. string) only in cases where the method actually exists in C# and does something meaningful without PostgreSQL/EFCore.

One small caveat to this, is that we do have two full-search types (NpgsqlTsVector and NpgsqlTsQuery) which only work in the context of PostgreSQL, so it may make sense to define extension methods over these types and translate them to SQL. I'm not saying the entire EF Core full-text search API can/should be implemented via extension methods on these types, but that where it makes sense such extension methods seem better than EF.Functions.

Regarding EF.Functions, I just posted this comment to try to get some progress. Hopefully we'll be able to at least get EF.Functions and a guideline on how to add name extension methods, and the we're unblocked. Obviously feel free to add your own comments there. What would be great is if you could start thinking about how the full-text search API would look like in Npgsql EF Core, assuming EF.Functions is there, so we can start doing back-and-forths on it.

Finally, I also agree with your comment on migrations... That's the conclusion I reached when looking at expression indices, and I guess that's still valid. People can always use raw SQL to do whatever they want.

@ShawnTheBeachy
Copy link

Any updates on where this is at currently?

@rwasef1830
Copy link
Contributor

EF Core 2.0 has landed with the EF.Functions helper we need to go ahead with this, I'm just waiting for a free chance to work on this.

@Biarity
Copy link

Biarity commented Dec 15, 2017

Any updates on this?

@rwasef1830
Copy link
Contributor

rwasef1830 commented Dec 15, 2017

I started working on this and then real life got in the way and then it just dropped off the radar. I plan to resume work on it next week. My humble apologies for the huge delay :-(

@roji
Copy link
Member Author

roji commented Dec 15, 2017

@rwasef1830, great to hear you'll be looking at this soon!

Check out some of the work that's been done in EF Core on this, I think at least some partial SqlServer support has been added (dotnet/efcore#10408, dotnet/efcore#10488, dotnet/efcore#10462). This isn't to say we should be doing what SqlServer is doing, but we should at least take a look and compare etc.

@roji
Copy link
Member Author

roji commented Mar 29, 2018

Issue number one, closed thanks to @rwasef1830 in #315!

@roji roji closed this as completed Mar 29, 2018
@roji roji removed the good first issue Good for newcomers label Mar 29, 2018
@roji roji added this to the 2.1.0 milestone Mar 29, 2018
@roji
Copy link
Member Author

roji commented May 16, 2018

@rwasef1830, I've done a lot of work on documentation on http://npgsql.org. I've left a placeholder page for full text search, do you have time to work on it?

The idea is to provide the minimal basic concepts for setting up FTS, and especially providing some code. You can take a look at some other pages to see what documentation looks like there.

If you're too busy let me know and I'll work on it.

@rwasef1830
Copy link
Contributor

@roji Sorry, I got held up at the moment for a couple of weeks. Don't have time to work on it right now.

@rwasef1830
Copy link
Contributor

@roji Opened a pull request with the documentation page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants