Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended computed fields #2578

Open
steve-chavez opened this issue Nov 26, 2022 · 7 comments
Open

Extended computed fields #2578

steve-chavez opened this issue Nov 26, 2022 · 7 comments
Labels
enhancement a feature, ready for implementation

Comments

@steve-chavez
Copy link
Member

steve-chavez commented Nov 26, 2022

Problem

It's not possible to apply parameters to computed columns. If I'd want to use the pgroonga_highlight_html function like:

SELECT pgroonga_highlight_html(content, ['fast', 'PostgreSQL']) FROM samples; 

It's currently not possible to create a computed and apply the ['fast', 'PostgreSQL'] argument in:

GET /samples?select=*,wrapped_pgroonga_highlight_html 

Proposal

Taking Wolfgang's syntax idea for aggregates(ref) and the SEARCH http method with underscore operators(ref), we could support:

SEARCH /samples?select=*,content.wrapped_pgroonga_highlight_html(body->keywords)&id=eq.1

{
  "keywords": ["fast", "PostgreSQL"]
}

With the extended computed column:

create function wrapped_pgroonga_highlight_html(tbl samples, keywords text[]) as $$
  SELECT pgroonga_highlight_html(tbl.content, keywords); 
$$ language sql stable;

Drawbacks

Adding computed columns for each output expression in select has an impact on db state and could be considered a sort of "damage"(like mocks on unit tests, see ref). To avoid this we'd need to have a way to allowlist functions to call them directly on select.

I think computed columns is the only safe way to expose transformation on select though. One can also define generic computed columns(taking an anyelement param) to avoid adding too much of these.

@steve-chavez steve-chavez added the idea Needs of discussion to become an enhancement, not ready for implementation label Nov 26, 2022
@adrinr
Copy link

adrinr commented Nov 30, 2022

I am actually tackling a similar issue, this would be ideal for us as well!

@steve-chavez
Copy link
Member Author

Taking the idea from #915 (comment), a refinement to the above syntax would be:

SEARCH /samples?select=*,$f.wrapped_pgroonga_highlight_html(content,$body.keywords)&id=eq.1

{
  "keywords": ["fast", "PostgreSQL"]
}

@steve-chavez
Copy link
Member Author

steve-chavez commented Jan 25, 2023

To avoid a more complex syntax for now, we could allow:

GET /samples?select=*,wrapped_pgroonga_highlight_html&wrapped_pgroonga_highlight_html.keywords=["fast", "PostgreSQL"]

or

GET /samples?select=*,$f.wrapped_pgroonga_highlight_html&wrapped_pgroonga_highlight_html.keywords=["fast", "PostgreSQL"]

@wolfgangwalther
Copy link
Member

wolfgangwalther commented Feb 26, 2023

To avoid a more complex syntax for now, we could allow:

GET /samples?select=*,wrapped_pgroonga_highlight_html&wrapped_pgroonga_highlight_html.keywords=["fast", "PostgreSQL"]

Uh, this is interesting. I like the symmetry between passing arguments to RPCs and other functions: Whenever another query string parameter does not have an operator, it's an argument. This can be a top-level argument for the RPC, or a nested argument with function names.

@steve-chavez
Copy link
Member Author

Yeah, I like that one too. Also with aliases it can be shorter:

GET /samples?select=*,highlight:wrapped_pgroonga_highlight_html&highlight.keywords=["fast", "PostgreSQL"]

This should be straightforward to support because it reuses existing parsers.

@steve-chavez steve-chavez added enhancement a feature, ready for implementation and removed idea Needs of discussion to become an enhancement, not ready for implementation labels Feb 27, 2023
@steve-chavez steve-chavez changed the title More complex transformations on select Parametrized computed columns May 11, 2023
@steve-chavez steve-chavez changed the title Parametrized computed columns Extended computed columns May 17, 2023
@steve-chavez
Copy link
Member Author

Whenever another query string parameter does not have an operator, it's an argument. This can be a top-level argument for the RPC, or a nested argument with function names.

I've just noticed that might conflict with the OpenAPI fix on #1970 (comment).

So I think we should change the syntax to:

GET /samples?select=*,highlight:wrapped_pgroonga_highlight_html&highlight.keywords=arg.["fast", "PostgreSQL"]

(Note the highlight.keywords=arg.)

I'm sure that will make the transition to operators to the left side easier #2066

@steve-chavez steve-chavez changed the title Extended computed columns Extended computed fields Dec 19, 2024
@steve-chavez
Copy link
Member Author

Another use case for this is ts_rank_cd (#1758).

I'm also wondering if we could provide an "implicit computed field" for ts_rank_cd/ts_rank, to make FTS more feature complete OOTB. This would only work for tsvector columns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement a feature, ready for implementation
Development

No branches or pull requests

3 participants