Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

=author:"Wang, I" gets expanded to "Wang, Y" despite the equal sign #198

Open
marblestation opened this issue Feb 28, 2023 · 2 comments
Open

Comments

@marblestation
Copy link
Collaborator

Solr (or the solr supervisor process) generates a transliteration:

cd /app/conf # in adsnest montysolr container
grep -i ^'wang\\,\\ i=' author_generated.translit
wang\,\ i=>wang\,\ y

These synonyms are merged with the ones that we hand curate (which do not contain any Wang reference).

When one uses the "=" syntax, we would expect that all synonyms and author name expansion would be skipped, but this is not the case.

@JCRPaquin
Copy link
Contributor

I'm not sure what the purpose of author_generated.translit is or how it was originally created. Is it still necessary to maintain beyond our hand curated author name transliterations?

When one uses the "=" syntax, we would expect that all synonyms and author name expansion would be skipped, but this is not the case.

Is this what we expect from user behavior (e.g. sampled user queries and results) or from our design? It might be possible to prevent synonym/transliteration expansion for queries, but it's hard to say that's the right course of action without assessing user impact.

@JCRPaquin JCRPaquin self-assigned this Aug 25, 2023
@JCRPaquin JCRPaquin added authors and removed search labels Oct 16, 2023
@JCRPaquin
Copy link
Contributor

Query annotations, like the exact search annotation (=), currently don't propagate to the query code-- we can't access the query AST with the way the code is written today. I have a patch to enable propagating this information sitting on a branch, but it's not currently my priority so it'll be a while before I post another update on this issue.

The patch in a nutshell: when executing the Query objects generated for each subquery we provide a character stream. If we inject a wrapper that provides access to the AST node the Query object originated from, it'd be possible to traverse the character stream wrappers until you find the one providing the AST node reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants