You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Partially a query parsing issue, but likely to also be an indexing issue.
Especially in technical fields, or when doing digital humanities-style queries, there are a lot of valid queries which include meta characters. Not clear how to represent many of these in the Lucene query syntax, or to escape out to a simpler syntax. Also not clear how many of these can even be handled by the query engine. Some examples:
A* search in computer science ("A star" algorithm)
identifiers used in bio-medicine. could try to query by prefix, suffix, or sub-patterns. sometimes dashes, periods, spaces, or other characters have meaning
math. even simple things like searching for exponentiation. or symbols like β (\beta in LaTeX). appear in titles, abstracts, body, citations, etc. do we flatten these down (in a unicode-aware way) to, eg, "b" for indexing? expand "beta"? other isues: function syntax, arrows, primes, dots, set inclusion, real numbers ("R"), integers ("N"), dot product, etc.
chemical formula: arrows, other notation
The text was updated successfully, but these errors were encountered:
Partially a query parsing issue, but likely to also be an indexing issue.
Especially in technical fields, or when doing digital humanities-style queries, there are a lot of valid queries which include meta characters. Not clear how to represent many of these in the Lucene query syntax, or to escape out to a simpler syntax. Also not clear how many of these can even be handled by the query engine. Some examples:
\beta
in LaTeX). appear in titles, abstracts, body, citations, etc. do we flatten these down (in a unicode-aware way) to, eg, "b" for indexing? expand "beta"? other isues: function syntax, arrows, primes, dots, set inclusion, real numbers ("R"), integers ("N"), dot product, etc.The text was updated successfully, but these errors were encountered: