Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: hyphenated phrases cause weird search behaviors #203

Open
aryehgigi opened this issue Jun 18, 2024 · 4 comments
Open

Bug: hyphenated phrases cause weird search behaviors #203

aryehgigi opened this issue Jun 18, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@aryehgigi
Copy link

Describe the Bug
According to the docs of the paper/search public API: Hyphenated query terms yield no matches (replace it with space to find matches) see here.
But i don't see any difference in matching results when i do replace the hyphen with a space..

To Reproduce

>>> import requests
>>> print(requests.get("https://api.semanticscholar.org/graph/v1/paper/search/", params = {"query":'Evidence-based Syntactic Transformations for IE', "fields": "title,corpusId","offset": "0", "fieldsOfStudy": "Computer Science"}, headers={...}).json()["total"])
92
>>> print(requests.get("https://api.semanticscholar.org/graph/v1/paper/search/", params = {"query":'Evidence based Syntactic Transformations for IE', "fields": "title,corpusId","offset": "0", "fieldsOfStudy": "Computer Science"}, headers={...}).json()["total"])
92
>>> print(requests.get("https://api.semanticscholar.org/graph/v1/paper/search/", params = {"query":'Evidence Syntactic Transformations for IE', "fields": "title,corpusId","offset": "0", "fieldsOfStudy": "Computer Science"}, headers={...}).json()["total"])
1

Expected Behavior
when i replace the hyphen with a space i expect to get more accurate and thus less results

Actual Behavior
when i replace the hyphen with a space i get the same amount of overflowing results as if the hyphen was there.

Environment Details
Platform: Linux

@aryehgigi aryehgigi added the bug Something isn't working label Jun 18, 2024
@cfiorelli
Copy link
Collaborator

In this case the feature for query operators on this endpoint does not exist. The documentation is going to be updated by end of today.

Thank you!

@aryehgigi
Copy link
Author

@cfiorelli
sorry i missed this issue-closure

iiuc you only updated the docs - but didnt change behavior.
so im not sure you understood my issue, as this is still a bug (unless you intend to mark it as a known issue?)..
when i am looking for a paper that has - in its title (e.g. Evidence-based Syntactic Transformations for IE) i expect either the Evidence-based Syntactic Transformations for IE query to find it or the Evidence based Syntactic Transformations for IE query to find it. instead both find 92 results!
wdyt?
thanks!

@cfiorelli
Copy link
Collaborator

investigating over DM - broken out to 2 distinct issues

  1. Documentation update for "hyphenated query terms"
    The docs indicated a functionality which does not exist for this endpoint: Using a hyphen to exclude a keyword.
    After reviewing your report we found that the behavior is working as intended but the documentation was misleading or inaccurate. As of today it seems the docs have reverted and are again showing the misleading instruction about using hyphenated query terms. I'll take a look at whats going here later today.

  2. Searching for a paper title fails to return the paper, but returns 92 other papers
    Holding for follow up with @aryehgigi to make sure i've got it clear before moving on this one.

@cfiorelli cfiorelli reopened this Jul 10, 2024
@aryehgigi
Copy link
Author

yes point 2 is the main one i was actually aiming to

imagine a user a looking for a paper that is titled: "AI2: the Seattle-based company..". now as a user they might try to search for seattle-based which would not lead to finding the paper for some unknown reason.
see a real example in my initial comment of this issue

thanks a lot for reopening this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants