Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to dynamically skip erroneous entries #2166

Open
anyangml opened this issue Oct 18, 2024 · 3 comments
Open

Ability to dynamically skip erroneous entries #2166

anyangml opened this issue Oct 18, 2024 · 3 comments
Labels
client Issues/PRs relating to the OPTIMADE client. suggestions

Comments

@anyangml
Copy link

Hi,
I am trying to use the Optimade API to download the Aflow dataset, however, I believe there is a server-side issue at the following endpoint https://aflow.org/API/optimade/v1/structures?page_number=758&page_limit=100

image

The API will always crash at this point. I can manually loop through the pages with requests, but I wonder if there is an option to skip bad endpoints using the optimade client?

image
@ml-evs
Copy link
Member

ml-evs commented Oct 18, 2024

Hi @anyangml, you should be able to filter out providers in the CLI with

  --exclude-providers TEXT        A string of comma-separated provider IDs to
                                  exclude from queries.

e.g., optimade-get --exclude-providers aflow,oqmd,mp, or if you are using the client within your own script, then you can pass exclude_providers=["aflow", "oqmd", "mp", "whoever"] to the OptimadeClient on initialisation.

I think generally, the AFLOW team would appreciate it if you report issues too! I can confirm that I also get a server error with that link, and it seems like they now provide an email address and issue tracker in the JSON response that you could use to report the problem.

@ml-evs ml-evs added the question Further information is requested label Oct 18, 2024
@ml-evs ml-evs changed the title Aflow server side issue Aflow server side issue & excluding providers in client Oct 18, 2024
@anyangml
Copy link
Author

Hi @anyangml, you should be able to filter out providers in the CLI with

  --exclude-providers TEXT        A string of comma-separated provider IDs to
                                  exclude from queries.

e.g., optimade-get --exclude-providers aflow,oqmd,mp, or if you are using the client within your own script, then you can pass exclude_providers=["aflow", "oqmd", "mp", "whoever"] to the OptimadeClient on initialisation.

I think generally, the AFLOW team would appreciate it if you report issues too! I can confirm that I also get a server error with that link, and it seems like they now provide an email address and issue tracker in the JSON response that you could use to report the problem.

Oh, I didn't mean to skip the entire provider. I just want to skip the bad endpoints within Aflow. I believe this can be done with requests if I loop through the page numbers.

@ml-evs
Copy link
Member

ml-evs commented Oct 20, 2024

Ah I see, that's not currently possible though we could add it as an option (and e.g., could automatically switch to crawling entry by entry to find the bad one).

Can you verify that the pages afterwards actually work though? Obviously it would be better if AFLOW could just fix this page for you.

I'd be happy to accept a PR for this, otherwise I will add it to the backlog.

@ml-evs ml-evs changed the title Aflow server side issue & excluding providers in client Ability to dynamically skip erroneous entries Oct 20, 2024
@ml-evs ml-evs added suggestions client Issues/PRs relating to the OPTIMADE client. and removed question Further information is requested labels Oct 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client Issues/PRs relating to the OPTIMADE client. suggestions
Projects
None yet
Development

No branches or pull requests

2 participants