-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List public files in dataset by API call #1717
Comments
@michbarsinai @landreev @ekraffmiller @scolapasta does the "native" API already have an endpoint for this? If not, where would be a good place for it? In case you don't know @rliebz he's the (awesome) developer behind https://github.com/CenterForOpenScience/osf.io/tree/master/website/addons/dataverse which end users can read about at https://osf.io/getting-started/#dataverse |
@scolapasta and @sekmiller can you please double-check that in 9c1d2cf I'm showing files from the most recently published version and only the most recently published version? @michbarsinai I'm pretty sure @rliebz is going to want to do the lookup by globalId (i.e. DOI) rather than database id, so that's what I implemented. However, it makes this endpoint different than the rest in the Datasets API. |
The code looks correct for showing the right files. However, I agree that the consistency of the API would be nice. Also, a general form of this should be straighforward, so you could get the files from any version. So something like id/version/files where the API already has special logic for handling version as ":latest-published". But I do agree with you about global id as well as db id. But that should be true for all our APIs. So I'm going to assign this to @michbarsinai to see what he can do with this by EOD Monday. @michbarsinai if you could take a look, and let me know what you suggest before you start on it? |
This endpoint looks like just what I needed! Also, allowing version to be specified from this endpoint is certainly something of which the OSF could easily take advantage. I do think allowing global ids in more places will make the APIs more compatible with each other, especially from the perspective of adding functionality to existing usages of the SWORD API. |
@scolapasta The infrastructure supported getting DvObjects by global id for quite some time now (see https://github.com/IQSS/dataverse/blob/master/src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java#L174). However, the doi naming scheme contains slashes, and so does not play well with REST api. We could introduce a scheme where the doi id is escaped or the slashes are replaces with dashes, or maybe base64-ed. Not sure any of these is a good idea - at least, it's not a very intuitive one. We could offer another endpoint that converts global ids to local ones. |
@michbarsinai I think that because the SWORD API takes DOIs in their original form, it would be a bit cumbersome to have to use differently-escaped versions of the same DOI to be able to use the two APIs. In that sense, I think if you escaped/replaced/base64-ed DOIs for the native API, the SWORD API should accept those as well. I do think an endpoint converting global IDs to local IDs would make it relatively simple to hook any native API endpoint into existing SWORD clients. |
@rliebz I also lean towards a conversion endpoint. Using global identifiers in an encoded form does not sound intuitive at all. |
This is now supported for all versions via the standard datasets API. Opened new issue #1837 for persistent id support |
Using the SWORD API route http://guides.dataverse.org/en/latest/api/sword.html#display-a-dataset-statement, it is possible to retrieve a list of files from the most recent version of a dataset. Because the most recent version will only be "released" (published) when a newer draft version does not exist, it is not always possible to retrieve only the public files from this route.
The metadata route specified in http://thedata.harvard.edu/guides/dataverse-api-main.html#metadata was available in 3.6, which used to accomplish this functionality by listing available materials from the latest released version of a dataset, is no longer supported.
Ideally, an API route that listed public files in a dataset would take a dataset ID or DOI and return a list of files for that dataset, including filenames and file IDs. Additionally, it could take a version (as in http://guides.dataverse.org/en/latest/api/native-api.html#datasets) and return the files for that version.
The text was updated successfully, but these errors were encountered: