Skip to content

Expose all download history through the API #557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Eh2406 opened this issue Feb 15, 2017 · 11 comments
Closed

Expose all download history through the API #557

Eh2406 opened this issue Feb 15, 2017 · 11 comments
Labels
A-backend ⚙️ C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works

Comments

@Eh2406
Copy link
Contributor

Eh2406 commented Feb 15, 2017

Hi,

What is the best way to get the download history for all crates? The data is publicly available by scraping the graphs out of each crates page, but that is rude and inelegant.

If the data is available, how can I improve the docs to make it easier to find?
If it is not available, what can I do to add that functionality?

@alexcrichton
Copy link
Member

Right now the only route for this is the downloads route but that just provides the data you see rendered already. I don't believe there's a route for a historical paginated version of this. Adding one would be fine though!

@Eh2406
Copy link
Contributor Author

Eh2406 commented Feb 15, 2017

Thank you for that link! I will try and grok the code and open a pr. So far I mostly have questions. :-P

I think the comment may be out of date or I don't know how to read it, the link /crates/quadrature/downloads just gets me an error. It seems to be maching L89: "/crates/:crate_id/:version" insted of line L94: "/crates/:crate_id/downloads".

Is there some kind of caching for tx.prepare( or are we rebuilding with each request?

Is there some kind of caching for the website? All the download counts (except for today's data) are going to be static, seems a shame to hit the database repeatedly.

Is there a schema for the tables that we can query?

@alexcrichton
Copy link
Member

Nah currently we don't have any caching, everything hits the database. Also there's currently no caching around tx.prepare(..). The schema is probably best looked at by following the README to prepare a local database and exploring that.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Feb 15, 2017

I will experiment with a locale instance when I have a chance. :-)

How do I hit the "/crates/:crate_id/:version" target? Everything I try just gets me error messages.

@alexcrichton
Copy link
Member

This should do the trick:

curl -H 'Content-Type: application/json' https://crates.io/api/v1/crates/libc/downloads

@carols10cents carols10cents changed the title programmatically got downlod history? Expose all download history through the API Feb 21, 2017
@carols10cents carols10cents added A-API C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works labels Feb 21, 2017
@Eh2406
Copy link
Contributor Author

Eh2406 commented Feb 24, 2017

Concrete suggestion:
Add an offset query parameter to version::downloads. This value is the number of days ago the most recent result will be from.

let offset = req.query().get("offset").parce().unwrap_or(0);
let cutoff_date_end = ::now() + Duration::days(-offset);
let cutoff_date_start = cutoff_date_end + Duration::days(-90);

This is primarily the smallest change I can think of to make the data available.

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 3, 2017

Alternative suggestion:
Add a page query parameter to version::downloads. This is the number of 90 day units ago to get.

let offset = req.query().get("page").parce().unwrap_or(0);
let cutoff_date_end = ::now() + Duration::days(-90 * offset);
let cutoff_date_start = cutoff_date_end + Duration::days(-90);

Are these at all an acceptable idea? How can these ideas be improved?

@carols10cents
Copy link
Member

Sure, paging sounds great and would match other endpoints' interfaces.

@sgrif
Copy link
Contributor

sgrif commented Mar 9, 2017

Random note if that endpoint is using Diesel by the time someone gets to it -- it can be done entirely in SQL as

use diesel::expression::dsl::*;

let cutoff_end_date = now - (90 * offset).days();
let cuttof_start_date = cutoff_end_date - 90.days();

@sgrif
Copy link
Contributor

sgrif commented Mar 9, 2017

Also there's currently no caching around tx.prepare(..).

All of the endpoints moved to Diesel do! ;) https://github.com/diesel-rs/diesel/blob/428db9515e5c7769a9313b4cf9bc14f1ced290e7/diesel/src/connection/statement_cache.rs

@Eh2406
Copy link
Contributor Author

Eh2406 commented Mar 13, 2017

Closed in #611

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ C-enhancement ✨ Category: Adding new behavior or a change to the way an existing feature works
Projects
None yet
Development

No branches or pull requests

5 participants