Skip to content

Restructuring backend as MVC #912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vignesh-sankaran opened this issue Jul 25, 2017 · 13 comments
Closed

Restructuring backend as MVC #912

vignesh-sankaran opened this issue Jul 25, 2017 · 13 comments
Labels
C-internal 🔧 Category: Nonessential work that would make the codebase more consistent or clear

Comments

@vignesh-sankaran
Copy link
Contributor

Relaying the thoughts of @carols10cents from an email sent to me. We could look into setting up modules with the following structure:

  • Model: All DB interaction structs
  • View: Encodable implementations e.g. Encodable Crate
  • Controller: All functions associated with API endpoints

This would also require a restructuring of unit and integration tests, so this would definitely not be a small task.

Thoughts @carols10cents?

@carols10cents
Copy link
Member

I am into this! I would prefer to do this incrementally, if possible, so that there isn't one big PR that would need to be reviewed and kept up to date. Even though the intermediate state of being half reorganized might be more confusing, it would be MUCH easier for me to have smaller independent-ish PRs to review and merge :)

@jtgeibel
Copy link
Member

I think a good place to start would be to move the API endpoints under cargo and frontend modules. In particular, it would be nice to see the following in src/lib.rs:

// used by both `cargo search` and the frontend
api_router.get("/crates", C(krate::index));

// Cargo only routes - comments below reference src/crates-io/lib.rs in cargo's repository
api_router.put("/crates/new", C(krate::cargo::new));                              // publish()
api_router.get("/crates/:crate_id/:version/download", C(krate::cargo::download)); // src/cargo/sources/registry/remote.rs:download()
api_router.get("/crates/:crate_id/owners", C(krate::cargo::owners));              // list_owners()
api_router.put("/crates/:crate_id/owners", C(krate::cargo::add_owners));          // add_owners()
api_router.delete("/crates/:crate_id/owners", C(krate::cargo::remove_owners));    // remove_owners()
api_router.delete("/crates/:crate_id/:version/yank", C(version::cargo::yank));    // yank()
api_router.put("/crates/:crate_id/:version/unyank", C(version::cargo::unyank));   // unyank()

// Frontend routes
api_router.get("/crates/:crate_id", C(krate::frontend::show));
api_router.get("/crates/:crate_id/:version", C(version::frontend::show));
api_router.get("/crates/:crate_id/:version/dependencies", C(version::frontend::dependencies));
api_router.get("/crates/:crate_id/:version/downloads", C(version::frontend::downloads));

The search API is the only route I've found that is currently used by both.

I've attached my list of all backend routes and where they are used. I hope to get this cleaned up and submitted as documentation soon.

routes.xlsx

@jtgeibel
Copy link
Member

I have several commits where I've started working on this and I'm looking for some feedback on the general approach. For now I've started with @sgrif's completed Diesel port since that removes a lot of code and there is less to move around. Once that lands I can rebase and clean this up for a PR.

I moved the functionality in commits one route at a time, except for routes that share a lot of code like yank and unyank.

My main question is, does this module structure look good? I moved cargo functionality and associated private functions under krate::cargo and version::cargo. While I have not done this yet, I would move much of the remaining krate and version functionality to krate::frontend and version::frontend respectively.

I see no reason to split any of the other API modules (keyword, category, user, token) as they are all frontend only. I think a few other files, such as src/dependency.rs and src/owners.rs, could also be split in a similar way as some of the functions they provide are used only by the new cargo submodules.

@carols10cents carols10cents added the C-internal 🔧 Category: Nonessential work that would make the codebase more consistent or clear label Aug 2, 2017
@vignesh-sankaran
Copy link
Contributor Author

This looks like a good start, the separation of cargo and front end endpoint code looks good. I feel we should aim for separation of all DB related code into a separate namespace and code linked to API endpoints. I'm not sure how we can further separate View-Controller code e.g. /crates/:crate_id endpoint to krate function.

I'd like to set up a tracking issue, maybe we could set up a meeting over IRC or Skype @jtgeibel?

@vignesh-sankaran
Copy link
Contributor Author

vignesh-sankaran commented Sep 30, 2017

Ok so I've had a look through the codebase, here's some suggestions of how we could move forward with the refactor of crates.io.

@jtgeibel recently create separate modules for krate and version. The next steps from here are to split out cargo and frontend related functionality in both modules into cargo.rs and frontend.rs separately.

An idea I've had is that we could put all the structs in the mod.rs, and for functions in the struct that are only used in either cargo.rs or frontend.rs, we can add those into the respective files via another impl on the struct. My knowledge on this area of Rust is admittedly a little hazy, I think it's possible but I'm not sure what other implications of doing this might be. Functions in the struct used between the 2 files, and externally can be impl'd in mod.rs.

For user, token.rs and keyword.rs, we could follow a similar structure, where we take out the struct into a mod.rs, and leave all the functions in either a frontend.rs, cargo.rs, or have both files, depending on which functions are used by either endpoint. My reasoning for this is that it keeps the line count within a file smaller, which makes it easier to find something e.g. a function, and also to keep consistency with the krate and version modules.

As for category.rs, dependency.rs, download.rs and owner.rs, we could put these files into a struct module to allow for sharing across the other modules. This is something to research further though since there are functions not in structs in these files.

Regarding all middleware files, we could have them in a middleware module so that it's clear what files like local_uploads.rs are doing.

I'm thinking we could have every file in a module of some sort, so we don't have any stray files, and feel it could be clearer to tell what a file / module does without having to look inside of it. For some files e.g. render.rs, I think they are only utilised by specific modules i.e. krate. We could also move email.rs inside user since that's the only place this file's contents are utilised in.

I'd also like to refactor the tests so that all tests related to specific functionality get put into the files and modules they are testing for, rather than having them in the src/tests directory. Perhaps we could have src/tests for blackbox tests only i.e. spin up a server, and hit the server directly, accounting for different inputs and their expected outputs. Perhaps it's worth me creating a separate issue on this?

@kureuil
Copy link
Contributor

kureuil commented Sep 30, 2017

Hey everyone, just giving my 2 cents this issue :)

@jtgeibel recently create separate modules for krate and version. The next steps from here are to split out cargo and frontend related functionality in both modules into cargo.rs and frontend.rs separately.

So first of all, I don't really see the gain of tagging endpoints with either a cargo or a frontend tag and I don't even think an explanation was ever given on how this improves on the status quo. Then, why only choose cargo ? A wiki page was recently created to track projects that use crates.io's API, why wouldn't we also have a docsrs module for endpoints used by docs.rs for example ? Another problem I have with this separation is that it is not future-proof: imagine cargo stops using some endpoint. Should we move the endpoint to the mod.rs saying that since version x.y.z of cargo it is not used anymore, or should it stay in the cargo.rs file and say that it is not used by cargo since version x.y.z ?

I think a far more sensible approach to refactoring the backend of crates.io would be to first split modules (that are getting out of hand) by functionality. Taking the krate.rs module, we could split it this way:

  • mod.rs: Crate type definition & impl
  • read.rs: index, summary, show endpoints, everything related to listing or getting specific crate's information
  • publishing.rs: new, parse_new_headers crate publishing related things.
  • meta.rs/link.rs: versions, reverse_dependencies, downloads, readme & all meta kind of links
  • following.rs: Follow, follow, unfollow, following & all following-related code.

IMO, this separation makes it far more easier to find what you might search in crates.io codebase, but it also makes it easier to split up the module because you don't have to think about Does cargo uses this endpoint ? or anything like that.

Then, the second step to refactoring the backend of crates.io would be to extract all conduit endpoints into a dedicated endpoints.rs module, which will only contain the smallest HTTP endpoints whose only job is to call the business logic from the corresponding module. This makes the code easier to test (we don't have to launch an HTTP server to test the yanking behavior) and it makes it easier for us to migrate from Conduit to another web framework (though due to the current state of the web framework ecosystem, we might have to wait a little longer).

Regarding all middleware files, we could have them in a middleware module so that it's clear what files like local_uploads.rs are doing.

Big 👍 from me on this one, though the example might not be the best. I think the LocalUploads middleware shuold eventually be replaced by a proper filesystem & s3 abstraction à la Django Storages but this would at least need another issue and probably even another new crate.

I'm thinking we could have every file in a module of some sort, so we don't have any stray files

Well, technically even "stray files" are modules for rust, I don't see the benefit of creating a folder whose sole purpose is to contain a single file named mod.rs (making less clear what its actual content will be). Moreover, this wouldn't even help from a example.rs to example/mod.rs migration as those are the same in a Rust project.

For some files e.g. render.rs, I think they are only utilised by specific modules i.e. krate. We could also move email.rs inside user since that's the only place this file's contents are utilised in.

I'm not really sure moving these utils-kind of modules deeper into the project's hierarchy is the best course of action. For example, the render.rs isn't related in any kind to the krate module and it probably shouldn't ever be. Moving it into the krate module would add unnecessary coupling between the two and would slow us down in case of refactoring or if we wanted to use the rendering functionality from another module. Same thoughts for the email module.

I'd also like to refactor the tests so that all tests related to specific functionality get put into the files and modules they are testing for, rather than having them in the src/tests directory. Perhaps we could have src/tests for blackbox tests only i.e. spin up a server, and hit the server directly, accounting for different inputs and their expected outputs. Perhaps it's worth me creating a separate issue on this?

This would be good but it would imply being able to test the code without spinning up a server for each test which means decoupling the code from the HTTP server. Anyway, 👍 on this the src/tests directory should make sure that we never break API's backwards compatibility.

@vignesh-sankaran
Copy link
Contributor Author

Thanks for your input @kureuil, my responses are below.

So first of all, I don't really see the gain of tagging endpoints with either a cargo or a frontend tag and I don't even think an explanation was ever given on how this improves on the status quo. Then, why only choose cargo ? A wiki page was recently created to track projects that use crates.io's API, why wouldn't we also have a docsrs module for endpoints used by docs.rs for example ?

So I think the reasoning for splitting modules into cargo and frontend modules is that we can split out the known endpoints, at least to us, and make changes to them independently of each other, and more importantly, not accidentally change something that will break the cargo endpoints. Yes, we don't know all external users, but we only have to provide a guarantee that cargo and the frontend work, since technically crates.io isn't an API service in ways other services other e.g. GitHub's GraphQL API. We could make an exception for adding docs.rs as having non breaking changes however.

It's a bit off topic, but it could be possible to set up GraphQL for crates.io via the juniper crate for Iron, but that'd require porting everything over to Iron, and if Rocket or another framework becomes the de facto web standard for Rust in a couple of years, then we'd have to migrate to that as well. If we're refactoring, I feel a discussion about Iron at the very least could be had as well, though it'd be a monumental amount of work, possibly even verging on re engineering crates.io...

Another problem I have with this separation is that it is not future-proof: imagine cargo stops using some endpoint. Should we move the endpoint to the mod.rs saying that since version x.y.z of cargo it is not used anymore, or should it stay in the cargo.rs file and say that it is not used by cargo since version x.y.z ?

So in this example, we'd deprecate the endpoint and keep it in the cargo module. I don't think we can remove the endpoint because of Rust's stability guarantees.

I think a far more sensible approach to refactoring the backend of crates.io would be to first split modules (that are getting out of hand) by functionality. Taking the krate.rs module, we could split it this way:

This sounds like a good plan to start the refactor of the krate module. It depends if we choose to go down the cargo and frontend separation though.

Ideally, in this refactoring, it'd be nice to have no file be greater than ~300 lines or roughly avoid scrolling more than a monitor length within a file.

Well, technically even "stray files" are modules for rust, I don't see the benefit of creating a folder whose sole purpose is to contain a single file named mod.rs (making less clear what its actual content will be). Moreover, this wouldn't even help from a example.rs to example/mod.rs migration as those are the same in a Rust project.

So the modules category.rs, dependency.rs, download.rs, and owner.rs are fairly low-level to the database and only contain structs. I'd like to group this somehow, since that's cognitive load to know that these files don't have any functions that are directly linked to API endpionts.

I'm not really sure moving these utils-kind of modules deeper into the project's hierarchy is the best course of action. For example, the render.rs isn't related in any kind to the krate module and it probably shouldn't ever be. Moving it into the krate module would add unnecessary coupling between the two and would slow us down in case of refactoring or if we wanted to use the rendering functionality from another module. Same thoughts for the email module.

Fair enough, we could keep the rest of the modules as is then.

I'm going to create a separate issue for the test refactoring, feel it'll help make things clearer regarding our current state of test coverage and subsequently work to improve that.

@kureuil
Copy link
Contributor

kureuil commented Oct 1, 2017

So I think the reasoning for splitting modules into cargo and frontend modules is that we can split out the known endpoints, at least to us, and make changes to them independently of each other, and more importantly, not accidentally change something that will break the cargo endpoints. Yes, we don't know all external users, but we only have to provide a guarantee that cargo and the frontend work, since technically crates.io isn't an API service in ways other services other e.g. GitHub's GraphQL API. We could make an exception for adding docs.rs as having non breaking changes however.

I don't see why crates.io wouldn't be a public API service, as it seems that no-one is against having public documentation for the API endpoints (See #731 and #741). Limiting ourselves to only support the crates.io frontend and cargo seems like shooting ourselves in the foot (feet?) as the API is already public and people are already using it, so telling them "The API might not be stable because you're not as big as cargo" doesn't seem fair to me. We should ensure that the API has no breaking changes no matter where it is used or by whom. This is the whole purpose of namespacing the API endpoints with the /api/v1 prefix: if we have to make breaking changes, we introduce a "new" API under the /api/v2 namespace and so forth.

It's a bit off topic, but it could be possible to set up GraphQL for crates.io via the juniper crate for Iron, but that'd require porting everything over to Iron, and if Rocket or another framework becomes the de facto web standard for Rust in a couple of years, then we'd have to migrate to that as well. If we're refactoring, I feel a discussion about Iron at the very least could be had as well, though it'd be a monumental amount of work, possibly even verging on re engineering crates.io...

Let's not rush ourselves here, the landscape might change once async I/O is used more widely and Rocket might even stable by then. About refactoring, if we decouple crates.io from conduit, it would help greatly if we are to change framework and go for something like Iron/Nickel/whatever. Though that's not the issue here.

So in this example, we'd deprecate the endpoint and keep it in the cargo module.

Really not a fan of this. Moreover, if during front-end work the Ember application happens to use an endpoint that was only used by cargo before, we would have to move this endpoint into a mod.rs/common.rs file which feel very cumbersome to me and not easily maintainable and easy to watch.

Ideally, in this refactoring, it'd be nice to have no file be greater than ~300 lines or roughly avoid scrolling more than a monitor length within a file.

I don't like these kind of "hard" rules about file length. Limiting ourselves to a certain number of lines might promote functionalities which logic is split in multiple files because "limits" making it harder for someone to contribute to this particular feature. Also this is probably one of the best way to end up with file names like krate.rs, krate2.rs, krate3.rs, krate5.rs which are pretty not explicit about their content.

So the modules category.rs, dependency.rs, download.rs, and owner.rs are fairly low-level to the database and only contain structs. I'd like to group this somehow, since that's cognitive load to know that these files don't have any functions that are directly linked to API endpionts.

Maybe they should be submodules of the krate module but I don't think I'd try to merge them or anything myself.

@vignesh-sankaran
Copy link
Contributor Author

vignesh-sankaran commented Oct 1, 2017

I don't see why crates.io wouldn't be a public API service, as it seems that no-one is against having public documentation for the API endpoints (See #731 and #741). Limiting ourselves to only support the crates.io frontend and cargo seems like shooting ourselves in the foot (feet?) as the API is already public and people are already using it, so telling them "The API might not be stable because you're not as big as cargo" doesn't seem fair to me. We should ensure that the API has no breaking changes no matter where it is used or by whom. This is the whole purpose of namespacing the API endpoints with the /api/v1 prefix: if we have to make breaking changes, we introduce a "new" API under the /api/v2 namespace and so forth.

My main concern here is that the crates.io backend isn't designed to support this use case in mind. An example of a service that is is GitHub, they created a GraphQL API because third party vendors were having difficulty querying specific data out of it. We simply don't have that knowledge ahead of time of third party users of the crates.io API to know what data is specifically most desired at least with fixed endpoints ala GitHub's v3 API, that's why I feel the priority of any API changes has to be ember and cargo since that's all we know. I feel if we want to support external users in the same manner cargo and ember are, we need to know where the most interest lies in crates.io data, and make more general endpoints for /api/v2.

Let's not rush ourselves here, the landscape might change once async I/O is used more widely and Rocket might even stable by then. About refactoring, if we decouple crates.io from conduit, it would help greatly if we are to change framework and go for something like Iron/Nickel/whatever. Though that's not the issue here.

👍 regarding async I/O stabilisation , I'm not sure how we could decouple the backend from conduit to ensure an easier migration to another framework. Interested to hear your thoughts on this.

Really not a fan of this. Moreover, if during front-end work the Ember application happens to use an endpoint that was only used by cargo before, we would have to move this endpoint into a mod.rs/common.rs file which feel very cumbersome to me and not easily maintainable and easy to watch.

Hmm, fair enough. If we decide to split on cargo and ember lines, we could add in tests to ensure that's kept up to date.

I don't like these kind of "hard" rules about file length. Limiting ourselves to a certain number of lines might promote functionalities which logic is split in multiple files because "limits" making it harder for someone to contribute to this particular feature. Also this is probably one of the best way to end up with file names like krate.rs, krate2.rs, krate3.rs, krate5.rs which are pretty not explicit about their content.

Sorry, I should have been clearer regarding this. I wanted to avoid having a file with a line count of krate/mod.rs propping up again, I took the wrong route of suggesting a rough guideline of file length but this is probably something we can discuss if we feel it's starting to happen again.

So this discussion's starting to get a bit long, I'm thinking that we could narrow this discussion down to a few key question so we can make set up a task list and make this an E-Mentor issue:

  • Do we want to split along cargo and ember functionality in krate and version?
  • Do we want to properly support potential external users of the crates.io API? This would include proper API documentation and knowing what users want to extract from it.
  • Do we want to migrate away from conduit to another framework? If so, what time scale do we provide for this? If not, how can we decouple the backend away from conduit to facilitate an easier framework switch in the future?
  • How do we deal with dependency.rs, download.rs, and owner.rs?

@kureuil
Copy link
Contributor

kureuil commented Oct 1, 2017

My main concern here is that the crates.io backend isn't designed to support this use case in mind. An example of a service that is is GitHub, they created a GraphQL API because third party vendors were having difficulty querying specific data out of it. We simply don't have that knowledge ahead of time of third party users of the crates.io API to know what data is specifically most desired at least with fixed endpoints ala GitHub's v3 API, that's why I feel the priority of any API changes has to be ember and cargo since that's all we know. I feel if we want to support external users in the same manner cargo and ember are, we need to know where the most interest lies in crates.io data, and make more general endpoints for /api/v2.

The thing is that even if it wasn't designed with this use case in mind, people are building tools against crates.io no matter what, and we probably should encourage a rich tool ecosystem imho. I think maybe another route to explore for a /api/v2 would be json:api compliance, ember-data has a built-in adapter for these kind of backends might be interesting to see what it's worth. Instead of "most desired" endpoints, maybe a "most used" endpoints metric would be more useful ?
By any chance, do we have any kind of analytics on which API routes are used @carols10cents ?

👍 regarding async I/O stabilisation, I'm not sure how we could decouple the backend from conduit to ensure an easier migration to another framework. Interested to hear your thoughts on this.

I'll quote myself on this one and give more details about what I'm talking about.

Then, the second step to refactoring the backend of crates.io would be to extract all conduit endpoints into a dedicated endpoints.rs module, which will only contain the smallest HTTP endpoints whose only job is to call the business logic from the corresponding module. This makes the code easier to test (we don't have to launch an HTTP server to test the yanking behavior) and it makes it easier for us to migrate from Conduit to another web framework (though due to the current state of the web framework ecosystem, we might have to wait a little longer).

For example, taking the dependencies function in the version/mod.rs module:

// Might not compile at all, no warranty

// src/lib.rs
api_router.get(
    "/crates/:crate_id/:version/dependencies",
    C(version::endpoints::dependencies),
);

// version/mod.rs
pub fn dependencies(conn: &PgConnection, crate_name: &str, crate_version: &semver::Version) -> CargoResult<Vec<EncodableCrate>> {
	let krate = Crate::by_name(crate_name).first::<Crate>(&*conn)?;
    let version = Version::belonging_to(&krate)
        .filter(versions::num.eq(semver.to_string()))
        .first(&*conn)
        .map_err(|_| {
            human(&format_args!(
                "crate `{}` does not have a version `{}`",
                crate_name,
                semver
            ))
        })?;
    let deps = version.dependencies(&*conn)?;
    let deps = deps.into_iter()
        .map(|(dep, crate_name)| dep.encodable(&crate_name, None));
        .collect();
}

// version/endpoints.rs
pub fn dependencies(req: &Request) -> CargoResult<Response> {
    let conn = req.db_conn()?;
    let crate_name = req.params()["crate_id"];
    let version = match semver::Version::parse(req.params()["version"]) {
	    Ok(semver) => semver,
        Err(_) => return Err(human(&format_args!("invalid semver: {}", semver)));
    };
    let deps = super::dependencies(conn, crate_name, &version)?

    #[derive(Serialize)]
    struct R {
        dependencies: Vec<EncodableDependency>,
    }
    Ok(req.json(&R { dependencies: deps }))
}

This is just a quick example I made without compiling anything but it should give you the big picture: The version::endpoints::dependencies function ends up with all of Conduit specific code while the version::dependencies function has a more typesafe signature and is easier to test in isolation because it doesn't know that it is used in an HTTP context, and therefore it can be tested without spinning up a Conduit server. Technically, we might argue about whether the version::dependencies function should return a Vec<Crate> or a Vec<EncodableCrate>.

So this discussion's starting to get a bit long, I'm thinking that we could narrow this discussion down to a few key question so we can make set up a task list and make this an E-Mentor issue:

👍 Here are my answers :D

Do we want to split along cargo and ember functionality in krate and version?

I don't think so :)

Do we want to properly support potential external users of the crates.io API? This would include proper API documentation and knowing what users want to extract from it.

I think so, lots of work on this side, esp. documentation.

Do we want to migrate away from conduit to another framework? If so, what time scale do we provide for this? If not, how can we decouple the backend away from conduit to facilitate an easier framework switch in the future?

Probably, no time scale as we probably will want for async I/O to stablize. cf. a couple paragraphs above.

How do we deal with dependency.rs, download.rs, and owner.rs?

¯\_(ツ)_/¯

@jtgeibel
Copy link
Member

jtgeibel commented Oct 2, 2017

Thanks a lot for the feedback @vignesh-sankaran and @kureuil.

So first of all, I don't really see the gain of tagging endpoints with either a cargo or a frontend tag

I agree, although so far I've been using that as a bit of a shorthand. I think it makes sense to group cargo related functionality together and agree that frontend code can be broken down better. I've seen some good suggestions in this thread and have tried to incorporate them.

When it comes to cargo related endpoints, I think there are a few special things to keep in mind. First, while we don't want to break any existing endpoints, we should be extra careful with the ones hit by cargo. For example, cargo expects us to return a 200 response even for errors. If we migrate to a /v2 api I could potentially see us maintaining these particular endpoints longer than other /v1 endpoints.

Second, cargo does a lot of read access directly against the index. Cargo does need to go through crates.io to write to the index and to upload to S3. Non-cargo routes don't interface with these subsystems or the associated error types.

For these reasons I think there is value in placing the cargo endpoints into krate/cargo.rs and version/cargo.rs files. A PR that modifies a cargo.rs file may warrant a closer review. I see these api modules as containing only the endpoint (controller level) functionality. The S3 (s3/lib.rs) and git (git.rs) related code seems to be well factored out already. The diesel and serde functionality would remain (temporarily) in the mod.rs files and could eventually be grouped under a db/ directory (or similar) for modules that implement the model and view functionality.

To summarize, I think there are several things to keep in mind in this restructuring:

  • Try to decouple things in an MVC style. In addition to krate and version we could split the MV from the C in keyword.rs, token.rs, and user/mod.rs.
  • Most endpoints only query the database, some write to the database (tokens, following, owners, etc.), and even fewer involve committing to the index and uploading to S3.
  • While we care about backwards compatibility for all of the routes, we may have to maintain cargo for a longer depreciation period if we ever get to that point.

Here is my proposed outline for the krate and version portions of the directory structure:

  • krate/mod.rs - diesel and serde related stuff; move that code to db/ or something similar long-term
  • krate/cargo.rs - index, download, new, list/add/remove owners [The index route will be clearly marked in the doc comment as also used by the frontend.]
  • krate/follow.rs - beyond cargo.rs I think these are the only other krate related routes that write to the database
  • krate/owners.rs - Read-only functionality: owner_team() and owner_user(). These frontend routes are already separate from the cargo associated routes.
  • krate/downloads.rs - downloads()
  • krate/read.rs - read-only routes of data written during a publish: show(), readme(), versions(), reverse_dependencies(), summary()
  • version/mod.rs - diesel and serde related stuff; move that code to db/ or something similar long-term
  • version/cargo.rs - yank/unyank
  • version/depreciated.rs - version::index appears to be unused
  • version/downloads.rs - downloads()
  • version/read.rs - read-only routes of data written during a publish: show(), dependencies(), downloads(), authors()

bors-voyager bot added a commit that referenced this issue Oct 18, 2017
1102: Split krate and version functionality into submodules r=carols10cents

This series of commits splits the api endpoints in the `krate` and `version` modules into submodules as an initial step of the refactoring discussed in #912.

## Module structure

### `krate::search::search`

Shared by cargo and the frontend, used for searching crates.

### `krate::publish::publish`

Used by `cargo publish`.  This endpoint updates the index, uploads to S3, and caches crate metadata in the database.

### `{krate,version}/metadata.rs`

Endpoints in these files provide read-only access to data that could largely be recreated by replaying all crate uploads.  The only exception I've seen to this so far is that some responses include download counts.

### `{krate,version}/downloads.rs`

Provide crate and version level download stats (updated in `version::cargo::download`).

### `krate/owners.rs`

All endpoints for the used by cargo and the frontend for maintaining the list of crate owners.

### `krate/follow.rs`

Read/write access to a user populated list of followed crates.

### `version::deprecated`

The `version::deprecated::{index,show}` routes appear to be unused.  We should confirm this and discuss a plan for potential removal.

### `version::yank`

Yank and unyank functionality.

## Code that remains in `mod.rs`

The code that remains in the `mod.rs` files consists primarily of structs for querying, inserting, and serializing.  I'm thinking that these structs could be moved to modules under a `src/models` directory along with: `src/category.rs`, `src/dependency.rs`, `src/download.rs`, `src/owner.rs`.  (The `keyword`, `token` and `user` modules also have model logic which can be extracted.)

## Remaining work

In order to simplify review of this PR, I've only moved code around and haven't done any refactoring of logic.  There are probably bits of code that we can move from the endpoints to the model logic, especially if it returns a QueryResult.  Feel free to let me know if you see any low-hanging fruit there, otherwise we can address that in a future PR.  (Some of the logic still in `mod.rs` returns CargoResult which covers all error types.  We will probably need to put some thought into if the model represents just the database or also includes the index and S3 state.  I think further exploration of this is best tracked under #912.)

/cc @vignesh-sankaran @kureuil
@jtgeibel
Copy link
Member

jtgeibel commented Nov 1, 2017

@vignesh-sankaran and @kureuil, I've posted a proposed MVC style module layout over in PR #1155.

@locks
Copy link
Contributor

locks commented Sep 7, 2019

The above-mentioned restructure PR has been merged. Is there any remaining work? Could we close this and open a issue issue for said remaining work if it's the case?

@sgrif sgrif closed this as completed Sep 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-internal 🔧 Category: Nonessential work that would make the codebase more consistent or clear
Projects
None yet
Development

No branches or pull requests

6 participants