-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed API Restructure #1053
Comments
Nice write up. Having this all written out helps as a reference to everyone as we move forward. Here is the summary in my words with more action items and tips:
|
I never coded in my life, but I'm a strong proponent of OOP. It's ok if we overload the models, and treat them as objects with roles, properties and methods. Controllers IMO should be very thin. Some code duplication is OK, people optimize too much for this. |
I agree with Yash, I think current trends in MVC recommend making your models "heavy". |
Also, why does re-doing the api server matter if we are unbundling it? In 1 year from now, the api server will just be an http server that writes stuff to a queue? |
We are discussing patterns to be formalized as we move forward. Specifically, testing patterns are immediately important. Focusing on and bulking up our unit tests will ensure we can keep development on API fast. Currently, API has a lot of integration and functional tests which have become harder to maintain as API changes. |
We have had a few meetings centered around this over the months, and Bryan has taken some initiative so that we can formalize and document our thoughts. The backend has been looped in on our direction in the meetings, but having a formal document will help us as we move forward and this is the start of it. |
This is something I didn't specifically cover in depth because it can get complex. More or less, the cleanest solution in my mind would to define some class that allows us to handle complex operations. For example, an To make it concrete, instead of something like this: app.post( //...
instance.findOne(),
sauron.allocateNetwork('instance'),
mavis.getDock('instance'),
docker.deployInstance('instance', 'sauronResult', 'weaveResult') or something like this, where the model hides all the logic but talks to multiple other services Instance.prototype.deploy = function (callback) {
series([
sauron.allocateNetwork,
mavis.getDock,
docker.deployInstance
], function () { callback(self); })
}
// and then having
app.post( // ...
instance.findOne(),
instance.deploy()) you could end up with something like: InstanceDeployment.deployInstance = function (instance, callback) {
series([
sauron.allocateNetwork,
mavis.getDock,
docker.deployInstance,
instance.updateWithStuff
], function () { callback(updatedInstance); })
}
// with a route like
app.post( // ...
instance.findOne(),
InstanceDeployment.deployInstance(instance))
All this to more or less say: yes, the models for the Instance should more or less only deal with the database. If we can limit models to that, it should make it easier to unit test the entire file. Defining the 'overarching models' is a little more tricky but with some thought, it can happen. |
First, OOP is awesome, but with that comes things like separation of concerns and other paradigms that make the design cleaner. We are just trying to define here the best ways to deal with our complexity moving forward. As for why to re-do the API? We are spending a lot of time now dealing with the complexity of the API and it's become cumbersome to do anything in it. We need to fix that debt before we can really even consider doing more things to and with the API. |
Brilliant Bryan, I really like what is proposed here. The basics are solid and I think we should move forward with them. I agree with TJ where mongo models should be just for talking to DB. Then we can create models like
|
@bkendall great write up. Also I want to mention that we are already moving this way by workerizing our code. |
I just finished re-reading this now that I'm back from my week off. I'm in complete agreement with the analysis and the proposed initiatives and I'll try to support it going forward when refactoring or building new logic. |
API Next
There has been a good amount of discussion about restructuring (rewriting) the API for various reasons, but I don't think we have ever iterated all the reasons and discussed why we want them. This document will attempt to do just that and propose a new way to format our API code so it is more reliable, readable, and testable.
Issues
Various reasons have been proposed for why we need to update our API. I want to focus on a few specific cases:
Coverage, and Route Conditional Logic
Currently, coverage reports are tough to get for API's tests. Of our two test suits, only one of them has coverage enabled: the unit tests. Coverage is a good metric for ensuring code is tested, but when we don't have enough tests to cover all our code, we won't get good feedback from this metric.
Interestingly, the argument has been made that we need "coverage for our routes". Express itself makes this difficult because of how it loads its middleware. Getting coverage in our routes is virtually impossible because of this, and I don't believe that coverage for our routes is a very good metric for our testing (yet).
The reasoning behind the argument for "coverage for our routes" stems from the fact we have conditionals embedded in our route flows. This, I strongly believe, is erroneous behavior to have defined in a route. It makes testing very difficult because we have to create complicated functional tests to trigger routes at specific times with specific conditions. All of this is moot, however, because we cannot get coverage in the first place.
Increasing coverage in the appropriate places and removing conditional logic in routes would be very solid first steps to increasing the confidence in and the testability of our API.
Unit Tests
In the discussion of coverage, I mentioned that we only have it turned on in the unit tests. This, in my view, is the correct place for coverage. However, since it has not been the paradigm to write unit tests for models from the beginning, we are lacking unit tests for many files. Currently (as of 2015/09/14), the coverage of what we do have unit tests for is at
86.42%
. While this is respectable, it's still a far cry from getting all our files covered. Our testing framework,Lab
, doesn't include files in coverage that aren't run during the unit tests, so we aren't including any file in which we have0%
coverage.Addressing this issue means we need to do two things. First, we need to make sure that un-tested files are included in the coverage reports.
Lab
does not do this, so we need to make it visible. Then we will be able to actually write the tests and ensure our models are covered.Test Duration
As a side effect of our in-route conditions, we have had to write a good deal of functional/integration tests that make full-blown requests through our API to test features. By increasing the number of unit tests (which are by definition much smaller and faster) and reducing the number of functional/integration tests (calling all the way through the API), we should be able to speed up our tests dramatically. This will increase developer confidence and allow us to move faster and debug problems more rapidly.
Proposed Changes
There are a handful of changes that I would like to propose and rules for moving forward.
Ultimately, I want to get closer to a "Model View Controller" methodology, but Express's names and how we've architected all our logic puts us pretty far removed from being able to define those three things clearly.
These tasks are very much related, but let's try to address them each individually.
Routes, Middleware, and Models
That's the way that Express's homepage defines its product. What this means to me, really, is that it is not opinionated when it comes to your usage. It provides the request/response/http server for you and ways to define behavior on routes. Unfortunately, this means that without really well defining how one is going to use Express before you start, it can become messy (which, I believe API has become).
Express's lack of opinion actually semi-infuriating at times. Even the examples directory contains vastly different ways of using Express (and I'll just ignore the fact that I can't seem to find any tests in any of these examples). Two that I found useful at all are route-separation and mvc.
a quick aside...
I think that we are using express in a very broken way. By having our routes handle a lot of conditions and logic, we're making it very, very difficult for ourselves to test our code. I am to blame for this as well; I remember talking to TJ about adding
then()
andelse()
todat-middleware
. At the time, when the API was young and we were moving very quickly, it seemed like a grand idea. However, as API has become more complex, these conditional routes that we created have become a major headache. In order to move away from this, however, we need to define how to do it and where to move our conditionals.Middleware (and branching)
"Middleware" has been abused in our routes. Having Middleware to control flow is tricky to debug and is something that has bitten us. A better definition for Middleware in an MVC viewpoint is "a function that prepares a request for use by a Controller". This means that it doesn't make any changes to a model (there are one or two exceptions, like sessions). It simply gathers information and prepares information for the Controller.
A very clear example of this in Middleware is
body-parser
. It looks at the body it was handed and turns it into a JSON object that is more friendly to us and provides it asreq.body
. That value can then be used by a Controller or modified by another Middleware.One case were I do see branching with Middleware to be useful is in validation. If Middleware ever 'branches' in some way, it should be for the sole purpose to fail and to fail quickly. For example,
dat-middleware
exposes functionality likemw.body('key').require()
. This checks that the request body containskey
and causes an error if it does not. This makes validation very easy and clear: this code either validates the body or returns an error. A good case fordat
's modification methods ismw.body().pick('key')
, which prunesbody
to just containbody.key
and allows us to prevent extra values being present. We know thatdat-middleware
is very well tested and has been validated, so it is a very useful Middleware.What we need to remove and avoid doing in the future is the following:
This conditional is very simple and is one that you may think is 'easy to test', but this is the mindset we want to avoid. The overhead to test this condition is substantial: you have to create an entire request to the API, at least twice, to validate that both Middleware. This conditional branching Middleware is difficult to test and maintain. Where we move these conditionals to is forthcoming.
There is one reasonable use for
.then()
: it can be used to conditionally modify the request if a value is present. However, it would be best to hide this condition as best as possible. Let's say the following exists in a route:This is quite a handful to read and isn't instantly clear in a route. If this a defined Middleware name
replaceQueryUsernameWithId(req, res, next)
which was just invoked, it would be immediately clear what happened to the request. Additionally, while writing and validating this new Middleware, you could test it with a fake request and make sure it is modified correctly. Importantly, the entire route doesn't need to be called to test this Middleware, and a simple unit test suite for this Middleware prevents it from being changed without acknowledgment from the developer and possibly breaking other routes that use it.(Note that I used
someHelper.findUserId
and never mentioned testing it specifically. If that helper and method were written on the model and then wrapped with library [e.g.middlewarize
] to make it Middleware-friendly, there should be unit tests written for that Model method already. When the custom Middleware is written and tested, the Model can be mocked out or stubbed so the Middleware chain can be tested as a unit.)Given this quick discussion, we can come up with two rules for any given middleware:
Models
Models we have defined (more or less) as something that interacts with an external service. This includes things from the Instance Model (interfacing with MongoDB and Neo4j) to the Docker Model (which, unsurprisingly, deals with Docker). I believe we are using these very well. On occasion, we can use these Models and their methods directly in our Middleware chain definition for a route to perform actions. We do this through the use of
middlewarize
,mongooseware
, or similar, that wraps the Model conveniently so the Models can be stored on the request and worked with in a Middleware fashion. Being able to sayinstances.findOne({_id: 'params.id'})
and put the Instance on the request is very powerful.These Model methods should be fairly straight forward and simple, if we are to use them as Middleware.
findOne
is a great example: it takes an object as options, calls to the database, and calls back with the Instance. Likedat-middleware
, we have to trust that any library wrapping this Model to be used as Middleware is behaving correctly. So long as the wrapper takes the called back data and puts it on the request in the correct place, all we have left to test is the Model call itself. A couple quick unit tests later, we have 100% confidence that this will work every time.Model Methods as Middleware Tangent
One thing I do want to make clear is that we are taking Model methods and wrapping them to use them as Middleware. For example, take this Model method:
We are able to call
walrus.model.eatFood(4)
in a route and the Walrus on our request gains energy (or whatever effect comes from eating food). What we do not want to do is the following:This means that the Model has knowledge of how it's being used. Using this function from another Model becomes very tricky as well because you have to build up a request and a response to be used, and you don't know if this method responds or simply continues with
next
(or conditionally does either - scary).All of this I like to summarize as the following: using models as middleware (wrapped through some library, usually) is a powerful tool, but models should never provide middleware.
Routes and Controllers
Now that we've discussed some important details about both Middleware and Models, we need a way to connect them to a route. In the traditional MVC way, we want to place our Middleware at the beginning of our routes and then use Controllers to call our Models. However, Express has no formal notion of 'Controllers', because it is "minimal and flexible"...
Take a look at the MVC example in Express's repository. They have a interesting methodology they employ to define "Controllers". Specifically, the Users Controller. In it, there are 4 methods defined that directly call a function on
res
:list
,edit
,show
, andupdate
. These are the defined controllers forGET /users
,GET /users/:user_id/edit
,GET /users/:user_id
, andPUT /users/:user_id
, respectively. Now, these Controllers are very simplistic. They simply are rendering data or making a very simplistic change to the 'database'. You can use your imagination to see how models could fit into this paradigm.There is a little bit of magic happening in
lib/boot.js
to map routes to these Controller functions. Importantly, here is the bit I want to highlight (omitting some logs):I think this is very telling of how things should be structured. What I didn't mention about this example is the
before
method in the Users Controller. It is used here as Middleware before the handler is called. This Middlewarebefore
is defined in the Controller and if you look at it, you can see it doesn't do anything except put a user on the request object (there's some magic withnext('route')
in there, but we can ignore that for now). That function is what is run before anything that could or would deal with the Model (the Controller) would run. This matches precisely with our new definition of Middleware: simply modifying the request so a Controller has the correct data.The Big Finish
Did you see what happened there? "Controllers" showed up! And here's where I think Express lead us astray. Everything that looks like
function (req, res, next)
is called 'middleware' in Express-land. Unfortunately, everything looks like that. If I were to ask you what are 'Controllers' are in the current API, you could only say "the routes", but that's not strictly true. This culmination of our conditional Middleware has put our code into a weird, tricky spot to test, and we can't do it without writing crazy requests to our API that have a high amount of overhead.I don't necessarily like moving all the validation/before logic into Controllers like the MVC example because we have some very good, well tested tools that allow us to write Middleware cleanly and quickly. When we do write our own, we can write and expose it in such a way that it is able to be unit tested, improving our confidence in the code! We do need to rid our routes of conditional branches because it doesn't make sense to have the conditions in the routes. It makes it very difficult to test and to read, and conditions in the routes means that we can never guarantee coverage of it.
Functional Example
Let's bring all this discussion home with a relatively simple example. For this, we will use an API that provides access to Walrus models. Walrus models are capable of accepting Gifts (which can be looked up in the database) and having Energy. Some gifts are edible (Food) and some are not (Shoes). The Walrus will throw away inedible Gifts, expending Energy, but will eat edible Gifts, gaining Energy. The Walrus is given Gifts by doing a
PUT
request to/walrus/:id
with a Gift in the body ({gift: 'shoe'}
). Here's a simple example that demonstrates some of the complexity we currently have in the API (with some comments for clarity):If you've spent any amount of time in our API, then you know that this kind of pattern is pretty prevalent. What's tough about this is that in order to test that the Walrus does the correct thing with the Gift we have to spin up the entire API and make multiple requests with multiple Gifts and validate that all our Walruses have correct values at the end.
"But, you have two model methods here (
throwGarbageAway
andconsumeFood
) that you can test! That's good, isn't it?"Yes, but no. This pattern of a route also leads people to simply mocking the Walrus and making sure the right model methods are called. We are doing that because we are trying to test the conditional logic in our route! This is what makes our tests so difficult and expansive, especially when we get nested conditions. The conditions here are also impossible to get coverage on (because of reasons eluded to above).
Here's the same route written with a different mindset:
There isn't much changed with this refactor. Importantly, the conditional logic is gone from the route! The route doesn't know any more than it needs. And notice how this reads (the comments in the routes for clarity): Middleware, Controller, View.
There is 'global' Middleware for all
/walrus/:id
routes at the top. Following that is a Middleware helper that is specific to this route which does some additional work to verify and gather information about the gift. This is all functionally exactly as before.The Controller happens next, where the Walrus is now simply calling
processGift
. The route doesn't do any conditional checking, the Walrus is now responsible for that!processGift
is probably a function that simply callsthrowGarbageAway
orconsumeFood
as before, but now the route doesn't care. The Model function forprocessGift
now should be completely unit tested with all sorts of Food and Shoes, and it makes the appropriate calls as necessary. The best part is that the route never needs to be called to check that conditional.Finally, the View takes our updated walrus and returns it to us. Pretty straight forward.
What's left to test here? Our unit tests take care of "consume or throw away the Gift based on if it is edible", so we don't have to make a bunch of requests to test that. Our helper Middleware
validateGift
can be unit tested by mocking out database calls and making sure it causes an error if the gift doesn't exist. All we functionally care about is that if we call this route with a valid food or shoe, the walrus behaves accordingly. No more truth-table-like validation of routes, we can simply make sure the route behaves correctly.Express doesn't make MVC very easy, and that's why we need to define these better. By separating our Middleware from our Controllers (and our Views), we can define routes in better ways to utilize Models and Controllers in a way that's more logical and testable.
In Conclusion
More or less, the ultimate crux in our testing pattern is that we've built up a massive amount of debt to conditional logic in our routes and not enough Models with unit tests. If we are able to make Models more appropriately handle any given input, remove conditions from our routes, and clean up our Middleware, we should be able to have a much easier time moving forward with the API. This discussion is one that only focuses on the formatting of routes and the roles of Models and Middleware, but it is one step in the process of improving our lives as developers and maintainers.
Let's take one last look at the proposed changes, with a little more detail:
Redefine the roles of Routes, Middleware, and Models
By using Middleware as something that only prepares the request for Controllers, we limit the scope of what it does and can easily separate out 'Controller' logic from the 'Middleware' logic (logic which is all Express 'Middleware').
Remove conditional logic from routes
Bringing conditions out of the routes and into Models/Controllers allows us to test these conditionals much more easily, bringing down the number of required functional tests and increasing the number of unit tests. This speeds up our testing feedback loop considerably.
Strengthen unit tests for Models, enabling coverage
As we move conditional logic to Models, we can ensure that they are covered much more easily with unit tests and get better feedback out of coverage reports.
Edit: formatting
The text was updated successfully, but these errors were encountered: