Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE: Code splitting by route #10

Closed
KyleAMathews opened this issue Jul 23, 2015 · 31 comments
Closed

FEATURE: Code splitting by route #10

KyleAMathews opened this issue Jul 23, 2015 · 31 comments
Labels
help wanted Issue with a clear description that the community can help with.

Comments

@KyleAMathews
Copy link
Contributor

Once we upgrade to ReactRouter 1.0, leverage Webpack's code splitting to dynamically load sections of large sites e.g. docs/* or blog/*

http://webpack.github.io/docs/code-splitting.html
remix-run/react-router#755

@KyleAMathews KyleAMathews added the help wanted Issue with a clear description that the community can help with. label Aug 3, 2015
@phlogisticfugu
Copy link
Contributor

What does RelateRocket have to do with code splitting? Some googling shows that it looks to be your startup?

I think implementing this could open up an important feature: speed

  • absolute minimum loaded on first page load
    HTML, CSS, images, with bundle loaded asynchronously so as not to block page load
  • bundle just contains react modules for current page, for event handling
  • routes use code splitting to only load modules for other pages on-demand, or via pre-fetch

@KyleAMathews
Copy link
Contributor Author

LOL, yeah, RelateRocket is my startup. I meant "React Router" but the two are close enough I guess that I substituted one for the other. I updated the original comment.

And yup, yup, yup! Larger sites especially will really benefit from code splitting. Right now relaterocket.co only has ~5-6 pages so the initial JS load isn't large but eventually the plan is it'll have 100s if not 1000s so code splitting will hugely beneficial.

@KyleAMathews
Copy link
Contributor Author

Because Gatsby loads all pages/modules upfront, to implement code splitting, internally each split would effectively be treated as its own site. E.g. /, /about, /pricing, /contact would be one split, /docs/* another, and /blog/* the third.

Each would share the top-level html/_template components.

@sergeylukin
Copy link

+1 for code split

I want to raise a few issues though.

  • What would happen to say 100s/1000s pages inside one split (like /blog/*)?

    @KyleAMathews just for reference your blog's bundle is already above 1mb before gzip and that's for <100 pages, isn't it? Pretty scary :)

  • How navigation from one split to another one would look like, would it be partial page change + history update or would it be full page redirect?

P.S. Thanks for amazing tool! I was amazed when first tried it out - it just works :)

@KyleAMathews
Copy link
Contributor Author

@sergeylukin hey glad it "just worked"! That's the goal!

So pagination/code splitting is something I'm still researching but the idea is you'd definitely be able to get chunk sizes down to whatever size you'd like. For example, my thinking is now that pagination should be a default way to load chunks. So if you have /blog/1, /blog/2 etc. each of those would be its own chunk.

Navigation to a new chunk doesn't require a page-reload. You would click on the link and wait a bit while the new js chunk downloads and then the page transition would happen. Most of the time you'd barely notice the delay unless you're on a slow connection. We're leveraging ReactRouter's https://github.com/rackt/react-router/blob/master/docs/guides/advanced/DynamicRouting.md functionality here.

Oh and on my blog bundle—it's smaller now since #48 629kb before gzip and 202kb after :)

@knpwrs
Copy link
Contributor

knpwrs commented Dec 9, 2015

It's late and I'm tired, but I think I figured out a way to get everything working with code splitting. Here are my thoughts:

  1. Upgrade to react-router ^1.0.0.
  2. Make all routes use getComponent instead of handler (component).
  3. Add the bundle-loader in lazy mode in front of the markdown wrapper loader thing.

I have some small experiments that demonstrate that this should work. Since gatsby makes use of require.context for the pages directory all pages will automatically be run through the build. By using the bundle-loader in lazy mode, each page (a component) gets its own webpack chunk. From there, just use getComponent instead of handler, do some callbacks, and boom. Full code split functionality.

If you see no problems with this I'd like to take a whack at implementing it.

@KyleAMathews
Copy link
Contributor Author

Hey Ken! Thanks for looking into this! Getting this done would be huge!

Two thoughts/questions. The bundle-loader (didn't know this existed!) would have to be added in front of every page loader type. I think that's doable as we're using https://github.com/lewie9021/webpack-configurator which allows for this sort of programmatic reconfiguration. The tricky part would be determining what constitutes a "page" in a given Gatsby site as someone could have markdown, json, yaml, video, etc. files all being turned into pages.

Second question is would it be possible to merge page chunks together? E.g. say the / chunk also should include the blog index page (/blog/) as those are the two most common pages, etc. In my musing on this, my thought was you could specify these larger chunks using glob syntax. The reason is this is a very nice optimization for high latency network connections. Downloading a ~150kb chunk once means you don't have to hit the network again for a while vs. having to hit the network every time you want a new page. Imagine you're on a cell phone on the train going through tunnels, it'd be nice if most of a site was cached after browsing to a few pages. Also on smaller sites e.g. < 50 pages, perhaps default to one bundle. My blog's bundle.js for example, with ~100 pages is only ~200kb after gzip. A very digestible chunk for all but the worst networks.

@KyleAMathews
Copy link
Contributor Author

Though one interesting advantage of single-page chunks is they'd be very stable as many pages don't change post-launch. With #100 you could have a service worker which caches pages (perhaps even do it in the background after the initial page load) meaning most of a Gatsby site would work offline and load near-instantly. This would be especially cool for a docs site as the docs would be instantly referenceable whenever you need them. Though sadly mobile Safari doesn't support service workers yet.

@KyleAMathews
Copy link
Contributor Author

And another complexity—metadata. Right now templating in Gatsby assumes you can access all metadata from any page. E.g. here where the example docs site gathers all doc pages and pulls out the title.

There's some exploration of this in #47 but this needs to be solved before code splitting can happen.

@knpwrs
Copy link
Contributor

knpwrs commented Dec 9, 2015

Second question is would it be possible to merge page chunks together? E.g. say the / chunk also should include the blog index page (/blog/) as those are the two most common pages, etc.

require.ensure (used by bundle-loader) takes an optional third parameter for chunk name (doc). bundle-loader specifies a unique-ish chunk name per loader call. Theoretically (I haven't tried this) you can make your own bundle-loader implementation that takes some sort of configuration to enable chunk names to be consistent across different paths.

With #100 you could have a service worker which caches pages (perhaps even do it in the background after the initial page load) meaning most of a Gatsby site would work offline and load near-instantly.

You might want to look in to offline-plugin. I haven't used it yet (and I see you mentioned the serviceworker-loader over in #100) but it might help you out. I'll make mention of it over in #100 as well.

And another complexity—metadata. Right now templating in Gatsby assumes you can access all metadata from any page. E.g. here where the example docs site gathers all doc pages and pulls out the title.

Consider my custom bundle-loader idea from before. You could make modules that resolve to metadata + a lazy load function, so the resolved value of the module would be something like:

exports.data = {
  title: 'foo',
  children: [],
  //...
};
exports.load = function () {
  require.ensure([], function (require) {
    // ...
  }, 'name');
};

Metadata would add some slight overhead to the initial bundle, but the vast majority of the content would still be lazy loaded. You could make metadata an option so people could opt-in.

You might also consider making this implementation closer to promise-loader which is the same thing as bundle-loader but it gives you a promise instead of calling a callback.

@knpwrs
Copy link
Contributor

knpwrs commented Dec 9, 2015

Looks like bundle-loader actually has a name parameter, so you can use that directly.

@KyleAMathews
Copy link
Contributor Author

Woah, very cool! The Webpack ecosystem keeps delivering :) So this sounds completely perfect. Tell me if this sounds right.

  1. By default, files are put in a per-file chunk.
  2. If desired for optimization, you can created named "sections" where all pages matching a glob pattern will get chunked together. E.g. for a blog I might do something like:
sections: {
  "2015": "/blog/2015*"
  "2014": "/blog/2014*"
  "oldPosts": "/blog/!(2014|2015)*"
}

So then when creating chunks, matching paths would get added to a named multi-path chunk. This same pattern would also be added into the require.ensure stuff with react-router so it knows which named chunks to load when a particular path is navigated to.

@KyleAMathews
Copy link
Contributor Author

On metadata—getting metadata for a single file is easy—the tricky thing is when you want metadata such as all the titles + paths of your blog entries, etc.

@phlogisticfugu
Copy link
Contributor

as this is implemented, some things I myself would like to see:

  • +1 for the use lazy-loading to maximize the performance of the first page load
  • use hashes in bundle/chunk names, so that CDN's and browsers can safely use very long cache times, while still permitting rapid change to content. New content would then get a different bundle name.
  • also on the names of chunk, if something regex-able like "chunk-" could be included in the name, then CDNs could use that for changing the cache policy.

edit: bundle -> chunk

@KyleAMathews
Copy link
Contributor Author

@phlogisticfugu 100% agree. Gatsby should be as speedy as possible in all respects. Minimal page loads through lazy loading, compress/minify all assets (e.g. #18), and cache everywhere (browser through service workers/normal caching & unique asset names for CDNs).

@knpwrs
Copy link
Contributor

knpwrs commented Dec 9, 2015

On metadata—getting metadata for a single file is easy—the tricky thing is when you want metadata such as all the titles + paths of your blog entries, etc.

Extending my custom bundle-loader above, you can get the paths that a context is able to load:

var context = require.context('posts', true, /\.md$/);
var keys = context.keys(); // ['posts/foo.md', 'posts/bar.md', ...]

@knpwrs
Copy link
Contributor

knpwrs commented Dec 9, 2015

From that point each post is just a module with the meta data and a lazy-load function.

@phlogisticfugu
Copy link
Contributor

On metadata—getting metadata for a single file is easy—the tricky thing is when you want metadata > such as all the titles + paths of your blog entries, etc.

so, re-reading https://webpack.github.io/docs/code-splitting.html it would seem to make sense then if the meta data for all pages is in it's own module, and that module is included in all entry-point chunks. Then it's available on all pages, plus webpack manages it properly as a dependency. The main content of each page would then be separated from the metadata, and lazy-loaded in it's own module

@KyleAMathews
Copy link
Contributor Author

it would seem to make sense then if the meta data for all pages is in it's own module

Yeah, that's what I'm thinking too. It's far easier and the size would only become prohibitive for very large sites (would have to run tests to see). Titles and dates, etc. are pretty small. So basically take the pages array that's being passed around now and remove the body. See also Sitegen's Query API #47 (comment)

@knpwrs
Copy link
Contributor

knpwrs commented Dec 11, 2015

only become prohibitive for very large sites

And if you really needed to split up the metadata you could always just make multiple contexts. But I'd say get the simple case working first.

I started an attempt but I don't really Coffeescript. I'll probably keep trying but if anyone else wants to take a whack at this feel free.

@KyleAMathews
Copy link
Contributor Author

Feel free to convert the coffeescript to js first. Most code has already
been converted.
On Fri, Dec 11, 2015 at 10:54 AM Ken Powers notifications@github.com
wrote:

only become prohibitive for very large sites

And if you really needed to split up the metadata you could always just
make multiple contexts. But I'd say get the simple case working first.

I started an attempt but I don't really Coffeescript. I'll probably keep
trying but if anyone else wants to take a whack at this feel free.


Reply to this email directly or view it on GitHub
#10 (comment).

@KyleAMathews
Copy link
Contributor Author

Gatsby 0.8 includes an upgrade to React Router 2.0 making code splitting now possible.

@knpwrs
Copy link
Contributor

knpwrs commented Feb 25, 2016

oOoOoOo... I might start taking another look at this. I've been silly busy lately.

@KyleAMathews KyleAMathews changed the title Code splitting by route FEATURE: Code splitting by route Feb 26, 2016
@KyleAMathews
Copy link
Contributor Author

@knpwrs that'd be great!

And how's that old saying go... "if you want something done, ask a busy person" ;-)

@KyleAMathews
Copy link
Contributor Author

@KyleAMathews
Copy link
Contributor Author

To combine the code splitting sections/groups + the above article, perhaps when you define page groups, every page within that group gets rolled up within a module e.g. blog/* becomes blog__pages.js which has within it import './blog-post-1' etc. for each matching page.

@sergeylukin
Copy link

Did you know that one can limit the total amount of chunks generated by webpack? I just learned about it now myself and decided to share with you as I couldn't find it being mentioned here.

Btw, a side effect I've discovered is that if it is set to 1, the build is faster, so my webpack.development.config.js has:

plugins: [
  new webpack.HotModuleReplacementPlugin(),
  ...
  new webpack.optimize.LimitChunkCountPlugin({maxChunks: 1})
]

and it puts all chunks in main bundle which is OK for development as long as build is faster :)

While production may have something like:

plugins: [
  ...
  new webpack.optimize.LimitChunkCountPlugin({maxChunks: 15})
]

Production configuration preference really varies of course and for gatsby chunks limit may not be ideally the best option but it's an option to consider and probably the easiest one to go for atm.

Cheers

@KyleAMathews
Copy link
Contributor Author

@sergeylukin that is nifty — thanks for bringing it up. Definitely worth looking into it. Might be an interesting option / inspiration.

@bnjmnt4n
Copy link

@KyleAMathews would like to contribute, but I'm not too familiar with the Gatsby codebase and Webpack as well. Perhaps you could suggest a concrete place to start looking into?

@KyleAMathews
Copy link
Contributor Author

@demoneaux you'd like to contribute to this issue or in general?

@KyleAMathews
Copy link
Contributor Author

Code splitting is coming in 1.0! Wrote up plans at #431

Closing this issue in favor of that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Issue with a clear description that the community can help with.
Projects
None yet
Development

No branches or pull requests

5 participants