Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the update schema step? #14797

Closed
corymortimerhcp opened this issue Jun 14, 2019 · 11 comments
Closed

What is the update schema step? #14797

corymortimerhcp opened this issue Jun 14, 2019 · 11 comments
Labels
type: question or discussion Issue discussing or asking a question about Gatsby

Comments

@corymortimerhcp
Copy link

Summary

We are in the process of upgrading our project from 2.1.23 to 2.9.4. Everything is going well except the update schema step is taking several minutes to complete. While trying to find solutions to this issue, we stumbled upon #12692 (comment). The issue says that we could "disable" this step if we don't need the context. I guess I don't understand what this step does and how we can disable it. I tried to comment it out in the source code when we build to see how the step affects our website but I don't see any differences except for our build is dramatically faster. Our questions are:

  • What is the "update schema" build step and what does it do?
    • We are hoping a better understanding of this step can help us narrow down why our builds are taking significantly longer than other projects.
  • Is there a way to disable this step (as stated in the above issue)? What are the advantages and disadvantages of disabling this step?

Relevant information

Below is a log from us running GATSBY_ENV=development node --max-old-space-size=4096 ./node_modules/.bin/gatsby develop -p 4000

success open and validate gatsby-configs - 0.094 s
success load plugins - 0.911 s
success onPreInit - 0.009 s
success initialize cache - 0.061 s
success copy gatsby files - 0.061 s
success onPreBootstrap - 0.014 s
Starting to fetch data from Contentful
Fetching default locale
default locale is : en-US
contentTypes fetched 37
Updated entries  948
Deleted entries  0
Updated assets  1462
Deleted assets  0
Fetch Contentful data: 3190.536ms
Starting to fetch data from Contentful
Fetching default locale
default locale is : en-US
contentTypes fetched 14
Updated entries  58
Deleted entries  0
Updated assets  71
Deleted assets  0
Fetch Contentful data: 663.739ms
Starting to fetch data from Contentful
Fetching default locale
default locale is : en-US
contentTypes fetched 1
Updated entries  46
Deleted entries  0
Updated assets  57
Deleted assets  0
Fetch Contentful data: 346.716ms
success source and transform nodes - 6.301 s
success building schema - 3.322 s
warn Attempting to create page "/pure-maintenance", but page "/pure-maintenance/" already
success createPages - 12.551 s
success createPagesStatefully - 0.044 s
success onPreExtractQueries - 0.013 s
success update schema - 662.858 s
success extract queries from components - 0.722 s
success write out requires - 0.024 s
success write out redirect data - 0.006 s
success onPostBootstrap - 0.002 s
⠀
info bootstrap finished - 690.478 s
⠀
⠀
success run static queries - 12.627 s — 116/116 9.19 queries/second
success run page queries - 25.075 s — 639/639 25.50 queries/second
warn
WARNING: We noticed you're using the `useBuiltIns` option without declaring a core-js
version. Currently, we assume version 2.x when no version is passed. Since this default
version will likely change in future versions of Babel, we recommend explicitly setting the
core-js version you are using via the `corejs` option.

You should also be sure that the version you pass to the `corejs` option matches the version
specified in your `package.json`'s `dependencies` section. If it doesn't, you need to run one
 DONE  Compiled successfully in 31576ms                                           10:04:53 AM
⠀
⠀
You can now view housecall-growth-site in the browser.
⠀
  http://localhost:4000/
⠀
View GraphiQL, an in-browser IDE, to explore your site's data and schema
⠀
  http://localhost:4000/___graphql
⠀
Note that the development build is not optimized.
To create a production build, use npm run build
⠀
info ℹ 「wdm」:
info ℹ 「wdm」: Compiled successfully.
⠀

✨  Done in 1102.22s.

Other interesting info about our build

  • We query 3 contentful environments (1 environment has a majority of the assets)
  • We use gatsby-image on many of our images that are hosted on contentful and on images that are queried through static queries
  • We are using @gatsby-contrib/gatsby-plugin-elasticlunr-search and the search index is pretty large

Environment (if relevant)

System:
    OS: macOS High Sierra 10.13.6
    CPU: (12) x64 Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
    Shell: 3.2.57 - /bin/bash
  Binaries:
    Node: 10.12.0 - ~/.nvm/versions/node/v10.12.0/bin/node
    Yarn: 1.12.3 - /usr/local/bin/yarn
    npm: 6.4.1 - ~/.nvm/versions/node/v10.12.0/bin/npm
  Languages:
    Python: 2.7.10 - /usr/bin/python
  Browsers:
    Chrome: 74.0.3729.169
    Firefox: 66.0.2
    Safari: 12.1.1
  npmPackages:
    gatsby: ^2.1.23 => 2.9.4
    gatsby-image: ^2.0.31 => 2.1.2
    gatsby-paginate: ^1.0.17 => 1.0.18
    gatsby-plugin-favicon: ^3.1.5 => 3.1.6
    gatsby-plugin-google-analytics: ^2.0.16 => 2.0.20
    gatsby-plugin-google-tagmanager: ^2.0.15 => 2.0.15
    gatsby-plugin-material-ui: 1.2.5 => 1.2.5
    gatsby-plugin-netlify: ^2.0.11 => 2.0.17
    gatsby-plugin-offline: ^2.0.24 => 2.1.1
    gatsby-plugin-react-helmet: ^3.0.8 => 3.0.12
    gatsby-plugin-robots-txt: ^1.4.0 => 1.4.0
    gatsby-plugin-segment-js: ^3.0.1 => 3.0.1
    gatsby-plugin-sharp: ^2.0.24 => 2.1.3
    gatsby-plugin-sitemap: ^2.0.6 => 2.1.0
    gatsby-plugin-web-font-loader: ^1.0.4 => 1.0.4
    gatsby-source-contentful: ^2.0.67 => 2.0.67
    gatsby-source-filesystem: ^2.0.23 => 2.0.38
    gatsby-transformer-remark: ^2.3.1 => 2.3.12
    gatsby-transformer-sharp: ^2.1.15 => 2.1.21
  npmGlobalPackages:
    gatsby-cli: 2.6.5

File contents (if changed)

All of these files are pretty heavily updated and contains some proprietary content. Will summarize our changes below or add files that have config information removed from them.
gatsby-config.js:

module.exports = {
  siteMetadata: {
    title: 'title',
    siteUrl: `siteUrl`
  },
  plugins: [
    'gatsby-plugin-react-helmet',
    {
      resolve: `gatsby-plugin-material-ui`,
      options: { theme }
    },
    {
      resolve: `gatsby-source-filesystem`,
      options: {
        name: `images`,
        path: `${__dirname}/src/images`
      }
    },
    'gatsby-transformer-remark',
    'gatsby-transformer-sharp',
    {
      resolve: 'gatsby-plugin-sitemap',
      options: {
        exclude: ['excluded urls']
      }
    },
    'gatsby-plugin-sharp',
    {
      resolve: 'gatsby-source-contentful',
      options: 'contentful options'
    },
    {
      resolve: 'gatsby-source-contentful',
      options: '2nd contentful options'
    },
    {
      resolve: 'gatsby-source-contentful',
      options: '3rd contentful options'
    },
    {
      resolve: `@gatsby-contrib/gatsby-plugin-elasticlunr-search`,
      options: 'options as an object'
    },
    {
      resolve: `gatsby-plugin-web-font-loader`,
      options: {
        google: {
          families: ['font families']
        }
      }
    },
    {
      resolve: 'gatsby-plugin-favicon',
      options: 'favicon options'
    },
    {
      resolve: 'gatsby-plugin-robots-txt',
      options: 'robots options'
    },
  ]
}

package.json: N/A
gatsby-node.js: For all of our pages not built from contentful, we build them all in this file. We do not use the default pages(?) folder to build our pages. We call the createPage function to generate our entire site.
gatsby-browser.js: Redux setup and other 3rd party component setup. Pretty identical to gatsby-ssr
gatsby-ssr.js: Redux setup and other 3rd party component setup. Pretty identical to gatsby-browser

@gatsbot gatsbot bot added the type: question or discussion Issue discussing or asking a question about Gatsby label Jun 14, 2019
@stefanprobst
Copy link
Contributor

In the "schema update" step, Gatsby infers the shape of the SitePage nodes, which represent pages in Gatsby's data layer. For all other node types but SitePage, the full data shape is already available in the "build schema" step, which runs after sourceNodes. However, fields added to SitePage.context in createPages need this extra "schema update" step to be available in the schema.

As long as you're only passing ids, slugs, or small amounts of data via context, this should not be a performance problem, but it can be when passing larger amounts of data to pages via context.

Now, in most cases you will not actually need the full SitePage.context.* fields in your schema (unless you plan to run GraphQL queries on any of those fields, which is pretty niche). In fact, we are planning to remove this step (see here and here), which unfortunately has to wait for a major version.

In the meantime you can comment out schema updating (like in the linked comment) - for a more official way we could maybe add this as a experimental flag)?

@corymortimerhcp
Copy link
Author

Thanks for the reply Stefan. It turns out that we do have some pages with lots of data in the context to accommodate search. Once I commented those out, our pages were blazing fast again. We'll dig into the docs about adding search in a better way (https://www.gatsbyjs.org/docs/adding-search/) to figure out a way that would fit our needs.

Adding an experimental flag may be a quicker route for us than redoing our whole search. We can take a look at that issue and then follow the contributing guidelines? https://www.gatsbyjs.org/contributing/how-to-contribute/

@master12
Copy link

So I also just ran into this issue. Please add the flag for this let me know if you need help in contributing this change.

@corymortimerhcp
Copy link
Author

@master12 We actually decided to rework our search since that would eventually break as well. In regards to the flag, our contribution may possibly be a few weeks out since we are resolving our issue by limiting the amount of data we pass through context.

@VikLiegostaiev
Copy link

Hi all, @stefanprobst can you, please, explain why "update schema" step take much more time comparing to the version 2.0.117, for example? Cause, our project migrated from 2.0.117 to 2.9.0 and the difference in "update schema" step is enormous.
Was - 36s
Now - 1581s

@stefanprobst
Copy link
Contributor

@VikLiegostaiev Sorry for the late reply.

The fact that schema updating can take a lot longer has to do with the Schema Customization API introduced in Gatsby v2.2. In particular, schema inference previously only checked on a random string field value if it is a Date (or a File path) and set the field type in the schema accordingly. Since v2.2 we do this more consistently by not only checking one field value. Especially in cases where you try to pass a lot of data (with a lot of string values) down via context, this has a perf impact.

everybody
Since setting experimental flags is not going to happen (#14636 has been closed), the next best thing is to use the Schema Customization APIs to explicitly define the fields for the SitePage type (and not include the context field), which will prevent type inference:

// gatsby-node.js
// Alternatively use `exports.sourceNodes` API for v2.11 and below
exports.createSchemaCustomization = ({ actions }) => {
  actions.createTypes(`
    type SitePage implements Node @dontInfer {
      path: String!
    }
  `)
}

@stefanprobst
Copy link
Contributor

I'm going to close this as answered, but don't hesitate to ask if anything needs further clarification!

@VikLiegostaiev
Copy link

VikLiegostaiev commented Aug 6, 2019

@stefanprobst Hi
I have another problem to discuss if possible. I moved most of data from page context to plugins where I'm creating nodes in sourceNodes and now calling that data in gql queries in pages or in static queries. When I did this, the time of update schema step reduced much from 1500 sec to 700 sec. However, there came more static pages to generate (additional languages) and the time increased almost 3 times. Now update schema step is about 2000 sec. Can you explain why update schema takes so much time even if the biggest data was removed from context? I also tried to add typings to one of the plugin's node - no improvements in time. Gatsby version - 2.12.1

@stefanprobst
Copy link
Contributor

hi @VikLiegostaiev
there actually was an issue with the infer extension not being handled correctly in the schema update step, which was fixed in 2.13.45, so maybe updating the package will help.
if not, is your repo public somehwere?

@j6workz
Copy link

j6workz commented Sep 20, 2019

Actually We too ran into this same issue. But the problem is we were in need of all the data that has been passed through the context. So we thought instead of sending large json as objects to context we stringified it and parsed it at the page. And yes. It worked like a charm. Hope this would help

@VikLiegostaiev
Copy link

@j6workz thanks for advice, we will check. Seems to be very simple and very affective solution :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: question or discussion Issue discussing or asking a question about Gatsby
Projects
None yet
Development

No branches or pull requests

5 participants