Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add automatic parsing cache for Rails apps #3156

Merged
merged 1 commit into from
Feb 23, 2021

Conversation

casperisfine
Copy link
Contributor

This is a followup to #3155

I don't know if that's a feature that is in scope of the project, if not I can very easily add this caching in our own project, but I figured I'd first try to see if I can speed this up for all users of the library.

I tried it in our app and it's pretty effective, GraphQL::Schema.from_definition is down from 3.3% of boot time to 0.8%:

Before:

$ bundle exec stackprof ~/Downloads/stackprof-shopify-boot-production-cpu\ \(2\).dump --method='GraphQL::Schema::BuildFromDefinition.from_definition'
GraphQL::Schema::BuildFromDefinition.from_definition (/tmp/bundle/ruby/3.0.0/gems/graphql-1.11.4/lib/graphql/schema/build_from_definition.rb:9)
  samples:     0 self (0.0%)  /   1744 total (3.3%)
  callers:
    1744  (  100.0%)  GraphQL::Schema.from_definition
  callees (1744 total):
    1331  (   76.3%)  GraphQL::Language::Parser.parse
     413  (   23.7%)  GraphQL::Schema::BuildFromDefinition::Builder#build

After:

$ bundle exec stackprof ~/Downloads/stackprof-shopify-boot-production-cpu\ \(3\).dump --method='GraphQL::Schema::BuildFromDefinition.from_definition'
GraphQL::Schema::BuildFromDefinition.from_definition_path (/tmp/bundle/ruby/3.0.0/bundler/gems/graphql-ruby-0292e15485c6/lib/graphql/schema/build_from_definition.rb:13)
  samples:     0 self (0.0%)  /    401 total (0.8%)
  callers:
     401  (  100.0%)  GraphQL::Schema.from_definition
  callees (401 total):
     372  (   92.8%)  GraphQL::Schema::BuildFromDefinition.from_document
      29  (    7.2%)  GraphQL::Language::Parser.parse_file

There is still a bit of time spent in the parser because one of our gem parse some queries before the case is properly initialized, but that's for us to fix.

@rmosolgo @swalkinshaw what do you think?

@swalkinshaw
Copy link
Collaborator

🤔 parsing a file is mostly a function of https://github.com/github/graphql-client I think so it would make sense to have this cache live there (but that doesn't solve Shopify's use case directly).

After your other PR, graphql-client (and Shopify's usage) could specify a custom parser with this caching behaviour? This functionality could work without any changes to GraphQL::Language::Parser right?

@casperisfine
Copy link
Contributor Author

I think so it would make sense to have this cache live there

I suppose. Again I apologies for my fairly shallow understanding of the global architecture of all this. I'm mostly reading profiles and walking up from there. But yes from what you say it would make sense I suppose.

After your other PR, graphql-client (and Shopify's usage) could specify a custom parser with this caching behaviour? This functionality could work without any changes to GraphQL::Language::Parser right?

That is something I somewhat considered, since the parser is substitutable, maybe the cache should be decoupled and applied regardless of the parser. But then I figure that in such case, we have no way to invalidate the cache if the parser change, hence why I opted for the slightly more conservative approach.

Let me know what you think is best and I'll happily change my PR.

@swalkinshaw
Copy link
Collaborator

But then I figure that in such case, we have no way to invalidate the cache if the parser change, hence why I opted for the slightly more conservative approach.

That's a good point.

We'll see what @rmosolgo thinks 😄

Copy link
Owner

@rmosolgo rmosolgo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think this is a great feature, but I'm hesitant to turn it on by default. There are some folks who parse schema files to build their production schemas, and this would start writing files to their production servers, right? I'm worried it might not play nice with environments that don't support the application process writing to the filesystem (Heroku?). I see you have a rescue that might cover that case, but I'm not exactly sure how the system might fail in that case.

Besides that, IIUC, this cache has no cleanup mechanism, so someone might be surprised to find a growing cache directory in production. Is that something to consider here?

Anyways, those considerations in mind, how about including this cache & documenting it, but not enabling it by default? It could be behind a global method like GraphQL.enable_parse_file_cache or something.

Comment on lines 25 to 31
tmp_path = "#{cache_path}.#{rand}"
File.binwrite(tmp_path, Marshal.dump(payload))
File.rename(tmp_path, cache_path.to_s)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, is this to avoid two ruby processes writing to the same cache file at the same time? You do two binwrites, and then the last one to .rename "wins"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@casperisfine
Copy link
Contributor Author

Is that something to consider here?

I think that's all good point. For context I mostly took inspiration from Bootsnap, I even toyed with using it as the cache backend. All this assumes you have some kind of build step in which you boot your application first and save the boot cache, e.g. heroku / Dockerfile etc.

how about including this cache & documenting it, but not enabling it by default?

Maybe there's a middleground? e.g. not enabled by default, but if we detect you have Bootsnap enabled, we automatically enable it?

GraphQL.enable_parse_file_cache or something

I'd say the best would be to set the cache root path.

@rmosolgo
Copy link
Owner

Sorry for the hangup here.

if we detect you have Bootsnap enabled, we automatically enable it?

That's 👍 by me!

@casperisfine casperisfine force-pushed the railtie-cache branch 4 times, most recently from c44cfcb to 2f8c2ca Compare February 19, 2021 17:39
@casperisfine
Copy link
Contributor Author

@rmosolgo no worries.

I've updated the PR it should be good to go now.

@rmosolgo rmosolgo added this to the 1.12.6 milestone Feb 23, 2021
@rmosolgo
Copy link
Owner

Thanks for your work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants