-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance optimizations and caching #698
Conversation
# Conflicts: # tests/Mappers/Parameters/TypeMapperTest.php
# Conflicts: # src/FactoryContext.php # src/Mappers/ClassFinderTypeMapper.php # src/Mappers/GlobTypeMapper.php # src/Mappers/StaticClassListTypeMapper.php # src/Mappers/StaticClassListTypeMapperFactory.php # src/Reflection/CachedDocBlockFactory.php # src/SchemaFactory.php # tests/AbstractQueryProvider.php # tests/FieldsBuilderTest.php # tests/Mappers/GlobTypeMapperTest.php # tests/Mappers/Parameters/TypeMapperTest.php # tests/Mappers/RecursiveTypeMapperTest.php # tests/SchemaFactoryTest.php # website/docs/other-frameworks.mdx
So, we're basically building a lot of complexity and cache clearing logic for development environments, that could be solved by just caching the autoload in dev and clearing when editing your vendor packages? Is that right? I'm just not sure I see why it's such an issue to write a command to clear your composer autoload when you add a new vendor package. What am I missing? |
No, not really. This has nothing to do with vendor or autoload at all. In small-medium sized codebases, you may have a dedicated GraphQL namespace that only contains types and controllers, that's small and quick to load. In our case, however, there's no such thing. As the application is modular, GraphQLite annotated files may be contained in multiple namespaces. So we have to supply the entire application namespace to $schemaFactory->addNamespace('App') Now, in a production environment GraphQLite just caches everything with a large TTL, so scanning the entire codebase isn't a concern. It's done once, and never thereafter. In dev environments, however, this is a massive pain. GraphQLite does 5+ filesystem scans (in search of classes), each of which takes 100ms in native FS environments (Linux or non-Dockerized Mac/Windows) and 2500-3500ms in Docker for Mac in our case. So 10-15 secs are wasted just to scan the project for files. Also, because of a short cache TTL, basically each request starts without cache, meaning GraphQLite has to process thousands of classes and extract type/controllers from them on every single request. This could, of course, be solved with a larger cache TTL, but then developers would have to be manually resetting the cache every time they make a change in the application's code. So this PR aims to solve exactly those two last points:
The first point addressed by combining "controller" and "type" namespaces together, and making sure all of GraphQLite uses There's generally no added complexity, aside from the |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #698 +/- ##
============================================
- Coverage 95.72% 95.16% -0.56%
- Complexity 1773 1818 +45
============================================
Files 154 174 +20
Lines 4586 4781 +195
============================================
+ Hits 4390 4550 +160
- Misses 196 231 +35 ☔ View full report in Codecov by Sentry. |
@oojacoboo I've added missing tests and simplified some things where possible. I've also attempted to extract parts of this PR as separate PRs, but it's all interconnected so it's a bit hard to do. I will if needed though. Because the PR touches a lot of things, here's the list of changes: Class finding:
Class bound cache:
Docblocks:
Class finder computed cache:
|
@oprypkhantc this is amazing! |
Thank you 😄 And yes, it should be way less unnecessary stuff being scanned and parsed. Ideally none xd |
@oprypkhantc so, looks like we were setting the GlobTTL explicitly, which has been improving performance on our end in development. But, after testing this PR, I'm seeing some additional performance improvements, so that's great. Can you try upgrading node for the docs? See if that fixes the tests without other issues. Also, this is going to be a BC break, so it looks like we'll target 7.1. The |
We just wanted to upgrade but were stopped by this. |
@Lappihuan It would definitely improve on the situation a lot, since the filesystem scan would only ever be done once in a request's lifespan, which should (?) fix the issue, but I can't say for sure. There are things where |
Great @oojacoboo :) I've fixed the docs and added a changelog entry. |
Is the only bc break the namespace configuration? |
@Lappihuan Also the Namespaces aren't truly a breaking change though - I've marked both methods as deprecated, but they they still exist |
So this is not the case and it doesnt need to wait for 7.1 |
Well it's up to you. Technically there are still breaking changes, just that they're not likely to affect many users. |
could |
Sure thing, I added it back. It just sets either ->devMode() or ->prodMode() depending on the TTL. |
@Lappihuan what's the issue with targeting 7.1? This whole PR is a pretty sizable update with a higher likelihood of breaks. |
@oprypkhantc Can you please review the new |
@oojacoboo Sure, I took a look. In short, after my PR, alekitto's changes would not really help. What alekitto did was add a cache of classes that exist in the project, with a TTL. That's only important in this scenario: For that specific case, these are the options without sacrificing the developer experience:
If none of the options work, and a developer is willing to sacrifice the DX and most improvements from this PR, they can use $cache = new Psr16Cache(new FilesystemAdapter(defaultLifetime: 15));
$factory = (new SchemaFactory($cache))->prodMode(); This is more or less equivalent to the current |
@Lappihuan I never heard back from you on the issues with targeting 7.1 for this release. I still think the changes here are significant enough to warrant that. |
@oojacoboo its probably the safer bet to target 7.1 |
Closes #671
There a few things I focused in this PR. It might seem a bit overcomplicated for what it is, but as I said in the comment to the issue, it's harder than I anticipated. These were my primary goals for a development environment:
ClassA
changes, then only the cache related to that class is invalidated to avoid the "full scan"These may seem a bit too much, but unfortunately filesystems in containerized environments on non-Linux hosts are slow as hell. Even with recent improvements that Docker has done for MacOS, it's still slower than native Linux environments by orders of magnitude (2400ms versus 100ms on my Mac), which is why I feel that all of the above optimizations were necessary for a good DX.
Still, I attempted to hide as much of that complexity as possible behind a
ClassFinder
interface. All of it is contained within one namespace, and if this ever becomes irrelevant or too much to handle - it can just as easily be removed.Namespaces and class finding
As proposed in the issue,
addControllerNamespace()
andaddTypeNamespace()
were combined into a singleaddNamespace()
. Deprecations were added too. This allowed simplifying and combining a lot of the code that used to handle caching of different namespaces. Now there's a singleClassFinder
, global for the entireSchemaFactory
, and other classes don't need to think about multiple namespaces or caching.devMode()
,prodMode()
and globTTLThere used to be a setting called
globTTL
, which controlled the expiration ofglob()
caches. This is no longer relevant and hence was removed:devMode()
enablesfilemtime()
based caching, as well as setting a very small expiration on "computed" caches (that is - caches that can't be invalidated in parts based on file changes, only entirely)prodMode()
disablesfilemtime()
based caching, as well as setting an infinite expiration on "computed" caches.filemtime()
is very performant and even has built-in cache, but is still unnecessary on productionClassFinderComputedCache
That's a new type of cache that I used in all places where
ClassFinder
used to be iterated directly previously. It has a single function:The idea is that it's responsible for iterating over the
ClassFinder
instance you provide. Initially, when there's no cache, it will iterate over all found classes and call$map($classReflection)
on every found reflection, and remember the results in the "entries" cache, keyed by the filename. This way we know which entries relate to which files, and we can invalidate them per file when only some files change.Then,
$reduce
is called to combine the resulting entries map into a result - usually some kind of cache object. This also allows skipping the "invalidate them per file" part on production, where individual entries aren't stored, but rather the result of$reduce
is cached as-is.Result
The schema introspection time (from the ground up, without cache) reduced from 60s+ to 15s, and allowed adding/changing/removing types and controllers without resetting the whole cache - which now takes <5s.