-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More work to prevent queries from running when there's in-progress node processing #8859
Conversation
@gatsbyjs/core it'd be great to write tests for handling these coordination problems between nodes being created and queries being run. @thebigredgeek it'd be great to talk to you sometime about how you solved similar problems in https://github.com/thebigredgeek/papq |
Thoughts on how to test this. Add a new integration test site that has a fake source and transformer plugin. The source plugin emits updates to a node every 10ms and then the transformer plugin transforms the node with an artificial delay of 20ms. There's one page that queries the transformed node. The test could be starting We could also just change a markdown file directly but that's not as controllable around timings so probably not what we want. |
@@ -61,8 +61,22 @@ const queue = new Queue((plObj, callback) => { | |||
) | |||
}, queueOptions) | |||
|
|||
// Pause running queries when new nodes are added (processing starts). | |||
emitter.on(`CREATE_NODE`, () => { | |||
queue.pause() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be converted to use xstate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't pause task that is already in progress, right? Just will stop any queued tasks from executing? So there is case where we still will be executing query in parallel to CREATE_NODE processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Though in practice this is fine as most queries are sync and what's not sync is background jobs or fetching data from elsewhere, neither of which would be affectes by node transformations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but markdown processing is async (and most of reports about problem this PR solves is about markdown). That's why I was asking if we should delay dispatching CREATE_NODE
action while query are executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a query of a markdown node is in progress, it gets the markdown node syncly (this should be a word) at the start so even if a new file node immediately causes the markdown remark node to be deleted, it won't affect the in-progress query.
That being said, querying will eventually be fully async so we'll need a better solution for that. I think the current PR is fine as it's simple and solves the problem. Next action would be to add tests and then ensure we have a solution when we go async e.g. delaying API/redux work work until the current queries are finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I was looking in wrong spot - at the transformer level we are already shielded from this issue as node is already passed. But before that we use run-sift
to actually get the node and this one is async as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR seems good as-is, just trying to see if we need to pursue further
It could be actually that my adding xstate to gatsby-source-filesystem to only emit new File nodes when the previous work was done was a red herring and all that was necessary to do (as included in this PR) was to pause query running when a new node is created and only resume when plugin work is finished. I should have a bit of time Monday to test this theory. |
I'm thinking if maybe we should add some redux middleware to delay dispatching |
@pieh turns out it's what I suspected that pausing query running on CREATE_NODE is all that's required... which is both great as this simplifies the problem a lot and annoying as I had been chasing the wrong thing for a while haha :-D Oh well. Would appreciate some manual testing from someone to verify this fixes things on e.g. reactjs.org! |
Checked reactjs.org before and after. Definitely could reproduce |
This includes a number of fixing including this PR which prevents errors during development when editing markdown files gatsbyjs/gatsby#8859
This includes a number of fixing including this PR which prevents errors during development when editing markdown files gatsbyjs/gatsby#8859
The corresponding code was removed in #8859.
) The corresponding code was removed in gatsbyjs#8859.
Mostly this involved including the changes from gatsbyjs#8859 into file-watcher.js
) The corresponding code was removed in gatsbyjs#8859.
This includes a number of fixing including this PR which prevents errors during development when editing markdown files gatsbyjs/gatsby#8859
…ss node processing (gatsbyjs#8859) * More work to prevent queries from running when there's in-progress node processing * Remove unused code * Turns out having query running pause on CREATE_NODE is all that's needed * Make state machine local to each instance of the plugin
) The corresponding code was removed in gatsbyjs#8859.
@gaearon reported that reactjs.org was still seeing problems with broken queries
when saving a markdown file multiple times in quick succession.
I investigated and confirmed and found two improvements to our query runningsystem that prevent this. With these fixes, I'm not able to trigger the bug anymore
with VSCode set to autosave every 100ms.
This PR has two fixes:Changing a file can trigger pages to be recreated which often triggerTOUCH_NODE which would change the state back to idle. This wasn't accurrate
as queries haven't yet been run so I added a new state to source-filesystem's
state machine to track that a query has been queued so we actually wait to emit
new nodes until queries have run.
Pause query running when new nodes are created and resume once the processingis finished. A file node could get created which would dirty a page with a
markdown query and the query sometimes runs before the markdown file node had
been recreated which results in an empty query result. Pausing and resuming
queue running fixes that.
After more investigation, it occurred to me that all that was necessary was to pause query running when node processing was happening which is easy to detect. That removed the need for special code in gatsby-source-filesystem & means we'll solve this same problem for any other plugin which might encounter it in the future.