Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle circular dependency upon dependency removal #506

Conversation

raejin
Copy link
Contributor

@raejin raejin commented Jan 3, 2020

Summary
This is a proposed fix for the issue I submitted #500. For problem statement, please refer to the issue linked to understand the scope of the bug.

Currently Metro does not handle dependency removal correctly in the presence of certain circular dependency. Since we rely on Metro server to produce correct dependency graph in hopes to bundle only the needed modules. This is especially an issue when a dependency is removed from the entry point, however, due to it having circular dependency, the updated dependency graph will leave out all its dependencies in the graph.

I added the following two more test cases to ensure the correctness of the algorithm:

  1. Remove B from E: removes a dependency with transient cyclic dependency
    image

  2. Remove B from E: removes a cyclic dependency which is both inverse dependency and direct dependency
    image

  3. Remove B from E: removes a sub graph that has internal cyclic dependency
    image

Feel free to propose more tests to ensure the correctness of the algorithm. I'm aware that this may introduce more expensive graph updates for certain scenarios, but I believe that ensuring the correctness is far more important for dependency graph updates.

Implementation

Previously, our implementation will stop removing circular dependency due to it having remaining inverseDependencies:

if (module.inverseDependencies.size) {
return;
}

This can be illustrated by this example:
image

When removing B from E, B will still have an inverse dependency A left. Henceforth, the rest of the graph remain untouched. My proposed solution is to have this async function canSafelyRemoveFromParentModule recursively checking all the inverse dependencies. In this example, we will look up inverse dependencies of A all the way to the end until there is no inverse dependency. We can only safely remove this dependency if and only if its end inverse dependency (in this case A will have B as its end inverse dependency) only has one path and the path is the same as the parent path.

Test plan

  • Updated 1 existing test, with 2 additional unit tests to ensure the correctness of the dependency removal logic handling with circular dependency.
yarn run jest packages/metro/src/DeltaBundler/__tests__/traverseDependencies-test.js

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 3, 2020
@codecov-io
Copy link

codecov-io commented Jan 3, 2020

Codecov Report

Merging #506 into master will increase coverage by 0.02%.
The diff coverage is 90.9%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #506      +/-   ##
==========================================
+ Coverage   84.07%   84.09%   +0.02%     
==========================================
  Files         175      175              
  Lines        5864     5891      +27     
  Branches      973      981       +8     
==========================================
+ Hits         4930     4954      +24     
- Misses        822      825       +3     
  Partials      112      112
Impacted Files Coverage Δ
...ges/metro/src/DeltaBundler/traverseDependencies.js 94.81% <90.9%> (-1.49%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 984aab8...9193ee1. Read the comment docs.

inverseDependencies is not empty. Rather, we recursively check the
inverseDependencies to see if it eventually only points to the removed
module. If this is the case, then we need to proceed removing dependency
instead of returning early.
@raejin raejin force-pushed the fix-dependency-removal-bug-when-circular-dependency-happens branch from 2e2b4fb to ae87904 Compare January 3, 2020 23:29
Copy link

@noahsug noahsug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! How much does this impact performance?

Comment on lines 368 to 369
'',
new Set(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make these args optional with default values

* Given `inverseDependencies`, tracing back inverse dependencies to
* see if it only leads back to `parentModule`.
*/
async function canSafelyRemoveFromParentModule<T>(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function is synchronous? In which case we should remove async

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, nice catch :)

// there isn't circular dependency. Thus, we check if it can be safely remove
// by tracing back the inverseDependencies.
if (
module.inverseDependencies.size &&
Copy link

@noahsug noahsug Jan 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do the module.inverseDependencies.size check in canSafelyRemoveFromParentModule

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may make it clear on what canSafelyRemoveFromParentModule function intends to do, as way to signify that if module.inverseDependencies.size is non zero then we need additional check for all of the inverse dependencies.

@cpojer
Copy link
Contributor

cpojer commented Jan 6, 2020

Thank you so much for this fix. I believe this looks good but do you mind answering @noahsug's questions? I am also curious if there is any measurable performance difference but overall I think it is probably fine – all this data is available in the graph already and no expensive I/O needs to be done for this. If it's less than ~100ms additional time spent for a graph of 10k modules with one change I think we can live with this.

@raejin
Copy link
Contributor Author

raejin commented Jan 16, 2020

@cpojer

Thank you so much for this fix. I believe this looks good but do you mind answering @noahsug's questions? I am also curious if there is any measurable performance difference but overall I think it is probably fine – all this data is available in the graph already and no expensive I/O needs to be done for this. If it's less than ~100ms additional time spent for a graph of 10k modules with one change I think we can live with this.

After running the initial solution with our entrypoint which amounts to 8730 modules, there was obvious performance bug with it (~10s for removing the entrypoint).

With more investigation, I implemented a memoized version which helps avoiding unnecessary DFS. With the updated approach, we're looking at 1~2 seconds by removing an entrypoint which has 8730 modules. I also added more tests around various edge cases that I came up with while debugging with our entrypoint.

@raejin raejin force-pushed the fix-dependency-removal-bug-when-circular-dependency-happens branch from 76f8c9c to fb93ce0 Compare January 16, 2020 12:27
memoized solution to short circuit any situation when a module does not
need further DFS.
@raejin raejin force-pushed the fix-dependency-removal-bug-when-circular-dependency-happens branch from fb93ce0 to d194e41 Compare January 16, 2020 12:34
Copy link

@noahsug noahsug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!! My comments are mostly style stuff

for (const dependency of module.dependencies.values()) {
removeDependency(module, dependency.absolutePath, graph, delta);
}
await Promise.all(
Copy link

@noahsug noahsug Jan 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's anything asynchronous going on here, so we can remove the await Promise.all(. See below for comment.

removeDependency(module, dependency.absolutePath, graph, delta);
}
}
await Promise.all(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's anything asynchronous going on here, so we can remove the await Promise.all(. See below for comment.

Copy link
Contributor Author

@raejin raejin Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do!

return canSafelyRemove;
}

async function removeDependency<T>(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is doing anything asynchronous, so we can remove async.

Since javascript is single threaded, the recursive await Promise.all( below is actually running synchronously (unless I'm missing something - we're not using jest-worker or anything truly async, right?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh noo I see what you meant. I think I did remove all the necessary await. updating to address this!

const result = getAllTopLevelInverseDependencies(
inverseDependencies,
graph,
'',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a comment here like '', // current module name could be helpful since there isn't a named variable to explain what it does

* this can happen when trying to see if we can safely remove from
* a module that was deleted. This is why we filtered them out with `delta.deleted`
* 2. We have one top module and it is parentModule
*
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove empty line, and comment length seems inconsistent. I'm not sure what the style guidelines are on that

delta: Delta,
): boolean {
const visited = new Set();
const result = getAllTopLevelInverseDependencies(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename result to inverseDependencies? Since result isn't actually the result the function returns

return true;
}

const filterNotDeletedResult = Array.from(result).filter(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could rename to something like undeletedInverseDependencies

packages/metro/src/DeltaBundler/traverseDependencies.js Outdated Show resolved Hide resolved
@raejin raejin force-pushed the fix-dependency-removal-bug-when-circular-dependency-happens branch from fa6ce94 to b836f22 Compare January 17, 2020 02:00
@cpojer
Copy link
Contributor

cpojer commented Jan 17, 2020

@raejin

With more investigation, I implemented a memoized version which helps avoiding unnecessary DFS. With the updated approach, we're looking at 1~2 seconds by removing an entrypoint which has 8730 modules. I also added more tests around various edge cases that I came up with while debugging with our entrypoint.

Just wanted to clarify this is not the usual case for every change, only when removing a module that has 8k+ dependencies of its own, right? Removing a module here and there won't have a big performance impact, is that correct?

@raejin
Copy link
Contributor Author

raejin commented Jan 23, 2020

@raejin

With more investigation, I implemented a memoized version which helps avoiding unnecessary DFS. With the updated approach, we're looking at 1~2 seconds by removing an entrypoint which has 8730 modules. I also added more tests around various edge cases that I came up with while debugging with our entrypoint.

Just wanted to clarify this is not the usual case for every change, only when removing a module that has 8k+ dependencies of its own, right? Removing a module here and there won't have a big performance impact, is that correct?

Yes that is correct. Even with 8k module the latency is negligible in my opinion.

Copy link
Contributor

@cpojer cpojer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, let's land it. Thanks so much @raejin and @noahsug!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpojer is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cpojer merged this pull request in 6fa9c13.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants