-
Notifications
You must be signed in to change notification settings - Fork 357
Prevent union-finding cycles for shared qspecs #3011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Certain graphs caused infinite recursion when unioning nodes to groups based on shared quantization specs while implicit sharing was enabled. The problem occurred when `NodeOrEdge` A with its own `QuantizationSpec` received an edge (in `shared_with_map`) to `EdgeOrNode` B which in turn had a `SharedQuantizationSpec` pointing back to A. Remedy this problem by checking if B, from the scenario above, has a `SharedQuantizationSpec` pointing to A; if that is the case, don't union them together by letting A point back to B. Avoiding the union/cycle preserves correctness because the nodes are effectively already united.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3011
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit b5e4ea8 with merge base 4dffb40 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This PR is the proposed solution to the bug I initially reported here in the ExecuTorch repo: pytorch/executorch#13842 |
|
Thanks, can you add a test in https://github.com/pytorch/ao/blob/main/test/quantization/pt2e/test_quantize_pt2e.py for us to understand the issue better? also for the test, add a comment that follows this format: ao/torchao/quantization/pt2e/prepare.py Line 278 in 4dffb40
|
|
@jerryzh168 has imported this pull request. If you are a Meta employee, you can view this in D82597005. |
- Add test case of implicit sharing for a model where one input is shared between two different ops. - Add code comments to `_union_if_no_cycle`
I dont follow why there A -> B edge if B's quant spec points to A |
I've added a test now. Have a look and see what you think :) |
It's quite complicated, but I will attempt to explain. We have the following model: And the following sitatuation of qspecs:
Then algorithm This is what happens when the problem occurs:
Maybe there's a smarter way to solve this problem but I could only come up with the solution to skip producing the edge at step 2. If we skip forming an edge in step 2, the nodes/edges will still be unioned in the end because when the last edge |
thanks for the detailed example! this is very helpful, could you add this explanation to the test as well? looking at the example, seems like this is still a relatively simple case, what if it takes a few more steps for (clone, eq) to point back to (x, eq)? It seems to me, at the high level, we need some ordering of the output nodes and input edges, so that instead of have the pointer always point from whatever edges we are processing to another node or edge, as we are doing in the implicit sharing code, we always follow the ordering instead. for example in this case, we have: the problem right now is (clone, eq) points to (x, eq), and then later (x, eq) points to (clone ,eq) again, forming a loop, due to lack of ordering. let's say, we follow an order that's listed above (when sharing, only higher ordering entities will point to lower ordering entities, not the reverse), even if we need to union (clone, eq) (3) and (x, eq) (4) multiple times, we always union the higher with the lower, and we'll always have (x, eq) points to (clone, eq) for both step 2 and 3 since the ordering of (x, eq) is higher than (clone, eq). We could add a new ordering map probably to assign the ordering for each entities (output node and input edge) in the graph I think. |
I dont think thats the issue. Loop is really of (x, eq) -> (clone, eq) -> clone node -> (x, clone) -> (x, eq) loop. |
Given (x, clone) has its own quant spec why do we end up with shared_with_map[(x, clone)] = (x, eq)? If anything I would have expected, after processing (x, clone), shared_with_map to be empty OR with implicit sharing shared_with_map[(x, eq)] = (x, clone). Basically which ever edge is annotated to have its own qspec should never be a key in the |
Now in this |
even something annotated with its own spec, it could share with other nodes / edges when they are the "same", e.g. output node of op1, and input edge (op1, op2), input edge (op1, op3) are the same thing since we just need to insert one observer, when implicit sharing is enabled. |
I think (clone, eq) points to (x, eq) IIUC, but cc @martinlsm to confirm, it might be helpful to write down |
|
btw can you not detect cycles explicitly? I think the issue is that we need to have one leader node that takes the responsibility of owning the qspec. In situations like it like "i dont have qspec, go ask this other guy" and then we have guy x, asking guy y, askin guy z, asking guy x. So it seems to me that for implicit sharing we have to denote one node in the cycle as leader. |
I agree with @kimishpatel . Even if we decide some order by sorting, there could be shared qspecs pointing in the reverse order and thus we could risk forming a loop if I understand things correctly. |
I just submitted a new version now that is kind of what you suggest here I guess. Instead of skipping forming the union, I instead reverse the edge in the
Then regarding this good argument, I don't believe it is possible to be more steps than one because |
Oh, right I forgot about this one. Will sort that out on Monday. But in the meantime, let me know what you think about the new solution if you got the time. |
I think it similar, if we have ordering, then even if user specifies a sharing with a reverse direction, we'll have the node/edge with a lower ordering_id to hold the final qspec, so we will have to modify the personally I feel the ordering map will help make it easier since no need to worry about cycle detection anymore, but both works I feel |
| # Parent already references child with a shared qspec. We would create | ||
| # a cycle if we formed an edge from the child to the parent. Therefore, | ||
| # we reverse the edge in this particular case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this be more confusing than just assign an ordering before hand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also I haven't thought through, but wondering if it's possible that root_child can go around and end up pointing to root_parent again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @kimishpatel is this the cycle detection you have in mind?
seems OK to me, if this is the only thing that's needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure. Isnt the cycle already formed (parent_qspec.edge_or_node == root_child) before we come here. it feels we are detecting that and correcting it. I might be wrong though. I
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kimishpatel Prior to forming the problematic union, we are in the state that is shown in figure below about to assign shared_with_map[(clone,eq)] = (x,eq). So there's not already a union we are correct. We are just reversing the green edge by assigning shared_with_map[(x,eq)] = (clone,eq) to make the edge point in the same direction as the blue one (edge_or_node_to_qspec).
I think cycle detection is probably more easy to reason about. it is clear in its intent. Unless there is a significant concern on other aspect, I would suggest we do cycle detection |
I have done an update regarding this now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, seems like it's the only change that's needed to prevent a loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok looks good
I think it's the only thing needed yes! Proving it would be hard but I have not yet found a case where it does not work. But if this problem is ever to be seen again, it would just have to revisited I guess. So we are good to go and can merge this? Thanks guys for the useful feedback and discussion in this PR! |


Certain graphs caused infinite recursion when unioning nodes to groups based on shared quantization specs while implicit sharing was enabled. The problem occurred when
NodeOrEdgeA with its ownQuantizationSpecreceived an edge (inshared_with_map) toEdgeOrNodeB which in turn had aSharedQuantizationSpecpointing back to A. While in this state, looking up the quantization spec in_unwrap_shared_qspecresulted in an endless recursion loop and the program crashed.Remedy this problem by, prior to unioning two trees, checking if the root of the parent has a
SharedQuantizationSpecpointing to the root of the child; if that is the case, reverse the edge by letting parent point to the child. This will ensure that no cycle like the one described above is formed.Add a test case which this commit fixes; namely, a graph where one input edge is shared between two ops. This graph would cause trigger cyclic reference bug that was seen prior to this patch.