-
Notifications
You must be signed in to change notification settings - Fork 15.5k
Description
Type Based Alias Analysis emits metadata nodes in the frontend, and an analysis pass in the middle-end does access tag matching to determine if two pointers alias or not. TBAA metadata is only attached to Load and Store Instructions.
But with the enhanced struct path TBAA (enabled by -new-struct-path-tbaa), the TBAA nodes are also added to memory aggregates that might cause two pointers to aliase due to their semantics. So, memcpy is usually tagged with TBAA nodes. This happens here in the Clang frontend.
llvm-project/clang/lib/CodeGen/CGExprAgg.cpp
Line 2200 in bd6e324
| if (CGM.getCodeGenOpts().NewStructPathTBAA) { |
Even though memcpy is tagged with TBAA metadata nodes, the TypeBasedAliasAnalysis is not able to correctly find Alias Sets due to the TBAA node structure being slightly different. I have two testcases where AA returns NoAlias for a case with a Load and Memcpy when they actually are Alias/MayAlias.
Reproducer: https://godbolt.org/z/7vK714zWx
Referring to the godbolt testcase. Take a look at L15 in the IR tab, an Add %0, %1 is converted to a Add %0, %0, which in turn is converted to Shl %0, 1. As it thinks that %0 and %1 (Load which was eliminated) are the same data. But, since we have copied b into a, that’s not true. This happens as TBAA tells that %0 and %1 are NoAlias, which is wrong.
Confirmed the issue after discussion with author of enhanced TBAA (@kosarev)
This same issue was seen even in GVN, have created a detailed page to list the bugs: https://discourse.llvm.org/t/type-based-alias-analysis-giving-incorrect-aliasing-information-with-enhanced-tbaa-format/79455