Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtered AliasSet #723

Open
wants to merge 9 commits into
base: development
Choose a base branch
from
Open

Filtered AliasSet #723

wants to merge 9 commits into from

Conversation

fabianbs96
Copy link
Member

@fabianbs96 fabianbs96 commented May 10, 2024

The LLVMAliasSet contains a lot of spurious aliases due to its context insensitivity and its almost not present field-sensitivity.
Hence, the client analyses based on this alias information produce many false positives and need to propagate many more facts than necessary, resulting in a severe performance hit.

Therefore, some analyses in phasar, such as the IFDSTaintAnalysis and IDEExtendedTaintAnalysis, filter the alias sets on-the-fly to achieve better results.

This PR provides a wrapper FilteredLLVMAliasSet around the LLVMAliasSet that abstracts and caches the filtering.

Copy link
Collaborator

@vulder vulder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you implement the caching (I might have missed it 😆 ) and how exactly do you ensure that the cache is a) not to big and b) how do you do cache invalidation.


class FilteredLLVMAliasSet {
public:
using traits_t = AliasInfoTraits<FilteredLLVMAliasSet>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might call it alias_traits_t so it easier to see what kind of traits one accesses in the following code.

FilteredLLVMAliasSet(const FilteredLLVMAliasSet &) = delete;
FilteredLLVMAliasSet &operator=(const FilteredLLVMAliasSet &) = delete;
FilteredLLVMAliasSet &operator=(FilteredLLVMAliasSet &&) = delete;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change


FilteredLLVMAliasSet(const FilteredLLVMAliasSet &) = delete;
FilteredLLVMAliasSet &operator=(const FilteredLLVMAliasSet &) = delete;
FilteredLLVMAliasSet &operator=(FilteredLLVMAliasSet &&) = delete;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FilteredLLVMAliasSet &operator=(FilteredLLVMAliasSet &&) = delete;
FilteredLLVMAliasSet &operator=(FilteredLLVMAliasSet &&) noexcept = delete;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make a difference to declare a deleted function noexcept?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a) you do the same thing for the move ctor and b) with noexcept is the correct signature of the function, so deleting it like this is the proper way

private:
FilteredLLVMAliasSet(MaybeUniquePtr<LLVMAliasSet, true> AS) noexcept;

MaybeUniquePtr<LLVMAliasSet, true> AS;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
MaybeUniquePtr<LLVMAliasSet, true> AS;
MaybeUniquePtr<LLVMAliasSet, true /*requiresAlignment*/> AS;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return nullptr;
}

[[nodiscard]] static bool isConstantGlob(const llvm::GlobalValue *GlobV) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[[nodiscard]] static bool isConstantGlob(const llvm::GlobalValue *GlobV) {
[[nodiscard]] static bool isConstantGlobalValue(const llvm::GlobalValue *GlobV) {

I find Glob not super easy to understand. I immediatly mixed it with shell globbing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +29 to +31
if (const auto *Glob = llvm::dyn_cast<llvm::GlobalVariable>(GlobV)) {
return Glob->isConstant();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a dyn_cast here? GlobV is already the type you want.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GlobV here is a GlobalValue, but I need it to be a GlobalVariable in order to call isConstant().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sry I miss read that

Comment on lines 98 to 99
if (llvm::isa<llvm::ConstantExpr>(Alias) ||
llvm::isa<llvm::ConstantData>(Alias)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (llvm::isa<llvm::ConstantExpr>(Alias) ||
llvm::isa<llvm::ConstantData>(Alias)) {
if (llvm::isa<llvm::ConstantExpr, llvm::ConstantData>(Alias)) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks; I didn't know about that!

@fabianbs96
Copy link
Member Author

Hi @vulder, thanks for your comments.
To answer your questions:

Where do you implement the caching (I might have missed it 😆 )

The caching is implemented in getAliasSet(), but now I have also implemented it for getReachableAllocationSites()

and how exactly do you ensure that the cache is a) not to big and b) how do you do cache invalidation.

Currently, I have not implemented cache invalidation and just keep the cache growing. In my tests (on some coreutils), it has not been any problem. Also, for IFDS and IDE analyses, it is hard to predict, when an alias set is never used again.

One could implement some ref-counting or least-recently-used strategy, but I havent' implemented it (yet?)

@vulder
Copy link
Collaborator

vulder commented May 13, 2024

Hi @vulder, thanks for your comments. To answer your questions:

Where do you implement the caching (I might have missed it 😆 )

The caching is implemented in getAliasSet(), but now I have also implemented it for getReachableAllocationSites()

and how exactly do you ensure that the cache is a) not to big and b) how do you do cache invalidation.

Currently, I have not implemented cache invalidation and just keep the cache growing. In my tests (on some coreutils), it has not been any problem. Also, for IFDS and IDE analyses, it is hard to predict, when an alias set is never used again.

One could implement some ref-counting or least-recently-used strategy, but I havent' implemented it (yet?)

Ok, just keep in mind invalidation could also be interesting when the cache grows to big. So, even when a set would be needed again afterwards it would be useful to remove it from the cache to reduce the memory footprint.

@@ -155,7 +155,7 @@ void IFDSTaintAnalysis::populateWithMayAliases(
container_type &Facts, const llvm::Instruction *Context) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name Context is misleading/confusing me, context rather resembles a calling context, while here it is used as the statement at which the aliases are requested, in the sense of flow-sensitivity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants