-
Notifications
You must be signed in to change notification settings - Fork 341
[Index] Only hash Decl
once when generating the hash value for a record
#10846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci Please test llvm |
// declaration. The declaration IDs serialized above is always greater than | ||
// zero, so there is no ambiguity between newly serialized decls and the | ||
// serialization of declaration IDs. | ||
HashBuilder.add(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't really matter, but is this actually needed 🤔?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn’t convince myself that there is no chance that two hashes look the same even though the values that went into them weren’t, ie. if the declaration ID matches the first character of the serialized declaration and things from there … just happen to line up – this just eliminates the possibility. I guess it doesn’t really matter because this is just a hash anyway and we can’t rule out conflicts anyway, but maybe better to be safe here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems exceedingly unlikely to me as we end up adding the type with a #
in first. But 🤷 Happy to keep if you really want it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed it. Saves a little bit of work, it’s unlikely that we get this sort of conflict and since we’re hashing, we technically don’t have a guarantee about conflict-freeness anyway.
/// process runs. | ||
/// | ||
/// See `hashDecl` for its use. | ||
std::map<const Decl *, size_t> HashedDecls; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use a DenseMap here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, nothing particular. Probably just a result of my various iterations of this PR. Changed it to a DenseMap
.
7ac7d49
to
8386c45
Compare
@swift-ci Please test llvm |
…cord Say we are indexing the header file containing `NSArray`. Almost every declaration in here has a relation to `NSArray` (mostly child-of relations). This means that for every of those declarations we hash `NSArray` again by creating a `DeclHashVisitor` that, among others, adds the string `NSArray` to the hash builder. To avoid this, keep track of all the declarations that we have already incorporated into the hash. If we did already hash the declaration, only incorporate an integer into the hash.
@swift-ci Please test llvm |
Say we are indexing the header file containing
NSArray
. Almost every declaration in here has a relation toNSArray
(mostly child-of relations). This means that for every of those declarations we hashNSArray
again by creating aDeclHashVisitor
that, among others, adds the stringNSArray
to the hash builder.To avoid this, keep track of all the declarations that we have already incorporated into the hash. If we did already hash the declaration, only incorporate an integer into the hash.
This reduces the overhead of index-while-building by 16% (3.58% overhead to 3.00% overhead), based on a single test project.