-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HashMaps comparison always seems to return the same hash code #47
Comments
Hmm.. i mean hash key collisions are a performance issue, the only thing that would be a correctness issue would be if two equal values had different hashes. And then, if I touch various hashmaps, I can get various hash values:
For hashmaps, the hash code is calculated by summing the hashcodes of keys and values in the map: hashCode(): number {
return this.hamt.fold(
(acc: number, value: V, key: K & WithEquality) =>
getHashCode(key) + getHashCode(value), 0);
} One thing here is that we must make sure that the order of elements is not relevant in the computation of the hash code for hashmaps. So for instance, when we compute the hash code for vectors, we do: hashCode(): number {
let hash = 1;
for (let i=0;i<this.length();i++) {
hash = 31 * hash + getHashCode(L.nth(i, this._list));
}
return hash;
} So, for every new element we start on the hash for the whole vector so far, times 31. But that means that the order of elements in the the vector is relevant, so we can't do that directly in the HashMap hash code computation. So then we get to the hashcode computation for numbers and strings (since you have string keys, integer values). For numbers, we use some 'magic' (from my point of view certainly) formula from immutablejs (src/Comparison.ts, line 111). So now I'll just check whether there could be a way of combining the hash values of the hashmap elements in a less naive way than simple summing, that would keep the attribute that order doesn't matter. I'll peek quickly at what immutablejs and some java libraries do. |
So, I see that vavr does the same, it just sums elements of the hashmap: // hashes the elements regardless of their order
static int hashUnordered(Iterable<?> iterable) {
return hash(iterable, Integer::sum);
}
private static int hash(Iterable<?> iterable, IntBinaryOperator accumulator) {
if (iterable == null) {
return 0;
} else {
int hashCode = 1;
for (Object o : iterable) {
hashCode = accumulator.applyAsInt(hashCode, Objects.hashCode(o));
}
return hashCode;
}
} In addition the hashmap for elements is obtained by hashing the array of For that, we could modify the hashing function for hashmaps, to not just sum hash(key) and hash(value), but instead using prelude's This would be probably better, because it does: export function fieldsHashCode(...fields: any[]): number {
// https://stackoverflow.com/a/113600/516188
// https://stackoverflow.com/a/18066516/516188
let result = 1;
for (const value of fields) {
result = 37*result + getHashCode(value);
}
return result;
} meaning at least we'd have 37*hash(key)+hash(value) instead of directly hash(key)+hash(value). I believe this would help with your case. So I'd probably stop with the research, try that change and if it helps with your use case and doesn't break any prelude-ts tests, then merge that to master. |
aaaaaaah there was a bug in the hashCode() for hashMap... hashCode(): number {
return this.hamt.fold(
(acc: number, value: V, key: K & WithEquality) =>
- getHashCode(key) + getHashCode(value), 0);
+ acc + getHashCode(key) + getHashCode(value), 0);
} it was returning the hashcode of the last element traversed... I'll do also the |
Hi, I pushed prelude-ts 1.0.1 to npm with the fix. Right now it's not visible on the npm website yet, but it's probably just a matter of a cache refresh, the release should be there. |
Ahh, that is such a silly mistake, I looked at this code trying to figure out what was going wrong and also completely missed it. I love it! Thanks for the swift response/fix!! |
Hi,
Given the following
All hash maps are giving me the exact same hash code:
I'd expect some conflicts to be possible given the [comment] above the hashcode function, but since all 3 are returning
1696
even after arbitrary changes, this means we basically can't compare hash maps.This is on the latest version of PreludeTS.
I'd love to help to fix this but the logic behind generating the hashcode is a bit above my head right now ):
The text was updated successfully, but these errors were encountered: