Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deriving Hashable #3446

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

fehrenbach
Copy link
Contributor

This implements deriving support for Hashable, as discussed in purescript/purescript-prelude#188.

The generated code is not quite ideal at the moment. For this datatype:

data Foo = Foo Int Boolean

I basically generate this code:

instance hashableFoo :: Hashable Foo where
  hash (Foo x y) =
    Hash ((0*31
           + case hash x of
               Hash x' => x')*31)
           + case hash y of
               Hash y' => y')

which results in the following code:

var hashableFoo = new Data_Hashable.Hashable(function () {
    return eqFoo;
}, function (x) {
    return (((0 * 31 | 0) + (function () {
        var v = Data_Hashable.hash(Data_Hashable.hashableInt)(x.value0);
        return v;
    })() | 0) * 31 | 0) + (function () {
        var v = Data_Hashable.hash(Data_Hashable.hashableBoolean)(x.value1);
        return v;
    })() | 0;
});

I'm slightly bothered by these IIFEs.
I've tried to generate code with let to unwrap, something like this:

hash (Foo x y) = let (Hash hx) = hash x
                     (Hash hy) = hash y
                 in Hash ((0*31 + hx)*31 + hy)

But I can't make it work. I always get errors about my let-bound names not being found.

If anyone has a clue what I need to do to make let work, even without a constructor like in the following, please let me know.

let forty = 40
   two = 2
in forty + two

This does not seem to work:

forty <- freshIdent "forty"
two <- freshIdent "two"
let fortyplustwo = Let FromLet [BoundValueDeclaration (ss, []) (VarBinder ss forty) (Literal ss (NumericLiteral (Left 40)))
                               ,BoundValueDeclaration (ss, []) (VarBinder ss two) (Literal ss (NumericLiteral (Left 2)))]
                   (App (App (Var ss (Qualified (Just dataSemiring) (Ident C.add))) (Var ss (Qualified Nothing forty))) (Var ss (Qualified Nothing two)))
pure fortyplustwo

@natefaubion
Copy link
Contributor

Let destructuring is desgured before type classes are derived. The equivalent of

hash (Foo x y) = let (Hash hx) = hash x
                     (Hash hy) = hash y
                 in Hash ((0*31 + hx)*31 + hy)

Would be

hash (Foo x y) =
  case hash x of
    Hash hx -> case hash y of
      Hash hy -> Hash (...)

Or

hash (Foo x y) = case hash x, hash y of
  Hash hx, Hash hy -> Hash (...)

@fehrenbach
Copy link
Contributor Author

Ah, that makes sense. Thanks!

@fehrenbach
Copy link
Contributor Author

It took some doing, but this generates decent code now:

data Test = Test Boolean { a :: Int, b :: Array Number }

derive instance eqTest :: Eq Test
derive instance hashableTest :: Hashable Test
var hashableTest = new Data_Hashable.Hashable(function () {
    return eqTest;
}, function (x) {
    var v = Data_Hashable.hash(Data_Hashable.hashableArray(Data_Hashable.hashableNumber))(x.value1.b);
    var v1 = Data_Hashable.hash(Data_Hashable.hashableInt)(x.value1.a);
    var v2 = Data_Hashable.hash(Data_Hashable.hashableBoolean)(x.value0);
    return (((((0 * 31 | 0) + v2 | 0) * 31 | 0) + v1 | 0) * 31 | 0) + v | 0;
});

Thanks for the help, @natefaubion

@fehrenbach fehrenbach changed the title Deriving Hashable [wip] Deriving Hashable Oct 24, 2018
hashBinder :: Ident -> Binder
hashBinder hx = ConstructorBinder ss dataHashableHashNewtypeConstructor [ VarBinder ss hx ]

-- acc * 31 + h -- for want of a better combining function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the tradeoff here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the combining function used by Java for lists etc. (see https://docs.oracle.com/javase/7/docs/api/java/util/List.html#hashCode()). It has issues when used with strings, where certain length prefixes lead to collisions. I'm not sure how much of a problem that is in practice with arbitrary datatypes.

The Haskell hashable library uses combine h1 h2 = (h1 * 16777619) xor h2 (FNV-1, see https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function). I think the reason I didn't use this, is that I thought xor is not inlined like * and +, but it appears I'm wrong about that.

A better hash function would probably be something like murmur_32, which is used by Google's Guava library as the default for short hashes. The main problem is that it's quite a bit more code, which is a pain to generate in deriving. It's also slower to compute.

We could probably do even more clever things in the future. For small types we could find (minimal?) perfect hash functions, for example.

Related: I'm not at all certain that the seeds [0..] are the best we can do.

@garyb garyb mentioned this pull request Jan 12, 2019
3 tasks
@flip111
Copy link

flip111 commented Mar 27, 2020

Would love to see this merged :)

@drathier
Copy link

What is this blocked on now?

@fehrenbach
Copy link
Contributor Author

fehrenbach commented Jan 27, 2021 via email

@JordanMartinez
Copy link
Contributor

My vote is to not include this in the v0.14.0 release. If I recall, this isn't breaking, right? So it could be added in a v0.14.1 release or something? Even if not, the delay has been long enough in my opinion.

@JordanMartinez
Copy link
Contributor

Any updates on this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants