-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Add optional hashing functions for different experimental settings #464
Comments
The closest research I can find is https://michiel.buddingh.eu/distribution-of-hash-values#summary, which has a summary section says CRC32 is a good choice: The value space of CRC32 is smaller than MD5 or other crypto hashing functions. But collision doesn't affect its distribution as demonstrated by the experiments by the article. In fact, the maximum buckets Flagr supports is That said, I agree there's a flexibility need to run different hash functions, and there's room to provide more experimentation results on various input. |
@victor-mariano-leite I added a test in the openflagr repo to verify my hypothesis #35 I would also check the distribution of |
Nice! And interesting point, is it a bad practice to use sequential ids (such as autogenerated SQL ids) as the entity_id? That is our case, and I was thinking that if the likelihood of older users to be assigned to an experiment is higher (since of our trigger to assign the user to the variant bucket is more commonly used by retained users), maybe the split is biased there also. |
Stale issue message |
Hi @victor-mariano-leite, @zhouzhuojie I am very interested in this issue, because I am trouble shooting something similar right now. Did you end up finding that there were correlations between treatment assignment using this hash function that would bias your experiments? |
The current implementation of Flagr seems to use a CRC32 mapping to generate the entities hash, but as far as I know, CRC32 is commonly used for other purposes instead of hashing for A/B tests since as the randomization units scale, the likelihood of the collisions increase more than MD5 for example, potentially generating sample ratio mismatches.
To validate this scenario, I've gathered a sample experiment in my company.
I suppose this is because of CRC32, but I'm not sure, is there any way to validate this more consistently on Flagr?
I've seen MD5 or a Jenkins Hash Function are implemented to assign units to it's variants, since they are collision resistant.
Anyway, for flexibility and more general use cases, it would be interesting if we could to choose the randomization algorithm, right-sizing it for ones use cases.
The text was updated successfully, but these errors were encountered: