-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode alphabets not supported? #22
Comments
Thanks for bringing it up β do you know what other implementations do (e.g. JS or PHP)? The underlying issue can have two reasons:
Given all this information, it might be possible to support tuples as alphabet to avoid the character representation complications altogether. Pull requests for that functionality would be very welcome. You could probably unblock yourself by using a mapping mechanism in combination with hashids. |
The PHP implementation has the same limitation it seems. The Swift implementation mentions in the readme that it supports emoji alphabets by using an array rather than a string internally. https://github.com/malczak/hashids I'm unblocked myself, by implementing my own thing, in a non-generic way (and in PHP incidentally). |
If swift supports arrays, we can certainly support lists/tuples. |
My understanding from the Swift readme is the array-thing is an inner implementation detail, and not a change in API (i.e. alphabet is still passed in from outside as a string). Not sure if that would affect your plan here. I might actually be able to speak to this (a little bit, from a JavaScript perspective, anyway):
In JavaScript, longer-than-2-byte strings are also represented as 16bit "surrogate-pairs", i.e.
e.g.
Maybe the same principles hold for Python too? |
Hi @blukis β sorry for going underground for so long. Is this still a feature youβd be interested in? Are more implementation able to handle emojis and unicode combinators these days? There is a whole rabbit hole with unicode normalisation waiting, but maybe we can do something simple. |
I can't be of much help on this unfortunately, nor have a use for it currently. I wound up kind of "pivoting to PHP" for that project, and the need went away. π |
Fair enough β I will close this issue for now. If you or anybody happen to end up needing this, I will take another look. I would assume there is existing functionality to split unicode strings these days. |
It seems like unicode alphabets are not supported, but perhaps I am doing something wrong? e.g. the following code (Google Cloud Platform environment)...
I get this output (in a browser):
(i.e. I'd expect hashid to be some combination of faces from alphabet, but they are appearing as [mostly] unknown characters.) Any possibility to add support for these funkier alphabets? Thanks!
The text was updated successfully, but these errors were encountered: