Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which base62 algorithm does this implement? #29

Open
coolaj86 opened this issue Dec 16, 2021 · 8 comments
Open

Which base62 algorithm does this implement? #29

coolaj86 opened this issue Dec 16, 2021 · 8 comments
Labels

Comments

@coolaj86
Copy link

My understanding is that there is no formal spec for base62, but that the "glowfall" implementation (despite its lack of stars) has become the de facto implementation (used the most across the most repos).

Does this follow that spec? Or a different one? Or create a new one?

@tuupola
Copy link
Owner

tuupola commented Dec 17, 2021

Have not heard of glowfall before. This library implements mathematical byte by byte base conversion of arbitrary data. There is really only one way to do it. Standards such as base85 which has extra rules for compressing spaces etc are not pure base conversions.

By default this library uses 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz as character set. It is the same as The GNU Multiple Precision Arithmetic Library uses. GMP has been around since early 1990's. I personally like this character set most since the encoded base62 strings preserve the sort order of the encoded values

@coolaj86
Copy link
Author

I think I was mistaken. The GMP, GnuPG, and Saltpack implementations seem to be the most recognized.

Would you mind giving example output for your implementation compared to these references?

Raw   : "Hello, 世界"
Base64: SGVsbG8sIOS4lueVjA (18 chars)
Base62: 1wJfrzvdbuFbL65vcS (18 chars)

Raw   : "Hello World"
Base64: SGVsbG8gV29ybGQ (15 chars)
Base62: 73XpUgyMwkGr29M (15 chars)

Raw   : [ 0, 0, 0, 0, 255, 255, 255, 255 ]
Base64: AAAAAP____8 (11 chars)
Base62: 000004gfFC3 (11 chars)

Raw   : [ 255, 255, 255, 255, 0, 0, 0, 0 ]
Base64: _____wAAAAA (11 chars)
Base62: LygHZwPV2MC (11 chars)

@rauanmayemir
Copy link

I too am having issues with this. Went though a bunch of go libraries implementing base62:

  1. None of them use 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz alphabet, there is not even an option to have that
  2. Even with the alphabet set manually, the decoded content doesn't match (I use default GMP from this lib)

@tuupola
Copy link
Owner

tuupola commented Jan 16, 2025

Even with the alphabet set manually, the decoded content doesn't match (I use default GMP from this lib)

Does not match what exactly?

@tuupola
Copy link
Owner

tuupola commented Jan 16, 2025

@coolaj86 I can see there is a difference in number of leading zeroes in \x00\x00\x00\x00\xff\xff\xff\xff. Which library generated the reference output?

data
character set
output

Hello, 世界
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
1wJfrzvdbuFbL65vcS

Hello World
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
73XpUgyMwkGr29M

\x00\x00\x00\x00\xff\xff\xff\xff
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
00004gfFC3

\xff\xff\xff\xff\x00\x00\x00\x00
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
LygHZwPV2MC

@rauanmayemir
Copy link

@tuupola I tried decoding an output encoded with this library and couldn't get it to match original data.

Fortunately, found a library that successfully decodes data encoded with your library! 🎉 https://github.com/deatil/go-encoding/blob/30f6f00c7215fcc36f3969458254e4287b3da000/base62/base62.go

@tuupola
Copy link
Owner

tuupola commented Jan 16, 2025

@rauanmayemir If you use a go library with different character set, you could also use the characters setting of this library to match the character set of the go library.

https://github.com/tuupola/base62?tab=readme-ov-file#character-sets

@rauanmayemir
Copy link

@tuupola I get it, but encoded data originated from this library, so I can't go back. Even after setting characters, most libraries in Go doesn't return expected result. I'm just glad I found one implementation that matches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants