You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently tried to pack a binary-form UUID with this library, and for certain specific UUIDs, it would fail.
I just realized why: you're packing PHP strings as msgpack strings - but msgpack strings are UTF-8 Unicode strings, and the PHP string-type is just binary data.
Some binary sequences will be invalid code-points, so packing/unpacking will fail.
I believe PHP strings should be packed as the binary type in msgpack.
The problem of course is if you know you have unicode strings, and if the msgpack recipient on the other end is not a PHP script, and expects strings to be encoded as strings.
Since there's no unicode string type in PHP, the rybakit/msgpack package actually goes so far as to use a UTF-8 detection/validation pattern, which must have detremental performance implications.
I guess there's no way to do this "right" in PHP, but if you're going to encode PHP strings as strings, at least the readme should probably note that binary strings aren't supported?
Alternatively, you could try the slightly faster UTF-8 string detection "hack" I used here.
The text was updated successfully, but these errors were encountered:
I believe PHP strings should be packed as the binary type in msgpack.
Also, it is worthwhile to mention that you can pack UUIDs by calling $packer->packBin() or set the string detection mode to Packer::FORCE_BIN. In both cases, the packer will skip detecting the string type.
I recently tried to pack a binary-form UUID with this library, and for certain specific UUIDs, it would fail.
I just realized why: you're packing PHP strings as msgpack strings - but msgpack strings are UTF-8 Unicode strings, and the PHP string-type is just binary data.
Some binary sequences will be invalid code-points, so packing/unpacking will fail.
I believe PHP strings should be packed as the binary type in msgpack.
At least that's what the PECL extension does.
The problem of course is if you know you have unicode strings, and if the msgpack recipient on the other end is not a PHP script, and expects strings to be encoded as strings.
Since there's no unicode string type in PHP, the rybakit/msgpack package actually goes so far as to use a UTF-8 detection/validation pattern, which must have detremental performance implications.
I guess there's no way to do this "right" in PHP, but if you're going to encode PHP strings as strings, at least the readme should probably note that binary strings aren't supported?
Alternatively, you could try the slightly faster UTF-8 string detection "hack" I used here.
The text was updated successfully, but these errors were encountered: