This repository has been archived by the owner on Dec 20, 2024. It is now read-only.
Restore: fix utf-8 encoding returning buffers #51
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
History:
leveldown
however, may return data as a buffer and we've had to account for this. Let's call this the maybe-string problem. The primary solution has been to passleveldown
a boolean*asBuffer
option. If false,leveldown
returns data as a string.levelup
intoencoding-down
, we didn't take the*asBuffer
options into account (we didn't realize that at the time). Some stores would return a buffer instead of a string. To deal with that, coercion to string was added tolevel-codec
in Fix/utf8 decoding #12 (7.0.0).*asBuffer
logic was restored in asBuffer fix encoding-down#19; we thought coercion was no longer necessary.This PR restores the coercion, to work around an ecosystem quirk:
leveldown
andmemdown
handle strings and buffers differently. Whileleveldown
stores both types as a byte array (meaning you can put a buffer and get back a string if so desired, and vice versa),memdown
stores them as-is (meaning if you put a buffer, you'll get back a buffer; if you put a string, you'll get back a string - simplified). This leads to unexpected behavior.Another issue (which won't be fixed by this PR but is very relevant) is that
memdown
isn't able to compare a string key to a buffer key (or any other type for that matter); you can only safely use one key type in your db. Possible solutions are discussed in Level/memdown#186. Let's call this the mixed-type problem. It is relevant because:memdown
behave likeleveldown
and thus remove the need for thislevel-codec
PR. Before you say "that sounds like the simplest solution", wait...memdown
behave likelevel-js
which doesn't have the maybe-string problem either, albeit for a different reason. It treats strings and buffers as distinct keys and values, even if their bytes are the same. Arguably - especially when viewed outside of the historical context of Level - this is the least-surprising behavior because you get back what you store. Working with binary data is a distinctly different use case from working with utf8 strings. You'll only sometimes have the need to process utf8 data as binary, which you can still do.So, fixing the mixed-type problem might also fix the maybe-string problem, but we could still choose to merge this PR as a short- to medium-term solution.