base64 compatibility #49

mattrobenolt · 2022-08-19T19:20:18Z

We currently rely on btoa and atob, because that works in browsers and worked in versions of Node we tested against, but that's not as universal as we had hoped.

We should instead, provide some compatibility shim against the Buffer API that should more likely exist? We might need to provide a fallback for even that if neither Buffer nor btoa and friends exist.

But for now, it seems either way, it'd be more favorable to use the Buffer API when available over btoa.

I propose something like this, granted I dunno the best way to do this in JavaScript/TypeScript these days:

var b64encode;
var b64decode;
if(typeof Buffer !== "undefined") {
  b64encode = x => Buffer.from(x, 'latin1').toString('base64')
  b64decode = x => Buffer.from(x, 'base64').toString('latin1')
} else if (typeof btoa !== "undefined") {
  b64encode = btoa
  b64decode = atob
} else {
  // I dunno, natively has neither
}

console.log(b64encode('foo'))
console.log(b64decode('Zm9v'))

I think this is still reasonably applicable since we don't simply use it for the Authorization header. While that's the only use of btoa, atob is used to decode binary row data, which we do for every row in a QueryResult.

Refs: #47

The text was updated successfully, but these errors were encountered:

vinerz · 2024-06-14T04:50:35Z

Chiming in: beyond being outdated and not recommended for new projects, atob and btoa severely compromises the decoding of utf-8 data.

By using these methods, the package is naively considering the values as being encoded in ASCII formatting, thus destroying characters outside the aforementioned encoding. The following TEXT value would be butchered: “Ô meu amigo, nós precisamos reunir o pessoal para o aniversário de José”

I have confirmed that the decoding issue happens locally by forcing btoa to comply to the new standards, which returns the data correctly decoded. When that is fixed, another issue arises: The decoder does not know the string length anymore as it was previously miscalculated and the columns values leak throughout the entire row, rendering the result useless.

mattrobenolt · 2024-06-14T05:48:04Z

Do you have an example where this fails with a table we can reproduce with? I have a bunch of test data that tests various encodings and Unicode data without any issues currently, so I'm curious what fails in practice here so we can construct a test case. From my understanding tho, our usage is safe and correct since the data being decoded is binary bytes, which we explicitly encode to utf8 if/when strings are needed. So within the context of our usage, I don't believe we corrupt any data.

vinerz · 2024-06-14T13:21:10Z

Hey @mattrobenolt! You are right to assume the data received is indeed a correctly encoded binary UTF-8 string. The thing with btoa is that it is literally means binary to ascii conversion and vice-versa, so any characters outside this range will be improperly decoded.

Here's a simple example using my test above:

{
  "headers": [
    ":vtg1 /* VARCHAR */"
  ],
  "types": {
    ":vtg1 /* VARCHAR */": "VARCHAR"
  },
  "fields": [
    {
      "name": ":vtg1 /* VARCHAR */",
      "type": "VARCHAR",
      "charset": 255
    }
  ],
  "rows": [
    {
      ":vtg1 /* VARCHAR */": "Ã� meu amigo, nÃ³s precisamos reunir o pessoal para o aniversÃ¡rio de JosÃ©"
    }
  ],
  "rowsAffected": 0,
  "insertId": "0",
  "size": 1,
  "statement": "SELECT \"Ô meu amigo, nós precisamos reunir o pessoal para o aniversário de José\"",
  "time": 1.217693
}

Now, if I monkeypatch atob like this:

globalThis.atob = (str: string) => Buffer.from(str, 'base64').toString('utf-8')

The response is correctly parsed:

{
  "headers": [
    ":vtg1 /* VARCHAR */"
  ],
  "types": {
    ":vtg1 /* VARCHAR */": "VARCHAR"
  },
  "fields": [
    {
      "name": ":vtg1 /* VARCHAR */",
      "type": "VARCHAR",
      "charset": 255
    }
  ],
  "rows": [
    {
      ":vtg1 /* VARCHAR */": "Ô meu amigo, nós precisamos reunir o pessoal para o aniversário de José"
    }
  ],
  "rowsAffected": 0,
  "insertId": "0",
  "size": 1,
  "statement": "SELECT \"Ô meu amigo, nós precisamos reunir o pessoal para o aniversário de José\"",
  "time": 2.099751
}

mattrobenolt · 2024-06-14T13:29:47Z

I'll spend a little time with this later but I think this is something different. I'll explain if I'm correct.

vinerz · 2024-06-14T17:18:01Z

@mattrobenolt it is indeed something different. There was a misunderstanding in my part about the cast property in the connection constructor. I've set it as a function to transform DATETIME fields to real javascript Date instances.

In my thought process, I though cast was a post-processor of the results output, not a fully fledged parser function itself. I see that the package exports the default parser, so I will use that as default case. Thank you for your time! =)

mattrobenolt · 2024-06-17T19:32:16Z

Ah, yeah, that checks out. It's definitely expected for the raw data to be binary encoded data, so is entirely safe to do atob. But yeah, inside cast is where we decode the binary data as utf8 if applicable.

mattrobenolt mentioned this issue Aug 19, 2022

Add example of using @planetscale/database with Fastly Compute@Edge planetscale/f1-championship-stats#19

Closed

AlexErrant mentioned this issue Dec 19, 2022

Retrieved binary is converted to a string #78

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

base64 compatibility #49

base64 compatibility #49

mattrobenolt commented Aug 19, 2022 •

edited

Loading

vinerz commented Jun 14, 2024

mattrobenolt commented Jun 14, 2024 via email •

edited

Loading

vinerz commented Jun 14, 2024

mattrobenolt commented Jun 14, 2024

vinerz commented Jun 14, 2024

mattrobenolt commented Jun 17, 2024

base64 compatibility #49

base64 compatibility #49

Comments

mattrobenolt commented Aug 19, 2022 • edited Loading

vinerz commented Jun 14, 2024

mattrobenolt commented Jun 14, 2024 via email • edited Loading

vinerz commented Jun 14, 2024

mattrobenolt commented Jun 14, 2024

vinerz commented Jun 14, 2024

mattrobenolt commented Jun 17, 2024

mattrobenolt commented Aug 19, 2022 •

edited

Loading

mattrobenolt commented Jun 14, 2024 via email •

edited

Loading