Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce DiplomatStr16 #368

Merged
merged 6 commits into from
Nov 22, 2023
Merged

Introduce DiplomatStr16 #368

merged 6 commits into from
Nov 22, 2023

Conversation

robertbastian
Copy link
Collaborator

@robertbastian robertbastian commented Nov 22, 2023

#57

This adds the DiplomatStr16 Rust type, and adds FFI bindings in JS (String), Dart (String), C2 (wchar_t), and CPP2 (std::wstring_view).

Based on #367

@robertbastian robertbastian changed the title Introduce DiplomatWtf16 Introduce DiplomatStr16 Nov 22, 2023
@@ -391,6 +391,13 @@ pub enum TypeName {
SelfType(PathType),
}

#[derive(Clone, PartialEq, Eq, Hash, Serialize, Deserialize, Debug, Copy)]
#[non_exhaustive]
pub enum StringEncoding {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: nice

tool/src/cpp2/ty.rs Outdated Show resolved Hide resolved
const buf = new Uint8Array(wasm.memory.buffer, ptr, len);
return (new TextDecoder("utf-8")).decode(buf)
}

export function readString16(wasm, ptr, len) {
const buf = new Uint16Array(wasm.memory.buffer, ptr, len);
return String.fromCharCode(buf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huzzah no decoding

too bad we can't pass it in directly, but whatever.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm no JS guru, maybe there's a better way to make a String from a WASM ptr and length, but this works for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is close but not quite right. String.fromCharCode takes one or more individual arguments where each argument is a UTF-16 code unit.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/fromCharCode

console.log(String.fromCharCode(189, 43, 190, 61));
// Expected output: "½+¾="

You need to use .apply() on the function in order to convert the Uint16Array to an argument list and get the behavior you want. There's an example here:

https://developer.chrome.com/blog/how-to-convert-arraybuffer-to-and-from-string/

For example:

u16a = new Uint16Array([189, 43, 190, 61]);
String.fromCharCode.apply(null, u16a); // "½+¾="

(I think the code you wrote, without the .apply(), doesn't work at all; did you test it?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed in #371

@robertbastian robertbastian merged commit 4e1046a into main Nov 22, 2023
5 checks passed
@robertbastian robertbastian deleted the str16 branch November 22, 2023 18:59
Copy link
Contributor

@sffc sffc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw someone mention JS string conversions so I looked at just that one file.

const buf = new Uint8Array(wasm.memory.buffer, ptr, len);
return (new TextDecoder("utf-8")).decode(buf)
}

export function readString16(wasm, ptr, len) {
const buf = new Uint16Array(wasm.memory.buffer, ptr, len);
return String.fromCharCode(buf)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is close but not quite right. String.fromCharCode takes one or more individual arguments where each argument is a UTF-16 code unit.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/fromCharCode

console.log(String.fromCharCode(189, 43, 190, 61));
// Expected output: "½+¾="

You need to use .apply() on the function in order to convert the Uint16Array to an argument list and get the behavior you want. There's an example here:

https://developer.chrome.com/blog/how-to-convert-arraybuffer-to-and-from-string/

For example:

u16a = new Uint16Array([189, 43, 190, 61]);
String.fromCharCode.apply(null, u16a); // "½+¾="

(I think the code you wrote, without the .apply(), doesn't work at all; did you test it?)

Comment on lines +89 to +95
static str16 = (wasm, string) => {
return new DiplomatBuf(wasm, string.length, 2, buf => {
for (var i; i < string.length; i++) {
buf[i] = string.codePointAt(i);
}
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want .charCodeAt, not .codePointAt

@hsivonen
Copy link

C2 (wchar_t), and CPP2 (std::wstring_view).

Why wchar_t and std::wstring_view? Neither should ever be used in portable code, since wchar_t on Windows has UTF-16 semantics and on most other present-day-relevant platforms has UTF-32 semantics.

#240 deliberately asked for char16_t and std::u16string_view as these have UTF-16 semantics regardless of Windows vs. other.

@robertbastian
Copy link
Collaborator Author

That's an oversight. I didn't know that w wasn't portable. Will fix.

@robertbastian
Copy link
Collaborator Author

#379

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants