-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CESU-8 support in XS to address UTF-8/UTF-16 confusion risk #2118
Comments
No question - XS stores strings internally as UTF-8. How is that visible to JavaScript code? Our conformance document describes the places where strings in XS do not conform to ECMA-262. But, it isn't necessarily the case that they are unsolvable when strings are stored as UTF-8 or that those issues make the fact that strings are stored as UTF-8 visible to scripts. Apologies if this seems overly precise. I only want to be certain what this issue is referring to. |
I just mean things like
It doesn't necessarily allow JS code to tell that the internal encoding is UTF-8, but it's a difference from other implementations that is visible from JS code.
Precision is exactly what we need here. I don't know for certain that this difference represents an exploitable vulnerability. I just have nightmares from reading The Tangled Web , which is full of situations where differences in implementations turned into security vulnerabilities. For example, on p 30:
I can imagine some JS code that splits at the 10th character, which could lead to problems like the ones derived from browsers splitting URLs at |
Just to emphasize, we do not expect this to be a problem and are not asking it to be fixed. Rather, as @dckc says, it is a difference we need to stay aware of, in case it does lead to an unpleasant surprise. Just being thorough. |
@erights - understood. To state the obvious: every engine has variances from the ECMA-262. Scripts can detect those. But beyond that, there's nothing here which is obviously exploitable - though the title suggests there might be. |
Some tangible impact: endojs/endo#737 , endojs/endo#739 cc @kriskowal |
@michaelfig what is this UTF-16 widget in our IBC stuff? |
I rely on the accuracy of conversions in https://github.com/Agoric/agoric-sdk/tree/master/SwingSet/src/vats/network/bytes.js. @phoddie, does this hold under XS? const octets = new Array(256).fill(0).map((_, i) => i);
// We receive some external JSON data (from Golang) in the following format:
const bstring1 = String.fromCharCode.apply(null, octets);
const bstring2 = JSON.parse(JSON.stringify(bstring1));
assert(bstring1 === bstring2);
// We want to extract the octets:
const octets2 = bstring2.split('').map(c => c.charCodeAt(0));
// And to be able to reencode them:
const bstring3 = String.fromCharCode.apply(null, octets2);
assert(bstring2 === bstring3); I'm not saying that XS needs to support this usage, only that we need a coherent way to encode and transport bytes through our marshalling from Golang JSON. The existing implementation in Again, this should be properly solved with the correct (and hopefully ES or Web JS standard) abstractions/APIs. Choosing how to do this is for people who understand XS and the standards. I don't have much choice on the Golang side, short of exploding the Golang But if we have a good solution on the JS side that needs a change, I will adopt it and change Golang to match. I just don't want to exchange one hack for another. |
Sorry, didn't mean to tag you, Peter when one of us could easily check this equivalence on XS. |
No problem. I hadn't caught up with this issue yet anyway. FWIW (and only tangentially relevant) - if the patch I shared with @kriskowal early today to |
Thanks for the test sketch. Yes, it holds.
I tweaked the test sketch to run with our other xsnap tests: 658830e. please excuse the temporary lack of branch discipline. Expect a cleaned up PR in due course. p.p.s. I wonder if using |
Ok, but what about the JS spec? Is it compat with the JSON spec? |
It looks like the relevant piece of the ECMA-262 spec is https://262.ecma-international.org/12.0/#sec-quotejsonstring If read this right the spec says to do some quoting stuff so the whacky bits of the JavaScript string will round-trip through JSON, but the strings in the JSON itself won't contain the whacky bits. Not 100% sure I'm reading it right though. |
I have to admit I don't understand the JS spec on that. It seems mostly to create a new utf-16 string, escaping codepoints that not valid or some of the ASCII chars that need to be escaped. I suppose that it's then used somewhere else and converted to utf-8, since it can assume it's a valid utf-16 encoded string? |
Jan 10 update from @phoddie : "our investigation into CESU-8 support has gone well. We are working through the full set of changes now. ..." |
Moddable has posted an xsnap release that integrates CESU-8. We either need a corresponding Breakdown:
This initial work would also unblock cross-testing other SES features like Compartment dynamic import.
Dan’s got a draft up for progress: |
You may know all this already, but just in case:
|
@kriskowal This does not have an area label that is covered by our weekly tech / planning meetings. Can you assign the proper label? We cover: agd, agoric-cosmos, amm, core economy, cosmic-swingset, endo, getrun, governance, installation-bundling, metering, run-protocol, staking, swingset, swingset-runner, token economy, wallet, zoe contract, |
@Tartuffo xsnap seems like the appropriate label and we do have weekly endo / SES meetings that cover it. @kriskowal and I touched base about this yesterday, in particular. |
@dckc Agreed, I just added it to the ZH filter for the Zoe/ERTP meeting. |
The |
I think this is addressed by #4832. I suppose it's cost-effective to add a test or two to confirm. |
What is the Problem Being Solved?
update: @erights noted our kernel runs on v8 with UTF-16 and xsnap uses UTF-8 and wondered about a risk of one side confusing the other -- especially if the xsnap child can confuse the kernel.
Description of the Design
Evaluate confusion risk as strings etc. go between the parent node process and the child xsnap process.
Document it as a known difference between our platform and the JS spec.
Security Considerations
not clear; we're evaluating a confusion risk
Test Plan
Unit tests to check border crossings.
cc @kriskowal @warner @erights
The text was updated successfully, but these errors were encountered: