Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string_bytes: fix unaligned write in UCS2 #2480

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 66 additions & 12 deletions src/string_bytes.cc
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,46 @@ bool StringBytes::GetExternalParts(Isolate* isolate,
}


size_t StringBytes::WriteUCS2(char* buf,
size_t buflen,
size_t nbytes,
const char* data,
Local<String> str,
int flags,
size_t* chars_written) {
uint16_t* const dst = reinterpret_cast<uint16_t*>(buf);

size_t max_chars = (buflen / sizeof(*dst));
size_t nchars;
size_t alignment = reinterpret_cast<uintptr_t>(dst) % sizeof(*dst);
if (alignment == 0) {
nchars = str->Write(dst, 0, max_chars, flags);
*chars_written = nchars;
return nchars * sizeof(*dst);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we figured out what circumstances are necessary to cause an unaligned pointer to be returned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trevnorris the truth is harsh: https://github.com/nodejs/node/blob/master/lib/buffer.js#L67 . We do nothing to align them at the moment.

}

uint16_t* aligned_dst =
reinterpret_cast<uint16_t*>(buf + sizeof(*dst) - alignment);
ASSERT_EQ(reinterpret_cast<uintptr_t>(aligned_dst) % sizeof(*dst), 0);

// Write all but the last char
nchars = str->Write(aligned_dst, 0, max_chars - 1, flags);

// Shift everything to unaligned-left
memmove(dst, aligned_dst, nchars * sizeof(*dst));

// One more char to be written
uint16_t last;
if (nchars == max_chars - 1 && str->Write(&last, nchars, 1, flags) != 0) {
memcpy(buf + nchars * sizeof(*dst), &last, sizeof(last));
nchars++;
}

*chars_written = nchars;
return nchars * sizeof(*dst);
}


size_t StringBytes::Write(Isolate* isolate,
char* buf,
size_t buflen,
Expand Down Expand Up @@ -334,26 +374,40 @@ size_t StringBytes::Write(Isolate* isolate,
break;

case UCS2: {
uint16_t* const dst = reinterpret_cast<uint16_t*>(buf);
size_t nchars;

if (is_extern && !str->IsOneByte()) {
memcpy(buf, data, nbytes);
nchars = nbytes / sizeof(*dst);
nchars = nbytes / sizeof(uint16_t);
} else {
nchars = buflen / sizeof(*dst);
nchars = str->Write(dst, 0, nchars, flags);
nbytes = nchars * sizeof(*dst);
nbytes = WriteUCS2(buf, buflen, nbytes, data, str, flags, &nchars);
}
if (IsBigEndian()) {
// Node's "ucs2" encoding wants LE character data stored in
// the Buffer, so we need to reorder on BE platforms. See
// http://nodejs.org/api/buffer.html regarding Node's "ucs2"
// encoding specification
if (chars_written != nullptr)
*chars_written = nchars;

if (!IsBigEndian())
break;

// Node's "ucs2" encoding wants LE character data stored in
// the Buffer, so we need to reorder on BE platforms. See
// http://nodejs.org/api/buffer.html regarding Node's "ucs2"
// encoding specification

const bool is_aligned =
reinterpret_cast<uintptr_t>(buf) % sizeof(uint16_t);
if (is_aligned) {
uint16_t* const dst = reinterpret_cast<uint16_t*>(buf);
for (size_t i = 0; i < nchars; i++)
dst[i] = dst[i] << 8 | dst[i] >> 8;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is dst unaligned here as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... this is true. I'll fix it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kzc fixed, please take a look.

break;
}

ASSERT_EQ(sizeof(uint16_t), 2);
for (size_t i = 0; i < nchars; i++) {
char tmp = buf[i * 2];
buf[i * 2] = buf[i * 2 + 1];
buf[i * 2 + 1] = tmp;
}
if (chars_written != nullptr)
*chars_written = nchars;
break;
}

Expand Down
9 changes: 9 additions & 0 deletions src/string_bytes.h
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,15 @@ class StringBytes {
enum encoding encoding) {
return Encode(v8::Isolate::GetCurrent(), buf, buflen, encoding);
})

private:
static size_t WriteUCS2(char* buf,
size_t buflen,
size_t nbytes,
const char* data,
v8::Local<v8::String> str,
int flags,
size_t* chars_written);
};

} // namespace node
Expand Down