Skip to content

Commit

Permalink
Surface websocket protocol errors to user applications
Browse files Browse the repository at this point in the history
Currently, if a websocket client sends bad data[1] to an application, the
application just receives a generic error message with no indication
of what went wrong. Also, the client just gets an aborted[2] websocket
connection without any clue as to what they did wrong.

This commit changes two things: first, the application now gets an
error event describing what was wrong with the received data. This
gives the application's owner some clue what's going wrong.

Second, we now send a Close frame to the client with an appropriate
error code, as we SHOULD do[3]. In an ideal world, this will let the
client's owner figure out what they're doing wrong and fix it.

[1] Invalid according to RFC 6455, for example sending a continuation
frame without a preceding start frame or sending a frame with reserved
bits (RSV1, RSV2, and RSV3; see
https://datatracker.ietf.org/doc/html/rfc6455#section-5.2) set.

[2] The underlying TCP connection is closed without first sending a
websocket "Close" frame.

[3] https://datatracker.ietf.org/doc/html/rfc6455#section-7.1.7
  • Loading branch information
smerritt committed Jul 26, 2023
1 parent 32979be commit 155b33b
Show file tree
Hide file tree
Showing 6 changed files with 285 additions and 13 deletions.
10 changes: 5 additions & 5 deletions WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@ bazel_skylib_workspace()

http_archive(
name = "capnp-cpp",
sha256 = "133d70c0f7482eb36cb6a4c662445bccb6219677aa61e15d22b9ace67bc36aa3",
strip_prefix = "capnproto-capnproto-b38fc11/c++",
sha256 = "b3d756251af1861681c29aa81ee074dac09122a4429f87950ef9235b17122389",
strip_prefix = "capnproto-capnproto-495cebe/c++",
type = "tgz",
urls = ["https://github.com/capnproto/capnproto/tarball/b38fc110aed669be98195c957acf0d35fccb8252"],
urls = ["https://github.com/capnproto/capnproto/tarball/495cebe8673a0820e3202de66970059be70d4928"],
)

http_archive(
Expand Down Expand Up @@ -108,8 +108,8 @@ git_repository(

# tcmalloc requires Abseil.
#
# WARNING: This MUST appear before rules_fuzzing_depnedencies(), below. Otherwise,
# rules_fuzzing_depnedencies() will choose to pull in a different version of Abseil that is too
# WARNING: This MUST appear before rules_fuzzing_dependencies(), below. Otherwise,
# rules_fuzzing_dependencies() will choose to pull in a different version of Abseil that is too
# old for tcmalloc. Absurdly, Bazel simply ignores later attempts to define the same repo name,
# rather than erroring out. Thus this leads to confusing compiler errors in tcmalloc complaining
# that ABSL_ATTRIBUTE_PURE_FUNCTION is not defined.
Expand Down
63 changes: 57 additions & 6 deletions src/workerd/api/web-socket.c++
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,22 @@

namespace workerd::api {

kj::Maybe<WebSocketProtocolError> WebSocketProtocolError::fromException(const kj::Exception& ex) {
auto maybeContext = ex.getContext();
while (maybeContext != nullptr) {
auto& context = KJ_ASSERT_NONNULL(maybeContext);
if (magicFileValue == context.file) {
return WebSocketProtocolError(context.line, kj::str(context.description));
}
maybeContext = context.next;
}
return nullptr;
}

void WebSocketProtocolError::encodeToException(kj::Exception &ex) && {
ex.wrapContext(magicFileValue.cStr(), code, kj::mv(description));
}

kj::StringPtr KJ_STRINGIFY(const WebSocket::NativeState& state) {
// TODO(someday) We might care more about this `OneOf` than its which, that probably means
// returning a kj::String instead.
Expand All @@ -25,6 +41,20 @@ kj::StringPtr KJ_STRINGIFY(const WebSocket::NativeState& state) {
KJ_UNREACHABLE;
}

kj::Exception WebSocketErrorHandler::handleWebSocketProtocolError(kj::WebSocket::ProtocolError protocolError) {
KJ_REQUIRE(protocolError.description.size() <= 122);
// We're going to put this as the reason in a WebSocket Close frame, so it has to fit.
// RFC6455 section 5.5 puts a maximum payload length of 125 on control frames, of which Close
// is one type. The payload also needs room for a status code (2 bytes) and maybe a null
// terminator (1 byte), leaving 122 bytes for the description.
kj::Exception ex = KJ_EXCEPTION(FAILED, "worker_do_not_log: WebSocket protocol error");
api::WebSocketProtocolError wspe{static_cast<int>(protocolError.statusCode),
kj::heapString(protocolError.description)};
// The status codes are all 4-digit decimal numbers, so they'll easily fit in an int.
kj::mv(wspe).encodeToException(ex);
return ex;
}

IoOwn<WebSocket::Native> WebSocket::initNative(IoContext& ioContext, kj::WebSocket& ws) {
auto nativeObj = kj::heap<Native>();
nativeObj->state.init<Accepted>(Accepted::Hibernatable{.ws = ws}, *nativeObj, ioContext);
Expand Down Expand Up @@ -309,8 +339,16 @@ kj::Promise<DeferredProxy<void>> WebSocket::couple(kj::Own<kj::WebSocket> other)

auto& context = IoContext::current();

auto upstream = other->pumpTo(*self);
auto downstream = self->pumpTo(*other);
auto upstream = other->pumpTo(*self).catch_([](kj::Exception&& e) {
if (WebSocketProtocolError::fromException(e) == nullptr) {
kj::throwFatalException(kj::mv(e));
};
});
auto downstream = self->pumpTo(*other).catch_([](kj::Exception&& e) {
if (WebSocketProtocolError::fromException(e) == nullptr) {
kj::throwFatalException(kj::mv(e));
};
});

if (locality == LOCAL) {
// We're terminating the WebSocket in this worker, so the upstream promise (which pumps
Expand Down Expand Up @@ -349,7 +387,7 @@ void WebSocket::accept(jsg::Lock& js) {
"Can't accept() WebSocket after enabling hibernation.");
// Technically, this means it's okay to invoke `accept()` once a `new WebSocket()` resolves to
// an established connection. This is probably okay? It might spare the worker devs a class of
// errors they do not care care about.
// errors they do not care about.
return;
}

Expand Down Expand Up @@ -429,7 +467,7 @@ WebSocket::Accepted::~Accepted() noexcept(false) {
void WebSocket::startReadLoop(jsg::Lock& js) {
// If the kj::WebSocket happens to be an AbortableWebSocket (see util/abortable.h), then
// calling readLoop here could throw synchronously if the canceler has already been tripped.
// Using kj::evalNow() here let's us capture that and handle correctly.
// Using kj::evalNow() here lets us capture that and handle correctly.
//
// We catch exceptions and return Maybe<Exception> instead since we want to handle the exceptions
// in awaitIo() below, but we don't want the KJ exception converted to JavaScript before we can
Expand Down Expand Up @@ -464,12 +502,25 @@ void WebSocket::startReadLoop(jsg::Lock& js) {
(jsg::Lock& js, kj::Maybe<kj::Exception>&& maybeError) mutable {
auto& native = *farNative;
KJ_IF_MAYBE(e, maybeError) {
if (!native.closedIncoming && e->getType() == kj::Exception::Type::DISCONNECTED) {
KJ_IF_MAYBE(wspe, WebSocketProtocolError::fromException(*e)) {
// The client sent us an invalid websocket message.
if (!native.closedOutgoing) {
// Send a close message to the client if we can.
native.closedIncoming = true;
close(js, wspe->getCode(), kj::str(wspe->getDescription()));
}
// Report to the application as an error event.
jsg::Value errorDescription{js.v8Isolate, js.wrapString(wspe->getDescription())};
error = errorDescription.addRef(js);
dispatchEventImpl(js, jsg::alloc<ErrorEvent>(
kj::str("WebSocket code ", wspe->getCode()),
kj::mv(errorDescription), js.v8Isolate));
} else if (!native.closedIncoming && e->getType() == kj::Exception::Type::DISCONNECTED) {
// Report premature disconnect or cancel as a close event.
dispatchEventImpl(js, jsg::alloc<CloseEvent>(
1006, kj::str("WebSocket disconnected without sending Close frame."), false));
native.closedIncoming = true;
// If there are no further messages to send, so we can discard the underlying connection.
// If there are no further messages to send, we can discard the underlying connection.
tryReleaseNative(js);
} else {
native.closedIncoming = true;
Expand Down
33 changes: 33 additions & 0 deletions src/workerd/api/web-socket.h
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,39 @@ class ErrorEvent: public Event {
void visitForGc(jsg::GcVisitor& visitor);
};

class WebSocketProtocolError {
public:
// Exception-like thing for WebSocket protocol errors. Since kj::WebSocket indicates protocol
// errors by throwing an exception and since exceptions are caught by kj::Promise, we can't just
// throw WebSocketProtocolError. Instead, we smuggle the extra in the exception's context.
WebSocketProtocolError(int code, kj::String description)
: code(code), description(kj::mv(description)) {}

static kj::Maybe<WebSocketProtocolError> fromException(const kj::Exception& ex);
// Generates a WebSocketProtocolError from the exception's context, but only if the context
// actually holds appropriate data, i.e. someone previously used encodeToException on it.

int getCode() const { return code; }
kj::StringPtr getDescription() const { return description; }

void encodeToException(kj::Exception& ex) &&;
// Adds a context entry to the exception containing this object's data. This is only useful if
// you're going to retrieve it with fromException later.

private:
int code;
kj::String description;
static inline constexpr kj::StringPtr magicFileValue = "__WebSocketProtocolError_magicFileValue"_kj;
// Used as a sentinel in exception context frames; if frame.file == magicFileValue, then that
// frame contains data about a websocket protocol error. The exact value is unimportant; it just
// has to not look like a real file path.
};

class WebSocketErrorHandler : public kj::WebSocketErrorHandler {
kj::Exception handleWebSocketProtocolError(kj::WebSocket::ProtocolError protocolError) override;
};
// Handler for WebSocket protocol errors.

// The forward declaration is necessary so we can make some
// WebSocket methods accessible to WebSocketPair via friend declaration.
class WebSocket;
Expand Down
2 changes: 1 addition & 1 deletion src/workerd/io/io-context.c++
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ void IoContext::logUncaughtExceptionAsync(UncaughtExceptionSource source,
// do still want to syslog if relevant, but we can do that without a lock.
if (!jsg::isTunneledException(exception.getDescription()) &&
!jsg::isDoNotLogException(exception.getDescription()) &&
// TODO(soon): Figure out why client disconncects are getting logged here if we don't
// TODO(soon): Figure out why client disconnects are getting logged here if we don't
// ignore DISCONNECTED. If we fix that, do we still want to filter these?
exception.getType() != kj::Exception::Type::DISCONNECTED) {
LOG_EXCEPTION("jsgInternalError", exception);
Expand Down
Loading

0 comments on commit 155b33b

Please sign in to comment.