Skip to content

The attach API implementation does not work as expected #1551

@thomas-fossati

Description

@thomas-fossati

Description

The Attach API implementation does not work as expected.

In particular, the status report sent by the Lambda

{"statusCode":...,"headers":{...},"multiValueHeaders":...,"body":...}

might end up being ignored, resulting in a 502 status response with the following

Function returned an invalid response (must include one of: body, headers, multiValueHeaders or statusCode in the response object)

Steps to reproduce

This might be tricky to repro. In particular, for us it never showed up on bare metal. We have started seeing it consistently when running SAM inside a VM.

What we can do at this stage is to give you the BoM of our failing VM. See below in the "Additional environment details" section.

Observed result

The Attach API implementation is split into two logically separate bits: first the client does a POST /containers/.../attach?... using the requests library machinery, asking to be attached to the running container. Then, after receiving the 101 "switching protocols" response from the Docker engine, the client pilfers the underlying socket descriptor from the requests.Response instance and consumes the binary protocol which muxes stdout and stderr coming from the container.

The problem is that at the time the socket descriptor is stolen, the requests library might have already read, either partially or entirely, the binary stream following the 101 from Docker. Therefore, when the actual raw read happens, there might be no data left to consume from the socket -- the bytes are securely stashed inside the requests.Response buffer (and can be looked at using peek()) but are ignored by the binary protocol processor.

(Note that this behaviour has never been observed on bare metal. It started showing up in a VM where the socket read buffer is apparently coalesced more aggressively.)

The following extract is from a strace(1) of a successful invocation. The sendto is the client doing the POST to the attach API endpoint. The first of the subsequent recvfrom is done by the requests library. Note that ~580 bytes of "good" output get swallowed and won't be available to the attached client. However, it doesn't really matter because the next recvfrom (after sd=13 has been pilfered from the requests.Response) has data to consume at a sane boundary and the only thing that SAM cares about for declaring victory is the {"statusCode":...,"headers":{...},"multiValueHeaders":...,"body":...} which is due to arrive later.
So this, at least partially, succeeds.

[pid 15304] sendto(13, "POST /containers/027cc9a5f6d03c6"..., 302, 0, NULL, 0) = 302
[pid 15304] recvfrom(13, "HTTP/1.1 101 UPGRADED\r\nContent-T"..., 8192, 0, NULL, NULL) = 692
[pid 15304] recvfrom(13, "\2\0\0\0\0\0\0\214", 8, 0, NULL, NULL) = 8
[pid 15304] recvfrom(13, "[GIN-debug] GET    /provisioning"..., 140, 0, NULL, NULL) = 140
[pid 15304] recvfrom(13, "\2\0\0\0\0\0\0\214", 8, 0, NULL, NULL) = 8
[pid 15304] recvfrom(13, "[GIN-debug] GET    /provisioning"..., 140, 0, NULL, NULL) = 140
[pid 15304] recvfrom(13, "\2\0\0\0\0\0\0\216", 8, 0, NULL, NULL) = 8
[...]
[pid 15304] recvfrom(13, "{\"statusCode\":404,\"headers\":{\"Co"..., 271, 0, NULL, NULL) = 271
[pid 15304] recvfrom(13, "", 8, 0, NULL, NULL) = 0

Cfr. with a strace(1) of an unsuccessful request. Note that the first recvfrom consumes all the data (including the final JSON status report). The second recvfrom (triggered by the _read_socket in attach_api.py) founds EOF which cascades to a 502 to the lambda client:

[pid 16048] sendto(19, "POST /containers/027cc9a5f6d03c6"..., 302, 0, NULL, 0) = 302
[pid 16048] recvfrom(19, "HTTP/1.1 101 UPGRADED\r\nContent-T"..., 8192, 0, NULL, NULL) = 5329
[pid 16048] recvfrom(19, "", 8, 0, NULL, NULL) = 0

Expected result

All the std{out,err} from the container should be received by the attached client. And that should cascade on the status code reported by the Lambda being forwarded instead of the 502.

Additional environment details (Ex: Windows, Mac, Amazon Linux etc)

A VM with the following software installed:

  • Ubuntu 18.04
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.3 LTS
Release:        18.04
Codename:       bionic
  • Docker 18.09
$ docker --version
Docker version 18.09.7, build 2d0083d
  • SAM 0.31.0
$ sam --version
SAM CLI, version 0.31.0
  • Python 3.7.3
$ python3 --version
Python 3.7.3

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions