Skip to content

Commit

Permalink
http: Revert the default timeout to 60 seconds
Browse files Browse the repository at this point in the history
In commit d5e9c75 (http: Configurable inactivity timeout) we
changed the default timeout from 60 seconds to 15. The reasoning was
that clients have no reason to connect and keep the connection idle for
long time. Once a client connects, it is expected to start sending
requests. On the first request, the socket timeout is replaced by the
ticket inactivity timeout, set by the user creating the transfer.

Turns out that there is a valid use case for idle clients, and the
shorter timeout breaks downloads of big images (reproduced with 8 TiB
image. The failure flow is:

1. Client connects and send an EXTENTS request.
2. While EXTENTS request is collecting data, client connects multiple
   downloads connections.
3. The connected download threads wait on a queue for work, but since
   EXTENTS request did not finish, the connections are idle.
4. After 15 seconds the server close the idle connections.
5. When the EXTENTS request finish, the client fails to send request to
   the server.

Here is example failure on the client side from ovirt-stress backup run:

      ...
      File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/io.py", line 288, in copy
        self._src.write_to(self._dst, req.length, self._buf)
      File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 207, in write_to
        res = self._get(length)
      File "/usr/lib64/python3.6/site-packages/ovirt_imageio/_internal/backends/http.py", line 432, in _get
        self._con.request("GET", self.url.path, headers=headers)
      File "/usr/lib64/python3.6/http/client.py", line 1273, in request
        self._send_request(method, url, body, headers, encode_chunked)
      File "/usr/lib64/python3.6/http/client.py", line 1319, in _send_request
        self.endheaders(body, encode_chunked=encode_chunked)
      File "/usr/lib64/python3.6/http/client.py", line 1268, in endheaders
        self._send_output(message_body, encode_chunked=encode_chunked)
      File "/usr/lib64/python3.6/http/client.py", line 1044, in _send_output
        self.send(msg)
      File "/usr/lib64/python3.6/http/client.py", line 1004, in send
        self.sock.sendall(data)
    BrokenPipeError: [Errno 32] Broken pipe

Looking in the server logs, we can see that EXTENTS request took about
24 seconds:

    2022-05-24 16:22:57,548 INFO    (Thread-75) [extents] [local]
    EXTENTS transfer=39f14719-2533-45d5-8315-9d0b577d5732 context=zero
    ...
    2022-05-24 16:23:44,907 INFO    (Thread-75) [http] CLOSE
    connection=75 client=local [connection 1 ops, 47.359750 s] [dispatch
    2 ops, 47.297816 s] [extents 2 ops, 47.296568 s]

Downloading 8 TiB disk is an edge case, but this can happen with smaller
images on very fragmented file system, or if there is another reason
that cause EXTENTS request to be slow.

Revert the timeout back to the previous value used in ovirt 4.4.

We may shorten the timeout once we support partial extents:
https://bugzilla.redhat.com/1924940

Fixes oVirt#71

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
  • Loading branch information
nirs committed May 24, 2022
1 parent 300480e commit f86933b
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion ovirt_imageio/_internal/http.py
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,9 @@ class Connection(http.server.BaseHTTPRequestHandler):
# connections. When the timeout expires we close the connection.
# Authorized connections get a larger timeout using the ticket
# inactivity timeout.
timeout = 15
# Note: Must not be less than the time to get image extents
# https://github.com/oVirt/ovirt-imageio/issues/71
timeout = 60

# For generating connection ids. Start from 1 to match the connection
# thread name.
Expand Down

0 comments on commit f86933b

Please sign in to comment.