Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent timeout with image preivews #3019

Open
natrad100 opened this issue Mar 25, 2021 · 6 comments
Open

Intermittent timeout with image preivews #3019

natrad100 opened this issue Mar 25, 2021 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@natrad100
Copy link

natrad100 commented Mar 25, 2021

My actions before raising this issue

Expected Behaviour

We have a docker-compose.overide that mounts the cvat_data to a nfs mount point. We expect that when we load any page that has a preview, the fetch request /api/v1/tasks/10/data?type=preview should return a successful fetch

Current Behaviour

When we load a page such as Tasks or Projects, there is a 10% success rate in successfully fetching any image previews. The fetch will take anywhere up to 60k ms and will timeout, to which the page will then finally load without any images, and will display 'preview' in it's place.

This data is reachable and is fetched from this mounted point on a NAS.

Note that it will work perfectly for a while, then it suddenly stops working and becomes unusable.

Possible Solution

We've changed the way the drives are mounted multiple times (mounted to linux and fetched by directory verse mounted within the docker-compose) with no avail. It's possible that it might be to do with the networking link between a NAS and the docker container, but could be some other internal bug.

Steps to Reproduce (for bugs)

  1. Run the docker compose with the following override setup:
version: "3.3"

services:
  cvat_proxy:
    environment:
      CVAT_HOST: localhost
    ports:
      - "80:80"
  cvat:
    environment:
      CVAT_SHARE_URL: "Mounted from /home/gr directory"
    volumes:
      - cvat_share:/home/django/share:ro
volumes:
  cvat_share:
    driver_opts:
      type: "nfs"
      device: nas.local:/volume3/shared
      o: "addr=nas.local,nolock,soft,rw"

  cvat_data:
    driver_opts:
      type: "nfs"
      device:  ":/volume1/data"
      o: "addr=nas.local,nolock,soft,rw"
  1. Create a new project
  2. Create a new task using files from external cvat_share network
  3. Open task.
  4. Log out.
  5. Close window and open new window
  6. Login again
  7. Keep loading various preview pages whilst monitoring network requests
  8. Suddenly the page will take a long time to load, with the get request for the image previews for the project/task not responding.
    note sometimes it will use a cached response to display the preview images

Context

Just trying to get it working consistently

Your Environment

  • Docker version docker version (e.g. Docker 17.0.05): 20.10.5
  • Operating System and version (e.g. Linux, Windows, MacOS): Linux Ubuntu 20
  • Other diagnostic information / logs:
    docker logs cvat
2021-03-25 23:26:15,023 DEBG 'ssh-agent' stderr output:
debug2: fd 4 setting O_NONBLOCK
debug1: process_message: socket 1 (fd=4) type 11

2021-03-25 23:28:52,336 DEBG 'runserver' stderr output:
[Thu Mar 25 23:28:52.336043 2021] [wsgi:error] [pid 365:tid 139864464013056] [remote 172.28.0.9:43938] WARNING - 2021-03-25 23:28:52,335 - environment - Failed to import module 'tf_detection_api_format.converter.py': Can't import tensorflow. Test process exit code: -4. This is likely because your CPU does not support AVX instructions, which are required for tensorflow.

2021-03-25 23:29:46,357 DEBG 'rqworker_low' stderr output:
DEBUG - 2021-03-25 23:29:46,357 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:29:46,454 DEBG 'rqworker_default_1' stderr output:
DEBUG - 2021-03-25 23:29:46,453 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:29:49,665 DEBG 'rqworker_default_0' stderr output:
DEBUG - 2021-03-25 23:29:49,664 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:36:31,392 DEBG 'rqworker_low' stderr output:
DEBUG - 2021-03-25 23:36:31,389 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:36:31,489 DEBG 'rqworker_default_1' stderr output:
DEBUG - 2021-03-25 23:36:31,488 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

2021-03-25 23:36:34,699 DEBG 'rqworker_default_0' stderr output:
DEBUG - 2021-03-25 23:36:34,698 - worker - Sent heartbeat to prevent worker timeout. Next one should arrive within 480 seconds.

Thanks for the help in advanced.

@natrad100
Copy link
Author

Have patched by sending a dummy image through the api, obviously not the ideal solution as not previews are seen but cvat works well with that change.

@azhavoro
Copy link
Contributor

Do you have any issues with chunk loading?

@natrad100
Copy link
Author

We don't seem to, however we do use a chunk size of 1 due to the resolutions of the images we are showing.

@nmanovic nmanovic added the bug Something isn't working label Apr 15, 2021
@nmanovic nmanovic added this to the Backlog milestone Apr 15, 2021
@kosehy
Copy link

kosehy commented Aug 2, 2021

Hi @natrad100,
could you share the solution how to send a dummy image through the api?
I also have same issue

504 error with preview image

image

@natrad100
Copy link
Author

natrad100 commented Aug 2, 2021

use the getPreview() in the api server,

https://github.com/openvinotoolkit/cvat/blob/380f4d81612f45d071f9a4f70d407a5dd824d929/cvat-core/src/server-proxy.js#L782-L797

and before doing anything simply return an empty string.

            async function getPreview(tid) {
                return "";

That's it!

@kosehy
Copy link

kosehy commented Aug 3, 2021

I will try your solution.
Thank you for sharing!

@nmanovic nmanovic removed this from the Backlog milestone Nov 23, 2021
@bsekachev bsekachev self-assigned this Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants