ClusterODM is dropping a high number of uploads #92

FJEANNOT · 2022-02-08T19:54:16Z

What is the problem?

Since yesterday my Webodm is constantly failing all tasks after i restarted it. I noticed it pulled a newer image from DockerHub and there are no previous versions available on DockerHub.
After investigating i noticed that ClusterODM is closing a lot of POST http requests on the routes /task/new/upload/<task_id>
The error message displayed in Webodm is sometimes Connection error: HTTPSConnectionPool(host='example.com', port=443): Read timed out. (read timeout=30) and some other times just 502.

Even the smallest jobs are failing, i had this issue with a dataset only containing 5 images.

On the web interface of ClusterODM, i can still launch a task, but during the uploads, i get a lot of messages saying Upload of IMG_NAME.jpg failed, retrying...

After seeing this i made a clean install of my entire stack (Webodm webapp & worker, ClusterODM and one locked NodeODM for the autoscaler) on a totally different infrastructure and had the same exact problem.

What should be the expected behavior?

Uploading the files on WebODM or ClusterODM UI should work

How can we reproduce this? (What steps did you do to trigger the problem? If applicable, please include multiple screenshots of the problem! Be detailed)

Install WebODM and ClusterODM and try to upload files to launch a task.
My current installation is on a Kubernetes cluster hosted on scaleway. I can provide the manifests i'm using if needed.
WebODM version: 1.9.11
ClusterODM version: latest on Dockerhub

The text was updated successfully, but these errors were encountered:

FJEANNOT · 2022-02-09T16:11:50Z

I Forked the projet today and start to troubleshoot.

Removing the HandleClose function seems to be the solution for me. Maybe saveStream.close() or fs.unlink() is taking too long?

ClusterODM/libs/taskNew.js

Lines 170 to 183 in 311dbb0

    
           const handleClose = () => { 
        
               if (saveStream){ 
        
                   saveStream.close(); 
        
                   saveStream = null; 
        
               } 
        
               if (fs.exists(saveTo, exists => { 
        
                   params.imagesCount--; 
        
                   fs.unlink(saveTo, err => { 
        
                       if (err) logger.error(err); 
        
                   }); 
        
               })); 
        
           }; 
        
           req.on('close', handleClose); 
        
           req.on('abort', handleClose);

pierotofy · 2022-02-09T16:59:07Z

I wonder if this is due to the fact that the docker image is based off of node:lts; I remember that there were some breaking changes in NodeJS that would lead to some issues in ClusterODM. Wonder what happens if you simply downgrade the node version to 12 or 14.

FJEANNOT · 2022-02-09T17:09:42Z

Alright i'm going to try this

FJEANNOT · 2022-02-09T17:16:23Z

Allright it work like a charm in node 14. I can submit a pull request to close this

pierotofy added help wanted possible software fault labels Feb 8, 2022

FJEANNOT mentioned this issue Feb 9, 2022

fix NodeJS version in Dockerfile #93

Merged

pierotofy closed this as completed in #93 Feb 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClusterODM is dropping a high number of uploads #92

ClusterODM is dropping a high number of uploads #92

FJEANNOT commented Feb 8, 2022

FJEANNOT commented Feb 9, 2022 •

edited

Loading

pierotofy commented Feb 9, 2022

FJEANNOT commented Feb 9, 2022

FJEANNOT commented Feb 9, 2022

ClusterODM is dropping a high number of uploads #92

ClusterODM is dropping a high number of uploads #92

Comments

FJEANNOT commented Feb 8, 2022

What is the problem?

What should be the expected behavior?

How can we reproduce this? (What steps did you do to trigger the problem? If applicable, please include multiple screenshots of the problem! Be detailed)

FJEANNOT commented Feb 9, 2022 • edited Loading

pierotofy commented Feb 9, 2022

FJEANNOT commented Feb 9, 2022

FJEANNOT commented Feb 9, 2022

FJEANNOT commented Feb 9, 2022 •

edited

Loading