Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFound error when running EVHR evhrToA with celery enabled #3

Open
cssprad1 opened this issue Nov 22, 2022 · 0 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@cssprad1
Copy link
Contributor

Issue: image_calc throws GdalIO: File does not exist error on celery enabled EVHR run

Environment:

Singularity Container: evhr_4.0.0.sif
System: Explore
Nodes: ilab2xx and gpu0xx
Multi-processing: Enabled

Description:

When running EVHR's EvhrToA with celery enabled, at times EvhrToA's mapproject system call does not always output <filename>-ortho-temp.tif. This leads to image_calc to error out as such:

GdalIO: "<filename>-ortho-temp.tif" does not exist in the file system, 
and is not recognized as a supported dataset name.  (code = 4)

This seems to be somewhat random in which file this occurs and only occurs when celery is enabled which makes it seem like some sort of problem on the multi-processing back-end to where mapproject does not throw and error when not all outputs are made.

Stack trace:

[2022-11-21 15:43:31,332: ERROR/ForkPoolWorker-8] evhr.model.EvhrToaCelery._runOneStrip[7a3e6e6f-4a54-46f4-9867-8b7e0f359595]: Encountered exception executing image_calc: A system command error occurred.  b"\t--> Setting number of processing threads to: 1\nError: GdalIO: `output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif' does not exist in the file system, and is not recognised as a supported dataset name.  (code = 4)\nGDAL: Failed to open output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif.\n"b''
In ILProcessController.__exit__() 16460
[2022-11-21 15:43:31,449: ERROR/ForkPoolWorker-8] Task evhr.model.EvhrToaCelery._runOneStrip[7a3e6e6f-4a54-46f4-9867-8b7e0f359595] raised unexpected: RuntimeError('Encountered exception executing image_calc: A system command error occurred.  b"\\t--> Setting number of processing threads to: 1\\nError: GdalIO: `output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif\' does not exist in the file system, and is not recognised as a supported dataset name.  (code = 4)\\nGDAL: Failed to open output/4-orthos/<wv_file>_BAND_B.r100-ortho-temp.tif.\\n"b\'\'')
Traceback (most recent call last):
  File "<FILE_SYSTEM_PATH_REDACTED>evhr/model/EvhrToA.py", line 647, in _orthoOne
    SystemCommand(cmd, logger, True)
  File "<FILE_SYSTEM_PATH_REDACTED>core/model/SystemCommand.py", line 70, in __init__
    raise RuntimeError(msg)
RuntimeError: A system command error occurred.  b"\t--> Setting number of processing threads to: 1\nError: GdalIO: `output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif' does not exist in the file system, and is not recognised as a supported dataset name.  (code = 4)\nGDAL: Failed to open output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif.\n"b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "<FILE_SYSTEM_PATH_REDACTED>evhr/model/EvhrToaCelery.py", line 83, in _runOneStrip
    EvhrToA._stripToToa(imageForEachBandInStrip,
  File "<FILE_SYSTEM_PATH_REDACTED>evhr/model/EvhrToA.py", line 1099, in _stripToToa
    orthoBandDg = EvhrToA._orthoOne(stripBand,
  File "<FILE_SYSTEM_PATH_REDACTED>evhr/model/EvhrToA.py", line 656, in _orthoOne
    raise RuntimeError(msg)
RuntimeError: Encountered exception executing image_calc: A system command error occurred.  b"\t--> Setting number of processing threads to: 1\nError: GdalIO: `output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif' does not exist in the file system, and is not recognised as a supported dataset name.  (code = 4)\nGDAL: Failed to open output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif.\n"b''
Traceback (most recent call last):
  File "evhr/view/evhrToaCLV.py", line 150, in <module>
    sys.exit(main())
  File "evhr/view/evhrToaCLV.py", line 138, in main
    toa.run(env, dgScenes)
  File "<FILE_SYSTEM_PATH_REDACTED>evhr/model/EvhrToA.py", line 838, in run
    self.processStrips(stripsWithDgScenes,
  File "<FILE_SYSTEM_PATH_REDACTED>evhr/model/EvhrToaCelery.py", line 57, in processStrips
    result.get()    # Waits for wpi to finish.
  File "/usr/local/lib/python3.8/dist-packages/celery/result.py", line 677, in get
    return (self.join_native if self.supports_native_join else self.join)(
  File "/usr/local/lib/python3.8/dist-packages/celery/result.py", line 808, in join_native
    raise value
RuntimeError: Encountered exception executing image_calc: A system command error occurred.  b"\t--> Setting number of processing threads to: 1\nError: GdalIO: `output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif' does not exist in the file system, and is not recognised as a supported dataset name.  (code = 4)\nGDAL: Failed to open output/4-orthos/<FILE_NAME_REDACTED>_BAND_B.r100-ortho-temp.tif.\n"b''

Reconstruction:

singularity instance start -B <system_paths> evhr_4.0.0.sif evhr_instance
singularity exec instance://evhr_instance /usr/bin/python /usr/local/ilab/evhr/view/evhrToaCLV.py \
	-o output \
	--scenes_in_file <contact cssprad1 for input path> \
	--celery
@cssprad1 cssprad1 self-assigned this Nov 22, 2022
@cssprad1 cssprad1 added the bug Something isn't working label Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

1 participant