Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large asset Ex OOM fix when in s3 asset mode #3598

Merged
merged 7 commits into from
Apr 30, 2024
Merged

Conversation

sotojn
Copy link
Contributor

@sotojn sotojn commented Apr 23, 2024

This PR makes the following changes:

  • Improves the s3 backend get() requests to grab assets in a more memory efficient way
    • This resolves and issue where pulling and decompressing large assets from s3 would cause and OOM on the execution controller on job startup
  • Add error message when asset loader would close with an error that advises what to do in the case of an OOM issue

Ref to issue #3595

@sotojn sotojn added bug enhancement k8s Applies to Teraslice in kubernetes cluster mode only. performance labels Apr 23, 2024
@sotojn sotojn self-assigned this Apr 23, 2024
@sotojn sotojn force-pushed the large-asset-s3-fix branch from 02a17d7 to f03c7c6 Compare April 23, 2024 22:19
@sotojn sotojn requested review from godber and busma13 April 24, 2024 14:40
@godber
Copy link
Member

godber commented Apr 24, 2024

Using this branch with ES backed asset storage, I started the cluster with this command:

yarn k8s:minio

I uploaded the asset fine:

earl assets deploy local -f autoload/common_processors-v0.13.1-node-18-bundle.zip
Asset posted to local: eabe46a623bc55886e1e81f3eefe74754a903fd1

When trying to run in ES mode I get the following error when the job is registered:

earl tjm register local examples/jobs/data_generator.json
Error Failure to get assets, caused by TSError: index_not_found_exception
    at _errorHandlerFn (/app/source/packages/elasticsearch-api/index.js:840:21)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Caused by: ResponseError: index_not_found_exception
    at IncomingMessage.<anonymous> (/app/source/node_modules/elasticsearch6/lib/Transport.js:310:25)
    at IncomingMessage.emit (node:events:529:35)
    at endReadableNT (node:internal/streams/readable:1400:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
If running out of memory, try consider increasing the memory allocation for the process by adding/modifying the "memory_execution_controller" or "resources_limits_memory" (for workers) field in the job file. registering job Data Generator on http://localhost:5678

@godber godber merged commit 1baff7a into master Apr 30, 2024
40 checks passed
@godber godber deleted the large-asset-s3-fix branch April 30, 2024 01:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug enhancement k8s Applies to Teraslice in kubernetes cluster mode only. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants