Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some of the cosmos DAGs not loading #1361

Open
vemulagopal opened this issue Dec 3, 2024 · 3 comments
Open

Some of the cosmos DAGs not loading #1361

vemulagopal opened this issue Dec 3, 2024 · 3 comments
Labels
area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc area:performance Related to performance, like memory usage, CPU usage, speed, etc stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed

Comments

@vemulagopal
Copy link

Hi @pankajkoti , I have an issue with cosmos, where few of the cosmos dags are not loading in my local , when i checked the logs DAG processor is getting killed with exit code -9 , This is happening for the cosmos dags with huge number of tasks. I increased the docker engine virtual machine (where airflow is running ) memory to 24gb and with 4 cpus, but still the processor is exiting. Would you please help me out to fix the issue.

{manager.py:1017} ERROR - Processor for /opt/airflow/dags/test_cosmos_dag.py exited with return code -9. and
Here are my configurations.

scheduler:
min_file_process_interval: 300
core:
dagbag_import_timeout: 1600
dag_file_processor_timeout: 1450

Thanks
Giridhar

@dosubot dosubot bot added area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc area:performance Related to performance, like memory usage, CPU usage, speed, etc labels Dec 3, 2024
Copy link

dosubot bot commented Dec 3, 2024

Hi @vemulagopal, I'm here to help you with your Cosmos DAG issue. Let's get this sorted out together.

The exit code -9 typically indicates that a process was killed due to running out of memory. Here are a few suggestions to address this issue:

  1. Optimize DAG Processing: Ensure that you are using Cosmos's caching and partial parsing features to optimize DAG processing. These can significantly reduce the memory and processing time required for large DAGs. Make sure caching is enabled and configured properly in your airflow.cfg or through environment variables [1][2].

  2. Increase Memory Allocation: Although you've already increased the memory to 24GB, consider further increasing it if possible, as large DAGs can be very memory-intensive.

  3. Adjust Airflow Configurations: Review and adjust your Airflow configurations. The dag_file_processor_timeout and dagbag_import_timeout settings are already increased, which is good. You might also want to look into other settings that could help manage memory usage, such as parallelism and max_active_tasks_per_dag.

  4. Profile and Monitor Resource Usage: Use monitoring tools to profile the memory usage of your Airflow instance. This can help identify if there are specific tasks or operations that are consuming excessive resources.

  5. Review DAG Complexity: If possible, try to simplify the DAGs or break them into smaller, more manageable pieces. This can help reduce the load on the DAG processor.

By implementing these strategies, you should be able to mitigate the memory issues causing the DAG processor to exit with code -9.

To continue talking to Dosu, mention @dosu.

@tatiana
Copy link
Collaborator

tatiana commented Dec 11, 2024

Hi @vemulagopal , thanks for reporting this problem.

We need additional information to be able to help you. Please answer the questions that are in our bug template ticket:
https://github.com/astronomer/astronomer-cosmos/issues/new?template=01-bug.yml

Copy link

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:execution Related to the execution environment/mode, like Docker, Kubernetes, Local, VirtualEnv, etc area:performance Related to performance, like memory usage, CPU usage, speed, etc stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed
Projects
None yet
Development

No branches or pull requests

2 participants