Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use it with multi python projects on Airflow #97

Closed
gudata opened this issue Jan 25, 2023 · 1 comment
Closed

How to use it with multi python projects on Airflow #97

gudata opened this issue Jan 25, 2023 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@gudata
Copy link

gudata commented Jan 25, 2023

Hello,
This project is excellent and gives a lot of usefulness and verbosity for airflow pipelines.

My question is probably more airflow related rather than cosmos.
I would like to know how to cosmos when all the project dependencies are self-contained in a docker image?

The architecture where the dags and the tasks share the same runtime python environment isn't scalable. I have a case where I would like another version of python, another use case is when I have old/new dbt version which is not compatible with the airflow requirements. Another use case is when airflow is running dags/tasks from different teams and technical stacks.

For those reasons, I choose to use airflow as a scheduler and pack all the task codes in docker images.

Could you please share your experience on how you use this project in airflow installation where a lot of teams runs a lot of pipelines with diffent stacks?

@jlaneve
Copy link
Collaborator

jlaneve commented Jan 26, 2023

Hey @gudata! Good question. What we do internally is install dbt into a virtual environment in the Docker image - we've documented how to it in the docs here. However, there's been some talk about running dbt/Cosmos tasks in a similar manner to the Kubernetes Pod Operator, where you build a Docker image specifically for the task. If that's what you're looking for, feel free to add an issue (or open a PR!) and if it gets enough traction we could prioritize it.

@chrishronek chrishronek self-assigned this Jan 26, 2023
@chrishronek chrishronek added the question Further information is requested label Jan 26, 2023
@chrishronek chrishronek assigned jlaneve and unassigned chrishronek Jan 26, 2023
dimberman added a commit that referenced this issue Mar 20, 2023
This pull request creates two operators, `DbtKubernetesBaseOperator` and
`DbtDockerBaseOperator`, to clone the logic in the `DbtBaseOperator`
with the same subclasses (`DbtLSOperator`, `DbtSeedOperator`,
`DbtRunOperator`, ...) but to use a `KubernetesPodOperator` or a
`DockerOperator`.

I'm trying to meet a community need (see
#128 or
#97).

My PR is not perfect at all I think, but I will open it to start
discussions and make it evolve to improve it as these discussions go on.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: dimerman <danielryan2430@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants