Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PythonVirtualOperator fails silently when virtualenv is not installed. #40420

Closed
1 of 2 tasks
thisiswhereitype opened this issue Jun 25, 2024 · 16 comments · Fixed by #43568
Closed
1 of 2 tasks

PythonVirtualOperator fails silently when virtualenv is not installed. #40420

thisiswhereitype opened this issue Jun 25, 2024 · 16 comments · Fixed by #43568

Comments

@thisiswhereitype
Copy link

thisiswhereitype commented Jun 25, 2024

Apache Airflow version

2.9.2

If "Other Airflow 2 version" selected, which one?

No response

What happened?

When using PythonVirtualOperator the venv is created by airflow as passed in the dag, in a tmp directory a standard python -m venv /tmp/.... is run. Later pip install -r ... is also run.

However, when virtualenv is not installed the failed command is not properly detected.

Later an uncaught OSError in subprocess.Popen (https://github.com/apache/airflow/blob/c310159bc2363c12110b11febd5febaab8670210/airflow/utils/process_utils.py#L184)` is raised by the absent .venv/bin/pip causing the exit logic not to fire and the TempDirectory.__exit__ then deletes the evidence.

See: subprocess Exceptions
Additionally stderr is not logged.

What you think should happen instead?

I see in the project file there is a core-all listing including virtualenv.

Shoudn't virtualenv be a dependecy of the pip install apache-airflow? I haven't used hatch

If it shouldn't I think there should be better error handing to explain the issue.

How to reproduce

Run a PythonVirtualOperator with airflow installed in a venv that has no virtualenv.

Operating System

ws2 ubuntu 2404

Versions of Apache Airflow Providers

No response

Deployment

Virtualenv installation

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@thisiswhereitype thisiswhereitype added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jun 25, 2024
Copy link

boring-cyborg bot commented Jun 25, 2024

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@potiuk potiuk added good first issue and removed needs-triage label for new issues that we didn't triage yet labels Jun 25, 2024
@potiuk
Copy link
Member

potiuk commented Jun 25, 2024

Marked it as good first issue for someone to handle.

@eladkal
Copy link
Contributor

eladkal commented Jun 25, 2024

But we do check

if not is_venv_installed():
raise AirflowException("PythonVirtualenvOperator requires virtualenv, please install it.")

@potiuk
Copy link
Member

potiuk commented Jun 25, 2024

I think @thisiswhereitype to check why it did not work in his case.

@thisiswhereitype
Copy link
Author

thisiswhereitype commented Jun 26, 2024

Thanks, the reason is that the is_venv_installed finds /usr/bin/virtualenv and proceeds but doesn't try to import it.

A module import would closely match what happens later.

(.venv_airflow) $ python  -c 'import importlib.util; importlib.util.find_spec("virtualenv")'
(.venv_airflow) $ which virtualenv
/usr/bin/virtualenv
(.venv_airflow) $ pip list | grep env
<no output>
(.venv_airflow) $

@potiuk
Copy link
Member

potiuk commented Jun 26, 2024

Feel free to fix it, otherwise it will have to wait for someone else to pick it up.

@STAR-173
Copy link

Hey @potiuk, could I take on this issue.

If I am correct, I should just remove the previous checks defined in the function is_venv_installed and add a new check for seeing if it venv is installed or not.
Or should I preserve the previous check and just add a new check with or

@potiuk
Copy link
Member

potiuk commented Jun 30, 2024

What do YOU think is better?

@STAR-173
Copy link

Removing the previous checks defined in the function is_venv_installed and add a new check for seeing if it venv is installed or not as the previous checks are not functioning as desired

@potiuk
Copy link
Member

potiuk commented Jul 2, 2024

Removing the previous checks defined in the function is_venv_installed and add a new check for seeing if it venv is installed or not as the previous checks are not functioning as desired

Not sure if I got it - but do what you think is best and let's discuss it.

@pateash
Copy link
Contributor

pateash commented Jul 15, 2024

Hi @eladkal ,
Please assign this to me if @STAR-173 is not working on it.

@STAR-173
Copy link

STAR-173 commented Jul 15, 2024

Hi @pateash ,
I really appreciate your concern regarding this issue but I think I will just continue to work on this. It's just been busy weeks for a bit.
I will be opening a PR within this week

@thisiswhereitype
Copy link
Author

Looks like the PR went stale

@potiuk
Copy link
Member

potiuk commented Oct 25, 2024

Looks like the PR went stale

Wopuld you like to take it over @thisiswhereitype ? Is this what the comment is about? Feel free to do it if you think it's a good idea. We have no problems with even similar people working in parallel on similar PRs - pretty much all the people contributing here are volunteers and they have day jobs, famiilies and hobbies, so they might or might not complete what they initially thought they will be able to do. Usually then someone who is eager to work on it, can take it over, and we can assign it to that person.

Shall I assign it to you @thisiswhereitype ?

@thisiswhereitype
Copy link
Author

@potiuk please do, see the PR

@potiuk
Copy link
Member

potiuk commented Oct 26, 2024

I commented there. You need way more to implement it.

potiuk added a commit to potiuk/airflow that referenced this issue Oct 31, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
potiuk added a commit to potiuk/airflow that referenced this issue Oct 31, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
potiuk added a commit to potiuk/airflow that referenced this issue Nov 1, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
potiuk added a commit to potiuk/airflow that referenced this issue Nov 1, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
potiuk added a commit to potiuk/airflow that referenced this issue Nov 1, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
potiuk added a commit to potiuk/airflow that referenced this issue Nov 1, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
potiuk added a commit to potiuk/airflow that referenced this issue Nov 1, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
potiuk added a commit to potiuk/airflow that referenced this issue Nov 1, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
potiuk added a commit that referenced this issue Nov 1, 2024
The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: #40420
ellisms pushed a commit to ellisms/airflow that referenced this issue Nov 13, 2024
…he#43568)

The PythonVirtualenvOperator has been using virtualenv in order
to support Python 2.7 and pre Python 3.5, however we do not need
those any more - and we can use built-in venv instead.

Fixes: apache#40420
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment