-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Airflow Scheduler on WSL fails to execute Windows EXE #13108
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
Interesting problem. Did you try to run a bash script that would execute your binary via
|
@potiuk tried (even though it's intended for non-executables like |
Can it be this (32 vs. 64bit python) https://stackoverflow.com/questions/59429146/using-subprocess-in-with-variable-containing-the-dos-cmd |
@potiuk don't think so, all of my airflow environment (including Python) is installed within Ubuntu. |
@potiuk any other things to check? Is there a way to get the task runner / airflow scheduler to emit verbose output or to see what exactly did it do by attempting to run the command and receiving error 1? |
I am not a WSL2 user unfortunately :(. maybe others who use it can help ? you can ask in troubleshooting or try to ask question in StackOverflow in general. |
Had already asked in Stack Overflow, with no luck: https://stackoverflow.com/questions/65319176/airflow-scheduler-on-wsl-fails-to-execute-windows-exe What about my other question about the scheduler? Also, I'm not 100% sure it's WSL2's fault. |
Too bad - maybe someone will answer it . I am not sure about the other question - maybe it makes sense to add one question in one thread? Then people will be able to focus on one thing and maybe someone will answer it. |
Thanks; I've just created a question but may I note that the question is not separate from the issue; it's just means to help the community and myself figure out the root cause; this will ultimately lead to a fix in either Airflow or WSL2 - depends on the findings. It can indeed be a WSL2 issue but I suspect it's more likely something with Airflow, since the way it runs |
Furthermore, if I'll try to create an issue with WSL2, they will very rightfully say that since Bash executes these Windows programs successfully, it must be something within airflow itself which executes Bash commands differently. So I think this place is the best avenue of trying to solve this... |
Sure - just please take a note, that Airflow is not "guaranteed" to work on WSL2. WSL2 might be used for development of Airflow itself, but this is not the "target" execution environment for any production use. So while you might get help from someone who finds the problem, it's not really "expected" that this case is going to work. |
The project is open source and non for profit... the whole idea is to make the product strive by figuring out cases that should legitimately work but don't. More and more users are using WSL / WSL2. The only way to invoke a Windows program by airflow is via WSL. Meaning, you can't just put airflow on a Linux VM or bare metal and invoke Windows programs because it won't work. At the same note, I shall probably not "expect" anyone to help with a genuine inner-Linux issue, because the project is mainly volunteer based. Airflow is a product, a platform, for scheduling, triggering and executing DAGs so it runs on its designed environment, Linux, but I think that by ruling out support for Windows targets (not Windows as an OS for airflow - Windows programs as targets to airflow tasks) it effectively sends a message: "run it inside the Linux world only". |
This is precisely what Airflow is right now. Linux is the only 'execution environment' only for Airflow. There are open issues to improve Windows support - for example #12874 and #10388 - but those are not being actively worked on for now. We are going to discuss the scope of Airflow 2.1 right after New Years, I will make sure to mention Windows support as possibly important topic to cover. But until this is picked up by someone in the community you can get at most support from other community members who are also on Windows. Sorry if that is not helpful, but this is the current state. Maybe you can also start a discussion thread on the devlist of Airflow about that and see if there is an interest there to improve support for Windows - maybe there will be some other community members who would like to join their forces and improve Windows support. |
Thanks for raising it. It works on a different environment so I think it's environment/permission based (even though I mentioned I'm root all the way) rather than code based. The other takeaway might be to add some more environment-related info when a Bash command fails, to STDERR / STDOUT. UPDATE: I've read the 2 issues you linked to; they talk about having Windows host airflow itself. |
Yep. But I think those are very much related - i.e. the same people might be interested in solving both issues. BTW - maybe you simply can try add this info yourself and make a PR. There are not many people who are running Airflow on Windows in the community - it is not very common. but that might be a good start for you to dig a bit deeper and try to add more logging yourself? Airflow is a community-driven project, so you might make it your first contribution if you find a fix to the problem. It's not as complex as you might think. This is all python, so you can edit the python code directly in your installation. The code is here for bash operator and depending on how you installed it you will find airflow installed under one of the paths displayed by Then you can add your own logging information there are do some more experimentation with it. |
Yes I've already inspected bash_operator.py. Saw that it's using a subprocess etc. but couldn't figure how to tweak it to emit more detailed logging or to expose what different environment a subprocess runs at. I'm not a Python developer; I can google of course and try debugging my way - which I'll do - but I was hoping someone within the project could immediately point the environmental difference when running via airflow scheduler / Python subprocess, vs. plain bash. |
I don't think we have anyone who uses this setup in the core project team. We could also try googling (which I did) but I believe at this stage some more experimentation is needed. |
This is likely more related to the way Windows treats processes spawned as sub-processed of such process - and this is not "airflow" issue but rather subprocesses in this setup. We fork the processes when we are running the task - so this might be the issue. You can try to do |
Thanks for the idea! For now I've re-built a fresh Ubuntu machine with Airflow 2 and the problem doesn't exist. If it will happen again in the future, I'll try that env flag. But I'm not sure it's a Windows thing since I have another Windows machine (running Ubuntu 18.04 as opposed to my "problematic" WSL2 that was running Ubuntu 20.04) which works just fine. |
I had the same issue. And I believe the problem is how the bashoperator parses the shell script. In the airflow bashoperator code the execute function tries to run "bash" + command. And when you run "bash *.exe" in WSL you get the error: "cannot execute binary file". So adapting the bashoperator to handle this exception (shell scripts that start with a windows executable file) or create your own operator (changing the execute function in the bashoperator code so the command run only the self.bash_command instead of ['bash', '-c', self.bash_command] should solve the issue. |
@BrunoSerafim thanks, I'll try! |
I think this is one of the best explanations i have found to execute exe files via wsl in airfflow Maybe a little bit late, but might be of some help to others |
Thanks for this @zarria , it helped me ! |
Apache Airflow version: 1.10.11
Kubernetes version (if you are using kubernetes) (use
kubectl version
): N/AEnvironment:
Ubuntu 20.04.1 LTS (Focal Fossa), running within WSL 2 on Windows 10 Pro 20H2, OS Build 19042.685 and Windows Feature Experience Pack 120.2212.551.0
uname -a
):Linux LifeMapPC 4.19.128-microsoft-standard #1 SMP Tue Jun 23 12:58:10 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
What happened:
I have a
BashOperator
task which invokes an .EXE on Windows (via/mnt/c/...
or via symlink).The task fails. Log shows:
And that's it. Return code
1
with no further useful info.Running the very same EXE via bash works perfectly, with no error (I also tried it on my own program which emits something to the console - in bash it emits just fine, but via
airflow scheduler
it's the same error 1).What you expected to happen:
Expected the Windows executable to run successfully via
airflow scheduler
, same as when I run it directly in Bash. That is: emit any output to the console and return success (error 0).Alternatively, happy to learn a way to get more insight into the log produced by the
airflow scheduler
run. i.e. to see "what happened" that makes it return error 1 on certain commands.How to reproduce it:
I do not know the circumstances / environment "fault" that is causing it so can't supply reproduction steps.
Anything else we need to know:
Some more data and things I've done to rule out any other issue:
airflow scheduler
runs as root. I also confirmed it's running in aroot
context by putting anwhoami
command in my BashOperator, which indeed emittedroot
(I should also note that all native Linux programs run just fine! only the Windows programs don't.)airflow test
on a BashOperator task - it runs perfectly - emits output to the console and returns 0 (success).The text was updated successfully, but these errors were encountered: