Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: git clone on Windows #26105

Merged
merged 8 commits into from
Sep 19, 2022
Merged

Fix: git clone on Windows #26105

merged 8 commits into from
Sep 19, 2022

Conversation

kokorin
Copy link
Contributor

@kokorin kokorin commented Sep 1, 2022

Airflow repository can't be cloned on Windows due to some files containing symbols restricted on Windows.

closes: #24596
related to: #25865


Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg
Copy link

boring-cyborg bot commented Sep 1, 2022

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

@potiuk
Copy link
Member

potiuk commented Sep 1, 2022

Good idea to add a test for it :). But you will also need to fix some tests to make it works I am afraid.

BTW. That will be really helpful (and should be part of #windows - which I tihnk very soon (pending few more PRs to complete https://github.com/apache/airflow/projects/13 ) will finally be possible.

@kokorin
Copy link
Contributor Author

kokorin commented Sep 1, 2022

But you will also need to fix some tests to make it works I am afraid.

That's clear. I'm working on that.

@kokorin
Copy link
Contributor Author

kokorin commented Sep 1, 2022

@potiuk after investigation it looks like the root of the problem is in FileTaskHandler as it writes logs to paths containing dates in ISO 8601format. But ISO 8601format won't work on Windows.

Rewrite Breeze and CI in Python project has no issue to use another date format to make log directory structure Windows-compatible. Probably it's because the project aimed at Breeze rewriting, not at been able to run Airflow on Windows.

Changing FileTaskHandler is out of scope of this PR and the only way I see is to add another class (in test) extending FileTaskHandler and overwriting log directory structure with Windows compatible. After that DEFAULT_LOGGING_CONFIG in test should be adjusted to use customized FileTaskHandler.

The class (customized FileTaskHandler) itself is not a problem, but I'm not an expert in Python and newbie Airflow developer. Could you tell me how to make Airflow/Breeze to load customized FileTaskHandler?

@uranusjr
Copy link
Member

uranusjr commented Sep 2, 2022

I commented elsewhere. Since this actually affects Windows support (which we want to achieve at some point), the root cause should also be fixed. I would suggest the following steps:

  1. Fix the actual FileTaskHandler to log in different a file name format in Windows.
  2. Delete those problematic files from the repository, and general those files dynamically when testing (under appropriate file names) instead in Pytest’s tmp_path.
  3. Modify tests to appropriately point the log directory to tmp_path.

@kokorin
Copy link
Contributor Author

kokorin commented Sep 2, 2022

1. Fix the actual `FileTaskHandler` to log in different a file name format in Windows.

@uranusjr I like your suggestion, the only problem I see is that it's not possible to run FileTaskHandler tests on Windows before fixing all other Windows-related issues.

So if it's ok to have in FileTaskHandler windows-related code which is not tested on Windows - I will work on that.

@kokorin
Copy link
Contributor Author

kokorin commented Sep 2, 2022

@uranusjr I have came to an idea that to fix git clone on Windows it would be enough to fix points 2 and 3:

  1. Delete those problematic files from the repository, and general those files dynamically when testing (under appropriate file names) instead in Pytest’s tmp_path.
  2. Modify tests to appropriately point the log directory to tmp_path.

While point 1 could be fixed in another PR:

  1. Fix the actual FileTaskHandler to log in different a file name format in Windows.

@uranusjr
Copy link
Member

uranusjr commented Sep 2, 2022

My proposal is not to make FileTaskHandler work on Windows, only to modify it in a way that is in line to the changes we want to make if we want to make it work. This makes the work more meaningful; the current patch you proposes will basically all be deleted when Windows support is implemented, which is more wasteful than just implement the proper logic in the first place.

@potiuk
Copy link
Member

potiuk commented Sep 2, 2022

My proposal is not to make FileTaskHandler work on Windows, only to modify it in a way that is in line to the changes we want to make if we want to make it work. This makes the work more meaningful; the current patch you proposes will basically all be deleted when Windows support is implemented, which is more wasteful than just implement the proper logic in the first place.

Yeah. Agree with @uranusjr that this is the right way. Regarding generation of the files, it might even be as simple as zipping the log files that are used for test and have two .zip files "windows" / "posix" - and decompress the one that is relevant for the OS that you are running the tests on. This way you could add the different behaviours in FileTaskHandler too.

@kokorin kokorin marked this pull request as ready for review September 5, 2022 09:32
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@potiuk
Copy link
Member

potiuk commented Sep 5, 2022

Very nice approach :).

@potiuk
Copy link
Member

potiuk commented Sep 5, 2022

@uranusjr - WDYT?

@kokorin
Copy link
Contributor Author

kokorin commented Sep 6, 2022

@potiuk @uranusjr 3 Helm tests failed during last CI run, are they flaky or should I fix them first?

Comment on lines 184 to 185
# delete logs to minimize tests interference
shutil.rmtree(log_path)
Copy link
Member

@uranusjr uranusjr Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed, pytest tracks tmp_path_factory and cleans things it creates automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uranusjr pytest cleans directories after exiting defined scope. log_path fixture requires module scope to be used by log_app fixture (which also has module scope). But log_path should be cleared after each test, because almost all tests reference to the same Dag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks. Would it be a good idea to only delete the generated log file instead though? (And add a comment explaining why)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uranusjr in the PR I tried not to make any assumptions regarding log file name. I will add a comment describing why manual log file deletion is required and will find more clear to delete created log file.

@potiuk
Copy link
Member

potiuk commented Sep 6, 2022

@potiuk @uranusjr 3 Helm tests failed during last CI run, are they flaky or should I fix them first?

@kokrin - your goal is to make it green. And whether to do it is something that you shoud be able to assess yourself I think

  1. assume it's you first and look at the actual log/error if you can figure if it is related
  2. if this is a once-off failure among multiple similar jobs and seems unrelated/networking problem - commit --amend, push and check if it is reproduced
  3. if other PRs and main are failing with similar problems - assume it is a shared/main problem and raise awareness (here or in #development slack)
  4. it the log seems releated and failure repeats in several similar jobs - reproduce locally and fix (the K8S tests have now very nice, reproducible path that you can run locally - see the exact recipe in TESTING.rst
  5. if you are lost after doing all the above - ping us :)

I think this is a good recipe and I wonder if following it will help in this case (I am using this a bit of a testing ground and if it works I will try to make contributors aware of this "approach" they should use - I am looking for a ways to scale the PR process better. Let me know @kokorin - and maybe if you have other suggestions how to improve the "process" that woudl be awesome.

@kokorin
Copy link
Contributor Author

kokorin commented Sep 6, 2022

@potiuk sorry but my first question was if the tests in question were flaky. The same tests fail in #26162.
In both PR tests fail inside breeze: airflow_breeze/commands/kubernetes_commands.py
image

@potiuk
Copy link
Member

potiuk commented Sep 6, 2022

@potiuk sorry but my first question was if the tests in question were flaky. The same tests fail in #26162. In both PR tests fail inside breeze: airflow_breeze/commands/kubernetes_commands.py image

What I mean, if you already did points 1) 2) 3) - just mention it and when you ask if this is a failure just point to the other similar issues :) . Try to put. your self in our shoes. When you ask such question and you do not mention that you already looked at it, the first assumption I have is .. that you are lazy and you have not checked just try to offload the check to us :).
No judgment here, just imagine how it looks like:

a) @potiuk @uranus - should I fix it ?

vs.

b) It looks like the error here is similar to error in #26162 , and it does not loook related to my changes. @potiuk @uranusjr - can you help please?

The first looks like you have not made any effort to check and you are lazy. The second shows that you've "done your part". Context is everything. My reaction to b) is vastly different than a) :).

Hope no hard feelings - just wantted to show you how it looks like if you are too brief.

@kokorin
Copy link
Contributor Author

kokorin commented Sep 7, 2022

CI steps "Build CI Images" fail with

  #7 [internal] load metadata for docker.io/library/python:3.10-slim-bullseye
  ------
   > pushing ghcr.io/apache/airflow/main/ci/python3.10:71f9e1d391969806a7ec44bbb2449072b655b6d7 with docker:
  ------
  error: denied: permission_denied: write_package

Probably, it's not related to changes in this MR. Please, let me know if it's not true.

@uranusjr
Copy link
Member

uranusjr commented Sep 7, 2022

The CI images job is failing for some other PRs as well, so perhaps it is related to some issues in GitHub. I have restarted the job to see if the issue is temporary.

@potiuk
Copy link
Member

potiuk commented Sep 7, 2022

The CI images job is failing for some other PRs as well, so perhaps it is related to some issues in GitHub. I have restarted the job to see if the issue is temporary.

Yeah. I think GitHub had a rough day yesterday ... Many things were failing here and there

@kokorin
Copy link
Contributor Author

kokorin commented Sep 7, 2022

@potiuk @uranusjr looking forward to merge :-)

@potiuk
Copy link
Member

potiuk commented Sep 9, 2022

I fixed the K8S test problems in #26235 . 🤞

@potiuk
Copy link
Member

potiuk commented Sep 9, 2022

Rebased to re-run.

@kokorin
Copy link
Contributor Author

kokorin commented Sep 13, 2022

@uranusjr @potiuk do you think this PR can be merged now? The only test failures are related to help and (probably) are flaky. The only failed static is related to documentation ("a/docs/apache-airflow/img/airflow_erd.sha256") and changes this PR should not be the reason of its failure.

@potiuk
Copy link
Member

potiuk commented Sep 18, 2022

@uranusjr @potiuk do you think this PR can be merged now? The only test failures are related to help and (probably) are flaky. The only failed static is related to documentation ("a/docs/apache-airflow/img/airflow_erd.sha256") and changes this PR should not be the reason of its failure.

In such cases, you can always rebase the PR to check (it's as easy as choosing "rebase" at the bottom right of the PR if there are no conflicts). I did it - but next time you can rebase yourself without waiting/pinging us (let's see if it succeed - please watch this and check and ping me if it does, otherwise I will come back to this one next time when I go through about 100/200 mails so it might take some time).

@kokorin
Copy link
Contributor Author

kokorin commented Sep 19, 2022

@potiuk all tests passed

@potiuk potiuk merged commit 9aa589e into apache:main Sep 19, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Sep 19, 2022

Awesome work, congrats on your first merged pull request!

@potiuk
Copy link
Member

potiuk commented Sep 19, 2022

Cool!

@kokorin kokorin deleted the fix/git-clone-on-windows branch September 19, 2022 14:23
ephraimbuddy pushed a commit that referenced this pull request Nov 10, 2022
(cherry picked from commit 9aa589e)
ephraimbuddy pushed a commit that referenced this pull request Nov 10, 2022
(cherry picked from commit 9aa589e)
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Nov 10, 2022
@ephraimbuddy ephraimbuddy added this to the Airflow 2.4.3 milestone Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants