Skip to content

Conversation

@kikairoya
Copy link
Contributor

@kikairoya kikairoya commented Oct 17, 2025

lit.util.mkdir and lit.util.mkdir_p were written during the Python 2.x era.
Since modern pathlib functions have similar functionality, we can simply use those instead.

If you encounter a path length issue after this change, the registry value LongPathsEnabled must be set as described in https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation . Note that the Python runtime is already marked as a longPathAware executable.

Background:

On Cygwin, a file named file_name.exe can be accessed without the suffix, simply as file_name, as shown below:

$ echo > file_name.exe

$ file file_name.exe
file_name.exe: very short file (no magic)

$ file file_name
file_name: very short file (no magic)

In this situation, while running mkdir file_name works as intended, checking for the existence of the target before calling mkdir incorrectly reports that it already exists and thus skips the directory creation.

$ test -e file_name && echo exists
exists

$ mkdir file_name && echo ok
ok

$ file file_name
file_name: directory

Therefore, the existence pre-check should be skipped on Cygwin. Instead of add a workaround, refactored them.

@llvmbot
Copy link
Member

llvmbot commented Oct 17, 2025

@llvm/pr-subscribers-testing-tools

Author: Tomohiro Kashiwada (kikairoya)

Changes

On Cygwin, a file named file_name.exe can be accessed without the suffix, simply as file_name, as shown below:

$ echo > file_name.exe

$ file file_name.exe
file_name.exe: very short file (no magic)

$ file file_name
file_name: very short file (no magic)

In this situation, while running mkdir file_name works as intended, checking for the existence of the target before calling mkdir incorrectly reports that it already exists and thus skips the directory creation.

$ test -e file && echo exists
exists

$ mkdir file_name && echo ok
ok

$ file file_name
file: directory

Therefore, the existence pre-check should be skipped on Cygwin. If the target actually already exists, such an error will be ignored anyway.


Full diff: https://github.com/llvm/llvm-project/pull/163948.diff

1 Files Affected:

  • (modified) llvm/utils/lit/lit/util.py (+1-1)
diff --git a/llvm/utils/lit/lit/util.py b/llvm/utils/lit/lit/util.py
index ce4c3c2df3436..a5181ab20a7e1 100644
--- a/llvm/utils/lit/lit/util.py
+++ b/llvm/utils/lit/lit/util.py
@@ -164,7 +164,7 @@ def mkdir(path):
 def mkdir_p(path):
     """mkdir_p(path) - Make the "path" directory, if it does not exist; this
     will also make directories for any missing parent directories."""
-    if not path or os.path.exists(path):
+    if not path or (sys.platform != "cygwin" and os.path.exists(path)):
         return
 
     parent = os.path.dirname(path)

@kikairoya
Copy link
Contributor Author

cc: @jeremyd2019 @mstorsjo

@arichardson
Copy link
Member

So does that mean any use of os.path.exists() is broken on cygwin? Maybe we could change it to is_dir?

Copy link
Member

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe now that we depend on newer python we can just use the exist_ok=True parameter for mkdir?

@kikairoya
Copy link
Contributor Author

So does that mean any use of os.path.exists() is broken on cygwin? Maybe we could change it to is_dir?

I think, in general, most of such checks should be avoided (cf. TOCTOU).

Or maybe now that we depend on newer python we can just use the exist_ok=True parameter for mkdir?

Sure. I'll make a change to use it.

On Cygwin, a file named `file_name.exe` can be accessed without the suffix,
simply as `file_name`, as shown below:

```
$ echo > file_name.exe

$ file file_name.exe
file_name.exe: very short file (no magic)

$ file file_name
file_name: very short file (no magic)
```

In this situation, while running `mkdir file_name` works as intended,
checking for the existence of the target before calling `mkdir`
incorrectly reports that it already exists and thus skips the directory creation.

```
$ test -e file && echo exists
exists

$ mkdir file_name && echo ok
ok

$ file file_name
file: directory
```

Therefore, the existence pre-check should be skipped on Cygwin.
If the target actually already exists, such an error will be ignored anyway.
@kikairoya
Copy link
Contributor Author

mkdir_p now simply forwards to os.makedirs.
Should it be inlined? It is only called from two locations in TestRunner.py.

mkdir can also be inlined as it is now only called from one location but I'm not sure if the call to CreateDirectoryW can be replaced with just an os.mkdir.

mkdir_p(parent)

mkdir(path)
os.makedirs(path, exist_ok=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much simpler, nice. Not a Windows expert but do we need to remap long paths using something like this?

Suggested change
os.makedirs(path, exist_ok=True)
if platform.system() == "Windows":
if not path.startswith(r"\\?\"):
path = r"\\?\" + path
os.makedirs(path, exist_ok=True)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure - do our python script do such path mangling anywhere else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do in the def mkdir(path): above.

It sounds like python3.6 supports long paths if they are enabled in the registry so there should be no need to work around it here anymore:
https://bugs.python.org/issue27731 -> https://hg.python.org/cpython/rev/26601191b368

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see.

@arichardson
Copy link
Member

From what I've read pathlib.Path handles long windows paths correctly, so maybe the best solution would be to migrate callers of these helpers towards pathlib?

@kikairoya
Copy link
Contributor Author

OK, I'll try to replace both of mkdir and mkdir_p with direct calls to the pathlib functions.

os.mkdir(path)
except OSError:
e = sys.exc_info()[1]
# ignore EEXIST, which may occur during a race condition
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, all EEXIST errors were ignored.
We should indeed ignore races between concurrent mkdir targets, but I don't see any reason to allow races between touch target and mkdir target.
Since pathlib's exist_ok=True behaves this way, the test shtest-glob.py has been updated to reflect the new behavior.

@kikairoya kikairoya changed the title [LIT][Cygwin] Skip pre-check for existence in mkdir-p [LIT] replace lit.util.mkdir with pathlib.Path.mkdir Oct 18, 2025
Copy link
Contributor

@RoboTux RoboTux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

# RUN: mkdir %S/example_dir1.new
# RUN: mkdir %S/example_dir2.new

## This mkdir should succeed (so RUN should fail) because the `example_dir*.new`s are directories already exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove `are'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review.
I meant to write, "the example_dir*.news are directories" rather than "the example_dir*.news already exist".
Would "the example_dir*.news that already exist are directories." be correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that would be fine, thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@kikairoya
Copy link
Contributor Author

I've verified that latest version resolves my initial problem about *.exe.
The title and the description are updated to reflect changes.

@kikairoya kikairoya requested a review from arichardson October 22, 2025 11:11
Copy link
Contributor

@RoboTux RoboTux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for the work

@kikairoya
Copy link
Contributor Author

@arichardson Could you take another look?

Copy link
Member

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment on the test otherwise LGTM.

Comment on lines 4 to 10
# RUN: not mkdir %S/example_file*.input

# RUN: mkdir %S/example_dir1.new
# RUN: mkdir %S/example_dir2.new

## This mkdir should succeed (so RUN should fail) because the `example_dir*.news` that already exist are directories.
# RUN: not mkdir %S/example_dir*.new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# RUN: not mkdir %S/example_file*.input
# RUN: mkdir %S/example_dir1.new
# RUN: mkdir %S/example_dir2.new
## This mkdir should succeed (so RUN should fail) because the `example_dir*.news` that already exist are directories.
# RUN: not mkdir %S/example_dir*.new
# RUN: rm -rf %t && mkdir %t
# RUN: cp %S/example_file*.input %t
# RUN: not mkdir %t/example_file*.input
# RUN: mkdir %t/example_dir1.new
# RUN: mkdir %t/example_dir2.new
## This mkdir should succeed (so RUN should fail) because the `example_dir*.news` that already exist are directories.
# RUN: not mkdir %t/example_dir*.new

I think we need to do something like the following since we shouldn't be writing to the source directory when running tests. Might be good to have a builder that bind mounts the source dir as read-only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion.

In these particular cases, Inputs are copied to the build directory and run within there, so they won't try to write to the source tree.

add_custom_target(prepare-check-lit
COMMAND ${CMAKE_COMMAND} -E remove_directory "${CMAKE_CURRENT_BINARY_DIR}/tests"
COMMAND ${CMAKE_COMMAND} -E copy_directory "${CMAKE_CURRENT_SOURCE_DIR}/tests" "${CMAKE_CURRENT_BINARY_DIR}/tests"
COMMAND ${CMAKE_COMMAND} -E copy "${CMAKE_CURRENT_BINARY_DIR}/lit.site.cfg" "${CMAKE_CURRENT_BINARY_DIR}/tests"
COMMENT "Preparing lit tests"
)

That said, I agree that it's not good that they look like writing to the source directory.
I'd like to avoid use of another command in this test, so I've updated to include example_dirs in the source tree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I was not aware that %S is actually in the build tree. LGTM

@mstorsjo
Copy link
Member

mstorsjo commented Nov 3, 2025

So if everyone are ok with this, I guess we can merge it?

@kikairoya
Copy link
Contributor Author

Yes, please -- thank you as always!

@mstorsjo mstorsjo merged commit 84cc2b0 into llvm:main Nov 4, 2025
10 checks passed
@kikairoya
Copy link
Contributor Author

kikairoya commented Nov 4, 2025

https://lab.llvm.org/buildbot/#/builders/123/builds/30061
https://lab.llvm.org/buildbot/#/builders/210/builds/4853/steps/12/logs/stdio

Some bots fail in litsupport/test.py due to lit.util.mkdir_p missing, but it seems an external code.
What should I do for this? Restoring back lit.util.mkdir_p and lit.util.mkdir with delegating to pathlib is enough?

CI got back to green.

kikairoya added a commit to kikairoya/llvm-test-suite that referenced this pull request Nov 4, 2025
The helper function `lit.util.mkdir_p` has been removed in
llvm/llvm-project#163948 .

Instead of that, call `pathlib` functions directly.
@kikairoya kikairoya deleted the cygwin-lit-mkdir-p branch November 4, 2025 12:17
jplehr pushed a commit to llvm/llvm-test-suite that referenced this pull request Nov 4, 2025
The helper function `lit.util.mkdir_p` has been removed in
llvm/llvm-project#163948 .

Instead of that, call `pathlib` functions directly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants