Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Windows use tempname for mktemp #33913

Closed
wants to merge 2 commits into from
Closed

Conversation

musm
Copy link
Contributor

@musm musm commented Nov 21, 2019

On top of #33785

From https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-gettempfilenamea

Due to the algorithm used to generate file names, GetTempFileName can perform poorly when creating a large number of files with the same prefix. In such cases, it is recommended that you construct unique file names based on GUIDs.

Here we simply use tempname which uses GUIDs to generate unique file name, as recommended. Whether this actually addresses 'can perform poorly when creating a large number of files' is kinda tangential since we prefix jl_ in tempname. This I think more importantly addresses the uniqueness issue, since the UIUD's generated should be unique and there is a lower probability to clash.

@musm musm force-pushed the tempname branch 2 times, most recently from ba77003 to 4c3721a Compare November 22, 2019 05:03
base/file.jl Outdated Show resolved Hide resolved
@musm musm changed the title Refactor tempname and use it for mktemp as well On Windows use tempname for mktemp Nov 26, 2019
@musm
Copy link
Contributor Author

musm commented Nov 30, 2019

libuv PR to implement mkstemp libuv/libuv#2557

@StefanKarpinski
Copy link
Member

How does this end up formatting the temp file names? I should perhaps point out that many Windows systems have a 260 bytes path length limit, so using a full UUID string in the temp file name is quite likely to cause problems since it makes the path so long. Perhaps just crunch it down to some 8-10 chars chosen from "0-9a-zA-Z"?

@musm
Copy link
Contributor Author

musm commented Dec 4, 2019

Ideally we can upgrade to libuv 1.34 which would allow a cross-platform solution without special casing.

How does this end up formatting the temp file names? I should perhaps point out that many Windows systems have a 260 bytes path length limit, so using a full UUID string in the temp file name is quite likely to cause problems since it makes the path so long. Perhaps just crunch it down to some 8-10 chars chosen from "0-9a-zA-Z"?

In practice this isn't a huge concern. Other applications use UUID's for temporary folders, and microsoft's documentation recommends their usage for unique paths.

@StefanKarpinski
Copy link
Member

In practice this isn't a huge concern.

In practice, I tried using full uuids in paths when I was I was developing Pkg3 and it did cause problems because of the Windows path length. So, I'm a lot less optimistic that this will be fine.

@musm
Copy link
Contributor Author

musm commented Dec 4, 2019

In practice, I tried using full uuids in paths when I was I was developing Pkg3 and it did cause problems because of the Windows path length. So, I'm a lot less optimistic that this will be fine.

Is this because there were deeply nested folder structures within the temporary directory?

In any case I'm hoping we can upgrade to libuv 1.34

@StefanKarpinski
Copy link
Member

Yes, it's fairly common for packages, artifacts, etc. to have quite deeply nested paths. I'm kind of shocked that we don't hit the 260 limit more often without lopping off ~40 chars gratuitously. We can also just not use all of the UUID data to generate a random slug. Using 10 chars from 0-9a-zA-Z is plenty of randomness: it would take 60 million temp files to have a 50% chance of a collision even on a case insensitive file system and nearly a billion on a case sensitive file system.

@musm
Copy link
Contributor Author

musm commented Dec 4, 2019

I'm kind of shocked that we don't hit the 260 limit more often without lopping off ~40 chars gratuitously.

Yea,h on Win 10 the path limit is no longer an issue since the API has been extended to allow much longer paths, which is probably why we don't see it in practice as often. Libuv still doesn't gracefully support extended paths and by extension some function is Julia have troubles with dealing with extended paths.

But in any case, I'm in favor of closing this and using the updated libuv implementation when we upgrade.

@musm musm closed this Dec 4, 2019
@StefanKarpinski
Copy link
Member

The problems I did see when using longs paths in Pkg3 were actually from tools that did weird things in case of the 260 char path limit. For example, Conda changes the path to python in the shebang line of scripts where it is installed from being the correct path to its python to #!/usr/bin/env python if the path to Python is longer than 260 chars on Windows. This, of course, ends up calling the wrong Python binary, wreaking all sorts of havoc.

@StefanKarpinski
Copy link
Member

Why not just modify this PR to crunch the GUID down to 10 chars from 0-9a-zA-Z instead?

@musm
Copy link
Contributor Author

musm commented Dec 4, 2019

Should we do that until we upgrade libuv, as a temporary solution till that time?

@StefanKarpinski
Copy link
Member

I guess the libuv solution is better, so we can always revive this if that doesn't work out 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants