Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Racey Could not resolve file errors despite files existing #1101

Closed
airhorns opened this issue Apr 2, 2021 · 16 comments
Closed

Racey Could not resolve file errors despite files existing #1101

airhorns opened this issue Apr 2, 2021 · 16 comments

Comments

@airhorns
Copy link

airhorns commented Apr 2, 2021

When running a working esbuild config in a new environment, I get frequent but inconsistent Could not resolve <file> errors despite the files in question existing just fine.

I'm trying to get an esbuild configuration that works locally on my OS X machine and in a Github Actions Ubuntu runner onto another Linux based CI system. The exact same config works fine always in the other environments, but only sometimes works in the new environment. I am building a big TypeScript project for node by passing ~200 or so .ts entrypoint files to the JS build API.

The files that esbuild fails to resolve change each time, but each time it does fail, it's all of the files in a particular folder that fail to resolve:

error: Could not resolve "/app/packages/api/src/routes/edit/graphql/1_scalars.ts"
error: Could not resolve "/app/packages/api/src/routes/edit/graphql/connections.ts"
error: Could not resolve "/app/packages/api/src/routes/edit/graphql/scope.ts"
error: Could not resolve "/app/packages/api/src/routes/edit/graphql/auth.ts"
error: Could not resolve "/app/packages/api/src/routes/edit/graphql/liveTreeSchema.ts"

I don't have any resolve plugins set up and I am running on v0.11.1.

Debugging

I tried adding a resolve plugin that runs an fs.promises.access on each file and then returns undefined to prove that the same process invoking esbuild can access the file fine, and it indeed logged successful accesses. Because of this and because the build works sometimes I think the files really are there.

Curiously, strace-ing esbuild makes it fail to resolve some files almost every time I invoke it with strce. I don't see anything amiss in the strace, but here's an example trace for a file that fails to resolve:

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetMetaSchema.ts", O_RDONLY|O_CLOEXEC) = 6
4801  epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2300782936, u64=140580875348312}} <unfinished ...>
4801  <... epoll_ctl resumed> )         = 0
4801  fcntl(6, F_GETFL)                 = 0x8000 (flags O_RDONLY|O_LARGEFILE)
4801  fcntl(6, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
4801  getdents64(6, 0xc0005da000, 8192) = -1 ENOTDIR (Not a directory)
4801  epoll_ctl(3, EPOLL_CTL_DEL, 6, 0xc00019296c <unfinished ...>
4801  <... epoll_ctl resumed> )         = 0
4801  close(6)                          = 0
4801  write(2, "\33[1m > \33[31merror: \33[0m\33[1mCould"..., 124) = 124
4801  newfstatat(AT_FDCWD, "/app/packages/api/src/services/live-tree-server/listeners", {st_mode=S_IFDIR|0775, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0

To me it seems like opening the file succeeds just fine, and then esbuild writes the error: Could ... string to stderr.

I was running strace -f -e '!futex,read' yarn <my-esbuild-task> which ends up running esbuild with these options:

{
        entryPoints: [...<filenames>],
        incremental: true,
        bundle: false,
        platform: "node",
        format: "cjs",
        target: ["node14"],
        outdir: <dir>,
        outbase: <dir>,
        sourcemap: "inline",
}

see https://github.com/gadget-inc/esbuild-dev/blob/main/src/Compiler.ts#L98-L106 for the source

For contrast, here's what the strace looks like for esbuild reading a file successfully early:

4801  newfstatat(AT_FDCWD, "/app/tsconfig.json", {st_mode=S_IFREG|0664, st_size=422, ...}, AT_SYMLINK_NOFOLLOW) = 0
4801  newfstatat(AT_FDCWD, "/app/tsconfig.json",  <unfinished ...>
4801  <... newfstatat resumed> {st_mode=S_IFREG|0664, st_size=422, ...}, 0) = 0
4801  openat(AT_FDCWD, "/app/tsconfig.json", O_RDONLY|O_CLOEXEC <unfinished ...>
4801  <... openat resumed> )            = 6
4801  epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2300782936, u64=140580875348312}} <unfinished ...>
4801  <... epoll_ctl resumed> )         = 0
4801  fcntl(6, F_GETFL)                 = 0x8000 (flags O_RDONLY|O_LARGEFILE)
4801  fcntl(6, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE <unfinished ...>
4801  <... fcntl resumed> )             = 0
4801  fstat(6, {st_mode=S_IFREG|0664, st_size=422, ...}) = 0
4801  epoll_ctl(3, EPOLL_CTL_DEL, 6, 0xc00013024c <unfinished ...>
4801  <... epoll_ctl resumed> )         = 0
4801  close(6 <unfinished ...>
4801  <... close resumed> )             = 0

I realize this isn't a super reproducible issue, but I am hoping that these are enough breadcrumbs that we could figure out what next steps to take to unearth more information. Also notable is that this seems similar to #348 , but ulimit -n is 65536 in this environment, and I am running as root.

Any other information I could gather to help figure out what might be causing this heisenbug?

@kzc
Copy link
Contributor

kzc commented Apr 2, 2021

This is odd... the successfully opened file is being queried as a directory:

4801  getdents64(6, 0xc0005da000, 8192) = -1 ENOTDIR (Not a directory)

@kzc
Copy link
Contributor

kzc commented Apr 2, 2021

Might you be able to strace all previous file descriptor names using 6 (or whatever the failing fd is in the new run)? If I had to guess, in incremental mode some goroutine thinks an old directory fd is still live.

@evanw
Copy link
Owner

evanw commented Apr 2, 2021

Thanks for all of the detail in this issue description. Unfortunately I'm not quite sure what the problem is here (although I agree ENOTDIR sounds potentially relevant). I'm going to add a verbose logging mode in the next release that might provide more information. When it's released, you'll be able to enable it with --log-level=debug.

Edit: It's been released in version 0.11.3. Please try it out and let me know if it provides any additional information for this problem.

@airhorns
Copy link
Author

airhorns commented Apr 2, 2021

Super useful, thanks Evan!

Here's a couple examples of the resolution phase failing (for two different runs):

 > debug: Resolving import "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts" in directory "/app" of type "entry-point"
   note: The import "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts" is being treated as an absolute paths
   note: Read 30 entries for directory "/app"
   note: Attempting to load "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts" as a file
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.tsx"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.ts"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.jsx"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.js"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.css"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts.json"
   note: Attempting to load "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts" as a directory
   note: Read 7 entries for directory "/app/packages/api/spec/services"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/index.tsx"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/index.ts"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/index.jsx"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/index.js"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/index.css"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/index.json"
   note: Read 0 entries for directory "/app/packages/api/spec/services/docs-search"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.tsx"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.ts"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.jsx"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.js"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.css"
   note: Failed to find file "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts/index.json"
   note: Read 0 entries for directory "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts"

 > error: Could not resolve "/app/packages/api/spec/services/docs-search/DocsSearchIndexer.spec.ts"
 > debug: Resolving import "/app/packages/api/spec/routes/render/utils.ts" in directory "/app" of type "entry-point"
   note: The import "/app/packages/api/spec/routes/render/utils.ts" is being treated as an absolute paths
   note: Read 30 entries for directory "/app"
   note: Attempting to load "/app/packages/api/spec/routes/render/utils.ts" as a file
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.tsx"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.ts"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.jsx"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.js"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.css"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts.json"
   note: Attempting to load "/app/packages/api/spec/routes/render/utils.ts" as a directory
   note: Read 1 entry for directory "/app/packages/api/spec/routes/render"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.tsx"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.ts"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.jsx"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.js"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.css"
   note: Failed to find file "/app/packages/api/spec/routes/render/utils.ts/index.json"
   note: Read 0 entries for directory "/app/packages/api/spec/routes/render/utils.ts"

 > error: Could not resolve "/app/packages/api/spec/routes/render/utils.ts"

Here's the actual file:

root@ubuntu:~# ls -lA "/app/packages/api/spec/routes/render/utils.ts"
-rw-r--r-- 1 root root 2750 Apr  2 14:42 /app/packages/api/spec/routes/render/utils.ts

Very suspicious to me is the note: Read 1 entry for directory "/app/packages/api/spec/routes/render" from that log -- there are 6 files in that directory.

root@ubuntu:~# ls -lA /app/packages/api/spec/routes/render/
total 32
-rw-rw-r-- 1 root root 7739 Apr  2 14:42 app-access.spec.ts
-rw-rw-r-- 1 root root 1255 Apr  2 14:42 client-side-app.spec.ts
-rw-rw-r-- 1 root root  630 Apr  2 14:42 docs-redirect.spec.ts
drwxrwxr-x 2 root root 4096 Apr  2 14:42 graphql
-rw-rw-r-- 1 root root 5705 Apr  2 14:42 login-logout.spec.ts
-rw-rw-r-- 1 root root 2750 Apr  2 14:42 utils.ts

For comparison, here's an example of a successful resolution from one of the same runs:

> debug: Resolving import "/app/packages/api/src/routes/edit/graphql/connections.ts" in directory "/app" of type "entry-point"
   note: The import "/app/packages/api/src/routes/edit/graphql/connections.ts" is being treated as an absolute paths
   note: Read 30 entries for directory "/app"
   note: Attempting to load "/app/packages/api/src/routes/edit/graphql/connections.ts" as a file
   note: Found file "/app/packages/api/src/routes/edit/graphql/connections.ts"
   note: Read 9 entries for directory "/app/packages/api/src/routes/edit/graphql"
   note: This import is under the effect of "/app/tsconfig.json"
   note: Primary path is "/app/packages/api/src/routes/edit/graphql/connections.ts" in namespace "file"

It would appear that the file isn't being found at that lowest level of resolution in loadAsFile.

Might you be able to strace all previous file descriptor names using 6 (or whatever the failing fd is in the new run)? If I had to guess, in incremental mode some goroutine thinks an old directory fd is still live.

It is always fd 6 and 6 is used for almost every read I see in the process doing. I think (but am not certain) that the esbuild code that reads files like tsconfig.json is different than the code that reads files to resolve them, and that all resolutions including successful ones do that getdents call. I think that is

dirInfo := r.dirInfoCached(path)
or
if entries, err, originalError := s.fs.ReadDirectory(dir); err == nil {
if entry, _ := entries.Get(base); entry != nil && entry.Kind(s.fs) == fs.FileEntry {
but I am not really sure.

Here's the whole strace if you are curious.

Also, I am equally suspicious of the new environment that I am running in as I am of esbuild -- it is the thing that is different between the case that works fine for me and the case that doesn't. It may be some other system level limit or something I haven't configured properly so apologies if this is just my fault, hopefully we can teach esbuild to emit an error message that makes it clear who is at fault though!

@kzc
Copy link
Contributor

kzc commented Apr 2, 2021

Very suspicious to me is the note: Read 1 entry for directory "/app/packages/api/spec/routes/render" from that log -- there are 6 files in that directory.

The strace log mirrors that observation, albeit with a different directory from a different run.

Here's the log fragment for the directory query of /app/packages/api/src/services/app-graphql-api/meta with some extraneous system calls removed:

4803  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta", O_RDONLY|O_CLOEXEC) = 6
4800  tgkill(4799, 4803, SIGURG <unfinished ...>
4803  getdents64(6 <unfinished ...>
4800  <... tgkill resumed> )            = 0
4803  <... getdents64 resumed> , /* 1 entries */, 8192) = 40
4803  --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=4799, si_uid=0} ---
4803  rt_sigreturn({mask=[]})           = 40
4803  getdents64(6 <unfinished ...>
4803  <... getdents64 resumed> , /* 0 entries */, 8192) = 0
4803  close(6)                          = 0

Notice getdents64 had one directory entry reported before it was interrupted with SIGURG, which presumably is from the Go runtime or third party library because I don't see it in the esbuild code base. After that signal, 0 entries were reported indicating no more files in the directory. But it is notable that this same directory later has a number of files incorrectly queried as directories:

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetMetaSchema.ts", O_RDONLY|O_CLOEXEC) = 6
4801  getdents64(6, 0xc0005da000, 8192) = -1 ENOTDIR (Not a directory)

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetModelType.ts", O_RDONLY|O_CLOEXEC) = 6
4801  getdents64(6, 0xc0005e4000, 8192) = -1 ENOTDIR (Not a directory)

4807  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetPageType.ts", O_RDONLY|O_CLOEXEC) = 6
4807  getdents64(6, 0xc00035a000, 8192) = -1 ENOTDIR (Not a directory)

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetRoleType.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801  <... openat resumed> )            = 6
4801  fcntl(6, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
4801  getdents64(6 <unfinished ...>
4801  <... getdents64 resumed> , 0xc0005e8000, 8192) = -1 ENOTDIR (Not a directory)

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/HydrationPlan.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801  <... openat resumed> )            = 6
4801  getdents64(6, 0xc0005ea000, 8192) = -1 ENOTDIR (Not a directory)

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/actionInputSDL.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801  <... openat resumed> )            = 6
4801  getdents64(6, 0xc0005ec000, 8192) = -1 ENOTDIR (Not a directory)

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetAuthenticationMethodType.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801  <... openat resumed> )            = 6
4801  getdents64(6 <unfinished ...>
4801  <... getdents64 resumed> , 0xc0005f0000, 8192) = -1 ENOTDIR (Not a directory)

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetFieldType.ts", O_RDONLY|O_CLOEXEC <unfinished ...>
4801  <... openat resumed> )            = 6
4801  getdents64(6 <unfinished ...>
4801  <... getdents64 resumed> , 0xc0005f6000, 8192) = -1 ENOTDIR (Not a directory)

4801  openat(AT_FDCWD, "/app/packages/api/src/services/app-graphql-api/meta/GadgetGlobalActionType.ts", O_RDONLY|O_CLOEXEC) = 6
4801  getdents64(6 <unfinished ...>
4801  <... getdents64 resumed> , 0xc0005f8000, 8192) = -1 ENOTDIR (Not a directory)

So esbuild's directory resolving logic appears to be reasonable, but I suspect that the system was interrupted while interrogating the files in the directory and it may have given an incorrect result. So this might be a Go runtime issue related to interrupting getdents64 on Linux with SIGURG. Perhaps esbuild could defensively introduce code to invalidate the resolve cache for a (probable) file's parent directory in the presence of ENOTDIR.

@airhorns
Copy link
Author

airhorns commented Apr 2, 2021

Right. I think the reason those files are later queried is because they are passed in as entrypoints to the build explicitly -- the tool invoking esbuild does a glob for all the js/ts files on disk and passes them all in as entrypoints. So, that glob which uses fast-glob underneath discovered the files just fine.

@kzc
Copy link
Contributor

kzc commented Apr 2, 2021

Can you confirm that all the files in question existed in those directories prior to running esbuild?

@airhorns
Copy link
Author

airhorns commented Apr 2, 2021

I think for the glob to have found them they had to have existed, and to my knowledge there's nothing that would delete them while esbuild was running? Subsequent runs in the exact same place succeed as well so I don't think they are coming and going but hey -- this is really weird, who knows.

Here's a few more full straces of failed executions:

http://paste.ubuntu.com/p/JxQ9F4MrgS/ failed with

error: Could not resolve "/app/packages/api/src/services/edit-graphql-api/EditGraphQLContext.ts"
error: Could not resolve "/app/packages/api/src/services/edit-graphql-api/formatError.ts"
error: Could not resolve "/app/packages/api/src/__generated__/edit-resolvers.ts"
error: Could not resolve "/app/packages/api/src/routes/test-support/scope.ts"

http://paste.ubuntu.com/p/QY8PWyVwR7/ failed with

error: Could not resolve "/app/packages/api/spec/fixtures/createFixture.ts"
error: Could not resolve "/app/packages/api/spec/fixtures/fixtureLoader.ts"
error: Could not resolve "/app/packages/api/src/services/app-database/DatabaseOperation.ts"
error: Could not resolve "/app/packages/api/src/services/app-database/FieldManipulator.ts"
error: Could not resolve "/app/packages/api/src/services/app-database/InternalValidateRecord.ts"

http://paste.ubuntu.com/p/8ksHkVrD9N/ failed with

error: Could not resolve "/app/packages/api/src/routes/test-support/scope.ts"
error: Could not resolve "/app/packages/api/src/__generated__/edit-resolvers.ts"
error: Could not resolve "/app/packages/api/src/workers/AppDependenciesSetup.ts"
error: Could not resolve "/app/packages/api/src/workers/CreateUniqueIndexConverges.ts"
error: Could not resolve "/app/packages/api/src/workers/GadgetQueue.ts"

In the first run there this is the strace snippet for listing of the directory the missing files are in

4443  openat(AT_FDCWD, "/app/packages/api/src/services/edit-graphql-api", O_RDONLY|O_CLOEXEC <unfinished ...>
4443  <... openat resumed> )            = 6
4443  epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=3797217304, u64=140706925834264}} <unfinished ...>
4443  <... epoll_ctl resumed> )         = -1 EPERM (Operation not permitted)
4443  epoll_ctl(3, EPOLL_CTL_DEL, 6, 0xc0002ae524 <unfinished ...>
4443  <... epoll_ctl resumed> )         = -1 EPERM (Operation not permitted)
4443  getdents64(6 <unfinished ...>
4443  <... getdents64 resumed> , /* 1 entries */, 8192) = 24
4443  --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=4439, si_uid=0} ---
4443  rt_sigreturn({mask=[]} <unfinished ...>
4443  <... rt_sigreturn resumed> )      = 24
4443  getdents64(6 <unfinished ...>
4443  <... getdents64 resumed> , /* 0 entries */, 8192) = 0
4440  nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>
4443  close(6)                          = 0

and for the second run

4537  openat(AT_FDCWD, "/app/packages/api/spec/fixtures", O_RDONLY|O_CLOEXEC) = 6
4537  epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=350822616, u64=140218148135128}} <unfinished ...>
4537  <... epoll_ctl resumed> )         = -1 EPERM (Operation not permitted)
4537  epoll_ctl(3, EPOLL_CTL_DEL, 6, 0xc00022e524 <unfinished ...>
4537  <... epoll_ctl resumed> )         = -1 EPERM (Operation not permitted)
4537  getdents64(6 <unfinished ...>
4537  <... getdents64 resumed> , /* 1 entries */, 8192) = 32
4537  --- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=4534, si_uid=0} ---
4537  rt_sigreturn({mask=[]} <unfinished ...>
4537  <... rt_sigreturn resumed> )      = 32
4537  getdents64(6, /* 0 entries */, 8192) = 0
4535  <... nanosleep resumed> NULL)     = 0
4535  nanosleep({tv_sec=0, tv_nsec=20000},  <unfinished ...>

Theres that SIGURG in both examples! If I look at some of the surrounding getdents64 calls I don't see that signal being sent.

@kzc
Copy link
Contributor

kzc commented Apr 2, 2021

It does seem like a Go runtime issue, barring some unusual storage device or background rsync-like process running.

@kzc
Copy link
Contributor

kzc commented Apr 2, 2021

Possibly related: golang/go#39237

@kzc
Copy link
Contributor

kzc commented Apr 2, 2021

Do you still see the file errors with the workaround mentioned in golang/go#39237 (comment) ?

export GODEBUG=asyncpreemptoff=1

@airhorns
Copy link
Author

airhorns commented Apr 3, 2021

🎉 That resolves the issue for me! Nice find @kzc , thank you!

I talked a bit with the folks from this other CI provider (LayerCI) and they're actually using a FUSE mounted filesystem under the hood to serve the files to esbuild using go-fuse, which I think may suffer from the same issue itself. To me this helps confirm I am hitting the same issue linked to up there, just with a different FS than CIFS.

In any case, I think esbuild is not the culprit here so I will close this issue.

For anyone else coming along seeing strange resolve errors for files that seem like they do exist on disk, especially if running on top of a userspace filesystem via FUSE, try running esbuild with the GODEBUG=asyncpreemptoff=1 environment variable value set and see if your issues go away.

@airhorns airhorns closed this as completed Apr 3, 2021
@kzc
Copy link
Contributor

kzc commented Apr 3, 2021

Even though it's a Linux kernel bug, I think this issue should be be reopened and warrants an esbuild workaround.

It's a difficult to diagnose problem that only occurs on Linux with Go applications on certain filesystems since the implementation of the SIGURG based non-cooperative go routine preemption proposal. Applications written in other compiled or jitted languages including C, C++, Java, Rust, and JavaScript do not have this problem with getdents64 under normal circumstances with the file system in question. And as mentioned above, Go programs will not exhibit the getdents64 problem with the GODEBUG=asyncpreemptoff=1 environment setting.

The June 2020 linux kernel bug report hasn't generated any activity as of this writing. And the RedHat bug report seems to have been marked as CLOSED ERRATA.

I think esbuild could have a directory-specific resolve cache invalidation strategy in the event of ENOTDIR in some situations as a defensive measure to workaround this getdents64 bug in the linux kernel.

@evanw
Copy link
Owner

evanw commented Apr 3, 2021

One potential solution could be to force asyncpreemptoff to 1 on Linux. It doesn't look like there's a Go API for this though, so that might have to be logged as a feature request for the Go team. Either that or the Go team might want to add a workaround for this on their end.

Regardless, it would be good to create another bug on the Go issue tracker. The issue golang/go#39237 was closed even though the issue isn't fixed, and the last post says to create a new issue. I don't think it'd be appropriate for me to create one as I have no way of reproducing the problem myself.

@kzc
Copy link
Contributor

kzc commented Apr 3, 2021

@evanw I think there's enough information at hand for an esbuild workaround on Linux.

The Readdirnames err does not appear to be used?

entries, err := f.Readdirnames(-1)
// Unwrap to get the underlying error
if syscallErr, ok := err.(*os.SyscallError); ok {
err = syscallErr.Unwrap()
}
// Don't convert ENOTDIR to ENOENT here. ENOTDIR is a legitimate error
// condition for Readdirnames() on non-Windows platforms.
return entries, canonicalError, originalError

The erroneously truncated entries due to SIGURG interrupting getdents64 on Linux could probably be detected here - and Readdirnames could be retried.

@cruhl
Copy link

cruhl commented Apr 7, 2021

I just ran into this issue using a monorepo with yarn workspaces. It causes a symlinked local dependency to be unresolvable.

timostamm added a commit to timostamm/esbuild-plugin-postcss2-esbuild1101 that referenced this issue Aug 11, 2021
esbuild is unable to find the file, even though it does exist. This only happens for files in a directory with several other entries, so by creating a unique directory name per file on every build, we guarantee that there will only every be a single file present within the directory, circumventing the esbuild issue.
timostamm added a commit to timostamm/esbuild-plugin-postcss2-esbuild1101 that referenced this issue Aug 11, 2021
esbuild is unable to find the file, even though it does exist. This only happens for files in a directory with several other entries, so by creating a unique directory name per file on every build, we guarantee that there will only every be a single file present within the directory, circumventing the esbuild issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants