Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppress errors on EBADF when unlocking files #35706

Conversation

sfc-gh-kstewart
Copy link
Contributor

@sfc-gh-kstewart sfc-gh-kstewart commented Oct 8, 2024

Description

This error is harmless and happens regularly when delete_after_read is set. This is because we acquire the lock right at the start of the ReadToEnd function and then defer the unlock, but that function also performs the delete. So, by the time it returns and the defer runs the file descriptor is no longer valid.

Testing

Tested manually on a laptop running MacOS. When using acquire_fs_locks in conjunction with delete_after_read I observe this error in the logs before applying this change:

2024-10-08T14:30:23.626-0700	error	reader/reader_unix.go:28	Failed to unlock	{"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "<redacted>", "error": "bad file descriptor"}
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader.(*Reader).unlockFile
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/internal/reader/reader_unix.go:28
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader.(*Reader).ReadToEnd
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/internal/reader/reader.go:126
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Manager).consume.func1
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/file.go:160

Performing the same test with this changed applied eliminates the error.

…arly when delete_after_read is set and it's harmless.
@sfc-gh-kstewart sfc-gh-kstewart marked this pull request as ready for review October 8, 2024 21:34
@sfc-gh-kstewart sfc-gh-kstewart requested a review from a team as a code owner October 8, 2024 21:34
@djaglowski djaglowski merged commit 8c7c67b into open-telemetry:main Oct 9, 2024
155 checks passed
@github-actions github-actions bot added this to the next release milestone Oct 9, 2024
@sfc-gh-kstewart sfc-gh-kstewart deleted the filelog-receiver-fs-lock-handle-delete branch October 9, 2024 15:31
jmichalek132 pushed a commit to jmichalek132/opentelemetry-collector-contrib that referenced this pull request Oct 10, 2024
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

This error is harmless and happens regularly when `delete_after_read` is
set. This is because we acquire the lock right at the start of the
`ReadToEnd` function and then defer the unlock, but that function also
performs the delete. So, by the time it returns and the defer runs the
file descriptor is no longer valid.

<!--Describe what testing was performed and which tests were added.-->
#### Testing

Tested manually on a laptop running MacOS. When using `acquire_fs_locks`
in conjunction with `delete_after_read` I observe this error in the logs
*before applying this change*:

```
2024-10-08T14:30:23.626-0700	error	reader/reader_unix.go:28	Failed to unlock	{"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "<redacted>", "error": "bad file descriptor"}
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader.(*Reader).unlockFile
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/internal/reader/reader_unix.go:28
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader.(*Reader).ReadToEnd
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/internal/reader/reader.go:126
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Manager).consume.func1
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/file.go:160
```

Performing the same test with this changed applied eliminates the error.

---------

Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
sbylica-splunk pushed a commit to sbylica-splunk/opentelemetry-collector-contrib that referenced this pull request Dec 17, 2024
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

This error is harmless and happens regularly when `delete_after_read` is
set. This is because we acquire the lock right at the start of the
`ReadToEnd` function and then defer the unlock, but that function also
performs the delete. So, by the time it returns and the defer runs the
file descriptor is no longer valid.

<!--Describe what testing was performed and which tests were added.-->
#### Testing

Tested manually on a laptop running MacOS. When using `acquire_fs_locks`
in conjunction with `delete_after_read` I observe this error in the logs
*before applying this change*:

```
2024-10-08T14:30:23.626-0700	error	reader/reader_unix.go:28	Failed to unlock	{"kind": "receiver", "name": "filelog", "data_type": "logs", "component": "fileconsumer", "path": "<redacted>", "error": "bad file descriptor"}
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader.(*Reader).unlockFile
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/internal/reader/reader_unix.go:28
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader.(*Reader).ReadToEnd
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/internal/reader/reader.go:126
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/fileconsumer.(*Manager).consume.func1
	github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza@v0.111.0/fileconsumer/file.go:160
```

Performing the same test with this changed applied eliminates the error.

---------

Co-authored-by: Bogdan Drutu <bogdandrutu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants