Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/stanza/fileconsumer] - Duplication of logs #27037

Closed
VihasMakwana opened this issue Sep 20, 2023 · 4 comments · Fixed by #27064
Closed

[pkg/stanza/fileconsumer] - Duplication of logs #27037

VihasMakwana opened this issue Sep 20, 2023 · 4 comments · Fixed by #27064
Labels
bug Something isn't working pkg/stanza

Comments

@VihasMakwana
Copy link
Contributor

Component(s)

pkg/stanza

What happened?

Description

Consider the following scenario for fileconsumer:

  • We read a file during a poll cycle, and emit all the logs. Let's say we emitted 10 lines.
  • Before beginning the next poll cycle we do the following things in the background.
    - Write more logs to the file.
    - Copy that file, move it out of the pattern, and truncate it to 0.
    - Write more logs to the previous file. More than 10 lines (this is necessary)
    - Call next poll()

While performing the above steps, I noticed the excess logs that we wrote (after 10 lines), were emitted twice i.e. duplicated.

Steps to Reproduce

  • Used the following to produce this bug/issue.
  • The following test passes, unfortunately.
func TestOutOfPattern(t *testing.T) {
	tempDir := t.TempDir()
	cfg := NewConfig()
	cfg.Include = append(cfg.Include, fmt.Sprintf("%s/*.log1", tempDir))
	cfg.StartAt = "beginning"
	operator, emitCalls := buildTestManager(t, cfg)
	operator.persister = testutil.NewMockPersister("test")

	temp := openTempWithPattern(t, tempDir, "*.log1")
	writeString(t, temp, "testlog1\n")
	operator.poll(context.Background())
	waitForToken(t, emitCalls, []byte("testlog1"))

	// write more log, before next poll() begins
	writeString(t, temp, "testlog2\n")
	// copy the file to another file i.e. rotate, out of pattern
	temp2 := openTempWithPattern(t, tempDir, "*.log2")
	temp.Seek(0, 0)
	_, err := io.Copy(temp2, temp)
	require.NoError(t, err)

	temp.Seek(0, 0)
	temp.Truncate(0)
	temp.Write([]byte("testlog4\ntestlog5\n"))
	// begin next poll()
	fmt.Print("\n\n\nSecond poll\n")
	operator.poll(context.Background())

	// INCORRECT, should emit testLog5 only once.
	waitForTokens(t, emitCalls, [][]byte{[]byte("testlog5"), []byte("testlog4"), []byte("testlog5")})
}

Expected Result

It should only emit the logs once.

Actual Result

Duplication.

Proposed fix

  • The fix for this would be to compare previous fingerprints, and newer fingerprints and only emit more logs if they're the same.
  • I can work on a PR if that sounds okay.

Collector version

v0.85.0

Environment information

Environment

OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")

OpenTelemetry Collector configuration

No response

Log output

No response

Additional context

No response

@VihasMakwana VihasMakwana added bug Something isn't working needs triage New item requiring triage labels Sep 20, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@VihasMakwana
Copy link
Contributor Author

VihasMakwana commented Sep 20, 2023

This is quite a rare scenario tbh. But it makes sense to fix this.

@djaglowski
Copy link
Member

Thanks for reporting @VihasMakwana, and especially for providing a test.

I looked into this and don't believe your suggested fix would work. The problem comes down to how we handle files which we know have been rotated. Once we know a file has been rotated, we can only assess whether or not it was moved or truncated by reading a new fingerprint from the old handle. If truncated, we should not read the remainder of the file. If moved, we can read the remainder (if there is any).

I've opened #27064 to address it.

(One thing I dislike here is that it was necessary to save the file name on the reader, because 1/50 times running the test we would hit a nil pointer on accessing the file, which means the reader had already been closed for some reason. I am ok with this for now but I still intend to clean up the way we manage readers across the board, so that we can avoid any such concerns.)

@VihasMakwana
Copy link
Contributor Author

VihasMakwana commented Sep 21, 2023

@djaglowski I see what you did in your PR! I kind of implemented the same in my PR here https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/25884/files#diff-a918db427e06075a5fa059d7893f54556bfa1504726ba77b3491b644a64ab818R228. But it was too lengthy.

I got rid of it and used your method instead, as your method is more compact and readable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pkg/stanza
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants