Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tail channel closed, stopping tailer #6503

Open
yangvipguang opened this issue Jun 27, 2022 · 8 comments
Open

tail channel closed, stopping tailer #6503

yangvipguang opened this issue Jun 27, 2022 · 8 comments

Comments

@yangvipguang
Copy link

Describe the bug
A clear and concise description of what the bug is.
Promtail often fails to read logs. log error:
image
Restart promtail and return to normal
image

Version: 2.5.0
Platform: Centos7

@dannykopping
Copy link
Contributor

Hi @yangvipguang. I think this occurring when your file is rotated.

@jeschkies
Copy link
Contributor

This might be a known issue. I'll take a look.

@yangvipguang
Copy link
Author

yangvipguang commented Jun 28, 2022

@dannykopping yes, the node log files are rotated 。
app: node
log: log4js-node
image
lQLPJxZxUXJBXsLMh80CZLAluRzDOVBnNwK6ugzgALwA_612_135

@dannykopping
Copy link
Contributor

@yangvipguang are you actually missing logs when this occurs, or are you just asking about the error in the logs?

This looks like good behaviour because when a file it rotated it will be renamed (but still maintain the same inode) and then eventually deleted when the contents are compressed. I would expect these log lines to show up in those situations.

@yangvipguang
Copy link
Author

@dannykopping yes actually missing logs when the log is rolated

@frittentheke
Copy link
Contributor

It appears Promtail really only tracks the positions based on the filename

p.positions[path] = pos

and only recognizes the rotation of a file if it's actively running and watching during the rotation:

This is how this is logged:

level=info ts=2022-07-21T21:06:59.436123136Z caller=tailer.go:143 component=tailer msg="tail routine: started" path=/tmp/logs/log1.log
ts=2022-07-21T21:06:59.436160443Z caller=log.go:168 level=info msg="Seeked /tmp/logs/log1.log - &{Offset:12 Whence:0}"
level=info ts=2022-07-21T21:07:29.263830553Z caller=filetargetmanager.go:179 msg="received file watcher event" name=/tmp/logs/log1.log.1 op=CREATE
ts=2022-07-21T21:07:29.293354294Z caller=log.go:168 level=info msg="Re-opening moved/deleted file /tmp/logs/log1.log ..."
ts=2022-07-21T21:07:29.293476537Z caller=log.go:168 level=info msg="Waiting for /tmp/logs/log1.log to appear..."
level=info ts=2022-07-21T21:07:29.436953384Z caller=tailer.go:202 component=tailer msg="skipping update of position for a file which does not currently exist" path=/tmp/logs/log1.log
level=info ts=2022-07-21T21:07:29.437157557Z caller=filetarget.go:291 msg="removing directory from watcher" directory=/tmp/logs
level=info ts=2022-07-21T21:07:29.437240091Z caller=tailer.go:116 component=tailer msg="position timer: exited" path=/tmp/logs/log1.log
level=info ts=2022-07-21T21:07:29.437294963Z caller=tailer.go:202 component=tailer msg="skipping update of position for a file which does not currently exist" path=/tmp/logs/log1.log
level=info ts=2022-07-21T21:07:29.43736309Z caller=tailer.go:159 component=tailer msg="tail routine: tail channel closed, stopping tailer" path=/tmp/logs/log1.log reason=null
level=info ts=2022-07-21T21:07:29.437418767Z caller=tailer.go:152 component=tailer msg="tail routine: exited" path=/tmp/logs/log1.log
level=info ts=2022-07-21T21:07:29.437451053Z caller=tailer.go:240 component=tailer msg="stopped tailing file" path=/tmp/logs/log1.log
level=info ts=2022-07-21T21:08:29.436230999Z caller=filetarget.go:278 msg="watching new directory" directory=/tmp/logs
level=info ts=2022-07-21T21:08:29.436281994Z caller=tailer.go:143 component=tailer msg="tail routine: started" path=/tmp/logs/log1.log
ts=2022-07-21T21:08:29.436300892Z caller=log.go:168 level=info msg="Seeked /tmp/logs/log1.log - &{Offset:0 Whence:0}"

Certainly Promtail tries to read the remaining data from the just "deleted" (actually moved) file, but what happens when that fails or does not finish until Promtail is stopped?

There is documenation on how Promtail behaves in case the file was truncated (or replaced by a new file of the same name during logroation): https://github.com/grafana/loki/blob/main/docs/sources/clients/promtail/troubleshooting.md#a-tailed-file-is-truncated-while-promtail-is-not-running=

There it's clearly stated that if the new file was already bigger than the last position saved for this file (by name) there is data loss (as in "not shipped" to Loki) as there is no way to reconcile.
There really seems no way around introducing the inode to positions to "abstract" away from the name of the file. The name simply changes for most cases / implementations of log rotation and the positions and filewatcher logic could be much simpler if it was just about finding candidates by pattern (read: name + potential wildcards) and then doing all of the position tracking using the inode as identifier. Adding some optimistic sorting on the creation date of the inode would even allow for backfilling of logs in (likely) the proper order.

Last but not least, using the inodes would also remove the current issue of the wildcard file pattern causing double (or multiple) ingests of the "same" file / data as described in: #3655, but make this a feature:

If the inode was used to identify each file a rotated away file could also be resumed to be read from, just like Elastic documents for their Filebeat shipper: https://www.elastic.co/guide/en/beats/filebeat/current/file-log-rotation.html#file-log-rotation

@frittentheke
Copy link
Contributor

@dannykopping would you potentially be open to discuss adding inode as a key to Promtail file tailing to avoid the known and documented issues with logroation?

@cstyan
Copy link
Contributor

cstyan commented Nov 16, 2023

the following may be relevant: #8039 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants