Spool to disk error #12174

sentient · 2019-05-10T18:34:37Z

I'm getting errors when I use the

queue:
   spool:
      file:

feature

not enough memory to allocate 255 data page

2019-05-10T18:28:10.029Z ERROR [publisher] spool/inbroker.go:544 Spool flush failed with: pq/writer-flush: txfile/tx-alloc-pages: file='/var/lib/statsdbeat/spool.dat' tx=0: transaction failed during commit: not enough memory to allocate 255 data page(s)

This is running on a server with 4Gb of memory. What configuration settings are there to reduce the number of data pages.

Thanks

The text was updated successfully, but these errors were encountered:

faec · 2020-01-28T21:05:42Z

I'm still investigating this, but some preliminary details: the error is misleading, as the real condition isn't a lack of system memory but a lack of free pages in the disk spool. I've tried many different constraints on both current and old beats versions and e.g. a 4GB spool on a machine with 2GB memory works fine.

The problem observed here is that a call to Writer.Flush() needs 255 pages (~1MB on disk) but there aren't that many free. This could mean an internal inconsistency that leads to fewer free blocks than there should be, or bad / misleading error handling that makes this a (seemingly?) fatal error instead of just retrying or blocking until the space frees up.

The default spool file size is only 100MB, so the (imperfect) configuration workaround would actually be to increase the spool size by adding e.g. size: 1G in the file configuration. (This wouldn't address the underlying bug, if there is one, but it will give a larger buffer so intermittent link saturation doesn't reach the limit so easily).

christophercutajar · 2020-09-14T16:08:16Z

I've noticed the same error with filebeat 7.5.1

2020-09-14T15:29:27.478Z        ERROR   [publisher]     spool/inbroker.go:547   Spool flush failed with: pq/writer-flush: txfile/tx-alloc-pages: file='/var/lib/filebeat/spool.dat' tx=0: transaction failed during commit: not enough memory to allocate 54 data page(s)
2020-09-14T15:29:28.777Z        ERROR   [publisher]     spool/inbroker.go:547   Spool flush failed with: pq/writer-flush: txfile/tx-alloc-pages: file='/var/lib/filebeat/spool.dat' tx=0: transaction failed during commit: not enough memory to allocate 55 data page(s)

Having the below config:

# Settings to cache events to disk in case logstash is unreachable
queue.spool:
  file:
    size: 512MiB
    page_size: 16KiB
  write:
    buffer_size: 10MiB
    flush.timeout: 5s
    flush.events: 1024

Not sure what caused that ERROR but after sometime I can see data being ingested.

rsdrakh · 2021-01-27T16:58:52Z

Same here for journalbeat Version 7.6.1.
Config:

queue:
    spool:
        file:
            page_size: 16KiB
            path: /srv/beats/journalbeat/spool.dat
            prealloc: true
            size: 4GiB
        read:
            flush:
                timeout: 0s
        write:
            buffer_size: 5MiB
            flush:
                events: 2048
                timeout: 10s

I notice that a cron'ed systemctl restart journalbeat does not help the situation, after issueing systemctl stop journalbeat followed by systemctl start journalbeat the contents of the spool file seem to be ingested correctly. The cronjob restarts journalbeat every 10 minutes due to problems with corrupt journal files (there is an open issue for that).

The logged ERROR appears sporadically during the day, somehow resolving itself (?), then manifests at some point so ingest comes to a complete halt.

EDIT: output is logstash, loadbalanced to two nodes, no further config.

Aloshi · 2021-07-14T00:03:36Z

Seeing this as well with filebeat 7.7.1 and a 200MiB spool file, from the stack monitoring page this log message seems to correlate with our beat no longer queuing new events. From /var/log/filebeat/filebeat:

...
2021-07-13T21:00:28.829Z        INFO    log/harvester.go:297    Harvester started for file: xxx.log
2021-07-13T21:00:42.492Z        ERROR   [publisher]     spool/inbroker.go:547   Spool flush failed with: pq/writer-flush: txfile/tx-alloc-pages: file='/var/lib/filebeat/spool.dat' tx=0: transaction failed during commit: not enough memory to allocate 44 data page(s)

(times in log are UTC, screenshot is PST)

faec · 2021-10-06T15:21:29Z

Closing: disk spool is deprecated in favor of the disk queue

andrewkroh added the libbeat label Jun 12, 2019

urso added the Team:Beats label Nov 18, 2019

ph added [zube]: Backlog [zube]: Inbox and removed [zube]: Backlog labels Nov 19, 2019

urso added [zube]: Investigate and removed [zube]: Inbox labels Dec 3, 2019

urso mentioned this issue Dec 11, 2019

Spooling to disk GA #6859

Closed

39 tasks

faec self-assigned this Jan 27, 2020

andresrc added [zube]: Inbox [zube]: Investigate Team:Integrations Label for the Integrations team and removed [zube]: Investigate [zube]: Inbox Team:Beats labels Mar 3, 2020

faec closed this as completed Oct 6, 2021

zube bot added [zube]: Done and removed [zube]: Investigate labels Oct 6, 2021

zube bot removed the [zube]: Done label Jan 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spool to disk error #12174

Spool to disk error #12174

sentient commented May 10, 2019

faec commented Jan 28, 2020

christophercutajar commented Sep 14, 2020

rsdrakh commented Jan 27, 2021 •

edited

Loading

Aloshi commented Jul 14, 2021 •

edited

Loading

faec commented Oct 6, 2021

Spool to disk error #12174

Spool to disk error #12174

Comments

sentient commented May 10, 2019

faec commented Jan 28, 2020

christophercutajar commented Sep 14, 2020

rsdrakh commented Jan 27, 2021 • edited Loading

Aloshi commented Jul 14, 2021 • edited Loading

faec commented Oct 6, 2021

rsdrakh commented Jan 27, 2021 •

edited

Loading

Aloshi commented Jul 14, 2021 •

edited

Loading