Feature: multi-line logs #74

ventris · 2018-12-12T14:11:03Z

Currently when an multi-line event is written to a logfile promtail will take this as each row is its own entry and send the separately to loki.

It would be great if Loki could handle multi-line events due to stacktraces.

This might also be a bug if promtail already should handle multi-lines.

tomwilkie · 2018-12-12T14:14:41Z

Agreed, thanks @ventris. Any idea how to do this though? I guess we could have the option to parse the log entries in promtail, and define each entry as starting with a time stamp?

ventris · 2018-12-12T14:17:26Z

Some applications require an "special" character before the multi-line event. I know that our current Wildfly setup puts #### before the multiline event.

But I think that parsing the log entries somehow and cut each entry with the timestamp as key is the better solution.

ventris · 2018-12-12T14:28:42Z

Splunk handles this in a smart way.

Source: extract-timestamps in splunk

Maybe somethink like that?

One thing to take into account with only look for timestamps is that they can occur within a stacktrace. That is why Splunk has a variable that you specify how many characters into a line it should look for the timestamp.

sandstrom · 2018-12-13T09:48:30Z

Here is another multi-line scenario

Our log files are JSONL (http://jsonlines.org/) and look like this:

{ "timestamp" : "2018-12-24T18:00:00", "user-id" : "abc", "request-id" : "123", "messages" : ["Parsing new user data", "Saving user data, valid model", "Storing user data in db"] }
{ "timestamp" : "2018-12-24T18:01:00", "user-id: " "def", "request-id" : "456", "messages" : ["Updating user record", "Updating search index"] }
{ "timestamp" : "2018-12-24T18:02:00", "user-id: " "abc", "request-id" : "789", "messages" : ["Keep alive ping from client", "Updated client with cluster state details"] }

We'd like to (1) associate the key-value metadata with each log line, (2) ingest the array of log lines from the "messages" key and (3) ideally group the associated log lines (so that a log reader client could easily show them together).

If it makes ingestion easier, we could easily keep a timestamp per message in the array instead. Something like this:

{ "timestamp" : "2018-12-24T18:02", "user-id: " "abc", "request-id" : "789", "messages" : [{ "t" : 2018-12-24T18:02:10", "m" : "Keep alive ping from client" }, { "t" : 2018-12-24T18:02:10", "m" : "Updated client with cluster state details" }] }

Also, Loki sounds awesome! 🏆

Have been looking for a lightweight alternative to the ELK stack for logging. We could easily live without full-text search inside log messages (distributed grep would be enough).

However, we'd need filtering by keywords and multi-line log handling to make it work.

tomwilkie · 2019-02-18T14:47:23Z

We recently merged support for having multiple entries with the same timestamp; when running the docker log parser, multi-line logs usually show up as entries with the same timestamp, and are therefore grouped together. Not a perfect solution, but helps.

m0wfo · 2019-03-11T16:08:15Z

When I was at Logentries we delimited on \r, \n, e.t.c but allowed loggers to concat multiple lines using the unicode line separator. The indexer would ignore it and treat the entire payload as one entry (it was then up to the UI to optionally reformat the code onto multiple lines). That's one version of an imperfect solution anyway :)

DanCech · 2019-03-11T16:28:05Z

My understanding is that loki doesn't treat the content of log "lines" specially, so it should be up to the process tailing the logfile to decide when a single loki log "line" should actually contain multiple newline-delimited "lines" of content (maybe it would have made more sense to call it a log entry but such is life).

Of course I may be misunderstanding the intent here, but it seems like something like a backtrace would be more useful as a single entry in loki rather than being split up.

davidhao3300 · 2019-03-15T22:03:53Z

From what I've seen in other similar log tailing (fluentd concat plugin), the tailer provides options for multi-line start regexp (lines matching the regex indicate the start of a new log entry, so flush the old one out), with a timeout/max buffer size (if x lines end up in the buffer without a new log entry appearing, flush the logs out anyways).

I implemented support for the multi-line start regex for a hackathon, and it seems to work decently. I'll try to clean up the code over the weekend and open a PR to show as an example. I didn't implement any sort of timeout/max buffer size though.

davidhao3300 · 2019-03-17T21:26:55Z

Opened a PR for this issue

steven-sheehy · 2019-04-01T16:25:48Z

Note that multi-line is different depending upon the EntryParser. With CRI, it will send multiple entries with different flags to indicate if it's a partial log or a full log. See this document.

cyriltovena · 2019-08-02T12:46:02Z

Currently investigating this.

cyriltovena · 2019-08-02T21:37:05Z

I looked at it #399. There would be a lot of refactoring involved if we want to get this feature in the current pipeline, I would love to hear more about how this is useful before getting started. Basically how merging lines together helps is it a visual problem? Or is to improve storage of those ? Good example of workload emitting multiline logs ?

I want to know a bit more.

ventris · 2019-08-04T07:38:27Z

Java and C# stacktraces are multiline.

One of the problems with not having access to the full event is that it makes it much harder to search for other similar events because that the linebreaks can cut javaclasses in half and thus does not paint a real picture of what happened in the logs on the server.

Another thing is if the example below is handled with multiple logs instead it is really hard to search for say:

I know that these two keywords should be present somewhere in the stacktrace. Say com.example.myproject.MyProjectServletException and javax.servlet.ServletException: Something bad happened.

If I search for them withing having the stacktrace in one entry I will just get the rows that matched my keywords and not the whole stacktrace. This makes troubleshooting java much harder with Loki if multiline events are handled as-is. Because I first need to search for the keywords and then try to find the first and last entry of the stacktrace so I can read it as a whole. And the stacktraces can be several hundred lines.

I think there are other multiline events that should benefit from having it all in the same entry, but I have only worked with Java that behaves like that so I can not talk for other languages.

Here is an example stacktrace from java:

javax.servlet.ServletException: Something bad happened
    at com.example.myproject.OpenSessionInViewFilter.doFilter(OpenSessionInViewFilter.java:60)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
    at com.example.myproject.ExceptionHandlerFilter.doFilter(ExceptionHandlerFilter.java:28)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
    at com.example.myproject.OutputBufferFilter.doFilter(OutputBufferFilter.java:33)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:326)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: com.example.myproject.MyProjectServletException
    at com.example.myproject.MyServlet.doPost(MyServlet.java:169)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
    at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
    at com.example.myproject.OpenSessionInViewFilter.doFilter(OpenSessionInViewFilter.java:30)
    ... 27 more
Caused by: org.hibernate.exception.ConstraintViolationException: could not insert: [com.example.myproject.MyEntity]
    at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:96)
    at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
    at org.hibernate.id.insert.AbstractSelectingDelegate.performInsert(AbstractSelectingDelegate.java:64)
    at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2329)
    at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2822)
    at org.hibernate.action.EntityIdentityInsertAction.execute(EntityIdentityInsertAction.java:71)
    at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:268)
    at org.hibernate.event.def.AbstractSaveEventListener.performSaveOrReplicate(AbstractSaveEventListener.java:321)
    at org.hibernate.event.def.AbstractSaveEventListener.performSave(AbstractSaveEventListener.java:204)
    at org.hibernate.event.def.AbstractSaveEventListener.saveWithGeneratedId(AbstractSaveEventListener.java:130)
    at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.saveWithGeneratedOrRequestedId(DefaultSaveOrUpdateEventListener.java:210)
    at org.hibernate.event.def.DefaultSaveEventListener.saveWithGeneratedOrRequestedId(DefaultSaveEventListener.java:56)
    at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.entityIsTransient(DefaultSaveOrUpdateEventListener.java:195)
    at org.hibernate.event.def.DefaultSaveEventListener.performSaveOrUpdate(DefaultSaveEventListener.java:50)
    at org.hibernate.event.def.DefaultSaveOrUpdateEventListener.onSaveOrUpdate(DefaultSaveOrUpdateEventListener.java:93)
    at org.hibernate.impl.SessionImpl.fireSave(SessionImpl.java:705)
    at org.hibernate.impl.SessionImpl.save(SessionImpl.java:693)
    at org.hibernate.impl.SessionImpl.save(SessionImpl.java:689)
    at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.hibernate.context.ThreadLocalSessionContext$TransactionProtectionWrapper.invoke(ThreadLocalSessionContext.java:344)
    at $Proxy19.save(Unknown Source)
    at com.example.myproject.MyEntityService.save(MyEntityService.java:59) <-- relevant call (see notes below)
    at com.example.myproject.MyServlet.doPost(MyServlet.java:164)
    ... 32 more
Caused by: java.sql.SQLException: Violation of unique constraint MY_ENTITY_UK_1: duplicate value(s) for column(s) MY_COLUMN in statement [...]
    at org.hsqldb.jdbc.Util.throwError(Unknown Source)
    at org.hsqldb.jdbc.jdbcPreparedStatement.executeUpdate(Unknown Source)
    at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeUpdate(NewProxyPreparedStatement.java:105)
    at org.hibernate.id.insert.AbstractSelectingDelegate.performInsert(AbstractSelectingDelegate.java:57)
    ... 54 more

steven-sheehy · 2019-08-05T13:59:42Z

Above, plus Grafana only shows them in descending order by time, so stacktrace is upside down from how it should appear. Also makes emoji art impossible to see, which a lot of java apps print on startup. :)

tlinhart · 2019-08-13T12:56:55Z

Python stack traces are the same:

[2019-08-13 06:58:20,588] ERROR in app: Exception on /graphql [POST]
Traceback (most recent call last):
  File "/srv/fzapi/venv/lib/python3.6/site-packages/flask/app.py", line 2292, in wsgi_app
    response = self.full_dispatch_request()
  File "/srv/fzapi/venv/lib/python3.6/site-packages/flask/app.py", line 1815, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/srv/fzapi/venv/lib/python3.6/site-packages/flask/app.py", line 1718, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/srv/fzapi/venv/lib/python3.6/site-packages/flask/_compat.py", line 35, in reraise
    raise value
  File "/srv/fzapi/venv/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/srv/fzapi/venv/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "./app/graphql/views.py", line 66, in service
    field_path_to_str(error.path, False),
AttributeError: 'Exception' object has no attribute 'path'

It's all about the context. Only complete log record holds the whole context, not any single line.

khauser · 2019-08-22T14:57:14Z

Could you please encounter the following scenario or at least give your opinion:

Because of Loki we have changed the log format of our appenders to use the logstash JSON-layout-formatter in our logback configuration. That's why now we have an all information containing log entry with also the whole stacktrace (containing the already noted \n\t). So no need to join lines afterwards. That sounds good to me...

Here is a sample entry:

{"@timestamp":"2019-08-13T23:57:20.971+00:00","@version":"1","message":"Error in job timer thread occurred!","logger_name":"somecode.core.internal.job.Timer","thread_name":"JobTimer759282143","level":"ERROR","level_value":40000,"stack_trace":"somecode.core.capi.common.SystemException: java.sql.SQLException: Exception occurred while getting connection: oracle.ucp.UniversalConnectionPoolException: Cannot get Connection from Datasource: java.sql.SQLException: Listener refused the connection with the following error:\nORA-12505, TNS:listener does not currently know of SID given in connect descriptor\n \n\tat somecode.core.internal.jdbc.JDBCDataSourceMgrImpl.getConnection(JDBCDataSourceMgrImpl.java:783)\n\tat somecode.core.internal.jdbc.JDBCDataSourceMgrImpl.getConnection(JDBCDataSourceMgrImpl.java:796)\n\tat somecode.core.internal.jdbc.JDBCUtilsImpl.getConnection(JDBCUtilsImpl.java:925)\n\tat somecode.core.capi.jdbc.JDBCUtils.getConnection(JDBCUtils.java:560)\n\tat somecode.core.internal.jdbc.DBDate.currentDBDate(DBDate.java:75)\n\tat somecode.core.internal.job.Timer.run(Timer.java:102)\nCaused by: java.sql.SQLException: Exception occurred while getting connection: oracle.ucp.UniversalConnectionPoolException: Cannot get Connection from Datasource: java.sql.SQLException: Listener refused the connection with the following error:\nORA-12505, TNS:listener does not currently know of SID given in connect descriptor\n \n\tat oracle.ucp.util.UCPErrorHandler.newSQLException(UCPErrorHandler.java:479)\n\tat oracle.ucp.util.UCPErrorHandler.throwSQLException(UCPErrorHandler.java:154)\n\tat oracle.ucp.jdbc.PoolDataSourceImpl.getConnection(PoolDataSourceImpl.java:1127)\n\tat oracle.ucp.jdbc.PoolDataSourceImpl.getConnection(PoolDataSourceImpl.java:1031)\n\tat oracle.ucp.jdbc.PoolDataSourceImpl.getConnection(PoolDataSourceImpl.java:1006)\n\tat somecode.core.internal.jdbc.JDBCDataSource.getConnection(JDBCDataSource.java:101)\n\tat somecode.core.internal.jdbc.JDBCDataSource.getConnection(JDBCDataSource.java:73)\n\tat somecode.core.internal.jdbc.JDBCDataSourceMgrImpl.getConnection(JDBCDataSourceMgrImpl.java:779)\n\t... 5 common frames omitted\nCaused by: oracle.ucp.UniversalConnectionPoolException: Cannot get Connection from Datasource: java.sql.SQLException: Listener refused the connection with the following error:\nORA-12505, TNS:listener does not currently know of SID given in connect descriptor\n \n\tat oracle.ucp.util.UCPErrorHandler.newUniversalConnectionPoolException(UCPErrorHandler.java:359)\n\tat oracle.ucp.util.UCPErrorHandler.throwUniversalConnectionPoolException(UCPErrorHandler.java:58)\n\tat oracle.ucp.jdbc.DataSourceConnectionFactoryAdapter.createConnection(DataSourceConnectionFactoryAdapter.java:102)\n\tat oracle.ucp.jdbc.oracle.OracleDataSourceConnectionFactoryAdapter.createConnection(OracleDataSourceConnectionFactoryAdapter.java:82)\n\tat oracle.ucp.common.Database.createPooledConnection(Database.java:268)\n\tat oracle.ucp.common.Cluster.create(Cluster.java:234)\n\tat oracle.ucp.common.Core.growBorrowed(Core.java:634)\n\tat oracle.ucp.common.UniversalConnectionPoolImpl.borrowConnectionWithoutCountingRequests(UniversalConnectionPoolImpl.java:230)\n\tat oracle.ucp.common.UniversalConnectionPoolImpl.borrowConnectionAndValidate(UniversalConnectionPoolImpl.java:139)\n\tat oracle.ucp.common.UniversalConnectionPoolImpl.borrowConnection(UniversalConnectionPoolImpl.java:120)\n\tat oracle.ucp.jdbc.JDBCConnectionPool.borrowConnection(JDBCConnectionPool.java:170)\n\tat oracle.ucp.jdbc.oracle.OracleJDBCConnectionPool.borrowConnection(OracleJDBCConnectionPool.java:849)\n\tat oracle.ucp.jdbc.oracle.OracleConnectionConnectionPool.borrowConnection(OracleConnectionConnectionPool.java:82)\n\tat oracle.ucp.jdbc.PoolDataSourceImpl.getConnection(PoolDataSourceImpl.java:1103)\n\t... 10 common frames omitted\nCaused by: java.sql.SQLException: Listener refused the connection with the following error:\nORA-12505, TNS:listener does not currently know of SID given in connect descriptor\n \n\tat oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:743)\n\tat oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:666)\n\tat oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)\n\tat oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:566)\n\tat oracle.jdbc.pool.OracleDataSource.getPhysicalConnection(OracleDataSource.java:317)\n\tat oracle.jdbc.pool.OracleDataSource.getConnection(OracleDataSource.java:241)\n\tat oracle.ucp.jdbc.DataSourceConnectionFactoryAdapter.createConnection(DataSourceConnectionFactoryAdapter.java:96)\n\t... 21 common frames omitted\nCaused by: oracle.net.ns.NetException: Listener refused the connection with the following error:\nORA-12505, TNS:listener does not currently know of SID given in connect descriptor\n \n\tat oracle.net.ns.NSProtocolStream.negotiateConnection(NSProtocolStream.java:275)\n\tat oracle.net.ns.NSProtocol.connect(NSProtocol.java:264)\n\tat oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1452)\n\tat oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:496)\n\t... 27 common frames omitted\n","executionsite":"root"}

But... In loki and the file log itself the stacktrace isn't readable anymore. That's not good. At least for the loki side maybe something like a configuration possiblity (fluentd plugin plus promtail) for fields would be good (e.g. multiline_fields: [stack_trace]). Then in grafana each multiline field could be represented in a well formated column maybe (just a simple thought).

I don't know if the one-json-line approach is the best one but just wanted to note it here. In theory it might also be well displayed. For us this would be the most needed Loki+Grafana feature since I wouldn't let other engineers work with a one-line stacktrace..

On the other side we also need to handle logs from older environments where multilines in spite of stacktraces would play a role:

[2019-08-13 20:36:44 GMT] - [Thread-138270] INFO  c.i.b.w.w.DeleteExpiredFilesProcess: finished deletion of expired files. Total files removed: 0 pages from 0/4 sites (enabled/disabled for indexing)
[2019-08-13 21:06:50 GMT] - [Thread-138713] INFO  c.i.b.w.w.DeleteExpiredFilesProcess: finished deletion of expired files. Total files removed: 0 pages from 0/4 sites (enabled/disabled for indexing)
[2019-08-13 21:36:56 GMT] - [Thread-139154] INFO  c.i.b.w.w.DeleteExpiredFilesProcess: finished deletion of expired files. Total files removed: 0 pages from 0/4 sites (enabled/disabled for indexing)
[2019-08-13 22:00:12 GMT] - [main] ERROR c.i.b.w.w.WebAdapterAgent: cycle failed:
java.lang.NumberFormatException: For input string: "-db error"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:580)
	at java.lang.Integer.parseInt(Integer.java:615)
	at somecode.webinterface.webadapteragent.PageCache.expirePages(PageCache.java:292)
	at somecode.webinterface.webadapteragent.WebAdapterAgent.loop(WebAdapterAgent.java:144)
	at somecode.webinterface.webadapteragent.WebAdapterAgent.main(WebAdapterAgent.java:40)
[2019-08-13 22:00:13 GMT] - [main] INFO  c.i.b.w.w.WebAdapterAgent: All services are now up and running

PS: Also this issue #616 also seems to be related with multilines.

khauser · 2019-08-23T07:26:39Z

We checked with Azure Log Analytics and with Kibana..

If a json formatted log entry stacktrace arrives in

Azure Log Analytics the table only shows the first characters of the stacktrace. But expanding the entry gives you a well formed one.
Kibana already displays the first 5 lines of the stacktrace and is also able to expand the log entry. Both outputs are well formed.

stale · 2019-09-22T08:23:35Z

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

pstibrany · 2019-09-25T12:13:06Z

Using timestamp stage in promtail with logs that contain lines without a timestamp (typically stack traces) results in a broken log stream.

If we use example from #74 (comment)

[2019-08-13 22:00:12 GMT] - [main] ERROR c.i.b.w.w.WebAdapterAgent: cycle failed:
java.lang.NumberFormatException: For input string: "-db error"
	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
	at java.lang.Integer.parseInt(Integer.java:580)
	at java.lang.Integer.parseInt(Integer.java:615)
	at somecode.webinterface.webadapteragent.PageCache.expirePages(PageCache.java:292)
	at somecode.webinterface.webadapteragent.WebAdapterAgent.loop(WebAdapterAgent.java:144)
	at somecode.webinterface.webadapteragent.WebAdapterAgent.main(WebAdapterAgent.java:40)
[2019-08-13 22:00:13 GMT] - [main] INFO  c.i.b.w.w.WebAdapterAgent: All services are now up and running

Then timestamp stage will extract correct timestamp from the first and last line, but other lines will be assigned 'current time' -- which will then cause subsequent lines to be out of order and rejected by Loki.

Now tracked in #1093

mnadeem · 2021-01-03T19:41:54Z

Unfortunately as of today this is still an issue with grafana/loki:2.1.0

cyriltovena · 2021-01-04T06:38:56Z

Have you tried to configured promtail to match those multi lines ? https://grafana.com/docs/loki/latest/clients/promtail/stages/multiline/

mnadeem · 2021-01-04T06:57:44Z

level=error ts=2021-01-04T06:56:02.891620396Z caller=main.go:106 msg="error creating promtail" error="failed to make file target manager: invalid multiline stage config: Unknown stage type: multiline"

          - job_name: spring-boot-app
            pipeline_stages:
              - multiline:
                  firstline: '^\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2}\.\d{3}'
                  max_wait_time: 1s
            static_configs:
              - targets:
                  - localhost
                labels:
                  job: spring-boot-app
                  __path__: /app/log/spring-boot-app*.log

grafana/promtail:2.1.0

Further I have documented below, the way I have setup loki

https://reachmnadeem.wordpress.com/2021/01/02/logging-stack-for-spring-boot-application-using-grafan-loki-on-openshift-kubernetes/

cyriltovena · 2021-01-04T21:19:24Z

It has not been release into 2.1, it will be in 2.2. You can use grafana/promtail:master-ee9c629 in the meantime. Sorry for the confusion.

F1ch18 · 2021-01-13T09:40:42Z

grafana/promtail:master-ee9c629 doesn't work fine for me. I have crashloopbackoff and this in pod logs

goroutine 1115 [sleep]:
time.Sleep(0xee6b280)
/usr/local/go/src/runtime/time.go:188 +0xbf
github.com/hpcloud/tail/watch.(*PollingFileWatcher).ChangeEvents.func1(0xc000c432e0, 0xc0009cc1d8, 0xc000ca5740, 0x2fe6480, 0xc0019f6c30, 0xc000ca5760)
/src/loki/vendor/github.com/hpcloud/tail/watch/polling.go:68 +0x8f
created by github.com/hpcloud/tail/watch.(*PollingFileWatcher).ChangeEvents
/src/loki/vendor/github.com/hpcloud/tail/watch/polling.go:59 +0x18c

goroutine 1116 [chan receive]:
github.com/grafana/loki/pkg/promtail/targets/file.(*tailer).readLines(0xc000bfc000)
/src/loki/pkg/promtail/targets/file/tailer.go:137 +0x225
created by github.com/grafana/loki/pkg/promtail/targets/file.newTailer
/src/loki/pkg/promtail/targets/file/tailer.go:81 +0x53d

goroutine 1117 [select]:
github.com/grafana/loki/pkg/promtail/targets/file.(*tailer).updatePosition(0xc000bfc000)
/src/loki/pkg/promtail/targets/file/tailer.go:101 +0x145
created by github.com/grafana/loki/pkg/promtail/targets/file.newTailer
/src/loki/pkg/promtail/targets/file/tailer.go:82 +0x565

goroutine 1118 [select]:
github.com/grafana/loki/pkg/promtail/targets/file.(*FileTarget).run(0xc000ad6480)
/src/loki/pkg/promtail/targets/file/filetarget.go:177 +0x18a
created by github.com/grafana/loki/pkg/promtail/targets/file.NewFileTarget
/src/loki/pkg/promtail/targets/file/filetarget.go:124 +0x269

goroutine 1119 [syscall]:
syscall.Syscall6(0xe8, 0x195, 0xc00218fb6c, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0xfad, 0x100000002, 0xfae)
/usr/local/go/src/syscall/asm_linux_amd64.s:41 +0x5
golang.org/x/sys/unix.EpollWait(0x195, 0xc00218fb6c, 0x7, 0x7, 0xffffffffffffffff, 0xfb3, 0x100000002, 0xfb4)
/src/loki/vendor/golang.org/x/sys/unix/zsyscall_linux_amd64.go:76 +0x72
gopkg.in/fsnotify%2ev1.(*fdPoller).wait(0xc000a48360, 0x100000000, 0xfbe, 0x100000002)
/src/loki/vendor/gopkg.in/fsnotify.v1/inotify_poller.go:86 +0x91
gopkg.in/fsnotify%2ev1.(*Watcher).readEvents(0xc0018ea820)
/src/loki/vendor/gopkg.in/fsnotify.v1/inotify.go:192 +0x206
created by gopkg.in/fsnotify%2ev1.NewWatcher
/src/loki/vendor/gopkg.in/fsnotify.v1/inotify.go:59 +0x1a8

cyriltovena · 2021-01-13T12:16:19Z

The log message is cut off could you share the full error log ? Have you tried recent build I fixed a panic recently in promtail that was unreleased.

cmde-lusional · 2021-02-19T14:04:48Z

It has not been release into 2.1, it will be in 2.2. You can use grafana/promtail:master-ee9c629 in the meantime. Sorry for the confusion.

@cyriltovena can you tell me when 2.2 is gonna be released?

antoniocascais · 2021-02-23T10:55:10Z

Hi all.

I'm also facing similar issues as @mnadeem does.

I was following the docs (https://grafana.com/docs/loki/latest/clients/promtail/stages/multiline/) and added this configuration to my promtail values file (I'm installing the promtail helm chart in my cluster):

config:
  snippets:
    pipelineStages:
      - cri: {}
      - multiline:
          # Identify timestamps as first line of a multiline block. Note the string should be in single quotes.
          firstline: '^\[\d{4}-\d{2}-\d{2} \d{1,2}:\d{2}:\d{2}\]'
          max_wait_time: 3s

But promtail doesn't start:

level=error ts=2021-02-23T10:52:53.883293805Z caller=main.go:106 msg="error creating promtail" error="failed to make file target manager: invalid multiline stage config: Unknown stage type: multiline"

Using promtail v2.1.0 and helm chart version 3.1.0:

appVersion: 2.1.0
version: 3.1.0

So, this will be fixed in v2.2.0? We need to wait for now?

triwats · 2021-02-23T16:01:15Z

@antoniocascais - I think so, it's lined up to be released in 2.2, but as the above comment states from @cyriltovena, you can use a different docker tag to pull in these features. This will be experimental:

It has not been release into 2.1, it will be in 2.2. You can use grafana/promtail:master-ee9c629 in the meantime. Sorry for the confusion.

cyriltovena · 2021-02-25T09:42:55Z

We wanted to release 2.2 this week. I don't know if we're on track but if not next week.

We wanted to include some read path bug in there too.

mdegat01 · 2021-03-29T15:26:43Z

Does this capability work when scraping the system journal or is it only when scraping files? I have an application dumping tracebacks line by line into the system journal, was hoping to pull those together into one log line in Loki. I've been trying to use it but seems like its getting ignored. Or do you have to combine it with output somehow to change the text content of the log?

cyriltovena · 2021-03-29T18:26:13Z

Yes it does

mdegat01 · 2021-03-29T20:26:11Z

Ah I figured it out. There was an escape sequence at the front of the text for terminal color-coding and that was causing my regex to fail. This is great! Much nicer to look at 👍

triwats · 2021-03-31T17:17:57Z

Just incase anyone else is watching this space for the update, this was all included in v2.2.0 which is now available and released: See here: https://github.com/grafana/loki/releases/tag/v2.2.0

omelnyk10 · 2021-11-12T15:44:49Z

If someone here is still alive, maybe you can help me. Is there an option to collapse the error stack trace, and if I need to see it I just click on the arrow sign and see a well-formatted stack trace?

Jojoooo1 · 2022-01-26T22:30:35Z

anyone had success multilining with logback and spring boot ?

jeschkies · 2022-01-27T10:25:47Z

👋 I wrote the multiline stage. @omelnyk10 and @Jojoooo1, you both want to collapse Java stack traces?

omelnyk10 · 2022-01-27T11:50:01Z

@jeschkies yes

Jojoooo1 · 2022-01-27T15:18:02Z

Exactly @jeschkies ! With the logback default encoder!

jeschkies · 2022-01-27T16:37:23Z

There is an example in the documentation for configuring logback. Does that help?

Jojoooo1 · 2022-01-27T22:16:05Z

Made it work with
firstline: '^\d{4}-\d{2}-\d{2}'

Update from upstream repository

dcerecedo · 2022-12-05T10:37:38Z

Anyone can share a sample grafana-agent.yml that makes use of the multiline feature? Not even sure if the multiline config even belongs there!

I have tried lots of options without success. This is my last piece of config:

logs:
  configs:
  - clients:
    - basic_auth:
        password: ${LOKI_SECRET}
        username: ${LOKI_USERNAME}
      url: ${LOKI_URL}
    name: integrations
    positions:
      filename: /tmp/positions.yaml
    scrape_configs:
    - job_name: integrations/docker
      docker_sd_configs:
      - host: unix:///var/run/docker.sock
        refresh_interval: 5s
      relabel_configs:
      - action: replace
        replacement: integrations/docker
        source_labels:
        - __meta_docker_container_id
        target_label: job
      - action: replace
        replacement: ${HOSTNAME}
        source_labels:
        - __meta_docker_container_id
        target_label: instance
      - regex: /(.*)
        source_labels:
        - __meta_docker_container_name
        target_label: container
      - source_labels:
        - __meta_docker_container_log_stream
        target_label: stream
      pipeline_stages:
      - multiline:
        firstline: '^\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}\.\d{7}\s[+-]\d{2}:\d{2}\s\[\w+\]'
        max_wait_time: 3s
    target_config:
      sync_period: 10s

TBH, the grafana docs, in general are hard to follow. Docs are modular in the sense that they are scattered according to the respective config sections. You can get sample blocks but not a full sample of the agent configuration file. Therefore, it is sometimes hard to see where you are making a mistake.

rufreakde · 2024-01-08T09:57:10Z

Made it work with firstline: '^\d{4}-\d{2}-\d{2}'

I tried the following in my values.yaml:

snippets:
  pipelineStages:
  - cri: {}
  - multiline:
      firstline: '^\d{4}-\d{2}-\d{2}'
      max_wait_time: 3s

With helm chart:

- name: promtail
  version: 6.14.1
  repo: https://grafana.github.io/helm-charts

But unfortunately it is not working at all. Still all logs are collected line by line no summarisation on timestamp...

Any ideas does it still work for you? @Jojoooo1

tomwilkie mentioned this issue Dec 14, 2018

Design questions #110

Closed

tomwilkie added component/agent type/feature Something new we should do labels Dec 16, 2018

davidhao3300 mentioned this issue Mar 17, 2019

Add multi-line detection support to promtail #399

Closed

3 tasks

steven-sheehy mentioned this issue Jun 15, 2019

Explore: multiline log messages don't display nicely grafana/grafana#17584

Closed

cyriltovena pinned this issue Aug 2, 2019

cyriltovena unpinned this issue Aug 2, 2019

stale bot added the stale A stale issue or PR that will automatically be closed. label Sep 22, 2019

cyriltovena added the keepalive An issue or PR that will be kept alive and never marked as stale. label Sep 22, 2019

stale bot removed the stale A stale issue or PR that will automatically be closed. label Sep 22, 2019

cyriltovena added the hacktoberfest label Sep 26, 2019

cyriltovena closed this as completed in #3024 Dec 3, 2020

wy281704862 mentioned this issue Jan 15, 2021

Unknown stage type: multiline #3170

Closed

madhead mentioned this issue Feb 20, 2021

Use Promtail 2.2+ when it's released madhead/ImgMacroBot#18

Closed

periklis added a commit to periklis/loki that referenced this issue Dec 6, 2021

Add profiler support (grafana#74)

e769d2d

dbluxo mentioned this issue May 22, 2022

Promtail: Support CRI-O multi-lines #5854

Closed

periklis added a commit to periklis/loki that referenced this issue Dec 2, 2022

Merge pull request grafana#74 from grafana/main

245362c

Update from upstream repository

Feature: multi-line logs #74

Feature: multi-line logs #74

Comments

ventris commented Dec 12, 2018 • edited Loading

tomwilkie commented Dec 12, 2018

ventris commented Dec 12, 2018 • edited Loading

ventris commented Dec 12, 2018 • edited Loading

sandstrom commented Dec 13, 2018 • edited Loading

tomwilkie commented Feb 18, 2019

m0wfo commented Mar 11, 2019

DanCech commented Mar 11, 2019

davidhao3300 commented Mar 15, 2019 • edited Loading

davidhao3300 commented Mar 17, 2019

steven-sheehy commented Apr 1, 2019

cyriltovena commented Aug 2, 2019

cyriltovena commented Aug 2, 2019

ventris commented Aug 4, 2019 • edited Loading

steven-sheehy commented Aug 5, 2019

tlinhart commented Aug 13, 2019

khauser commented Aug 22, 2019 • edited Loading

khauser commented Aug 23, 2019

stale bot commented Sep 22, 2019

pstibrany commented Sep 25, 2019 • edited Loading

mnadeem commented Jan 3, 2021 • edited Loading

cyriltovena commented Jan 4, 2021

mnadeem commented Jan 4, 2021 • edited Loading

cyriltovena commented Jan 4, 2021 • edited Loading

F1ch18 commented Jan 13, 2021 • edited Loading

cyriltovena commented Jan 13, 2021

cmde-lusional commented Feb 19, 2021

antoniocascais commented Feb 23, 2021 • edited Loading

triwats commented Feb 23, 2021

cyriltovena commented Feb 25, 2021

mdegat01 commented Mar 29, 2021 • edited Loading

cyriltovena commented Mar 29, 2021

mdegat01 commented Mar 29, 2021

triwats commented Mar 31, 2021

omelnyk10 commented Nov 12, 2021

Jojoooo1 commented Jan 26, 2022

jeschkies commented Jan 27, 2022

omelnyk10 commented Jan 27, 2022

Jojoooo1 commented Jan 27, 2022

jeschkies commented Jan 27, 2022

Jojoooo1 commented Jan 27, 2022

dcerecedo commented Dec 5, 2022

rufreakde commented Jan 8, 2024

ventris commented Dec 12, 2018 •

edited

Loading

ventris commented Dec 12, 2018 •

edited

Loading

ventris commented Dec 12, 2018 •

edited

Loading

sandstrom commented Dec 13, 2018 •

edited

Loading

davidhao3300 commented Mar 15, 2019 •

edited

Loading

ventris commented Aug 4, 2019 •

edited

Loading

khauser commented Aug 22, 2019 •

edited

Loading

pstibrany commented Sep 25, 2019 •

edited

Loading

mnadeem commented Jan 3, 2021 •

edited

Loading

mnadeem commented Jan 4, 2021 •

edited

Loading

cyriltovena commented Jan 4, 2021 •

edited

Loading

F1ch18 commented Jan 13, 2021 •

edited

Loading

antoniocascais commented Feb 23, 2021 •

edited

Loading

mdegat01 commented Mar 29, 2021 •

edited

Loading