receive: v0.18.0 memory leak (v0.16.0 regression) #3726

sepich · 2021-01-16T16:55:49Z

Thanos, Prometheus and Golang version used:
thanosio/thanos:v0.17.2

Object Storage Provider:
GCS

What happened:
I'm trying to upgrade from v0.16.0 to v0.17.2 and see that thanos-receive memory is "leaking":

Here are 3 thanos-receive pods in a hashring with equal load, red and blue lines are v0.16.0. At 18:30 i'm restarting thanos-receive-0 (orange line) as v0.17.2, then at 19:33 restart it back as v0.16.0.
GC load profile also differs between versions:

Related:
#3265

What you expected to happen:
Stable memory usage.

How to reproduce it (as minimally and precisely as possible):
Only reproducible on production with 80k samples/s per thanos-receive pod.

Full logs to relevant components:
Attaching pprof heap.zip, right before second restart.
heap.zip

sepich · 2021-02-07T17:42:42Z

Rechecked on:

thanos, version 0.18.0 (branch: HEAD, revision: 60d45a02d46858a38013283b578017a171cf7b82)
  build user:       circleci@8ddf80c1eb30
  build date:       20210127-12:29:07
  go version:       go1.15.7
  platform:         linux/amd64

Result still the same (v0.16.0 before 17:43):

Attaching pprof for v0.18.0 if needed:
heap.zip

kakkoyun · 2021-02-12T09:24:36Z

Sorry for the late response @sepich, is this still an issue with the latest Thanos version?

sepich · 2021-02-12T12:12:43Z

Result for latest v0.18.0 is above.
Anyway i've retested with thanosio/thanos:master-2021-02-11-7b09e30c and memory is leaking to 24Gb even faster.

kakkoyun · 2021-02-12T12:25:55Z

Thanks again. We're going to dedicate something to investigate this.
We had a similar issue but I thought we had fixed it. Anyways we'll check it out. Feel free to give us a hand, so help wanted.

cc @squat @bwplotka

kakkoyun · 2021-02-12T17:31:51Z

Possible duplicate of #3471

luizrojo · 2021-02-23T13:54:13Z

I am also facing this issue and same behavior, v0.16.0 fixes the possible leak.

As we can see below

Until 9h40 all nodes were running v0.18.0, at this time I had all receive nodes restarted.

At 9h45 all nodes were back up, still on v0.18.0, and we can see a rapid memory increase.

At 10h17 I had 2 nodes downgraded to v0.17.2 and v0.16.0. (ip-10-184-125-10 to v0.17.2 and ip-10-184-125-13 to v0.16.0)

At 10h35 the v0.17.2 node starts to increase the memory usage, following the v0.18.0 nodes behavior, but the v0.16.0 keeps the memory usage stable.

Here we can see the memory heap usage

bwplotka · 2021-02-23T15:06:21Z

Ack, so this means something change between 0.17.2 and 0.18?

Let's check the v0.19.0-rc.0 I am cutting this week if anything helps there, then we can try to look closer on the commit log and bisect on commit level, especially if you can amazingly reproduce the problem 🤗

luizrojo · 2021-02-23T15:17:44Z

@bwplotka I believe something changed between 0.16.0 and 0.17.2 and then apparently got worse on 0.18.0.

On 0.17.2 we can see it takes longer to start building the memory usage, but it does start to increase, following the same behavior as 0.18.0.

I'll update one of the instance to the v0.19.0-rc.0 and get back to you with some more info.

Right now I got all nodes on v0.16.0 and memory is as stable as it can get:

At 11h18 I had all instances downgraded and at 11h34 got metrics ingestion back on

bwplotka · 2021-02-23T15:25:17Z

I would start bisecting the commits between 0.17.2 🤗 that would be helpful.

luizrojo · 2021-02-23T19:20:58Z

I just ran a couple more tests and did not have to go very far on the image tags to notice a pattern change on memory consumption.

Here is a screenshot of the memory graphs, the green line is the test instance and the yellow line is the instance running 0.16.0.

The first 3 big slopes are version 0.17.0, 0.17.2 and 0.18.0 consecutively

Starting at 15h49 im using master-2020-10-26-8447f621 the first image tag after 0.16.0.

The pattern changes, and it looks like after every GC, the consumption increases a little.

I'll do some more testing tomorrow

bwplotka · 2021-02-24T18:07:49Z

There are cool findings made by @svenwltr:
#3471 (comment)

jmichalek132 · 2021-03-01T15:10:24Z

We are experiencing the same issue (with both v0.17.2 and v0.18.0), but the downgrade to v0.16.0 seems to help as suggested.

The fall in memory usage between 13:20 and 14:40 is caused by the instances beeing oomkilled.
The fall in memory usage between 15:20 and 14:50 is caused by the rollout of v0.16.0. After that the memory usage is lower and stable.

dhohengassner · 2021-03-02T08:33:46Z

We are facing the same issue. Glad this issue exists.
Downgrade to 0.16.0 stabilized our system as well 👍

Looking forward to a fix!

jmichalek132 · 2021-03-02T15:03:57Z

Tried with v0.19.0-rc.0 to see if the issue is still there, looks like it is. (Upgraded from 0.16.0 to the rc at 14:30)

bwplotka · 2021-03-02T17:02:17Z

Thanks a lot!

Anyone can help but I would love to find out what's wrong this week, ideally before 0.19.0 but let's see.

We got some profile but let's see if this is helpful:

(pprof) top 10
Showing nodes accounting for 421.25GB, 96.30% of 437.43GB total
Dropped 572 nodes (cum <= 2.19GB)
Showing top 10 nodes out of 47
      flat  flat%   sum%        cum   cum%
  199.27GB 45.55% 45.55%   199.27GB 45.55%  github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).Unmarshal
   80.06GB 18.30% 63.86%    80.06GB 18.30%  github.com/golang/snappy.Decode
   75.49GB 17.26% 81.11%    77.47GB 17.71%  github.com/thanos-io/thanos/pkg/receive.(*Writer).Write
   25.18GB  5.76% 86.87%    25.18GB  5.76%  bytes.makeSlice
   18.15GB  4.15% 91.02%    18.64GB  4.26%  github.com/thanos-io/thanos/pkg/receive.(*Handler).forward
   18.07GB  4.13% 95.15%   217.34GB 49.69%  github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*WriteRequest).Unmarshal
    2.71GB  0.62% 95.77%     2.71GB  0.62%  github.com/prometheus/prometheus/tsdb/encoding.(*Decbuf).UvarintStr (inline)
    1.97GB  0.45% 96.22%     4.62GB  1.06%  github.com/prometheus/prometheus/tsdb/record.(*Decoder).Series
    0.33GB 0.076% 96.29%     2.87GB  0.66%  github.com/prometheus/prometheus/tsdb.(*Head).loadWAL
    0.03GB 0.0064% 96.30%     2.32GB  0.53%  github.com/prometheus/prometheus/tsdb.(*Head).loadWAL.func6

Ideally we pin point commit which introduced the regression 🤗 The problem will be if that's TSDB update (it probably is).

jmichalek132 · 2021-03-04T11:42:17Z

Profiles captured via conprof:


github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).Unmarshal

/home/circleci/project/pkg/store/storepb/prompb/types.pb.go

  Total:     22.64TB    22.64TB (flat, cum) 46.00%

github.com/thanos-io/thanos/pkg/receive.(*Writer).Write

/home/circleci/project/pkg/receive/writer.go

  Total:      6.77TB     6.88TB (flat, cum) 13.98%

google.golang.org/grpc.(*parser).recvMsg

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/rpc_util.go

  Total:      4.77TB     4.77TB (flat, cum)  9.70%

github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Marshal

/home/circleci/project/pkg/store/storepb/rpc.pb.go

  Total:      4.77TB     4.77TB (flat, cum)  9.69%

github.com/golang/snappy.Decode

/home/circleci/go/pkg/mod/github.com/golang/snappy@v0.0.2/decode.go

  Total:      1.79TB     1.79TB (flat, cum)  3.63%

github.com/thanos-io/thanos/pkg/receive.(*Handler).forward

/home/circleci/project/pkg/receive/handler.go

  Total:      1.72TB     3.31TB (flat, cum)  6.72%

google.golang.org/grpc/internal/transport.(*http2Client).Write

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:      1.60TB     1.60TB (flat, cum)  3.25%

github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Unmarshal

/home/circleci/project/pkg/store/storepb/rpc.pb.go

  Total:      1.18TB    17.34TB (flat, cum) 35.22%

internal/reflectlite.Swapper

/usr/local/go/src/internal/reflectlite/swapper.go

  Total:    937.76GB   937.76GB (flat, cum)  1.86%

github.com/thanos-io/thanos/pkg/receive.hash

/home/circleci/project/pkg/receive/hashring.go

  Total:    603.91GB     1.51TB (flat, cum)  3.06%

bytes.makeSlice

/usr/local/go/src/bytes/buffer.go

  Total:    459.36GB   459.36GB (flat, cum)  0.91%

github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*WriteRequest).Unmarshal

/home/circleci/project/pkg/store/storepb/prompb/remote.pb.go

  Total:    325.43GB     6.80TB (flat, cum) 13.82%

github.com/prometheus/prometheus/tsdb/record.(*Decoder).Series

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/record/record.go

  Total:    138.39GB   138.39GB (flat, cum)  0.27%

google.golang.org/grpc/internal/transport.(*decodeState).processHeaderField

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http_util.go

  Total:     87.70GB    87.70GB (flat, cum)  0.17%

google.golang.org/grpc/internal/transport.(*http2Client).newStream

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:     87.38GB   111.23GB (flat, cum)  0.22%

golang.org/x/net/http2.(*Framer).readMetaFrame.func1

/home/circleci/go/pkg/mod/golang.org/x/net@v0.0.0-20201110031124-69a78807bb2b/http2/frame.go

  Total:     85.90GB    85.90GB (flat, cum)  0.17%

google.golang.org/grpc/internal/transport.(*http2Client).createHeaderFields

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:     85.67GB    95.15GB (flat, cum)  0.19%

context.WithValue

/usr/local/go/src/context/context.go

  Total:     85.18GB    85.18GB (flat, cum)  0.17%

github.com/prometheus/prometheus/tsdb.(*blockBaseSeriesSet).Next

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/querier.go

  Total:     75.53GB    75.58GB (flat, cum)  0.15%

google.golang.org/grpc/internal/transport.(*http2Server).operateHeaders

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:     72.84GB   174.99GB (flat, cum)  0.35%

google.golang.org/grpc.newClientStream

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/stream.go

  Total:     62.68GB   332.85GB (flat, cum)  0.66%

github.com/go-kit/kit/log.With

/home/circleci/go/pkg/mod/github.com/go-kit/kit@v0.10.0/log/log.go

  Total:     46.79GB    46.79GB (flat, cum) 0.093%

github.com/prometheus/prometheus/tsdb/index.(*Writer).writePostingsToTmpFiles

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/index/index.go

  Total:     43.93GB    45.88GB (flat, cum) 0.091%

golang.org/x/net/http2.(*Framer).readMetaFrame

/home/circleci/go/pkg/mod/golang.org/x/net@v0.0.0-20201110031124-69a78807bb2b/http2/frame.go

  Total:     40.48GB   127.44GB (flat, cum)  0.25%

google.golang.org/grpc/internal/transport.(*controlBuffer).executeAndPut

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/controlbuf.go

  Total:     36.12GB    39.32GB (flat, cum) 0.078%

context.(*cancelCtx).Done

/usr/local/go/src/context/context.go

  Total:     33.51GB    33.51GB (flat, cum) 0.066%

context.WithDeadline

/usr/local/go/src/context/context.go

  Total:     31.74GB    51.55GB (flat, cum)   0.1%

google.golang.org/grpc/internal/transport.newWriteQuota

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/flowcontrol.go

  Total:     28.73GB    28.73GB (flat, cum) 0.057%

github.com/thanos-io/thanos/pkg/receive.(*Handler).replicate

/home/circleci/project/pkg/receive/handler.go

  Total:     23.55GB    72.43GB (flat, cum)  0.14%

github.com/thanos-io/thanos/pkg/tracing.StartSpan

/home/circleci/project/pkg/tracing/tracing.go

  Total:     22.55GB    45.18GB (flat, cum)  0.09%

github.com/prometheus/prometheus/tsdb.(*Head).appender

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/head.go

  Total:     22.28GB    22.30GB (flat, cum) 0.044%

github.com/prometheus/prometheus/tsdb/index.(*MemPostings).addFor

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/index/postings.go

  Total:     20.98GB    20.98GB (flat, cum) 0.042%

github.com/prometheus/prometheus/tsdb.(*blockBaseSeriesSet).Next.func1

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/querier.go

  Total:     19.84GB    19.84GB (flat, cum) 0.039%

time.AfterFunc

/usr/local/go/src/time/sleep.go

  Total:     19.81GB    19.81GB (flat, cum) 0.039%

github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/head.go

  Total:     19.62GB    49.24GB (flat, cum) 0.098%

github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward

/home/circleci/project/pkg/receive/handler.go

  Total:     18.12GB   123.33GB (flat, cum)  0.24%

google.golang.org/grpc/internal/transport.(*http2Server).writeHeaderLocked

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:     17.57GB    20.73GB (flat, cum) 0.041%

google.golang.org/grpc/internal/transport.(*http2Server).WriteStatus

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:     17.49GB    37.89GB (flat, cum) 0.075%

github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.newClientSpanFromContext

/home/circleci/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.1.0/tracing/opentracing/client_interceptors.go

  Total:     17.46GB    36.64GB (flat, cum) 0.073%

github.com/prometheus/prometheus/tsdb/index.(*MemPostings).Delete

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/index/postings.go

  Total:     16.26GB    16.26GB (flat, cum) 0.032%

google.golang.org/grpc/internal/transport.(*http2Client).NewStream

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:     15.73GB   233.06GB (flat, cum)  0.46%

github.com/prometheus/prometheus/tsdb.(*Head).processWALSamples

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/head.go

  Total:     15.57GB    18.77GB (flat, cum) 0.037%

golang.org/x/net/http2.parseHeadersFrame

/home/circleci/go/pkg/mod/golang.org/x/net@v0.0.0-20201110031124-69a78807bb2b/http2/frame.go

  Total:     14.25GB    14.25GB (flat, cum) 0.028%

google.golang.org/grpc/internal/transport.(*recvBuffer).put

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/transport.go

  Total:     13.98GB    13.98GB (flat, cum) 0.028%

google.golang.org/grpc.(*Server).processUnaryRPC

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go

  Total:     13.16GB    24.68TB (flat, cum) 50.12%

google.golang.org/grpc.(*clientStream).newAttemptLocked

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/stream.go

  Total:     12.72GB    12.72GB (flat, cum) 0.025%

github.com/grpc-ecosystem/go-grpc-prometheus.newClientReporter

/home/circleci/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/client_reporter.go

  Total:     12.72GB    12.72GB (flat, cum) 0.025%

github.com/grpc-ecosystem/go-grpc-prometheus.newServerReporter

/home/circleci/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/server_reporter.go

  Total:     12.67GB    12.67GB (flat, cum) 0.025%

google.golang.org/grpc.(*clientStream).SendMsg

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/stream.go

  Total:     12.66GB     6.39TB (flat, cum) 12.97%

github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).iterator

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/chunkenc/xor.go

  Total:     12.15GB    12.15GB (flat, cum) 0.024%


github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).Unmarshal

/home/circleci/project/pkg/store/storepb/prompb/types.pb.go

  Total:      2.11TB     2.11TB (flat, cum) 45.68%

github.com/thanos-io/thanos/pkg/receive.(*Writer).Write

/home/circleci/project/pkg/receive/writer.go

  Total:    644.39GB   655.40GB (flat, cum) 13.88%

github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Marshal

/home/circleci/project/pkg/store/storepb/rpc.pb.go

  Total:    453.94GB   453.94GB (flat, cum)  9.61%

google.golang.org/grpc.(*parser).recvMsg

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/rpc_util.go

  Total:    452.94GB   452.99GB (flat, cum)  9.59%

github.com/golang/snappy.Decode

/home/circleci/go/pkg/mod/github.com/golang/snappy@v0.0.2/decode.go

  Total:    172.04GB   172.04GB (flat, cum)  3.64%

github.com/thanos-io/thanos/pkg/receive.(*Handler).forward

/home/circleci/project/pkg/receive/handler.go

  Total:    163.47GB   314.08GB (flat, cum)  6.65%

google.golang.org/grpc/internal/transport.(*http2Client).Write

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:    151.62GB   151.90GB (flat, cum)  3.22%

github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Unmarshal

/home/circleci/project/pkg/store/storepb/rpc.pb.go

  Total:    111.61GB     1.61TB (flat, cum) 34.83%

internal/reflectlite.Swapper

/usr/local/go/src/internal/reflectlite/swapper.go

  Total:     87.29GB    87.29GB (flat, cum)  1.85%

github.com/thanos-io/thanos/pkg/receive.hash

/home/circleci/project/pkg/receive/hashring.go

  Total:     55.46GB   142.74GB (flat, cum)  3.02%

bytes.makeSlice

/usr/local/go/src/bytes/buffer.go

  Total:     44.68GB    44.68GB (flat, cum)  0.95%

github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*WriteRequest).Unmarshal

/home/circleci/project/pkg/store/storepb/prompb/remote.pb.go

  Total:     30.47GB   654.21GB (flat, cum) 13.86%

github.com/prometheus/prometheus/tsdb/record.(*Decoder).Series

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/record/record.go

  Total:     19.57GB    19.57GB (flat, cum)  0.41%

github.com/prometheus/prometheus/tsdb.(*Head).processWALSamples

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/head.go

  Total:     15.57GB    18.77GB (flat, cum)   0.4%

google.golang.org/grpc/internal/transport.(*decodeState).processHeaderField

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http_util.go

  Total:      8.16GB     8.16GB (flat, cum)  0.17%

google.golang.org/grpc/internal/transport.(*http2Client).newStream

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:      8.02GB    10.23GB (flat, cum)  0.22%

google.golang.org/grpc/internal/transport.(*http2Client).createHeaderFields

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:      8.02GB     8.89GB (flat, cum)  0.19%

github.com/prometheus/prometheus/tsdb/index.(*MemPostings).addFor

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/index/postings.go

  Total:      7.96GB     7.96GB (flat, cum)  0.17%

golang.org/x/net/http2.(*Framer).readMetaFrame.func1

/home/circleci/go/pkg/mod/golang.org/x/net@v0.0.0-20201110031124-69a78807bb2b/http2/frame.go

  Total:      7.93GB     7.93GB (flat, cum)  0.17%

context.WithValue

/usr/local/go/src/context/context.go

  Total:      7.93GB     7.93GB (flat, cum)  0.17%

google.golang.org/grpc/internal/transport.(*http2Server).operateHeaders

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:      6.74GB    16.29GB (flat, cum)  0.34%

github.com/prometheus/prometheus/tsdb.(*blockBaseSeriesSet).Next

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/querier.go

  Total:         6GB     6.01GB (flat, cum)  0.13%

google.golang.org/grpc.newClientStream

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/stream.go

  Total:      5.82GB    30.78GB (flat, cum)  0.65%

github.com/go-kit/kit/log.With

/home/circleci/go/pkg/mod/github.com/go-kit/kit@v0.10.0/log/log.go

  Total:      4.26GB     4.26GB (flat, cum)  0.09%

golang.org/x/net/http2.(*Framer).readMetaFrame

/home/circleci/go/pkg/mod/golang.org/x/net@v0.0.0-20201110031124-69a78807bb2b/http2/frame.go

  Total:      3.77GB    11.81GB (flat, cum)  0.25%

github.com/prometheus/prometheus/tsdb/index.(*Writer).writePostingsToTmpFiles

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/index/index.go

  Total:      3.47GB     3.64GB (flat, cum) 0.077%

google.golang.org/grpc/internal/transport.(*controlBuffer).executeAndPut

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/controlbuf.go

  Total:      3.15GB     3.45GB (flat, cum) 0.073%

context.(*cancelCtx).Done

/usr/local/go/src/context/context.go

  Total:      3.14GB     3.14GB (flat, cum) 0.066%

context.WithDeadline

/usr/local/go/src/context/context.go

  Total:      2.98GB     4.85GB (flat, cum)   0.1%

github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/head.go

  Total:      2.95GB     6.55GB (flat, cum)  0.14%

google.golang.org/grpc/internal/transport.newWriteQuota

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/flowcontrol.go

  Total:      2.69GB     2.69GB (flat, cum) 0.057%

github.com/prometheus/prometheus/tsdb.(*Head).appender

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/head.go

  Total:      2.50GB     2.50GB (flat, cum) 0.053%

github.com/prometheus/prometheus/tsdb.(*Head).getOrCreateWithID

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/head.go

  Total:      2.49GB    12.46GB (flat, cum)  0.26%

github.com/prometheus/prometheus/tsdb/index.(*MemPostings).Delete

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/index/postings.go

  Total:      2.47GB     2.47GB (flat, cum) 0.052%

github.com/thanos-io/thanos/pkg/receive.(*Handler).replicate

/home/circleci/project/pkg/receive/handler.go

  Total:      2.26GB     6.79GB (flat, cum)  0.14%

github.com/thanos-io/thanos/pkg/tracing.StartSpan

/home/circleci/project/pkg/tracing/tracing.go

  Total:      2.07GB     4.19GB (flat, cum) 0.089%

time.AfterFunc

/usr/local/go/src/time/sleep.go

  Total:      1.86GB     1.87GB (flat, cum)  0.04%

github.com/prometheus/prometheus/tsdb/chunkenc.(*bstream).writeBits

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/chunkenc/bstream.go

  Total:      1.75GB     1.75GB (flat, cum) 0.037%

github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).iterator

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/chunkenc/xor.go

  Total:      1.73GB     1.73GB (flat, cum) 0.037%

github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward

/home/circleci/project/pkg/receive/handler.go

  Total:      1.68GB    11.44GB (flat, cum)  0.24%

google.golang.org/grpc/internal/transport.(*http2Server).WriteStatus

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:      1.64GB     3.55GB (flat, cum) 0.075%

google.golang.org/grpc/internal/transport.(*http2Server).writeHeaderLocked

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:      1.63GB     1.91GB (flat, cum)  0.04%

github.com/prometheus/prometheus/tsdb.(*blockBaseSeriesSet).Next.func1

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/querier.go

  Total:      1.60GB     1.60GB (flat, cum) 0.034%

github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.newClientSpanFromContext

/home/circleci/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.1.0/tracing/opentracing/client_interceptors.go

  Total:      1.56GB     3.32GB (flat, cum)  0.07%

google.golang.org/grpc/internal/transport.(*recvBuffer).put

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/transport.go

  Total:      1.44GB     1.44GB (flat, cum)  0.03%

google.golang.org/grpc/internal/transport.(*http2Client).NewStream

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:      1.40GB    21.53GB (flat, cum)  0.46%

github.com/prometheus/prometheus/tsdb.seriesHashmap.set

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20201119142752-3ad25a6dc3d9/tsdb/head.go

  Total:      1.35GB     1.35GB (flat, cum) 0.029%

golang.org/x/net/http2.parseHeadersFrame

/home/circleci/go/pkg/mod/golang.org/x/net@v0.0.0-20201110031124-69a78807bb2b/http2/frame.go

  Total:      1.35GB     1.35GB (flat, cum) 0.029%

google.golang.org/grpc.(*Server).processUnaryRPC

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go

  Total:      1.23GB     2.28TB (flat, cum) 49.55%

google.golang.org/grpc.(*clientStream).newAttemptLocked

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/stream.go

  Total:      1.18GB     1.18GB (flat, cum) 0.025%

jmichalek132 · 2021-03-09T11:27:55Z

So yesterday I deployed new instances of Prometheus and Thanos receive for this test With the same amount of traffic as before when I hit the leak. First with v0.17.2 then before midnight I downgraded to v0.16.0

So it seems that the downgrade does lead to significant memory usage reduction.
But the memory profiles collected via conprof just before the downgrade are different compared to last time.
This time

github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).Unmarshal

isn't in top 10.

Full profile before the downgrade:

compress/flate.NewWriter
/usr/local/go/src/compress/flate/deflate.go
  Total:      8.63GB    13.61GB (flat, cum) 56.32%
compress/flate.(*compressor).init
/usr/local/go/src/compress/flate/deflate.go
  Total:      4.99GB     4.99GB (flat, cum) 20.63%
runtime/pprof.StartCPUProfile
/usr/local/go/src/runtime/pprof/pprof.go
  Total:      1.30GB     1.30GB (flat, cum)  5.39%
runtime/pprof.allFrames
/usr/local/go/src/runtime/pprof/proto.go
  Total:   1007.67MB  1007.67MB (flat, cum)  4.07%
google.golang.org/grpc/internal/transport.newFramer
/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http_util.go
  Total:    757.67MB   762.17MB (flat, cum)  3.08%
github.com/prometheus/procfs.FS.Stat
/go/pkg/mod/github.com/prometheus/procfs@v0.1.3/stat.go
  Total:    696.48MB     1.98GB (flat, cum)  8.18%
runtime/pprof.(*profileBuilder).emitLocation
/usr/local/go/src/runtime/pprof/proto.go
  Total:    685.01MB     2.61GB (flat, cum) 10.81%
strings.Fields
/usr/local/go/src/strings/strings.go
  Total:    623.45MB   623.45MB (flat, cum)  2.52%
runtime/pprof.writeHeapInternal
/usr/local/go/src/runtime/pprof/pprof.go
  Total:    550.47MB     5.15GB (flat, cum) 21.29%
github.com/prometheus/client_golang/prometheus.processMetric
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go
  Total:    519.56MB     1.41GB (flat, cum)  5.85%
bytes.makeSlice
/usr/local/go/src/bytes/buffer.go
  Total:    480.22MB   480.22MB (flat, cum)  1.94%
github.com/prometheus/client_golang/prometheus.(*Registry).Gather
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go
  Total:    478.36MB     1.96GB (flat, cum)  8.12%
runtime/pprof.(*protobuf).strings
/usr/local/go/src/runtime/pprof/protobuf.go
  Total:    295.40MB   307.69MB (flat, cum)  1.24%
os.(*File).readdirnames
/usr/local/go/src/os/dir_unix.go
  Total:    288.42MB   288.42MB (flat, cum)  1.17%
github.com/prometheus/client_golang/prometheus.NewDesc
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/desc.go
  Total:    244.01MB   244.01MB (flat, cum)  0.99%
github.com/prometheus/client_golang/prometheus.(*histogram).Write
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/histogram.go
  Total:    230.01MB   230.01MB (flat, cum)  0.93%
github.com/prometheus/client_golang/prometheus.checkMetricConsistency
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go
  Total:    207.01MB   207.01MB (flat, cum)  0.84%
bufio.(*Scanner).Scan
/usr/local/go/src/bufio/scan.go
  Total:    198.15MB   198.15MB (flat, cum)   0.8%
regexp.(*bitState).reset
/usr/local/go/src/regexp/backtrack.go
  Total:    187.23MB   187.23MB (flat, cum)  0.76%
github.com/prometheus/client_golang/prometheus.populateMetric
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/value.go
  Total:    141.51MB   141.51MB (flat, cum)  0.57%
runtime/debug.ReadGCStats
/usr/local/go/src/runtime/debug/garbage.go
  Total:    117.13MB   117.13MB (flat, cum)  0.47%
golang.org/x/net/trace.NewEventLog
/go/pkg/mod/golang.org/x/net@v0.0.0-20200822124328-c89045814202/trace/events.go
  Total:    114.69MB   114.69MB (flat, cum)  0.46%
github.com/prometheus/procfs.parseCPUStat
/go/pkg/mod/github.com/prometheus/procfs@v0.1.3/stat.go
  Total:    100.51MB   188.51MB (flat, cum)  0.76%
github.com/prometheus/client_golang/prometheus.(*wrappingCollector).Collect
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/wrap.go
  Total:     94.01MB    94.01MB (flat, cum)  0.38%
github.com/prometheus/client_golang/prometheus/internal.NormalizeMetricFamilies
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/internal/metric.go
  Total:     84.06MB    84.06MB (flat, cum)  0.34%
github.com/prometheus/client_golang/prometheus.(*wrappingMetric).Write
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/wrap.go
  Total:        79MB   177.51MB (flat, cum)  0.72%
compress/flate.(*huffmanEncoder).generate
/usr/local/go/src/compress/flate/huffman_code.go
  Total:     77.67MB    77.67MB (flat, cum)  0.31%
github.com/prometheus/client_golang/prometheus.(*goCollector).Collect
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/go_collector.go
  Total:     77.42MB   214.05MB (flat, cum)  0.86%
runtime/pprof.(*protobuf).varint
/usr/local/go/src/runtime/pprof/protobuf.go
  Total:     72.46MB    72.46MB (flat, cum)  0.29%
sync.(*Pool).pinSlow
/usr/local/go/src/sync/pool.go
  Total:     63.59MB    63.59MB (flat, cum)  0.26%
net/textproto.(*Reader).ReadMIMEHeader
/usr/local/go/src/net/textproto/reader.go
  Total:     57.51MB    57.51MB (flat, cum)  0.23%
regexp.(*Regexp).FindAllStringIndex.func1
/usr/local/go/src/regexp/regexp.go
  Total:     50.01MB    50.01MB (flat, cum)   0.2%
fmt.newScanState
/usr/local/go/src/fmt/scan.go
  Total:        48MB    55.01MB (flat, cum)  0.22%
net/http.newBufioWriterSize
/usr/local/go/src/net/http/server.go
  Total:     43.63MB    48.64MB (flat, cum)   0.2%
github.com/prometheus/prometheus/tsdb/index.NewFileWriter
/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20200922180708-b0145884d381/tsdb/index/index.go
  Total:     36.01MB    36.01MB (flat, cum)  0.15%
fmt.(*ss).floatToken
/usr/local/go/src/fmt/scan.go
  Total:        34MB       34MB (flat, cum)  0.14%
runtime/pprof.writeHeapProto
/usr/local/go/src/runtime/pprof/protomem.go
  Total:     33.50MB     4.61GB (flat, cum) 19.07%
net/http.(*conn).readRequest
/usr/local/go/src/net/http/server.go
  Total:     32.50MB   203.08MB (flat, cum)  0.82%
net/http.readRequest
/usr/local/go/src/net/http/request.go
  Total:     31.01MB   120.03MB (flat, cum)  0.48%
net/http.newBufioReader
/usr/local/go/src/net/http/server.go
  Total:     28.61MB    32.11MB (flat, cum)  0.13%
github.com/prometheus/prometheus/tsdb/index.NewWriter
/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20200922180708-b0145884d381/tsdb/index/index.go
  Total:     24.01MB    60.02MB (flat, cum)  0.24%
runtime/pprof.writeRuntimeProfile
/usr/local/go/src/runtime/pprof/pprof.go
  Total:     21.19MB     2.84GB (flat, cum) 11.74%
github.com/prometheus/client_golang/prometheus.NewConstMetric
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/value.go
  Total:        21MB       21MB (flat, cum) 0.085%
net/url.parse
/usr/local/go/src/net/url/url.go
  Total:     20.50MB    20.50MB (flat, cum) 0.083%
os.lstatNolog
/usr/local/go/src/os/stat_unix.go
  Total:        19MB       28MB (flat, cum)  0.11%
runtime/pprof.writeMutex
/usr/local/go/src/runtime/pprof/pprof.go
  Total:     18.75MB     1.41GB (flat, cum)  5.82%
github.com/prometheus/client_golang/prometheus.checkSuffixCollisions
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go
  Total:     18.50MB    18.50MB (flat, cum) 0.075%
context.WithCancel
/usr/local/go/src/context/context.go
  Total:        18MB       38MB (flat, cum)  0.15%
github.com/prometheus/common/expfmt.glob..func1
/go/pkg/mod/github.com/prometheus/common@v0.13.0/expfmt/text_create.go
  Total:     17.57MB    17.57MB (flat, cum) 0.071%
syscall.anyToSockaddr
/usr/local/go/src/syscall/syscall_linux.go
  Total:        16MB       16MB (flat, cum) 0.065%

Full profile from this morning after the downgrade:

ompress/flate.NewWriter
/usr/local/go/src/compress/flate/deflate.go
  Total:      9.66GB    15.18GB (flat, cum) 55.74%
compress/flate.(*compressor).init
/usr/local/go/src/compress/flate/deflate.go
  Total:      5.52GB     5.52GB (flat, cum) 20.25%
runtime/pprof.StartCPUProfile
/usr/local/go/src/runtime/pprof/pprof.go
  Total:      1.45GB     1.45GB (flat, cum)  5.34%
runtime/pprof.allFrames
/usr/local/go/src/runtime/pprof/proto.go
  Total:      1.20GB     1.20GB (flat, cum)  4.42%
google.golang.org/grpc/internal/transport.newFramer
/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http_util.go
  Total:    868.66MB   873.66MB (flat, cum)  3.13%
runtime/pprof.(*profileBuilder).emitLocation
/usr/local/go/src/runtime/pprof/proto.go
  Total:    806.57MB     3.04GB (flat, cum) 11.15%
github.com/prometheus/procfs.FS.Stat
/go/pkg/mod/github.com/prometheus/procfs@v0.1.3/stat.go
  Total:    795.99MB     2.19GB (flat, cum)  8.03%
strings.Fields
/usr/local/go/src/strings/strings.go
  Total:    691.46MB   691.46MB (flat, cum)  2.48%
runtime/pprof.writeHeapInternal
/usr/local/go/src/runtime/pprof/pprof.go
  Total:       648MB     5.94GB (flat, cum) 21.83%
github.com/prometheus/client_golang/prometheus.processMetric
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go
  Total:    611.57MB     1.62GB (flat, cum)  5.94%
github.com/prometheus/client_golang/prometheus.(*Registry).Gather
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go
  Total:    517.38MB     2.21GB (flat, cum)  8.10%
bytes.makeSlice
/usr/local/go/src/bytes/buffer.go
  Total:    494.39MB   494.39MB (flat, cum)  1.77%
runtime/pprof.(*protobuf).strings
/usr/local/go/src/runtime/pprof/protobuf.go
  Total:    348.78MB   365.20MB (flat, cum)  1.31%
os.(*File).readdirnames
/usr/local/go/src/os/dir_unix.go
  Total:    337.22MB   338.72MB (flat, cum)  1.21%
github.com/prometheus/client_golang/prometheus.(*histogram).Write
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/histogram.go
  Total:    271.52MB   271.52MB (flat, cum)  0.97%
github.com/prometheus/client_golang/prometheus.NewDesc
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/desc.go
  Total:    259.01MB   259.01MB (flat, cum)  0.93%
github.com/prometheus/client_golang/prometheus.checkMetricConsistency
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/registry.go
  Total:    241.59MB   241.59MB (flat, cum)  0.87%
bufio.(*Scanner).Scan
/usr/local/go/src/bufio/scan.go
  Total:    205.73MB   205.73MB (flat, cum)  0.74%
regexp.(*bitState).reset
/usr/local/go/src/regexp/backtrack.go
  Total:    190.90MB   190.90MB (flat, cum)  0.68%
github.com/prometheus/client_golang/prometheus.populateMetric
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/value.go
  Total:    157.51MB   157.51MB (flat, cum)  0.56%
runtime/debug.ReadGCStats
/usr/local/go/src/runtime/debug/garbage.go
  Total:    149.78MB   150.28MB (flat, cum)  0.54%
golang.org/x/net/trace.NewEventLog
/go/pkg/mod/golang.org/x/net@v0.0.0-20200822124328-c89045814202/trace/events.go
  Total:    137.33MB   137.33MB (flat, cum)  0.49%
github.com/prometheus/client_golang/prometheus.(*wrappingCollector).Collect
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/wrap.go
  Total:    109.01MB   109.01MB (flat, cum)  0.39%
github.com/prometheus/procfs.parseCPUStat
/go/pkg/mod/github.com/prometheus/procfs@v0.1.3/stat.go
  Total:    109.01MB   209.01MB (flat, cum)  0.75%
compress/flate.(*huffmanEncoder).generate
/usr/local/go/src/compress/flate/huffman_code.go
  Total:    100.72MB   100.72MB (flat, cum)  0.36%
runtime/pprof.(*protobuf).varint
/usr/local/go/src/runtime/pprof/protobuf.go
  Total:     94.15MB    94.15MB (flat, cum)  0.34%
github.com/prometheus/client_golang/prometheus/internal.NormalizeMetricFamilies
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/internal/metric.go
  Total:     87.57MB    87.57MB (flat, cum)  0.31%
github.com/prometheus/client_golang/prometheus.(*goCollector).Collect
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/go_collector.go
  Total:     87.47MB   260.75MB (flat, cum)  0.94%
github.com/prometheus/client_golang/prometheus.(*wrappingMetric).Write
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/wrap.go
  Total:     82.50MB   196.51MB (flat, cum)   0.7%
sync.(*Pool).pinSlow
/usr/local/go/src/sync/pool.go
  Total:     62.59MB    62.59MB (flat, cum)  0.22%
net/textproto.(*Reader).ReadMIMEHeader
/usr/local/go/src/net/textproto/reader.go
  Total:     62.51MB    63.02MB (flat, cum)  0.23%
github.com/prometheus/prometheus/tsdb/index.NewFileWriter
/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20200922180708-b0145884d381/tsdb/index/index.go
  Total:     60.02MB    60.02MB (flat, cum)  0.22%
regexp.(*Regexp).FindAllStringIndex.func1
/usr/local/go/src/regexp/regexp.go
  Total:     49.51MB    49.51MB (flat, cum)  0.18%
fmt.newScanState
/usr/local/go/src/fmt/scan.go
  Total:     47.50MB    52.51MB (flat, cum)  0.19%
fmt.(*ss).floatToken
/usr/local/go/src/fmt/scan.go
  Total:     43.50MB    43.50MB (flat, cum)  0.16%
net/http.newBufioWriterSize
/usr/local/go/src/net/http/server.go
  Total:     40.13MB    47.14MB (flat, cum)  0.17%
github.com/prometheus/prometheus/tsdb/index.NewWriter
/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20200922180708-b0145884d381/tsdb/index/index.go
  Total:     40.01MB   100.03MB (flat, cum)  0.36%
net/http.(*conn).readRequest
/usr/local/go/src/net/http/server.go
  Total:     37.01MB   211.57MB (flat, cum)  0.76%
net/http.readRequest
/usr/local/go/src/net/http/request.go
  Total:     35.51MB   128.53MB (flat, cum)  0.46%
runtime/pprof.writeHeapProto
/usr/local/go/src/runtime/pprof/protomem.go
  Total:     35.50MB     5.31GB (flat, cum) 19.50%
net/http.newBufioReader
/usr/local/go/src/net/http/server.go
  Total:     34.13MB    37.64MB (flat, cum)  0.13%
github.com/prometheus/common/expfmt.glob..func1
/go/pkg/mod/github.com/prometheus/common@v0.13.0/expfmt/text_create.go
  Total:     27.11MB    27.11MB (flat, cum) 0.097%
runtime/pprof.writeRuntimeProfile
/usr/local/go/src/runtime/pprof/pprof.go
  Total:     24.71MB     3.16GB (flat, cum) 11.60%
github.com/prometheus/client_golang/prometheus.NewConstMetric
/go/pkg/mod/github.com/prometheus/client_golang@v1.7.1/prometheus/value.go
  Total:     24.50MB    24.50MB (flat, cum) 0.088%
runtime/pprof.writeMutex
/usr/local/go/src/runtime/pprof/pprof.go
  Total:     20.27MB     1.52GB (flat, cum)  5.57%
os.lstatNolog
/usr/local/go/src/os/stat_unix.go
  Total:        20MB    25.50MB (flat, cum) 0.091%
context.WithCancel
/usr/local/go/src/context/context.go
  Total:     19.50MB       43MB (flat, cum)  0.15%
net/url.parse
/usr/local/go/src/net/url/url.go
  Total:        17MB       17MB (flat, cum) 0.061%
net.(*netFD).accept
/usr/local/go/src/net/fd_unix.go
  Total:     16.50MB       40MB (flat, cum)  0.14%
syscall.ByteSliceFromString
/usr/local/go/src/syscall/syscall.go
  Total:        16MB       16MB (flat, cum) 0.057%

bwplotka · 2021-03-10T13:41:21Z

BTW those memory metrics you have is for what exactly metrics? (it matters, before Go1.6)

jmichalek132 · 2021-03-10T13:54:13Z

Memory used is

container_memory_usage_bytes

from cAdvisor and Memory Used based on thanos metrics is:

go_memstats_heap_inuse_bytes

bwplotka · 2021-03-10T19:46:06Z

container_memory_usage_bytes

is inflated, see https://www.bwplotka.dev/2019/golang-memory-monitoring/

Some amazing ideas from Cortex experience are using following options (adding those envvars to Thanos process):

GODEBUG=madvdontneed=1 which will mimick what Go1.16 will do by default (and what was done before Go1.12). This will improve observability part of things.
GOGC=50 (default is 100). This changes GC to run more often. For memory heavy containers we want that: We prefer latency and CPU over OOM

bwplotka · 2021-03-11T09:38:05Z

While ingesting ~10 millions series (replicated) on 0.19.0-rc.1 looks actually better not worse:

jmichalek132 · 2021-03-11T15:42:08Z

Tested version in order

v0.17.2 -> profile
v0.16.0 -> profile
v0.19.0-rc.0 -> profile
1fff9a7 -> profile
c534b6d -> profile
b452888 -> profile
9875340 -> profile
f494a99 -> profile
v0.16.0 -> profile
2e12840 -> profile

tomleb · 2021-03-11T16:16:50Z

While ingesting ~10 millions series (replicated) on 0.19.0-rc.1 looks actually better not worse:

Is this 10 million samples per scrape or total? 0.19.0-rc.1 also leaks for me with about ~1 million samples per scrape. (Prometheus 2.22.2)

jmichalek132 · 2021-03-11T17:12:10Z

I am not sure that this bugfix that was merged into 0.16.0 release branch made it back into main branch.

metalmatze · 2021-03-11T17:48:35Z

@jmichalek132 I think you're right. I was never merged back into v0.17+ but instead entirely replaced with the ZLabel.
https://github.com/thanos-io/thanos/commits/v0.17.0/pkg/store/labelpb/label.go

I'm wondering if @bwplotka based that work on the fixed label or ignored it and did an entire rewrite?
Either way, it shouldn't show up anymore with the benchmarks posted, it seems rather unlikely. 🤔

jmichalek132 · 2021-03-11T17:53:05Z

Memory after deployment of the v0.19.0-rc.1.

And the

github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).Unmarshal

popped up again in the profile.

TOP


github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*TimeSeries).Unmarshal

/home/circleci/project/pkg/store/storepb/prompb/types.pb.go

  Total:      1.95TB     1.95TB (flat, cum) 46.75%

github.com/thanos-io/thanos/pkg/receive.(*Writer).Write

/home/circleci/project/pkg/receive/writer.go

  Total:    616.01GB   626.44GB (flat, cum) 14.67%

google.golang.org/grpc.(*parser).recvMsg

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/rpc_util.go

  Total:    453.55GB   453.57GB (flat, cum) 10.62%

github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Marshal

/home/circleci/project/pkg/store/storepb/rpc.pb.go

  Total:    337.71GB   337.71GB (flat, cum)  7.91%

github.com/thanos-io/thanos/pkg/receive.(*Handler).forward

/home/circleci/project/pkg/receive/handler.go

  Total:    152.65GB   290.83GB (flat, cum)  6.81%

github.com/golang/snappy.Decode

/home/circleci/go/pkg/mod/github.com/golang/snappy@v0.0.3-0.20201103224600-674baa8c7fc3/decode.go

  Total:    126.02GB   126.02GB (flat, cum)  2.95%

github.com/thanos-io/thanos/pkg/store/storepb.(*WriteRequest).Unmarshal

/home/circleci/project/pkg/store/storepb/rpc.pb.go

  Total:    112.75GB     1.61TB (flat, cum) 38.66%

google.golang.org/grpc/internal/transport.(*http2Client).Write

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:    112.66GB   112.87GB (flat, cum)  2.64%

internal/reflectlite.Swapper

/usr/local/go/src/internal/reflectlite/swapper.go

  Total:     80.62GB    80.62GB (flat, cum)  1.89%

github.com/thanos-io/thanos/pkg/receive.hash

/home/circleci/project/pkg/receive/hashring.go

  Total:     49.27GB   129.89GB (flat, cum)  3.04%

bytes.makeSlice

/usr/local/go/src/bytes/buffer.go

  Total:     30.01GB    30.01GB (flat, cum)   0.7%

github.com/thanos-io/thanos/pkg/store/storepb/prompb.(*WriteRequest).Unmarshal

/home/circleci/project/pkg/store/storepb/prompb/remote.pb.go

  Total:     22.34GB   480.65GB (flat, cum) 11.26%

github.com/prometheus/prometheus/tsdb/record.(*Decoder).Series

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/record/record.go

  Total:     18.23GB    18.23GB (flat, cum)  0.43%

github.com/prometheus/prometheus/tsdb.(*Head).processWALSamples

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/head.go

  Total:     13.80GB    16.67GB (flat, cum)  0.39%

github.com/prometheus/prometheus/tsdb/index.(*MemPostings).addFor

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/index/postings.go

  Total:      8.98GB     8.98GB (flat, cum)  0.21%

golang.org/x/net/http2.(*Framer).readMetaFrame.func1

/home/circleci/go/pkg/mod/golang.org/x/net@v0.0.0-20210119194325-5f4716e94777/http2/frame.go

  Total:      7.41GB     7.41GB (flat, cum)  0.17%

context.WithValue

/usr/local/go/src/context/context.go

  Total:      6.99GB     6.99GB (flat, cum)  0.16%

google.golang.org/grpc/internal/transport.(*decodeState).processHeaderField

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http_util.go

  Total:      6.94GB     6.94GB (flat, cum)  0.16%

google.golang.org/grpc/internal/transport.(*http2Server).operateHeaders

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:      6.81GB    16.11GB (flat, cum)  0.38%

google.golang.org/grpc/internal/transport.(*http2Client).newStream

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:      6.03GB     7.70GB (flat, cum)  0.18%

google.golang.org/grpc/internal/transport.(*http2Client).createHeaderFields

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_client.go

  Total:      6.01GB     6.67GB (flat, cum)  0.16%

github.com/prometheus/prometheus/tsdb.(*blockBaseSeriesSet).Next

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/querier.go

  Total:      5.68GB     5.69GB (flat, cum)  0.13%

google.golang.org/grpc.newClientStream

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/stream.go

  Total:      4.29GB    23.07GB (flat, cum)  0.54%

github.com/go-kit/kit/log.With

/home/circleci/go/pkg/mod/github.com/go-kit/kit@v0.10.0/log/log.go

  Total:      3.96GB     3.96GB (flat, cum) 0.093%

github.com/prometheus/prometheus/tsdb/index.(*Writer).writePostingsToTmpFiles

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/index/index.go

  Total:      3.29GB     3.42GB (flat, cum)  0.08%

golang.org/x/net/http2.(*Framer).readMetaFrame

/home/circleci/go/pkg/mod/golang.org/x/net@v0.0.0-20210119194325-5f4716e94777/http2/frame.go

  Total:      3.13GB    10.63GB (flat, cum)  0.25%

google.golang.org/grpc/internal/transport.(*controlBuffer).executeAndPut

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/controlbuf.go

  Total:      3.07GB     3.29GB (flat, cum) 0.077%

context.(*cancelCtx).Done

/usr/local/go/src/context/context.go

  Total:      2.80GB     2.80GB (flat, cum) 0.066%

github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/head.go

  Total:      2.78GB     6.20GB (flat, cum)  0.15%

context.WithDeadline

/usr/local/go/src/context/context.go

  Total:      2.75GB     4.53GB (flat, cum)  0.11%

github.com/thanos-io/thanos/pkg/server/http/middleware.RequestID.func1

/home/circleci/project/pkg/server/http/middleware/request_id.go

  Total:      2.74GB   700.44GB (flat, cum) 16.40%

github.com/prometheus/prometheus/tsdb.(*Head).getOrCreateWithID

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/head.go

  Total:      2.40GB    13.33GB (flat, cum)  0.31%

google.golang.org/grpc/internal/transport.newWriteQuota

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/flowcontrol.go

  Total:      2.37GB     2.37GB (flat, cum) 0.056%

github.com/prometheus/prometheus/tsdb/index.(*MemPostings).Delete

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/index/postings.go

  Total:      2.32GB     2.32GB (flat, cum) 0.054%

github.com/thanos-io/thanos/pkg/tracing.StartSpan

/home/circleci/project/pkg/tracing/tracing.go

  Total:      1.95GB     3.86GB (flat, cum)  0.09%

github.com/oklog/ulid.Monotonic

/home/circleci/go/pkg/mod/github.com/oklog/ulid@v1.3.1/ulid.go

  Total:      1.93GB     1.93GB (flat, cum) 0.045%

time.AfterFunc

/usr/local/go/src/time/sleep.go

  Total:      1.77GB     1.77GB (flat, cum) 0.042%

github.com/prometheus/prometheus/tsdb/chunkenc.(*XORChunk).iterator

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/chunkenc/xor.go

  Total:      1.68GB     1.68GB (flat, cum) 0.039%

google.golang.org/grpc/internal/transport.(*http2Server).writeHeaderLocked

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:      1.65GB     1.93GB (flat, cum) 0.045%

github.com/thanos-io/thanos/pkg/receive.(*Handler).replicate

/home/circleci/project/pkg/receive/handler.go

  Total:      1.62GB     4.83GB (flat, cum)  0.11%

google.golang.org/grpc/internal/transport.(*http2Server).WriteStatus

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go

  Total:      1.62GB     3.59GB (flat, cum) 0.084%

github.com/thanos-io/thanos/pkg/receive.(*Handler).fanoutForward

/home/circleci/project/pkg/receive/handler.go

  Total:      1.57GB    10.58GB (flat, cum)  0.25%

github.com/prometheus/prometheus/tsdb.(*blockBaseSeriesSet).Next.func1

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/querier.go

  Total:      1.54GB     1.54GB (flat, cum) 0.036%

github.com/minio/minio-go/v7.Client.putObjectMultipartStreamFromReadAt

/home/circleci/go/pkg/mod/github.com/minio/minio-go/v7@v7.0.10/api-put-object-streaming.go

  Total:      1.50GB     1.50GB (flat, cum) 0.035%

github.com/prometheus/prometheus/tsdb.(*Head).appender

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/head.go

  Total:      1.45GB     1.45GB (flat, cum) 0.034%

github.com/prometheus/prometheus/tsdb/chunkenc.(*bstream).writeBits

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/chunkenc/bstream.go

  Total:      1.43GB     1.43GB (flat, cum) 0.033%

github.com/prometheus/prometheus/tsdb.seriesHashmap.set

/home/circleci/go/pkg/mod/github.com/prometheus/prometheus@v1.8.2-0.20210215121130-6f488061dfb4/tsdb/head.go

  Total:      1.29GB     1.29GB (flat, cum)  0.03%

github.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.newClientSpanFromContext

/home/circleci/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.1.0/tracing/opentracing/client_interceptors.go

  Total:      1.20GB     2.50GB (flat, cum) 0.059%

google.golang.org/grpc.(*Server).processUnaryRPC

/home/circleci/go/pkg/mod/google.golang.org/grpc@v1.29.1/server.go

  Total:      1.19GB     2.30TB (flat, cum) 55.06%

github.com/grpc-ecosystem/go-grpc-prometheus.newServerReporter

/home/circleci/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/server_reporter.go

  Total:      1.19GB     1.19GB (flat, cum) 0.028%

bwplotka · 2021-03-15T22:34:43Z

What's the latest? 🤗

bwplotka · 2021-03-18T20:45:56Z

We discussed some of this today's Contributor Hours: https://docs.google.com/document/d/137XnxfOT2p1NcNUq6NWZjwmtlSdA6Wyti86Pd6cyQhs/edit#heading=h.dmnpchivqkn9 I will try to look on this a bit.

bwplotka · 2021-03-18T21:14:06Z

Investigation: I found this #3327 being on v0.16.0 but not on master. I think merge to v0.16.0 failed.

The long term fix was merged, but it looks like not properly, as this issue is exactly the same as #3265

Long term fix attempt no 1: #3279
Long "fix" merged: #3330 might be not actually fixing this. Let me unpack this (:

bwplotka · 2021-03-20T14:04:22Z

Thank you all for your patience and help. It kind of silly but the fix was already in PR but never merged: #3334

Up-to-date fix is available here: #3943 and will be part of v0.19.0 🚀

Some learnings when we would be investigating such issues:

Find the regression commit. I found that only after this guy things were working fine for @jmichalek132

So it was easy to tell that such "quick fix" never made to master 🤗

So it was as easy as really porting #3334 (stripping from unrelated changes) and ensure we have profiles that back up our thinking. E.g those:

Before fix: https://share.polarsignals.com/68255aa/
After fix: https://share.polarsignals.com/fbffd26/

Notice this big 200MB chunk that does not exist now 🤗

luizrojo · 2021-03-22T23:03:42Z

Awesome news @bwplotka !

Looking forward to v0.19

mxmorin · 2021-03-25T08:25:13Z

I've tested 0.19-rc2 and issued is fixed.
Many thanks

bwplotka · 2021-03-25T10:22:15Z

Done then. I see the capacity to reduce the resource usage after v0.19.0 a lot too. Let's release it and iterate. Thanks all for help.

kakkoyun added the component: receive label Feb 12, 2021

sepich changed the title ~~receive: memory leak (v0.16.0 regression)~~ receive: v0.18.0 memory leak (v0.16.0 regression) Feb 12, 2021

kakkoyun added needs-investigation help wanted bug labels Feb 12, 2021

bwplotka mentioned this issue Feb 24, 2021

receive: High memory usage after update to 0.17.0 #3471

Closed

bwplotka mentioned this issue Mar 19, 2021

v0.19.0 patch: Added receive benchmark; Fixed Receiver excessive mem usage introduced in 0.17 #3943

Merged

bwplotka closed this as completed Mar 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

receive: v0.18.0 memory leak (v0.16.0 regression) #3726

receive: v0.18.0 memory leak (v0.16.0 regression) #3726

sepich commented Jan 16, 2021

sepich commented Feb 7, 2021

kakkoyun commented Feb 12, 2021

sepich commented Feb 12, 2021

kakkoyun commented Feb 12, 2021

kakkoyun commented Feb 12, 2021

luizrojo commented Feb 23, 2021

bwplotka commented Feb 23, 2021

luizrojo commented Feb 23, 2021

bwplotka commented Feb 23, 2021

luizrojo commented Feb 23, 2021 •

edited

Loading

bwplotka commented Feb 24, 2021

jmichalek132 commented Mar 1, 2021

dhohengassner commented Mar 2, 2021

jmichalek132 commented Mar 2, 2021

bwplotka commented Mar 2, 2021

jmichalek132 commented Mar 4, 2021

jmichalek132 commented Mar 9, 2021 •

edited

Loading

bwplotka commented Mar 10, 2021

jmichalek132 commented Mar 10, 2021

bwplotka commented Mar 10, 2021 •

edited

Loading

bwplotka commented Mar 11, 2021 •

edited

Loading

jmichalek132 commented Mar 11, 2021 •

edited

Loading

tomleb commented Mar 11, 2021

jmichalek132 commented Mar 11, 2021

metalmatze commented Mar 11, 2021

jmichalek132 commented Mar 11, 2021 •

edited by bwplotka

Loading

bwplotka commented Mar 15, 2021

bwplotka commented Mar 18, 2021

bwplotka commented Mar 18, 2021

bwplotka commented Mar 20, 2021

luizrojo commented Mar 22, 2021

mxmorin commented Mar 25, 2021

bwplotka commented Mar 25, 2021

receive: v0.18.0 memory leak (v0.16.0 regression) #3726

receive: v0.18.0 memory leak (v0.16.0 regression) #3726

Comments

sepich commented Jan 16, 2021

sepich commented Feb 7, 2021

kakkoyun commented Feb 12, 2021

sepich commented Feb 12, 2021

kakkoyun commented Feb 12, 2021

kakkoyun commented Feb 12, 2021

luizrojo commented Feb 23, 2021

bwplotka commented Feb 23, 2021

luizrojo commented Feb 23, 2021

bwplotka commented Feb 23, 2021

luizrojo commented Feb 23, 2021 • edited Loading

bwplotka commented Feb 24, 2021

jmichalek132 commented Mar 1, 2021

dhohengassner commented Mar 2, 2021

jmichalek132 commented Mar 2, 2021

bwplotka commented Mar 2, 2021

jmichalek132 commented Mar 4, 2021

jmichalek132 commented Mar 9, 2021 • edited Loading

bwplotka commented Mar 10, 2021

jmichalek132 commented Mar 10, 2021

bwplotka commented Mar 10, 2021 • edited Loading

bwplotka commented Mar 11, 2021 • edited Loading

jmichalek132 commented Mar 11, 2021 • edited Loading

tomleb commented Mar 11, 2021

jmichalek132 commented Mar 11, 2021

metalmatze commented Mar 11, 2021

jmichalek132 commented Mar 11, 2021 • edited by bwplotka Loading

bwplotka commented Mar 15, 2021

bwplotka commented Mar 18, 2021

bwplotka commented Mar 18, 2021

bwplotka commented Mar 20, 2021

luizrojo commented Mar 22, 2021

mxmorin commented Mar 25, 2021

bwplotka commented Mar 25, 2021

luizrojo commented Feb 23, 2021 •

edited

Loading

jmichalek132 commented Mar 9, 2021 •

edited

Loading

bwplotka commented Mar 10, 2021 •

edited

Loading

bwplotka commented Mar 11, 2021 •

edited

Loading

jmichalek132 commented Mar 11, 2021 •

edited

Loading

jmichalek132 commented Mar 11, 2021 •

edited by bwplotka

Loading