Fix memory issue & improve processing speed #212

mga-chka · 2022-08-25T11:37:36Z

Description

There is a huge memory overhead when caching data in either redis or fscache (it's a regression introduce during the redis cache feature).
It was raised by 2 people in this issue
#206
and this PR
#210

The aim of this PR is to fix it and improve (if possible) the query processing speed.

Here is a summary of the improvements.
All the scenarios are improved when cache is on:

scenario 1: file system cache on cache miss
scenario 2: file system cache on cache hit
scenario 3: redis cache on cache miss
scenario 4 redis cache on cache hit

And here are the result of a bench on my laptop (that consist of running sequentially 20 queries that fetch 64MB of data each from clickhouse)
before fix:

scenario 1: heap size: 150MB | processing speed: 220MB/sec then* 70MB/sec
scenario 2: heap size: 150MB | processing speed: 1.2GB/sec
scenario 3: heap size: 750MB | processing speed: 120MB/sec then* 80MB/sec
scenario 4: heap size: 350MB | processing speed: 140MB/sec then* 70MB/sec

after fix

scenario 1: heap size: 10MB** | processing speed: 220MB/sec then* 70MB/sec
scenario 2: heap size: 10MB** | processing speed: 3.1GB/sec
scenario 3: heap size: 15MB** | processing speed: 140MB/sec
scenario 4: heap size: 17MB** | processing speed: 1,2GB/sec

** The size of the heap at startup is 7MB
*The throughput decreases a lot after 5-8 queries, it's likely because my ssd disk speed degrades on bursts (because of some internal caches I guess) or because of the turbo boost of intel CPUs.

Pull request type

Please check the type of change your PR introduces:

Checklist

Linter passes correctly
Add tests which fail without the change (if possible)
All tests passing
Extended the README / documentation, if necessary

Does this introduce a breaking change?

Yes
No
This PR will make the objects store in fs & redis cache in the previous version unreachable because the key generation has changed since the format is different

Further comments

render · 2022-08-25T11:37:40Z

Your Render PR Server URL is https://chproxy-pr-212.onrender.com.

Follow its progress at https://dashboard.render.com/static/srv-cc3lu0g2i3migi3kveb0.

gontarzpawel · 2022-08-26T15:15:07Z

cache/cache.go

@@ -24,7 +24,7 @@ type ContentMetadata struct {

 type CachedData struct {
 	ContentMetadata
-	Data io.Reader
+	Data io.ReadCloser


I think it'd be worth to leave a comment here why a client of this object is responsible for closing the reader

gontarzpawel · 2022-08-26T15:19:27Z

cache/filesystem_cache_test.go

+			conditionalStr := ""
+			if len(value) > 30 {
+				conditionalStr = "..."
+			}


It's odd, I guess it's to display the fact that we display only a fraction of data. Is it needed? If you think so, maybe we can mutualise 30 to variable and rename conditionalStr to logSuffx?

it's needed because I added a test with a string of 4MB and if this test fail the logs will contains a string of 4MB ...
I'll do the changes requested

gontarzpawel · 2022-08-26T15:21:26Z

cache/redis_cache.go

@@ -91,11 +83,11 @@ func (r *redisCache) nbOfBytes() uint64 {
 	return uint64(cacheSize)
 }

+/*


We should remove the comment

my bad, good cache

cache/redis_cache.go

gontarzpawel · 2022-08-26T15:34:00Z

cache/redis_cache.go

 	if err != nil {
-		log.Errorf("failed to decode payload: %s , due to: %v ", payload.Payload, err)
+		log.Errorf("failed to get key %s with error: %s", key.String(), err)
 		return nil, ErrMissing
 	}
+	stringKey := key.String()


You could move it before err handling and use stringKey inside log

gontarzpawel · 2022-08-26T15:37:52Z

cache/redis_cache.go

+func (m io_reader_decorator) Close() error {
+	return nil
+}
+func (r *redisCache) stringToBytes(s string) []byte {


I think it's more than stringToBytes. The way I understand it, this function:

on the first 4 bytes encodes len of the string

appends the string to the slice
Can we call it encodeString and leave the comment? Reading this function take a moment

Actually I just realised that it's the same code as in filesystem cache, isn't it? (both encode and decode - there it's called writeHeader and readHeader, maybe that's a better naming. Except that there we write to writer and read from reader)

it is really transforming a string to a byte[] in order for serialization usage, I'll rename it to encodeString (but we will lose the information that the string becomes a byte array). Yes the logic is closed to writeHeader/readHeader execpt that it's more granulare since we're only dealing with a single string and not the headers.

gontarzpawel · 2022-08-26T15:39:35Z

cache/redis_cache.go

+	return b
+}
+
+func (r *redisCache) stringFromBytes(bytes []byte) (string, int) {


And then we could call that one decodeString. Also I think both of them are not tightly coupled to redis cache. I'd extract them out to marshalling file or something like this

gontarzpawel · 2022-08-26T15:40:31Z

cache/redis_cache.go

+	return string(s), int(4 + n)
+}
+
+func (r *redisCache) metadataToByte(contentMetadata *ContentMetadata) []byte {


I'd add suggest adding a method receiver to ContentMetadata instead of keeping it in redis cache

what do you mean?
FYI I renamed this function as encodeMetadata in order to be consistent with encodeString

gontarzpawel · 2022-08-26T15:44:30Z

cache/redis_cache.go

-	}
+// this struct is here because CachedData requires an io.ReadCloser
+// but logic in the the Get function generates only an io.Reader
+type io_reader_decorator struct {


to be more golang compliant

Suggested change

type io_reader_decorator struct {

type ioReaderDecorator struct {

gontarzpawel · 2022-08-26T15:46:19Z

cache/redis_cache.go

 	return r.expire, nil
 }

 func (r *redisCache) Name() string {
 	return r.name
 }

-func toBytes(stream io.Reader) ([]byte, error) {
-	buf := new(bytes.Buffer)
+type RedisStreamReader struct {


it can be private I think

Suggested change

type RedisStreamReader struct {

type redisStreamReader struct {

gontarzpawel · 2022-08-26T15:48:49Z

cache/redis_cache.go

+	}
+}
+
+func (r *RedisStreamReader) Read(destBuf []byte) (n int, err error) {


Very smart! :)

gontarzpawel · 2022-08-26T15:50:29Z

cache/redis_cache_test.go

 }

 func TestRedisCacheMiss(t *testing.T) {
 	c := generateRedisClientAndServer(t)
 	cacheMissHelper(t, c)
 }
+func TestStringFromToByte(t *testing.T) {
+	c := generateRedisClientAndServer(t)


It's not needed to test encode and decode functions (once they're decoupled from redis cache)

good point. I asked myself this question before doing this test but thought having a dedicated test that only that only focus on this part would be easier for debugging if this part contains a pb (at least for coding this part very helpful).
Even if it's redundant with other tests it's still worth keep it IMHO

gontarzpawel · 2022-08-27T08:10:46Z

cache/redis_cache.go

+	// But before that, since the usage of the reader could take time and the object in redis could disappear btw 2 fetchs
+	// we need to make sure the TTL will be long enough to avoid nasty side effects
+	// nb: it would be better to retry the flow if such a failure happened but this require a huge refactoring of proxy.go
+	if ttl <= 5*time.Second {


Can you extract it to const next to other timeouts?

gontarzpawel · 2022-08-27T08:12:42Z

cache/redis_cache.go

+func (r *redisCache) decodeMetadata(b []byte) (*ContentMetadata, int) {
+	cLength := uint64(b[7]) | (uint64(b[6]) << 8) | (uint64(b[5]) << 16) | (uint64(b[4]) << 24) | uint64(b[3])<<32 | (uint64(b[2]) << 40) | (uint64(b[1]) << 48) | (uint64(b[0]) << 56)


I think we miss error handling. What if malicious actor tampers payloads in redis? It would make chproxy panic. We could handle that more securely by ignoring cached data. WDYT?

gontarzpawel · 2022-08-29T07:26:35Z

cache/redis_cache.go

+}
+
+func (e *RedisCacheCorruptionError) Error() string {
+	return "chproxy can't decode the cached result from redis, it seems to have been corrupted"


May be it'd be worth to add an info about the key in order to ease analysing the payload

gontarzpawel · 2022-08-29T07:33:25Z

cache/redis_cache.go

@@ -126,17 +133,37 @@ func (r *redisCache) Get(key *Key) (*CachedData, error) {
 	}
 	// since the cached results in redis are too big, we can't fetch all of them because of the memory overhead.
 	// We will create an io.reader that will fetch redis bulk by bulk to reduce the memory usage.
+	redisStreamreader := newRedisStreamReader(uint64(offset), r.client, stringKey, metadata.Length)


maybe we could instantiate after the check if ttl is small?

I don't see what it'll chance since we need the redisStreamreader whether TTL is small (it will be used to write into the tmp file) or long (it will be used to write the http response)

oh sorry, I misread it. Forget about the comment :)

gontarzpawel

Thank you @mga-chka , great enhancement 💪

mga-chka added 5 commits August 24, 2022 16:28

add filebased response writer

10b9aa3

switch from memory to file tmp buff + remove tee

a22ece0

avoid memory overhead while fecting from filesystem_cache

10b6cfa

from costly (memory & cpu) json serialisation to binary one

0431469

make insertion in redis memory & cpu efficient

7a74fbd

pull-request-size bot added the size/XL label Aug 25, 2022

mga-chka mentioned this pull request Aug 25, 2022

refactor: use io.ReadCloser instead of io.Reader in CacheData #210

Closed

13 tasks

mga-chka added 2 commits August 25, 2022 14:31

fix linter issues

f9631d9

make fetching from redis memory & cpu efficient

9ac36d3

mga-chka mentioned this pull request Aug 25, 2022

Memory consumption spikes after upgrading to v1.16.x #206

Closed

gontarzpawel reviewed Aug 26, 2022

View reviewed changes

mga-chka added 2 commits August 26, 2022 19:20

fix from PR's comments

5f15b81

handle edge case due to streaming

500a0bb

gontarzpawel reviewed Aug 27, 2022

View reviewed changes

avoid panic while decoding redis payloads

0eba7cf

pull-request-size bot added size/XXL and removed size/XL labels Aug 28, 2022

improve edge case handling

baa6148

gontarzpawel reviewed Aug 29, 2022

View reviewed changes

gontarzpawel approved these changes Aug 29, 2022

View reviewed changes

log redis key in case of corrupted data

d2b0bde

mga-chka merged commit 3f3288a into master Aug 29, 2022

mga-chka deleted the fix-memory-issue branch August 29, 2022 09:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory issue & improve processing speed #212

Fix memory issue & improve processing speed #212

mga-chka commented Aug 25, 2022 •

edited

Loading

render bot commented Aug 25, 2022

gontarzpawel Aug 26, 2022

gontarzpawel Aug 26, 2022

mga-chka Aug 26, 2022

gontarzpawel Aug 26, 2022

mga-chka Aug 26, 2022

gontarzpawel Aug 26, 2022

gontarzpawel Aug 26, 2022

gontarzpawel Aug 26, 2022

mga-chka Aug 26, 2022

gontarzpawel Aug 26, 2022

gontarzpawel Aug 26, 2022

mga-chka Aug 26, 2022

gontarzpawel Aug 26, 2022

gontarzpawel Aug 26, 2022

gontarzpawel Aug 26, 2022

gontarzpawel Aug 26, 2022

mga-chka Aug 26, 2022

gontarzpawel Aug 27, 2022

gontarzpawel Aug 27, 2022

gontarzpawel Aug 29, 2022

gontarzpawel Aug 29, 2022

mga-chka Aug 29, 2022

gontarzpawel Aug 29, 2022

gontarzpawel left a comment

	type io_reader_decorator struct {
	type ioReaderDecorator struct {

	type RedisStreamReader struct {
	type redisStreamReader struct {

		func (r redisCache) decodeMetadata(b []byte) (ContentMetadata, int) {
		cLength := uint64(b[7]) \| (uint64(b[6]) << 8) \| (uint64(b[5]) << 16) \| (uint64(b[4]) << 24) \| uint64(b[3])<<32 \| (uint64(b[2]) << 40) \| (uint64(b[1]) << 48) \| (uint64(b[0]) << 56)

Fix memory issue & improve processing speed #212

Fix memory issue & improve processing speed #212

Conversation

mga-chka commented Aug 25, 2022 • edited Loading

Description

Pull request type

Checklist

Does this introduce a breaking change?

Further comments

render bot commented Aug 25, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gontarzpawel left a comment

Choose a reason for hiding this comment

mga-chka commented Aug 25, 2022 •

edited

Loading