Streaming queries are very inefficient #1195

bboreham · 2019-01-21T16:21:41Z

I noticed high resource usage in ruler and traced it back to a change where I turned on:

   - -querier.batch-iterators=true
   - -querier.ingester-streaming=true

Upon reverting this change, CPU went down to a third of what it was, memory down to a quarter and network traffic to a fifth.

Profiling suggests vast amounts of memory being used here:

github.com/cortexproject/cortex/pkg/querier/batch.newMergeIterator
/go/src/github.com/cortexproject/cortex/pkg/querier/batch/merge.go
  Total:      6.78TB     7.29TB (flat, cum) 38.52%
     20            .          .            
     21            .          .           	currErr error 
     22            .          .           } 
     23            .          .            
     24            .          .           func newMergeIterator(cs []chunk.Chunk) *mergeIterator { 
     25            .   151.63GB           	css := partitionChunks(cs) 
     26       5.39GB     5.39GB           	its := make([]*nonOverlappingIterator, 0, len(css)) 
     27            .          .           	for _, cs := range css { 
     28            .   365.93GB           		its = append(its, newNonOverlappingIterator(cs)) 
     29            .          .           	} 
     30            .          .            
     31            .          .           	c := &mergeIterator{ 
     32            .          .           		its:        its, 
     33      10.82GB    10.82GB           		h:          make(iteratorHeap, 0, len(its)), 
     34       3.36TB     3.36TB           		batches:    make(batchStream, 0, len(its)*2*promchunk.BatchSize), 
     35       3.40TB     3.40TB           		batchesBuf: make(batchStream, 0, len(its)*2*promchunk.BatchSize), 
     36            .          .           	} 
     37            .          .            
     38            .          .           	for _, iter := range c.its { 
     39            .          .           		if iter.Next(1) { 
     40            .          .           			c.h = append(c.h, iter)

I am unclear why those sizes have *promchunk.BatchSize - they are allocating slices of Batch which are already sized that big.

The text was updated successfully, but these errors were encountered:

nschad · 2021-06-22T11:30:54Z

Is this still a thing?

bboreham · 2021-06-22T15:25:21Z

I haven't profiled recently, but the code still looks the same to me.

cortex/pkg/querier/batch/merge.go

Lines 24 to 36 in 20f8931

    
           func newMergeIterator(cs []GenericChunk) *mergeIterator { 
        
           	css := partitionChunks(cs) 
        
           	its := make([]*nonOverlappingIterator, 0, len(css)) 
        
           	for _, cs := range css { 
        
           		its = append(its, newNonOverlappingIterator(cs)) 
        
           	} 
        
           	c := &mergeIterator{ 
        
           		its:        its, 
        
           		h:          make(iteratorHeap, 0, len(its)), 
        
           		batches:    make(batchStream, 0, len(its)*2*promchunk.BatchSize), 
        
           		batchesBuf: make(batchStream, len(its)*2*promchunk.BatchSize), 
        
           	}

nschad · 2021-06-22T17:19:26Z

I haven't profiled recently, but the code still looks the same to me.

cortex/pkg/querier/batch/merge.go

Lines 24 to 36 in 20f8931

func newMergeIterator(cs []GenericChunk) *mergeIterator {

css := partitionChunks(cs)

its := make([]*nonOverlappingIterator, 0, len(css))

for _, cs := range css {

its = append(its, newNonOverlappingIterator(cs))

}

c := &mergeIterator{

its: its,

h: make(iteratorHeap, 0, len(its)),

batches: make(batchStream, 0, len(its)*2*promchunk.BatchSize),

batchesBuf: make(batchStream, len(its)*2*promchunk.BatchSize),

}

Isn't this a major problem, when cpu/memory/bandwith explodes like you described? Especially since this is considered a good default (or even is the default)? 🤔

Edit: It is literally the default

bboreham · 2021-06-23T16:02:29Z

Depends how much you use the ruler (or queries) compared to everything else.

Also for the blocks store (which we recommend over chunks), I don't think you will go through that path unless you also turn on the experimental -ingester.stream-chunks-when-using-blocks option.

nschad · 2021-06-23T18:11:06Z

Depends how much you use the ruler (or queries) compared to everything else.

Also for the blocks store (which we recommend over chunks), I don't think you will go through that path unless you also turn on the experimental -ingester.stream-chunks-when-using-blocks option.

Ok cool I misunderstood I thought this also applies to block storage users

bboreham · 2021-07-06T10:08:14Z

I figured out what was happening in the code, and also why it hit ruler more than queriers - it is maximally inefficient for instant queries.

bboreham mentioned this issue Oct 23, 2019

Enable querier.ingester-streaming by default #1753

Closed

bboreham mentioned this issue Jan 28, 2020

Memory allocations in batch.newMergeIterator #2046

Closed

pstibrany mentioned this issue Feb 3, 2020

Refactor Cortex querier to use Queryable interface directly #2060

Merged

3 tasks

pracucci added the keepalive Skipped by stale bot label Feb 3, 2020

gouthamve added the type/performance label May 28, 2020

pracucci mentioned this issue May 18, 2021

Deprecate -querier.ingester-streaming option #4195

Closed

3 tasks

bboreham mentioned this issue Jul 6, 2021

MergeIterator: allocate less memory at first #4341

Merged

2 tasks

bboreham closed this as completed in #4341 Jul 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming queries are very inefficient #1195

Streaming queries are very inefficient #1195

bboreham commented Jan 21, 2019

nschad commented Jun 22, 2021

bboreham commented Jun 22, 2021

nschad commented Jun 22, 2021 •

edited

Loading

bboreham commented Jun 23, 2021

nschad commented Jun 23, 2021

bboreham commented Jul 6, 2021

Streaming queries are very inefficient #1195

Streaming queries are very inefficient #1195

Comments

bboreham commented Jan 21, 2019

nschad commented Jun 22, 2021

bboreham commented Jun 22, 2021

nschad commented Jun 22, 2021 • edited Loading

bboreham commented Jun 23, 2021

nschad commented Jun 23, 2021

bboreham commented Jul 6, 2021

nschad commented Jun 22, 2021 •

edited

Loading