Doppler out of memory when scaled #241

jkbschmid · 2019-12-09T13:02:12Z

Describe the bug
When scaling to two Doppler instances and deploying an application, the memory consumption increases until either the k8s node crashes or Doppler is killed by the OOMKiller.

To Reproduce
Install kubecf master with cf-operator master.
Run cf push with the example Dora app.

Expected behavior
Doppler should consume reasonable amount of memory.

Environment

CF Version: v8 and v12

The text was updated successfully, but these errors were encountered:

loewenstein · 2019-12-09T13:14:43Z

@viovanov blocker for 1.0?

f0rmiga · 2020-01-23T00:28:25Z

The problem here seems to be with log-cache. In my initial investigation, I couldn't find the exact problem. For the 0.2 release, I'm going to restrict HA for doppler. We should get back to it for 1.0 though.

f0rmiga · 2020-01-24T18:40:59Z

I changed the priority from Critical to High given that there is a temporary fix in place already.

loewenstein · 2020-01-24T19:07:07Z

I wouldn't call "don't scale it" a temporary fix for "this can't be scaled".
What does it mean for logs, if Doppler is a singleton? Potential delay or loss of logs? If it is the latter I would still judge it critical I guess. Unless there are other issues left that are comparable to completely failing k8s nodes, of course.

f0rmiga · 2020-01-25T00:30:08Z

@loewenstein Do you have spare cycles to help to debug this?

bikramnehra · 2020-02-05T19:06:49Z

Seems like the original solution that was proposed is not going to work out. An issue has been filed upstream though it might take some time to be fixed as the root cause hasn't been identified yet.

Since the problematic part is the log-cache job therefore one potential fix as proposed by @f0rmiga is to leave the log-cache job outside of the doppler instance group and avoid scaling it altogether.

f0rmiga · 2020-03-06T17:19:57Z

As discussed with @viovanov, we will keep it as is for now.

viovanov · 2020-03-26T16:44:17Z

Upstream's VM deployments don't seem to be suffering from this issue based on discussions.

We should implement the split for the log cache from doppler.

f0rmiga added Priority: Critical Type: Bug Something isn't working labels Dec 30, 2019

f0rmiga added this to the 0.2.0 milestone Dec 30, 2019

f0rmiga self-assigned this Jan 6, 2020

fargozhu added the SUSE SUSE is pursuing a solution label Jan 7, 2020

f0rmiga mentioned this issue Jan 23, 2020

fix: log-cache memory consumption spike #355

Merged

7 tasks

f0rmiga modified the milestones: 0.2.0, 1.0.0 Jan 23, 2020

f0rmiga removed their assignment Jan 23, 2020

bikramnehra self-assigned this Jan 24, 2020

f0rmiga added Priority: High and removed Priority: Critical labels Jan 24, 2020

bikramnehra mentioned this issue Jan 29, 2020

Fixing out of memory issue when scaling doppler instances #366

Closed

7 tasks

bikramnehra removed their assignment Jan 29, 2020

fargozhu added the Status: Verification Needed Issue must be verified before closed label Feb 4, 2020

bikramnehra mentioned this issue Feb 5, 2020

log-cache job runs out of memory with more than one instance in kubecf cloudfoundry/log-cache-release#29

Closed

f0rmiga self-assigned this Mar 2, 2020

f0rmiga assigned viovanov and unassigned f0rmiga Mar 6, 2020

fargozhu modified the milestones: 1.0.0, 1.1.0 Mar 12, 2020

fargozhu modified the milestones: 1.0.1, 1.2.0 Mar 19, 2020

fargozhu assigned f0rmiga Mar 19, 2020

viovanov added the Size: 8 label Mar 26, 2020

aduffeck self-assigned this Mar 30, 2020

f0rmiga unassigned aduffeck and viovanov Mar 30, 2020

f0rmiga mentioned this issue Apr 3, 2020

log-cache instance group #619

Merged

7 tasks

viovanov closed this as completed in #619 Apr 8, 2020

fargozhu modified the milestones: Next Release, 2.0.0 Apr 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doppler out of memory when scaled #241

Doppler out of memory when scaled #241

jkbschmid commented Dec 9, 2019

loewenstein commented Dec 9, 2019

f0rmiga commented Jan 23, 2020

f0rmiga commented Jan 24, 2020

loewenstein commented Jan 24, 2020

f0rmiga commented Jan 25, 2020

bikramnehra commented Feb 5, 2020

f0rmiga commented Mar 6, 2020

viovanov commented Mar 26, 2020 •

edited

Loading

Doppler out of memory when scaled #241

Doppler out of memory when scaled #241

Comments

jkbschmid commented Dec 9, 2019

loewenstein commented Dec 9, 2019

f0rmiga commented Jan 23, 2020

f0rmiga commented Jan 24, 2020

loewenstein commented Jan 24, 2020

f0rmiga commented Jan 25, 2020

bikramnehra commented Feb 5, 2020

f0rmiga commented Mar 6, 2020

viovanov commented Mar 26, 2020 • edited Loading

viovanov commented Mar 26, 2020 •

edited

Loading