Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Feed Pull Model Memory Issues #2087

Closed
abranaugh opened this issue Dec 21, 2020 · 7 comments
Closed

Change Feed Pull Model Memory Issues #2087

abranaugh opened this issue Dec 21, 2020 · 7 comments
Assignees
Labels
bug Something isn't working ChangeFeed Hotfix A hotfix is required for the issue QUERY

Comments

@abranaugh
Copy link

We are continuously addressing and improving the SDK, if possible, make sure the problem persist in the latest SDK version.

Describe the bug
We are trying to update our SDK version from 3.9.1 - preview to the latest preview version. We use the change feed pull model. We noticed when we updated our version to 3.15.2 - preview and deployed, our machines quickly ran out of RAM. Looking at the logs, we got up to about 23gb of RAM in use and then crashed. I was able to reproduce locally and noticed that the RAM on my machine quickly ballooned.

I put together a console app to verify the behavior wasn't anything in our code, the only dependencies in the console app is to Cosmos SDK. In my test, I read off the change feed for 10 minutes and monitored the diagnostics in VS2019. I tested with various versions and took screenshots of the memory usage in the diagnostic session. These tests were against a collection that is about ~300GB

3.15.2 - Preview - Ended up at around 1.8gb for the 10 minutes
image

3.15.1 - Preview - 900mb in 10 minutes
image

3.15.0 - Preview - 600mb in 10 minutes
image

3.14.0 - Preview - 150mb in 10 minutes
image

3.13.0 - Preview - 150mb in 10 minutes
image

3.9.1 - Preview - 150mb in 10 minutes
image

Here is the code for the methods I ran:

3.13 and up

public async Task RunFeed()
{
   var connection = "";
   var container = new CosmosClient(connection).GetContainer("", "");
   var totalDocs = 0;
   var options = new ChangeFeedRequestOptions
   {
     PageSizeHint = 5000
   };
    
   using (var iterator = container.GetChangeFeedStreamIterator(ChangeFeedStartFrom.Beginning(), options))
   {
      var watch = Stopwatch.StartNew();
      while (iterator.HasMoreResults && watch.Elapsed < TimeSpan.FromMinutes(10))
      {
          using (var response = await iterator.ReadNextAsync())
          {
              if (!response.IsSuccessStatusCode)
              {
                 break;
              }

              using (var changes = await JsonDocument.ParseAsync(response.Content))
              {
                  var count = changes.RootElement.GetProperty("_count").GetInt32();
                  Console.WriteLine($"Retrieved {count} documents");
                  totalDocs += count;
              }
          }
      }
      Console.WriteLine($"Found {totalDocs} total documents in {watch.Elapsed.TotalMinutes} minutes");
    }
}

3.9.1 - Preview

public async Task RunFeed()
{
    var connection = "";
    var container = new CosmosClient(connection).GetContainer("", "");
    var totalDocs = 0;
    
    var iterator = container.GetChangeFeedStreamIterator(changeFeedRequestOptions: new ChangeFeedRequestOptions
    {
        StartTime = DateTime.MinValue,
        MaxItemCount = 5000
    });
    
    var watch = Stopwatch.StartNew();
    while (iterator.HasMoreResults && watch.Elapsed < TimeSpan.FromMinutes(10))
    {
        using (var response = await iterator.ReadNextAsync())
        {
            using (var changes = await JsonDocument.ParseAsync(response.Content))
            {
                var count = changes.RootElement.GetProperty("_count").GetInt32();
                Console.WriteLine($"Retrieved {count} documents");
                totalDocs += count;
            }
        }
    }
    
    Console.WriteLine($"Found {totalDocs} total documents in {watch.Elapsed.TotalMinutes} minutes");
}

Environment summary
SDK Version: 3.15.0 - Preview, 3.15.1 - Preview, 3.15.2 - Preview
.Net Core 3.1

@j82w j82w added bug Something isn't working ChangeFeed Hotfix A hotfix is required for the issue labels Dec 22, 2020
@j82w j82w assigned j82w and sboshra and unassigned j82w Dec 22, 2020
@j82w
Copy link
Contributor

j82w commented Dec 22, 2020

Likely caused by #1933

@j82w
Copy link
Contributor

j82w commented Dec 22, 2020

@abranaugh please contact support if this is impacting your production so a support ticket is created and will ensure the issue is properly prioritized.

@j82w
Copy link
Contributor

j82w commented Dec 22, 2020

@abranaugh any chance you can provide a memory dump or perf trace to help root cause the issue?

@abranaugh
Copy link
Author

@j82w I can provide a memory dump, how would you like me to provide that to you? When I dumped out the version running 3.15.2 - preview, it was about 1.8gb. I don't want to post it publicly on github due to potentially exposing our connection strings and whatnot. I can open up a support ticket and attach it to that if that is easiest.

@j82w
Copy link
Contributor

j82w commented Dec 22, 2020

Please open a support ticket and attach the memory dump as it is the easiest way.

@shibayan
Copy link

I also encountered this problem using the latest preview version (3.15.2-preview). Based on the findings of this issue, I decided to use 3.14.0-preview, but now I've encountered the issue #1918 and have a difficult version to choose.

@j82w
Copy link
Contributor

j82w commented Feb 8, 2021

Fix in PR #2129

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ChangeFeed Hotfix A hotfix is required for the issue QUERY
Projects
None yet
Development

No branches or pull requests

5 participants