Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak after updating to 1.0.7.0 #123

Closed
zlatinal opened this issue Apr 9, 2019 · 8 comments
Closed

Possible memory leak after updating to 1.0.7.0 #123

zlatinal opened this issue Apr 9, 2019 · 8 comments
Assignees

Comments

@zlatinal
Copy link

zlatinal commented Apr 9, 2019

Issue description

Snowflake.Data is throwing a System.OutOfMemoryException.

After updating to version 1.0.7.0 some of our automated tests starter failing. These tests run daily and query snowflake using the snowflake-connector-net, ranging from a few rows to 500000 rows. These tests were passing with version 1.0.4.0, and after updating to 1.0.7.0 they started to fail. We also updated to 1.0.8.0 hoping there is a fix, however the issue persists.

Error log

 System.AggregateException:
   at System.Threading.Tasks.Task`1.GetResultCore (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Threading.Tasks.Task`1.get_Result (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Snowflake.Data.Core.SFResultSet.Next (Snowflake.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=null)
   at Snowflake.Data.Client.SnowflakeDbDataReader.Read (Snowflake.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=null)
   ....
   ....
   ....
 Inner exception System.OutOfMemoryException handled at System.Threading.Tasks.Task`1.GetResultCore:
   at Snowflake.Data.Core.SFReusableChunk+BlockResultData.allocateArrays (Snowflake.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=null)
   at Snowflake.Data.Core.SFReusableChunk+BlockResultData.add (Snowflake.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=null)
   at Snowflake.Data.Core.ReusableChunkParser.ParseChunk (Snowflake.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=null)
   at Snowflake.Data.Core.SFBlockingChunkDownloaderV3.ParseStreamIntoChunk (Snowflake.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=null)
   at Snowflake.Data.Core.SFBlockingChunkDownloaderV3+<DownloadChunkAsync>d__15.MoveNext (Snowflake.Data, Version=1.0.8.0, Culture=neutral, PublicKeyToken=null)

Configuration

Driver version: 1.0.8.0

@howryu howryu self-assigned this Apr 9, 2019
@howryu
Copy link
Contributor

howryu commented Apr 9, 2019

The goal of that change is to reduce memory usage. So do you run multiple tests within same process? I did not explicitly call GC.collect(), maybe I should.

@manigandham
Copy link
Contributor

Please don't call GC.collect() inside the library. It affects the entire running program and doesn't solve the memory leak if there is one.

@zlatinal
Copy link
Author

The tests are only calling our API, which is in turn is querying snowflake (and using the connector). I don't expect the actual tests to be the cause, as they are independent (acceptance tests). I only gave them as an example, because they highlighted the issue quite well - one day they work the next day (after updating) they fail. The failure occurs when doing 10 queries each retrieving around 50000 rows, in total 500000 rows.

@ChTimTsubasa
Copy link
Contributor

@zlatinal In your test case, are the queries happening sequentially or concurrently?

@ChTimTsubasa
Copy link
Contributor

The change included in 1.0.7.0 is switching to new result chunk downloader that reuses the buffer when downloading chunks.

I created a micro benchmark that runs 10 queries sequentially on top of both the new downloader and the old downloader. I ran two rounds:

  1. Each query scans on 1000000 rows
    New downloader:
    image
    Old downloader
    image

  2. Each query scans 5000 rows
    New Downloader
    image
    Old Downloader
    image

From the result of 1st round I don't see a memory leak introduced by the new downloader, and the memory footprint is actually better(less GC and less memory usage on average). However, the second round is for sure revealing a performance regression for that workload.

After some investigation there is definitely more improvement can be done for the new downloader to reduce memory for small workload, but that would be a time consuming task.

For now what you can do is one of the following:

  1. Switch back to the old driver before 1.0.7.0
  2. In the next release we would add a configurable option to use the old downloader and you can use that to switch back to the old downloader.

@zlatinal
Copy link
Author

@ChTimTsubasa Thank you for the suggestions, we'll downgrade to 1.0.6.0 until the configurable option is implemented.

To answer your previous question: The queries are concurrent.

@CameronFiederer
Copy link

Hey, has there been any change with this? We have our systems using 1.0.6, but I think that blocks out a fairly sizeable chunk of RAM (at least 2gb?). Ideally we'd like to get our memory footprint down when using this driver.

@sfc-gh-jfan sfc-gh-jfan reopened this Jul 1, 2022
@github-actions github-actions bot closed this as completed Jul 2, 2022
@sfc-gh-jfan sfc-gh-jfan reopened this Jul 6, 2022
@sfc-gh-igarish
Copy link
Collaborator

To clean up and re-prioritize more pressing bugs and feature requests we are closing all issues older than 6 months as of April 1, 2023. If there are any issues or feature requests that you would like us to address, please create them according to the new templates we have created. For urgent issues, opening a support case with this link Snowflake Community is the fastest way to get a response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants