Question: Re-using interactive shell for multiple notebook executions #97

jimkimball · 2024-04-12T20:33:56Z

We have a scenario where we are running the same notebook thousands of times. And we are seeing memory increasing significantly as we progress. Initial investigation looks like incremental memory is primarily because we import pandas and numpy in the notebooks. We were thinking if we could import pandas and numpy in the client._shell, then re-use that shell, we might be able to manage our memory. I am looking into this now, but wondered if it is something you had already explored or even already support. Thank you.

edublancas · 2024-04-13T01:32:14Z

hi, we haven't explored this.

are you running the notebook iterations one at a time? if you're and you're seeing memory usage increasing that might be a bug, a user reported something similar and fixed it (#75), but there might be other leaks yet.

a quick way to fix this is to run ploomber-engine via the subprocess module, this way you'll ensure that each call completely wipes out memory

jimkimball · 2024-04-13T21:48:07Z

Thanks for the quick reply Eduardo. Our specific use case is that we are using jupyter notebooks for our application testing and have wrapped ploomber engine in a test harness. So we are running ~3000 separate notebooks one after the other from a single process. We see the memory steadily climb, and the thing that actually brought the issue to our attention was the 3-4 minute lag from when the application is done running all the tests to when it exits. During this period, we see the process memory drop so there is a significant time cost to the python memory cleanup, above and beyond the memory utilization itself. We don’t want to run in a sub-process because that slows us down. We need these tests to be as fast as possible (this is what brought us to ploomber in the first place). We will continue to investigate and let you know what we find. Thanks again. And thanks for making ploomber engine. Jim

…

On Apr 12, 2024, at 9:32 PM, Eduardo Blancas ***@***.***> wrote: hi, we haven't explore this. are you running the notebook iterations one at a time? if you're and you're seeing memory usage increasing that might be a bug, a user reported something similar and fixed it (#75 <#75>), but there might be other leaks yet. a quick way to fix this is to run ploomber-engine via the subprocess module, this way you'll ensure that each call completely wipes out memory — Reply to this email directly, view it on GitHub <#97 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACN2P773GBW3RR7QLNZSLVLY5CDLHAVCNFSM6AAAAABGEX6HQKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJSHA4TEMRZGE>. You are receiving this because you authored the thread.

edublancas · 2024-04-16T15:13:54Z

yeah, I guess that there is some memory leak somewhere.

another thing you can do to speed things up is turn off our anonymous telemetry:

export PLOOMBER_STATS_ENABLED=false

I'm unsure this will have a big effect but let me know if it helps (we're thinking of removing it completely)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Re-using interactive shell for multiple notebook executions #97

Question: Re-using interactive shell for multiple notebook executions #97

jimkimball commented Apr 12, 2024

edublancas commented Apr 13, 2024 •

edited

Loading

jimkimball commented Apr 13, 2024 via email

edublancas commented Apr 16, 2024

Question: Re-using interactive shell for multiple notebook executions #97

Question: Re-using interactive shell for multiple notebook executions #97

Comments

jimkimball commented Apr 12, 2024

edublancas commented Apr 13, 2024 • edited Loading

jimkimball commented Apr 13, 2024 via email

edublancas commented Apr 16, 2024

edublancas commented Apr 13, 2024 •

edited

Loading