Replies: 8 comments 2 replies
-
@mccabete you can see an example of python memory profiling implemented by @omshinde in https://github.com/orgs/MAAP-Project/discussions/863 @chuckwondo and I have discussed some other potential tools in the past. Perhaps the next step is a depth discussion session reviewing the current code and which tools would be worth trying? |
Beta Was this translation helpful? Give feedback.
-
I will post a documentation example possibly by tomorrow explaining the steps for memory profiling python script. |
Beta Was this translation helpful? Give feedback.
-
Thanks for being willing to help @chuckwondo @omshinde and @wildintellect! Documentation would be helpful. For a more in-depth discussion, would it make sense to schedule a meeting? FEDS-specific difficulties that might crop up could include: profiling with dask- parallelized code, getting the profiler to run/report out on DPS jobs, and profiling across a wide range of data-ingest levels (the algorithm runs slower when there are more fires to keep track of). If a meeting makes sense let me know the best platform to coordinate that: email, slack, here, or something else. |
Beta Was this translation helpful? Give feedback.
-
If it's helpful we can invite these folks to next Monday's meeting on the 15th since we'll be talking about All The Work ™️ for the next quarter. Then folks can set up smaller meetings at different cadences if needed
Another difficulty to mention here is that the Fire environment is pinned to old (and probably now unsupported versions of some libraries such as pandas) and some libs can't be upgraded b/c then the existing code would lose the ability to read the archive pickled objects that it created. Julia and I have a goal this PI to experiment with different data models and formats and recommend one to move away from the pickle. But this could limit what is able to be installed/upgraded FWIW |
Beta Was this translation helpful? Give feedback.
-
@chuckwondo Summarizing your feedback from the call -- use the |
Beta Was this translation helpful? Give feedback.
-
@wildintellect Suggested that folks need a high-level diagram that specifies when there is a file-read, the ingest steps, etc. @wildintellect is there a place with best practices for this/ an example? I know i'm not 100% sure what to include, or what the common parlance is. |
Beta Was this translation helpful? Give feedback.
-
We also need to design some "small tests". Need to design what those tests are. On discussion:
|
Beta Was this translation helpful? Give feedback.
-
Scalene is a Python CPU+GPU+memory profiler that can even profile multi-threaded and multi-core programs. It's extremely easy to incorporate without requiring any code changes. The Initially, I suggest using a "reduced" profile to keep the profile output on the smaller side, ideally pinpointing some hotspots for further investigation. To do so with scalene is straightforward. For example, assume you want to profile a script (such as from a DPS algorithm's "run" script) that is executed in the following form (with or without
In order to use
This will run your script like normal, but will also profile its execution and produce a profiling report in the file The following is an example report that I produced from the GEDI Subsetter algorithm by adjusting the "run" script as described above. This file contains HTML, but attachments with a |
Beta Was this translation helpful? Give feedback.
-
Those of us who are working on trying to improve the FEDS algorithm (me, @ranchodeluxe, @jsignell, and @eorland) have hit limits on how much we can improve the code without some profiling of how the algorithm uses memory, CPU, etc. What support exists for implementing that?
@wildintellect I know you mentioned that this was something worth following up on in the new year.
Beta Was this translation helpful? Give feedback.
All reactions