Add primitive support for link-time memory reports#781
Add primitive support for link-time memory reports#781parth-07 wants to merge 1 commit intoqualcomm:mainfrom
Conversation
eb1f881 to
996e9c3
Compare
This commit adds primitive support for link-time memory reports. The memory report contains memory usage information for each (most!) timer that we have in the codebase (eld::RegisterTimer). The main motivation for link-time memory reports is to help find out which linker areas to focus on for reducing the link memory footprint. The memory usage information contains the current resident set size, the resident set size change in this timer, and the peak resident set size seen so far. All these information are computed by parsing the virtual file '/proc/self/status'. As expected, this solution would not work for windows and thus this feature is only available for eld-on-linux. The virtual file '/proc/self/status' may be represented slightly different across different linux distributions so we might see some issues in different linux distributions. Thus, this feature should be considered experimental for now. Memory usage information is not recorded for timers that are created a large number of times, for example, VisitSymbol and VisitSections. This is because each read to the virtual file '/proc/self/status' is a system call and making large number of this system call can take considerable time. This is fine because we can always improve/rearrange timers such that we get the memory information that we need. Signed-off-by: Parth Arora <partaror@qti.qualcomm.com>
996e9c3 to
b01e1bc
Compare
|
Thanks for working on this! I would prefer a monitor thread that runs throughout the link and RegisterTimer intercepts the monitor thread with events, that gets recorded in json. We can have an event recorder and measure all and whatever we need. |
From what I have seen, the monitor threads are typically useful when we want to record the application metrics by periodic sampling. Can you please explain the benefit of using monitor thread in the current case when we are using explicit hooks? I am concerned that the efforts of creating and maintaining a thread-safe monitor thread is much more than its benefits. |
|
A monitor thread allows you to continously measure memory growth through the time of link. We cannot use the same hook as what we use for profiling. |
But this patch aims to measure memory consumption in the key link phases that we can control through the hooks. Why do |
I doubt that the extra granularity or complexity of that approach allows for any better analysis than the current one |
|
The current approach in both profiling and memory utilization is not scalable. It is only usable by us. Moving to this model will allow the system to be used much better for analysis and debugging. |
Can you explain how? In my opinion this patch works fine; we can see the memory consumption and change in memory consumption between important link stages. What additional benefit does continuous measurement provide? If that level of detail is needed, using an actual profiler may be more appropriate.
who other than linker developers needs to profile the linker, especially to such granularity? |
Yes, it can be extended to measure at random intervals. |
There are many advantages of doing this a seperate thread
What I am trying to say is that write a monitor thread, that keeps dumping memory usage to json, and add instrumentation points. |
This commit adds primitive support for link-time memory reports. The memory report contains memory usage information for each (most!) timer that we have in the codebase (eld::RegisterTimer).
The main motivation for link-time memory reports is to help find out which linker areas to focus on for reducing the link memory footprint.
The memory usage information contains the current resident set size, the resident set size change in this timer, and the peak resident set size seen so far. All these information are computed by parsing the virtual file '/proc/self/status'. As expected, this solution would not work for windows and thus this feature is only available for eld-on-linux. The virtual file '/proc/self/status' may be represented slightly different across different linux distributions so we might see some issues in different linux distributions. Thus, this feature should be considered experimental for now.
Memory usage information is not recorded for timers that are created a large number of times, for example, VisitSymbol and VisitSections. This is because each read to the virtual file '/proc/self/status' is a system call and making large number of this system call can take considerable time. This is fine because we can always improve/rearrange timers such that we get the memory information that we need.