-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libdrgn: improve C string reading efficiency #313
base: main
Are you sure you want to change the base?
Conversation
b3f4466
to
ce815d7
Compare
This will need some fixups if #310 lands first, as we'll need to canonicalize the address passed to |
The current C string reading implementation is inefficient, especially for low bandwidth remote targets, as it needs to do a separate segment read (including a fresh page table lookup) for each character read. A more efficient approach is to retain the page table between character reads, only discarding it when we hit the null terminator. Implement this approach by allowing segments to also specify a C string reading callback. The callback for page tables will preserve the page table iterator while reading characters. Signed-off-by: Peter Collingbourne <pcc@google.com>
I don't love the idea of adding another callback for this. I've thought about this inefficiency in the past, and the approach that I was considering was to have This might be less optimal in a few edge cases, but I think it's good enough, and it's a bit cleaner. There are other use cases that I'd like to allow partial reads for anyways (e.g., for batched reads of all of the First of all, do you think this approach would solve your problem? Secondly, would you be interested in implementing it? If so, I have some more specific thoughts about how to do it that I can share. No worries if not, I can get to it soon-ish. |
I also considered whether C string reads could be optimized by using/extending the existing callback. But it seemed to me that null-terminated strings were the only special case that could justify having a separate callback, given how common they are in the types of programs that drgn is used to debug, together with the low "granularity" (per byte) of the termination check, which increases the benefit of pushing down optimizations to a lower level. My concern with larger transfer sizes is similar to what I've seen with e.g. #312: these debug adapters can be highly bandwidth and latency constrained, creating the need to carefully optimize what we send over the wire to the adapter and thence to the target. At least in my experience the strings being transferred are typically relatively short which means that it is better to try to reduce bandwidth in this case. That's not to say that latency isn't important; quite the contrary, it can have a big impact on long strings. But with a C string specialized approach I think there is an opportunity to further optimize latency for long strings in the debug adapter case. The idea that I had was to put more intelligence into the debug adapter so that it can issue the necessary JTAG/SWD commands to read up to the null terminator without host involvement, so you don't need a USB round trip except when crossing a page boundary. (This could be done by extending an open source debug adapter firmware such as DAPLink, together with the associated protocol standard.) This would be precluded with a short read based approach. That all being said, I'm not fundamentally opposed to a short read based approach as long as it doesn't regress performance perceptibly in typical remote debugging scenarios. (If you knew anything about me, you would know that I am the last person to try to push for unnecessary optimizations, so I am happy to be proven wrong about my assumptions, but they are based on several months of experience working with these debug adapters.) If you would like to try implementing it, I can try running benchmarks on my setup. |
Ok, you've convinced me that the extra callback makes sense. In fact, it would also benefit the I'll come back to this one after #310 and #312, which have some bearing on this. |
The current C string reading implementation is inefficient, especially for low bandwidth remote targets, as it needs to do a separate segment read (including a fresh page table lookup) for each character read. A more efficient approach is to retain the page table between character reads, only discarding it when we hit the null terminator.
Implement this approach by allowing segments to also specify a C string reading callback. The callback for page tables will preserve the page table iterator while reading characters.