Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proto: Add comment on location address adjustment #882

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions proto/profile.proto
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,10 @@ message Location {
// for the corresponding mapping. A non-leaf address may be in the
// middle of a call instruction. It is up to display tools to find
// the beginning of the instruction if necessary.
// If Mapping.memory_start is 0, Mapping.memory_limit is 0 or the max uint64,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending the change!

I'm not sure I agree with the proposed text, some comments below.

What I expect tools to do is to use the Mapping.* values together with the Offset and VirtAddr for the executable segment in ELF program headers to translate the addresses to the objdump-compatible addresses.

In reality, if Offset == VirtAddr for the segment in the ELF headers, then setting Mapping.memory_start == Mapping.file_offset would mean effectively no translation, for any values (can be zero as in the proposed comment). I think we should document these mechanics.

I don't think we should give any special meaning to Mapping.memory_limit of zero or max uint64. I would expect it to be set to Mapping.memory_start + size of the VMA always.

I hope this makes sense, happy to discuss this more. I'm glad we are trying to improve the docs around these nuances.

$ readelf --segments --wide /bin/ls | grep -P "LOAD|Type"
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0037b8 0x0037b8 R   0x1000
  LOAD           0x004000 0x0000000000004000 0x0000000000004000 0x015549 0x015549 R E 0x1000
  LOAD           0x01a000 0x000000000001a000 0x000000000001a000 0x009090 0x009090 R   0x1000
  LOAD           0x0232b0 0x00000000000242b0 0x00000000000242b0 0x001330 0x002610 RW  0x1000

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case this is because debuginfo provided by distributions may have a different offset or virtaddr of the executable that the data was obtained from, this is because the debuginfo is stripped, meaning only the actual production executable has the correct values for these.

As the code mentions some tools may want to specifically signify that the location has already been adjusted. Do you have a suggestion on how we can document this behavior accurately?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Let me think about it and propose something concrete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, let me know if I can be helpful! Also happy to discuss on CNCF Slack if you want.

// and Mapping.file_offset is 0, then the location is already fully adjusted
// to the symbol table address space, and display tools should not attempt to
// further adjust it.
uint64 address = 3;
// Multiple line indicates this location has inlined functions,
// where the last entry represents the caller into which the
Expand Down