Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Backtrace Screen #1270

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Add Backtrace Screen #1270

wants to merge 6 commits into from

Conversation

MrCirdo
Copy link

@MrCirdo MrCirdo commented Jul 19, 2023

Hello everyone!

This is my first big pull request 😃

The goal of this PR is to add a backtrace screen for process or thread.
Here's what it looks like for a thread :
image

And for a process:
image

Behind, I use the tool called eu-stack provided by elf-utils.
The standard output is parsed and printed to the screen.
Currently, I have implemented only the Refresh button. And my world is inspired by TraceScreen and OpenFilesScreen.

I still have some work to do before my work is ready (Formatting, bug fixes, ...).
Currently, this is more of a demonstration than a cool feature 😄 .

What do you think about my work? Is it a feature that can be added?

@BenBE BenBE added the new feature Completely new feature label Jul 19, 2023
Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your PR. The feature itself looks interesting and fits quite nicely with existing functionality.

But, the use of external tools is a bit problematic and is best to be avoided. For the case of retrieving stack traces, there are several libraries available, that might be worth a look (e.g. libunwind). Also I'm not sure if the code as-is would properly run on e.g. FreeBSD or Darwin. Thus I strongly prefer a solution that allows to split out these platform-dependent parts where necessary (in the case of lsof, things are portable enough across all our target platforms so it's not an issue there; not sure about eu-stack though).

While reading through your PR I noticed that this seems to handle debug information. As such, it would be nice to have the module, source file and line be available separately (where available). Also the setting of hiding path names should be respected for module filenames in order to be consistent with the rest of the UI. Taking the highlight basename setting into account would be nice too.

Another code refactoring task is related to a recent addition of the generalized columns code that @natoscott recently worked on. Please have a look there if you like.

Also there's a few further notes regarding the code which you can find below. Please feel free to rebase the fixes to those issues directly into the existing commits as you see fit.

If you need further assistance feel free to ask.

Action.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.h Outdated Show resolved Hide resolved
BacktraceScreen.h Outdated Show resolved Hide resolved
BacktraceScreen.h Outdated Show resolved Hide resolved
BacktraceScreen.h Show resolved Hide resolved
@MrCirdo
Copy link
Author

MrCirdo commented Jul 20, 2023

Thank you for your reactivity and your review.

I agree with you when you say eu-stack is not a good idea. I only used it to make a quick demo of my idea.
I'm also unsure if eu-stack is supported on the BSD platforms.

Moreover, I think also the library unwind is more appropriate. I just have to deal with DWARF information.
I will probably close all suggestions regarding the execution/parsing of eu-stack.

I saw very quickly the work of @natoscott and I will wait until his work is finished.

BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
@BenBE
Copy link
Member

BenBE commented Jul 20, 2023

Thank you for your reactivity and your review.

You're welcome.

I agree with you when you say eu-stack is not a good idea. I only used it to make a quick demo of my idea. I'm also unsure if eu-stack is supported on the BSD platforms.

There seems to be some patches re FreeBSD, but I'd not actually call this support … ;-)

That's with Darwin aside entirely … ;-)

Moreover, I think also the library unwind is more appropriate. I just have to deal with DWARF information. I will probably close all suggestions regarding the execution/parsing of eu-stack.

Sure, go ahead. Maybe check for similar places elsewhere in case something similar was missed in other places of this PR.

I saw very quickly the work of @natoscott and I will wait until his work is finished.

If you prepare your PR to anticipate that change, e.g. by preparing the data structures in a way that makes this move easy, you could even prepare things now. Given the remaining aspects in that other PR will take some time to sort out, don't feel obliged to wait for those changes to land. If the structures introduced in this PR are clean enough there's not too much work later to move things over to this new interface. Most of the work actually has already been done.

@Explorer09
Copy link
Contributor

Please don't merge the htop-dev:main branch. Rebase it instead. You know what git rebase is, don't you?

@MrCirdo
Copy link
Author

MrCirdo commented Sep 5, 2023

The flake commit is just for me. I will remove it at the end.
I don't start my work. However, I keep updating my branch. So it's not ready to review.

@MrCirdo
Copy link
Author

MrCirdo commented Nov 16, 2023

Hey,

I didn't have a lot of time recently for this PR. But I have some questions.

If you prepare your PR to anticipate that change, e.g. by preparing the data structures in a way that makes this move easy, you could even prepare things now. Given the remaining aspects in that other PR will take some time to sort out, don't feel obliged to wait for those changes to land. If the structures introduced in this PR are clean enough there's not too much work later to move things over to this new interface. Most of the work actually has already been done.

I'm looking for Row workflow. I don't know if it's a good idea to use it. Row_display uses the object Settings to get fields. Does it mean I have to make my own Settings object? With all process settings (like highlightThreads)? And do I have to make my ScreenSettings? Is it okay?

@BenBE
Copy link
Member

BenBE commented Nov 16, 2023

I didn't have a lot of time recently for this PR. But I have some questions.

If you prepare your PR to anticipate that change, e.g. by preparing the data structures in a way that makes this move easy, you could even prepare things now. Given the remaining aspects in that other PR will take some time to sort out, don't feel obliged to wait for those changes to land. If the structures introduced in this PR are clean enough there's not too much work later to move things over to this new interface. Most of the work actually has already been done.

I'm looking for Row workflow. I don't know if it's a good idea to use it. Row_display uses the object Settings to get fields. Does it mean I have to make my own Settings object? With all process settings (like highlightThreads)? And do I have to make my ScreenSettings? Is it okay?

The Settings and ScreenSettings objects are shared between all tabs. The Settings provides with the global settings, applicable to all screens, while ScreenSettings is applicable to only to the current tab only. As you are basically inheriting the settings from the tab you are opening the backtrace details on, you can just copy the pointer to those settings. These are held in the Screen mostly for convenience and to reduce dependency scope. I.e. if you don't need any of the values from the settings (and none of the functions you're relying on) you could even skip these objects entirely.

@MrCirdo
Copy link
Author

MrCirdo commented Nov 17, 2023

I didn't have a lot of time recently for this PR. But I have some questions.

If you prepare your PR to anticipate that change, e.g. by preparing the data structures in a way that makes this move easy, you could even prepare things now. Given the remaining aspects in that other PR will take some time to sort out, don't feel obliged to wait for those changes to land. If the structures introduced in this PR are clean enough there's not too much work later to move things over to this new interface. Most of the work actually has already been done.

I'm looking for Row workflow. I don't know if it's a good idea to use it. Row_display uses the object Settings to get fields. Does it mean I have to make my own Settings object? With all process settings (like highlightThreads)? And do I have to make my ScreenSettings? Is it okay?

The Settings and ScreenSettings objects are shared between all tabs. The Settings provides with the global settings, applicable to all screens, while ScreenSettings is applicable to only to the current tab only. As you are basically inheriting the settings from the tab you are opening the backtrace details on, you can just copy the pointer to those settings. These are held in the Screen mostly for convenience and to reduce dependency scope. I.e. if you don't need any of the values from the settings (and none of the functions you're relying on) you could even skip these objects entirely.

Thank you very much for the clarification. I was not sure I would be able to modify it (I mean If my changes would be accepted).

Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please no custom build system files from e.g. Flake …

BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Show resolved Hide resolved
BacktraceScreen.c Show resolved Hide resolved
BacktraceScreen.h Show resolved Hide resolved
@MrCirdo
Copy link
Author

MrCirdo commented Jan 9, 2024

Hi @BenBE ,
I completely forgot to say to not review my code. I just updated my branch.
I deeply apologize.

@BenBE
Copy link
Member

BenBE commented Jan 9, 2024

Hi @BenBE , I completely forgot to say to not review my code. I just updated my branch. I deeply apologize.

Don't worry. No need to apologize. The PR is marked as draft anyway, thus the only things I remarked upon was what I noticed immediately. I also ticked off some of the previous comments that no longer apply to the current state of this PR. Basically some maintenance.

NB: This PR would be scheduled for 3.4.x earliest anyway given that 3.3.0 is about to ship anytime soon.

@MrCirdo
Copy link
Author

MrCirdo commented Mar 16, 2024

Hi,

This PR is ready for review.
I closed all previous conversations.

Currently, I add only the support of Linux.

@MrCirdo MrCirdo marked this pull request as ready for review March 16, 2024 17:35
configure.ac Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
@MrCirdo MrCirdo force-pushed the main branch 2 times, most recently from c53af06 to 792bdba Compare March 16, 2024 20:46
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
BacktraceScreen.h Outdated Show resolved Hide resolved
configure.ac Outdated Show resolved Hide resolved
@BenBE
Copy link
Member

BenBE commented Mar 17, 2024

Given that this feature will be available on multiple platforms and requires changes in different places I'd suggest to have a top-level feature flag for backtrace support and additional flags for the various libraries (iberty, eu-stack, …) The actual implementation for the stack traces should go in the platform code for each system, leaving the actual screen implementation mostly platform independent.

@MrCirdo
Copy link
Author

MrCirdo commented Mar 19, 2024

Given that this feature will be available on multiple platforms and requires changes in different places I'd suggest to have a top-level feature flag for backtrace support and additional flags for the various libraries (iberty, eu-stack, …) The actual implementation for the stack traces should go in the platform code for each system, leaving the actual screen implementation mostly platform independent.

Ok I see what you mean, is it a good idea to be inspired by for example Machine_new?

@BenBE
Copy link
Member

BenBE commented Mar 19, 2024

Given that this feature will be available on multiple platforms and requires changes in different places I'd suggest to have a top-level feature flag for backtrace support and additional flags for the various libraries (iberty, eu-stack, …) The actual implementation for the stack traces should go in the platform code for each system, leaving the actual screen implementation mostly platform independent.

Ok I see what you mean, is it a good idea to be inspired by for example Machine_new?

Maybe a better candidate may be the way in which Platform_getNetworkIO and Platform_getProcessLocks are implemented in <platform>/*Platform.[ch]

@MrCirdo
Copy link
Author

MrCirdo commented Mar 20, 2024

Given that this feature will be available on multiple platforms and requires changes in different places I'd suggest to have a top-level feature flag for backtrace support and additional flags for the various libraries (iberty, eu-stack, …) The actual implementation for the stack traces should go in the platform code for each system, leaving the actual screen implementation mostly platform independent.

Ok I see what you mean, is it a good idea to be inspired by for example Machine_new?

Maybe a better candidate may be the way in which Platform_getNetworkIO and Platform_getProcessLocks are implemented in <platform>/*Platform.[ch]

Okay thank you, I'll take a look

BacktraceScreen.h Outdated Show resolved Hide resolved
BacktraceScreen.c Outdated Show resolved Hide resolved
@BenBE
Copy link
Member

BenBE commented Nov 6, 2024

@MrCirdo
Copy link
Author

MrCirdo commented Nov 7, 2024

For reference:
https://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10

Thank you for the link, It's a goldmine of information.

Normally, I should have corrected all the comments. If so, my work is done. I hope it's ready to merge!

XUtils.c Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved

static void BacktraceFrame_displayHeader(const BacktraceFrame* frame, RichString* out) {
const BacktracePanelPrintingHelper* printingHelper = &frame->backtracePanel->printingHelper;
assert(printingHelper);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant assert? (Since an & operator is used for retrieving address for printingHelper)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if at all you would check assert(frame);, which is better simply declared as ATTR_NONNULL_ALL because you want the RichString* to be valid too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Catch! Thank you.

Is it a good idea to put ATTR_NONNULL on every function? It will catch a lot of bugs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. In the htop sources we don't always do it, but in strategic places you will find these compiler annotations. Normally the non-NULL is implied by skipping the NULL check or placing an assert. The attribute is only rarely used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I understand better. Thank you for the clarification @BenBE.

BacktraceScreen.c Outdated Show resolved Hide resolved

char* basename = frame->backtracePanel->process->procExe + frame->backtracePanel->process->procExeBasenameOffset;
char* object = line + objectPathStart;
if (!basename || !object) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like something is wrong with this conditional. Perhaps you mean (!frame->backtracePanel->process->procExe || !line) because you should ignore the character offset when the pointer is null already.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, but also not quite correct, because object is a pointer addition too and thus should check line.

Alternative:

   char* basename = frame->backtracePanel->process->procExe ? frame->backtracePanel->process->procExe + frame->backtracePanel->process->procExeBasenameOffset : NULL;
   char* object = line ? line + objectPathStart : NULL;
   if (!basename || !object) {

In which case the current check would work just fine.

You could even do a small macro #define PTR_ADD_OFFSET(ptr, offset) (ptr) ? ((ptr) + (offset)) : NULL.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed if procExe is null, then procExeBasenameOffset equals 0. But it may be false.
Good catch! Thank you!

#define BACTRACE_PANEL_HEADER_NUMBER_FRAME "#"
#define BACTRACE_PANEL_HEADER_ADDRESS "ADDRESS"
#define BACTRACE_PANEL_HEADER_NAME "NAME"
#define BACTRACE_PANEL_HEADER_PATH "PATH"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to have these strings defined as macros?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless used in multiple places, I think they should be removed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this decision because it's easier to change the name of column header. Currently only BACTRACE_PANEL_HEADER_NUMBER_FRAME is used twice.
I recognize the name can be shorten.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MrCirdo The header names in htop are usually stored in an array or a table structure, so it's unlikely you need macro tokens for these.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok interesting, I don't want to use magic value. I will use an array of strings. Thank you for the suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just take a look at how things are defined in linux/LinuxProcess.c for example …

(int)printingHelper->maxFrameNumLen, BACTRACE_PANEL_HEADER_NUMBER_FRAME,
(int)printingHelper->maxAddrLen, BACTRACE_PANEL_HEADER_ADDRESS,
(int)maxFunctionNameLength, BACTRACE_PANEL_HEADER_NAME,
(int)printingHelper->maxObjPathLen, BACTRACE_PANEL_HEADER_PATH
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a little uncomfortable when the length values originally in size_t types have to downcast to int here. This suggests me that the original values should have been in int types already, and not size_t.

I know this question should be up to @BenBE to answer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, the C standard is at fault here. printf should have defined this to be size_t (or ssize_t for this matter), but this would have broken backward compatibility and far too much code.

Instead, all references to these variables should ensure to stick within the limits of int; consult limits.h for some useful macros. This is in practice no problem, as going beyond INT_MAX or even INT16_MAX will go outside of what curses can handle (pun intended).

Put simply: place some asserts for these limits in strategic places and we should be fine.

Copy link
Contributor

@Explorer09 Explorer09 Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For everyone's information, the field width specifiers of printf have to be signed types as the negative widths are treated as left alignment for the fields. In other words, ssize_t (yet this type is not part of ISO C but defined in POSIX) might be the best type for printf field width specifier here.

Update: Da**it. The POSIX definition of ssize_t was ill fit for this purpose. It only guaranteed the values in range [-1, SSIZE_MAX] can be stored in. That is, not arbitrary negative value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see the problem. Thank you @Explorer09 for the information 😄

Put simply: place some asserts for these limits in strategic places and we should be fine.

Got it!

char* line = NULL;
int objectPathStart = -1;
int len = xAsprintf(&line, "%-*d 0x%0*zx %-*s %n%-*s",
(int)printingHelper->maxFrameNumLen, frame->index,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about right aligning the frame number field?

Copy link
Author

@MrCirdo MrCirdo Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the right alignment, I'm not a fan. Here with all alignment :

  1. Right alignment :
    image
  2. Pad with 0:
    image
  3. Left alignment :
    image

I prefer the 3 (the left alignment), it's aligned with the header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know the header can be right-aligned, don't you?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I updated my comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the first one …

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go for the 1 (right alignment) 😄

linux/Platform.c Outdated Show resolved Hide resolved

if (frame->functionName) {
size_t functionNameLength = strlen(frame->functionName) + digitOfOffsetFrame;
printingHelper->maxFuncNameLen = MAXIMUM(functionNameLength, printingHelper->maxFuncNameLen);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should happen if both frame->functionName and frame->demangleFunctionName are NULL? It seems that digitOfOffsetFrame would be unused. Was that intentional?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the code is more readable than checking if frame->functionName and frame->demangleFunctionName are not null, then checking if frame->functionName is not null, then frame->demangleFunctionName is not null.


size_t addressLength = MAX_HEX_ADDR_STR_LEN_32;
if (longestAddress > MAX_ADDR_32) {
addressLength = MAX_HEX_ADDR_STR_LEN_64;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this for more-than-32-bit addresses?

addressLength = strlen("0x") + countDigits(longestAddress, 16);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For hexadecimal numbers, using CLZ might be the faster choice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#ifndef ULLONG_WIDTH
#define ULLONG_WIDTH ((int)sizeof(unsigned long long) * CHAR_BIT)
#endif

addressLength = strlen("0x") + ((ULLONG_WIDTH / 4) - __builtin_clzll(longestAddress | 0x1) / 4);

The catch: If CLZ is not supported as a built-in, then we need an alternate implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this choice because I like how GDB prints the backtrace frame and especially the addresses (with a lot of zero).

MrCirdo and others added 2 commits November 14, 2024 21:54
Co-authored-by: Benny Baumann <BenBE@geshi.org>
Co-authored-by: Kang-Che Sung <explorer09@gmail.com>
Co-authored-by: Benny Baumann <BenBE@geshi.org>
Co-authored-by: Kang-Che Sung <explorer09@gmail.com>
MrCirdo and others added 3 commits November 14, 2024 22:07
Co-authored-by: Benny Baumann <BenBE@geshi.org>
Co-authored-by: Kang-Che Sung <explorer09@gmail.com>
Co-authored-by: Benny Baumann <BenBE@geshi.org>
Co-authored-by: Kang-Che Sung <explorer09@gmail.com>
Co-authored-by: Benny Baumann <BenBE@geshi.org>
Co-authored-by: Kang-Che Sung <explorer09@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature Completely new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants