Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Byte-level full-file histogram statistics #140

Closed
christopherball opened this issue Dec 16, 2021 · 7 comments
Closed

Feature Request: Byte-level full-file histogram statistics #140

christopherball opened this issue Dec 16, 2021 · 7 comments
Labels
feature A new feature
Milestone

Comments

@christopherball
Copy link

In some Hex Editors, I've seen the feature which essentially renders a histogram representing the full contents of the file, walking through each unique byte of the file. Along the x-axis represents each unique byte, and along the y-axis, the count total for said byte. This can be very helpful for identifying changes in a given file when file comparison approaches don't necessarily work (such as dynamically generated files).

Thoughts?

@solemnwarning
Copy link
Owner

Sounds useful. Is it done as a low-resolution overview, kind of like the preview-ish bit you get down the side of some text editors, or does it zoom in right down to byte precision?

@solemnwarning solemnwarning added the feature A new feature label Dec 16, 2021
@christopherball
Copy link
Author

The general workflow allows for you to mouse-hover over a given column of the histogram (with perhaps a tooltip or something signaling which byte-value you're hovering over, such as "5F"), and if you click said column of the histogram, it would cycle your cursor selections through each present instance of said byte within the file (and probably visually highlight all of said instances also).

HxD is one such example hex editor that includes this type of "statistics" feature as they call it. I saw it used in some Youtube videos and immediately wished the feature was included for this tool :).

@solemnwarning solemnwarning added this to the 0.6.0 milestone Apr 26, 2022
@solemnwarning
Copy link
Owner

Note to self: When I get to this, also make it easy to generate stats for fields in a struct (i.e. have a data type and stride setting, or maybe selecting a column in the input or something...)

@solemnwarning
Copy link
Owner

Figured I'd post a screenshot/teaser of this:

image

Not finished yet, but available on the data-histogram branch, it will count 8/16/32/64 bit values in the file, with a given stride and count them in to a selectable number of buckets/columns (16 shown).

If a column encompasses more than one value, then clicking it descends into a new chart separating out the column's range into a new set of columns. You can keep doing this until each column is a single value.

Also planning on making it support working on selections, so you can get a chart of values from a particular byte range in the file, or a particular field of a repeated struct using the stride option (for example). Needs more thought as I want it to integrate properly with the virtual sections rather than just using file offsets like most of the existing tools.

@christopherball
Copy link
Author

Sweet - glad to see this old feature request of mine is going to see the light of day! Looking pretty promising so far.

@solemnwarning
Copy link
Owner

solemnwarning commented Nov 13, 2022

Unfortunately the preview I posted before had some usability issues, and also some technical issues when I tried improving it, so I wound up simplifying it down to only working with single-byte values for now :(

image

It can accumulate bytes from the whole file or a specific region, and you can zoom/pan the chart to see more detail, but that's about it for now.

EDIT: Also clicking a column cycles through occurrences (using the search feature).

@solemnwarning
Copy link
Owner

Merged in 758743d.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature
Projects
None yet
Development

No branches or pull requests

2 participants