Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the hashutil functions to properly deal with path record field types #432

Merged
merged 1 commit into from
Nov 7, 2023

Conversation

pyrco
Copy link
Contributor

@pyrco pyrco commented Nov 1, 2023

(DIS-2526)

@pyrco pyrco requested review from Schamper and Miauwkeru November 1, 2023 11:50
@pyrco
Copy link
Contributor Author

pyrco commented Nov 1, 2023

This should fix #413

@pyrco
Copy link
Contributor Author

pyrco commented Nov 1, 2023

I made hash_uri() and hash_uri_records() deprecated and let them point to the new functions for now. Any thoughts on removing them completely already in this PR @Schamper, @Miauwkeru?

Copy link

codecov bot commented Nov 1, 2023

Codecov Report

Merging #432 (c181dbe) into main (ba5fed3) will increase coverage by 0.12%.
The diff coverage is 92.10%.

@@            Coverage Diff             @@
##             main     #432      +/-   ##
==========================================
+ Coverage   73.99%   74.12%   +0.12%     
==========================================
  Files         258      258              
  Lines       20559    20622      +63     
==========================================
+ Hits        15213    15286      +73     
+ Misses       5346     5336      -10     
Flag Coverage Δ
unittests 74.12% <92.10%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
dissect/target/plugins/filesystem/resolver.py 100.00% <100.00%> (ø)
dissect/target/tools/query.py 64.51% <0.00%> (ø)
dissect/target/helpers/hashutil.py 86.15% <94.11%> (+0.96%) ⬆️

... and 39 files with indirect coverage changes

📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

@pyrco pyrco linked an issue Nov 1, 2023 that may be closed by this pull request
@Miauwkeru
Copy link
Contributor

I made hash_uri() and hash_uri_records() deprecated and let them point to the new functions for now. Any thoughts on removing them completely already in this PR @Schamper, @Miauwkeru?

There might be some folks that specifically use that utility, so I'd set a version when it will be deprecated (not the next release, but the one after that) so they have at least ample time to change their usage?

@pyrco pyrco requested a review from Miauwkeru November 1, 2023 13:37
@pyrco pyrco force-pushed the feature/dis-2526_fix-hash-records branch 2 times, most recently from 6b90733 to 22b2c02 Compare November 2, 2023 15:44
warnings.warn(
(
"The function passing in a path field did not resolve that path. ",
"The autmatic resolving of path fields will be removed in the future.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"The autmatic resolving of path fields will be removed in the future.",
"The automatic resolving of path fields will be removed in the future.",

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why, by the way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this part of the code should not have any knowledge about what the meaning of the content of a field is. That is the responsibility of the plugin creating the record. The plugin knows whether it is useful to resolve a path or not. It's correct separation of responsibilities.

Copy link
Member

@Schamper Schamper Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about when for preservation of the original data I want to put the original path from e.g. the registry in the record which might include an environment variable, but want to hash that file? Should I create a second field like real_path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point. You do want to preserve original values. Though putting unresolved paths with e.g. env variables feels problematic. In my mind a (pathlib style) path type means a literal path, where percentages, dollars and all are part of the actual path, not of potential env variables, otherwise you would still be guessing if it is an actual path or if it still needs to be resolved.

In an ideal world I think you would have 2 values, a literal string which contains whatever was parsed from the artifact and a path which is resolved and present on the target. You would still have the side cases of: 1. what if the path correctly resolves but is not present in the target and 2. what if the path does not or partially resolve how can you be sure the dollar/percentage is part of the path or an unresolved env variable.

Although the last side case won't be solvable in practice, I do think we need to move to a system where a path is as fully resolved as possible and if the literal value, is of value it should be present in the record as a second (string) value. Also because people tend to want to put paths that come out of target-query into something like target-fs to get to the file, which will fail if it is not resolved (we could of course change target-fs to (have the option to) resolve paths).

But given that this is more a policy discussion then something that should be realized by this PR, I'll revert the deprecation warning on unresolved paths for now.

@pyrco pyrco force-pushed the feature/dis-2526_fix-hash-records branch 2 times, most recently from 2179a0d to 8ccfd8d Compare November 3, 2023 09:23
@pyrco pyrco requested review from Schamper and Miauwkeru November 3, 2023 12:16
@pyrco pyrco force-pushed the feature/dis-2526_fix-hash-records branch 2 times, most recently from b8556d3 to 3717028 Compare November 6, 2023 09:59
@pyrco pyrco force-pushed the feature/dis-2526_fix-hash-records branch from 3717028 to c181dbe Compare November 7, 2023 09:21
@pyrco pyrco merged commit 6e10501 into main Nov 7, 2023
@pyrco pyrco deleted the feature/dis-2526_fix-hash-records branch November 7, 2023 09:39
Poeloe pushed a commit that referenced this pull request Feb 29, 2024
Zawadidone pushed a commit to Zawadidone/dissect.target that referenced this pull request Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

target-query --hash function broken
3 participants