Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix matched string #363

Merged
merged 9 commits into from
Jul 6, 2022
Merged

Conversation

sushantmimani
Copy link
Contributor

To help us get this pull request reviewed and merged quickly, please be sure to include the following items:

  • Tests (if applicable)
  • Documentation (if applicable)
  • Changelog entry
  • A full explanation here in the PR description of the work done

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Tests
  • Other

Backward Compatibility

Is this change backward compatible with the most recently released version? Does it introduce changes which might change the user experience in any way? Does it alter the API in any way?

  • Yes (backward compatible)
  • No (breaking changes)

Issue Linking

Fixes #347 #183

What's new?

  • Fixes the leading +/- in matched_strings bug. Displays a deprecation warning for signatures generated using older versions of Tartufo where the actual string was prefixed by a + or - and provides a new signature to replace it with.

sushantmimani and others added 7 commits June 29, 2022 10:32
It's not actually true that *all* chunks are diff output; local folder
scans have just raw content (prefixed with filename). To fix this:

* Add a new "is_diff" property to the Chunk type. We do this instead of
  adding this to the existing metadata property so that we do not
  perturb user-visible output (where we blindly render every member).
* Fix all Chunk generators to set .is_diff properly.
* Adjust scan_entropy() so that it removes the first character from each
  line of a chunk when .is_diff is True, and leaves it alone otherwise.
  The first character is preserved in expectation that we might use it
  for some backwards-compatibility kludge (but right now it just pisses
  off pylint because it's never consumed).

Unit tests updated, mostly to just add the missing constructor
parameter. The local folder scan tests now pass again.

Conundrums left for another day:

* Technically, scan_regex() should be doing the same thing, but this
  also would introduce a compatiblity break and it's more effort because
  we presently don't do any chunk interpretation. It's probably better
  to just leave it as-is.
* I notice that bizarrely one of the pre-commit unit tests fails if you
  have staged files present. It works fine as long as you don't have any
  such files. Based on what it claims to be doing, this behavior
  shouldn't be occurring in the first place.
I think. This code isn't tested yet. Overhead should be relatively small
(aka manageable) because we go to the effort of checking only if:
* There is an entropy finding
* It isn't excluded
* It appears at the beginning of a line

In that situation, we retry the exclusion based on the string that
tartufo used to use. If THAT gets excluded, then we:
* Do not generate an issue
* We do create issues (temporarily) for both the old and new strings
* We report the old and new signatures and tell users to update their
  configuration
All instances of line should have been replaced by analyze.
Spotted by @sushantmimani
Fix chunk handling for entropy scanner
Copy link
Contributor

@rbailey-godaddy rbailey-godaddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One stupid "Tartufo" -> "tartufo" change. Everything else looks fine.

tartufo/scanner.py Outdated Show resolved Hide resolved
Co-authored-by: Scott Bailey <72747501+rbailey-godaddy@users.noreply.github.com>
Copy link
Contributor

@emayuri-godaddy emayuri-godaddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@sushantmimani sushantmimani merged commit 6efe52f into godaddy:main Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

matched_string contains "+" character, because of that BFG is unable to clean git history
4 participants