-
Notifications
You must be signed in to change notification settings - Fork 29
feat(heuristics): add whitespace check to detect excessive spacing and invisible characters for malware check #1086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/macaron/malware_analyzer/pypi_heuristics/sourcecode/white_spaces.py
Outdated
Show resolved
Hide resolved
…d invisible characters Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
db0e35c
to
6978bd7
Compare
src/macaron/slsa_analyzer/checks/detect_malicious_metadata_check.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Ben Selwyn-Smith <benselwynsmith@googlemail.com>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
…ackages (oracle#965) Include support for using Semgrep for analysis of source code to detect malicious code patterns, specified using Semgrep's YAML files. Signed-off-by: Carl Flottmann <carl.flottmann@oracle.com>
This PR allows Macaron to discover GitHub attestation. To retrieve these attestations, the SHA256 hash of the related artefact is required. Hashes are computed from local artefact files if available, or from downloaded ones otherwise. Signed-off-by: Ben Selwyn-Smith <benselwynsmith@googlemail.com>
…acle#1096) This PR replaces the Go shared library previously used via C-bindings in Python with a standalone binary for the cuevalidator component. The binary can now be invoked as a subprocess, simplifying integration and improving portability. Signed-off-by: behnazh-w <behnaz.hassanshahi@oracle.com>
…e. (oracle#1102) The detail info containing inspector links now contains links as keys regardless of whether they are reachable, and includes a boolean value for reachability. Signed-off-by: Carl Flottmann <carl.flottmann@oracle.com>
…torial (oracle#1101) Signed-off-by: Carl Flottmann <carl.flottmann@oracle.com>
oracle#1097) Signed-off-by: Amine <amine.raouane@enim.ac.ma>
…d invisible characters Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
tests/malware_analyzer/pypi/resources/sourcecode_samples/obfuscation/inline_imports.py
Show resolved
Hide resolved
…tion threshold Signed-off-by: Amine <amine.raouane@enim.ac.ma>
The CI test seems to be failing due to detecting excessive whitespace in |
Dismissing approval until integration test failure is resolved.
I have investigated this problem. in
In
Both of these examples are triggered by the excessive whitespace Semgrep rule as there are over 50 spaces before some of the indented lines. Both of these examples occur in docstrings, so my proposed solution (which I have tested does not trigger on
I have used Something we may have to be wary of is benign code blocks that are excessively indented and will cause this to trigger. Many projects will not encounter this, as the indentation level will not reach more than 50 spaces and/or code linters will prevent this from happening, so I don't expect too many false positives with that, but it is a possibility. |
Resolve CI errors by not considering large whitespace text present in function docstrings. Signed-off-by: Carl Flottmann <carlflottmann@gmail.com>
The pattern-not-inside was found to not behave as expected. This was resolved, along with formatting. Signed-off-by: Carl Flottmann <carl.flottmann@oracle.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI fix has been resolved by ignoring excessive spacing present in docstrings.
Summary
This PR adds a new heuristic that analyzes code to detect suspicious use of excessive spaces and invisible characters. It checks whether the amount of spacing and invisible Unicode characters exceeds a defined threshold.
Description of changes
WhiteSpaces
heuristic in a new Python module.heuristics.py
file.WhiteSpacesAnalyzer
heuristic.detect_malicious_metadata_check.py
to integrate and execute the new heuristic logic during analysis.Related issues
None
Checklist
verified
label should appear next to all of your commits on GitHub.