Skip to content

Commit

Permalink
Only check for @generated-like markers in file header (#2654)
Browse files Browse the repository at this point in the history
* Only check for `@generated`-like markers in file header

Check a large but limited buffer from the start of the file.
Do not assume UTF-8 encoding and text decode at all, but search for byte string.
First check for the marker with the highest priority, to short-circuit return when it is found.

This prevents performance problems with large files

* Fix CSpell linter fault

---------

Co-authored-by: nvuillam <nicolas.vuillamy@gmail.com>
  • Loading branch information
sanmai-NL and nvuillam authored Oct 14, 2023
1 parent 59e6cfb commit 4f5dcd5
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 5 deletions.
3 changes: 2 additions & 1 deletion .cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -405,6 +405,7 @@
"SHFMT",
"SNAKEFMT",
"SOQL",
"SOURCEFILEHEADER",
"SOURCEPATHS",
"SQLFLUFF",
"STDLIB",
Expand Down Expand Up @@ -1466,4 +1467,4 @@
"zaach",
"zricethezav"
]
}
}
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Note: Can be used with `oxsecurity/megalinter@beta` in your GitHub Action mega-l
- Core
- mega-linter-runner: Convert to ES6 and upgrade npm dependencies
- Add rust to checkov as it is a required dependency (to do that, allow to define empty string packages as cargo dependencies in descriptors)
- Optimize `@generated` marker scanning ([#2654](https://github.com/oxsecurity/megalinter/pull/2654))

- Media
- [Achieve Code Consistency: MegaLinter Integration in Azure DevOps](https://techcommunity.microsoft.com/t5/azure-devops-blog/achieve-code-consistency-megalinter-integration-in-azure-devops/ba-p/3939448), by [Don Koning](https://techcommunity.microsoft.com/t5/user/viewprofilepage/user-id/2039143#profile) on [Microsoft Tech Community](https://techcommunity.microsoft.com/)
Expand Down
9 changes: 5 additions & 4 deletions megalinter/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
from megalinter import config
from megalinter.constants import DEFAULT_DOCKER_WORKSPACE_DIR

SIZE_MAX_SOURCEFILEHEADER = 1024

REPO_HOME_DEFAULT = (
DEFAULT_DOCKER_WORKSPACE_DIR
if os.path.isdir(DEFAULT_DOCKER_WORKSPACE_DIR)
Expand Down Expand Up @@ -278,10 +280,9 @@ def file_contains(file_name: str, regex_object: Optional[Pattern[str]]) -> bool:


def file_is_generated(file_name: str) -> bool:
with open(file_name, "r", encoding="utf-8", errors="ignore") as f:
content = f.read()
is_generated = "@generated" in content and "@not-generated" not in content
return is_generated
with open(file_name, "rb") as f:
content = f.read(SIZE_MAX_SOURCEFILEHEADER)
return b"@generated" in content and b"@not-generated" not in content


def decode_utf8(stdout):
Expand Down

0 comments on commit 4f5dcd5

Please sign in to comment.