Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to add fixing of BOMs #522

Merged
merged 1 commit into from
Oct 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions .pre-commit-hooks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@
language: python
types: [python]
- id: check-byte-order-marker
name: Check for byte-order marker
description: Forbid files which have a UTF-8 byte-order marker
name: 'check BOM - deprecated: use fix-byte-order-marker'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shortened the name here so it doesn't wrap -- I should probably add a test for this

description: forbid files which have a UTF-8 byte-order marker
entry: check-byte-order-marker
language: python
types: [text]
Expand Down Expand Up @@ -131,6 +131,12 @@
entry: file-contents-sorter
language: python
files: '^$'
- id: fix-byte-order-marker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this to fix-byte-order-marker so it's more clear what it does

name: fix UTF-8 byte order marker
description: removes UTF-8 byte order marker
entry: fix-byte-order-marker
language: python
types: [text]
- id: fix-encoding-pragma
name: Fix python encoding pragma
language: python
Expand Down
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,6 @@ Require literal syntax when initializing empty or zero Python builtin types.
- Ignore this requirement for specific builtin types with `--ignore=type1,type2,…`.
- Forbid `dict` keyword syntax with `--no-allow-dict-kwargs`.

#### `check-byte-order-marker`
Forbid files which have a UTF-8 byte-order marker

#### `check-case-conflict`
Check for files with names that would conflict on a case-insensitive filesystem like MacOS HFS+ or Windows FAT.

Expand Down Expand Up @@ -102,6 +99,9 @@ This hook replaces double quoted strings with single quoted strings.
#### `end-of-file-fixer`
Makes sure files end in a newline and only a newline.

#### `fix-byte-order-marker`
removes UTF-8 byte order marker

#### `fix-encoding-pragma`
Add `# -*- coding: utf-8 -*-` to the top of python files.
- To remove the coding pragma pass `--remove` (useful in a python3-only codebase)
Expand Down Expand Up @@ -183,6 +183,7 @@ Trims trailing whitespace.
[mirrors-autopep8](https://github.com/pre-commit/mirrors-autopep8)
- `pyflakes`: instead use `flake8`
- `flake8`: instead use [upstream flake8](https://gitlab.com/pycqa/flake8)
- `check-byte-order-marker`: instead use fix-byte-order-marker

### As a standalone package

Expand Down
30 changes: 30 additions & 0 deletions pre_commit_hooks/fix_byte_order_marker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import argparse
from typing import Optional
from typing import Sequence


def main(argv: Optional[Sequence[str]] = None) -> int:
parser = argparse.ArgumentParser()
parser.add_argument('filenames', nargs='*', help='Filenames to check')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the --fix=no -- I'd rather have formatters-only

args = parser.parse_args(argv)

retv = 0

for filename in args.filenames:
with open(filename, 'rb') as f_b:
bts = f_b.read(3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a leaking file descriptor here, I fixed that


if bts == b'\xef\xbb\xbf':
with open(filename, newline='', encoding='utf-8-sig') as f:
contents = f.read()
with open(filename, 'w', newline='', encoding='utf-8') as f:
f.write(contents)

print(f'{filename}: removed byte-order marker')
retv = 1

return retv


if __name__ == '__main__':
exit(main())
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ console_scripts =
double-quote-string-fixer = pre_commit_hooks.string_fixer:main
end-of-file-fixer = pre_commit_hooks.end_of_file_fixer:main
file-contents-sorter = pre_commit_hooks.file_contents_sorter:main
fix-byte-order-marker = pre_commit_hooks.fix_byte_order_marker:main
fix-encoding-pragma = pre_commit_hooks.fix_encoding_pragma:main
forbid-new-submodules = pre_commit_hooks.forbid_new_submodules:main
mixed-line-ending = pre_commit_hooks.mixed_line_ending:main
Expand Down
13 changes: 13 additions & 0 deletions tests/fix_byte_order_marker_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from pre_commit_hooks import fix_byte_order_marker


def test_failure(tmpdir):
f = tmpdir.join('f.txt')
f.write_text('ohai', encoding='utf-8-sig')
assert fix_byte_order_marker.main((str(f),)) == 1


def test_success(tmpdir):
f = tmpdir.join('f.txt')
f.write_text('ohai', encoding='utf-8')
assert fix_byte_order_marker.main((str(f),)) == 0