Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Writing German characters into form fields #2047

Merged
merged 18 commits into from
Aug 13, 2023

Conversation

pubpub-zz
Copy link
Collaborator

closes #2035
closes #2021

@codecov
Copy link

codecov bot commented Jul 30, 2023

Codecov Report

Patch coverage: 66.66% and project coverage change: -0.11% ⚠️

Comparison is base (fb4f466) 94.28% compared to head (d4cb793) 94.17%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2047      +/-   ##
==========================================
- Coverage   94.28%   94.17%   -0.11%     
==========================================
  Files          41       41              
  Lines        7364     7386      +22     
  Branches     1451     1458       +7     
==========================================
+ Hits         6943     6956      +13     
- Misses        262      267       +5     
- Partials      159      163       +4     
Files Changed Coverage Δ
pypdf/_writer.py 87.62% <62.96%> (-0.56%) ⬇️
pypdf/_cmap.py 94.40% <100.00%> (+0.06%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pypdf/_cmap.py Outdated Show resolved Hide resolved
pypdf/_cmap.py Outdated Show resolved Hide resolved
pubpub-zz and others added 5 commits July 30, 2023 22:40
Co-authored-by: Martin Thoma <info@martin-thoma.de>
Co-authored-by: Martin Thoma <info@martin-thoma.de>
@pubpub-zz
Copy link
Collaborator Author

Test converage is not wonderfull but I miss ideas

@MartinThoma all yours

@pubpub-zz
Copy link
Collaborator Author

PS : I will need this PR for a new one (which will deal with full unicode(arabic and asian chararacter sets) ) 😉

@pubpub-zz
Copy link
Collaborator Author

stdby

pypdf/_cmap.py Outdated Show resolved Hide resolved
pypdf/_writer.py Outdated
Comment on lines 887 to 889
raise AssertionError("can not find font dictionary")
logger_warning(f"can not find font dictionary for {font_name}", __name__)
font_full_rev = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code after the exception will never be executed

I would use ValueError instead of AssertionError: https://docs.python.org/3/library/exceptions.html#exception-hierarchy - but I cannot give a good reason why 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that using AssertionErrors usually is a bad approach. Why raising an exception at all when there apparently is a way to continue afterwards after having logged a (recoverable) violation/issue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion error is here to detect issues when using pytest. In operational condition it is the logger_warning (non blocking) that should be raised

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand that. The Exception will also be raised in prod code, not only in pytest.

When you execute the following with python:

print("First")
raise AssertionError("the exception")
print("Second")

you get:

First
Traceback (most recent call last):
  File "/home/moose/foo.py", line 2, in <module>
    raise AssertionError("the exception")
AssertionError: the exception

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see my mistake. I originally used assert but ruff changed to the exception only.
I propose this code:

if __debug__:
            raise AssertionError("can not find font dictionary")
else:
            logger_warning(f"can not find font dictionary for {font_name}", __name__)

your opinion ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I didn't know about __debug__ - thanks for sharing 🙏

Sadly, I think it cannot be used like that. I think more than 90% of all users don't use/know the -O flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within pytest, we can make warnings which are not captured via with pytest.warns be treated as errors with pytest -W error: https://docs.python.org/3/using/cmdline.html#cmdoption-W

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is probably a good idea as default as well 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within pyproject.toml there is already:

[tool.pytest.ini_options]
addopts = "--disable-socket"
filterwarnings = ["error"]

isn't this sufficient consider warnings as errors ?
for the moment I removed the raise line

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that should do it. Didn't it work?

@pubpub-zz
Copy link
Collaborator Author

Coverage should improve with further PRs
@MartinThoma all yours

@pubpub-zz
Copy link
Collaborator Author

conflict solved

pypdf/_writer.py Outdated Show resolved Hide resolved
@MartinThoma MartinThoma changed the title BUG : writing german characters into fields BUG: Writing German characters into form fields Aug 13, 2023
@MartinThoma MartinThoma added the soon PRs that are almost ready to be merged, issues that get solved pretty soon label Aug 13, 2023
@MartinThoma MartinThoma merged commit 46ac7ad into py-pdf:main Aug 13, 2023
12 of 14 checks passed
MartinThoma added a commit that referenced this pull request Aug 13, 2023
## What's new

### Performance Improvements (PI)
-  optimize _decode_png_prediction (#2068)

### Bug Fixes (BUG)
-  Fix incorrect tm_matrix in call to visitor_text (#2060)
-  Writing German characters into form fields (#2047)
-  Prevent stall when accessing image in corrupted pdf (#2081)
-  append() fails when articles do not have /T (#2080)

### Robustness (ROB)
-  Cope with xref not followed by separator (#2083)

[Full Changelog](3.15.0...3.15.1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
soon PRs that are almost ready to be merged, issues that get solved pretty soon
Projects
None yet
3 participants