Skip to content

Conversation

@j-t-1
Copy link
Contributor

@j-t-1 j-t-1 commented Jul 29, 2025

No description provided.

@codecov
Copy link

codecov bot commented Jul 29, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.95%. Comparing base (2a91bd4) to head (f0cd742).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3403      +/-   ##
==========================================
+ Coverage   96.94%   96.95%   +0.01%     
==========================================
  Files          55       55              
  Lines        9333     9341       +8     
  Branches     1708     1708              
==========================================
+ Hits         9048     9057       +9     
+ Misses        170      169       -1     
  Partials      115      115              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Jul 29, 2025

Would adding BlackIs1 decrease code coverage?

@stefan6419846
Copy link
Collaborator

Would adding BlackIs1 decrease code coverage?

This parameter is already implemented globally and IMHO should not be required here as it requires the actual image anyway:

pypdf/pypdf/filters.py

Lines 956 to 958 in 2a91bd4

if lfilters == FT.CCITT_FAX_DECODE and decode_parms.get("/BlackIs1", BooleanObject(False)).value is True:
from PIL import ImageOps # noqa: PLC0415
img = ImageOps.invert(img)

@j-t-1
Copy link
Contributor Author

j-t-1 commented Jul 29, 2025

You want to have them in one place, not having means that its non-inclusion is because of something elsewhere.

@stefan6419846
Copy link
Collaborator

Sorry, but I do not understand what you mean with your last comment.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Jul 29, 2025

Sorry, but I do not understand what you mean with your last comment.

I think it should be at the top of CCITTParameters, together with the rest of them. It is the only one excluded. The other attributes you need the image as well.

@stefan6419846
Copy link
Collaborator

I am still a bit confused about the benefits. Does this have any advantage besides just having it there? Some of the other values might be valuable for the decoder, but BlackIs1 is purely a color transformation.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Jul 30, 2025

I am still a bit confused about the benefits. Does this have any advantage besides just having it there? Some of the other values might be valuable for the decoder, but BlackIs1 is purely a color transformation.

The advantage is consistency: we do not use some of the others and they are there.

We also have a comment: 0 = WhiteIsZero.
https://github.com/py-pdf/pypdf/blob/2a91bd4d0b5bda90f2eae741e383813b6cda9721/pypdf/filters.py#L656C47-L656C58

This means BlackIsOne.

From the specification: BlackIs1 has default value: false.

So, I am unsure if the comment is correct. Should it be BlackIs1 = false?

@stefan6419846
Copy link
Collaborator

Regarding WhiteIsZero: You are of course invited to attempt to refactor the current BlackIs1 implementation to address this at the Pillow level.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Jul 30, 2025

Regarding WhiteIsZero: You are of course invited to attempt to refactor the current BlackIs1 implementation to address this at the Pillow level.

@stefan6419846 I barely understand this! Is that code directly from Pillow?

@stefan6419846
Copy link
Collaborator

Evaluating the BlackIs1 parameter in

0, # Thresholding, SHORT, 1, 0 = WhiteIsZero
might allow us to drop the explicit manual processing at

pypdf/pypdf/filters.py

Lines 956 to 958 in 2a91bd4

if lfilters == FT.CCITT_FAX_DECODE and decode_parms.get("/BlackIs1", BooleanObject(False)).value is True:
from PIL import ImageOps # noqa: PLC0415
img = ImageOps.invert(img)

@j-t-1
Copy link
Contributor Author

j-t-1 commented Jul 30, 2025

Have done some other PRs to make it easier to understand.

I will implement this in a new PR. Using decode_parms if it is not None is going to be necessary?

pypdf/pypdf/filters.py

Lines 616 to 627 in 01c98a5

def decode(
data: bytes,
decode_parms: Optional[DictionaryObject] = None,
height: int = 0,
**kwargs: Any,
) -> bytes:
# decode_parms is unused here
if isinstance(decode_parms, ArrayObject): # deprecated
deprecation_no_replacement(
"decode_parms being an ArrayObject", removed_in="3.15.5"
)
params = CCITTFaxDecode._get_parameters(decode_parms, height)

@stefan6419846
Copy link
Collaborator

Sorry, but I do not understand what you mean.

What are you planning to implement in a new PR?

Using decode_parms if it is not None is going to be necessary?

It already is necessary to process some of the parameters. What do you intend to change?

@j-t-1
Copy link
Contributor Author

j-t-1 commented Aug 1, 2025

Sorry, but I do not understand what you mean.

What are you planning to implement in a new PR?

Using decode_parms if it is not None is going to be necessary?

It already is necessary to process some of the parameters. What do you intend to change?

Lets get the existing PRs done, and then see where we are. Is this PR good to go?

@stefan6419846
Copy link
Collaborator

We previously discussed moving BlackIs1 processing into the generated list instead of implementing this with Pillow manually, as we are adding it to the parameter set. Did you try this and were you successful? If this is feasible without having to analyze it further, I vote to change this here directly.

@j-t-1
Copy link
Contributor Author

j-t-1 commented Aug 2, 2025

Yes, better in this PR, where have created the Black1 processing. The removal of the part in _xobj_to_image is dependent on #3415. If that is okay can then merge that into this one?

@stefan6419846
Copy link
Collaborator

stefan6419846 commented Aug 4, 2025

Could you please rebase your changes on the current main branch? Additionally, if we want to move the parameter, we should remove the section from _xobj_to_image and properly evaluate the decode_parms.

stefan6419846 and others added 7 commits August 5, 2025 09:31
…pdf#3411)

Related to py-pdf#3051 and py-pdf#3408.

This updates some of the binary dependencies as well to avoid side effects on Python 3.14.
Nevertheless, Pillow 11.1.0 would indeed introduce a side effect, which required us to change
the tests to check pixel data instead of byte data for the PNG file comparison.
Additionally includes the related changes required for getting a clean CI with these library changes.
Use modern formatting.
UP035: checks for uses of deprecated imports based on the minimum supported Python version.
@j-t-1
Copy link
Contributor Author

j-t-1 commented Aug 5, 2025

Could you please rebase your changes on the current main branch? Additionally, if we want to move the parameter, we should remove the section from _xobj_to_image and properly evaluate the decode_parms.

Did the rebase. Unsure if was correct.

@j-t-1 j-t-1 closed this Aug 5, 2025
@j-t-1 j-t-1 deleted the CCITTParameters branch August 5, 2025 09:38
@stefan6419846
Copy link
Collaborator

It seems like the rebase includes too much changes, but as you deleted the branch anyway, feel free to open a new clean PR for this to avoid confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants