Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add ability to add hex encoded colors to outline items #1186

Merged
merged 8 commits into from
Jul 31, 2022

Conversation

mtd91429
Copy link
Contributor

Add hexstring and basic named color parsing to PdfWriter method add_outline_item()

@MartinThoma
Copy link
Member

I fear that this might overall do more harm than good:

  1. Consistency: people can now expect those two types of color specifications in every place where PyPDF2 is accepting a color argument
  2. Simplicity: it's clear what color a hex string maps to, but which shade of gray is "gray"?
  3. Spelling: gray? Or Gray? GRAY? Grey?
  4. Interface : I like to keep parameters simple - giving different types and different logic around those types to one parameter is something I try to avoid.

Those disadvantages can be avoided when you would create a Color helper class. That helper class could have class attributes that map to the hex value. Or even better they are returning a Color object which has a ".tohex()" function... This way you could also add methods like darken / mix / interpolate to combine colors.

Maybe there are even good 3rd party libraries that offer this?

@MartinThoma
Copy link
Member

Or we really add just a minimal color class that has those color names as class attributes and the hey codes as values... That might be a good middle ground

@MartinThoma
Copy link
Member

What do you think about https://pypi.org/project/colour/?

@mtd91429
Copy link
Contributor Author

You raise some very good points, many of which I thought about but wasn't sure what was the best path forward.

My overall aim goal was to reduce the clunkiness of the current color attribute within the outline item (with the potential to expand it to other aspects of the library). The RGB tuple from 0-1 is not very intuitive or common for encoding colors. At first, I simply wanted to add the hex values in basic 6 digit format (ex.,#800080). I think those are the most straight forward, common due to the web, and easier to work-with.

But then I thought that sometimes, you don't want to look up a hex code for a basic color and thought using color names might be reasonable. For example, simply color="blue". As I started to dig into named color formats, I realized that was a larger can of worms than I wanted. See the Note in W3C docs, xkcd color survery. For middle ground, I settled on the basic 16 colors from the Web specification, then added Orange because it seemed like a notable exception for a basic color within the English language. 🤷‍♂️

What do you think about https://pypi.org/project/colour/?

That was the logical step if we wanted to go "all in" on color features. But I don't think that colors are that central to the primary purpose of PyPDF2. I think it is overkill for the needs of the library.

a minimal color class that has those color names as class attributes and the hex codes as values

Yea, that may be a better idea. Keep it simple.

I think in retrospect either 1) ditching the naming convention entirely and only supporting hexstrings or 2) use a minimal class to serve essentially as a dictionary for convenience.

@MartinThoma
Copy link
Member

At first, I simply wanted to add the hex values in basic 6 digit format (ex.,#800080). I think those are the most straight forward, common due to the web, and easier to work-with.

Yes!

What do you think about https://pypi.org/project/colour/?

That was the logical step if we wanted to go "all in" on color features. But I don't think that colors are that central to the primary purpose of PyPDF2. I think it is overkill for the needs of the library.

I agree. I was thinking more about having a look at this library and pointing to it in the documentation :-)

  1. ditching the naming convention entirely and only supporting hexstrings or 2) use a minimal class to serve essentially as a dictionary for convenience

Both make sense to me. I think we could do both at the same time.

However, we would need to support both, the (r, g, b) tuple and the hex-string, for a while.

@mtd91429 mtd91429 changed the title ENH: Add basic color names and ability to parse hex encoded colors to outline items ENH: Add parse hex encoded colors to outline items Jul 31, 2022
@mtd91429 mtd91429 changed the title ENH: Add parse hex encoded colors to outline items ENH: Add ability to add hex encoded colors to outline items Jul 31, 2022
@mtd91429
Copy link
Contributor Author

I slept on it and think that removing the named colors is probably the best route forward. It's more complicated than it is worth. Rather, I'm proposing to add the ability to parse hex strings instead. I've only implemented this for the outline items but think this could be expanded to other areas of the PDF where the color attribute is employed. For now, I think this is only within the outline item workflow, but is implied for other components (such as annotations) via a dictionary

@codecov
Copy link

codecov bot commented Jul 31, 2022

Codecov Report

Merging #1186 (8c8c3a8) into main (42ae312) will decrease coverage by 0.03%.
The diff coverage is 80.00%.

@@            Coverage Diff             @@
##             main    #1186      +/-   ##
==========================================
- Coverage   92.14%   92.10%   -0.04%     
==========================================
  Files          24       24              
  Lines        4911     4913       +2     
  Branches     1016     1017       +1     
==========================================
  Hits         4525     4525              
  Misses        244      244              
- Partials      142      144       +2     
Impacted Files Coverage Δ
PyPDF2/_writer.py 88.16% <ø> (ø)
PyPDF2/generic.py 91.59% <80.00%> (-0.19%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 42ae312...8c8c3a8. Read the comment docs.

PyPDF2/generic.py Outdated Show resolved Hide resolved
tests/test_writer.py Outdated Show resolved Hide resolved
@MartinThoma MartinThoma merged commit 7c7ef77 into py-pdf:main Jul 31, 2022
@MartinThoma
Copy link
Member

@mtd91429 Thank you for all the work you continuously put into PyPDF2 🤗

MartinThoma added a commit that referenced this pull request Jul 31, 2022
New Features (ENH):
-  Add ability to add hex encoded colors to outline items (#1186)
-  Add support for pathlib.Path in PdfMerger.merge (#1190)
-  Add link annotation (#1189)
-  Add capability to filter text extraction by orientation  (#1175)

Bug Fixes (BUG):
-  Named Dest in PDF1.1 (#1174)
-  Incomplete Graphic State save/restore (#1172)

Documentation (DOC):
-  Update changelog url in package metadata (#1180)
-  Table extraction (#1179)
-  Mention pyHanko for signing PDF documents (#1178)
-  We now have CMAP support (#1177)

Maintenance (MAINT):
-  Consistant usage of warnings / log messages (#1164)
-  Consistent terminology for outline items (#1156)

Code Style (STY):
-  Apply pre-commit (#1188)

Full Changelog: 2.8.1...2.9.0
@mtd91429 mtd91429 deleted the outline-item-color branch August 1, 2022 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants