-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WARNING: Unsupported annotation subtype: /'Square' #46
Comments
Do you have a sample annotated PDF you can share? |
I took a quick look at the spec. A "Square" annotation (which despite the name is really a rectangle) could probably be treated as a more limited form of highlight where we capture any text inside the rectangle. Would that make sense for your use-case? |
Yes. The attached PDF contains both "Highlights" and the Square annotation type that Apple uses with its highlighter-style annotation palette in all iOS devices. |
Yes, it would. I would absolutely prefer to also have the text returned within the Square. :-) |
The PDF you uploaded contains a single Square annotation, but the "Rect" property for that annotation (which defines the region it covers on the page) begins at "E - Express" and continues until the end of "... you could pluck bits of". It includes the twitter link and the paragraph below it that are not actually highlighted in the pale yellow colour, so I don't think capturing all that text is what you want or what the program intended. I also noticed that other programs (e.g. the Chrome PDF reader, and SumatraPDF) don't interact with these annotations in the same way they do with true Highlight annotations, so I'm at a bit of a loss for what to do about them. |
Your annotations also have extra information in some proprietary fields under |
Going back to your original request, perhaps it's best if we treat "Square" as another type of note, so pdfannots can tell you that it's there on a certain page, and can emit the contents (text comment associated with the annotation) if it has any, but doesn't attempt to capture text covered by the annotation. |
Of course. It's Apple, so I understand. Having pdfannots recognize the existence of the annotation, along with any notes, as you suggested, is a good solution. |
Per issue #46, Apple tools support "highlighting" PDFs where the hightlights are emitted as a Square annotation with a custom appearance that renders the markup. Figuring out the affected text would be a major undertaking, but with this change we (1) recognise the existence of the annotation rather than emitting an "unsupported annotation" warning, and (2) capture the contents (text note) of the annotation if any.
Square annotations are now recognised, however depending on the output format you may not see anything, or you may just get a warning, if there is no actual "content" text. They'll always be there in the json format ( |
Great. How can I install the version that has this change? I understand this change has not yet been made as part of a release, and so I cannot use "pip install pdfannots" I've tried a few things after downloading the zip, but I've been unsuccessful. |
Per #35 I think you can do:
(but I haven't tested it myself.) |
That seems to have installed the update because I no longer get the error message. But, neither do I get any information about the existence of the square. Using the file I provided, what output should I expect? |
That's what I alluded to. If there is no comment, there's nothing meaningful to output as markdown, but json has it.
|
pdfannots --no-group Yes... that's what I was missing. I see it now. Thank you. |
For the sake of completeness, this is the check that now works for me. I use it in Hazel, to check if a PDF has any annotations in it. If yes, I add a Finder tag so that I can easily find the file (on my Mac, iPad, or iPhone) for further follow-up. I have a separate script, using pdfannots, to pull the highlighted text for further distillation.
|
How can I have pdfannots return any value at all - page number, etc. - for annotation subtype: /'Square'?
Here's why... the markup annotation - the marker highlighter annotation - used in iOS is (apparently) reported by pdfannots as subtype: /'Square'. Although, my PDF app - PDF Expert - reports this annotation type as "Rectangle".
For this use case, I simply need pdfannots to return even just a page number, for any type of annotation, at all, on my PDF.
The text was updated successfully, but these errors were encountered: