Content between xref section and trailer #511
Labels
documentation
Improvements or additions to documentation
Parked
Parked (eg. passed to another TWG, next ISO spec)
The specification is somewhat ambiguous about the mutual position of a cross-reference section and the corresponding file trailer. That the latter should directly follow the former is, to my best knowledge, only made explicit in 7.5.1 in the form of a list (presumably calling the only cross-reference section a "table"), and then in Figures 2 and 3.
The next mention of their mutual position is in the new text of 7.5.4 saying "PDF comments shall not be included in a cross-reference table between the keywords
xref
andtrailer
". This, however, has two issues:trailer
is not a part of the cross-reference table, so it is a strange thing to say "in" the table "between" the keywords when the latter keyword itself is not in it.trailer
whatsoever, derailing a person reading the specification in a linear fashion.Note that the other similar mention near the end of 7.5.8.4, stating that "PDF comments shall not be included in a cross-reference table or in cross-reference streams", speaks of the content of the table (section?) itself, i.e., in the case of a "traditional" table, the subsection headers and the 20-character entries. (I'm having a hard time trying to imagine where a comment could ever fit inside the binary content of a cross-reference stream.) But my question is specifically about the space between the end of a cross-reference section (after its last subsection) and the beginning of a trailer.
This opens several questions. "Comments shall not be included": why comments, specifically? What situation does this intend to prevent (within the already existing constraints)? Other forms of whitespace are acceptable? If so, are they OK in any amount, as long as they do not include a PDF comment? If a PDF reader already has to be prepared to skip an arbitrary amount of whitespace, what extra complication it would be to skip comments too?
From a practical point of view, the location of the trailer is specified in 7.5.5 given by its preceding the
startxref
keyword, which is impractical for implementations given that it spans an a priori unknown amount of bytes or lines, and could in principle happen to contain the characters "trailer[whitespace]<<" by coincidence (e.g. as part of the/ID
strings if they are written literal). So in order to find the trailer quickly and unambiguously, reading past the end of the cross-reference section is the only way.A possible reading of the list in 7.5.1 is that nothing can appear between the cross-reference section and its trailer, because nothing is listed between the two. Together with #112 (with the EOL directly before
trailer
being the final EOL of the last subsection), this would be a very useful guarantee, because the keyword would act as a sentinel for the end of the xref section, which itself has no explicit EOD marker. (The other option is stopping reading the section when a subsection is finished and followed by a line not conforming to the format of another subsection header. This would allow to check on string equality rather than failure to match a pattern.) Moreover, the PDF reader could immediately proceed to reading the trailer, as opposed to having to find where it starts.NB that the latter would exclude the possibility of a blank like preceding
trailer
, which, albeit uncommon, does appear in the PDF Association's own examples, see e.g. the final trailer ofPDF 2.0 via incremental save.pdf
in pdf-association/pdf20examples, so maybe that's too much to ask for, but the rest of the proposal stands. (Nevertheless, I haven't ever seen that done in PDF from any other source but I don't mind being shown otherwise.)Proposed solution
trailer
from the last cross-reference section, or similar.The text was updated successfully, but these errors were encountered: