Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

structMap[@TYPE=OCR-D-LOGICAL] / FULLDOWNLOAD #154

Closed
wants to merge 16 commits into from
Closed

structMap[@TYPE=OCR-D-LOGICAL] / FULLDOWNLOAD #154

wants to merge 16 commits into from

Conversation

kba
Copy link
Member

@kba kba commented Jun 2, 2020

No description provided.

@kba kba changed the title Issues 142 structMap[@TYPE=OCR-D-LOGICAL] / FULLDOWNLOAD Jun 7, 2020
debug: smLink
mets.md Outdated Show resolved Hide resolved
mets.md Outdated Show resolved Hide resolved
@@ -160,6 +199,83 @@ encodings of the same page.
</mets:structMap>
```

## OCR-D structMap
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not make sense to put this section past Grouping files by page – the latter should be integrated into the former as a subsection!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here my proposal for a new document structure:

Requirements on handling METS/PAGE

  1. Metadata
    1.1 Unique @ID for the document processed

  2. Images
    2.1. Pixel density of images must be explicit and high enough
    2.2. No multi-page images
    2.3 Image coordinates
    2.4 If in PAGE then in METS

  3. File Group mets:fileGrp
    3.1 @USE syntax
    Examples

  4. File mets:file
    4.1 @ID syntax
    Examples
    4.2 @MIMETYPE syntax
    Examples
    Examples (Media Type for PAGE XML)

  5. Grouping files by page mets:structMap
    Example
    5.1 @TYPE syntax
    Example

  6. Range of pages mets:structLink
    Example

  7. Paths
    7.1 Always use URL or relative filenames
    Example

  8. Recording processing information in METS

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I would:

  • subsume 8 (processing information) under 1 (metadata)
  • abandon 2.3 (frankly, I don't know why this resides here and not just in PAGE.md)
  • replace 2.3 with a general note about original/derived images (what is now in PAGE.md, but including new language from Alternative image same folder #164)

But I wonder: where in that outline did Fulldownload go? Is it still subsumed under 4.1 for you? (We discussed this elsewhere: then you cannot make these subsections self-contained.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • abandon 2.3 (frankly, I don't know why this resides here and not just in PAGE.md)
  • replace 2.3 with a general note about original/derived images (what is now in PAGE.md, but including new language from Alternative image same folder #164)

That's right, I think 2.3 is better found in page.md.

  • subsume 8 (processing information) under 1 (metadata)
    That`s a good proposal.

But I wonder: where in that outline did Fulldownload go? Is it still subsumed under 4.1 for you? (We discussed this elsewhere: then you cannot make these subsections self-contained.)

But I wonder: where in that outline did Fulldownload go? Is it still subsumed under 4.1 for you? (We discussed this elsewhere: then you cannot make these subsections self-contained.)

Yes, Fulldownload is a section/part under 4.1

Copy link
Contributor

@tboenig tboenig Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requirements on handling METS/PAGE

1 Metadata
1.1 Recording processing information in METS
1.2 Unique @ID for the document processed

2 Images
2.1. Pixel density of images must be explicit and high enough
2.2. No multi-page images
2.3 If in PAGE then in METS

3 File Group mets:fileGrp
3.1 @USE syntax
Examples
3.2 @USE="FULLDOWNLOAD_..."
Examples

4 File mets:file
4.1 @ID syntax
Examples
4.2 @MIMETYPE syntax
Examples
Examples (Media Type for PAGE XML)

5 Grouping files by page mets:structMap
Example
5.1 @TYPE syntax
Example

6 Range of pages mets:structLink
Example

7 Paths
7.1 Always use URL or relative filenames
Example

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this resolved?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this resolved?

No, AFAICS it is not. The section Fulldownload should still be part of the file ID syntax (differentiating between page-local and document-global naming scheme, but not trying to formulate this "self-contained"). Also, the section about grouping files by structMap should come below fileGrp and file ID sections.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or did you want to do all that in a separate PR, or just wait for the merge with master?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think also 7.1 could be 1.3 instead.

Yes!

mets.md Outdated Show resolved Hide resolved
mets.md Outdated Show resolved Hide resolved
mets.md Outdated Show resolved Hide resolved
mets.md Outdated Show resolved Hide resolved
mets.md Outdated Show resolved Hide resolved
tboenig and others added 2 commits June 16, 2020 13:28
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
mets.md Outdated Show resolved Hide resolved
tboenig and others added 6 commits June 17, 2020 08:53
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
@tboenig tboenig requested review from tboenig and Boenig and removed request for tboenig June 17, 2020 15:22
mets.md Outdated Show resolved Hide resolved
kba and others added 2 commits July 21, 2020 14:20
Co-authored-by: Robert Sachunsky <38561704+bertsky@users.noreply.github.com>
mets.md Outdated Show resolved Hide resolved
Copy link
Member

@cneud cneud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree with @bertsky to implement these changes:

replace 2.3 with a general note about original/derived images (what is now in PAGE.md, but including new language from #164)

The section Fulldownload should still be part of the file ID syntax (differentiating between page-local and document-global naming scheme, but not trying to formulate this "self-contained").

Also, the section about grouping files by structMap should come below fileGrp and file ID sections.

I think also 7.1 could be 1.3 instead.

@tboenig tboenig self-requested a review August 2, 2022 11:22
@kba
Copy link
Member Author

kba commented Aug 2, 2022

Superseded by #207, @tboenig please have a look there.

@kba kba closed this Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants