Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prefix option is ignored since latest FoLiA-hocr changes #52

Open
proycon opened this issue Dec 12, 2020 · 5 comments
Open

prefix option is ignored since latest FoLiA-hocr changes #52

proycon opened this issue Dec 12, 2020 · 5 comments
Assignees

Comments

@proycon
Copy link
Member

proycon commented Dec 12, 2020

when running: FoLiA-hocr --prefix "FH-" -O ./ -t 1 "OllevierGeets-5.hocr"

I get an output file: OllevierGeets-5.tif.folia.xml instead of FH-OllevierGeets-5.tif.folia.xml as before. This breaks the current PICCL pipeline. Prefix was introduced in LanguageMachines/PICCL@ebd16ff for LanguageMachines/PICCL#30 .

@kosloot kosloot self-assigned this Dec 12, 2020
@kosloot
Copy link
Contributor

kosloot commented Dec 12, 2020

Hmm, yes. It's not a bug, but a changed feature: :)
the prefix is only added to the document id's. (to avoid invalid NC-names) No longer to the output filenames.
The latter is in fact pointless.
But this might indeed break scripts.

Maybe it is better to stick to the old behavior.
The same applies to some other modules.

@proycon
Copy link
Member Author

proycon commented Dec 12, 2020

ok, no problem, that sounds good, I'll simply adapt the pipeline then.

@proycon
Copy link
Member Author

proycon commented Dec 12, 2020

Wait, I think we're both going in different direction aren't we? I just saw some commits where you reinstated the old naming.

@proycon proycon reopened this Dec 12, 2020
@kosloot
Copy link
Contributor

kosloot commented Dec 12, 2020

indeeed :)

proycon added a commit to LanguageMachines/PICCL that referenced this issue Dec 12, 2020
@kosloot
Copy link
Contributor

kosloot commented Dec 12, 2020

@proycon I still think adding prefixes to output files is not the best solution. As it may come as a surprise for users.
One advantage might be that is directly shows which tool created the file.

But it is maybe better to avoid this. But then clearly for ALL tools.
We might agree on changing this in a next (major) release. But it is not an urgent task I think.
Agreed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants