Implement processing worker in bashlib #1023

kba · 2023-03-24T16:18:30Z

https://github.com/OCR-D/core/pull/974/files#r1140759640

Well, among the recent changes @joschrew introduced is_bashlib_processor. Instead of trying to look inside – which is error-prone, and by OCR-D CLI we might even deal with pure program code (binaries), which would not work at all – I recommend simply adding the Processing Server functionality to bashlib's ocrd__wrap, so bashlib-based processors behave like Pythonic processors. Of course, this would simply delegate to ocrd processing-worker internally.

The text was updated successfully, but these errors were encountered:

kba · 2023-03-24T16:22:02Z

Implementing this in bashlib processors will be very difficult. Honestly, at this point, I'm wondering whether we want to / can continue to support bashlib fully or whether it wouldn't be easier to convert ocrd_olena, ocrd_fileformat etc. to python.

bertsky · 2023-03-24T17:44:03Z

Implementing this in bashlib processors will be very difficult.

This is absolutely not about (re-)implementing the Processing Server. It's merely about delegating to the ocrd processing-worker subcommand, implemented as part of #974, to make it available in bashlib, so I can do e.g. …

ocrd-olena-binarize --queue amqp://admin:admin@localhost:5672 --database mongodb://localhost:27018

…instead of…

ocrd processing-worker ocrd-olena-binarize --queue amqp://admin:admin@localhost:5672 --database mongodb://localhost:27018

The reason is not to save you the few extra characters to type, but to implement the (extended) OCR-D CLI fully within bashlib, so as a user/caller you would not need to know whether a certain processor to call is bashlib or Python – as has happened in https://github.com/OCR-D/core/pull/974/files#r1140759640

bertsky · 2023-03-24T17:47:55Z

I'm wondering whether we want to / can continue to support bashlib fully or whether it wouldn't be easier to convert ocrd_olena, ocrd_fileformat etc. to python.

I've said this before, but think it cannot be overstated: Keeping bashlib is as important as it gets. We want core to be a framework for writing OCR-D compliant processors, which can mean implementing them in Python (if possible), but also merely integrating them via a tiny shell wrapper. The latter must always be possible, otherwise developers will have to reimplement our conventions for Java, C++, Go, whathaveyou.

kba · 2023-03-26T13:18:42Z

The latter must always be possible, otherwise developers will have to reimplement our conventions for Java, C++, Go, whathaveyou.

I'm very fond of bashlib, don't get me wrong. But you could easily reimplement bashlib procesors in python by just delegating the calls of the actual tools to subprocess.run and have the full expressivity and OCR-D/core support of Python.

Be that as it may, I just misunderstood the problem, #1024 is indeed straightforward, no need for mongo/queue CLI in bashlib as I feared.

bertsky · 2023-03-26T13:33:22Z

But you could easily reimplement bashlib procesors in python by just delegating the calls of the actual tools to subprocess.run and have the full expressivity and OCR-D/core support of Python.

You mean provide a "wildcard processor" in Python which can do system calls to external tools – like ocrd_wrap's ocrd-preprocess-image, but not just for image processing, and as a base class to inherit from?

Ok, perhaps avoiding bash would make our life easier (only maintaining Python parts, being able to use the API). I thought for external modules it would help keeping it (no knowledge of Python necessary). Of course, ATM our bashlib-enabled processor boilerplate (page looping, PAGE handling with xmlstarlet) is so huge that it's not feasible for outsiders anyway. (But perhaps this model can help overcome this.)

bertsky · 2023-06-04T10:27:37Z

The original issue was resolved via #1024. So do we want to repurpose (and rename) this for your new idea of providing a Python-only general purpose wrapper doing preconfigured shell calls, @kba?

kba mentioned this issue Mar 24, 2023

Processing-Server #974

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement processing worker in bashlib #1023

Implement processing worker in bashlib #1023

kba commented Mar 24, 2023

kba commented Mar 24, 2023 •

edited by bertsky

Loading

bertsky commented Mar 24, 2023

bertsky commented Mar 24, 2023

kba commented Mar 26, 2023

bertsky commented Mar 26, 2023

bertsky commented Jun 4, 2023

Implement processing worker in bashlib #1023

Implement processing worker in bashlib #1023

Comments

kba commented Mar 24, 2023

kba commented Mar 24, 2023 • edited by bertsky Loading

bertsky commented Mar 24, 2023

bertsky commented Mar 24, 2023

kba commented Mar 26, 2023

bertsky commented Mar 26, 2023

bertsky commented Jun 4, 2023

kba commented Mar 24, 2023 •

edited by bertsky

Loading