Skip to content

vipulg13/grobid-quantities-python-client

 
 

Repository files navigation

Python client for Grobid Quantities

https://travis-ci.org/hirmeos/entity-fishing-client-python.svg?branch=master

Python client to query the Grobid Quantities service API For more information about Grobid Quantities, please check the Grobid Quantities Documentation.

Installation

The client can be installed using pip:

pip install grobid-quantities-client

Command Line Interface (CLI)

The CLI follows the following parameters:

python -m grobid_quantities.quantities --help usage: quantities.py [-h] --input INPUT [--output OUTPUT] [--base-url BASE_URL] [--config CONFIG] [--n N] [--force] [--verbose]

Client for the Grobid-quantities service

optional arguments:
-h, --help show this help message and exit
--input INPUT path to the directory containing PDF files or .txt (for processCitationList only, one reference per line) to process
--output OUTPUT
 path to the directory where to put the results (optional)
--base-url BASE_URL
 Base url of the service
--config CONFIG
 path to the config file, default is ./config.json
--n N concurrency for service usage
--force force re-processing pdf input files when tei output files already exist
--verbose print information about processed files in the console

API Usage

Initialisation

from grobid_quantities.quantities import Quantities client = QuantitiesAPI(base_url=server_url:port)

Process raw text:

client.process_text(
"I lost two minutes"

)

Process PDF document

client.process_pdf(pdfFile)

Parse the measurements

client.parse_measures("from": "10", "to": "20", "unit": "km")

The response is a tuple where the first element is the status code and and the second element the response body as a dictionary. Here an example:

(

200, {

"runtime": 123, "measurements": [

{

"type": "value", "quantity": {

"type": "time", "rawValue": "two", "rawUnit": {

"name": "minutes", "type": "time", "system": "non SI", "offsetStart": 11, "offsetEnd": 18

}, "parsedValue": {

"numeric": 2, "structure": {

"type": "ALPHABETIC", "formatted": "two"

}, "parsed": "two"

}, "normalizedQuantity": 120, "normalizedUnit": {

"name": "s", "type": "time", "system": "SI base"

}, "offsetStart": 7, "offsetEnd": 11

}

}

]

}

)

Packages

No packages published

Languages

  • Python 100.0%