Skip to content
This repository has been archived by the owner on Dec 10, 2024. It is now read-only.

Commit

Permalink
Improvements to documentation (#2)
Browse files Browse the repository at this point in the history
Adds examples and API reference
  • Loading branch information
pipliggins authored Nov 21, 2024
1 parent c167219 commit 4884112
Showing 14 changed files with 844 additions and 24 deletions.
16 changes: 16 additions & 0 deletions docs/api/create_mapping.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Mapping Functions

The following functions can be used to create the intermediate mapping CSV required to generate a parser

```{eval-rst}
.. autofunction:: autoparser.create_mapping
```

## Class definitions

You can also interact with the base class `Mapper`

```{eval-rst}
.. autoclass:: autoparser.Mapper
:members:
```
20 changes: 20 additions & 0 deletions docs/api/dict_writer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Data Dictionary Functions

The following functions can be used to create and add descriptions to a data dictionary

```{eval-rst}
.. autofunction:: autoparser.create_dict
:noindex:
.. autofunction:: autoparser.generate_descriptions
:noindex:
```

## Class definitions

You can also interact with the base class `DictWriter`

```{eval-rst}
.. autoclass:: autoparser.DictWriter
:members:
```
9 changes: 9 additions & 0 deletions docs/api/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# API

This section describes the public API for AutoParser

```{toctree}
dict_writer
create_mapping
make_toml
```
17 changes: 17 additions & 0 deletions docs/api/make_toml.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Parser Functions

The following functions can be used to create the final TOML parser file

```{eval-rst}
.. autofunction:: autoparser.create_parser
:noindex:
```

## Class definitions

You can also interact with the base class `parserGenerator`

```{eval-rst}
.. autoclass:: autoparser.ParserGenerator
:members:
```
6 changes: 6 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -32,4 +32,10 @@
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
html_theme = "sphinx_book_theme"
html_logo = "images/logo.png"
html_title = "AutoParser"
html_static_path = ["_static"]

html_theme_options = {
"repository_url": "https://github.com/globaldothealth/autoparser",
"use_repository_button": True,
}
47 changes: 47 additions & 0 deletions docs/examples/cli_example.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# CLI Parser construction

This file describes how to run the same parser generation pipeline as described in the
[parser construction](example) notebook, but using the command line interface. It
constructs a parser file for an `animals.csv` file of test data, and assumes all commands
are run from the root of the `autoparser` package.

Note: As a reminder, you will need an API key for OpenAI or Google. This example uses the OpenAI LLM.

## Generate a data dictionary
In this example, we will generate a data dictionary with descriptions already added in one step. The CLI command follows this syntax:


```bash
autoparser create-dict data language [-d] [-k api_key] [-l llm_choice] [-c config_file] [-o output_name]
```
so for the `animal_data.csv` data we will run this command to generate a data dictionary
with descriptions

```bash
autoparser create-dict tests/sources/animal_data.csv "fr" -d -k $OPENAI_API_KEY -c tests/test_config.toml -o "animal_dd"
```
This creates an `animals_dd.csv` data dictionary to use in the next step.

## Create intermediate mapping file
The next step is to create an intermediate CSV for you to inspect, mapping the fields and values in the raw data to the target schema. This is the CLI syntax:

```bash
autoparser create-mapping dictionary schema language api_key [-l llm_choice] [-c config_file] [-o output_name]
```
so we can run
```bash
autoparser create-mapping animal_dd.csv tests/schemas/animals.schema.json "fr" $OPENAI_API_KEY -c tests/test_config.toml -o animal_mapping
```
to create the intermediate mapping file `animal_mapping.csv` for you to inspect for any errors.

## Write the parser file
Finally, the parser file for ADTL should be written out based on the contents of `animal_mapping.csv`. Once you've mande any changes to the mapping you want, we can use the `create_parser` command

```bash
autoparser create-parser mapping schema_path [-n parser_name] [--description parser_description] [-c config_file]
```
as
```bash
autoparser create-parser animal_mapping.csv tests/schemas -n animal_parser -c tests/test_config.toml
```
which writes out the TOML parser as `animal_parser.toml` ready for use in ADTL.
Loading

0 comments on commit 4884112

Please sign in to comment.