Skip to content

Commit

Permalink
Merge pull request #140 from DanCardin/dc/parse
Browse files Browse the repository at this point in the history
fix: Refactor parser combinators into dedicated module, and document the behavior more thoroughly.
  • Loading branch information
DanCardin authored Aug 13, 2024
2 parents 7188941 + 5aa5b7b commit ccc5766
Show file tree
Hide file tree
Showing 9 changed files with 146 additions and 53 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

## 0.22

### 0.22.5

- fix: Refactor parser combinators into dedicated module, and document the behavior more thoroughly.

### 0.22.4

- fix: Avoid applying annotated type parsing to default value.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/annotation.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ As such you can again, opt out of the "Mapping inference" entirely, by supplying
your own `parse` function.

```{note}
Mapping inference is built up out of component functions defined in `cappa.annotation`,
Mapping inference is built up out of component functions defined in `cappa.parse`,
such as `parse_list`, which know how to translate `list[int]` and a source list of raw
parser strings into a list of ints.

Expand Down
64 changes: 61 additions & 3 deletions docs/source/arg.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ default inferred behavior.
```{note}
This feature is currently experimental, in particular because the parser state
available to either backend's callable is radically different. However, for an
action callable which accepts no arguments, behaviors is unlikely to change.
action callable which accepts no arguments, the behavior is unlikely to change.
```

In addition to one of the literal `ArgAction` variants, the provided action can
Expand All @@ -73,7 +73,7 @@ argument.
```{note}
Custom actions may interfere with general inference features, depending on what you're
doing (given that you're taking over the parser's duty of determining how the
code ought to handle the argument).
code ought to handle the argument in question).
As such, you may need to specify options like `num_args`, where you wouldn't have otherwise
needed to.
Expand All @@ -97,7 +97,8 @@ The set of available objects to inject include:
original input.

The above set of objects is of potentially limited value. More parser state will
likely be exposed through this interface in the future.
likely be exposed through this interface in the future. If you think some specific
bit of parser state is missing and could be useful to you, please raise an issue!

For example:

Expand Down Expand Up @@ -289,3 +290,60 @@ is not allowed with argument '-v'`
An explicit `group=` can still be used in concert with the above syntax to control
the `order` and name of the resultant group.
```

## Parse

`Arg.parse` can be used to provide **specific** instruction to cappa as to how to
handle the raw value given from the CLI parser backend.

In _general_, this argument shouldn't need to be specified because the annotated
type will generally _imply_ how that particular value ought to be parsed, especially
for built in python types.

However, there will inevitably be cases where the type itself is not enough to infer
the specific parsing required. Take for example:

```python
from datetime import date

@dataclass
class Example:
iso_date: date
american_date: Annotated[date, cappa.Arg(parse=lambda date_str: date.strptime('%d/%m/%y'))]
```

Cappa's default date parsing assumes an input isoformat string. However you might instead
want a specific alternate parsing behavior; and `parse=` is how that is achieved.

Further, this is likely more useful for parsing any custom classes which dont have simple,
single-string-input constructor arguments.

```{note}
Note cappa itself contains a number of component `parse_*` functions inside the `parse`
module, which can be used in combination with your own custom `parse` functions.
```

### Parsing JSON

Another example of a potentially useful parsing concept could be to parse json string input.
For example:

```python
import json
from dataclasses import dataclass
from typing import Annotated

import cappa

@dataclass
class Example:
todo: Annotated[dict[str, int], cappa.Arg(parse=json.loads)]

todo = cappa.parse(Todo)
print(todo)
```

Natively (at present), cappa doesn't have any specific `dict` type inference because it's
ambiguous what CLI input shape that ought to map to. However, by combining that with
a dedicated `parse=json.loads` annotation, `example.py '{"foo": "bar"}'` now yields
`Example({'foo': 'bar'})`.
35 changes: 35 additions & 0 deletions docs/source/backends.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# Parser Backends

```{note}
If you're looking for custom parsing of individual arguments, you probably want
[Arg.parse](./arg.md#parse) or [Arg.action](./arg.md#action).
This document is concerned with the actual e2e CLI parsing process.
```

Cappa is designed in two parts:

- The "frontend", which is the vast majority of the public API. All the
Expand Down Expand Up @@ -68,3 +75,31 @@ Some potential reasons you want want to use the argparse backend:
guarantee an arbitrary argparse extension will function correctly with cappa,
but to the extent possible, it's a goal that they should be supported if
possible.

## Custom Backends

`cappa.invoke`/`cappa.parse` both accept a `backend=` argument which is used to
select between the existing two backends shipped with Cappa.

Technically, you could use this `backend` argument to author an entirely new
backend to a different argument parser, like `click` for example (although
a click backend was attempted at some point and later abandoned due to unforeseen
complexities). This would allow you to retain all of cappa's pre-parsing and inference
capabilities, as well as the post-processing, mapping, and invoke/dependency injection
infrastructure.

With that said, it's much more likely that it could be useful to make use of the
backend argument to **wrap** one of the existing backends, and mutate the resultant
output structure of the backend before it's passed further downstream. This is somewhat
of an interesting usecase, and again if you find yourself making use of this in order
to work around potential upstream deficiencies in cappa, please bring it up in an
issue/discussion!

```{note}
The backend **interface** is currently not set in stone. Before relying on the specific
details of the input/output shape of a backend, please bring it up in an issue/discussion
in hopes that customizing the backend may be made to not be necessary!
Further, it's likely that the backend interface is more formalized at some point
in the future; at which point it may break those assumptions.
```
6 changes: 3 additions & 3 deletions docs/source/manual_construction.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ from the class in question, if much more manually.
from dataclasses import dataclass

import cappa
from cappa.annotation import parse_list
from cappa.parse import parse_list


@dataclass
Expand All @@ -37,14 +37,14 @@ result = cappa.parse(command, argv=["one", "2", "3"])
```

There are a number of built-in parser functions used to build up the existing
inference system. [parse_value](cappa.annotation.parse_value) the the main
inference system. [parse_value](cappa.parse.parse_value) the the main
entrypoint used by cappa internally, but each of the parser factory functions
below make up the component built-in parsers for each type.

For inherent types, like `int`, `float`, etc. Their constructor may serve as
their own parser.

```{eval-rst}
.. autoapimodule:: cappa.annotation
.. autoapimodule:: cappa.parse
:members: parse_value, parse_list, parse_tuple, parse_literal, parse_none, parse_set, parse_union
```
9 changes: 3 additions & 6 deletions src/cappa/arg.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,20 @@

from typing_inspect import is_optional_type

from cappa.annotation import (
detect_choices,
is_sequence_type,
parse_optional,
parse_value,
)
from cappa.class_inspect import Field, extract_dataclass_metadata
from cappa.completion.completers import complete_choices
from cappa.completion.types import Completion
from cappa.env import Env
from cappa.parse import parse_optional, parse_value
from cappa.typing import (
MISSING,
NoneType,
T,
detect_choices,
find_type_annotation,
get_optional_type,
is_of_type,
is_sequence_type,
is_subclass,
is_union_type,
missing,
Expand Down
40 changes: 2 additions & 38 deletions src/cappa/annotation.py → src/cappa/parse.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
from __future__ import annotations

import enum
import types
import typing
from datetime import date, datetime, time

from typing_inspect import get_origin, is_literal_type, is_optional_type
from typing_inspect import is_literal_type

from cappa.file_io import FileMode
from cappa.typing import T, get_optional_type, is_none_type, is_subclass, is_union_type
from cappa.typing import T, is_none_type, is_subclass, is_union_type, repr_type

__all__ = [
"parse_value",
Expand All @@ -18,7 +17,6 @@
"parse_union",
"parse_tuple",
"parse_none",
"detect_choices",
]


Expand Down Expand Up @@ -217,37 +215,3 @@ def file_io_mapper(value: str):
return file_mode(value)

return file_io_mapper


def detect_choices(annotation: type) -> list[str] | None:
if is_optional_type(annotation):
annotation = get_optional_type(annotation)

origin = typing.get_origin(annotation) or annotation
type_args = typing.get_args(annotation)
if is_subclass(origin, enum.Enum):
return [v.value for v in origin] # type: ignore

if is_subclass(origin, (tuple, list, set)):
origin = typing.cast(type, type_args[0])
type_args = typing.get_args(type_args[0])

if is_union_type(origin):
if all(is_literal_type(t) for t in type_args):
return [str(typing.get_args(t)[0]) for t in type_args]

if is_literal_type(origin):
return [str(t) for t in type_args]

return None


def is_sequence_type(typ):
return is_subclass(get_origin(typ) or typ, (typing.List, typing.Tuple, typing.Set))


def repr_type(t):
if isinstance(t, type) and not typing.get_origin(t):
return str(t.__name__)

return str(t).replace("typing.", "")
37 changes: 36 additions & 1 deletion src/cappa/typing.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from __future__ import annotations

import enum
import sys
import types
import typing
Expand All @@ -9,7 +10,7 @@
import typing_extensions
import typing_inspect
from typing_extensions import Annotated, get_args, get_origin
from typing_inspect import is_literal_type
from typing_inspect import is_literal_type, is_optional_type

try:
from typing_extensions import Doc
Expand Down Expand Up @@ -159,6 +160,40 @@ def is_of_type(annotation, types):
return False


def detect_choices(annotation: type) -> list[str] | None:
if is_optional_type(annotation):
annotation = get_optional_type(annotation)

origin = typing.get_origin(annotation) or annotation
type_args = typing.get_args(annotation)
if is_subclass(origin, enum.Enum):
return [v.value for v in origin] # type: ignore

if is_subclass(origin, (tuple, list, set)):
origin = typing.cast(type, type_args[0])
type_args = typing.get_args(type_args[0])

if is_union_type(origin):
if all(is_literal_type(t) for t in type_args):
return [str(typing.get_args(t)[0]) for t in type_args]

if is_literal_type(origin):
return [str(t) for t in type_args]

return None


def is_sequence_type(typ):
return is_subclass(get_origin(typ) or typ, (typing.List, typing.Tuple, typing.Set))


def repr_type(t):
if isinstance(t, type) and not typing.get_origin(t):
return str(t.__name__)

return str(t).replace("typing.", "")


if sys.version_info >= (3, 10):
_get_type_hints = typing.get_type_hints

Expand Down
2 changes: 1 addition & 1 deletion tests/test_manually_built.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@

import cappa
import pytest
from cappa.annotation import parse_list, parse_value
from cappa.output import Exit
from cappa.parse import parse_list, parse_value

from tests.utils import backends, parse

Expand Down

0 comments on commit ccc5766

Please sign in to comment.