-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic skeleton for Integrations CLI with type checking command #32
Merged
Merged
Changes from 19 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
b9e0d98
Stub diff CLI
Swiddis 3c8b433
Move diff to own file
Swiddis 1151899
Add mapping loading to diff
Swiddis 7b26733
Add README
Swiddis 242a7cc
Replace set with update
Swiddis 1e03dcf
Add basic type checking system
Swiddis f1bb8b1
Add diff output
Swiddis fad5f61
Add integer type checking
Swiddis 892db65
Add ability to show empty mapping fields
Swiddis aab73ea
Add better handling for missing type field
Swiddis 96ddd57
Ignore aliases in no-optional mode
Swiddis 702ecd1
Fix error and warning colors
Swiddis 2b24d58
Remove accidental .DS_Store
Swiddis 7b099ef
Rename no-optional
Swiddis 67d5889
Flatten JSON output
Swiddis ca72763
Fill out remainder of README
Swiddis 6b29c7d
Apply black formatting
Swiddis e146a78
Propagate show_missing through do_check
Swiddis 181eb84
Add functionality to unwrap data lists
Swiddis b7b56ff
Rename files for more standard python package structure
Swiddis 4d75e7e
Add more detail to documentation
Swiddis bac5b6b
Correct doc link to permalink
Swiddis 39e1ba1
Remove .DS_Store
Swiddis 4f88a1b
Add colored diff output to diff
Swiddis File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,176 @@ | ||
# Created by https://www.toptal.com/developers/gitignore/api/python | ||
# Edit at https://www.toptal.com/developers/gitignore?templates=python | ||
|
||
### Python ### | ||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
*.py,cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
cover/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
db.sqlite3-journal | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
.pybuilder/ | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
# For a library or package, you might want to ignore these files since the code is | ||
# intended to run in multiple environments; otherwise, check them in: | ||
# .python-version | ||
|
||
# pipenv | ||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
# However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
# having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
# install all needed dependencies. | ||
#Pipfile.lock | ||
|
||
# poetry | ||
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. | ||
# This is especially recommended for binary packages to ensure reproducibility, and is more | ||
# commonly ignored for libraries. | ||
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control | ||
#poetry.lock | ||
|
||
# pdm | ||
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. | ||
#pdm.lock | ||
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it | ||
# in version control. | ||
# https://pdm.fming.dev/#use-with-ide | ||
.pdm.toml | ||
|
||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm | ||
__pypackages__/ | ||
|
||
# Celery stuff | ||
celerybeat-schedule | ||
celerybeat.pid | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
# Pyre type checker | ||
.pyre/ | ||
|
||
# pytype static type analyzer | ||
.pytype/ | ||
|
||
# Cython debug symbols | ||
cython_debug/ | ||
|
||
# PyCharm | ||
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can | ||
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore | ||
# and can be added to the global gitignore or merged into this file. For a more nuclear | ||
# option (not recommended) you can uncomment the following to ignore the entire idea folder. | ||
#.idea/ | ||
|
||
### Python Patch ### | ||
# Poetry local configuration file - https://python-poetry.org/docs/configuration/#local-configuration | ||
poetry.toml | ||
|
||
# ruff | ||
.ruff_cache/ | ||
|
||
# LSP config files | ||
pyrightconfig.json | ||
|
||
# End of https://www.toptal.com/developers/gitignore/api/python |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# OsInts: the OpenSearch Integrations CLI | ||
|
||
The OsInts CLI is a utility CLI for developing integrations with OpenSearch Integrations. | ||
It provides a few convenience methods: | ||
|
||
- `diff`: Type check your integration given a sample data record and the appropriate SS4O schema. | ||
|
||
## Installation | ||
|
||
Use the package manager [pip](https://pip.pypa.io/en/stable/) to install the CLI. | ||
|
||
```bash | ||
$ cd cli | ||
$ pip install . | ||
... | ||
Successfully installed osints-0.1.0 | ||
``` | ||
|
||
## Usage | ||
|
||
```bash | ||
$ osints --help | ||
Usage: osints [OPTIONS] COMMAND [ARGS]... | ||
|
||
Various tools for working with OpenSearch Integrations. | ||
|
||
Options: | ||
--help Show this message and exit. | ||
|
||
Commands: | ||
diff Diff between a mapping and some data. | ||
``` |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,154 @@ | ||
from beartype import beartype | ||
import click | ||
import json | ||
import os.path | ||
import glob | ||
|
||
|
||
@beartype | ||
def load_mapping(mapping: str) -> dict[str, dict]: | ||
with open(mapping, "r") as mapping_file: | ||
data = json.load(mapping_file) | ||
properties = data.get("template", {}).get("mappings", {}).get("properties") | ||
if properties is None: | ||
return {} | ||
composed_of = data.get("composed_of", []) | ||
curr_dir = os.path.dirname(mapping) | ||
for item in composed_of: | ||
item_glob = glob.glob(os.path.join(curr_dir, f"{item}*")) | ||
if len(item_glob) == 0: | ||
click.secho( | ||
f"ERROR: mapping file {mapping} references component {item}, which does not exist.", | ||
err=True, | ||
fg="red", | ||
) | ||
raise click.Abort() | ||
if properties.get(item) is not None: | ||
click.secho( | ||
f"ERROR: mapping file {mapping} references component {item} and defines conflicting key '{item}'", | ||
err=True, | ||
fg="red", | ||
) | ||
raise click.Abort() | ||
# Greedily take any mapping that matches the name for now. | ||
# Later, configuration will need to be implemented. | ||
if len(item_glob) > 1: | ||
click.secho( | ||
f"WARNING: found more than one mapping for component {item}. Assuming {item_glob[0]}.", | ||
err=True, | ||
fg="yellow", | ||
) | ||
properties.update(load_mapping(item_glob[0])) | ||
return properties | ||
|
||
|
||
@beartype | ||
def flat_type_check(expect: str, actual: object) -> dict[str, dict]: | ||
match expect: | ||
case "text" | "keyword": | ||
if not isinstance(actual, str): | ||
return {"expected": expect, "actual": actual} | ||
case "long" | "integer": | ||
if not isinstance(actual, int): | ||
return {"expected": expect, "actual": actual} | ||
case "alias": | ||
# We assume aliases were already unwrapped by the caller and ignore them. | ||
return {} | ||
case "date": | ||
if not isinstance(actual, str) and not isinstance(actual, int): | ||
return {"expected": expect, "actual": actual} | ||
case _: | ||
click.secho(f"WARNING: unknown type '{expect}'", err=True, fg="yellow") | ||
return {} | ||
|
||
|
||
@beartype | ||
def get_type(mapping: dict) -> str | dict: | ||
if mapping.get("properties"): | ||
return { | ||
key: get_type(value) for key, value in mapping.get("properties").items() | ||
} | ||
return mapping.get("type", "unknown") | ||
|
||
|
||
@beartype | ||
def do_check( | ||
mapping: dict[str, dict], data: dict[str, object], show_missing: bool = False | ||
) -> dict[str, dict]: | ||
result = {} | ||
for key, value in mapping.items(): | ||
if key not in data: | ||
if show_missing and value.get("type") != "alias": | ||
result[key] = {"expected": get_type(value), "actual": None} | ||
continue | ||
elif "properties" in value and isinstance(data[key], dict): | ||
check = do_check(value["properties"], data[key], show_missing) | ||
if check != {}: | ||
result[key] = check | ||
elif value.get("type") == "alias": | ||
# Unwrap aliases split by '.' | ||
value_path = value["path"].split(".") | ||
curr_data = data | ||
for step in value_path[:-1]: | ||
if step not in curr_data: | ||
curr_data[step] = {} | ||
curr_data = curr_data[step] | ||
curr_data[value_path[-1]] = data[key] | ||
elif "type" in value: | ||
check = flat_type_check(value["type"], data[key]) | ||
if check != {}: | ||
result[key] = check | ||
for key, value in data.items(): | ||
if key not in mapping: | ||
result[key] = {"expected": None, "actual": value} | ||
return result | ||
|
||
|
||
@beartype | ||
def output_diff(difference: dict[str, object], prefix: str = "") -> None: | ||
for key, value in sorted(difference.items()): | ||
out_key = prefix + key | ||
if "expected" not in value and "actual" not in value: | ||
output_diff(value, f"{prefix}{key}.") | ||
if value.get("actual") is not None: | ||
click.echo(f"- {out_key}: {json.dumps(value.get('actual'))}") | ||
if value.get("expected") is not None: | ||
click.echo(f"+ {out_key}: {json.dumps(value.get('expected'))}") | ||
|
||
|
||
@click.command() | ||
@click.option( | ||
"--mapping", | ||
type=click.Path(exists=True, readable=True), | ||
help="The mapping for the format the data should have", | ||
) | ||
@click.option( | ||
"--data", | ||
type=click.Path(exists=True, readable=True), | ||
help="The location of data to validate", | ||
) | ||
@click.option( | ||
"--json", | ||
"output_json", | ||
is_flag=True, | ||
help="Output machine-readable JSON instead of the default diff format", | ||
) | ||
@click.option( | ||
"--show-missing", | ||
"show_missing", | ||
is_flag=True, | ||
help="Output fields that are expected in the mappings but missing in the data", | ||
) | ||
def diff(mapping, data, output_json, show_missing): | ||
"""Type check your integration given a sample data record and the appropriate SS4O schema.""" | ||
properties = load_mapping(mapping) | ||
with open(data, "r") as data_file: | ||
data_json = json.load(data_file) | ||
if isinstance(data_json, list): | ||
# Unwrap list of data, assume first record is representative | ||
data_json = data_json[0] | ||
check = do_check(properties, data_json, show_missing) | ||
if output_json: | ||
click.echo(json.dumps(check, sort_keys=True)) | ||
else: | ||
output_diff(check) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
import click | ||
from .diff import diff | ||
|
||
|
||
@click.group() | ||
def cli(): | ||
"""Various tools for working with OpenSearch Integrations.""" | ||
pass | ||
|
||
|
||
cli.add_command(diff) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
from setuptools import setup, find_packages | ||
|
||
setup( | ||
name="osints", | ||
version="0.1.0", | ||
packages=find_packages(), | ||
include_package_data=True, | ||
install_requires=[ | ||
"beartype", | ||
"click", | ||
], | ||
python_requires=">3.10.0", | ||
entry_points={"console_scripts": ["osints = osints.main:cli"]}, | ||
) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What’s cli for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See: #31