GitHub - SAP/json-data-and-query-generator: A generator of JSON data and queries, e.g., for benchmarking JSON document stores.

JSON Data and Query Generator

The growing popularity of JSON as exchange and storage format in business and analytical applications led to its rapid dissemination, thus making a timely storage and processing of JSON documents crucial for organizations. Consequently, specialized JSON document stores are ubiquitously used for diverse domain-specific workloads, while a JSON-specific benchmark is missing.

In this repository, we provide an example implementation of DeepBench, an extensible, scalable benchmark that addresses nested JSON data, as well as queries over JSON documents. DeepBench features configurable domain-independent (e. g., varying document sizes, concurrent users) and JSON-specific scale levels (e. g., object, array nesting).

The package json_data_and_query_generator contains tools to generate random json data and corresponding SQL queries. Each of these tools needs as an input a configuration in form of a json document describing the fixed structure of the data and the characteristic of the generated queries.

Setup

Install prerequisites:

  pip install .

To execute data and query generation based on the example scenario in examples (default):

  python -m json_data_and_query_generator --num-proc 5

with five processes.

If other scenarios should be run, then specify paths to schema.txt, data.txt, and config.json as described in pipeline.py --help.

Support, Feedback, Contributing

This project is open to feature requests/suggestions, bug reports etc. via GitHub issues. Contribution and feedback are encouraged and always welcome. For more information about how to contribute, the project structure, as well as additional contribution information, see our Contribution Guidelines.

Code of Conduct

We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone. By participating in this project, you agree to abide by its Code of Conduct at all times.

Licensing

Copyright 2022-2023 SAP SE or an SAP affiliate company and contributors. Please see our LICENSE for copyright and license information. Detailed information including third-party components and their licensing/copyright information is available via the REUSE tool.

Citation

For more documentation read the following documents. If you find this work useful for your research, please cite:

@inproceedings{DBLP:conf/dbtest-ws/Belloni0SR22,
  author       = {Stefano Belloni and
                  Daniel Ritter and
                  Marco Schr{\"{o}}der and
                  Nils R{\"{o}}rup},
  editor       = {Manuel Rigger and
                  Pinar T{\"{o}}z{\"{u}}n},
  title        = {DeepBench: Benchmarking {JSON} Document Stores},
  booktitle    = {DBTest@SIGMOD '22: Proceedings of the 9th International Workshop of
                  Testing Database Systems, Philadelphia, PA, USA, 17 June 2022},
  pages        = {1--9},
  publisher    = {{ACM}},
  year         = {2022},
  url          = {https://doi.org/10.1145/3531348.3532176},
  doi          = {10.1145/3531348.3532176},
  timestamp    = {Sun, 02 Oct 2022 15:58:56 +0200},
  biburl       = {https://dblp.org/rec/conf/dbtest-ws/Belloni0SR22.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

and / or the usage in systems:

@article{DBLP:journals/dbsk/BelloniR22,
  author       = {Stefano Belloni and
                  Daniel Ritter},
  title        = {Benchmarking {JSON} Document Stores in Practice},
  journal      = {Datenbank-Spektrum},
  volume       = {22},
  number       = {3},
  pages        = {217--226},
  year         = {2022},
  url          = {https://doi.org/10.1007/s13222-022-00425-y},
  doi          = {10.1007/s13222-022-00425-y},
  timestamp    = {Sat, 25 Feb 2023 21:35:08 +0100},
  biburl       = {https://dblp.org/rec/journals/dbsk/BelloniR22.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.reuse		.reuse
LICENSES		LICENSES
json_data_and_query_generator		json_data_and_query_generator
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

JSON Data and Query Generator

Setup

Support, Feedback, Contributing

Code of Conduct

Licensing

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

SAP/json-data-and-query-generator

Folders and files

Latest commit

History

Repository files navigation

JSON Data and Query Generator

Setup

Support, Feedback, Contributing

Code of Conduct

Licensing

Citation

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages