Skip to content

Latest commit

 

History

History
95 lines (68 loc) · 5.06 KB

README.md

File metadata and controls

95 lines (68 loc) · 5.06 KB

json-repair : A library for fixing JSON anomalies generated by LLMs.

Release Maven Central Apache 2.0 license CN doc

Logo

What is json-repair?

json-repair is a Java library that provides repair for abnormal JSON generated by LLM (Large Language Model) at the application layer. Based on json-repair, the repair of abnormal JSON can be achieved simply, efficiently and accurately.

Getting started

To get started with json-repair, first add it as a dependency in your Java project. If you're using Maven, that looks like this:

<dependency>
    <groupId>io.github.haibiiin</groupId>
    <artifactId>json-repair</artifactId>
    <version>0.2.2</version>
</dependency>

If you're using Gradle, that looks like this:

implementation 'io.github.haibiiin:json-repair:0.2.2'

Next, You can instantiate a JSONRepair object then call handle() function to repair JSON string like so.

JSONRepair repair = new JSONRepair();
String correctJSON = repair.handle(mistakeJSON);

Feature

You can learn about all the JSON exceptions that the current version 0.2.2 of json-repair supports for repair by checking the test case dataset or test report.

The functions based on the current version 0.2.2 are as follows:

  • Implement the basic repair function for JSON strings:
    • Adding the missing right parenthesis;
    • Adding the missing right square bracket;
    • Cleaning up the redundant commas when the value is in the form of an array;
    • Filling in the missing values with "null";
    • Patch the missing left parentheses;
    • Patch the missing outermost parentheses;
    • Patch the lack of quotation marks for strings in individual scenarios;
    • Provide the number of custom patching attempts.

Benchmark

You can conduct performance tests in more scenarios by running BenchmarkTests.

The benchmark based on the current version 0.2.2 as follows:

--AverageTime --NANOSECONDS --Warmup-5-1-SECONDS
Benchmark                                                          (anomalyJSON)    Mode     Cnt          Score         Error   Units
BenchmarkTests.testSimpleRepairStrategy                      {"f":"v", "f2":"v2"    avgt       5       9077.407 ±    5268.919   ns/op
BenchmarkTests.testSimpleRepairStrategy                         {"f":"v", "a":[1    avgt       5      21058.074 ±    2600.312   ns/op
BenchmarkTests.testSimpleRepairStrategy  {"f":"v", "a":[1,2], "o1":{"f1":"v1"},     avgt       5      18696.069 ±    3740.596   ns/op
BenchmarkTests.testSimpleRepairStrategy     "f":"v", "a":[1,2], "o1":{"f1":"v1"}    avgt       5      21853.925 ±     343.950   ns/op
BenchmarkTests.testSimpleRepairStrategy                                      f:v    avgt       5      45642.245 ±   11680.611   ns/op

--AverageTime --MILLISECONDS --Warmup-5-1-SECONDS
Benchmark                                                          (anomalyJSON)    Mode     Cnt          Score         Error   Units
BenchmarkTests.testSimpleRepairStrategy                      {"f":"v", "f2":"v2"    avgt       5          0.012 ±       0.008   ms/op
BenchmarkTests.testSimpleRepairStrategy                         {"f":"v", "a":[1    avgt       5          0.061 ±       0.112   ms/op
BenchmarkTests.testSimpleRepairStrategy  {"f":"v", "a":[1,2], "o1":{"f1":"v1"},     avgt       5          0.037 ±       0.048   ms/op
BenchmarkTests.testSimpleRepairStrategy     "f":"v", "a":[1,2], "o1":{"f1":"v1"}    avgt       5          0.035 ±       0.054   ms/op
BenchmarkTests.testSimpleRepairStrategy                                      f:v    avgt       5          0.094 ±       0.151   ms/op

Coverage

You can check out report to learn about the details of the project's test coverage.

Roadmap

  • Make more accurate corrections to abnormal JSON by providing reference JSON styles.

Development Guid

After cloning the code to your local machine, navigate to the project's root directory. Build and install all modules, it’ll install modules into Maven local repository cache, and also generate Java class files of parser from ANTLR grammar .g4 files to prevent from compile error of parser on IDE.

mvnw install

License

json-repair is licensed under the Apache-2.0 license.