Skip to content

HAibiiin/json-repair

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

json-repair : A library for fixing JSON anomalies generated by LLMs.

Release Maven Central Apache 2.0 license CN doc

Logo

What is json-repair?

json-repair is a Java library that provides repair for abnormal JSON generated by LLM (Large Language Model) at the application layer. Based on json-repair, the repair of abnormal JSON can be achieved simply, efficiently and accurately.

Getting started

To get started with json-repair, first add it as a dependency in your Java project. If you're using Maven, that looks like this:

<dependency>
    <groupId>io.github.haibiiin</groupId>
    <artifactId>json-repair</artifactId>
    <version>0.2.2</version>
</dependency>

If you're using Gradle, that looks like this:

implementation 'io.github.haibiiin:json-repair:0.2.2'

Next, You can instantiate a JSONRepair object then call handle() function to repair JSON string like so.

JSONRepair repair = new JSONRepair();
String correctJSON = repair.handle(mistakeJSON);

Feature

You can learn about all the JSON exceptions that the current version 0.2.2 of json-repair supports for repair by checking the test case dataset or test report.

The functions based on the current version 0.2.2 are as follows:

  • Implement the basic repair function for JSON strings:
    • Adding the missing right parenthesis;
    • Adding the missing right square bracket;
    • Cleaning up the redundant commas when the value is in the form of an array;
    • Filling in the missing values with "null";
    • Patch the missing left parentheses;
    • Patch the missing outermost parentheses;
    • Patch the lack of quotation marks for strings in individual scenarios;
    • Provide the number of custom patching attempts.

Benchmark

You can conduct performance tests in more scenarios by running BenchmarkTests.

The benchmark based on the current version 0.2.2 as follows:

--AverageTime --NANOSECONDS --Warmup-5-1-SECONDS
Benchmark                                                          (anomalyJSON)    Mode     Cnt          Score         Error   Units
BenchmarkTests.testSimpleRepairStrategy                      {"f":"v", "f2":"v2"    avgt       5       9077.407 ±    5268.919   ns/op
BenchmarkTests.testSimpleRepairStrategy                         {"f":"v", "a":[1    avgt       5      21058.074 ±    2600.312   ns/op
BenchmarkTests.testSimpleRepairStrategy  {"f":"v", "a":[1,2], "o1":{"f1":"v1"},     avgt       5      18696.069 ±    3740.596   ns/op
BenchmarkTests.testSimpleRepairStrategy     "f":"v", "a":[1,2], "o1":{"f1":"v1"}    avgt       5      21853.925 ±     343.950   ns/op
BenchmarkTests.testSimpleRepairStrategy                                      f:v    avgt       5      45642.245 ±   11680.611   ns/op

--AverageTime --MILLISECONDS --Warmup-5-1-SECONDS
Benchmark                                                          (anomalyJSON)    Mode     Cnt          Score         Error   Units
BenchmarkTests.testSimpleRepairStrategy                      {"f":"v", "f2":"v2"    avgt       5          0.012 ±       0.008   ms/op
BenchmarkTests.testSimpleRepairStrategy                         {"f":"v", "a":[1    avgt       5          0.061 ±       0.112   ms/op
BenchmarkTests.testSimpleRepairStrategy  {"f":"v", "a":[1,2], "o1":{"f1":"v1"},     avgt       5          0.037 ±       0.048   ms/op
BenchmarkTests.testSimpleRepairStrategy     "f":"v", "a":[1,2], "o1":{"f1":"v1"}    avgt       5          0.035 ±       0.054   ms/op
BenchmarkTests.testSimpleRepairStrategy                                      f:v    avgt       5          0.094 ±       0.151   ms/op

Coverage

You can check out report to learn about the details of the project's test coverage.

Roadmap

  • Make more accurate corrections to abnormal JSON by providing reference JSON styles.

License

json-repair is licensed under the Apache-2.0 license.