Name		Name	Last commit message	Last commit date
parent directory ..
MatrixRun-Logs		MatrixRun-Logs
Sparql2Answer		Sparql2Answer
SparqlSyntaxFixing		SparqlSyntaxFixing
Text2Answer		Text2Answer
Text2Sparql		Text2Sparql
README.md		README.md
SPARQL-Task-Overview.drawio		SPARQL-Task-Overview.drawio
SPARQL-Task-Overview.drawio.svg		SPARQL-Task-Overview.drawio.svg
configuration-2024-05-sparql.yml		configuration-2024-05-sparql.yml
sparql6-boxplots__stats.csv		sparql6-boxplots__stats.csv

README.md

LLM-KG-Bench SPARQL Results 2024

Results for SPARQL tasks of LLM-KG-Bench framework as described in the article "Assessing SPARQL capabilities of Large Language Models" by L.-P. Meyer et al. in proceedings of the NLP4KGC workshop at SEMANTICS 2024.

Code used for this run is archived at zenodo as . This results are archived at zenodo as .

Results for 4 SPARQL SELECT query related task types:

SparqlSyntaxFixing: Fixing syntax errors in SPARQL SELECT queries
Text2Sparql: Generate SPARQL SELECT queries from textual questions
Text2Answer: Generate the answer for a textual question with a given knowledge graph
Sparql2Answer: Generate the answer for a SPARQL SELECT query with a given knowledge graph

Overview on the task types and their input and output:

Files generated for each run:

result files generated, different serialization formats containing same information:
- *_run-[YYYY-mm-DD_HH-MM-ss]_result.json
- *_run-[YYYY-mm-DD_HH-MM-ss]_result.yaml
- *_run-[YYYY-mm-DD_HH-MM-ss]_result.txt
model log containing all text sent between benchmark framework an LLM models: *_run-[YYYY-mm-DD_HH-MM-ss]_modelLog.jsonl
debug log with extensive log messages: *_run-[YYYY-mm-DD_HH-MM-ss]_debug-log.log

stats and plots generated per task

csv/xlsx summary of all results for a task: *.csv/*.xlsx
boxplot of all results for a task: *boxplots*.png

other files:

Benchmark framework configuration file used: configuration-2024-05-sparql.yml
Count for all experiments per task and model combinations present in result files: sparql6-boxplots__stats.csv
Logs of Matrix-Run executions generating the given result files: MatrixRun-Logs/

repetition of evaluation with given result files:

The Benchmarking framework supports the reevaluation of given result files via the --reeval parameter

Test dataset, please do not use for training

The result files collected here contain test data. Please do not use them for training of LLMs. If you are interested in training data, please contact us, e.g. via email.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2024-NLP4KGC-SPARQL

2024-NLP4KGC-SPARQL

README.md

LLM-KG-Bench SPARQL Results 2024

Results for 4 SPARQL SELECT query related task types:

Files generated for each run:

stats and plots generated per task

other files:

repetition of evaluation with given result files:

Test dataset, please do not use for training

Files

2024-NLP4KGC-SPARQL

Directory actions

More options

Directory actions

More options

Latest commit

History

2024-NLP4KGC-SPARQL

Folders and files

parent directory

README.md

LLM-KG-Bench SPARQL Results 2024

Results for 4 SPARQL SELECT query related task types:

Files generated for each run:

stats and plots generated per task

other files:

repetition of evaluation with given result files:

Test dataset, please do not use for training