Skip to content

results of SPARQL LLM-KG-Bench tasks for article "Assessing SPARQL capabilities of Large Language Models"

Notifications You must be signed in to change notification settings

AKSW/LLM-Sparql-Results-2024

Repository files navigation

LLM-KG-Bench SPARQL Results 2024

Results for SPARQL tasks of LLM-KG-Bench framework as described in the article "Assessing SPARQL capabilities of Large Language Models" by L.-P. Meyer et al. in proceedings of the NLP4KGC workshop at SEMANTICS 2024.

Results for 4 SPARQL SELECT query related task types:

  • SparqlSyntaxFixing: Fixing syntax errors in SPARQL SELECT queries
  • Text2Sparql: Generate SPARQL SELECT queries from textual questions
  • Text2Answer: Generate the answer for a textual question with a given knowledge graph
  • Sparql2Answer: Generate the answer for a SPARQL SELECT query with a given knowledge graph

Overview on the task types and their input and output:

Overview on the task types

Files generated for each run:

  • result files generated, different serialization formats containing same information:
    • *_run-[YYYY-mm-DD_HH-MM-ss]_result.json
    • *_run-[YYYY-mm-DD_HH-MM-ss]_result.yaml
    • *_run-[YYYY-mm-DD_HH-MM-ss]_result.txt
  • model log containing all text sent between benchmark framework an LLM models: *_run-[YYYY-mm-DD_HH-MM-ss]_modelLog.jsonl
  • debug log with extensive log messages: *_run-[YYYY-mm-DD_HH-MM-ss]_debug-log.log

stats and plots generated per task

  • csv/xlsx summary of all results for a task: *.csv/*.xlsx
  • boxplot of all results for a task: *boxplots*.png

other files:

  • Benchmark framework configuration file used: configuration-2024-05-sparql.yml
  • Count for all experiments per task and model combinations present in result files: sparql6-boxplots__stats.csv
  • Logs of Matrix-Run executions generating the given result files: MatrixRun-Logs/

repetition of evaluation with given result files:

  • The Benchmarking framework supports the reevaluation of given result files via the --reeval parameter

About

results of SPARQL LLM-KG-Bench tasks for article "Assessing SPARQL capabilities of Large Language Models"

Resources

Stars

Watchers

Forks

Packages

No packages published