Skip to content

Latest commit

 

History

History
12 lines (7 loc) · 1.54 KB

README.md

File metadata and controls

12 lines (7 loc) · 1.54 KB

LLMSteer: The Unreasonable Effectiveness of LLMs for Query Optimization

This repository contains code, embeddings of SQL queries, models, and result artifacts from the preliminary work of LLMSteer. Analysis was done using python version 3.11.4. See requirements.txt for list of dependencies. Queries were executed using PostgreSQL version 16.1.

Abstract: Recent work in database query optimization has used complex machine learning strategies, such as customized reinforcement learning schemes. Surprisingly, we show that LLM embeddings of query text contain useful semantic information for query optimization. Specifically, we show that a simple binary classifier deciding between alternative query plans, trained only on a small number of labeled embedded query vectors, can outperform existing heuristic systems. Although we only present some preliminary results, an LLM-powered query optimizer could provide significant benefits, both in terms of performance and simplicity.

Data used in this work is accessible here and can be unzipped with the following command: tar --zstd -xvf llmsteer_data.tar.zst

The list of hints used in this work originate from Bao and can be found in the online appendix here

Full paper can be found on arXiv here

You can find me at peterai.me