Skip to content

Conversation

ayman-openai
Copy link

@ayman-openai ayman-openai commented Sep 6, 2025

Summary

This PR adds a new notebook that serves as an introduction on how we can integrate OpenAI embeddings into DuckDB by defining an openai embeddings UDF to use with DuckDB SQL operations. The example is quite straightforward, demonstrating a simple semantic / vector search end to end example using an arXiv abstracts dataset.

Motivation

I noticed that we don't have any guides on using DuckDB with OpenAI APIs, so thought of kicking that off with a simple introduction. DuckDB is a popular lightweight OLAP system, mainly for analytics workloads. For many analytics use cases, combining embeddings and other OpenAI endpoints with native SQL analysis tasks can be quite powerful.


For new content

When contributing new content, read through our contribution guidelines, and mark the following action items as completed:

  • I have added a new entry in registry.yaml (and, optionally, in authors.yaml) so that my content renders on the cookbook website.
  • I have conducted a self-review of my content based on the contribution guidelines:
    • Relevance: This content is related to building with OpenAI technologies and is useful to others.
    • Uniqueness: I have searched for related examples in the OpenAI Cookbook, and verified that my content offers new insights or unique information compared to existing documentation.
    • Spelling and Grammar: I have checked for spelling or grammatical mistakes.
    • Clarity: I have done a final read-through and verified that my submission is well-organized and easy to understand.
    • Correctness: The information I include is correct and all of my code executes successfully.
    • Completeness: I have explained everything fully, including all necessary references and citations.

We will rate each of these areas on a scale from 1 to 4, and will only accept contributions that score 3 or higher on all areas. Refer to our contribution guidelines for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant