title	summary
Get Started with Vector Search via SQL	Learn how to quickly get started with Vector Search in TiDB Cloud using SQL statements and power the generative AI application.

Get Started with Vector Search via SQL

TiDB extends MySQL syntax to support Vector Search and introduce new Vector data types and several vector functions.

This tutorial demonstrates how to get started with TiDB Vector Search just using SQL statements. You will learn how to use the MySQL command-line client to:

Connect to your TiDB cluster.
Create a vector table.
Store vector embeddings.
Perform vector search queries.

Note

TiDB Vector Search is currently in beta and only available for TiDB Serverless clusters.

Prerequisites

To complete this tutorial, you need:

MySQL command-line client (MySQL CLI) installed on your machine.
A TiDB Serverless cluster. Follow creating a TiDB Serverless cluster to create your own TiDB Cloud cluster if you don't have one.

Get started

Step 1. Connect to the TiDB cluster

Navigate to the Clusters page, and then click the name of your target cluster to go to its overview page.
Click Connect in the upper-right corner. A connection dialog is displayed.
In the connection dialog, select MySQL CLI from the Connect With drop-down list and keep the default setting of the Connection Type as Public.
If you have not set a password yet, click Generate Password to generate a random password.

Copy the connection command and paste it into your terminal. The following is an example for macOS:

mysql -u '<prefix>.root' -h '<host>' -P 4000 -D 'test' --ssl-mode=VERIFY_IDENTITY --ssl-ca=/etc/ssl/cert.pem -p'<password>'

Step 2. Create a vector table

With vector search support, you can use the VECTOR type column to store vector embeddings in TiDB.

To create a table with a three-dimensional VECTOR column, execute the following SQL statements using your MySQL CLI:

USE test;
CREATE TABLE embedded_documents (
    id        INT       PRIMARY KEY,
    -- Column to store the original content of the document.
    document  TEXT,
    -- Column to store the vector representation of the document.
    embedding VECTOR(3)
);

The expected output is as follows:

Query OK, 0 rows affected (0.27 sec)

Step 3. Store the vector embeddings

Insert three documents with their vector embeddings into the embedded_documents table:

INSERT INTO embedded_documents
VALUES
    (1, 'dog', '[1,2,1]'),
    (2, 'fish', '[1,2,4]'),
    (3, 'tree', '[1,0,0]');

The expected output is as follows:

Query OK, 3 rows affected (0.15 sec)
Records: 3  Duplicates: 0  Warnings: 0

Note

This example simplifies the dimensions of the vector embeddings and uses only 3-dimensional vectors for demonstration purposes.

In real-world applications, embedding models often produce vector embeddings with hundreds or thousands of dimensions.

Step 4. Query the vector table

To verify that the documents have been inserted correctly, query the embedded_documents table:

SELECT * FROM embedded_documents;

The expected output is as follows:

+----+----------+-----------+
| id | document | embedding |
+----+----------+-----------+
|  1 | dog      | [1,2,1]   |
|  2 | fish     | [1,2,4]   |
|  3 | tree     | [1,0,0]   |
+----+----------+-----------+
3 rows in set (0.15 sec)

Step 5. Perform a vector search query

Similar to full-text search, users provide search terms to the application when using vector search.

In this example, the search term is "a swimming animal", and its corresponding vector embedding is [1,2,3]. In practical applications, you need to use an embedding model to convert the user's search term into a vector embedding.

Execute the following SQL statement and TiDB will identify the top three documents closest to the search term by calculating and sorting the cosine distances (vec_cosine_distance) between the vector embeddings.

SELECT id, document, vec_cosine_distance(embedding, '[1,2,3]') AS distance
FROM embedded_documents
ORDER BY distance
LIMIT 3;

The expected output is as follows:

+----+----------+---------------------+
| id | document | distance            |
+----+----------+---------------------+
|  2 | fish     | 0.00853986601633272 |
|  1 | dog      | 0.12712843905603044 |
|  3 | tree     |  0.7327387580875756 |
+----+----------+---------------------+
3 rows in set (0.15 sec)

From the output, the swimming animal is most likely a fish, or a dog with a gift for swimming.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vector-search-get-started-using-sql.md

vector-search-get-started-using-sql.md

Get Started with Vector Search via SQL

Prerequisites

Get started

Step 1. Connect to the TiDB cluster

Step 2. Create a vector table

Step 3. Store the vector embeddings

Step 4. Query the vector table

Step 5. Perform a vector search query

See also

Files

vector-search-get-started-using-sql.md

Latest commit

History

vector-search-get-started-using-sql.md

File metadata and controls

Get Started with Vector Search via SQL

Prerequisites

Get started

Step 1. Connect to the TiDB cluster

Step 2. Create a vector table

Step 3. Store the vector embeddings

Step 4. Query the vector table

Step 5. Perform a vector search query

See also