This is a basic introduction to Milvus by milvus-sdk-node. The all functions will return a promise, so we can use async await to get the result.
Milvus: v1.x Node: v14+
npm install @zilliz/milvus-sdk-node
Before we start, there are some prerequisites.
Make sure that:
- You have a running Milvus instance.
- milvus-sdk-node is correctly installed.
- First of all, we need to import @zilliz/milvus-sdk-node.
import { MilvusNode } from "@zilliz/milvus-sdk-node";
- Then, we can make connection with Milvus server. By default Milvus runs on localhost in port 19530, so you can use default value to connect to Milvus.
const IP = "127.0.0.1:19530";
const milvusClient = new MilvusNode(IP);
- After connecting, we can communicate with Milvus in the following ways. If you are confused about the terminology, see Milvus Terminology for explanations.
Now let's create a new collection. Before we start, we can list all the collections already exist. For a brand new Milvus running instance, the result should be empty.
const collections = await milvusClient.showCollections();
// return data
// {
// collection_names: [ 'test_01' ],
// status: { error_code: 'SUCCESS', reason: 'OK' }
// }
To create collection, we need to provide collection parameters.
CollectionSchema
consists of 4 components, they are collection_name
, dimension
, index_file_size
and metric_type
.
-
collection_name: The name of collection should be a unique string to collections already exist.
-
dimension: For a float vector, dimension should be equal to the length of a vector; for a binary vector, dimension should be equal to bit size of a vector.
-
index_file_size: Milvus controls the size of data segment according to the
index_file_size
, you can refer to Storage Concepts for more information aboutsegments
andindex_file_size
. -
metric_type: We can use getMetricType function to get all metricTypes. Milvus compute distance between two vectors, you can refer to Distance Metrics for more information.
Now we can create a collection:
const metricTypes = milvusClient.getMetricType();
const res = await milvusClient.createCollection({
collection_name: "demo_milvus_tutorial",
dimension: 8,
metric_type: metricTypes.IP,
index_file_size: 1024,
});
// { error_code: 'SUCCESS', reason: 'OK' }
Then you can list collections and 'demo_film_tutorial' will be in the result.
You can also get info of the collection.
const collectionInfo = await milvusClient.showCollectionsInfo({
collection_name: COLLECTION_NAME,
});
// {
// status: { error_code: 'SUCCESS', reason: 'OK' },
// json_info: '{"partitions":[{"row_count":0,"segments":null,"tag":"_default"}],"row_count":0}'
// }
This tutorial is a basic intro tutorial, building index won't be covered by this tutorial.
If you want to go further into Milvus with indexes, it's recommended to check our Full examples
Further more, if you want to get a thorough view of indexes, check our official website for Vector Index.
If you don't create a partition, there will be a default one called "_default
", all the entities will be inserted into the "_default
" partition. You can check it by `list_partitions
const partitions = await milvusClient.showPartitions({
collection_name: COLLECTION_NAME,
});
// {
// partition_tag_array: [ '_default' ],
// status: { error_code: 'SUCCESS', reason: 'OK' }
// }
You can provide a partition tag to create a new partition.
const res = await milvusClient.createPartition({
collection_name: COLLECTION_NAME,
tag: PARTITION_TAG,
});
// { error_code: 'SUCCESS', reason: 'OK' }
An entity is a group of fields that corresponds to real world objects. In current version, Milvus only contains a vector field.
- List 3 Entities
const entities = new Array(3).fill(new Array(8).fill(Math.random() * 100));
- Insert Entities
If the entities inserted successfully,
ids
we provided will be returned.
const res = await milvusClient.insert({
collection_name: COLLECTION_NAME,
partition_tag: PARTITION_TAG,
records: entities.map((v, i) => ({
value: v,
})),
record_type: "float",
});
// {
// vector_id_array: [
// '1618818108058974000',
// '1618818108058974001',
// '1618818108058974002'
// ],
// status: { error_code: 'SUCCESS', reason: 'OK' }
// }
Or you can also provide entity ids
const res = await milvusClient.insert({
collection_name: COLLECTION_NAME,
partition_tag: PARTITION_TAG,
records: entities.map((v, i) => ({
id: i + 1,
value: v,
})),
record_type: "float",
});
// {
// vector_id_array: [
// '1',
// '2',
// '3'
// ],
// status: { error_code: 'SUCCESS', reason: 'OK' }
// }
If the first time when insert()
is invoked id
is not passed into this method, each of the rest time when insert()
is invoked id
is not permitted to pass, otherwise server will return an error and the insertion process will fail. And vice versa.
If partition_tag
isn't provided, these entities will be inserted into the "_default
" partition.
otherwise, them will be inserted into specified partition.
After successfully inserting 3 entities into Milvus, we can Flush
data from memory to disk so that we can retrieve them. Milvus also performs an automatic flush with a fixed interval(configurable, default 1 second),
see Data Flushing
You can flush multiple collections at one time, so be aware the parameter is a list.
const res = await milvusClient.flush({
collection_name_array: [COLLECTION_NAME],
});
// { error_code: 'SUCCESS', reason: 'OK' }
We can also count how many entities are there in the collection.
const count = await milvusClient.countCollection({
collection_name: COLLECTION_NAME,
});
// {
// status: { error_code: 'SUCCESS', reason: 'OK' },
// collection_row_count: '6'
// }
You can get entities by their ids.
const res = await milvusClient.getVectorsByID({
collection_name: COLLECTION_NAME,
id_array: [1, 2],
});
console.log("--- get vectors by id ---", count);
If id exists, an entity will be returned. If id doesn't exist, []
will be return.
For the example above, the result demo_milvus_tutorial
will only have one entity, the other is []
.
You can get entities by vector similarity. Assuming we have a film_A
like below, and we want to get top 2 films
that are most similar with it.
const films_a = new Array(2).fill(new Array(8).fill(Math.random() * 100));
-
If the collection is index-built, user need to specify search param, and pass parameter
params
like:milvusClient.search(..., params={...})
. -
If parameter
partition_tags
is specified, milvus executes search request on these partition instead of whole collection. -
Because vectors are randomly generated, so the retrieved vector id and distance may differ.
const res = await milvusClient.search({
collection_name: COLLECTION_NAME,
topk: 1,
extra_params: { nprobe: 16 },
query_record_array: films_a.map((v) => ({
float_data: v,
})),
});
// {
// ids: [
// '1618819557627387001',
// '1618819557627387000',
// ],
// distances: [
// 7.619400501251221,
// 7.619400501251221,
// ],
// status: { error_code: 'SUCCESS', reason: 'OK' },
// row_num: '2',
// data: [
// [
// { id: '1618819557627387001', distance: 7.619400501251221 },
// ],
// [
// { id: '1618819557627387000', distance: 7.619400501251221 }
// ],
// ]
// }
Finally, let's move on to deletion in Milvus. We can delete entities by ids, drop a whole partition, or drop the entire collection.
You can delete entities by their ids.
const res = await milvusClient.deleteByIds({
id_array: [1, 2],
collection_name: COLLECTION_NAME,
});
// { error_code: 'SUCCESS', reason: 'OK' }
If one entity corresponding to a specified id doesn't exist, milvus ignore it and execute next deletion. In this case, client always return ok status except any exception occurs.
You can also drop a partition.
Once you drop a partition, all the data in this partition will be deleted too.
const res = await milvusClient.dropPartition({
collection_name: COLLECTION_NAME,
tag: PARTITION_TAG,
});
// { error_code: 'SUCCESS', reason: 'OK' }
Finally, you can drop an entire collection.
Once you drop a collection, all the data in this collection will be deleted too.
await milvusClient.dropCollection({
collection_name: COLLECTION_NAME,
});
// { error_code: 'SUCCESS', reason: 'OK' }