Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Data Preview" for DiskANN index doesn't work #333

Closed
yhmo opened this issue Dec 12, 2023 · 5 comments · Fixed by #340
Closed

"Data Preview" for DiskANN index doesn't work #333

yhmo opened this issue Dec 12, 2023 · 5 comments · Fixed by #340
Assignees
Labels
bug Something isn't working

Comments

@yhmo
Copy link

yhmo commented Dec 12, 2023

Describe the bug:
"Data Preview" for DiskANN index doesn't work.

Steps to reproduce:

  1. create a collection with DiskANN index
import random
import time
import numpy as np

from pymilvus import (
    connections,
    FieldSchema, CollectionSchema, DataType,
    Collection,
    utility,
)

connections.connect(host='localhost', port='19530')
print(utility.get_server_version())

collection_name = "test"
dim = 128
metric_type = "L2"

# create collection
id_field = FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=False,)
vector_field = FieldSchema(name="vector", dtype = DataType.FLOAT_VECTOR, dim=dim)
name_field = FieldSchema(name="name", dtype = DataType.VARCHAR, max_length=256)

schema = CollectionSchema(fields=[id_field, vector_field, name_field])

if utility.has_collection(collection_name):
    utility.drop_collection(collection_name)

collection = Collection(name=collection_name, schema=schema)
print(f"Collection '{collection_name}' created")

batch_size = 10000
data = [
    [k for k in range(batch_size)],
    [[random.random() for _ in range(dim)] for _ in range(batch_size)],
    [f"name_{k}" for k in range(batch_size)]
]
for i in range(3):
    print("insert batch", i)
    collection.insert(data)

collection.flush()
print(collection.num_entities)

index_params = {
    'metric_type': metric_type,
    'index_type': "DISKANN",
    'params': {},
}
collection.create_index(field_name="vector", index_params=index_params)
print("index created")

collection.load()
print("collection loaded")
  1. Open the Attu dashboard
  2. Choose the collection and click "Data Preview"

The error shows "Param 'search_list_size'(20) is not in range [100, 4294967295] err %!w()"

Attu version:
Milvus v2.2.14
Attu v2.2.8

Attu version:

@yhmo
Copy link
Author

yhmo commented Dec 12, 2023

Unsaved Image 1

@yhmo
Copy link
Author

yhmo commented Dec 12, 2023

The root cause is:
https://milvus.io/docs/disk_index.md#Prerequisites in this doc, it says "search_list" range is [top_k, min( 10 * top_k, 65535)] for k > 20 and [top_k, 200] for top_k <= 20
But the Attu passes search_list:20 and topk:100 to the server.

Change the search_list to 100 can fix this error.

@shanghaikid
Copy link
Collaborator

yeah. I will fix this in the next release.

@shanghaikid shanghaikid self-assigned this Dec 12, 2023
@shanghaikid shanghaikid added the bug Something isn't working label Dec 12, 2023
@naty88
Copy link

naty88 commented Dec 12, 2023

Great, thanks!
Would it be possible, to fix it in the Version v2.2.8, since it is more suitable for Milvus v2.2.14?
https://discordapp.com/channels/1160323594396635310/1183736071234793494/1183986773081198672

@shanghaikid
Copy link
Collaborator

Great, thanks! Would it be possible, to fix it in the Version v2.2.8, since it is more suitable for Milvus v2.2.14? https://discordapp.com/channels/1160323594396635310/1183736071234793494/1183986773081198672

Why not just use the latest version, if you have any issue, just let me know, I will fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants