Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maximum value validation for HNSW parameters #424

Merged
merged 2 commits into from
Apr 5, 2023
Merged

Conversation

Jeadie
Copy link
Contributor

@Jeadie Jeadie commented Apr 4, 2023

Changes

  • Validate that HNSW parameters cannot be above set values.
    • For ef_construction, maximum can be set via environment variable MARQO_EF_CONSTRUCTION_MAXIMUM_VALUE.

Testing

  • Unit tests
  • Local test
  • Upper limit testing

Local Test

export MARQO_EF_CONSTRUCTION_MAXIMUM_VALUE="2"

Ensure the following fails

client.create_index(index_name, settings_dict={
    "index_defaults": {
        "ann_parameters": {
            "name": "hnsw",
            "space_type": "l2",
            "engine": "lucene",
            "parameters": {
              "ef_construction": 3,
              "m": 50 # acceptable value
            }
        },
    },
    "number_of_shards": 1
})

Upper limit testing

  • Create index with ef_construction
  • Attempt to add 200 simple documents
  • MARQO_EF_CONSTRUCTION_MAXIMUM_VALUE =10,000,000 is a safe upper bound

@@ -101,7 +101,7 @@ class EnvVars:
MARQO_ENABLE_THROTTLING = "MARQO_ENABLE_THROTTLING"
MARQO_LOG_LEVEL = "MARQO_LOG_LEVEL"
MARQO_ROOT_PATH = "MARQO_ROOT_PATH"

EF_CONSTRUCTION_MAXIMUM_VALUE = "MARQO_EF_CONSTRUCTION_MAXIMUM_VALUE"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep the name of the variable and the value identical (in line with the other env vars?). It would be good to keep the pattern unless there's a pressing reason to change it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we can use MAX rather than MAXIMUM for conciseness

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I only saw the pattern on the RHS.

@pandu-k pandu-k temporarily deployed to marqo-test-suite April 5, 2023 00:10 — with GitHub Actions Inactive
@pandu-k pandu-k merged commit 2a6d588 into mainline Apr 5, 2023
@pandu-k pandu-k deleted the jack/issue-206 branch April 5, 2023 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants