Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix embed not using cuda as default device when available 2.11 #941

Merged
merged 18 commits into from
Aug 20, 2024
Merged
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -149,4 +149,6 @@ dump.rdb
.DS_Store

# Tester app for unit tests
scripts/vespa_local/vespa_tester_app.zip
scripts/vespa_local/vespa_tester_app.zip

src/marqo/tensor_search/cache_dir/*
10 changes: 10 additions & 0 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
# Release 2.11.2
RaynorChavez marked this conversation as resolved.
Show resolved Hide resolved

## Bug fixes and minor changes
- Fix an issue where CUDA was not automatically selected as the default device for the `embed` endpoint, even when available [#941](https://github.com/marqo-ai/marqo/pull/941).

# Release 2.11.1

## Bug fixes and minor changes
Expand All @@ -22,6 +27,11 @@
- Huge shoutout to all our 4.4k stargazers! We’ve come a long way as a team and as a community, so a huge thanks to everyone who continues to support Marqo.
- Feel free to keep on sharing questions and feedback on our [forum](https://community.marqo.ai/) and [Slack channel](https://marqo-community.slack.com/join/shared_invite/zt-2b4nsvbd2-TDf8agPszzWH5hYKBMIgDA#/shared-invite/email)! If you have any more inquiries or thoughts, please don’t hesitate to reach out.

# Release 2.10.2

## Bug fixes and minor changes
- Fix an issue where CUDA was not automatically selected as the default device for the `embed` endpoint, even when available [#941](https://github.com/marqo-ai/marqo/pull/941).

# Release 2.10.1

## Bug fixes and minor changes
Expand Down
6 changes: 3 additions & 3 deletions src/marqo/core/embed/embed.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from marqo.tensor_search.tensor_search_logging import get_logger
from marqo.core.utils.prefix import determine_text_prefix, DeterminePrefixContentType
from marqo.vespa.vespa_client import VespaClient
from marqo.tensor_search import utils

logger = get_logger(__name__)

Expand Down Expand Up @@ -61,11 +62,10 @@ def embed_content(
temp_config = config.Config(
vespa_client=self.vespa_client,
)

# Set default device if not provided
if device is None:
device = self.default_device

device = utils.read_env_vars_and_defaults("MARQO_BEST_AVAILABLE_DEVICE")
RaynorChavez marked this conversation as resolved.
Show resolved Hide resolved

# Content validation is done in API model layer
t0 = timer()
Expand Down
2 changes: 1 addition & 1 deletion src/marqo/version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "2.11.1"
__version__ = "2.11.2"

def get_version() -> str:
return f"{__version__}"
40 changes: 40 additions & 0 deletions tests/tensor_search/integ_tests/test_embed.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from marqo.vespa.models.query_result import Root, Child, RootFields
from marqo.tensor_search.models.private_models import S3Auth, ModelAuth, HfAuth
from marqo.api.models.embed_request import EmbedRequest
from marqo.tensor_search import utils
import os
import pprint
import unittest
Expand Down Expand Up @@ -150,6 +151,45 @@ def tearDown(self) -> None:
super().tearDown()
self.device_patcher.stop()

def test_embed_content_cuda_device_as_default(self):
"""
Test that embed_content uses the default device when no device is specified.
"""
for index in [self.unstructured_default_text_index, self.structured_default_text_index]:
with self.subTest(index=index.type):
expected_devices = ["cuda", "cpu"]
for expected_device in expected_devices:
with patch.dict(os.environ, {"MARQO_BEST_AVAILABLE_DEVICE": expected_device}):
with patch('marqo.tensor_search.tensor_search.run_vectorise_pipeline') as mock_vectorise:
mock_vectorise.return_value = {0: [0.1, 0.2, 0.3]}

embed_res = embed(
marqo_config=self.config,
index_name=index.name,
embedding_request=EmbedRequest(
content=["This is a test document"]
),
device=None
)

# Check that run_vectorise_pipeline was called
mock_vectorise.assert_called_once()

# Get the arguments passed to run_vectorise_pipeline
args, kwargs = mock_vectorise.call_args

# Print the args and kwargs for debugging
print(f"args passed to run_vectorise_pipeline: {args}")
print(f"kwargs passed to run_vectorise_pipeline: {kwargs}")

# Check that the device passed to run_vectorise_pipeline matches the expected value
self.assertEqual(args[2], expected_device)

# Check the result
self.assertEqual(embed_res["content"], ["This is a test document"])
self.assertIsInstance(embed_res["embeddings"][0], list)
self.assertEqual(embed_res["embeddings"][0], [0.1, 0.2, 0.3])

def test_embed_equivalent_to_add_docs(self):
"""
Ensure that the embedding returned by embed endpoint matches the one created by add_docs.
Expand Down
Loading