Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when sharing environment from workspace to registry #39502

Closed
bastbu opened this issue Jan 31, 2025 · 4 comments
Closed

Error when sharing environment from workspace to registry #39502

bastbu opened this issue Jan 31, 2025 · 4 comments
Assignees
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning needs-author-feedback Workflow: More information is needed from author to address the issue. Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@bastbu
Copy link

bastbu commented Jan 31, 2025

  • Package Name: azure-ai-ml
  • Package Version: 1.23.0
  • Operating System: Ubuntu
  • Python Version: 3.11.11

Describe the bug

We are trying to use the AML Registry to share components across workspaces. As indicated in the documentation for a secure AML Registry, we must build the environment in the workspace before we share it to the AML Registry.

When we try to onboard different components that share the same environment concurrently, this often results in:

AttributeError: 'AzureMachineLearningWorkspaces' object has no attribute 'workspaces'

To wait until the environment was built, we rely on an error message raised by the AML Registry, as there is no way to wait for the environment build to complete in the v2 SDK. Is there another way how we can make sharing components in an AML Registry work when using the v2 SDK? We cannot publish the components directly to the AML Registry due to the bug reported here.

To Reproduce

I used the following script to reproduce the issue, and for me it consistently occurs after one or two tries. Make sure that the requirements.txt is modified between runs to avoid caching of an already built environment/image.

from concurrent.futures import ThreadPoolExecutor
from pathlib import Path
from time import sleep
from uuid import uuid4

from azure.ai.ml import MLClient
from azure.ai.ml.entities import BuildContext, Environment
from azure.core.exceptions import HttpResponseError


def wait_register_environment(
    *, ws_client: MLClient, registry_name: str, environment: Environment
) -> Environment:
    _ = ws_client.environments.create_or_update(environment)

    while True:
        try:
            return ws_client.environments.share(
                name=name,
                version=version,
                share_with_name=name,
                share_with_version=version,
                registry_name=registry_name,
            )
        except HttpResponseError as exc_info:
            if "AssetNotReadyForPublishFromSource" in exc_info.message:
                print("Not yet ready, waiting...")
                sleep(5)
            else:
                raise


def main(ml_client: MLClient, registry_name: str):
    with ThreadPoolExecutor() as executor:
        futures = [
            executor.submit(
                lambda: wait_register_environment(
                    ws_client=ml_client,
                    registry_name=registry_name,
                    environment=Environment(
                        name="share_env_repro",
                        version=uuid4().hex,
                        build=BuildContext(path=Path(__file__).parent / "environment"),
                    ),
                )
            )
            for _ in range(5)
        ]

    for future in futures:
        print(future.result())

With the following build context:

Dockerfile:

FROM python:3.11-bookworm

COPY requirements.txt .

RUN pip install -r requirements.txt

requirements.txt:

argparse

We share a global client instance as MLClient was indicated to be immutable.

The full error is as follows:

Not yet ready, waiting...
Not yet ready, waiting...
Not yet ready, waiting...
Not yet ready, waiting...
Traceback (most recent call last):
  File "/home/vscode/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/runpy.py", line 198, in _run_module_as_main
    return _run_code(code, main_globals, None,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/runpy.py", line 88, in _run_code
    exec(code, run_globals)
  File "/home/vscode/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/__main__.py", line 71, in <module>
    cli.main()
  File "/home/vscode/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 501, in main
    run()
  File "/home/vscode/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher/../../debugpy/../debugpy/server/cli.py", line 351, in run_file
    runpy.run_path(target, run_name="__main__")
  File "/home/vscode/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 310, in run_path
    return _run_module_code(code, init_globals, run_name, pkg_name=pkg_name, script_name=fname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 127, in _run_module_code
    _run_code(code, megaloblast, init_globals, mod_name, mod_spec, pkg_name, script_name)
  File "/home/vscode/.vscode-server/extensions/ms-python.debugpy-2024.14.0-linux-x64/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_runpy.py", line 118, in _run_code
    exec(code, run_globals)
  File "/workspaces/repros/share_environment/main.py", line 81, in <module>
    main(dependencies.workspace_client(), dependencies.registry_name())
  File "/workspaces/repros/share_environment/main.py", line 77, in main
    print(future.result())
          ^^^^^^^^^^^^^^^
  File "/home/vscode/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/home/vscode/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/vscode/.local/share/uv/python/cpython-3.11.11-linux-x86_64-gnu/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/repros/share_environment/main.py", line 63, in <lambda>
    lambda: wait_register_environment(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/repros/share_environment/main.py", line 36, in wait_register_environment
    return ws_client.environments.share(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/.venv/lib/python3.11/site-packages/azure/ai/ml/_telemetry/activity.py", line 292, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/workspaces/.venv/lib/python3.11/site-packages/azure/ai/ml/_utils/_experimental.py", line 100, in wrapped
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/.venv/lib/python3.11/site-packages/azure/ai/ml/operations/_environment_operations.py", line 508, in share
    workspace = self._service_client.workspaces.get(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'AzureMachineLearningWorkspaces' object has no attribute 'workspaces'

Steps to reproduce the behavior:

  1. Run the code described above

Expected behavior

All environments should be successfully published to the AML Registry after a while.

@github-actions github-actions bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jan 31, 2025
@xiangyan99 xiangyan99 added bug This issue requires a change to an existing behavior in the product in order to be resolved. Machine Learning Service Attention Workflow: This issue is responsible by Azure service team. and removed needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Jan 31, 2025
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/azure-ml-sdk @azureml-github.

@github-actions github-actions bot added needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team labels Jan 31, 2025
@kristapratico kristapratico added the Client This issue points to a problem in the data-plane of the library. label Feb 4, 2025
@achauhan-scc
Copy link
Member

SDK current implementation has a limitation, where share method is overriding the client to registry client causing multi thread application to break.
workaround for that pass the different copy of MLClient

@achauhan-scc achauhan-scc added needs-author-feedback Workflow: More information is needed from author to address the issue. and removed needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team labels Feb 4, 2025
Copy link

github-actions bot commented Feb 4, 2025

Hi @bastbu. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

@bastbu
Copy link
Author

bastbu commented Feb 4, 2025

Thanks @achauhan-scc for the workaround, will implement it and I'll report back/reopen if there are any issues.

@bastbu bastbu closed this as completed Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning needs-author-feedback Workflow: More information is needed from author to address the issue. Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

5 participants