[FEATURE] `uv` provider #164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

burtenshaw wants to merge 3 commits into main from uv-provider-for-hub

Collaborator

burtenshaw commented Nov 7, 2025 •

edited

Loading

This PR adds a uv provider to open env core which can be used to run envs on spaces without docker. The main intention here is to make a pure python option for notebook environments like colab.

!pip install git+https://huggingface.co/spaces/burtenshaw/echo-cli

from envs.echo_env.client import EchoEnv, EchoAction

# Pulls `openenv/echo_env` from the Hub and runs it via `uv run`.
client = EchoEnv.from_hub(
    "burtenshaw/echo-cli",
    host="127.0.0.1",
    port=8016,
    reload=False,
    use_docker=False,
)

# use it
reset_result = client.reset()
step_result = client.step(EchoAction(message="Hello from the Hub!"))
print("Echoed:", step_result.observation.echoed_message)
print("Reward:", step_result.reward)

# close it
client.close()  # stops the uv subprocess cleanly

The above example, depends on this specific env, which is the result of #160 .

burtenshaw added 3 commits

November 7, 2025 09:51


          implement a uv subprocess provider based on docker provider

939de3c


          add uv provider to http client

1108d8d


          expose uv provider

01a5835

meta-cla bot added the CLA Signed label

Collaborator Author

burtenshaw commented Nov 7, 2025

@Wauplin is the maintain of hf and he agreed to take a look at this PR.

burtenshaw requested review from pankit-eng and zkwentz

November 7, 2025 09:31

Wauplin reviewed

View reviewed changes

Wauplin left a comment

Hey @burtenshaw , thanks for the ping! Happy to review a PR :) I've left quite a lot of comments on how I would implement things if starting the project right now. Feel free to ignore when not applicable or if I did not take into account other relevant parts of the library. In general my main comment is that I always tend to reduce the number of available parameters ("if no-one asked for it, let's not implement it") and isolate logic ("let's not have Hub-logic or docker-logic in uv provider").

Given that it's the second provider implemented, I think it's good to take some time to think about which API makes the most sense for the future of the library.

Let me know if you have any questions / comments 🤗

src/core/containers/runtime/uv_provider.py

    
              import requests

              from .providers import ContainerProvider

Wauplin Nov 7, 2025

Since uv is not per-se a container, I would rename the abstract class RuntimeProvider (or simply Runtime?) and the abstract methods start, stop, and wait_for_ready (instead of start_container, etc.). I feel that semantically it would be more accurate.

(just my 2c, feel free to ignore 🤗 )

src/core/containers/runtime/uv_provider.py

Comment on lines +23 to +27

    
                          response = requests.get(health_url, timeout=2.0)

                          if response.status_code == 200:

                              return

                      except requests.RequestException:

                          pass

Wauplin Nov 7, 2025

Suggested change

      
                        response = requests.get(health_url, timeout=2.0)
          
                        if response.status_code == 200:
          
                            return
          
                    except requests.RequestException:
          
                        pass
          
                        timeout = max(0.0001, min(deadline - time.time(), 2.0))
          
                        response = requests.get(health_url, timeout=timeout)
          
                        if response.status_code == 200:
          
                            return
          
                    except requests.RequestException:
          
                        continue

(nit) to avoid waiting 2s + 0.5s if timeout_s is less than 2s.

src/core/containers/runtime/uv_provider.py

    
              @dataclass

              class UVProvider(ContainerProvider):

Wauplin Nov 7, 2025

Suggested change

      
            class UVProvider(ContainerProvider):
          
            class UVRuntime(RuntimeProvider):

(naming suggestion to be aligned with the "runtime/" folder)

src/core/containers/runtime/uv_provider.py

    
                  host: str = "0.0.0.0"

                  port: Optional[int] = None

                  reload: bool = False

                  project_url: Optional[str] = None

Wauplin Nov 7, 2025

What is project_url for compared to repo_id? I find this redundancy a bit confusing. What about having a single project: str argument which will then be forwarded to uv? This project can be either a local path or a git+... url.

If we do so, UVProvider becomes independent from HF Hub and all the Hub-related logic is delegated to HTTPEnvClient.fromHub (i.e. we build the project url based on repo_id in fromHub and that's it).

src/core/containers/runtime/uv_provider.py

    
                  return command

              @dataclass

Wauplin Nov 7, 2025

Personal opinion but I feel that using a dataclass for a subprocess runtime is not very Pythonic (and not aligned with ContainerProvider / LocalDockerProvider). Maybe better to remove it and define a proper __init__ method?

src/core/containers/runtime/uv_provider.py

Comment on lines +114 to +116

    
                      client_host = self.connect_host or (

                          "127.0.0.1" if self.host in {"0.0.0.0", "::"} else self.host

                      )

Wauplin Nov 7, 2025

Is there a big difference between host and a connect_host? Like in other comments, I would suggest exposing a single argument to avoid bloated signature.

src/core/http_env_client.py

    
                      connect_host: Optional[str] = None,

                      extra_env: Optional[Dict[str, str]] = None,

                      **provider_kwargs: Any,

                  ) -> EnvClientT:

Wauplin Nov 7, 2025

I find this signature very hard to understand without context since it's mixing kwargs for docker and for uv. Also I don't think one needs use_docker, provider, and runner.

Wauplin Nov 7, 2025

My suggestion would be to remove provider and runner. Also remove project_url and connect_host. Have a single env (instead of env for docker and extra_env for uv). And use some typed dict + typing.overload so that IDEs can have correct autocompletion. Here is a simplified example:

from typing import Any, Dict, NotRequired, TypedDict, Unpack, overload


class DockerKwargs(TypedDict, total=False):
    tag: NotRequired[str]
    env: NotRequired[Dict[str, str]]


class UVProvider(TypedDict):
    host: NotRequired[str]
    port: NotRequired[int]
    reload: NotRequired[bool]
    timeout_s: NotRequired[float]
    env: NotRequired[Dict[str, str]]


@overload
def from_hub(repo_id: str, *, use_docker: bool = True, **kwargs: Unpack[DockerKwargs]) -> str: ...


@overload
def from_hub(repo_id: str, *, use_docker: bool = False, **kwargs: Unpack[UVProvider]) -> str: ...


def from_hub(repo_id: str, *, use_docker: bool = False, **kwargs: Any) -> str:
    raise NotImplementedError()

Wauplin Nov 7, 2025

Note that I'm not a fan of overloads and typed dict but it's the only solution I see to correctly document the signature while keeping a single method.

Another solution is to have from_hub_docker and from_hub_uv (more explicit but less elegant)

src/core/http_env_client.py

    
                          tag = provider_kwargs.pop("tag", "latest")

                          image = provider_kwargs.pop(

                              "image",

Wauplin Nov 7, 2025

Let's not have an image option in from_hub (if user wants to define their own image, they should use from_docker_image)

src/core/http_env_client.py

Comment on lines +159 to +170

    
                      base_url = uv_runner.start_container(

                          repo_id,

                          port=port,

                          env_vars=env_vars,

                          **non_docker_kwargs,

                      )

                      try:

                          uv_runner.wait_for_ready(base_url, timeout_s=timeout_s)

                      except Exception:

                          uv_runner.stop_container()

                          raise

Wauplin Nov 7, 2025

Once the runner is defined and instantiated, I feel that the start and wait calls should be made in the __init__ instead. This way you get the same behavior both in from_hub and from_docker. With current implementation, one stops the runner if failing to start, the other don't (which is not consistent).

src/core/http_env_client.py

Comment on lines +146 to +154

    
                      uv_runner = runner or UVProvider(

                          repo_id=repo_id,

                          host=host,

                          port=port,

                          reload=reload,

                          project_url=project_url,

                          connect_host=connect_host,

                          extra_env=extra_env,

                      )

Wauplin Nov 7, 2025

Suggested change

      
                    uv_runner = runner or UVProvider(
          
                        repo_id=repo_id,
          
                        host=host,
          
                        port=port,
          
                        reload=reload,
          
                        project_url=project_url,
          
                        connect_host=connect_host,
          
                        extra_env=extra_env,
          
                    )
          
                    runner = UVProvider(
          
                        project=f"git+https://huggingface.co/spaces/{repo_id}",
          
                        host=host,
          
                        port=port,
          
                        reload=reload,
          
                        env=env,
          
                    )

With all the suggestions above, that would be the runner call (i.e. provide a project url, remove connect_host, remove project_url, rename extra_env to env, and remove runner). I don't think final user loose much in this simplification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels