Skip to content

Implement MLX backend support for Apple Silicon#47

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/add-mlx-backend-support
Draft

Implement MLX backend support for Apple Silicon#47
Copilot wants to merge 4 commits intomainfrom
copilot/add-mlx-backend-support

Conversation

Copy link

Copilot AI commented Oct 19, 2025

Overview

This PR implements full MLX backend support for the Docker Model Runner, enabling optimized inference on Apple Silicon (M1/M2/M3 chips) using Apple's MLX framework. This addresses issue #XXX which requested MLX backend implementation.

What Changed

MLX Backend Implementation

Added a complete MLX backend (pkg/inference/backends/mlx/mlx.go) that:

  • Automatic Installation: Creates a Python virtual environment and installs the mlx-lm package on first use
  • Platform Detection: Restricts execution to macOS with Apple Silicon (arm64) only
  • Safetensors Support: Leverages the existing safetensors format support in OCI artifacts (no additional work needed as mentioned in the issue comments)
  • OpenAI Compatible API: Runs mlx_lm.server to provide a standard OpenAI-compatible inference API
  • Memory Estimation: Properly estimates memory requirements accounting for Apple Silicon's unified memory architecture
  • Sandboxed Execution: Uses the same sandbox configuration as llama.cpp for security

Integration

  • Updated main.go to conditionally register the MLX backend on compatible platforms
  • MLX backend works alongside the existing llama.cpp backend
  • Virtual environment is created in ./mlx-env and reused across runs

Usage Example

# Package a safetensors model
docker model package --safetensors-dir ./my-model-dir --tag myregistry/mymodel:latest

# Run the model (MLX backend will be automatically used on Apple Silicon)
docker model run myregistry/mymodel:latest "Hello, world!"

Technical Details

Requirements:

  • macOS with Apple Silicon (arm64)
  • Python 3.8 or later
  • Models in safetensors format

Backend Interface:
All required methods are fully implemented:

  • Install(): Sets up Python environment and installs mlx-lm
  • Run(): Executes model inference via mlx_lm.server
  • GetRequiredMemoryForModel(): Estimates memory based on model size
  • Status(), GetDiskUsage(): Track backend state

Security:

  • CodeQL scan: 0 alerts found
  • Sandboxed execution for process isolation
  • Proper subprocess lifecycle management
  • Context-aware cancellation support

Testing

  • Added comprehensive unit tests in mlx_test.go
  • All existing tests continue to pass
  • Build succeeds on all platforms (gracefully disabled on non-Apple Silicon)

Documentation

Updated README.md with a dedicated MLX backend section including:

  • Feature overview
  • Requirements
  • Usage examples
  • Automatic installation details

Related Issues

Closes #XXX (MLX backend feature request)

As noted in the issue comments, safetensors support in OCI artifacts was a prerequisite, which is already implemented in the codebase. This PR completes the MLX backend implementation.

Original prompt

This section details on the original issue you should resolve

<issue_title>Feature: MLX backend</issue_title>
<issue_description>Hello there!

It looks like you are planning to support MLX backend eventually, since there is already a mock in the code. Opening this issue mostly to be able to subscribe to it to know when it happens.

Otherwise great project, thank you!</issue_description>

Comments on the Issue (you are @copilot in this section)

@ericcurtin One step required for this is safetensors in OCI artifact, same goes for vLLM support and others... @ericcurtin This is not a prioritised item by anyone, so if somebody in the community wants to grab it, go for it.

Fixes docker#90


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Copilot AI and others added 3 commits October 19, 2025 11:43
Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
Co-authored-by: ericcurtin <1694275+ericcurtin@users.noreply.github.com>
Copilot AI changed the title [WIP] Add support for MLX backend in the project Implement MLX backend support for Apple Silicon Oct 19, 2025
Copilot AI requested a review from ericcurtin October 19, 2025 11:51
ericcurtin pushed a commit that referenced this pull request Oct 30, 2025
* Add /models/prune

Uses docker/model-distribution#78.

Signed-off-by: Dorin Geman <dorin.geman@docker.com>

* feat(tests): add ResetStore test to verify store reset functionality

---------

Signed-off-by: Dorin Geman <dorin.geman@docker.com>
Co-authored-by: Ignacio López Luna <ignasi.lopez.luna@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: MLX backend

2 participants