pyscript · Mar 4, 2025
diff --git a/‎docs/api.md
Lines changed: 113 additions & 0 deletions b/‎docs/api.md
Lines changed: 113 additions & 0 deletions
diff --git a/‎docs/user-guide/media.md
Lines changed: 372 additions & 0 deletions b/‎docs/user-guide/media.md
Lines changed: 372 additions & 0 deletions
@@ -423,6 +423,119 @@ Such named modules will always then be available under the
     Please see the documentation (linked above) about restrictions and gotchas
     when configuring how JavaScript modules are made available to PyScript.
 
+### `pyscript.media`
+
+The `pyscript.media` namespace provides classes and functions for interacting
+with media devices and streams in a web browser. This module enables you to work
+with cameras, microphones, and other media input/output devices directly from
+Python code.
+
+#### `pyscript.media.Device`
+
+A class that represents a media input or output device, such as a microphone,
+camera, or headset.
+
+```python title="Creating a Device object"
+from pyscript.media import Device, list_devices
+
+# List all available media devices
+devices = await list_devices()
+# Get the first available device
+my_device = devices[0]
+```
+
+The `Device` class has the following properties:
+
+* `id` - a unique string identifier for the represented device.
+* `group` - a string group identifier for devices belonging to the same physical device.
+* `kind` - an enumerated value: "videoinput", "audioinput", or "audiooutput".
+* `label` - a string describing the device (e.g., "External USB Webcam").
+
+The `Device` class also provides the following methods:
+
+##### `Device.load(audio=False, video=True)`
+
+A class method that loads a media stream with the specified options.
+
+```python title="Loading a media stream"
+# Load a video stream (default)
+stream = await Device.load()
+
+# Load an audio stream only
+stream = await Device.load(audio=True, video=False)
+
+# Load with specific video constraints
+stream = await Device.load(video={"width": 1280, "height": 720})
+```
+
+Parameters:
+* `audio` (bool, default: False) - Whether to include audio in the stream.
+* `video` (bool or dict, default: True) - Whether to include video in the
+  stream. Can also be a dictionary of video constraints.
+
+Returns:
+* A media stream object that can be used with HTML media elements.
+
+##### `get_stream()`
+
+An instance method that gets a media stream from this specific device.
+
+```python title="Getting a stream from a specific device"
+# Find a video input device
+video_devices = [d for d in devices if d.kind == "videoinput"]
+if video_devices:
+    # Get a stream from the first video device
+    stream = await video_devices[0].get_stream()
+```
+
+Returns:
+* A media stream object from the specific device.
+
+#### `pyscript.media.list_devices()`
+
+An async function that returns a list of all currently available media input and
+output devices.
+
+```python title="Listing all media devices"
+from pyscript.media import list_devices
+
+devices = await list_devices()
+for device in devices:
+    print(f"Device: {device.label}, Kind: {device.kind}")
+```
+
+Returns:
+* A list of `Device` objects representing the available media devices.
+
+!!! Note
+
+    The returned list will omit any devices that are blocked by the document
+    Permission Policy or for which the user has not granted permission.
+
+### Simple Example
+
+```python title="Basic camera access"
+from pyscript import document
+from pyscript.media import Device
+
+async def init_camera():
+    # Get a video stream
+    stream = await Device.load(video=True)
+    
+    # Set the stream as the source for a video element
+    video_el = document.getElementById("camera")
+    video_el.srcObject = stream
+    
+# Initialize the camera
+init_camera()
+```
+
+!!! warning
+
+    Using media devices requires appropriate permissions from the user.
+    Browsers will typically show a permission dialog when `list_devices()` or
+    `Device.load()` is called.
+
 ### `pyscript.storage`
 
 The `pyscript.storage` API wraps the browser's built-in
 
@@ -0,0 +1,372 @@
+# PyScript and Media Devices
+
+For web applications to interact with cameras, microphones, and other media
+devices, there needs to be a way to access these hardware components through the
+browser. PyScript provides a media API that enables your Python code to interact
+with media devices directly from the browser environment.
+
+This section explains how PyScript interacts with media devices and how you can
+use these capabilities in your applications.
+
+## Media Device Access
+
+PyScript interacts with media devices through the browser's [MediaDevices
+API](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices). This API
+provides access to connected media input devices like cameras and microphones,
+as well as output devices like speakers.
+
+When using PyScript's media API, it's important to understand:
+
+1. Media access requires **explicit user permission**. The browser will show a
+   permission dialog when your code attempts to access cameras or microphones.
+2. Media access is only available in **secure contexts** (HTTPS or localhost).
+3. All media interactions happen within the **browser's sandbox**, following the
+   browser's security policies.
+
+## The `pyscript.media` API
+
+PyScript provides a Pythonic interface to media devices through the
+`pyscript.media` namespace. This API includes two main components:
+
+1. The `Device` class - represents a media device and provides methods to
+   interact with it
+2. The `list_devices()` function - discovers available media devices
+
+### Listing Available Devices
+
+To discover what media devices are available, use the `list_devices()` function:
+
+```python
+from pyscript.media import list_devices
+
+async def show_available_devices():
+    devices = await list_devices()
+    for device in devices:
+        print(f"Device: {device.label}, Type: {device.kind}, ID: {device.id}")
+
+# List all available devices
+show_available_devices()
+```
+
+This function returns a list of `Device` objects, each representing a media
+input or output device. Note that the browser will typically request permission
+before providing this information.
+
+### Working with the Camera
+
+The most common use case is accessing the camera to display a video stream:
+
+```python
+from pyscript import when
+from pyscript.media import Device
+from pyscript.web import page
+
+async def start_camera():
+    # Get a video stream (defaults to video only, no audio)
+    stream = await Device.load(video=True)
+    
+    # Connect the stream to a video element in your HTML
+    video_element = page["#camera"][0]._dom_element
+    video_element.srcObject = stream
+    
+    return stream
+
+# Start the camera
+camera_stream = start_camera()
+```
+
+The `Device.load()` method is a convenient way to access media devices without
+first listing all available devices. You can specify options to control which
+camera is used:
+
+```python
+# Prefer the environment-facing camera (often the back camera on mobile)
+stream = await Device.load(video={"facingMode": "environment"})
+
+# Prefer the user-facing camera (often the front camera on mobile)
+stream = await Device.load(video={"facingMode": "user"})
+
+# Request specific resolution
+stream = await Device.load(video={
+    "width": {"ideal": 1280},
+    "height": {"ideal": 720}
+})
+```
+
+### Capturing Images from the Camera
+
+To capture a still image from the video stream:
+
+```python
+def capture_image(video_element):
+    # Get the video dimensions
+    width = video_element.videoWidth
+    height = video_element.videoHeight
+    
+    # Create a canvas to capture the frame
+    canvas = document.createElement("canvas")
+    canvas.width = width
+    canvas.height = height
+    
+    # Draw the current video frame to the canvas
+    ctx = canvas.getContext("2d")
+    ctx.drawImage(video_element, 0, 0, width, height)
+    
+    # Get the image as a data URL
+    image_data = canvas.toDataURL("image/png")
+    
+    return image_data
+```
+
+For applications that need to process images with libraries like OpenCV, you
+need to convert the image data to a format these libraries can work with:
+
+```python
+import numpy as np
+import cv2
+
+def process_frame_with_opencv(video_element):
+    # Get video dimensions
+    width = video_element.videoWidth
+    height = video_element.videoHeight
+    
+    # Create a canvas and capture the frame
+    canvas = document.createElement("canvas")
+    canvas.width = width
+    canvas.height = height
+    ctx = canvas.getContext("2d")
+    ctx.drawImage(video_element, 0, 0, width, height)
+    
+    # Get the raw pixel data
+    image_data = ctx.getImageData(0, 0, width, height).data
+    
+    # Convert to numpy array for OpenCV
+    frame = np.asarray(image_data, dtype=np.uint8).reshape((height, width, 4))
+    
+    # Convert from RGBA to BGR (OpenCV's default format)
+    frame_bgr = cv2.cvtColor(frame, cv2.COLOR_RGBA2BGR)
+    
+    # Process the image with OpenCV
+    # ...
+    
+    return frame_bgr
+```
+
+### Managing Camera Resources
+
+It's important to properly manage media resources, especially when your
+application no longer needs them. Cameras and microphones are shared resources,
+and failing to release them can impact other applications or cause unexpected
+behavior.
+
+### Stopping the Camera
+
+To stop the camera and release resources:
+
+```python
+from pyscript.web import page
+
+def stop_camera(stream):
+    # Stop all tracks on the stream
+    if stream:
+        tracks = stream.getTracks()
+        for track in tracks:
+            track.stop()
+        
+        # Clear the video element's source
+        video_element = page["#camera"][0]._dom_element
+        if video_element:
+            video_element.srcObject = None
+```
+
+### Switching Between Cameras
+
+For devices with multiple cameras, you can implement camera switching:
+
+```python
+from pyscript.media import Device, list_devices
+from pyscript.web import page
+
+class CameraManager:
+    def __init__(self):
+        self.cameras = []
+        self.current_index = 0
+        self.active_stream = None
+        self.video_element = page["#camera"][0]._dom_element
+    
+    async def initialize(self):
+        # Get all video input devices
+        devices = await list_devices()
+        self.cameras = [d for d in devices if d.kind == "videoinput"]
+        
+        # Start with the first camera
+        if self.cameras:
+            await self.start_camera(self.cameras[0].id)
+    
+    async def start_camera(self, device_id=None):
+        # Stop any existing stream
+        await self.stop_camera()
+        
+        # Start a new stream
+        video_options = (
+            {"deviceId": {"exact": device_id}} if device_id 
+            else {"facingMode": "environment"}
+        )
+        self.active_stream = await Device.load(video=video_options)
+        
+        # Connect to the video element
+        if self.video_element:
+            self.video_element.srcObject = self.active_stream
+    
+    async def stop_camera(self):
+        if self.active_stream:
+            tracks = self.active_stream.getTracks()
+            for track in tracks:
+                track.stop()
+            self.active_stream = None
+            
+            if self.video_element:
+                self.video_element.srcObject = None
+    
+    async def switch_camera(self):
+        if len(self.cameras) <= 1:
+            return
+        
+        # Move to the next camera
+        self.current_index = (self.current_index + 1) % len(self.cameras)
+        await self.start_camera(self.cameras[self.current_index].id)
+```
+
+## Working with Audio
+
+In addition to video, the PyScript media API can access audio inputs:
+
+```python
+# Get access to the microphone (audio only)
+audio_stream = await Device.load(audio=True, video=False)
+
+# Get both audio and video
+av_stream = await Device.load(audio=True, video=True)
+```
+
+## Best Practices
+
+When working with media devices in PyScript, follow these best practices:
+
+### Permissions and User Experience
+
+1. **Request permissions contextually**:
+   - Only request camera/microphone access when needed
+   - Explain to users why you need access before requesting it
+   - Provide fallback options when permissions are denied
+
+2. **Clear user feedback**:
+   - Indicate when the camera is active
+   - Provide controls to pause/stop the camera
+   - Show loading states while the camera is initializing
+
+### Resource Management
+
+1. **Always clean up resources**:
+   - Stop media tracks when they're not needed
+   - Clear `srcObject` references from video elements
+   - Be especially careful in single-page applications
+
+2. **Handle errors gracefully**:
+   - Catch exceptions when requesting media access
+   - Provide meaningful error messages
+   - Offer alternatives when media access fails
+
+### Performance Optimization
+
+1. **Match resolution to needs**:
+   - Use lower resolutions when possible
+   - Consider mobile device limitations
+   - Adjust video constraints based on the device
+
+2. **Optimize image processing**:
+   - Process frames on demand rather than continuously
+   - Use efficient algorithms
+   - Consider downsampling for faster processing
+
+## Example Application: Simple Camera Capture
+
+Here's a simplified example that shows how to capture and display images from a
+camera:
+
+```python
+from pyscript import when, window
+from pyscript.media import Device
+from pyscript.web import page
+
+class CameraCapture:
+    def __init__(self):
+        # Get UI elements
+        self.video = page["#camera"][0]
+        self.video_element = self.video._dom_element
+        self.capture_button = page["#capture-button"]
+        self.snapshot = page["#snapshot"][0]
+        
+        # Start camera
+        self.initialize_camera()
+    
+    async def initialize_camera(self):
+        # Prefer environment-facing camera on mobile devices
+        stream = await Device.load(video={"facingMode": "environment"})
+        self.video_element.srcObject = stream
+    
+    def take_snapshot(self):
+        """Capture a frame from the camera and display it"""
+        # Get video dimensions
+        width = self.video_element.videoWidth
+        height = self.video_element.videoHeight
+        
+        # Create canvas and capture frame
+        canvas = window.document.createElement("canvas")
+        canvas.width = width
+        canvas.height = height
+        
+        # Draw the current video frame to the canvas
+        ctx = canvas.getContext("2d")
+        ctx.drawImage(self.video_element, 0, 0, width, height)
+        
+        # Convert the canvas to a data URL and display it
+        image_data_url = canvas.toDataURL("image/png")
+        self.snapshot.setAttribute("src", image_data_url)
+
+# HTML structure needed:
+# <video id="camera" autoplay playsinline></video>
+# <button id="capture-button">Take Photo</button>
+# <img id="snapshot">
+
+# Usage:
+# camera = CameraCapture()
+# 
+# @when("click", "#capture-button")
+# def handle_capture(event):
+#     camera.take_snapshot()
+```
+
+This example demonstrates:
+- Initializing a camera with the PyScript media API
+- Accessing the camera stream and displaying it in a video element
+- Capturing a still image from the video stream when requested
+- Converting the captured frame to an image that can be displayed
+
+This simple pattern can serve as the foundation for various camera-based
+applications and can be extended with image processing libraries as needed for
+more complex use cases.
+
+
+## Conclusion
+
+The PyScript media API provides a powerful way to access and interact with
+cameras and microphones directly from Python code running in the browser. By
+following the patterns and practices outlined in this guide, you can build
+sophisticated media applications while maintaining good performance and user
+experience.
+
+Remember that media access is a sensitive permission that requires user consent
+and should be used responsibly. Always provide clear indications when media
+devices are active and ensure proper cleanup of resources when they're no longer
+needed.