Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 76435dc

Browse files
authoredMar 4, 2025··
Merge pull request #176 from danyeaw/add-media-docs
Add docs for the media API
2 parents 9dad36a + e14e54b commit 76435dc

File tree

2 files changed

+485
-0
lines changed

2 files changed

+485
-0
lines changed
 

‎docs/api.md

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -423,6 +423,119 @@ Such named modules will always then be available under the
423423
Please see the documentation (linked above) about restrictions and gotchas
424424
when configuring how JavaScript modules are made available to PyScript.
425425

426+
### `pyscript.media`
427+
428+
The `pyscript.media` namespace provides classes and functions for interacting
429+
with media devices and streams in a web browser. This module enables you to work
430+
with cameras, microphones, and other media input/output devices directly from
431+
Python code.
432+
433+
#### `pyscript.media.Device`
434+
435+
A class that represents a media input or output device, such as a microphone,
436+
camera, or headset.
437+
438+
```python title="Creating a Device object"
439+
from pyscript.media import Device, list_devices
440+
441+
# List all available media devices
442+
devices = await list_devices()
443+
# Get the first available device
444+
my_device = devices[0]
445+
```
446+
447+
The `Device` class has the following properties:
448+
449+
* `id` - a unique string identifier for the represented device.
450+
* `group` - a string group identifier for devices belonging to the same physical device.
451+
* `kind` - an enumerated value: "videoinput", "audioinput", or "audiooutput".
452+
* `label` - a string describing the device (e.g., "External USB Webcam").
453+
454+
The `Device` class also provides the following methods:
455+
456+
##### `Device.load(audio=False, video=True)`
457+
458+
A class method that loads a media stream with the specified options.
459+
460+
```python title="Loading a media stream"
461+
# Load a video stream (default)
462+
stream = await Device.load()
463+
464+
# Load an audio stream only
465+
stream = await Device.load(audio=True, video=False)
466+
467+
# Load with specific video constraints
468+
stream = await Device.load(video={"width": 1280, "height": 720})
469+
```
470+
471+
Parameters:
472+
* `audio` (bool, default: False) - Whether to include audio in the stream.
473+
* `video` (bool or dict, default: True) - Whether to include video in the
474+
stream. Can also be a dictionary of video constraints.
475+
476+
Returns:
477+
* A media stream object that can be used with HTML media elements.
478+
479+
##### `get_stream()`
480+
481+
An instance method that gets a media stream from this specific device.
482+
483+
```python title="Getting a stream from a specific device"
484+
# Find a video input device
485+
video_devices = [d for d in devices if d.kind == "videoinput"]
486+
if video_devices:
487+
# Get a stream from the first video device
488+
stream = await video_devices[0].get_stream()
489+
```
490+
491+
Returns:
492+
* A media stream object from the specific device.
493+
494+
#### `pyscript.media.list_devices()`
495+
496+
An async function that returns a list of all currently available media input and
497+
output devices.
498+
499+
```python title="Listing all media devices"
500+
from pyscript.media import list_devices
501+
502+
devices = await list_devices()
503+
for device in devices:
504+
print(f"Device: {device.label}, Kind: {device.kind}")
505+
```
506+
507+
Returns:
508+
* A list of `Device` objects representing the available media devices.
509+
510+
!!! Note
511+
512+
The returned list will omit any devices that are blocked by the document
513+
Permission Policy or for which the user has not granted permission.
514+
515+
### Simple Example
516+
517+
```python title="Basic camera access"
518+
from pyscript import document
519+
from pyscript.media import Device
520+
521+
async def init_camera():
522+
# Get a video stream
523+
stream = await Device.load(video=True)
524+
525+
# Set the stream as the source for a video element
526+
video_el = document.getElementById("camera")
527+
video_el.srcObject = stream
528+
529+
# Initialize the camera
530+
init_camera()
531+
```
532+
533+
!!! warning
534+
535+
Using media devices requires appropriate permissions from the user.
536+
Browsers will typically show a permission dialog when `list_devices()` or
537+
`Device.load()` is called.
538+
426539
### `pyscript.storage`
427540

428541
The `pyscript.storage` API wraps the browser's built-in

‎docs/user-guide/media.md

Lines changed: 372 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,372 @@
1+
# PyScript and Media Devices
2+
3+
For web applications to interact with cameras, microphones, and other media
4+
devices, there needs to be a way to access these hardware components through the
5+
browser. PyScript provides a media API that enables your Python code to interact
6+
with media devices directly from the browser environment.
7+
8+
This section explains how PyScript interacts with media devices and how you can
9+
use these capabilities in your applications.
10+
11+
## Media Device Access
12+
13+
PyScript interacts with media devices through the browser's [MediaDevices
14+
API](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices). This API
15+
provides access to connected media input devices like cameras and microphones,
16+
as well as output devices like speakers.
17+
18+
When using PyScript's media API, it's important to understand:
19+
20+
1. Media access requires **explicit user permission**. The browser will show a
21+
permission dialog when your code attempts to access cameras or microphones.
22+
2. Media access is only available in **secure contexts** (HTTPS or localhost).
23+
3. All media interactions happen within the **browser's sandbox**, following the
24+
browser's security policies.
25+
26+
## The `pyscript.media` API
27+
28+
PyScript provides a Pythonic interface to media devices through the
29+
`pyscript.media` namespace. This API includes two main components:
30+
31+
1. The `Device` class - represents a media device and provides methods to
32+
interact with it
33+
2. The `list_devices()` function - discovers available media devices
34+
35+
### Listing Available Devices
36+
37+
To discover what media devices are available, use the `list_devices()` function:
38+
39+
```python
40+
from pyscript.media import list_devices
41+
42+
async def show_available_devices():
43+
devices = await list_devices()
44+
for device in devices:
45+
print(f"Device: {device.label}, Type: {device.kind}, ID: {device.id}")
46+
47+
# List all available devices
48+
show_available_devices()
49+
```
50+
51+
This function returns a list of `Device` objects, each representing a media
52+
input or output device. Note that the browser will typically request permission
53+
before providing this information.
54+
55+
### Working with the Camera
56+
57+
The most common use case is accessing the camera to display a video stream:
58+
59+
```python
60+
from pyscript import when
61+
from pyscript.media import Device
62+
from pyscript.web import page
63+
64+
async def start_camera():
65+
# Get a video stream (defaults to video only, no audio)
66+
stream = await Device.load(video=True)
67+
68+
# Connect the stream to a video element in your HTML
69+
video_element = page["#camera"][0]._dom_element
70+
video_element.srcObject = stream
71+
72+
return stream
73+
74+
# Start the camera
75+
camera_stream = start_camera()
76+
```
77+
78+
The `Device.load()` method is a convenient way to access media devices without
79+
first listing all available devices. You can specify options to control which
80+
camera is used:
81+
82+
```python
83+
# Prefer the environment-facing camera (often the back camera on mobile)
84+
stream = await Device.load(video={"facingMode": "environment"})
85+
86+
# Prefer the user-facing camera (often the front camera on mobile)
87+
stream = await Device.load(video={"facingMode": "user"})
88+
89+
# Request specific resolution
90+
stream = await Device.load(video={
91+
"width": {"ideal": 1280},
92+
"height": {"ideal": 720}
93+
})
94+
```
95+
96+
### Capturing Images from the Camera
97+
98+
To capture a still image from the video stream:
99+
100+
```python
101+
def capture_image(video_element):
102+
# Get the video dimensions
103+
width = video_element.videoWidth
104+
height = video_element.videoHeight
105+
106+
# Create a canvas to capture the frame
107+
canvas = document.createElement("canvas")
108+
canvas.width = width
109+
canvas.height = height
110+
111+
# Draw the current video frame to the canvas
112+
ctx = canvas.getContext("2d")
113+
ctx.drawImage(video_element, 0, 0, width, height)
114+
115+
# Get the image as a data URL
116+
image_data = canvas.toDataURL("image/png")
117+
118+
return image_data
119+
```
120+
121+
For applications that need to process images with libraries like OpenCV, you
122+
need to convert the image data to a format these libraries can work with:
123+
124+
```python
125+
import numpy as np
126+
import cv2
127+
128+
def process_frame_with_opencv(video_element):
129+
# Get video dimensions
130+
width = video_element.videoWidth
131+
height = video_element.videoHeight
132+
133+
# Create a canvas and capture the frame
134+
canvas = document.createElement("canvas")
135+
canvas.width = width
136+
canvas.height = height
137+
ctx = canvas.getContext("2d")
138+
ctx.drawImage(video_element, 0, 0, width, height)
139+
140+
# Get the raw pixel data
141+
image_data = ctx.getImageData(0, 0, width, height).data
142+
143+
# Convert to numpy array for OpenCV
144+
frame = np.asarray(image_data, dtype=np.uint8).reshape((height, width, 4))
145+
146+
# Convert from RGBA to BGR (OpenCV's default format)
147+
frame_bgr = cv2.cvtColor(frame, cv2.COLOR_RGBA2BGR)
148+
149+
# Process the image with OpenCV
150+
# ...
151+
152+
return frame_bgr
153+
```
154+
155+
### Managing Camera Resources
156+
157+
It's important to properly manage media resources, especially when your
158+
application no longer needs them. Cameras and microphones are shared resources,
159+
and failing to release them can impact other applications or cause unexpected
160+
behavior.
161+
162+
### Stopping the Camera
163+
164+
To stop the camera and release resources:
165+
166+
```python
167+
from pyscript.web import page
168+
169+
def stop_camera(stream):
170+
# Stop all tracks on the stream
171+
if stream:
172+
tracks = stream.getTracks()
173+
for track in tracks:
174+
track.stop()
175+
176+
# Clear the video element's source
177+
video_element = page["#camera"][0]._dom_element
178+
if video_element:
179+
video_element.srcObject = None
180+
```
181+
182+
### Switching Between Cameras
183+
184+
For devices with multiple cameras, you can implement camera switching:
185+
186+
```python
187+
from pyscript.media import Device, list_devices
188+
from pyscript.web import page
189+
190+
class CameraManager:
191+
def __init__(self):
192+
self.cameras = []
193+
self.current_index = 0
194+
self.active_stream = None
195+
self.video_element = page["#camera"][0]._dom_element
196+
197+
async def initialize(self):
198+
# Get all video input devices
199+
devices = await list_devices()
200+
self.cameras = [d for d in devices if d.kind == "videoinput"]
201+
202+
# Start with the first camera
203+
if self.cameras:
204+
await self.start_camera(self.cameras[0].id)
205+
206+
async def start_camera(self, device_id=None):
207+
# Stop any existing stream
208+
await self.stop_camera()
209+
210+
# Start a new stream
211+
video_options = (
212+
{"deviceId": {"exact": device_id}} if device_id
213+
else {"facingMode": "environment"}
214+
)
215+
self.active_stream = await Device.load(video=video_options)
216+
217+
# Connect to the video element
218+
if self.video_element:
219+
self.video_element.srcObject = self.active_stream
220+
221+
async def stop_camera(self):
222+
if self.active_stream:
223+
tracks = self.active_stream.getTracks()
224+
for track in tracks:
225+
track.stop()
226+
self.active_stream = None
227+
228+
if self.video_element:
229+
self.video_element.srcObject = None
230+
231+
async def switch_camera(self):
232+
if len(self.cameras) <= 1:
233+
return
234+
235+
# Move to the next camera
236+
self.current_index = (self.current_index + 1) % len(self.cameras)
237+
await self.start_camera(self.cameras[self.current_index].id)
238+
```
239+
240+
## Working with Audio
241+
242+
In addition to video, the PyScript media API can access audio inputs:
243+
244+
```python
245+
# Get access to the microphone (audio only)
246+
audio_stream = await Device.load(audio=True, video=False)
247+
248+
# Get both audio and video
249+
av_stream = await Device.load(audio=True, video=True)
250+
```
251+
252+
## Best Practices
253+
254+
When working with media devices in PyScript, follow these best practices:
255+
256+
### Permissions and User Experience
257+
258+
1. **Request permissions contextually**:
259+
- Only request camera/microphone access when needed
260+
- Explain to users why you need access before requesting it
261+
- Provide fallback options when permissions are denied
262+
263+
2. **Clear user feedback**:
264+
- Indicate when the camera is active
265+
- Provide controls to pause/stop the camera
266+
- Show loading states while the camera is initializing
267+
268+
### Resource Management
269+
270+
1. **Always clean up resources**:
271+
- Stop media tracks when they're not needed
272+
- Clear `srcObject` references from video elements
273+
- Be especially careful in single-page applications
274+
275+
2. **Handle errors gracefully**:
276+
- Catch exceptions when requesting media access
277+
- Provide meaningful error messages
278+
- Offer alternatives when media access fails
279+
280+
### Performance Optimization
281+
282+
1. **Match resolution to needs**:
283+
- Use lower resolutions when possible
284+
- Consider mobile device limitations
285+
- Adjust video constraints based on the device
286+
287+
2. **Optimize image processing**:
288+
- Process frames on demand rather than continuously
289+
- Use efficient algorithms
290+
- Consider downsampling for faster processing
291+
292+
## Example Application: Simple Camera Capture
293+
294+
Here's a simplified example that shows how to capture and display images from a
295+
camera:
296+
297+
```python
298+
from pyscript import when, window
299+
from pyscript.media import Device
300+
from pyscript.web import page
301+
302+
class CameraCapture:
303+
def __init__(self):
304+
# Get UI elements
305+
self.video = page["#camera"][0]
306+
self.video_element = self.video._dom_element
307+
self.capture_button = page["#capture-button"]
308+
self.snapshot = page["#snapshot"][0]
309+
310+
# Start camera
311+
self.initialize_camera()
312+
313+
async def initialize_camera(self):
314+
# Prefer environment-facing camera on mobile devices
315+
stream = await Device.load(video={"facingMode": "environment"})
316+
self.video_element.srcObject = stream
317+
318+
def take_snapshot(self):
319+
"""Capture a frame from the camera and display it"""
320+
# Get video dimensions
321+
width = self.video_element.videoWidth
322+
height = self.video_element.videoHeight
323+
324+
# Create canvas and capture frame
325+
canvas = window.document.createElement("canvas")
326+
canvas.width = width
327+
canvas.height = height
328+
329+
# Draw the current video frame to the canvas
330+
ctx = canvas.getContext("2d")
331+
ctx.drawImage(self.video_element, 0, 0, width, height)
332+
333+
# Convert the canvas to a data URL and display it
334+
image_data_url = canvas.toDataURL("image/png")
335+
self.snapshot.setAttribute("src", image_data_url)
336+
337+
# HTML structure needed:
338+
# <video id="camera" autoplay playsinline></video>
339+
# <button id="capture-button">Take Photo</button>
340+
# <img id="snapshot">
341+
342+
# Usage:
343+
# camera = CameraCapture()
344+
#
345+
# @when("click", "#capture-button")
346+
# def handle_capture(event):
347+
# camera.take_snapshot()
348+
```
349+
350+
This example demonstrates:
351+
- Initializing a camera with the PyScript media API
352+
- Accessing the camera stream and displaying it in a video element
353+
- Capturing a still image from the video stream when requested
354+
- Converting the captured frame to an image that can be displayed
355+
356+
This simple pattern can serve as the foundation for various camera-based
357+
applications and can be extended with image processing libraries as needed for
358+
more complex use cases.
359+
360+
361+
## Conclusion
362+
363+
The PyScript media API provides a powerful way to access and interact with
364+
cameras and microphones directly from Python code running in the browser. By
365+
following the patterns and practices outlined in this guide, you can build
366+
sophisticated media applications while maintaining good performance and user
367+
experience.
368+
369+
Remember that media access is a sensitive permission that requires user consent
370+
and should be used responsibly. Always provide clear indications when media
371+
devices are active and ensure proper cleanup of resources when they're no longer
372+
needed.

0 commit comments

Comments
 (0)
Please sign in to comment.