Tensor rt testing #26

GerdsenAI-Admin · 2026-02-04T07:09:17Z

This pull request updates the documentation and Dockerfile to improve deployment, troubleshooting, and performance for the Depth Anything 3 ROS2 wrapper on NVIDIA Jetson Orin AGX. The changes clarify environment detection, deployment procedures, and architecture, and address compatibility issues with Jetson hardware and software. The Dockerfile is updated to fix OpenCV, PyTorch, and pip configuration issues for Jetson L4T r36.x. The documentation now provides clearer guidance for users and agents, and details current performance bottlenecks.

Deployment & Environment Setup Improvements

Expanded CLAUDE.md with detailed environment detection steps, SSH/MCP usage, and one-click Jetson deployment instructions, including preferred git-based deployment and troubleshooting for X11 GUI forwarding. [1] [2]
Added JetPack/L4T version notes, Docker build known issues, and host-container TensorRT architecture explanation, including file-based IPC details and performance status table. [1] [2]

Critical Design Principles & Testing

Clarified camera-agnostic design and ROS2 patterns as non-negotiable principles, and expanded testing section to highlight mocked model tests and camera-agnostic functionality. [1] [2]

Troubleshooting & Agent Guidance

Added troubleshooting section referencing key docs, and improved agent selection guidance with a detailed table and proactive usage instructions for Jetson and NVIDIA experts. [1] [2] [3] [4]

Dockerfile Compatibility Fixes

Updated base image selection to use humble-desktop for Jetson L4T r36.x, improved OpenCV version checks, and fixed pip configuration to use PyPI instead of unreliable Jetson servers. [1] [2] [3]
Skipped torchvision source build for Jetson, installing CPU-only torchvision from PyPI for host-container TRT architecture, and improved ROS2 workspace sourcing for non-interactive shells. [1] [2] [3]## Description

Please include a summary of the changes and the related issue. Include relevant motivation and context.

Dockerfile: Ensure ROS2 workspace/setup sourcing is added before the PS1 guard in ~/.bashrc (using sed when the PS1 return line exists) so the setup runs for non-interactive shells (e.g. docker exec). Use the install/setup.bash path with fallback and add equivalent lines to /etc/bash.bashrc and /etc/profile.d/ros2.sh. Code: Add atomic writes for numpy files in depth_anything_3_ros2/da3_inference.py and scripts/trt_inference_service.py (write to a temp file, flush, fsync, then rename) to prevent partial reads by the inference service. In trt_inference_service.py also validate the input tensor size against the engine's expected shape and raise a clear ValueError on mismatch. Uses allow_pickle=False for np.save to improve safety.

Add two scripts to run a live depth visualization demo: scripts/demo_depth_viewer.py (ROS2-based viewer showing side-by-side camera feed and colorized TensorRT depth, FPS toggle, frame save to demo_captures, and helper to start the TRT inference service) and scripts/run_demo.sh (convenience runner that starts the TRT service, camera driver, depth node in the da3_ros2_jetson container, and launches the viewer with X11). Notes: requires ROS2, a built TensorRT engine at models/tensorrt/da3-small-fp16.engine, a camera at /dev/video0, and a display (Jetson). The runner waits for the TRT service status file and cleans up processes on exit.

Stop auto-starting the TensorRT inference service from the viewer and instead just verify its status. demo_depth_viewer.py: replace start_trt_service() with check_trt_service() that inspects the shared status file and emits a warning if the service isn't present; remove process spawning/cleanup logic so the service is expected to be managed externally. Add scripts/jetson_demo.sh: new helper to run the full pipeline on a Jetson (starts TRT service on the host, prepares shared dir, starts container ROS nodes, and launches the viewer with X11). Update scripts/run_demo.sh: improve X11 access handling (xhost), handle SSH sessions by printing instructions and showing TRT stats instead of trying to open a GUI, and launch the viewer in-container with QT_X11_NO_MITSHM set when running locally. These changes decouple service lifecycle from the viewer and provide a dedicated Jetson entrypoint for systems with a local display.

…hitecture details

Add Phase 5 'Live Demo System' section to TODO.md. Documents new demo components (scripts/demo_depth_viewer.py, scripts/run_demo.sh, scripts/jetson_demo.sh), atomic IO for numpy files, and a Dockerfile ROS2 sourcing fix. Lists demo features (side-by-side camera and colorized TensorRT depth, FPS toggle, frame capture, X11 with SSH fallback), usage examples for Jetson and container runs, and notes a pending merge of the TensorRT-Testing branch. Updates the Last Updated date to 2026-02-03.

…d known issues - Update SSH commands to use -i ~/.ssh/jetson_j4012 identity file - Add git clone as preferred deployment method (preserves history) - Document deploy_jetson.sh script usage - Add JetPack/L4T version compatibility table (r36.2.0 vs r36.4.0) - Document Docker build known issues (pip.conf, OpenCV, cuDNN, base image)

…ecture - Switch base image from humble-pytorch to humble-desktop (r36.x compatible) - Remove dustynv pip.conf that uses unreliable jetson.webredirect.org - Add OpenCV 4.10.x support for L4T r36.4.0 - Replace torchvision source build with CPU-only PyPI install - Add explicit PyTorch dependencies (filelock, sympy, etc.)

…oyment - Update Jetson demo to use git clone (preserves history) - Add SSH identity file to example commands - Add troubleshooting for humble-pytorch, pip.conf, cuDNN issues

Match container base image to host L4T R36.4.x environment.

Update to note humble-desktop is used because humble-pytorch doesn't exist for r36.x.

Removed redundant demo scripts: - scripts/deploy_jetson.sh (merged into run.sh) - scripts/jetson_demo.sh (merged into run.sh) - scripts/run_demo.sh (merged into run.sh) Fixed TRT inference service race condition: - Handle empty REQUEST_PATH file during atomic write - Make REQUEST_PATH write atomic in container side - Prevents "could not convert string to float" errors Updated scripts/demo.sh with deprecation notice pointing to run.sh Remaining scripts (11 total): - Setup: install_dependencies.sh, setup_models.py - Core: trt_inference_service.py, build_tensorrt_engine.py - Utilities: detect_cameras.sh, performance_monitor.sh - Viewer: demo_depth_viewer.py - Testing: benchmark_models.sh, test_trt10.3_host.sh, thermal_stability_test.sh - Legacy: demo.sh (deprecated) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Update TensorRT Demo section to use ./run.sh - Update Quick Start to use ./run.sh instead of deploy_jetson.sh - Update Key Files table - Simplify demo script options documentation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add video group, /dev mount, and device cgroup rule for proper v4l2 camera access. Fixes 'Failed mapping device memory' error. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add quick reference block at top of file with host, user, and identity file info for easy SSH access to Jetson device.

- Add blank lines around headings (MD022) - Remove trailing period from heading (MD026) - Add blank lines around code fences (MD031) - Add blank lines around lists (MD032) - Fix table column alignment (MD060)

…e Jetson SSH quick reference

… access methods

…DE.md and OPTIMIZATION_GUIDE.md

GerdsenAI-Admin and others added 20 commits February 3, 2026 00:36

Revise CLAUDE.md for Git workflow guidelines and clarify TensorRT arc…

474c87b

…hitecture details

docs(README.md): add Jetson Docker troubleshooting and git clone depl…

c8aadba

…oyment - Update Jetson demo to use git clone (preserves history) - Add SSH identity file to example commands - Add troubleshooting for humble-pytorch, pip.conf, cuDNN issues

build(docker-compose): update L4T version to r36.4.0

cfe0799

Match container base image to host L4T R36.4.x environment.

docs(JETSON_DEPLOYMENT_GUIDE): clarify humble-desktop base image usage

551870b

Update to note humble-desktop is used because humble-pytorch doesn't exist for r36.x.

Add one-click run.sh demo script at repo root

1ec4f3f

Fix Docker camera access for v4l2 memory mapping

4fd78ec

Add video group, /dev mount, and device cgroup rule for proper v4l2 camera access. Fixes 'Failed mapping device memory' error. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

docs(CLAUDE.md): add Jetson SSH quick reference section

bb2c17e

Add quick reference block at top of file with host, user, and identity file info for easy SSH access to Jetson device.

style(CLAUDE.md): fix markdownlint warnings

2a37f1d

- Add blank lines around headings (MD022) - Remove trailing period from heading (MD026) - Add blank lines around code fences (MD031) - Add blank lines around lists (MD032) - Fix table column alignment (MD060)

docs(CLAUDE.md): restructure environment detection section and enhanc…

7c78912

…e Jetson SSH quick reference

docs(CLAUDE.md): add troubleshooting section and enhance agent details

bac95d7

docs(CLAUDE.md): update environment detection section and clarify SSH…

b12d249

… access methods

docs: update performance metrics and architecture limitations in CLAU…

49aadb4

…DE.md and OPTIMIZATION_GUIDE.md

GerdsenAI-Admin merged commit 4337a74 into main Feb 4, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor rt testing #26

Tensor rt testing #26

Uh oh!

GerdsenAI-Admin commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Tensor rt testing #26

Tensor rt testing #26

Uh oh!

Conversation

GerdsenAI-Admin commented Feb 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants