-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
D455 on Jetson Xavier very slow #2396
Comments
Hi @GEngels Problems with pointcloud generation, and also enabling align_depth, on Jetson boards specifically are a known issue in the RealSense ROS wrapper. Symptoms can iclude very low FPS or missing color. The issue does not occur on non-Jetson computers such as laptop and desktop PCs. Detailed information about it can be found at #1967 For ROS1 (Kinetic, Melodic, Noetic) the current best solution available is to use librealsense version 2.43.0 and ROS1 wrapper version 2.2.23. |
Hi @MartyG-RealSense, I am a colleague of @GEngels. |
Hi @Doch88 The older RealSense ros2 wrapper branch would have to be used with older SDK versions, as the current ros2_beta wrapper branch is for SDK 2.50.0. https://github.com/IntelRealSense/realsense-ros/tree/ros2 If SDK 2.43.0 is being used with ROS2 then it would have to be matched with ROS2 wrapper version 3.1.5 https://github.com/IntelRealSense/realsense-ros/releases/tag/3.1.5 This wrapper version only supports Foxy, Eloquent and Dashing though. So it would not be suitable for newer ROS2 versions such as Galactic and Rolling as support for those two ROS2 versions was introduced in more recent wrapper versions. As far as I am aware, Jetson pointcloud generation has not been tested with the combination of SDK 2.43.0 and ROS2 wrapper 3.1.5 though, so I cannot offer a guarantee about pointcloud performance with that configuration. |
Hi @MartyG-RealSense, I built the SDK 2.43.0 version from source (using RSUSB backend), with CUDA support, and I also used the ROS2 wrapper version 3.1.5 (with Foxy), built from source too.
What looks strange to me is that the 15W is performing better than the 30W one, I would guess that it depends on some specific power limitation set with these modes on the USB port or on the CPU cores. EDIT: With the realsense-viewer everything works perfectly. |
Thanks so much for the feedback from your tests! When you refer to occluding an infrared camera, do you mean having it disabled in the ROS wrapper or covering over the lens on the outside of the camera? |
Sorry, I mean covering the lens on the outside of the camera. |
New update: The Realsense-viewer works perfectly in the 3D view with all the post-processing filters enabled, the same resolutions, and the FPS set to 30. |
How does it perform if depth and color are both set to 848x480 and 30 FPS? |
MAXN: It works well with that resolution and FPS. However, we need at least a resolution of 1280x720 with 15 FPS, preferably using MODE 30W ALL. |
How about using 1280x720 and applying a Decimation filter to reduce the complexity of the depth scene? The Decimation filter will not work with align_depth = true though if you are using alignment. |
Yes, we are using alignment, and also the Decimation filter will reduce the resolution of the depth image, which is not something that we want. |
In your project, is 1280x720 a requirement for both depth and color or only for depth? |
Yes, unfortunately it is a requirement for both frames. |
Can you confirm please that when you built librealsense from source code with RSUSB, you included the build flag -DBUILD_WITH_CUDA=TRUE in the CMake build instruction? An RSUSB build of librealsense will automatically support metadata but require the DBUILD_WITH_CUDA flag to be used in order to add CUDA support to the build too. |
I used this command to build the library:
|
I checked the differences between MODE 30W ALL and MODE 15W DESKTOP. Also I looked a bit into the code of the realsense-viewer and at a first sight I noticed that there a difference thread is used for the post-processing part of the code, while in the node (as I understood) everything is done in the same thread (except for some part not related to filtering and point cloud generation). |
You mentioed earlier at #2396 (comment) that you could run well with 1280x720 depth and 1280x800 color (both at 15 FPS) but with spatial and temporal filters disabled. Are these filters vital to your ROS project, please? Normally, Infra and Infra2 are disabled by default when using the rs_launch.py launch file. Are you using that launch file or a custom launch file, and if you are using a custom launch then are Infra and Infra2 enabled in it? |
It runs better, but there is still a bit of delay which we would like to avoid. By the way thank you for your support, I really like the fact that you are trying to reply as quickly and precisely as possible, so I am sorry if I am being a bit annoying! |
We are using a custom launch file that uses rs_launch.py inside, we are not touching the settings regarding infra and infra2. These are the launch arguments, many of them are useless for the current camera but they are still provided for internal purposes:
|
It's no trouble at all :) If librealsense is built with the flag -DBUILD_WITH_OPENMP=TRUE then YUY to RGB color conversion and depth-color alignment can take advantage of multiple CPU cores, at the expense of slightly higher CPU percentage utilization. rs2::gl, also known as GLSL, is an alternative acceleration system to CUDA that is built into librealsense and works with any graphics GPU brand instead of just Nvidia. It is not involved in CUDA processing. There would not be a significant performance advantage to using GLSL over CUDA and since GLSL only works with rs2:: instructions in librealsense scripts, CUDA will be the best choice for ROS. Unless you need an ordered pointcloud instead of the default unordered one then you could try removing the ordered_pc instruction to see if it makes a positive difference. As you are using both depth and color, if auto-exposure is enabled then you could also try disabling an RGB setting called auto_exposure_priority to force the FPS of the two streams to remain at the defined FPS (such as '30') instead of being permitted to drop to a lower FPS value. Information about defining this setting in a launch file as a rosparam is at #2308 (comment) |
Thank you for your clarifications! Yes, we need ordered_pc and I have also tried to disable it and the performances did not improve visibly. I cannot find how to set that parameter with ROS2, that setting in the launch file seems to be a ROS1 one. Also, I cannot find the definition of that parameter inside of the Realsense node, can you provide me a way to do that with ROS2? |
Are you able to test disabling auto_exposure_priority by opening the rqt_reconfigure interface in the 3.x ROS2 wrapper with the command ros2 run rqt_reconfigure rqt_reconfigure If you can open the rqt_reconfigure interface and the side-panel has an rgb_camera category in its side-panel (camera > rgb_camera) then the auto_exposure_priority option should be under that section's options. |
For some reason rqt_reconfigure was not working, but I managed to change the setting with If I set the camera to run at 15 fps, the publishing rate is around 10 Hz; if I set the camera to run at 30 fps the publishing rate is around 20 Hz (1280x720 for both cases). Don't know then if it is a problem of |
Thanks very much for your tests. When auto_exposure_priority is set to False, is the RGB auto_exposure also set to True? If auto_exposure is true and auto_exposure_priority is false then the FPS rate should be forced to remain constant. |
GLSL differs from CUDA in that programs will not make use of GLSL acceleration unless support for doing so has been deliberately coded into that program, like it is in the RealSense Viewer. So the ROS wrapper would not be benefiting from GLSL. CUDA acceleration of pointclouds, alignment and color conversion is provided automatically though if the librealsense SDK has CUDA support enabled (by building the SDK from packages, or from source code with the DBUILD_WITH_CUDA=true flag included). CUDA support is not included though if the SDK and the ROS wrapper were installed together from packages at the same time using the wrapper's Method 1 instructions. |
Then I do not really understand this difference of performance between the node and the realsense-viewer.
I would expect it to be around 700 mA, as it is when I run it in a laptop. |
The RealSense Viewer tool runs directly in the RealSense SDK, whilst the ROS wrapper runs as a layer atop the SDK in the background. The wrapper also handles some functions differently from librealsense in order to maintain compatibility with ROS standards. So the camera may work correctly in the Viewer but have some performance issues in the ROS wrapper for some RealSense users. A Jetson user at #1964 who was also using MAXN mode had performance issues such as very high CPU % utilization. They provided extensive testing logs in that case. In the end, they found that performance improved if they removed the ROS wrapper with the command sudo apt remove ros-$ROS_DISTRO-librealsense2 and rebuilt it from source code. |
Hi @MartyG-RealSense |
Hi @iraadit Do you have an update about this case that you can provide, please? Thanks! |
Hi @MartyG-RealSense, I passed the last days installing Jetpack 5 and Isaac ROS (Humble with Docker) In the discoveries of the last weeks:
It can be seen that gl::pointcloud is faster than pointcloud (6ms vs 9ms, maximum of 38 vs 609ms)
I'll be on vacations until the start of September and will continue to work on this when I will be back. Please, do not close the ticket :) Thank you |
Thanks so much @iraadit for the highly detailed feedback of your findings. It is no problem at all to keep the ticket open whilst you are on vacation and resume when you return. |
@iraadit Thank you for sharing |
Hi @AndreV84 Humble is now supported by the RealSense ros2_beta wrapper, though not by earlier wrapper branches (ROS1 and ros2). At the time of writing this, only building from source code is supported for Humble. Ubuntu 22.04 (Jammy) Debians are on the roadmap. IntelRealSense/librealsense#10439 (comment) and the comments beneath it provide further information. |
Gld it can help you @AndreV84 :) I've been able to install the last librealsense v2.51.1 as well as the branch ros2-beta of realsense-ros inside a docker container created with Nvidia Isaac ROS (with ROS 2 Humble). I could install following the first method of the Jetson install (with FORCE_RSUSB_BACKEND=false), without needing to apply the kernel patching. Running on a Nvidia Jetson AGX Xavier. Sadly, the performances are staying the same. ROS2 logs are now working correctly though and I observed something in the output of the command
The following snippets are cleaned for brievety. Scenario 1 is the best scenario would be, with depth and color "arriving at the same time":
But two other scenarios can happen, and are not desirable. Scenario 2 has only depth before filters:
Scenario 3 has only color before filters:
Perfect execution should always be scenario 1, always having pointcloud data with color texture available as an output. Would you have any idea on how to avoid that? Thank you |
@iraadit You should not need to set enable_sync to true in your launch instruction as the ROS wrapper documentation for this parameter states that it happens automatically when filters such as pointcloud are enabled. Likewise, the decimation filter is false by default in the rs_launch.py file and so should not need setting to false in the launch instruction. How does it perform if you remove depth_module.exposure:=32000 depth_module.enable_auto_exposure:=false and use auto-exposure? |
enble_sync as well as decimation_filter.enable are in the launch command because it is easier for me that way to try out different parameters (I don't have to remember their name when I want to try them out, but just to change true to false (and vice-versa)). By the way, I think that pointcloud.enable:=true must also turn align_depth.enable to true, what is happening then if I add align_depth.enable:=false in the launch command? Is it ignored? As explained in the point 1 of #2396 (comment), I have to add depth_module.exposure:=32000 depth_module.enable_auto_exposure:=false to be able to get 30 FPS in input for depth, if not it saturates at 15FPS. By default, depth_module.enable_auto_exposure is true, but doesn't seem to work. (???) If I put depth_module.enable_auto_exposure to false, it will use the default value of depth_module.exposure that is 33000 (33ms) and seems to be too much to get 30FPS. If I change the value of depth_module.exposure to be lower (here 32000), it can runt 30FPS. These FPS values are seen thanks to the /diagnostics topic. |
I would assume that a 'false' state of align_depth will be overriden to 'true' if the pointcloud filter is enabled. Your colleague @Doch88 reported earlier in the discussion that enabling auto_exposure and disabling auto_exposure_priority instead of using manual exposure in order to enforce a constant FPS) did not seem to work. #2396 (comment) states that you cannot reduce your resolution from 1280x720 to 848x480 to improve performance. I believe we have covered almost all of the available possibilities for improving performance during this 2 month discussion, unfortunately. |
Hi @MartyG-RealSense , |
Thanks so much @iraadit :) Please do feedback any findings that you make. |
Hi @iraadit Do you have an update about this case that you can provide, please? Thanks! |
I finally found a way to get a (nearly) 30FPS textured pointcloud, but it is suboptimal. It consists of these different elements:
With that, I can get nearly 30FPS (more than 25) on color, depth and pointcloud. topic_hz.pyWith rviz2, I could see the streams framerates dropping when I was using ros2 topic hz. Searching for it on internet, I found this post describing the same thing, started to write my own suscriber to calculate the FPS and then discovered it had already been done on realsense-ros repo. For reference, the script is here: realsense2_camera/scripts/topic_hz.py. Using this script, I can (finally) get good framerates values. The script is only present in the ros2-beta branch (as it seems it wasn't a problem with ROS 1). Depth module32000 instead of 33000 or auto (because then saturate at 15FPS). For the exposure of the depth module, would there be something similar to auto_exposure_priority for the color stream? PresetI got the preset from the page D400-Series-Visual-Presets wiki on the librealsense repo. DecimationOn this wiki page, it is also said:
What would be the recommended default for D455? Concerning decimation, I also saw on the web documentation of intelrealsense that there would be a better implementation of decimation for collision avoidance problems. I'll add it in our code.
OpenGLI also have a modification of the code to execute part of the pointcloud calculation with OpenGL. I should test without this modification to see if I still get the nearly 30 FPS. (CPU usage is for sure down with this modification, as well as pointcloud processing is faster). ConclusionCan you help me with any of the questions I have? I will finally have to let this case at rest in the coming days to work on other things. I'm still not satisfied with the result and it can be said for sure that the combination of D455, ROS 2 (Humble - Isaac), librealsense 2.51.1, realsense-ros (ros2-beta) and trying to get a textured pointcloud with the highest resolution possible on a Jetson AGX Xavier (Jetpack 4 or 5) is giving subpar results for now. I will submit some Pull Requests in the coming days. Thank you for your help |
Thanks so much @iraadit for sharing such detailed feedback of your tests! Using the Medium Density preset - MedDensityPreset.json - may provide a better image than High Accuracy, as HA tends to greatly reduce the amount of detail on the depth image due to confidence-filtering of depth coordinates, whilst Medium Density provides a good balance between accuracy and the amount of detail on the image. Regarding a similar option to the RGB-only auto_exposure_priority for the depth stream: when auto-exposure is disabled then a constant FPS can be enforced if the manual epxosure value is within a certain range, as mentioned earlier in this discussion at #2396 (comment) but there is not a direct depth-stream equivalent for auto_exposue_priority The recommended resolution setting for optimal depth accuracy on the D455 camera model is the same as those for D435 / D435i - 848x480 depth at 30 FPS. |
Does anyone who commented on this case require further assistance, please? Thanks! |
not from my side. thanks for following up though |
Thanks very much, @AndreV84 :) |
@MartyG-RealSense do you know by any chance |
@AndreV84 If you mean using .ply pointcloud files then PyTorch3D might be a suitable option for object detection if you are able to use Python. PyTorch3D https://ai.facebook.com/blog/building-3d-deep-learning-models-with-pytorch3d/ Loading ply files into PyTorch3D |
Sorry, I've been busy on something else for the last two weeks. I'll work again on the realsense pointcloud optimization this week. |
No problem at all, @iraadit - thanks very much for the update and good luck! |
Hi @iraadit Do you have an update about this case that you can provide, please? Thanks! |
Hi @MartyG-RealSense, |
Thanks very much @iraadit for the update! As you are happy to close the ticket and there have not been further comments from other RealSense users on this discussion, I will close it. Thanks again! |
I am trying to use the Intel Realsense camera on the Jetson Xavier AGX. When I only run the RGB camera & depth image it seems to run quite fast (30fps). However, when I enable the pointcloud it becomes very slow.
I tried to run it at 15 fps for RGB & Depth at various resolutions but the result is the same. The RGB camera freezes and no point cloud is displayed in RVIZ. When I block part of the depth camera (so that it doesn't have to create as many points in the point cloud) it seems to work again. The creation of the pointcloud in the realsense node seems to be a bottleneck, I am running the Jetson on max performance.
Some of the errors and warnings that appear are the following:
"incomplete frame received: Incomplete video frame detected! Size 99080 out of 2048255 bytes (4%):"
"Hardware Notification:Depth stream start failure,1.65668e+12,Error,Hardware Error"
"Out of frame resources"
I wonder if it is expected behaviour that you can only run it with maybe 5 FPS on the Jetson on max settings.
The text was updated successfully, but these errors were encountered: