-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG REPORT [ the NullPtr bug within communication between two pkgs -- nav2_planner & nav2_costmap_2d] #3940
Comments
I don't understand what you think is the problem.
They are not, however, thread locked it does not appear, so perhaps you're referring to memory corruptions with Your report shows that one of the traces shows Thus, your report to me is unclear and I don't see a potential problem / resolution without some more information from you @GoesM |
we've confirmed that sometimes members in bool isCurrent()
{
return layered_costmap_->isCurrent();
} Sorry that our question wasn't specific enough. We will check and provide more details, Thanks for replying. |
here's our PR #3958, to decribe the bug more clearly and the solution. Check it please : ) |
Closing ticket, moving discussion to PR which should be merged after you fix some linting / rebase |
<1> Bug Description
Bug Type: NullPtr referenced
nav2_planner referenced a NULL-Pointer of nav2_costmap_2d
workstation_environment set
code location
/navigation2/nav2_costmap_2d/src/layered_costmap.cpp
/navigation2/nav2_planner/src/planner_server.cpp
function
isCurrent()
from/navigation2/nav2_costmap_2d/src/layered_costmap.cpp
isCurrent()
is accessed by/navigation2/nav2_planner/src/planner_server.cpp
<2> References [ log_files ]
More details are provided here.
function calling stack [by Asan report]
planner_server work_log [by ros_node_log]
We have met the same bug more than 50 times totally, just in one week
Each time, the planner_server met the bug after a [INFO] as: "Received request to clear entirely the global_costmap",
The following is representative log_files of planner_server from these 50 attempts
at this line, the planner_server shutdown suddenly.
<3> Analysis
In what situations bugs would happen?
When executing instructions [ nav2_goal action sent by user ], planner-node need the current costmap-result so that would access the pointer variable [ costmap_ros_ ] from nav2_costmap_2d
However, it seems that there's no checks by planner_server before accessing the pointer,
There may be a coincidental collision causing the bug:
nav2_costmap_2d node happens to have reached its lifecycle or is undergoing recalculation due to changes in sensors_msg (like odom and scan) . At this stage, the pointer may have been released, and at this point, the planner_server coincidentally called its pointer for the need of executing an action instruction , finally resulting in a null pointer access.
<4> POC design
This seems to be a concurrency problem caused by multithreaded execution, with a high frequency of bugs but also full of coincidences, so we cannot provide a 100% successful POC design. But we can provide some ideas to try and trigger the bug:
Method 1:
At present, it seems that when a certain situation occurs (such as changes in sensor information or end of life cycle), it will lead to nav2_ costmap_ 2d's operation -- "clear entirely the global_costmap".
Therefore, it could be tried:
when program executing nav2_goal aciton normally, we could send some interference sensor messages, constantly make nav2_ costmap_2d perform the operations of clearing or recalculating, at the same time observing if the same bug will occur.
Method 2:
this method is just used for checking the bug, but not a real POC:
when program executing nav2_goal aciton normally, try to restart nodes related to nav2_costmap_ 2d;
The text was updated successfully, but these errors were encountered: