Introduce a RMF transportation workcell #42

luca-della-vedova · 2024-12-11T08:23:43Z

This PR introduces RMF integration into nexus, where RMF is a workcell, managed by the workcell orchestrator, that is capable of executing transportation tasks through a new behavior tree and set of capabilities.

This is in a simple demo stage. I brought in a modified office world, with the only modifications being renaming the dispensers to the names of the workcells, and added a new launch to nexus_integration_tests that launches RMF together with Nexus, as well as changed the movement of items to be based on an AMR rather than a mock transporter.

Test it!

Clone, build and run:

ros2 launch nexus_integration_tests depot.launch.xml headless:=false

Submit a task:

ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '23', work_order: '$(cat config/pick_and_place.json)'}}"

You should see the transportation happening:

Screencast.from.2024-12-12.18-10-14.webm

PR breakdown

The PR is large but I'll try to condense the main decision (and potentially controversial) points I went through during the design.

nexus_integration_tests vs nexus_demos

It would be more natural to create a new nexus_demos package that contains the bringup and I got halfway there before realising it would make the diff explode even further, so I went for an initial approach that reduces the number of changes in nexus_integration_tests, we can then do a followup PR that splits the package into a nexus_demos and a nexus_integration_tests, or maybe just rename it.

Task cancellation

As noted in #40, the cancellation behavior of the workcell can't be customized and defaults to letting tasks run to completion. This means that the RMF task will not be cancelled and if a robot happens to be halfway through a long task and be waiting for a workcell that is cancelled, it will hang its waiting indefinitely. Once #40 is addressed we should add task cancellation to the TransportAmr capability.

Is task doable / navgraph checking

As noted in #41, the payload can't be used for verifying task capability. Transportation tasks have a payload with a list of destinations and they will currently always return true regardless of whether the destinations exist or not. A more advanced capability checking that, for example, checks the fleet's navgraph for existence of waypoints, would be a better design.

Map annotation

Visualizing the workcell requires its position to be populated, however Nexus (and the workcell orchestrator) currently have no way to populate this information.
For now just for the sake of visualizing I wrote a node that subscribes to the /map topic and looks for all waypoints with the pickup_dispenser property and use their location to populate markers. It will then subscribe to states and update them.

A better long term design would involve passing the workcell orchestrator information about the location of the workcell, pass it to the system orchestrator when registering and refactoring the visualization node to regularly calls the /list_workcells service to query for existence of new workcells. I deferred this to avoid adding a large diff to the workcell orchestrator node and keep changes strictly addictive for review simplicity.

Signaling

I introduced the capability of receiving signals for the system orchestrator, as well as change the default behavior tree to wait for the AMR before starting the workcell, rather than halfway. This was done to improve reliability in case of parallel tasks (i.e. there is no risk of a workcell starting a task, just for the wrong AMR to come in) but parallel tasks are still not quite there so not sure if it is still needed. An example of behavior tree that implements this new logic is here.

What's next

Many things! But this PR is already at a very large size and I tried to keep the diff minimal (where I liberally define "diff" as pre-existing files that are changed and risk breaking existing behavior, not new additions that are more likely to be safe).

Create a Gazebo simulation that includes workcells together with AMRs

Right now the workcells are not simulated in Gazebo, it would be great to have a proper simulation world so users can inspect what is happening.
Often these workcells have conveyor belts to feed the items to / from the AMRs, these would also be valuable additions.

Simulate humans for workcells that are manually operated

In real life, not all workcells are automated and some are just operated by humans. We could mock this in simulation by just having a human in the dropoff point and a special behavior tree that just waits for an input.

Task parallelism

Currently submitting parallel tasks can risk deadlocking the system, since RMF and Nexus are somewhat independent. We should revisit the implementation to make sure we can have parallel tasks.

SKU Tracking

It would be interesting to show the position and status of the SKUs in rviz. This is especially useful to know their state as they are being moved throughout the facility.

Better handling of workcell location and registration

As noted in the Map annotation section of the PR description, populate the information at workcell registration time and not by subscribing to a /map topic.

Post processing of waypoints for AMR tasks

Currently, whenever a work order is received, an AMR task that goes through all the workcells will be generated and each workcell will only be signaled to start when the AMR arrives.
This however, will be suboptimal in two corner cases:

If there is only one workcell and we don't want to use an AMR to transport, we will still request an AMR to the location which is unnecessary.
If there are multiple tasks being done by the same workcell, the AMR will have multiple "pickup" phases, although I believe this should be innocuous and just introduce some extra signaling.

It is actually a bit tricky to design a single behavior tree that works for all cases and I would actually suggest using a different behavior tree for different purposes, such as the first case.

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

luca-della-vedova · 2025-01-14T09:36:08Z

In f55bab2 I reverted the signaling at the system orchestrator level.
Now the workcell behavior trees are exactly the same regardless of whether it is a pick and place on a conveyor or on an AMR.
Sadly remapping gets in the way since we need to make sure we run the same workcell behavior tree but a different system orchestrator behavior tree (and not only the main.xml, but also the one that is loaded, pick_and_place.xml).

Furthermore, I explored the idea of removing all the duplicated behavior trees / work orders altogether in f9705a7. The idea is that if we just expose the remap_task_types parameter to the launch files we can use it to make sure the same work order results in a different system orchestrator behavior tree but the same workcell behavior tree. But happy to revert it if this is not desirable

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Signed-off-by: Yadunund <yadunund@gmail.com>

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

aaronchongth · 2025-01-20T08:11:40Z

Although integration tests were passing, I noticed a weird behavior when running non-headless. workcell_2 seems to start moving just as the AMR starts leaving workcell_1. CI shows the same thing too

I've managed to narrow it down to this flag, where despite the initialization to false, it appears to be true when WaitForAmr::onStart gets triggered.

This can be verified by checking out 5d3daf8, and commenting out this line. The printouts will show that the _amr_ready is already true despite no dispenser requests sent. Watch out for

[nexus_workcell_orchestrator-14] [INFO] [1737357565.631919038] [rmf_nexus_transporter]: CHECKING: workcell [workcell_2], rmf_task_id [compose.dispatch-9ccd185db2], amr_ready: [amr ready]

Other than this line, everything else seem good

edit: my guess is that the WaitForAmr BT node is being re-used within LoopDestination, and when the AMR is at workcell_1, while the flag is set back to false the callback still triggered, setting it back to true (RMF continuously sends out dispenser requests every second until a result is provided)

* Working with the same commands Signed-off-by: Aaron Chong <aaronchongth@gmail.com> * Basic demo works with models Signed-off-by: Aaron Chong <aaronchongth@gmail.com> * Moving enclosures, removing in-between, using nested models, moving camera Signed-off-by: Aaron Chong <aaronchongth@gmail.com> * Use new released rmf_building_map_tools args, clean up, moved waypoints, added dispenser/ingestor Signed-off-by: Aaron Chong <aaronchongth@gmail.com> * Clean up duplicated and unused files, use rmf_transporter Signed-off-by: Aaron Chong <aaronchongth@gmail.com> * Remove duplicated depot Signed-off-by: Aaron Chong <aaronchongth@gmail.com> --------- Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

aaronchongth · 2025-01-22T07:18:04Z

Per discussion, there may exist an issue regarding multiple work orders at the same time. After investigating more and trying it out, I found that they were working as expected 🤔, with some observations,

cd nexus_integration_tests

# First order accepted
ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '23', work_order: '$(cat config/pick_and_place.json)'}}"

# While the first order is still being executed, run a second same order with a different order ID
ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '24', work_order: '$(cat config/pick_and_place.json)'}}"
# This gets rejected/aborted, saying failed to assign task to workcells as the task ID [2] already exists (this refers to the step ID)

# Save this new order somewhere, https://gist.github.com/aaronchongth/b5b92f140d539c33e0d0ec23b414d70c
# This new order just modifies the step IDs to 3.0 and 4.0
# While the first order is still being executed, send this new order
ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '24', work_order: '$(cat config/new_pick_and_place.json)'}}"
# This order gets accepted and starts being executed after order ID 23 is done

# When the order ID 23 has been completed, send in the same original order, with a different order ID,
ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '25', work_order: '$(cat config/pick_and_place.json)'}}"
# Order gets accepted, and only starts after order ID 24 is done

I haven't been able to replicate the behavior we discussed about, regarding work orders interfering with each other during completion. However summarizing some observations,

when a work order is being executed, sending in more work orders with the same step ID will be rejected (the above scenario)
when a work order is being executed, sending in work orders with different step IDs will be accepted
after the a work order has been completed, sending in the same work order with a different order ID containing same step IDs will be accepted
step ID is parsed as double, but retrieved as int, this causes step IDs of 1.0 and 1.1 to be treated the same and gets rejected. Is there a reason we use doubles but parse as int?

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

Yadunund · 2025-01-22T21:30:34Z

Per discussion, there may exist an issue regarding multiple work orders at the same time. After investigating more and trying it out, I found that they were working as expected 🤔, with some observations,
cd nexus_integration_tests

# First order accepted
ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '23', work_order: '$(cat config/pick_and_place.json)'}}"

# While the first order is still being executed, run a second same order with a different order ID
ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '24', work_order: '$(cat config/pick_and_place.json)'}}"
# This gets rejected/aborted, saying failed to assign task to workcells as the task ID [2] already exists (this refers to the step ID)

# Save this new order somewhere, https://gist.github.com/aaronchongth/b5b92f140d539c33e0d0ec23b414d70c
# This new order just modifies the step IDs to 3.0 and 4.0
# While the first order is still being executed, send this new order
ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '24', work_order: '$(cat config/new_pick_and_place.json)'}}"
# This order gets accepted and starts being executed after order ID 23 is done

# When the order ID 23 has been completed, send in the same original order, with a different order ID,
ros2 action send_goal /system_orchestrator/execute_order nexus_orchestrator_msgs/action/ExecuteWorkOrder "{order: {id: '25', work_order: '$(cat config/pick_and_place.json)'}}"
# Order gets accepted, and only starts after order ID 24 is done
I haven't been able to replicate the behavior we discussed about, regarding work orders interfering with each other during completion. However summarizing some observations,

when a work order is being executed, sending in more work orders with the same step ID will be rejected (the above scenario)

when a work order is being executed, sending in work orders with different step IDs will be accepted

after the a work order has been completed, sending in the same work order with a different order ID containing same step IDs will be accepted

step ID is parsed as double, but retrieved as int, this causes step IDs of 1.0 and 1.1 to be treated the same and gets rejected. Is there a reason we use doubles but parse as int?

Thanks for investigating further.

I do think the current behavior where Step IDs also need to be unique is not ideal. Let's open a ticket and update behavior? The Work Order ID should be unique. The id we pass to the workcell in the WorkcellTask request can be a unique combination of the Work Order ID and the Step ID.
We should parse the step ID as an integer and not float (and update work order defns)
Regarding the parallel behavior:
- Even with unique Step IDs, it seems one Work Order needs to complete execution before another can begin. This is not the case with the conveyor plugin. If a workcell needed for the second order is available, it should be able to process the step for the second order. I just tried submitting two separate jobs (unique Step IDs) with use_rmf_transporter:=False and the system executes both in parallel, ie While Step 2 in Work Order 1 is being processed by workcell_2, workcell_1 is processing Step 1 in Work Order 2. I believe this is not possible with the RMF integration given the BT definition and might be related to the point about signaling that Luca made above.
- I would expect a second RMF bid to go out which gets assigned to the second AMR but the AMR is dispatched only when the required workcell is available.
- Could you check if there is a quick fix by modifying the BTs? Else let's open a ticket and tackle this in a follow up PR.

Signed-off-by: Yadunund <yadunund@gmail.com>

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

aaronchongth · 2025-01-23T10:09:17Z

Gotcha, I was actually looking into the issue that you showed me, when there were 2 work orders in parallel, completion of the first work order, somehow completed the second work order as well. But at least that weird scenario does not seem to be happening.

I would expect a second RMF bid to go out which gets assigned to the second AMR but the AMR is dispatched only when the required workcell is available.

Thanks for flagging out this other parallel work order scenario. Yeah that is happening due to how the RMF transportation workcell's BT is currently designed, where the root BT takes care of dispatching an RMF task as well as keeping track of each looped destination in the RMF task (to handle the dispenser requests). IIUC, this means the RMF transportation workcell is never "done" until the whole work order is completed, before dispatching another robot.

From the Signalling section in this PR's description, it looks like this is by design to prevent the wrong AMR from reaching the workcell that is waiting for another AMR. Unfortunately I can't think of a way to resolve this on a BT level confidently. I have some ideas, will open a ticket for this particular situation to discuss more.

edit: opened #63 and #64

Signed-off-by: Yadunund <yadunund@gmail.com>

Yadunund

Thanks for trailblazing the approach of integrating RMF as a workcell that provides transportation services!

The experience here has been invaluable in better understanding the pros and cons of this approach vs integration via nexus_transporter. The main issue uncovered is that we can't run multiple work orders in parallel since workcells don't have the ability to run tasks in parallel yet. Further, the implementation here implicitly defines the task.type for workcells that perform transportation services, ie, task.type = transportation with an internal schema for the task params to include destinations/pickups etc. Lastly, all transportation steps required for a job is performed by the same transporter workcells. However in practice, we might distribute this among different transporters (eg. conveyor for some segments, AMRs for others (or even different AMRs)).
I've opened a meta-ticket to track things we need to implement to better support Workcells as transporters in general. #67

For now we can merge this PR in and iterate in subsequent PRs.

luca-della-vedova added 9 commits December 11, 2024 16:19

Migrate to ros2dds bridge

308a4d8

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Bump ros2dds bridge version to fix warning spam

0cbb8ec

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Add source to transporter API

dc87dbf

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

WIP first draft of RMF integration

2871130

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Change signals to contain task ids

89fbbe4

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Move signaling from workcell to system orchestrator

a45f187

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Change RMF transporter to be a workcell instead

d9e80fe

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Fix cancellation, feedback

f5cd789

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Add TransportAmr capability and RMF workcell

89203ff

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

luca-della-vedova force-pushed the luca/rmf_transporter branch from 67b0faa to 89203ff Compare December 11, 2024 08:23

Revert transporter changes

7574be9

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

luca-della-vedova force-pushed the luca/rmf_transporter branch from 66d911d to 28bbd6b Compare December 11, 2024 09:18

Add missing dependency

cd666b0

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

luca-della-vedova force-pushed the luca/rmf_transporter branch from 28bbd6b to cd666b0 Compare December 12, 2024 03:10

luca-della-vedova added 6 commits December 12, 2024 11:30

Reintroduce signal queueing, cleanup debugs

9ee9b55

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Go back to task signaling

5802441

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Remove backup files

167061c

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Add visualization package

d3c54a3

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Add demo package based on rmf_demos

cac92bd

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Remove printout

08bc9d2

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

luca-della-vedova force-pushed the luca/rmf_transporter branch from 6dc0918 to 5eb6449 Compare December 13, 2024 07:23

luca-della-vedova changed the title ~~WIP: introduce a RMF transportation workcell~~ Introduce a RMF transportation workcell Dec 13, 2024

Move to nexus_integration_tests instead

3452bf7

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

luca-della-vedova force-pushed the luca/rmf_transporter branch from 5eb6449 to 3452bf7 Compare December 13, 2024 08:12

luca-della-vedova added 4 commits December 13, 2024 16:27

Fix integration test

e12f7b6

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Fix repos file, reintroduce comprehensive test

4e83aa3

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Make sure AMRs are up before sending task

4fe4811

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Fix copyrights for new files

8540f3a

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Base automatically changed from luca/ros2dds_bridge to main December 26, 2024 07:59

Merge branch 'main' into luca/rmf_transporter

aefa7a7

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Yadunund and others added 3 commits January 13, 2025 21:51

Merge branch 'main' into luca/rmf_transporter

f6dc652

Remove signaling at the system orchestrator level

f55bab2

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Remove duplicated behavior trees and use remapping

f9705a7

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

luca-della-vedova and others added 2 commits January 14, 2025 17:37

Remove unnecessary variable

0ad6bea

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Update copyrights

541083e

Signed-off-by: Yadunund <yadunund@gmail.com>

aaronchongth mentioned this pull request Jan 15, 2025

Set up depot map #61

Closed

3 tasks

Fix launch

bc73f23

Signed-off-by: Yadunund <yadunund@gmail.com>

Yadunund force-pushed the luca/rmf_transporter branch from 5c714dd to bc73f23 Compare January 18, 2025 01:49

aaronchongth added 2 commits January 20, 2025 15:52

Fix potential UB with comments, updated README to build with rmf.repos

5d3daf8

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

Removed comments

1795826

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

Use new DeliveryRobotWithConveyor model from fuel

5aa3daa

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

Yadunund added 3 commits January 22, 2025 22:01

Make remap_task_types and rviz_config launch args

285def7

Signed-off-by: Yadunund <yadunund@gmail.com>

Move maps into config/rmf

43ec932

Signed-off-by: Yadunund <yadunund@gmail.com>

Also make bt_path and max_jobs launch args

3e9a291

Signed-off-by: Yadunund <yadunund@gmail.com>

aaronchongth mentioned this pull request Jan 23, 2025

WorkcellTask ID should be unique, and not based on the work order step ID #63

Open

rmf_demos_fleet_adapter available via rosdep

43978e0

Signed-off-by: Yadunund <yadunund@gmail.com>

Yadunund force-pushed the luca/rmf_transporter branch from f58c48d to 43978e0 Compare January 23, 2025 04:00

Set nested models as static as well

5fff133

Signed-off-by: Aaron Chong <aaronchongth@gmail.com>

aaronchongth mentioned this pull request Jan 23, 2025

RMF integration does not yet support parallel work orders #64

Open

Rename dispatch_transporter to assign_transporter_workcell

6c6de52

Signed-off-by: Yadunund <yadunund@gmail.com>

Yadunund mentioned this pull request Jan 25, 2025

Towards supporting transporters as workcells #67

Open

4 tasks

Yadunund approved these changes Jan 25, 2025

View reviewed changes

Yadunund merged commit 7479795 into main Jan 25, 2025
3 checks passed

Yadunund mentioned this pull request Jan 27, 2025

Support multiple destinations with actions in nexus_transporter #69

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a RMF transportation workcell #42

Introduce a RMF transportation workcell #42

luca-della-vedova commented Dec 11, 2024 •

edited by Yadunund

Loading

luca-della-vedova commented Jan 14, 2025

aaronchongth commented Jan 20, 2025 •

edited

Loading

aaronchongth commented Jan 22, 2025

Yadunund commented Jan 22, 2025

aaronchongth commented Jan 23, 2025 •

edited

Loading

Yadunund left a comment

Introduce a RMF transportation workcell #42

Introduce a RMF transportation workcell #42

Conversation

luca-della-vedova commented Dec 11, 2024 • edited by Yadunund Loading

Test it!

PR breakdown

nexus_integration_tests vs nexus_demos

Task cancellation

Is task doable / navgraph checking

Map annotation

Signaling

What's next

Create a Gazebo simulation that includes workcells together with AMRs

Simulate humans for workcells that are manually operated

Task parallelism

SKU Tracking

Better handling of workcell location and registration

Post processing of waypoints for AMR tasks

luca-della-vedova commented Jan 14, 2025

aaronchongth commented Jan 20, 2025 • edited Loading

aaronchongth commented Jan 22, 2025

Yadunund commented Jan 22, 2025

aaronchongth commented Jan 23, 2025 • edited Loading

Yadunund left a comment

Choose a reason for hiding this comment

luca-della-vedova commented Dec 11, 2024 •

edited by Yadunund

Loading

aaronchongth commented Jan 20, 2025 •

edited

Loading

aaronchongth commented Jan 23, 2025 •

edited

Loading