Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Executing action server but nothing is ready #2226

Open
EvgeniNF opened this issue Jun 24, 2023 · 18 comments
Open

Executing action server but nothing is ready #2226

EvgeniNF opened this issue Jun 24, 2023 · 18 comments
Labels
more-information-needed Further information is required

Comments

@EvgeniNF
Copy link

EvgeniNF commented Jun 24, 2023

Bug report

Required Info:

  • Operating System:
    • Ubuntu 20.04
  • Installation type:
    • from source
  • Version or commit hash:
    • iron
  • DDS implementation:
    • Cyclone dds
  • Client library (if applicable):
    • rclcpp_action

Steps to reproduce issue

I am do stress test for action server and catch this error on random request. How i can fix that?

using Exec = devicelib_msgs::action::Exec;
using GoalHandle = rclcpp_action::ServerGoalHandle<Exec>;

class Server final : public rclcpp::Node {
   public: 
    Server() : rclcpp::Node("action_server") {
        using namespace std::placeholders;

        m_cb = create_callback_group(rclcpp::CallbackGroupType::Reentrant);
        m_server = rclcpp_action::create_server<Exec>(
            this,
            "/test/server",
            std::bind(&Server::goalHandle, this, _1, _2),
            std::bind(&Server::handleCancel, this, _1),
            std::bind(&Server::handleAccepted, this, _1),
            rcl_action_server_get_default_options(),
            m_cb
        );
    }

    Server(Server const &) = delete;
    Server(Server &&) noexcept = delete;
    Server& operator=(Server const &) = delete;
    Server& operator=(Server &&) noexcept = delete;
    ~Server() noexcept = default;

   private:
    rclcpp_action::GoalResponse goalHandle(rclcpp_action::GoalUUID const &uuid,
                                           Exec::Goal::ConstSharedPtr goal) {
        RCLCPP_INFO_STREAM(get_logger(), "Goal handle");
        return rclcpp_action::GoalResponse::ACCEPT_AND_EXECUTE;
    }
    
    rclcpp_action::CancelResponse handleCancel(std::shared_ptr<GoalHandle> const goalHandle) {
        RCLCPP_INFO_STREAM(get_logger(), "Handle cancel");
        return rclcpp_action::CancelResponse::ACCEPT;
    }

    void handleAccepted(std::shared_ptr<GoalHandle> const goalHandle) {
        RCLCPP_INFO_STREAM(get_logger(), "Handle accepted");
        std::thread([goal = goalHandle, this] () -> void {
            this->execute(std::move(goal));       
        }).detach();
    }

    void execute(std::shared_ptr<GoalHandle> const goalHandle) {
        RCLCPP_INFO_STREAM(get_logger(), "Start execute");
        auto const goal = goalHandle->get_goal();
        auto const feedback = std::make_shared<Exec::Feedback>();
        auto const result = std::make_shared<Exec::Result>();

        size_t counter = 0;
        for (auto const &_: goal->details) {
            if (goalHandle->is_canceling()) {
                result->message = "Canceled by peer";
                goalHandle->canceled(result);
                RCLCPP_INFO(get_logger(), "Goal canceled");
                return;
            }

            RCLCPP_INFO_STREAM(get_logger(), "Start detail: " << counter);
            std::this_thread::sleep_for(std::chrono::milliseconds(100));

            feedback->num_executed_details = counter;
            feedback->command_status.command_name = "command";
            counter++;
            feedback->command_status.seconds = goal->details.size() - counter;
            goalHandle->publish_feedback(feedback);
            RCLCPP_INFO_STREAM(get_logger(), "Publish feedback");
        } 

        if (rclcpp::ok()) {
            goalHandle->succeed(result);
            RCLCPP_INFO_STREAM(get_logger(), "Goal success");
        }
    }

   private:
    rclcpp_action::Server<Exec>::SharedPtr m_server;
    rclcpp::CallbackGroup::SharedPtr m_cb;
};

int main(int argc, char** argv) {
    rclcpp::init(argc, argv);

    auto const server = std::make_shared<Server>();

    rclcpp::executors::MultiThreadedExecutor executor;
    executor.add_node(server);
    executor.spin();

    rclcpp::shutdown();
}

Expected behavior

Normal client/server action behavior where each client sends a goal and receives a result from the server.

Actual behavior

Node Server throws an exception and crashes

terminate called after throwing an instance of 'std::runtime_error'
  what():  Executing action server but nothing is ready

Additional information


Feature request

Feature description

Implementation considerations

@EvgeniNF EvgeniNF reopened this Jun 24, 2023
fujitatomoya added a commit to fujitatomoya/ros2_test_prover that referenced this issue Jun 26, 2023
  ros2/rclcpp#2226

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
@fujitatomoya
Copy link
Collaborator

@EvgeniNF thanks for creating issue.

i tried to use your sample here fujitatomoya/ros2_test_prover@1938d8d, i did modify several places to build it with rolling.
so far i do not have the problem described here.

root@tomoyafujita:~/ros2_ws/colcon_ws# source install/local_setup.bash 
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run prover_rclcpp rclcpp_2226 
[INFO] [1687800854.360839751] [action_server]: Goal handle
[INFO] [1687800854.361871421] [action_server]: Handle accepted
[INFO] [1687800854.361973245] [action_server]: Executing goal
[INFO] [1687800854.362307637] [action_server]: Publish feedback
[INFO] [1687800855.362249651] [action_server]: Publish feedback
[INFO] [1687800856.362233943] [action_server]: Publish feedback
[INFO] [1687800857.362178916] [action_server]: Publish feedback
[INFO] [1687800858.362253362] [action_server]: Publish feedback
[INFO] [1687800859.362238926] [action_server]: Publish feedback
[INFO] [1687800860.362247690] [action_server]: Publish feedback
[INFO] [1687800861.362286352] [action_server]: Publish feedback
[INFO] [1687800862.362336140] [action_server]: Publish feedback
^C[INFO] [1687800972.893649435] [rclcpp]: signal_handler(signum=2)

Ubuntu 20.04

This is not supported platform Iron, https://docs.ros.org/en/rolling/Releases/Release-Iron-Irwini.html#supported-platforms. can you try with Ubuntu 22.04?

@fujitatomoya fujitatomoya added the more-information-needed Further information is required label Jun 26, 2023
@EvgeniNF
Copy link
Author

@fujitatomoya Ok, thanks for reply, i try run this code on ubuntu 22.04

@EvgeniNF
Copy link
Author

EvgeniNF commented Jun 27, 2023

@fujitatomoya I have the same problem on ubuntu 22.04.02 (64-bit). I noticed what on Fast DDS this error occurs less frequently.

[INFO] [1687871330.810723412] [action_server]: Goal success
[INFO] [1687871330.907057276] [action_server]: Publish feedback
[INFO] [1687871331.007894935] [action_server]: Publish feedback
[INFO] [1687871331.108666245] [action_server]: Publish feedback
[INFO] [1687871331.209445981] [action_server]: Publish feedback
[INFO] [1687871331.211895488] [action_server]: Goal success
[INFO] [1687871331.270255007] [action_server]: Goal handle
[INFO] [1687871331.271282290] [action_server]: Handle accepted
[INFO] [1687871331.271716174] [action_server]: Start execute
[INFO] [1687871331.273183100] [action_server]: Goal handle
[INFO] [1687871331.274387011] [action_server]: Handle accepted
[INFO] [1687871331.274578657] [action_server]: Goal handle
[INFO] [1687871331.274843450] [action_server]: Start execute
[INFO] [1687871331.275820549] [action_server]: Handle accepted
[INFO] [1687871331.276691501] [action_server]: Start execute
terminate called after throwing an instance of 'std::runtime_error'
  what():  Executing action server but nothing is ready
[ros2run]: Aborted

@EvgeniNF
Copy link
Author

I think it is race condition, because i catch this error on random request, but i don't understand when called method take_data(). Could you explain me how it's work, please. I want help close this problem

@EvgeniNF
Copy link
Author

I noticed one more thing. When i am using single thread executor for spin services node, i don't catch error. Now i am testing this thing on ubuntu 20.04

@fujitatomoya
Copy link
Collaborator

with fujitatomoya/ros2_test_prover@1938d8d, and the following stress test for hour, i cannot reproduce this issue in my local environment.

  • terminal-1
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run prover_rclcpp rclcpp_2226 
[INFO] [1687988184.470012330] [action_server]: Goal handle
[INFO] [1687988184.470281262] [action_server]: Handle accepted
[INFO] [1687988184.470335165] [action_server]: Executing goal
[INFO] [1687988184.470407129] [action_server]: Publish feedback
[INFO] [1687988185.470686883] [action_server]: Publish feedback
[INFO] [1687988186.470699443] [action_server]: Publish feedback
[INFO] [1687988187.470627795] [action_server]: Publish feedback
[INFO] [1687988188.470713334] [action_server]: Publish feedback
...
  • terminal-2
root@tomoyafujita:~/ros2_ws/colcon_ws# while true; do ros2 action send_goal /test/server action_tutorials_interfaces/action/Fibonacci order:\ 10; done
Waiting for an action server to become available...
Sending goal:
     order: 10

Goal accepted with ID: d4cdf1b178ce4c06a55702b1e514d845
...

@EvgeniNF
Copy link
Author

@fujitatomoya try to calling the action server with multiple clients in same time

@EvgeniNF
Copy link
Author

@fujitatomoya and i noticed in test case you don't call goalHandle->succeed(result);

fujitatomoya added a commit to fujitatomoya/ros2_test_prover that referenced this issue Jun 29, 2023
  ros2/rclcpp#2226

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
@fujitatomoya
Copy link
Collaborator

try to calling the action server with multiple clients in same time

  • action server
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run prover_rclcpp rclcpp_2226
...
  • action clients

make sure we have more process spaces that can be ON_PROC to the server.

root@tomoyafujita:~/ros2_ws/colcon_ws# cat /proc/cpuinfo | grep processor | wc -l
16

root@tomoyafujita:~/ros2_ws/colcon_ws# cat test.sh
#!/bin/bash

for i in {1..32}
do
    echo action request sending $i ---
    ros2 action send_goal /test/server action_tutorials_interfaces/action/Fibonacci order:\ 10 &
done

root@tomoyafujita:~/ros2_ws/colcon_ws# ./test.sh
...<snip>

all 32 goals have been completed and confirmed.

btw, i had the following different error on the server side.

terminate called after throwing an instance of 'rclcpp::exceptions::RCLError'
  what():  client will not receive response, at /root/ros2_ws/colcon_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_response.cpp:149, at /root/ros2_ws/colcon_ws/src/ros2/rcl/rcl/src/rcl/service.c:379

this is know issue, ros2/ros2#1253

@EvgeniNF
Copy link
Author

EvgeniNF commented Jun 30, 2023

@fujitatomoya Ok, i will try your test, maybe my client code is bad...

evgeniy@evgeniy-GL753VD:~$ cat /proc/cpuinfo | grep processor | wc -l
8

My client code:

#include <functional>
#include <future>
#include <memory>
#include <string>
#include <rclcpp/rclcpp.hpp>
#include <example_interfaces/srv/add_two_ints.hpp>

using Srv = example_interfaces::srv::AddTwoInts;
using Request = Srv::Request::SharedPtr;
using Responce = Srv::Response::SharedPtr;

class Client {
   public: 
    Client(rclcpp::Node::SharedPtr node) : m_node(std::move(node)) {
          if (not m_node->has_parameter("path")) {
             m_node->declare_parameter("path", "/test/service");
          }

          m_client = m_node->create_client<Srv>(
          m_node->get_parameter("path").as_string()
       );
    }
    Client(Client const &) = delete;
    Client(Client &&) noexcept = delete;
    Client& operator=(Client const &) = delete;
    Client& operator=(Client &&) noexcept = delete;
    ~Client() noexcept;

   public:
    std::shared_future<Responce> sendRequest(Request goal) {
      if (not m_client->wait_for_service(std::chrono::seconds(2))) {
          throw std::runtime_error("Service is not ready");
      }

      auto responce = m_client->async_send_request(goal);
      return responce;
    }

   private:
    rclcpp::Node::SharedPtr m_node;
    rclcpp::Client<Srv>::SharedPtr m_client;
};

int main(int argc, char** argv) {
    rclcpp::init(argc, argv);
    auto node = rclcpp::Node::make_shared("client");
    auto client1 = std::make_shared<Client>(node);
    auto client2 = std::make_shared<Client>(node);
    auto client3 = std::make_shared<Client>(node);
    

    while (rclcpp::ok()) {
        try {
            std::vector<std::shared_future<Responce>> futures;
            auto const req = std::make_shared<Srv::Request>();
            
            req->a = 1;
            req->b = 1;
            futures.push_back(client1->sendRequest(req));

            req->a = 2;
            req->b = 2;
            futures.push_back(client2->sendRequest(req));

            req->a = 3;
            req->b = 3;
            futures.push_back(client3->sendRequest(req));

            for (auto &future: futures) {
                rclcpp::spin_until_future_complete(node, future);
                auto result = future.get();

                if (not result) {
                    throw std::runtime_error("result is empty");
                }
            }
            
            RCLCPP_INFO_STREAM(rclcpp::get_logger("client"), "Result is success");
        } catch (std::exception const &error) {
            RCLCPP_ERROR_STREAM(rclcpp::get_logger("client"), error.what());
            std::this_thread::sleep_for(std::chrono::seconds(1));
        }
    }
    rclcpp::shutdown();    
}

@EvgeniNF
Copy link
Author

@fujitatomoya With SingleThreadExecuter i don't have any problems on server. I tested it all day yesterday

@EvgeniNF
Copy link
Author

@fujitatomoya About error: I am using CycloneDDS and i saw in this issue https://github.com/ros2/ros2/issues/1253 it happens if you use FastDDS.

@jmachowinski
Copy link
Contributor

This bug and
#2242 are related.
#2250 should also fix this.
@EvgeniNF can you try this out, you need to compile your own ros though.

@tonynajjar
Copy link
Contributor

I also have this issue with this setup:

OS: ubuntu 22
Architecture: ARM
DDS: Cyclone
Executor: EventsExecutor (https://github.com/irobot-ros/events-executor because the rclcpp version is not available in humble)

I was able to catch this backtrace if it's useful for you

[behavior_server-5] terminate called after throwing an instance of 'std::runtime_error'
[behavior_server-5]   what():  Taking data from action server but nothing is ready
[behavior_server-5] Stack trace (most recent call last):
[behavior_server-5] #15   Object "/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1", at 0xffffffffffffffff, in 
[behavior_server-5] #14   Object "/code/ros2_ws/build/nav2_behaviors/behavior_server", at 0x559047856f, in _start
[behavior_server-5] #13   Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0x7f7ce574cb, in __libc_start_main
[behavior_server-5] #12   Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0x7f7ce573fb, in 
[behavior_server-5] #11   Source "/code/ros2_ws/src/navigation2/nav2_behaviors/src/main.cpp", line 30, in main [0x5590478753]
[behavior_server-5]          27:   auto executor = std::make_shared<rclcpp::executors::EventsExecutor>();
[behavior_server-5]          28: 
[behavior_server-5]          29:   executor->add_node(recoveries_node->get_node_base_interface());
[behavior_server-5]       >  30:   executor->spin();
[behavior_server-5]          31:   rclcpp::shutdown();
[behavior_server-5]          32: 
[behavior_server-5]          33:   return 0;
[behavior_server-5] #10   Object "/code/ros2_ws/install/irobot_events_executor/lib/libirobot_events_executor.so", at 0x7f7d4f7723, in rclcpp::executors::EventsExecutor::spin()
[behavior_server-5] #9    Object "/code/ros2_ws/install/irobot_events_executor/lib/libirobot_events_executor.so", at 0x7f7d4f6e37, in rclcpp::executors::EventsExecutor::execute_event(rclcpp::executors::ExecutorEvent const&)
[behavior_server-5] #8    Object "/opt/ros/humble/lib/librclcpp_action.so", at 0x7f7c1d9dab, in rclcpp_action::ServerBase::take_data_by_entity_id(unsigned long)
[behavior_server-5] #7    Object "/opt/ros/humble/lib/librclcpp_action.so", at 0x7f7c1dcef3, in rclcpp_action::ServerBase::take_data()
[behavior_server-5] #6    Object "/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30", at 0x7f7d0b2d23, in __cxa_throw
[behavior_server-5] #5    Object "/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30", at 0x7f7d0b2a3f, in std::terminate()
[behavior_server-5] #4    Object "/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30", at 0x7f7d0b29db, in 
[behavior_server-5] #3    Object "/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30", at 0x7f7d0b51fb, in __gnu_cxx::__verbose_terminate_handler()
[behavior_server-5] #2    Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0x7f7ce5712f, in abort
[behavior_server-5] #1    Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0x7f7ce6a67b, in raise
[behavior_server-5] #0    Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0x7f7ceaf200, in 
[behavior_server-5] Aborted (Signal sent by tkill() 116 0)

I will try to see if I can reproduce with the SingleThreadedExecutor.

@EvgeniNF
Copy link
Author

@tonynajjar If you will use SingleThreadedExecutor or callback group for sync action service this error don't happen. i solved it with Mutually Exclusive Callback Group

@tonynajjar
Copy link
Contributor

i solved it with Mutually Exclusive Callback Group

How did you solve it?

@EvgeniNF
Copy link
Author

EvgeniNF commented Aug 29, 2023

@tonynajjar O sorry, simply i use for action server Mutually Exclusive Callback Group.

m_cb = create_callback_group(rclcpp::CallbackGroupType::MutuallyExclusive);
        m_server = rclcpp_action::create_server<Exec>(
            this,
            "/test/server",
            std::bind(&Server::goalHandle, this, _1, _2),
            std::bind(&Server::handleCancel, this, _1),
            std::bind(&Server::handleAccepted, this, _1),
            rcl_action_server_get_default_options(),
            m_cb
        );

@jmachowinski
Copy link
Contributor

This should be fixed by the merge of #2531.
Can anyone affected by this bug please check and confirm ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more-information-needed Further information is required
Projects
None yet
Development

No branches or pull requests

4 participants