Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ergoCubSN000] Robot not starting from yarpmanager #1543

Open
lrapetti opened this issue Apr 20, 2023 · 30 comments
Open

[ergoCubSN000] Robot not starting from yarpmanager #1543

lrapetti opened this issue Apr 20, 2023 · 30 comments
Assignees
Labels
ergoCub 1.0 S/N:000 ergoCub robot (prototype) pinned This label prevents an issue from being closed automatically

Comments

@lrapetti
Copy link
Member

lrapetti commented Apr 20, 2023

Device name 🤖

ergoCubSN000

Request/Failure description

When starting the robot from the yarpmanager we are getting the error:

<#STRING_START#>[h, o, s, t, T, r, a, n, s, c, e, i, v, e, r, (, ), :, :, p, a, r, s, e, (, ),  , d, e, t, e, c, t, e, d,  , a, n,  , E, R, R, O, R,  , i, n,  , s, e, q, u, e, n, c, e,  , n, u, m, b, e, r,  , f, r, o, m,  , I, P,  , =,  , 1, 0, ., 0, ., 1, ., 4, .,  , E, x, p, e, c, t, e, d, :,  , 4, 5, 0, 8, ,,  , R, e, c, e, i, v, e, d, :,  , 4, 8, 8, 8, ,,  , M, i, s, s, i, n, g, :,  , 3, 8, 0, ,,  , P, r, e, v,  , F, r, a, m, e,  , T, X,  , a, t,  , 2, 3, 4, 3, 8, 8, 2, 9, 8,  , u, s, ,,  , T, h, i, s,  , F, r, a, m, e,  , T, X,  , a, t,  , 2, 3, 5, 1, 5, 0, 2, 9, 8,  , u, s, <#STRING_END#>

this was happening also in the last days.

Detailed context

Here is a log saved from yarplogger

yarprunlog_20_04_2023_16_49_52.log

Additional context

No response

How does it affect you?

No response

cc @CarlottaSartore @DanielePucci

@lrapetti
Copy link
Member Author

This was observed also by @AntonioConsilvio

@sgiraz
Copy link
Contributor

sgiraz commented Apr 20, 2023

It may be a network issue. We'll check that asap.
Stay tuned 📢

@AntonioConsilvio
Copy link
Contributor

Hi @lrapetti, me and @sgiraz have updated all the robot boards and the robot software. Now the robot seems to start from yarpmanager without any problems.

However, give us feedback on the functioning of the robot!

@lrapetti
Copy link
Member Author

As per #1544 (comment)

@sgiraz
Copy link
Contributor

sgiraz commented Apr 23, 2023

Closing!

@sgiraz sgiraz closed this as completed Apr 23, 2023
@lrapetti lrapetti reopened this Apr 26, 2023
@lrapetti
Copy link
Member Author

Reopening since the issue happened again today.

Here are a few logs
yarprunlog_26_04_2023_16_11_29.log
yarprunlog_26_04_2023_16_17_59.log
yarprunlog_26_04_2023_16_15_34.log
again, when running from the terminal the problem was not happening.

Note that this time we did not experience something like #1544

@sgiraz
Copy link
Contributor

sgiraz commented Apr 26, 2023

cc @AntonioConsilvio

@GiulioRomualdi
Copy link
Member

The problem just happened now.

Here another logger
log_ergocub-torso_yarprobotinterface_2764.txt

@pattacini
Copy link
Member

Hi @GiulioRomualdi @lrapetti

When launching yarprobotinterface from the console, do you spot any error/warning that can be related somehow?

@traversaro @maggia80 and I remember that something similar was occurring on iCub3 in L.A. (or even a bit earlier), although we haven't tracked it down in the proper issue yet. Perhaps @S-Dafarra has a reference to share.

@GiulioRomualdi
Copy link
Member

Hi @pattacini, we noticed that when the interface is started from the manager, we got a lot of host transceiver errors. This do not happen if started from the terminal

@Nicogene
Copy link
Member

Nicogene commented May 2, 2023

Hi @pattacini, we noticed that when the interface is started from the manager, we got a lot of host transceiver errors. This do not happen if started from the terminal

Hi @GiulioRomualdi @lrapetti,

Is yarplogger when you try to run yarprobotinterface?

An interesting test could be disabling the log from yarprun:

immagine

because basically, the only difference between running it with yarpmanager or from terminal should be the streaming of the log

@mebbaid
Copy link

mebbaid commented May 4, 2023

It seems also yarprun on the head is not able to start from the yarpmanager. However I am able to start it from the terminal. Not sure if this is related

@sgiraz sgiraz added this to the ICRA 2023 💂🏻‍♀️ milestone May 4, 2023
@mebbaid
Copy link

mebbaid commented May 5, 2023

It seems also yarprun on the head is not able to start from the yarpmanager. However I am able to start it from the terminal. Not sure if this is related

today, i was able to do yarprun from the yarpmanager. Not sure if the issue was related to multiple yarprun processes running on the head.

@sgiraz
Copy link
Contributor

sgiraz commented May 10, 2023

OK @mebbaid let us know if the problem persists.

Not sure if the issue was related to multiple yarprun processes running on the head.

To check this, you may try to open multiple yarprun instances from the terminal on the same SERVERPORT used by the yarpmanager and see what happens.

@S-Dafarra
Copy link

Hold on, we are discussing about two different issues. The issue @mebbaid mentioned is related to launching yarprun on the Xavier. I think this is due to the long time it takes to enter in ssh, that sometimes causes yarpmanager to think that yarprun did not start. This is unrelated to the initial problem, where the yarprobotinterface refuses to start when launched from yarpmanager on the torso (hence a different machine).

I tried launching it from terminal with YARP_LOG_FORWARD_ENABLE and it works. I guess that the only difference is that when launched from terminal, the yarprobotinterface is slowed down a bit by the terminal output. It seems fishy to me, it is like the initial communication to some boards is slower than usual

@sgiraz
Copy link
Contributor

sgiraz commented May 17, 2023

Hi guys,

@AntonioConsilvio spotted that it happens when you try to run both yarprobotinterface and yarplogger at the same time. It works fine if you run the yarprobotinterface first, then run the yarplogger after a while.
As suggested (💡) by @davidelasagna we may try to use the wired ETH connection and check if the problem persists when they are launched together. If yes, it may be either a bandwidth or router configuration issue.

Notes:

  • yarpmanager has been launched from the laptop.
  • ⚠️ The solution to run the yarprobotinterface and then the yarplogger (e.g. after the calibration) doesn't allow us to catch all the possible errors/warnings that happen during the startup of the robot.

cc @Nicogene @S-Dafarra @lrapetti

@sgiraz sgiraz removed this from the ICRA 2023 💂🏻‍♀️ milestone Jun 16, 2023
@sgiraz
Copy link
Contributor

sgiraz commented Jul 5, 2023

This issue is open for a long time without any recent activity, it appears that a solution has been found (or at least identified).
Therefore, I will proceed to close it. However, please feel free to reopen it if necessary.

@sgiraz sgiraz closed this as completed Jul 5, 2023
@GiulioRomualdi
Copy link
Member

GiulioRomualdi commented Jul 6, 2023

I would avoid closing it since we have to run the interface from the terminal in order to have the robot running and this is not the standard way to use the robot
If you think this is not the right place to open the issue we can move it somewhere else

@lrapetti
Copy link
Member Author

I would avoid closing it since we have to run the interface from the terminal in order to have the robot running and this is not the standard way to use the robot If you think this is not the right place to open the issue we can move it somewhere else

I agree with @GiulioRomualdi, the issue does not seem to be solved. Let us know if you want to track the problem in a different location.

@lrapetti lrapetti reopened this Jul 10, 2023
@github-actions
Copy link

This issue has been automatically marked as stale because it did not have recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale This issue will be soon closed automatically label Sep 19, 2023
@S-Dafarra
Copy link

This is still happening I guess

@pattacini pattacini added pinned This label prevents an issue from being closed automatically and removed stale This issue will be soon closed automatically labels Sep 19, 2023
@pattacini
Copy link
Member

Added pinned This label prevents an issue from being closed automatically

@S-Dafarra
Copy link

Now the issue seem to happen also when running the robot from terminal from time to time. We experienced this during the IIT20y demo together with @AntonioConsilvio. unfortunately we do not have nay log due to #1645

@S-Dafarra
Copy link

@lrapetti @GiulioRomualdi @mebbaid @CarlottaSartore please try to summarize the cases in which this happens

@lrapetti
Copy link
Member Author

lrapetti commented Oct 2, 2023

Until last week the situation was the following (it is documentef in the comments above till #1543 (comment)):

  • from yarpmanager: if you launch the yarplogger and start the robot yarpmotorinterface, it always fails with the error above
  • from yarpmanager: if you launch the yarprobotinterface, wait for the robot to start the calibration, and then start the yarprobotlogger, it was always able to start.
  • from terminal: if you launch the yarprobotinterface from the terminal, it was always able to start.

Since last week a new behaviour has been observed, as documented in #1543 (comment). Basically, sometimes the robot is not able to start even starting from the terminal (I don't know if the failure is due to the same error).

@Nicogene
Copy link
Member

Nicogene commented Nov 7, 2023

Related or at least similar to:

It would be interesting see if also on SN001 is happening

cc @pattacini @marcoaccame

@Nicogene
Copy link
Member

Nicogene commented Nov 8, 2023

It would be interesting see if also on SN001 is happening

Today I had the chance to test this issue both on SN000 and SN001, and on SN001 it is not happening.
In theory, the only difference between the two robots is the fact that the SN001 is missing the forearm.
Talking w/ @maggia80 and @marcoaccame it came out that is unlikely that this is due to a damaged ethernet connector in the chain because otherwise, we should have this problem also when running the yarprobotinterface from the terminal.

Running it from yarpmanager w/ yarplogger requires that the COM express handle not only the eth traffic to/from the boards but also the traffic of the log to the laptop. There is the possibility that the COM express is running on the edge of its CPU capabilities w/ the power consumption setting of the BIOS we set for solving the overheating problem.

We should check with htop the cpu usage of the COM express on both robot and the BIOS configuration.

The suspicious thing seems that this problem came out after

This could stress a lot the eth traffic and then the cpu, and the calibration phase is critical in this sense.

Note that without the forearm we miss several boards, so maybe this is the reason why ergoCub SN001 starts fine

cc @GiulioRomualdi @lrapetti @Fabrizio69 @pattacini @sgiraz @AntonioConsilvio

@S-Dafarra
Copy link

S-Dafarra commented Nov 8, 2023

Running it from yarpmanager w/ yarplogger requires that the COM express handle not only the eth traffic to/from the boards but also the traffic of the log to the laptop.

Note that we were used to run the yarprobotinterface with YARP_LOG_FORWARD_ENABLE. See the comment above. Hence, there should not be differences on this side. What about the opposite instead? Printing in a terminal requires time, and there are a ton of messages. This might slow down the yarprobotinterface process when running it from terminal. Is it possible that there is a concurrency issue when launching the different devices?

@Nicogene
Copy link
Member

Nicogene commented Nov 8, 2023

Note that we were used to run the yarprobotinterface with YARP_LOG_FORWARD_ENABLE. See #1543 (comment). Hence, there should not be differences on this side. What about the opposite instead? Printing in a terminal requires time, and there are a ton of messages. This might slow down the yarprobotinterface process when running it from terminal. Is it possible that there is a concurrency issue when launching the different devices?

I am not sure that running yarprobotinterface w/ YARP_LOG_FORWARD_ENABLE and running via yarprun follows the same code paths inside YARP maybe one is more efficient than the other?

@davidelasagna noticed that we have this setting in the documentation

immagine

And since the image of ergocub SN000 was created starting from an icub one these settings were still set to the user icub.
We should change it to ergocub reboot and see if the problem persists

Maybe also the buffersize has to be revised, it is quite old and maybe obsolete

@Nicogene
Copy link
Member

Nicogene commented Nov 8, 2023

We tried to both add this configuration and the RXRate in the pc104.xml to 1 ms but the behaviour unfortunately is the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ergoCub 1.0 S/N:000 ergoCub robot (prototype) pinned This label prevents an issue from being closed automatically
Projects
None yet
Development

No branches or pull requests

8 participants