Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docker nightly build #2643

Open
wants to merge 9 commits into
base: gz-sim9
Choose a base branch
from

Conversation

ntfshard
Copy link

@ntfshard ntfshard commented Oct 9, 2024

🦟 Bug fix

Fixes #

Summary

Clean build by instruction in docker/README.md not working. Base image builds fine, but the next step is failing.

Ubuntu Focal (20.04) is too old to build gz-sim9 branch. It doesn't contain required GZ packages.

E: Package 'libgz-cmake4-dev' has no installation candidate
E: Unable to locate package libgz-common6-dev
E: Unable to locate package libgz-fuel-tools10-dev
E: Unable to locate package libgz-math8-eigen3-dev
E: Unable to locate package libgz-plugin3-dev
E: Unable to locate package libgz-physics8-dev
E: Unable to locate package libgz-rendering9-dev
E: Unable to locate package libgz-transport14-dev
E: Unable to locate package libgz-gui9-dev
E: Unable to locate package libgz-msgs11-dev
E: Unable to locate package libgz-sensors9-dev
E: Unable to locate package libsdformat15-dev

GCC8 is a very old compiler, I believe installation and replacing a default compiler was done for Ubuntu 18 (with default GCC7). Switched to GCC11 (default for Ubuntu 22). IMHO I'd prefer not to bind to compiler version in this case, but I made it as close as possible
Minor improvements in Readme and image
AFAIU docker/run.bash script not working well on modern systems, but it's out of scope of this question

Slightly out of scope, but couple of months ago tried same for a gz-sim8 branch and had problem with 2 missing gz libraries. If this patch will be accepted, I'll do similar changes on that branch too

Checklist

  • Signed all commits for DCO
  • Added tests
  • Updated documentation (as needed)
  • Updated migration guide (as needed)
  • Consider updating Python bindings (if the library has them)
  • codecheck passed (See contributing)
  • All tests passed (See test coverage)
  • While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining Signed-off-by messages.

Signed-off-by: Maksim Derbasov <ntfs.hard@gmail.com>
@github-actions github-actions bot added the 🏛️ ionic Gazebo Ionic label Oct 9, 2024
Signed-off-by: Maksim Derbasov <ntfs.hard@gmail.com>
@mjcarroll
Copy link
Contributor

Would it additionally make sense to introduce a CI job to make sure that this docker file can build on, for instance, a nightly basis?

@ntfshard
Copy link
Author

Would it additionally make sense to introduce a CI job to make sure that this docker file can build on, for instance, a nightly basis?

Not a big master of GitHub actions, could try I suppose

@ntfshard
Copy link
Author

ntfshard commented Oct 10, 2024

Would it additionally make sense to introduce a CI job to make sure that this docker file can build on, for instance, a nightly basis?

Not a big master of GitHub actions, could try I suppose

Maybe in another PR? It not fits very well widely spread pipelines AFAIS. Actually I planned to make fix for annoying bug #2506 which affects us. (and we use older version in which I'd like to propagate fix too).
[In other words, propagate this changes on previous branches: gz-sim8, gz-sim7 and made fix for mentioned bug above on this branch and propagate it on a same branches too. It sounds reasonable?]

@mjcarroll
Copy link
Contributor

Maybe in another PR?

Absolutely, would just be a nice addition to keep these dockerfiles fresh and not regressing.

It sounds reasonable?

Good with me.

@ntfshard
Copy link
Author

Ok. The last mystery, what happened in Jenkins. Looks like something goes wrong with couple of python tests. I can't try restart build there, don't have a permissions.

@iche033
Copy link
Contributor

iche033 commented Oct 10, 2024

Ok. The last mystery, what happened in Jenkins. Looks like something goes wrong with couple of python tests. I can't try restart build there, don't have a permissions.

the failing python tests are likely due to osrf/homebrew-simulation#2834 which is currently affecting all gz packages. So not the fault of this PR

@ntfshard
Copy link
Author

Thank you for your explanation. What is the next step for this PR?
And what is the right way to propagate this changes to other branches? Cherry-pick(commit after merge) or just implement same changes due to it still will be a merge conflict.

@iche033
Copy link
Contributor

iche033 commented Oct 15, 2024

yeah, once this is merged, you can cherry pick the changes for backports or use mergify, e.g. add the comment @mergifyio backport gz-sim<N>

@ntfshard
Copy link
Author

Ok, will try) Thank you!
But right now while CI is red, it can't be merged.

@@ -1,4 +1,4 @@
FROM ubuntu:focal
FROM ubuntu:jammy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The officially supported platform for gz-sim9 is Ubuntu:noble. https://gazebosim.org/docs/latest/install/#supported-platforms

@@ -10,7 +10,7 @@ COPY docker/scripts/enable_nightly.sh scripts/enable_nightly.sh
RUN scripts/enable_nightly.sh

RUN apt-get update \
&& apt-get install -y \
&& apt-get install --no-install-recommends -y \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified this works? My recollection is that some of our packages, such as gz-*-python and gz-*-cli are suggested/recommended packages and my not install if we add --no-install-recommends.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried to make it smaller, it's working, but yes, some tools, like 'gz topic' maybe missing. I'll remove this line

@@ -17,16 +17,17 @@ sudo apt-get install --no-install-recommends -y \
cppcheck \
curl \
git \
g++-8 \
pkg-config \
g++11 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just install the default version

Suggested change
g++11 \
g++ \

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Just tried to make changes more in a spirit of previous style
Same fro Clang I suppose

docker/README.md Outdated
@@ -73,21 +75,21 @@ distribution using debians.
image of Gazebo Garden:

```
./build.bash gz-garden ./Dockerfile.gz
./build.bash gz ./Dockerfile.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instruction above says the first argument must be the name of the Gazebo distribution. Should this change be reverted?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess I used auto-replacement

I'll check everything and send update

Signed-off-by: Maksim Derbasov <ntfs.hard@gmail.com>
docker/run.bash Outdated
@@ -40,7 +31,6 @@ docker run -it \
-v "/etc/localtime:/etc/localtime:ro" \
-v "/dev/input:/dev/input" \
--rm \
--gpus all \
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after this changes this script started working fine on a Ubuntu 22.04 host

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is needed to get GPU acceleration when using NVIDIA GPUs. Maybe leave it as is and add a statement below to remove it if not using a GPU?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll roll-back. Not sure how to dispatch automatically it for a GPU/noGPU cases. In my environment (VirtualBox) it leads to error:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Copy link
Contributor

@azeey azeey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. I have a few more comments.

docker/run.bash Outdated
@@ -40,7 +31,6 @@ docker run -it \
-v "/etc/localtime:/etc/localtime:ro" \
-v "/dev/input:/dev/input" \
--rm \
--gpus all \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is needed to get GPU acceleration when using NVIDIA GPUs. Maybe leave it as is and add a statement below to remove it if not using a GPU?

@@ -18,17 +18,8 @@ ARGS=("$@")
# This is necessary so Gazebo can create a context for OpenGL rendering
# (even headless).
XAUTH=/tmp/.docker.xauth
if [ ! -f $XAUTH ]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this change necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, previous code not working, here is example on a system with empty /tmp:

xauth:  /tmp/.docker.xauth not writable, changes will be ignored
xauth: (argv):1:  unable to read any entries from file "(stdin)"
chmod: changing permissions of '/tmp/.docker.xauth': Operation not permitted
Authorization required, but no authorization protocol specified

and previous code somehow creates /tmp/.docker.xauth as a directory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@j-rivero any thoughts on this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still quite simple to verify. But results can be false positive (it will be mentioned above error, but gui will work) if xhost +local was executed before (not sure, maybe it should be local:root, it's still part of ancient magic)

docker/README.md Outdated
@@ -75,7 +75,7 @@ distribution using debians.
image of Gazebo Garden:

```
./build.bash gz ./Dockerfile.gz
./build.bash gz-ionic ./Dockerfile.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but now we need to update the instruction right above to say "Gazebo Ionic"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll roll-back it here to garden and lines below too, due to build.bash not working properly with Ionic AFAIU. It executes Dockerfile.gz, which is based on nvidia/opengl:1.2-glvnd-devel-ubuntu20.04 image.

Signed-off-by: Maksim Derbasov <ntfs.hard@gmail.com>
Signed-off-by: Maksim Derbasov <ntfs.hard@gmail.com>
@ntfshard
Copy link
Author

Would it additionally make sense to introduce a CI job to make sure that this docker file can build on, for instance, a nightly basis?

Added, thanks to @skorobogatydmitry for help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏛️ ionic Gazebo Ionic
Projects
Status: In review
Development

Successfully merging this pull request may close these issues.

4 participants