-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[🚀 Feature]: Move "seleniarm" multi-arch images to the same Docker namespace as "selenium" amd64 images #1847
Comments
@jamesmortensen, thank you for creating this issue. We will troubleshoot it as soon as we can. Info for maintainersTriage this issue by using labels.
If information is missing, add a helpful comment and then
If the issue is a question, add the
If the issue is valid but there is no time to troubleshoot it, consider adding the
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable
After troubleshooting the issue, please add the Thank you! |
Thank you for this detailed write up, @jamesmortensen! A few doubts, comments:
An way to go could be to keep the same name but then users need to specify the platform image they want to pull, properly documented. Would that make sense?
Can you please explain again why we need to use CircleCi? Can't I build an arm64 image on a amd64 host? |
Let's say we release Firefox 112.0 on May 17th for amd64, and then on June 12th Firefox 112.0 becomes available for ARM. Currently, we tag images with the browser and driver versions. We need to figure out how would the browser tagging work. We can see an example with the experiment I did where I built the image from both dockerfiles. It shows Firefox 112.0.1 for amd64 and 112.0 for arm/v7 on the same tag. It's a slight difference in this case, but it shows how there's a discrepancy:
Additionally, the geckodriver versions are different:
We could convey this in the release notes, but I don't know how it would work if we attempted to tag the images with the browser and driver versions. Maybe we could outline what the release notes could look like in the case where the browser versions are different. |
It's possible to build multi-arch images in GitHub Actions amd64 runners, but QEMU emulation is incredibly slow and sluggish. This creates two problems: We can't test browsers under emulationIt's not possible to test the images because the browsers won't startup under emulation. On CircleCI, since there's both arm64 and amd64 runners natively, I can test browsers on both of those platforms. I cannot yet test armv7l since no cloud provider has this architecture as an option. I tried downloading an armv7 VM and running in QEMU, but that didn't really work in CI The build process is painfully slow under QEMU emulationBuilding the Chromium and Firefox images from Base to Standalone takes 4 to 6 minutes to build natively, but it can take up to 20 to 40 minutes to build under emulation. The last full build of all images was the fastest ever at 17 minutes, and that was because of splitting the build process between multiple runners and executing in parallel. If we could use a self-hosted ARM64 runner, perhaps one of Oracle's free ARM Ampere virtual machines, we could do it on GitHub Actions, but there's no guarantee Oracle will keep that as a free solution. |
Maybe something like this: Release Notesselenium/standalone-firefox:4.9.1-20230517amd64:
arm64:
arm/v7:
selenium/standalone-chrome:4.9.1-20230517amd64:
selenium/standalone-chromium:4.9.1-20230517arm64:
arm/v7:
amd64:
selenium/standalone-edge:4.9.1-20230517amd64:
|
@diemol I'm thinking of doing an experiment where I mirror all of the releases in a test namespace, in forked repos, so we can see how combined images play out in isolation. Let's iterate on this idea a little more and tease out some more questions that we may need to tackle. Please let me know if the information in the previous comments is helpful. If you have more questions, please keep them coming! :) |
Sorry for the high volume of comments, but I discovered that the latest browser and driver binaries are indeed available on Ubuntu arm64. I just tried in an Ubuntu 20.04.6 arm64 VM to install both browser images and drivers, and it seems I got the latest versions of everything we need, except Google Chrome. Here's the terminal output from the VM:
I also verified, using test scripts from https://github.com/jamesmortensen/debug-tools-for-docker-selenium/tree/main/selenium-webdriver-demo-javascript that both browsers can open and be automated. In the interests of being thorough, I should try the same experiment on an armv7l VM to make sure everything is available there too. I do know geckodriver won't be there, but I'm already building that myself. The important thing is that Chromium, Firefox, and chromedriver be there. Hope this information is helpful! |
I think this is good enough.
Then maybe we can move completely to Circle CI? Just the build process, and keep the release logic in GitHub Actions.
Nice 🎉 |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Feature and motivation
Improving visibility of Multi-Arch Images (also known as docker-seleniarm)
We are seeking information from the community to learn how we can bring more awareness to the existence of the multiple architecture Docker container images, reduce the learning curve for those beginning to use the container images, and enable everyone to get started faster.
One idea we're considering is to move the images in the Docker Hub seleniarm namespace to the official selenium namespace.
In order to get feedback, I want to first provide more insight and transparency on how the container images are built, as well as the challenges we face in building and maintaining them. This gives us a platform for brainstorming ideas by first seeking to understand. First, let's look at the browser images supported by Selenium and Seleniarm (multi-arch). We'll also look at how we source browser binaries, the driver binaries, as well as operating systems.
Namespaces
Currently, we have two namespaces in Docker Hub:
Selenium Images - amd64/Intel only
Seleniarm Container Images - Multiple Architectures (amd64 / arm64 / armv7l)
Browser Binaries
Let's take a moment to look at the browser binaries which are available for various architectures:
Google Chrome
Google currently does not build binaries for Chrome on Linux for ARM. Chrome is only available for Intel/x86_64 and amd64 architectures. For multi-arch images, we substitute Google Chrome with the upstream open source Chromium browser. While they share a lot of the same code and browser engine, there are slight, subtle differences in how the browsers function.
Mozilla Firefox
Although Mozilla doesn't provide official binaries for ARM on Linux, the Debian community maintains binaries for various architectures, which are available on Debian Sid (the unstable Debian). We build and maintain container images for amd64, arm64, and armv7l. Here is where binaries for Mozilla Firefox are sourced from when building container images:
Chromium
Since Google does not provide binaries for ARM, and since Google Chrome is not open source (only Chromium is), the next best solution to providing a Selenium browser automation solution on ARM is to use Chromium binaries. The Debian community maintains the chromium package and keeps it up to date.
While Chromium and Google Chrome are similar and share the same browser engines, at the end of the day, they're not the same browsers. This may be fine in some use cases and problematic in others.
Edge
Currently, Microsoft does not publish binaries for ARM. Instead, open source Chromium is our next best solution.
Browser Drivers
Locating up to date browser drivers for ARM, such as chromedriver and geckodriver, is also a challenge. Up until October 2022, I maintained geckodriver binaries for arm64 until Mozilla stepped forward and took responsibility for building and supporting arm64 geckodriver binaries. At this time, I still maintain an unofficial binary for geckodriver on armv7l so that developers and testers using Raspberry Pi devices can run their Selenium Grids and run automation scripts that drive Mozilla Firefox.
Operating Systems
Let's also look at how the base image for docker-selenium and docker-seleniarm images differ. Selenium images are built on top of a stable Ubuntu 20.04 base image, while Seleniarm images are built on Debian Sid. Debian Sid is the unstable version of the Debian operating system. It's one of the only distributions I could find that offered both of the latest binaries.
Although Ubuntu is based on Debian, these are different distributions of Linux. Moreover, Debian Sid is considered to be on the bleeding edge while Ubuntu 20.04 is a LTS (long term stable release) and is of course quite stable.
All of this information leads us into the next question: How do we name the images so that it's clear which flavor of the container images you're using?
How the Docker Images are used
Currently, I suggest teams who exclusively use Intel based machines use images from the "selenium" namespace due to the stability of the container images. For teams using ARM hardware, the best choice is to use images from the "seleniarm" namespace.
But one question that occasionally pops up is how teams should handle scenarios where half the team uses Intel machines and the other half uses ARM? There are two ways I've seen teams handle this:
Everyone uses seleniarm
Since the docker-seleniarm images are also compatible with Intel machines, some teams choose to have everyone use the seleniarm images. The advantage is that this ensures a single docker-compose.yml file works on all architectures. However, one potential disadvantage is it also means no one is automating Google Chrome, only Chromium. Maybe that's okay, and maybe it's not. It depends.
Intel users use selenium, M1/ARM users use seleniarm
The other option I've suggested is to detect the architecture of the machine and then load the best image for that architecture. Here's an example I've taken from an article I wrote on combining multi-arch Docker images from different sources:
If the developer or tester uses an Intel machine, they'll pull selenium/standalone-chrome, with Ubuntu, Google Chrome, and stability. If they're using an M1, then they'll pull seleniarm/standalone-chromium, which includes Debian Sid, open source Chromium, and may involve some instability.
Moving Seleniarm to Selenium
To move Seleniarm container images to Selenium, there's a few things we need to take into consideration:
We'd like to get feedback from the community to brainstorm some ideas for what to do. Please comment below with any suggestions on how we can proceed forward.
Usage example
The problem we're looking to solve is to make it easier to maintain multi-arch images as well as easy for the community to find and use them.
The text was updated successfully, but these errors were encountered: