Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal for build multi arch images #159

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Workgroup is a virtual team of aggregating the efforts of all the interested par
|[Scanning](./workgroups/wg-scanning/README.md)|The obejct of workgroup is centralizing and focusing efforts on supporting the ability of Harbor users to choose which container image scanner they want to use with Harbor directly|(Zach Hill@Anchore [zhill](https://github.com/zhill))|Daniel Pacak@Aqua Security ([danielpacak](https://github.com/danielpacak)) / Steven Zou@VMware ([steven-zou](https://github.com/steven-zou))|Liz Rice@Aqua Security ([lizrice](https://github.com/lizrice)), Daniel Pacak@Aqua Security ([danielpacak](https://github.com/danielpacak)), Steven Zou@VMware ([steven-zou](https://github.com/steven-zou))|
|[P2P Distribution](https://github.com/goharbor/community/tree/master/workgroups/wg-p2p)|The objective of this working group is to aggregate efforts to discuss, design and finally deliver suitable P2P image distribution solutions for Harbor.|Fanjian Kong@Qihoo ([kofj](https://github.com/kofj))| Steven Zou@VMware ([steven-zou](https://github.com/steven-zou)) |[5 Members](https://github.com/goharbor/community/tree/master/workgroups/wg-p2p#members)|
|[Operator](./workgroups/wg-operator/README.md)|Aggregate the efforts from all the interested parties to discuss, design and finally deliver suitable Operator solutions for Harbor|Steven Zou@VMware ([steven-zou](https://github.com/steven-zou))| - | [14 Members](https://github.com/goharbor/community/blob/master/workgroups/wg-operator/README.md#members)|Steven Zou@VMware ([steven-zou](https://github.com/steven-zou))| - |[5 members](https://github.com/goharbor/community/tree/master/workgroups/wg-p2p#members)|
|[Multi Architecture](https://github.com/goharbor/community/tree/master/workgroups/wg-multiarch)|The purpose of this working group is to enable Harbor to support multiple architectures, such as x86 arm.|ZhiPeng Yu@Alauda ([yuzp1996](https://github.com/yuzp1996))| Steven Zou@VMware ([steven-zou](https://github.com/steven-zou)) |[8 Members](https://github.com/goharbor/community/tree/master/workgroups/wg-multiarch#members)|
|[Multi Architecture](https://github.com/goharbor/community/tree/master/workgroups/wg-multiarch)|The purpose of this working group is to enable Harbor to support multiple architectures, such as x86 arm.|ZhiPeng Yu@Alauda ([yuzp1996](https://github.com/yuzp1996))| Steven Zou@VMware ([steven-zou](https://github.com/steven-zou)) |[13 Members](https://github.com/goharbor/community/tree/master/workgroups/wg-multiarch#members)|
|[Performance](https://github.com/goharbor/community/tree/master/workgroups/wg-performance)|The purpose of this work group is to improve the performance and scalability of harbor in the use case of large data volumes.|ChenYu Zhang@Alauda ([chlins](https://github.com/chlins))| Steven Zou@VMware ([steven-zou](https://github.com/steven-zou))/ Daniel Morinigo@Alauda ([danielfbm](https://github.com/danielfbm)) |[11 Members](https://github.com/goharbor/community/tree/master/workgroups/wg-performance#members)|

## Structure
Expand Down
123 changes: 123 additions & 0 deletions proposals/new/multi-arch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Proposal: `Multi Arch Support`

Author: `Zhipeng Yu / yuzp1996`

Discussion: `none`

## Background

More and more companies need to support multi-architecture solutions, some of the more mainstream x86 and arm64 architectures, this proposal is to provide a solution to produce multi-architecture images

## Plan

When the release 2.3 of Harbor is released, the arm64 Harbor can be issued at the same time

Release 2.3 of Harbor is tentatively scheduled to be released in mid-June 2021

## Target

### Short Term Target

We can provide support for different architectures in a different Harbor sub-projects.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a table describing an overview of the maturity model? For example:

Level Description Goals Challenges
Dev Introduce a new architecture into the ecosystem as independent repo and images - Build the upstrea's master branch code
- Introduce development practices
- Build platform and code
- Upstream changes trigger downstream build
Test Uses Harbor's standard testing procedures to test the new architectures version - Validate the new architecture's artifacts making sure it conforms to Harbor's standard test suite
- Ensure upstream master code have at least a daily build and test
- Infrastructure to deploy and run tests
Alpha Release Onboard architecture first release together with Harbor's release - Make sure release candidates and other release builds on upstream can execute successfully on downstream
- Run release tests in the new architecture
- Sync with Harbor release timing
Integration Integrate new architecture within Upstream code. PS: only available for Mainstream architectures - Enable upstream to integrate build methods
- Enable master build and PR build checks
- Add build process on upstream without affecting contributors and maintainers experience
- Active maintainance on upstream
Beta Release First architecture release after integration - Enable official upstream release with new architecture
- Offer official multi-architecture tag for users
- Make sure release time and processes will not be affected by new architecture
GA release After the first beta release any other releases will be considered GA for the new architecture - Offline installer integration - Offline code changes
- Offiline installer build process and architecture

For example, in the harbor-arm repo, we can maintain the build logic and other related information for arm64 images
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it need to be in separate repositories?

(I've built and run Harbor on arm64 successfully, and it didn't require any code changes -- just a base image that was supported, as seen in goharbor/harbor#13788)

I think the PostgreSQL major version upgrade problem we discussed over there in goharbor/harbor#13788 (comment) is very relevant to this work -- without some way to upgrade between major versions of PostgreSQL, this is going to get very complicated very quickly (and is the reason that PR got more complicated and added a separate Dockerfile for arm64 - it has to upgrade to Photon 3.0 to get arm64 support but that also comes with newer PostgreSQL which will break existing installs). See docker-library/postgres#37 for a really long discussion around this particular problem and why it's so complicated to solve nicely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will discuss major version upgrades of PostgreSQL at a meeting every Wednesday at 5 pm. You are welcome to join in. The meeting link is https://VMware.zoom.com.cn/j/94528344426?pwd=aVpmMWtweHNXenRBOFhQNjgvQlZ0UT09

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice! Which timezone is that in?
(it looks like maybe today's already happened? I'm in Las Vegas, so even listening in on today's community call at 6am was a bit rough for me 😬)

Copy link
Contributor Author

@yuzp1996 yuzp1996 Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't realize this. 😂 We are in Beijing now. We discussed your mention of PostgreSQL in comment. As far as I know Harbor Team will provide some solution to resolve this problem in two weeks.

But we are not sure if we will encounter the same problem when building arm64 image base photon 4.0. We need do a test about this.

If Harbor Team can provide a solution to solve this problem within two weeks, then this may not be a problem any more

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, fair enough -- if upstream stays on photon 2.0 and downstream is on photon 4.0, neither will have an issue today, but it's one more delta between upstream and downstream, and it will be a serious issue for both eventually (photon 2.0 won't be maintained forever, and even it is, the PostgreSQL version in it won't be 😅).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry ! I am not sure if I understand the delta between upstream and downstream that you have mentioned. 😭

The base images in upstream have upgrade to photon 4.0 in this PR which has been merged to master.

https://github.com/goharbor/harbor/pull/14551/files

I think we will develop based on upstream master branch. Then the arm images will use the latest version of PostgreSQL in photon 4.0 so maybe we don't need to care about the major version problem for now?

If I did not understand the question you mentioned correctly please feel free to correct me

Thank you very much! ☺️☺️☺️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhh, I hadn't seen goharbor/harbor#14551 yet 😬🙈

That's definitely going to be a problem for upgrades unless the PostgreSQL upgrade problem is solved. 😬😬😬


### Long Term Target

When the harbor is released, images of different architectures can be directly generated and push to registry. When deploying harbor, users do not need to distinguish the difference in architecture

## Implementation

### Glossary

Upstream: github repository goharbor/harbor

Downstream: github repository like harbor-arm harbor-loongson and so on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not understanding the need for these to be separate? If they're treated as downstreams, they'll always be second-class citizens. If they require code changes, the ideal would be to get those upstream, and have those tested upstream too.

Copy link
Contributor Author

@yuzp1996 yuzp1996 Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The solution you mentioned is an ideal solution. However, Harbor Team has no plans to do multi-architecture support in the upstream this year, because the workload brought about by this is huge and it is easy to cause delays in release.
The current solution is a relatively fast method. We’ve weighed and decided to use the current way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a reasonable middle-ground would be adding a GitHub Action upstream that just verifies that the code still compiles successfully for any architectures that get built/tested downstream? I think Go's cross-compilation support is probably enough to accomplish that without QEMU or any major slowdown, and it would help make sure that the downstream project is more likely to have success.


### Upstream changes

#### Goal

Provides the ability to compile binary and image of a specific architecture based on environment variable parameters. (In the initial stage, the upstream only guarantees that the image of the amd64 architecture can be successfully built, and other versions are guaranteed by the corresponding downstream repository)

#### Changes

1. Increase the environment variable GOARCH, the default is amd64, which can be overwritten during execution

2. When executing make compile to generate binary, it can generate binary of different architectures according to the environment variable GOARCH

3. Add a file named adapter.sh, the content of the file is empty. Before executing compile and build commands, please call adapter.sh scripts. This shell script file can be overwritten by downstream users, and downstream users can write their own modifications in this file, such as overriding the addresses of third-party dependencies or replacing Dockerfile.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really strange IMO -- Go has top-notch, first-class support for multiple architectures and cross-compiling, so it really feels like a backwards approach to treat them as secondary and even downstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upstream will be responsible for compiling different architecture binaries, while the downstream will not be responsible for this

The downstream will use the way that upstream provided to get the different architecture binary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why we need to put this upstream. Could you give a concrete example as to what this script may look like and what it does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file will not perform any operations in the upstream. This is just an empty file. It looks like a placeholder and should be overwritten by downstream repositories.

Copy link

@danielfbm danielfbm Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe what @tianon and @reasonerjt mean is that this script should not be called in the upstream at all. Since the downstream repository is responsible for executing the whole build process, it is enough to have the Makefile have different instructions for compiling go binaries, and to build docker images. Then downstream projects can control the flow and decide whether they need or not to call a script or not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no problem. I agree with this


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any considerations, as to what upstream should be careful with, in order not to break the down stream workflows?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing that upstream to care about because it is an empty file and should be overwritten by the downstream repository. Downstream will care about the content of this file.

In the upstream repository just need call this shell file before executing the compilation and build tasks in the makefile to ensure that this shell can be executed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the focus of the question here is not regarding the script, but regarding any general changes in the core project, for example:

  • Adding or removing components
  • Changing base images
  • Changing Dockerfile in general
  • Other environmental dependency changes
    ?

### Downstream changes

#### Goal

Maintain the build and test of an architecture images

#### Change

This comment was marked as outdated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readme in upstream can link to there

1. Introduce the harbor repository into the downstream repository through git Submodules
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be personal, but I've never had anything but a bad time using Git Submodules, especially at larger scales (which I think is another strong argument against having other architectures support be totally separated from the core product code).

One concrete issue is that they're tied to a specific commit without any real history or notation of intent, so it becomes difficult to determine which branch that commit comes from and why that branch was chosen, for example.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can suggest a better way to achieve the same goals?

I believe that the requirement here is to build downstream repos on upstream have the following changes:

  • on every master commit
  • every release-* commit/tag

using Github Actions we want to be able to compile the core commit and use the downstream project's scripts (Makefile, Dockerfile, etc.)

@tianon any suggestion how can we achieve that using Github Actions will be of great help

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can find, the only way to get the downstreams to build automatically whenever there's a commit on the upstream is to add something in the upstream's GitHub actions with an access token for the downstream and have it use that token to explicitly trigger an event like workflow_dispatch or repository_dispatch on the downstream repository directly. The other option would be something like a scheduled action watching for changes, or maintaining some kind of external infrastructure that listens to webhooks and does the triggering (but in that case it seems much more reliable to just use GitHub Actions on the main repository for that).

No matter how this downstream repository is consuming the upstream code, it's going to be complicated to coordinate -- I imagine each release of Harbor is going to require slightly different downstream code (like the addition of the new metrics service in 2.2, for example), so I'm not sure what the best solution for mapping those and keeping them up to date is going to be. 😬


2. Add a file named adapter.sh, responsible for making changes to areas that cannot be covered by upstream changes

3. Add a Makefile (which can be called when performing github actions) to achieve such as overwriting adapter.sh or other functions
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe there will be many dockefiles


4. Add Dockerfile if necessary

5. Add github action files to maintain the build and test in them. The content of the github action should be modified based on the upstream github action, but it should be kept in sync at all times

### Test

We want to use the upstream test method to test, use docker-compose to deploy Harbor and then use the shell script to execute the test

However, there are currently the following issues that need to be resolved

1. During testing, docker-compose will be used to deploy Harbor and test, but it seems that docker-compose [only provides x86 binary files](https://github.com/docker/compose/releases), and we need to compile other versions by ourselves. We can try to install docker-compose with pip

2. Images of different architectures may not be able to run using the runner provided by github action. [Self-hosted runners](https://docs.github.com/en/actions/hosting-your-own-runners) should be used to ensure that images of different architectures can run tests.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block issue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, Travis has support for a few alternative architectures, but the product quality/reliability has seen a steady decline since their acquisition.

We might be able to use the binfmt-based user-mode QEMU emulation support for testing, but it has traditionally struggled with supporting Go binaries, so might not be sufficient for real testing.

Alternatively, now that Harbor is part of the CNCF sandbox, we might be able to convince them to sponsor something like an Amazon A1 instance for a self-hosted arm64 runner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your advice.

We may solve this problem through the third comment you mentioned.



### Image strategy

#### Repository

The image will be hosted under the goharbor project of Docker Hub

#### Tag

Add the architecture name after the generated image version. For example: goharbor/harbor-core:v2.2.1-arm64


### Related Implementation

There are already some PRs and repository that have implemented this function, we can make full use of these contents

Changes provided by this [PR](https://github.com/goharbor/harbor/pull/13788) can build amd64 and arm64 image simultaneously

This repository https://github.com/querycap/harbor have created arm64 and amd64 image successfully



## Development Iteration

After several iterations, we need to ensure that the content in adapter.sh and github actions files in downstream repository is as small as possible,After each iteration, more code can be extracted and merge to upstream。

After the adapter.sh and github actions files of the downstream code disappear, we can no longer maintain the downstream repository.



## Release

When the goharbor/harbor is released, check out the same release branch under the downstream repository, and check the corresponding commit in git Submodules, execute the pipeline to generate the corresponding image and push it to the docker hub

Readme.md or release notes needs to be updated in different branches to inform the images related information, such as location and corresponding tags


## Notify Users

We provide repository addresses that support other architectures in goharbor/harbor release notes. Users can directly find repository addresses of different architectures according to the link.


## Block Item

1. Images of different architectures may not be able to run using the runner provided by github action. [Self-hosted runners](https://docs.github.com/en/actions/hosting-your-own-runners) should be used to ensure that images of different architectures can run tests.
4 changes: 3 additions & 1 deletion workgroups/wg-multiarch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,14 @@ The purpose of this working group is to enable Harbor to support multiple archit

- ChenYu Zhang @Alauda ([chlins](https://github.com/chlins))

- Daniel @Alauda ([danielfbm](https://github.com/danielfbm))
- Daniel Filipe Bruehmueller Morinigo @Alauda ([danielfbm](https://github.com/danielfbm))

- JingTao Cheng @Alauda ([jtcheng](https://github.com/chengjingtao))

- ZhiPeng Yu @Alauda ([yuzp1996](https://github.com/yuzp1996))

- Bo Zhu @Alauda ([Jeremy-boo](https://github.com/Jeremy-boo))

- WenLong Zhang@Loongson ([zhangwenlong8911](https://github.com/zhangwenlong8911))

- JingYun Hua@Loongson ([huajingyun01](https://github.com/huajingyun01))
Expand Down