Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process to adopt paddle/operator #520

Closed
Bobgy opened this issue Jun 21, 2021 · 53 comments
Closed

Process to adopt paddle/operator #520

Bobgy opened this issue Jun 21, 2021 · 53 comments
Assignees

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Jun 21, 2021

Because we accepted #502 paddle operator proposal.

Now we can start the process to adopt the external repo.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 21, 2021

/cc @kubeflow/wg-training-leads @tizhou86

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 21, 2021

Plan:

  1. Stage current https://github.com/PaddleFlow/paddle-operator repo into google internal git.
  2. Google internal release process (there might be back and forths on repo requirements, I'll sync with @tizhou86 to resolve them).
  3. Once the kubeflow/paddle-opeartor repo is public, I can add OWNERS for @tizhou86, @kuizhiqing and @kubeflow/wg-training-leads.
  4. Push new commits to the repo during the process.
  5. Archive the existing repo.

Note, I cannot rename the existing github repo directly due to policy issues, so issue transfer etc is not possible. Luckily, we don't have many issues yet: https://github.com/PaddleFlow/paddle-operator/issues

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 21, 2021

That's my rough plan. Any concerns?

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 21, 2021

For license requirement

@tizhou86 May I confirm, is all the code in https://github.com/PaddleFlow/paddle-operator currently wrote by your company? Are there snippets or files copied from other projects?

If you have copied snippets, it's not a big problem, but you'll need to move them into a "third_party" folder at the root of the repo and mention original copyright and license, example: https://github.com/kubeflow/pipelines/tree/master/third_party.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 21, 2021

The copyright headers do not reference a company name. @tizhou86 do you want to update?
example:

https://github.com/PaddleFlow/paddle-operator/blob/0f1bab781dcd45059fd3e336a3aa7ecab9eaf601/controllers/paddlejob_controller.go#L2

Also can you add license headers for all the files (including yaml)?
You may use https://github.com/google/addlicense to add licenses easily.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 21, 2021

For new open source projects, we will be enforcing inclusive language policies.
I found that there are several occurrences of "master" in the repo: https://github.com/PaddleFlow/paddle-operator/search?q=master.

reference:

Can you investigate whether it's possible to avoid using "master" in code and API?

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 21, 2021

These are problems I found for the initial pass.

@Bobgy Bobgy self-assigned this Jun 21, 2021
@tizhou86
Copy link
Contributor

Hi Bobgy, I'll talk to some of our team members and resolve all the problems you mentioned asap, thanks!

@tizhou86
Copy link
Contributor

/cc @kuizhiqing

@tizhou86
Copy link
Contributor

Hi @Bobgy , All mentioned problems above are resolved in PRs: PaddleFlow/paddle-operator#48 and PaddleFlow/paddle-operator#49 , please let us know if anything is needed from our side, thanks!

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 24, 2021

@tizhou86 thanks!
LGTM except some minor problems:

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 24, 2021

Another Q: did anyone file any patents around paddle-operator?

@kuizhiqing
Copy link
Member

Hi @Bobgy , No, no patent is filed for paddle-operator now.

@kuizhiqing
Copy link
Member

Hi @Bobgy , I've add license for the yaml files with PaddleFlow/paddle-operator#53 .
BTW, can you also add me as OWNERS in third step of your Plan.

@tizhou86
Copy link
Contributor

Hi @Bobgy , I've add license for the yaml files with PaddleFlow/paddle-operator#53 .
BTW, can you also add me as OWNERS in third step of your Plan.

Hi Bobgy, please also add kuizhiqing as the owner of this project, thanks! :-)

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 28, 2021

Sure, updated plan

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 24, 2021

A required process after paddle-operator is migrated to Kubeflow org:
All released images should comply with open source licenses of its dependencies: https://github.com/kubeflow/community/blob/master/guidelines/application_requirements.md#docker-images.

There were some documentation for how this can be done for go binaries in https://github.com/kubeflow/testing/tree/master/py/kubeflow/testing/go-license-tools. However, I wrote that tool a while back, now I'd recommend a new license tool I just built, refer to https://github.com/Bobgy/go-licenses/tree/main/v2.

I'm not opinionated on the choice, you can decide. Feel free to ask me if anything is not clear.

@Jeffwan
Copy link
Member

Jeffwan commented Aug 24, 2021

@Bobgy Thanks for the update. Is it project owner's responsibility or WG's responsibility to make sure license files are included in all the images? Do you know if there's way we can scan licences just like vulnerability scanner for docker images?

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 25, 2021

Hi @Jeffwan, I'd say it's primarily project owners' responsibilities.

@Bobgy
Copy link
Contributor Author

Bobgy commented Aug 25, 2021

I am not aware of such tools like vulnerability scanning, because it's a little hard understanding how each binary was built.

So far existing tools are around integrating license compliance process in dev/release process.

@kuizhiqing
Copy link
Member

Hi @Bobgy @Jeffwan , thanks for your clarification. We are working on it now.

@kuizhiqing
Copy link
Member

Hi @Bobgy , I've add licenses file in the repo and copy it into image, cf pr PaddleFlow/paddle-operator#73.

For the image issue, kubebuilder provide us with base image gcr.io/distroless which we are replacing with bitnami/minideb:stretch due to the accessibility.

Is it ok or maybe we should use registry.access.redhat.com/ubi8/ubi as base image just like pytorch-operator did ?

@Jeffwan
Copy link
Member

Jeffwan commented Sep 1, 2021

@kuizhiqing What's your plan on kubeflow/paddle-operator? Seems you still use PaddleFlow/paddle-operator as main dev repo?

@terrytangyuan
Copy link
Member

Should we aim to migrate the paddle-operator code base to the unified operator instead of standalone new repo?

@kuizhiqing
Copy link
Member

Hi all, we are aware of your advance in working on unified operator, we would like to making paddle work under unified operator, though, we are now working on repo paddle-operator and it's on the process of adoption. I'm considering that shall we possibly continue the process and plan to work with unified operator later ?

@Jeffwan
Copy link
Member

Jeffwan commented Sep 7, 2021

@kuizhiqing Please have a look at tf-operator code base and see if that's possible to make it easily. We heavily reuse codes from kubeflow/common

@kuizhiqing
Copy link
Member

Hi @Jeffwan , I've read your recent code in tf-operator, it make sense for paddlepaddle to take a place in it but I am afraid maybe we can do it later since it is not straight forward to make it right away.

@Bobgy
Copy link
Contributor Author

Bobgy commented Sep 9, 2021

Understood, I don't think we have a strong opinion on the base image. At the end, users will only care about whether there can be vulnerabilities in the base image, I think you can decide by yourself based on this.

@Bobgy
Copy link
Contributor Author

Bobgy commented Sep 15, 2021

Hi @Jeffwan @gaocegege, what's your opinion on #520 (comment)?
This is the remaining blocker.

@gaocegege
Copy link
Member

gaocegege commented Sep 16, 2021

I think we could give admin permission. The security issues can be guaranteed by code reviews. WDYT @Jeffwan @johnugeorge

BTW, prefer using the same base images with other training operators. It can be easy to maintain.

@Jeffwan
Copy link
Member

Jeffwan commented Sep 23, 2021

@gaocegege Just to double confirm. Do you mean you prefer second option and would like to grant permission to paddle maintainers? Our training operator image is hosted in AWS ECR and we use our own CI pipeline to build images.

We also own kubeflow dockerhub but that's a free account with rated limited issue

@gaocegege
Copy link
Member

We also own kubeflow dockerhub but that's a free account with rated limited issue

I mean that we can grant docker hub permission.

@kuizhiqing WDYT

@kuizhiqing
Copy link
Member

We also own kubeflow dockerhub but that's a free account with rated limited issue

I mean that we can grant docker hub permission.

@kuizhiqing WDYT

OK, it works for us.

@kuizhiqing
Copy link
Member

Hi @@kubeflow/wg-training-leads @Bobgy @gaocegege @Jeffwan,
as you may note that we have update our image register (pr) which is hosted on baidu cloud, and we will grant permission for you.

If that is OK with you, we may continue the process.

@terrytangyuan
Copy link
Member

I don't have a strong opinion on the registry as long as Training WG has permission and can take actions when necessary. Any objections from others?

@gaocegege
Copy link
Member

I have no objection to it.

@terrytangyuan
Copy link
Member

terrytangyuan commented Oct 23, 2021

@Bobgy @kubeflow/project-steering-group I believe we can start the transfer. What steps are involved for transferring the repo over to Kubeflow org? I believe @kuizhiqing has to give owner permission of https://github.com/PaddleFlow to @kubeflow/project-steering-group?

@tizhou86
Copy link
Contributor

Hi @Bobgy Please let us know if we need to add owner permission for kubeflow project steering group mailing list, thanks!

@terrytangyuan
Copy link
Member

Ping @kubeflow/project-steering-group. Let's move forward with this. Please provide instructions on the next steps.

@terrytangyuan
Copy link
Member

Also cc @zijianjoy

@terrytangyuan
Copy link
Member

terrytangyuan commented Jan 23, 2022

Since @Bobgy is no longer with the team, @zijianjoy would you be able to take over from here?

@Bobgy
Copy link
Contributor Author

Bobgy commented Jan 23, 2022

Thanks for the reminder!
@theadactyl and @kramachandran in @kubeflow/project-steering-group will be the main POCs for this.

@stale
Copy link

stale bot commented Apr 28, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@terrytangyuan
Copy link
Member

/assign @theadactyl

@zijianjoy
Copy link
Contributor

/lifecycle frozen

@Windfarer
Copy link

Any update about the paddle operator?

@johnugeorge
Copy link
Member

Since we have a unified training operator, we don't need a separate repo

@johnugeorge
Copy link
Member

This can be closed as #1675 is merged into training operator

@terrytangyuan
Copy link
Member

/close

@google-oss-prow
Copy link

@terrytangyuan: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants