Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make plugins importable for other projects #91983

Closed
hzxuzhonghu opened this issue Jun 10, 2020 · 37 comments
Closed

Make plugins importable for other projects #91983

hzxuzhonghu opened this issue Jun 10, 2020 · 37 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@hzxuzhonghu
Copy link
Member

Context

volcano scheduler is making use of kubernetes' scheduler algorithms. But during the k8s dependency upgrade, we found that the scheduler framework has been refactored.

And we need to use pkg/scheduler/internal/cache.NewSnapshot to initialize the interface that needed to call related plugins. As internal package can not be imported, so i have to copy the code.

That is hard to maintain, so i doubt if we can remove the internal limitation in the k/k.

/sig-scheduling

@hzxuzhonghu hzxuzhonghu added the kind/support Categorizes issue or PR as a support question. label Jun 10, 2020
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 10, 2020
@hzxuzhonghu
Copy link
Member Author

/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 10, 2020
@hzxuzhonghu
Copy link
Member Author

cc @k82cn

@k82cn
Copy link
Member

k82cn commented Jun 10, 2020

/cc @kubernetes/sig-scheduling-feature-requests

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 10, 2020
@BenTheElder
Copy link
Member

If we want to support something for external consumption it should probably be staged. If you have to import k8s.io/kubernetes then it's generally been considered unsupported.

@k82cn
Copy link
Member

k82cn commented Jun 11, 2020

If we want to support something for external consumption it should probably be staged. If you have to import k8s.io/kubernetes then it's generally been considered unsupported.

IMO, staging is better; if we do not support that, we have to copy it under Apache License :(

@Huang-Wei
Copy link
Member

@k82cn IIRC, it was intentional to move the cache/queue code to internal package, in some release lead by @misterikkit (he/him).

The intention was that cache/queue are quite internal mechanics of the upstream scheduler, so shouldn't support external consumption.

So personally, I quite doubt if it's desired to move the internal stuff back publicly.

cc/ @ahg-g @alculquicondor @damemi

@k82cn
Copy link
Member

k82cn commented Jun 11, 2020

The intention was that cache/queue are quite internal mechanics of the upstream scheduler, so shouldn't support external consumption.

Honestly, I don-t care about the cache, but the algorithm in upstream dependent on it :(
It's better to decouple algorithm & framework, so the other developer or user can use them on demand.

@Huang-Wei
Copy link
Member

It's better to decouple algorithm & framework

#91029 is created to track separating framework & runtime. But the overall decoupling needs further efforts such as #89930.

@k82cn
Copy link
Member

k82cn commented Jun 11, 2020

That's great if we can get it ready ASAP :) Before that, we have to do hard copy under Apache License.

@k82cn
Copy link
Member

k82cn commented Jun 11, 2020

@hzxuzhonghu , I think we need to follow Apache License to highlight this hard copy.

@hzxuzhonghu
Copy link
Member Author

@alculquicondor
Copy link
Member

To clarify on the reasoning of the snapshot being internal:

We envisioned the scheduler framework to be used by simulators (such as cluster autoscaler) to predict how pods would get scheduled if conditions are changed. Those are only required to implement the SharedLister interface. I actually proposed at some point to leave some shared implementation, but the risk was increasing the maintenance burden.

The snapshot is highly associated with the scheduler cache, which should remain internal. Perhaps you should make your own cache implement the SharedLister interface.

@damemi
Copy link
Contributor

damemi commented Jun 11, 2020

It's better to decouple algorithm & framework

#91029 is created to track separating framework & runtime. But the overall decoupling needs further efforts such as #89930.

Yeah, #89930 is intended to make the framework more portable by breaking internal k/k dependencies and is basically the first step we need to take to do that.

@k82cn
Copy link
Member

k82cn commented Jun 12, 2020

The snapshot is highly associated with the scheduler cache, which should remain internal. Perhaps you should make your own cache implement the SharedLister interface.

But we can not import algorithm directly; that's wired that we dependent on algorithms in upstream and that algortihm dependent on my implementations :(.

@alculquicondor
Copy link
Member

I don't follow. By algorithm you mean plugins? SharedLister interface is in the framework package.

@misterikkit
Copy link

One goal of Scheduler Framework is to have other scheduler implementations import certain parts of the scheduler, staging seems like the best way to make that importable.

@k82cn when you say algorithm are you just referring to the go package that used to have that name, with all the predicates and priorities?

Another intent with the refactoring effort was to improve the way that other projects (like autoscaler) import our predicate implementations. In theory, each predicate should be a simple piece of logic with very few imports. That function could be imported by other projects, but the scheduler will wrap it in some boilerplate so it can fit into the Framework.

@k82cn
Copy link
Member

k82cn commented Jun 15, 2020

I'd like to import predicates and prioritise :)

SharedLister interface is in the framework package.

IMO, predicates & priorities should be self-contains for other project, e.g. autoscaler as misterikkit@ mentoned; if SharedLister is an interface, it's better to have a default implementation for predicates & prioritises. Regarding 'self-contains', I means predicates & prioritise can be imported directly without additional effort; if there is no SharedLister implementation, the other project has to learn and implement it themself, and it's hard to keep backward compatability.

@k82cn
Copy link
Member

k82cn commented Jun 15, 2020

That function could be imported by other projects, but the scheduler will wrap it in some boilerplate so it can fit into the Framework.

+1

One goal of Scheduler Framework is to have other scheduler implementations import certain parts of the scheduler, staging seems like the best way to make that importable.

Staging seems better; but I'm open on that part :)

@k82cn k82cn changed the title Make pkg/scheduler/internal/cache public Make predicates & priorities importable for other projects Jun 15, 2020
@Huang-Wei
Copy link
Member

@hzxuzhonghu @k82cn If you just want to import the plugins (predicates and priorities have all been refactored into plugins), I'm confused at this requirement:

use pkg/scheduler/internal/cache.NewSnapshot to initialize the interface that needed to call related plugins

Why not importing the plugins directly? I cannot get the point of "use NewSnapshot to initialize the interface ... to call plugins".

I checked the latest code, no plugin code (excluding unit tests) is dependent on the internal "cache" or "queue" pkg. If you spot some, can you show me the example?

@Huang-Wei
Copy link
Member

The only case I can think of is to use pkg/scheduler/internal/cache.NewSnapshot to building tests, but other than that, I don't see anything can prevent users from importing the plugins.

@hzxuzhonghu
Copy link
Member Author

The snapshot package is used to build FrameworkHandle interface.

@ahg-g
Copy link
Member

ahg-g commented Jun 15, 2020

There is no concept of predicates and priorities in the scheduler anymore, please update the title. The description of the issue is also not quite related to the title.

If the ask is to be able to import plugins, then this will be doable as part of the long term plan of breaking up most of the scheduler code and move it to staging repo: https://docs.google.com/document/d/1WO-ixERpqkCSEXEq30YtEH_z_G-BoLKeCbkRJcKq3xA/edit

Once in staging, you are guaranteed that dependencies on internal pkgs or dependencies on the specific implementation of default-scheduler will not exist.

@alculquicondor
Copy link
Member

The snapshot package is used to build FrameworkHandle interface.

No? Please refer me to the line were it is required.
You only need to provide an implementation of the SharedLister interface.

if SharedLister is an interface, it's better to have a default implementation

If we provide an implementation, I'm quite sure you will eventually find it limiting. Specially in the volcano use case, were you have to simulate multiple pods being added and rerun the plugins.
I honestly think you are better off implementing the interface.

Also, are you sure you need to directly instantiate plugins. In general, you should be good instantiating a Framework and use their exposed functions to run the plugins for you.
For reference, here is the instantiation that cluster autoscaler does https://github.com/kubernetes/autoscaler/blob/a63c7abbe47773be4eec49b13210a3cd9c4a719f/cluster-autoscaler/simulator/scheduler_based_predicates_checker.go#L52-L58

@ahg-g
Copy link
Member

ahg-g commented Jun 15, 2020

The snapshot package is used to build FrameworkHandle interface.

No? Please refer me to the line were it is required.
You only need to provide an implementation of the SharedLister interface.

I think the dependency they are referring to comes from the plugins unit tests, we should create a separate testing pkg and cut the dependency on the internal pkg in tests.

@k82cn
Copy link
Member

k82cn commented Jun 15, 2020

There is no concept of predicates and priorities in the scheduler anymore, please update the title. The description of the issue is also not quite related to the title.

hm... which term are we going to use for that, plugins?

If the ask is to be able to import plugins, then this will be doable as part of the long term plan of breaking up most of the scheduler code and move it to staging repo: https://docs.google.com/document/d/1WO-ixERpqkCSEXEq30YtEH_z_G-BoLKeCbkRJcKq3xA/edit

That's great we have a long term plan for that :)

@ahg-g
Copy link
Member

ahg-g commented Jun 16, 2020

hm... which term are we going to use for that, plugins?

Yes, plugins.

To summarize:

  1. The short term plan is to remove the unit tests dependency on the internal pkg by offering a testing pkg inside the framework to create Snapshot/SharedLister. However, this is best effort form sig scheduling, and so no guarantees are offered here.

  2. The long term plan is to move the scheduler framework and plugins to the staging repo.

@k82cn k82cn changed the title Make predicates & priorities importable for other projects Make plugins importable for other projects Jun 18, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 16, 2020
@ahg-g
Copy link
Member

ahg-g commented Sep 16, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 16, 2020
@damemi
Copy link
Contributor

damemi commented Sep 16, 2020

Don't see it referenced here yet, but this is currently blocked by #92507, which initializes an external "component-helpers" repo with the goal of migrating internal/duplicate helpers to this repo in order to break the framework's internal dependencies.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 15, 2020
@ingvagabund
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 5, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 5, 2021
@ingvagabund
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 6, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 5, 2021
@k8s-triage-robot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 4, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. kind/support Categorizes issue or PR as a support question. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
None yet
Development

No branches or pull requests