From c6aa802865a2f35cb9645bd3155a79bf14ee30a9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Frederico=20Mu=C3=B1oz?= <frederico.munoz@sas.com>
Date: Fri, 26 Jul 2024 15:08:42 +0100
Subject: [PATCH] Add SIG Scheduling spotlight

Closes: #519
Co-authored-by: Arvind Parekh <aruparekh@gmail.com>
---
 .../en/blog/2024/sig-scheduling-spotlight.md  | 384 ++++++++++++++++++
 1 file changed, 384 insertions(+)
 create mode 100644 content/en/blog/2024/sig-scheduling-spotlight.md

diff --git a/content/en/blog/2024/sig-scheduling-spotlight.md b/content/en/blog/2024/sig-scheduling-spotlight.md
new file mode 100644
index 000000000..4764548f9
--- /dev/null
+++ b/content/en/blog/2024/sig-scheduling-spotlight.md
@@ -0,0 +1,384 @@
+---
+layout: blog
+title: "Spotlight on SIG Scheduling"
+slug: sig-scheduling-spotlight-2024
+date: 2024-09-06
+author: "Arvind Parekh"
+---
+
+In this SIG Scheduling spotlight we talked with [Kensei Nakada](https://github.com/sanposhiho/), an
+approver in SIG Scheduling.
+
+## Introductions
+
+**Arvind:** **Hello, thank you for the opportunity to learn more about SIG Scheduling! Would you
+like to introduce yourself and tell us a bit about your role, and how you got involved with
+Kubernetes?**
+
+**Kensei**: Hi, thanks for the opportunity! I’m Kensei Nakada
+([@sanposhiho](https://github.com/sanposhiho/)), a software engineer at
+[Tetrate.io](https://tetrate.io/). I have been contributing to Kubernetes in my free time for more
+than 3 years, and now I’m an approver of SIG-Scheduling in Kubernetes. Also, I’m a founder/owner of
+two SIG subprojects,
+[kube-scheduler-simulator](https://github.com/kubernetes-sigs/kube-scheduler-simulator) and
+[kube-scheduler-wasm-extension](https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension).
+
+# About SIG Scheduling
+
+**AP: That's awesome! You've been involved with the project since a long time. Can you provide a
+brief overview of SIG Scheduling and explain its role within the Kubernetes ecosystem?**
+
+**K**: As the name implies, our responsibility is to enhance scheduling within
+Kubernetes. Specifically, we develop the components that determine which Node is the best place for
+each Pod. In Kubernetes, our main focus is on maintaining the
+[kube-scheduler](https://kubernetes.io/docs/concepts/scheduling-eviction/kube-scheduler/), along
+with other scheduling-related components as part of our SIG subprojects.
+
+**AP: I see, got it! That makes me curious--what recent innovations or developments has SIG
+Scheduling introduced to Kubernetes scheduling?**
+
+**K**: From a feature perspective, there have been [several
+enhancements](https://kubernetes.io/blog/2023/04/17/fine-grained-pod-topology-spread-features-beta/)
+to `PodTopologySpread` recently. `PodTopologySpread` is a relatively new feature in the scheduler,
+and we are still in the process of gathering feedback and making improvements.
+
+Most recently, we have been focusing on a new internal enhancement called
+[QueueingHint](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/4247-queueinghint/README.md)
+which aims to enhance scheduling throughput. Throughput is one of our crucial metrics in
+scheduling. Traditionally, we have primarily focused on optimizing the latency of each scheduling
+cycle. QueueingHint takes a different approach, optimizing when to retry scheduling, thereby
+reducing the likelihood of wasting scheduling cycles.
+
+**A: That sounds interesting! Are there any other interesting topics or projects you are currently
+working on within SIG Scheduling?**
+
+**K**: I’m leading the development of `QueueingHint` which I just shared.  Given that it’s a big new
+challenge for us, we’ve been facing many unexpected challenges, especially around the scalability,
+and we’re trying to solve each of them to eventually enable it by default.
+
+And also, I believe
+[kube-scheduler-wasm-extention](https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension)
+(SIG sub project) that I started last year would be interesting to many people.  Kubernetes has
+various extensions from many components. Traditionally, extensions are provided via webhooks
+([extender](https://github.com/kubernetes/design-proposals-archive/blob/main/scheduling/scheduler_extender.md)
+in the scheduler) or Go SDK ([Scheduling
+Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/) in the
+scheduler). However, these come with drawbacks - performance issues with webhooks and the need to
+rebuild and replace schedulers with Go SDK, posing difficulties for those seeking to extend the
+scheduler but lacking familiarity with it.  The project is trying to introduce a new solution to
+this general challenge - a [WebAssembly](https://webassembly.org/) based extension. Wasm allows
+users to build plugins easily, without worrying about recompiling or replacing their scheduler, and
+sidestepping performance concerns.
+
+Through this project, sig-scheduling has been learning valuable insights about WebAssembly's
+interaction with large Kubernetes objects. And I believe the experience that we’re gaining should be
+useful broadly within the community, beyond sig-scheduling.
+
+**A: Definitely! Now, there are currently 8 subprojects inside SIG Scheduling. Would you like to
+talk about them? Are there some interesting contributions by those teams you want to highlight?**
+
+**K**: Let me pick up three sub projects; Kueue, KWOK and descheduler.
+
+[Kueue](https://github.com/kubernetes-sigs/kueue):
+: Recently, many people have been trying to manage batch workloads with Kubernetes, and in 2022,
+Kubernetes community founded
+[WG-Batch](https://github.com/kubernetes/community/blob/master/wg-batch/README.md) for better
+support for such batch workloads in Kubernetes.  [Kueue](https://github.com/kubernetes-sigs/kueue)
+is a project that takes a crucial role for it. It’s a job queueing controller, deciding when a job
+should wait, when a job should be admitted to start, and when a job should be preempted. Kueue aims
+to be installed on a vanilla Kubernetes cluster while cooperating with existing matured controllers
+(scheduler, cluster-autoscaler, kube-controller-manager, etc).
+
+[KWOK](https://github.com/kubernetes-sigs/kwok):
+: KWOK is a component in which you can create a cluster of thousands of Nodes in seconds. It’s
+  mostly useful for simulation/testing as a lightweight cluster, and actually another SIG sub
+  project [kube-scheduler-simulator](https://github.com/kubernetes-sigs/kube-scheduler-simulator)
+  uses KWOK background.
+
+[descheduler](https://github.com/kubernetes-sigs/descheduler):
+: Descheduler is a component recreating pods that are running on undesired Nodes.  In Kubernetes,
+scheduling constraints (`PodAffinity`, `NodeAffinity`, `PodTopologySpread`, etc) are honored only at
+Pod schedule, but it’s not guaranteed that the contrtaints are kept being satisfied afterwards.
+Descheduler evicts Pods violating their scheduling constraints (or other undesired conditions) so
+that they’re recreated and rescheduled.
+
+[Descheduling Framework](https://github.com/kubernetes-sigs/descheduler/blob/master/keps/753-descheduling-framework/README.md).
+: One very interesting on-going project, similar to [Scheduling
+  Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/) in the
+  scheduler, aiming to make descheduling logic extensible and allow maintainers to focus on building
+  a core engine of descheduler.
+
+** AP: Thank you for letting us know! And I have to ask, what are some of your favorite things about
+this SIG?**
+
+**K**: What I really like about this SIG is how actively engaged everyone is. We come from various
+companies and industries, bringing diverse perspectives to the table. Instead of these differences
+causing division, they actually generate a wealth of opinions. Each view is respected, and this
+makes our discussions both rich and productive.
+
+I really appreciate this collaborative atmosphere, and I believe it has been key to continuously
+improving our components over the years.
+
+## Contributing to SIG Scheduling
+
+**AP: Kubernetes is a community-driven project. Any recommendations for new contributors or
+beginners looking to get involved and contribute to SIG scheduling? Where should they start?**
+
+**K**: Let me start with a general recommendation for contributing to any SIG: a common approach is
+to look for
+[good-first-issue](https://github.com/kubernetes/kubernetes/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).
+However, you'll soon realize that many people worldwide are trying to contribute to the Kubernetes
+repository.
+
+I suggest starting by examining the implementation of a component that interests you. If you have
+any questions about it, ask in the corresponding Slack channel (e.g., #sig-scheduling for the
+scheduler, #sig-node for kubelet, etc).  Once you have a rough understanding of the implementation,
+look at issues within the SIG (e.g.,
+[sig-scheduling](https://github.com/kubernetes/kubernetes/issues?q=is%3Aopen+is%3Aissue+label%3Asig%2Fscheduling)),
+where you'll find more unassigned issues compared to good-first-issue ones.  You may also want to
+filter issues with the
+[kind/cleanup](https://github.com/kubernetes/kubernetes/issues?q=is%3Aopen+is%3Aissue++label%3Akind%2Fcleanup+)
+label, which often indicates lower-priority tasks and can be starting points.
+
+Specifically for SIG Scheduling, you should first understand the [Scheduling
+Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/), which is
+the fundamental architecture of kube-scheduler.  Most of the implementation is found in
+[pkg/scheduler](https://github.com/kubernetes/kubernetes/tree/master/pkg/scheduler). I suggest
+starting with
+[ScheduleOne](https://github.com/kubernetes/kubernetes/blob/0590bb1ac495ae8af2a573f879408e48800da2c5/pkg/scheduler/schedule_one.go#L66)
+function and then exploring deeper from there.
+
+Additionally, apart from the main kubernetes/kubernetes repository, consider looking into
+sub-projects. These typically have fewer maintainers and offer more opportunities to make a
+significant impact. Despite being called "sub" projects, many have a large number of users and a
+considerable impact on the community.
+
+And last but not least, remember contributing to the community isn’t just about code.  While I
+talked a lot about the implementation contribution, there are many ways to contribute, and each one
+is valuable. One comment to an issue, one feedback to an existing feature, one review comment in PR,
+one clarification on the documentation; every small contribution helps drive the Kubernetes
+ecosystem forward.
+
+**AP: Those are some pretty useful tips! And if I may ask, how do you assist new contributors in
+getting started, and what skills are contributors likely to learn by participating in SIG
+Scheduling?**
+
+**K**: Our maintainers are available to answer your questions in the #sig-scheduling Slack
+channel. By participating, you'll gain a deeper understanding of Kubernetes scheduling and have the
+opportunity to collaborate and network with maintainers from diverse backgrounds. You'll learn not
+just how to write code, but also how to maintain a large project, design and discuss new features,
+address bugs, and much more.
+
+## Future Directions
+
+**AP: What are some Kubernetes-specific challenges in terms of scheduling? Are there any particular
+pain points?**
+
+**K**: Scheduling in Kubernetes can be quite challenging because of the diverse needs of different
+organizations with different business requirements. Supporting all possible use cases in
+kube-scheduler is impossible. Therefore, extensibility is a key focus for us. A few years ago, we
+rearchitected kube-scheduler with [Scheduling
+Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/), which
+offers flexible extensibility for users to implement various scheduling needs through plugins. This
+allows maintainers to focus on the core scheduling features and the framework runtime.
+
+Another major issue is maintaining sufficient scheduling throughput. Typically, a Kubernetes cluster
+has only one kube-scheduler, so its throughput directly affects the overall scheduling scalability
+and, consequently, the cluster's scalability. Although we have an internal performance test
+([scheduler_perf](https://github.com/kubernetes/kubernetes/tree/master/test/integration/scheduler_perf)),
+unfortunately, we sometimes overlook performance degradation in less common scenarios. It’s
+difficult as even small changes, which look irrelevant to performance, can lead to degradation.
+
+**AP:  What are some upcoming goals or initiatives for SIG Scheduling? How do you envision the SIG evolving in the future?**
+
+**K**: Our primary goal is always to build and maintain _extensible_ and _stable_ scheduling
+runtime, and I bet this goal will remain unchanged forever.
+
+As already mentioned, extensibility is key to solving the challenge of the diverse needs of
+scheduling. Rather than trying to support every different use case directly in kube-scheduler, we
+will continue to focus on enhancing extensibility so that it can accommodate various use
+cases. [kube-scheduler-wasm-extention](https://github.com/kubernetes-sigs/kube-scheduler-wasm-extension)
+that I mentioned is also part of this initiative.
+
+Regarding stability, introducing new optimizations like QueueHint is one of our
+strategies. Additionally, maintaining throughput is also a crucial goal towards the future. We’re
+planning to enhance our throughput monitoring
+([ref](https://github.com/kubernetes/kubernetes/issues/124774)), so that we can notice degradation
+as much as possible on our own before releasing. But, realistically, we can't cover every possible
+scenario. We highly appreciate any attention the community can give to scheduling throughput and
+encourage feedback and alerts regarding performance issues!
+
+## Closing Remarks
+
+**AP: Finally, what message would you like to convey to those who are interested in learning more
+about SIG Scheduling?**
+
+**K**: Scheduling is one of the most complicated areas in Kubernetes, and you may find it difficult
+at first. But, as I shared earlier, you can find many opportunities for contributions, and many
+maintainers are willing to help you understand things. We know your unique perspective and skills
+are what makes our open source so powerful :)
+
+Feel free to reach out to us in Slack
+([#sig-scheduling](https://kubernetes.slack.com/archives/C09TP78DV)) or
+[meetings](https://github.com/kubernetes/community/blob/master/sig-scheduling/README.md#meetings).
+I hope this article interests everyone and we can see new contributors!
+
+**AP: Thank you so much for taking the time to do this! I'm confident that many will find this
+information invaluable for understanding more about SIG Scheduling and for contributing to the SIG.**
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+-----------------------------------
+
+In this SIG Architecture spotlight I talked with [Madhav Jivrajani](https://github.com/MadhavJivrajani)
+(VMware), a member of the Code Organization subproject.
+
+## Introducing the Code Organization subproject
+
+**Frederico (FSM)**: Hello Madhav, thank you for your availability. Could you start by telling us a
+bit about yourself, your role and how you got involved in Kubernetes?
+
+**Madhav Jivrajani (MJ)**: Hello! My name is Madhav Jivrajani, I serve as a technical lead for SIG
+Contributor Experience and a GitHub Admin for the Kubernetes project. Apart from that I also
+contribute to SIG API Machinery and SIG Etcd, but more recently, I’ve been helping out with the work
+that is needed to help Kubernetes [stay on supported versions of
+Go](https://github.com/kubernetes/enhancements/tree/cf6ee34e37f00d838872d368ec66d7a0b40ee4e6/keps/sig-release/3744-stay-on-supported-go-versions),
+and it is through this that I am involved with the Code Organization subproject of SIG Architecture.
+
+**FSM**: A project the size of Kubernetes must have unique challenges in terms of code organization
+-- is this a fair assumption?  If so, what would you pick as some of the main challenges that are
+specific to Kubernetes?
+
+**MJ**: That’s a fair assumption! The first interesting challenge comes from the sheer size of the
+Kubernetes codebase. We have ≅2.2 million lines of Go code (which is steadily decreasing thanks to
+[dims](https://github.com/dims) and other folks in this sub-project!), and a little over 240
+dependencies that we rely on either directly or indirectly, which is why having a sub-project
+dedicated to helping out with dependency management is crucial: we need to know what dependencies
+we’re pulling in, what versions these dependencies are at, and tooling to help make sure we are
+managing these dependencies across different parts of the codebase in a consistent manner.
+
+Another interesting challenge with Kubernetes is that we publish a lot of Go modules as part of the
+Kubernetes release cycles, one example of this is
+[`client-go`](https://github.com/kubernetes/client-go).However, we as a project would also like the
+benefits of having everything in one repository to get the advantages of using a monorepo, like
+atomic commits... so, because of this, code organization works with other SIGs (like SIG Release) to
+automate the process of publishing code from the monorepo to downstream individual repositories
+which are much easier to consume, and this way you won’t have to import the entire Kubernetes
+codebase!
+
+## Code organization and Kubernetes
+
+**FSM**: For someone just starting contributing to Kubernetes code-wise, what are the main things
+they should consider in terms of code organization? How would you sum up the key concepts?
+
+**MJ**: I think one of the key things to keep in mind at least as you’re starting off is the concept
+of staging directories. In the [`kubernetes/kubernetes`](https://github.com/kubernetes/kubernetes)
+repository, you will come across a directory called
+[`staging/`](https://github.com/kubernetes/kubernetes/tree/master/staging). The sub-folders in this
+directory serve as a bunch of pseudo-repositories. For example, the
+[`kubernetes/client-go`](https://github.com/kubernetes/client-go) repository that publishes releases
+for `client-go` is actually a [staging
+repo](https://github.com/kubernetes/kubernetes/tree/master/staging/src/k8s.io/client-go).
+
+**FSM**: So the concept of staging directories fundamentally impact contributions?
+
+**MJ**: Precisely, because if you’d like to contribute to any of the staging repos, you will need to
+send in a PR to its corresponding staging directory in `kubernetes/kubernetes`. Once the code merges
+there, we have a bot called the [`publishing-bot`](https://github.com/kubernetes/publishing-bot)
+that will sync the merged commits to the required staging repositories (like
+`kubernetes/client-go`). This way we get the benefits of a monorepo but we also can modularly
+publish code for downstream consumption. PS: The `publishing-bot` needs more folks to help out!
+
+For more information on staging repositories, please see the [contributor
+documentation](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/staging.md).
+
+**FSM**: Speaking of contributions, the very high number of contributors, both individuals and
+companies, must also be a challenge: how does the subproject operate in terms of making sure that
+standards are being followed?
+
+**MJ**: When it comes to dependency management in the project, there is a [dedicated
+team](https://github.com/kubernetes/org/blob/a106af09b8c345c301d072bfb7106b309c0ad8e9/config/kubernetes/org.yaml#L1329)
+that helps review and approve dependency changes. These are folks who have helped lay the foundation
+of much of the
+[tooling](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/vendor.md)
+that Kubernetes uses today for dependency management. This tooling helps ensure there is a
+consistent way that contributors can make changes to dependencies. The project has also worked on
+additional tooling to signal statistics of dependencies that is being added or removed:
+[`depstat`](https://github.com/kubernetes-sigs/depstat)
+
+Apart from dependency management, another crucial task that the project does is management of the
+staging repositories. The tooling for achieving this (`publishing-bot`) is completely transparent to
+contributors and helps ensure that the staging repos get a consistent view of contributions that are
+submitted to `kubernetes/kubernetes`.
+
+Code Organization also works towards making sure that Kubernetes [stays on supported versions of
+Go](https://github.com/kubernetes/enhancements/tree/cf6ee34e37f00d838872d368ec66d7a0b40ee4e6/keps/sig-release/3744-stay-on-supported-go-versions). The
+linked KEP provides more context on why we need to do this. We collaborate with SIG Release to
+ensure that we are testing Kubernetes as rigorously and as early as we can on Go releases and
+working on changes that break our CI as a part of this. An example of how we track this process can
+be found [here](https://github.com/kubernetes/release/issues/3076).
+
+## Release cycle and current priorities
+
+**FSM**: Is there anything that changes during the release cycle?
+
+**MJ** During the release cycle, specifically before code freeze, there are often changes that go in
+that add/update/delete dependencies, fix code that needs fixing as part of our effort to stay on
+supported versions of Go.
+
+Furthermore, some of these changes are also candidates for
+[backporting](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-release/cherry-picks.md)
+to our supported release branches.
+
+**FSM**: Is there any major project or theme the subproject is working on right now that you would
+like to highlight?
+
+**MJ**: I think one very interesting and immensely useful change that
+has been recently added (and I take the opportunity to specifically
+highlight the work of [Tim Hockin](https://github.com/thockin) on
+this) is the introduction of [Go workspaces to the Kubernetes
+repo](/blog/2024/03/19/go-workspaces-in-kubernetes/). A lot of our
+current tooling for dependency management and code publishing, as well
+as the experience of editing code in the Kubernetes repo, can be
+significantly improved by this change.
+
+## Wrapping up
+
+**FSM**: How would someone interested in the topic start helping the subproject?
+
+**MJ**: The first step, as is the first step with any project in Kubernetes, is to join our slack:
+[slack.k8s.io](https://slack.k8s.io), and after that join the `#k8s-code-organization` channel. There is also a
+[code-organization office
+hours](https://github.com/kubernetes/community/tree/master/sig-architecture#meetings) that takes
+place that you can choose to attend. Timezones are hard, so feel free to also look at the recordings
+or meeting notes and follow up on slack!
+
+**FSM**: Excellent, thank you! Any final comments you would like to share?
+
+**MJ**: The Code Organization subproject always needs help! Especially areas like the publishing
+bot, so don’t hesitate to get involved in the `#k8s-code-organization` Slack channel.