Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubeflow Steering Committee Elections - Testimonial Phase - Johnu George #675

Closed
akgraner opened this issue Dec 25, 2023 · 2 comments
Closed

Comments

@akgraner
Copy link
Contributor

akgraner commented Dec 25, 2023

Johnu George

LinkedIn: https://www.linkedin.com/in/johnu-george-83036610/

Github: https://github.com/johnugeorge

Q: Why do you think you would be a good candidate for the Kubeflow Steering Committee?

Kubeflow is one of the most mature open-source cloud-native MLOps platforms used in production at various companies. I started my contributions in 2017 with TF Operator, one of the initial projects in Kubeflow for distributed Tensorflow training. With my decade long MLOps experience, I have been active in the community with code contributions, leading Kubeflow Working Groups, maintaining Kubeflow core components, creating component releases, setting quality checks in CI/CD, mentoring new developers, providing user support and creating roadmaps with a long-term vision based on the usage feedback. I have made significant contributions to the overall Kubeflow customer user journey - "Build model -> Train model -> Tune model -> Serve model" with a consistent machine learning workflow from the local environment to on-premise/cloud deployment.
The challenges in the MLOps domain are quite different from when Kubeflow started five years ago. Further, technologies and infrastructure requirements have evolved. For example, With the recent GenAI/LLM wave, MLOps tools should be capable of handling large models with optimized accelerator utilization. In the LLM space, I lead directional changes to steer Kubeflow as a front-runner and a first choice for the MLOps needs. As part of this initiative, I have started focussed efforts in training operator component for large-scale LLM training/finetuning (https://github.com/kubeflow/training-operator/blob/master/docs/proposals/train_api_proposal.md), Katib and KServe(kserve/open-inference-protocol#18).

I have led/participated in numerous Kubeflow community meetups, Kubeflow contributor summits (2018-), user/developer feedback sessions and public tutorials for highlighting enterprise Kubeflow use cases. Investing in outreach programmes to highlight enterprise readiness is essential as a community project. I have contributed to one of the early PM and Outreach efforts(http://bit.ly/kf-outreach-meeting-notes). I am engaged with numerous Kubeflow PM and Community efforts (Thanks, Josh and Amber).

I am involved with other OSS and ML communities like Apache, MLCommons and CNCF. Integrating Kubeflow with 3rd party applications is essential to build a robust ecosystem to address the gaps in the MLOps domain. For example, I have added MLCube Kubeflow runner for MLCube (mlcommons/mlcube#202), echoing the Kubeflow mission to run consistent ML workflows from the local environment to the Kubernetes environment.

I was involved in one of the early Kubeflow metadata efforts. Keeping Kubeflow attractive and providing confidence among users and developers is essential to drive next-gen growth and investment. I am working towards providing tighter integrations with core Kubeflow components on the new model registry proposal (https://docs.google.com/document/d/1T3KfOqIfJohp0s1koQ2XrJJQQhj7TECO-m2xPsW59_c/edit#heading=h.t7b42oqmiz2y)

CNCF incubation is the growth catalyst for the next phase. I was involved in the early feasibility conversations and driving incubation efforts. (https://docs.google.com/document/d/1HXAl6ew5ZUgQaAnEHS1qEPxA5puUz2knUwXOZHU39sA/edit) Huge shoutout to all my colleagues for making this a reality.

Q: How long have you been involved in the Kubeflow Community? What projects have you been actively involved with?

I have been part of the initial Kubeflow community involved in the early conversations with Google in 2017. Since then, I have led/maintained various Kubeflow components and am one of the all-time top contributors across all Kubeflow repositories. Since its formation, I have been leading Training WG (Training operators) and AutoML WG(Katib). My notable contributions (non-code contributions excluded) are:

  1. Designed and Implemented the first version of Training Operator. The current stable v1 version is based on these changes. (Refactoring TF operator code training-operator#767)
  2. Designed and implemented PyTorch Operator CRD for distributed Pytorch Jobs. The current PyTorch job support in training operator is based on these changes (Pytorch v1alpha2 api implementation pytorch-operator#54)
  3. Designed a common JobController framework to support distributed training of arbitrary ML Frameworks. All frameworks, including mxnet, xgboost, and paddlepaddle, are based on this common framework. (Shared implementation of operator code training-operator#773)
  4. Designed and implemented the first version of Katib. The current beta v1beta1 version is based on these changes. (Adding initial v1alpha2 API controller katib#457)
  5. Contributed to numerous Kubeflow repositories, including Kubeflow Core, KServe, Manifests, etc

Q: How long have you been involved in open-source?

I have been active in Opensource for over ten years, starting with core contributions to Ceph Storage (OSS Storage project). I am currently leading various open-source initiatives in MLCommons and Apache. I am an early contributor to the Knative Serving core for Serverless workloads. I am on the co-founding team of the MLPerf Storage benchmark aimed at characterizing storage for ML workloads and a technical lead of MedPerf for medical benchmarks. I am an Apache PMC member, contributing to Apache projects including Mnemonic.

Q: Is there anything else you would like the community to know about you that you believe would help eligible voters make there decision?

Here are some of my critical focus areas based on the user feedback.

  1. Support for large models for data center deployments, compact/quantized models for edge deployments
  2. Focus on transformer-based models with optimizations like tensor/model parallelism, flash attention, etc.
  3. Support for better data handling and I/O processing pipelines for efficient training workflows
  4. Fill the current Kubeflow gaps in metadata tracking, model and artifact registry, unified SDK experience, etc.
  5. Better isolation of data scientist and ML admin persona Kubeflow view. Data scientists should have a pure notebook experience, while ML admins should have better system visibility with observability.

Q: Links to any external sites (projects, hobbies, etc) that you would like to share that would help people make a decision about.

CNCF Kubeflow talk (David Aronchick, Elvira Dzhuraeva, Johnu George) - https://www.youtube.com/watch?v=B4soMk6AzOk

CNCF Cloud Native AI WG Charter contributor - https://docs.google.com/document/d/1yl4NJFMaOq8LXK-m6tp0gcWjPT5KIV1XSAaoFcWicWk/edit#heading=h.rw7143kar1zv

MLPerf Storage Co-chair - https://mlcommons.org/working-groups/benchmarks/storage/

MedPerf Technical Lead - https://mlcommons.org/working-groups/data/medical/

Katib Paper Author- A Scalable and Cloud-Native Hyperparameter Tuning System - https://arxiv.org/abs/2006.02085

Kubeflow MLCube integration - mlcommons/mlcube#202

Apache Bigdata talk - http://events17.linuxfoundation.org/sites/events/files/slides/Mnemonic.pdf

Kubecon Presentation- https://www.youtube.com/watch?v=OkAoiA6A2Ac

CODS-COMAD Kubeflow Tutorial - https://cods-comad.in/2022/end-to-end-machine-learning-using-kubeflow/

ODSC Kubeflow Presentation - https://www.youtube.com/watch?v=32UNhcXSDJc

AI/ML Systems GenAI Invited Panelist - https://www.aimlsystems.org/2023/panels/

Training and AutoML Summit Organizer - https://docs.google.com/document/d/1Xg5v9pMzJMkWxDbwlVqx9bPqm87pVEd_rmPoaYCg5p4/edit

@jbottum
Copy link
Contributor

jbottum commented Jan 3, 2024

Johnu stands as a distinguished multi-year contributor to the Kubeflow community, leaving a lasting impact through critical contributions. As a dedicated code contributor in two essential working groups, Johnu’s expertise extends to comprehensive documentation efforts. Displaying leadership qualities, Johnu has taken on the role of working group lead for two separate teams, showcasing a profound commitment to the success of Kubeflow. Additionally, as a Working Group Lead, Johnu exemplifies organizational and coordination skills, while actively mentoring to nurture talent within the community. Johnu has shared valuable insights as a Kubeflow speaker at a variety of conferences and venues, contributing significantly to the dissemination of knowledge. Notably, Johnu's unwavering commitment to the Kubeflow community persists through changes in employment. Based in India, Johnu brings a crucial international perspective to the collaborative and global efforts of the Kubeflow ecosystem. These qualifications make Johnu a great candidate for the Kubeflow Steering Committee.

@andreyvelich
Copy link
Member

I have worked with Johnu on the Kubeflow project for the last 5 years, and he played a pivotal role in my joining the community in 2018.

Over the past 6 years, Johnu has been actively involved in every Kubeflow component. He contributed to Kubeflow releases, developed and improved core Kubeflow components, wrote blog posts and research papers, spoke about Kubeflow at various conferences, and consistently fostered the growth of the Kubeflow community.

Johnu’s recent efforts in implementing LLM features in Kubeflow allows users to build their Generative AI models and applications with scalable Kubeflow infrastructure in an easy way.

All of these demonstrate Johnu’s long-term commitment to the future of the Kubeflow project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants