Image Selection for ScaledJob based on Event Data #5100

benhid · 2023-10-19T09:37:40Z

Proposal

Current functionality allows us to scale and execute jobs based on events but with a fixed job template. Instead of a fixed image specified in the ScaledJob, deciding the image to run dynamically would be beneficial.

e.g.,

Let's consider a scenario where a unified message queue contains events related to image processing and data validation. Each type of event demands its processing logic and, hence, its own Docker image / Job.

Currently, we'd have to set up two distinct ScaledJobs, each listening to the same queue but filtering events and scaling based on their specific criteria.

With the proposed functionality, a single ScaledJob could listen to the unified queue. Upon receiving an event, it would inspect the event type or payload and dynamically decide whether to use the Docker image for image processing or the one for data validation.

Use-Case

In our organization, we handle diverse event-driven tasks that require different processing logic. While these tasks share a common trigger mechanism (via the same message source, e.g., RabbitMQ), the processing logic, and thus the Docker image, differs depending on the specifics of the event.

As our system evolves and introduces new event types with distinct processing logic, this feature offers the flexibility to accommodate these changes without creating a separate ScaledJob.

Is this a feature you are interested in implementing yourself?

No

Anything else?

I'm unsure whether KEDA's current architecture can implement the proposed feature. A more straightforward approach might involve enabling the scaler to choose the target reference job based on the topic name. For instance, Kafka can structure topics using a dot (.) separator. Taking a topic pattern such as "events.*" as an example, the Kafka scaler could determine which job to trigger based on whether the event arrives at "events.busybox" or "events.python3".

The text was updated successfully, but these errors were encountered:

JorTurFer · 2023-10-19T20:04:07Z

Hi
Nice proposal, but IMHO we shouldn't do this because it means that we get extra information about the messages and I think that we shouldn't cross the line of checking something from the messages. Currently, I'd say that you already could do this using multiple ScaledJobs, with different specs and filtering the topics, couldn't?

WDYT @kedacore/keda-contributors ?

zroubalik · 2023-10-19T20:11:56Z

This would be very hard to implement in a generic way. We would need to cover all different technologies and transport protocols (I wish everyone use CloudEvents :) ). Also, inspecting the actual data brings concerns with security.

I think that the current approach with mulitple scaledjob is not such a big overheard and solves the problem. Or is there anything in particular?

benhid · 2023-10-19T20:35:52Z

Thank you for your feedback! We are indeed using multiple ScaledJobs to achieve this (>100 at the moment). However, our users have the flexibility to execute jobs using any base image, which is not feasible with the current setup. We don't know all the potential images they might choose in advance, making it challenging to predefine ScaledJobs for each one.

zroubalik · 2023-10-19T20:54:47Z

@benhid gotcha. But even if we somehow implement this, you would still need to define the relation between images and message source, don't you?

benhid · 2023-10-20T07:23:15Z

@benhid gotcha. But even if we somehow implement this, you would still need to define the relation between images and message source, don't you?

In fact, having to define that relation is what I'm trying to avoid.

I keep thinking about this, and I can't come up with a proper solution. Perhaps KEDA isn't the right tool for this specific use case, and even if it is, implementing this feature might introduce too much complexity. 😟

Let me know what you think 👍

SpiritZhou · 2023-10-20T08:05:21Z

Thank you for your feedback! We are indeed using multiple ScaledJobs to achieve this (>100 at the moment). However, our users have the flexibility to execute jobs using any base image, which is not feasible with the current setup. We don't know all the potential images they might choose in advance, making it challenging to predefine ScaledJobs for each one.

I have a question. Is it possible to create a scaledjob immediately according to your user's image choose?

benhid · 2023-10-20T08:16:37Z

I have a question. Is it possible to create a scaledjob immediately according to your user's image choose?

Yes I think so. It would involve automating the creation of the ScaledJobs using Kubernetes APIs. However, I'm concerned about users creating CRDs for one-off tasks and then not cleaning them up.

zroubalik · 2023-10-20T09:23:26Z

I have a question. Is it possible to create a scaledjob immediately according to your user's image choose?

Yes I think so. It would involve automating the creation of the ScaledJobs using Kubernetes APIs. However, I'm concerned about users creating CRDs for one-off tasks and then not cleaning them up.

That could be solved by a very simple operator imho or even a cron job.

benhid · 2023-10-22T19:29:51Z

That could be solved by a very simple operator imho or even a cron job.

I'm not entirely convinced that a cron job would be optimal. It doesn't seem like a particularly robust solution, especially if we start scaling and have a high volume of these jobs. The overhead of constantly creating, checking, and cleaning CRDs might be substantial. Could you elaborate more on how you envision this operator working?

stale · 2023-12-22T07:55:01Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale · 2023-12-29T09:59:43Z

This issue has been automatically closed due to inactivity.

benhid added feature-request All issues for new features that have not been committed to needs-discussion labels Oct 19, 2023

stale bot added the stale All issues that are marked as stale due to inactivity label Dec 22, 2023

stale bot closed this as completed Dec 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Selection for ScaledJob based on Event Data #5100

Image Selection for ScaledJob based on Event Data #5100

benhid commented Oct 19, 2023

JorTurFer commented Oct 19, 2023

zroubalik commented Oct 19, 2023

benhid commented Oct 19, 2023

zroubalik commented Oct 19, 2023

benhid commented Oct 20, 2023

SpiritZhou commented Oct 20, 2023

benhid commented Oct 20, 2023

zroubalik commented Oct 20, 2023 •

edited

Loading

benhid commented Oct 22, 2023

stale bot commented Dec 22, 2023

stale bot commented Dec 29, 2023

Image Selection for ScaledJob based on Event Data #5100

Image Selection for ScaledJob based on Event Data #5100

Comments

benhid commented Oct 19, 2023

Proposal

Use-Case

Is this a feature you are interested in implementing yourself?

Anything else?

JorTurFer commented Oct 19, 2023

zroubalik commented Oct 19, 2023

benhid commented Oct 19, 2023

zroubalik commented Oct 19, 2023

benhid commented Oct 20, 2023

SpiritZhou commented Oct 20, 2023

benhid commented Oct 20, 2023

zroubalik commented Oct 20, 2023 • edited Loading

benhid commented Oct 22, 2023

stale bot commented Dec 22, 2023

stale bot commented Dec 29, 2023

zroubalik commented Oct 20, 2023 •

edited

Loading