[Feature]: Support Inference Overrides for mm_processor_kwargs #8742

alex-jw-brooks · 2024-09-23T18:23:08Z

🚀 The feature, motivation and pitch

Follow-up on #8657, which added support for passing initialization time mm_processor_kwargs to be used by the input mapper / processor / max token count calculations / dummy data if they're added to architecture-specific implementations as keyword arguments. It would be nice to also be to pass such kwargs as input values at inference time as part of the multi-modal data, e.g.,:

llm.generate({"multi_modal_data": {"image": {"data": image, "mm_processor_kwargs": image_kwargs}}})

Such that for models that support additional mm_processor_kwargs:

The initialization time mm_processor_kwargs take priority over the config values
The inference time mm_processor_kwargs take priority over the config values and the initialization mm_processor_kwargs

Alternatives

Keep mm_processor_kwargs as initialization time only

Additional context

For per-request mm_processor_kwargs, it needs to be correctly handled:

In the input mapper
In the input processor

Some care needs to be taken around the input mapper, which falls back to a wrapper around HF resources, e.g., image processors, since it may take stuff out of the config. More specifically:

We should avoid initializing and managing multiple multimodal processors with different processor kwargs if possible
Init time processor kwargs / per request processor kwargs should behave identically - this probably depends on the preprocess signature for the HF resource closely matching the init signature by default
- If for whatever reason init/preprocess are not well-aligned, the mapper / processor can be implemented in the VLLM model class as a backup plan to fix it

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

alex-jw-brooks · 2024-09-23T18:24:28Z

I think this should be straightforward to implement - I plan to try it in the next week or so 🤞

alex-jw-brooks added the feature request label Sep 23, 2024

alex-jw-brooks mentioned this issue Sep 23, 2024

[Core][Frontend] Support Passing Multimodal Processor Kwargs #8657

Merged

alex-jw-brooks mentioned this issue Oct 7, 2024

[Core][Frontend] Add Support for Inference Time mm_processor_kwargs #9131

Merged

DarkLight1337 closed this as completed in #9131 Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Support Inference Overrides for mm_processor_kwargs #8742

[Feature]: Support Inference Overrides for mm_processor_kwargs #8742

alex-jw-brooks commented Sep 23, 2024 •

edited

Loading

alex-jw-brooks commented Sep 23, 2024 •

edited

Loading

[Feature]: Support Inference Overrides for mm_processor_kwargs #8742

[Feature]: Support Inference Overrides for mm_processor_kwargs #8742

Comments

alex-jw-brooks commented Sep 23, 2024 • edited Loading

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

alex-jw-brooks commented Sep 23, 2024 • edited Loading

alex-jw-brooks commented Sep 23, 2024 •

edited

Loading

alex-jw-brooks commented Sep 23, 2024 •

edited

Loading