Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sweep job support for pipeline component in Azure Machine Learning #30929

Open
donin1129 opened this issue Mar 3, 2025 · 1 comment
Open
Labels
Auto-Assign Auto assign by bot customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning az ml Service Attention This issue is responsible by Azure service team.

Comments

@donin1129
Copy link

Related command
az ml component create -f XXX.yaml
az ml job create -f XXX.yaml

Is your feature request related to a problem? Please describe.
No.

Describe the solution you'd like
We are implementing a Retrieval-Augmented Generation (RAG) system using Azure Machine Learning, structured into multiple components such as:

  • process-data-input
  • retrieve_docs_from_search_index
  • build_prompt
  • query_llm
  • evaluate

We want to perform hyperparameter tuning across these components. For example, we aim to optimize parameters such as:

  • The number of documents retrieved from the search index
  • Different prompt templates
  • Various q values in LLM queries

However, AzureML Sweep Jobs currently seem to only support command components, making it impossible to tune hyperparameters across components. We tried to group our command components in to a pipeline component. But sweep job does not start correctly. It would be highly beneficial if Sweep Jobs could also support pipeline component, allowing us to bundle our components into a single pipeline component and efficiently perform hyperparameter tuning.

Describe alternatives you've considered
Our current approach involves creating a standalone command component that pulls source code from all other components. However, this:

  • Defeats the purpose of separating concerns across multiple components
  • Violates the Single Responsibility Principle in system design
  • Increases maintenance complexity

Additional context
Extending Sweep Job support to pipeline component would greatly improve modularity, maintainability, and efficiency in hyperparameter tuning.

@yonzhan
Copy link
Collaborator

yonzhan commented Mar 3, 2025

Thank you for opening this issue, we will look into it.

@microsoft-github-policy-service microsoft-github-policy-service bot added customer-reported Issues that are reported by GitHub users external to the Azure organization. Auto-Assign Auto assign by bot Service Attention This issue is responsible by Azure service team. Machine Learning az ml labels Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auto-Assign Auto assign by bot customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning az ml Service Attention This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

2 participants