-
Notifications
You must be signed in to change notification settings - Fork 834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add documentation for mlflow autologging on website #1508
feat: add documentation for mlflow autologging on website #1508
Conversation
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Codecov Report
@@ Coverage Diff @@
## master #1508 +/- ##
==========================================
- Coverage 84.30% 82.75% -1.55%
==========================================
Files 296 296
Lines 14906 14906
Branches 717 717
==========================================
- Hits 12566 12336 -230
- Misses 2340 2570 +230
Continue to review full report at Codecov.
|
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Magnifique, one small Q
website/docs/mlflow/autologging.md
Outdated
2. Upload your customized `log_model_allowlist.txt` file to dbfs by clicking File/Upload Data button on Databricks UI. | ||
3. Set Spark configuration: | ||
``` | ||
spark.conf.set("spark.mlflow.pysparkml.autolog.logModelAllowlistFile", "/dbfs/FileStore/PATH_TO_YOUR_log_model_allowlist.txt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this accept URLs? Perhaps we can host a reasonable default in our blob!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might also want to mention this can be set in cluster configs too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this idea! I'll raise a PR to mlflow to make the URL work lol this sounds so reasonable. And I just tested that the above spark.conf.set doesn't work as the cluster is already started, I'll change it to add spark config inside cluster configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## Configuration process in Databricks as an example | ||
|
||
1. Install MLflow via `%pip install mlflow` | ||
2. Upload your customized `log_model_allowlist.txt` file to dbfs by clicking File/Upload Data button on Databricks UI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we can pass a URL in this param, perhaps we can make this platform agnostic or at least give advice for both Synapse and Databricks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add example in step 2 in section To enable autologging for SynapseML
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this! Left a few copy edits and some minor questions and suggestions. Cant wait to get this on website and thanks for building this out you freakin PM/Dev combo
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Summary
Add documentation for Mlflow autologgin on the website.
Tests
Rendered website and check
Dependency changes
None.
AB#1785118