feat: add documentation for mlflow autologging on website #1508

serena-ruan · 2022-05-10T09:42:39Z

Summary

Add documentation for Mlflow autologgin on the website.

Tests

Rendered website and check

Dependency changes

None.

AB#1785118

serena-ruan · 2022-05-10T09:42:54Z

/azp run

azure-pipelines · 2022-05-10T09:43:05Z

Azure Pipelines successfully started running 1 pipeline(s).

codecov-commenter · 2022-05-10T09:49:35Z

Codecov Report

Merging #1508 (3a9a3d3) into master (9657a53) will decrease coverage by 1.54%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1508      +/-   ##
==========================================
- Coverage   84.30%   82.75%   -1.55%     
==========================================
  Files         296      296              
  Lines       14906    14906              
  Branches      717      717              
==========================================
- Hits        12566    12336     -230     
- Misses       2340     2570     +230

Impacted Files	Coverage Δ
...soft/azure/synapse/ml/cognitive/AudioStreams.scala	`0.00% <0.00%> (-87.88%)`	⬇️
...t/azure/synapse/ml/cognitive/SpeechToTextSDK.scala	`18.43% <0.00%> (-72.16%)`	⬇️
...crosoft/azure/synapse/ml/cognitive/SpeechAPI.scala	`0.00% <0.00%> (-70.00%)`	⬇️
...crosoft/azure/synapse/ml/io/http/HTTPClients.scala	`66.17% <0.00%> (-8.83%)`	⬇️
...ft/azure/synapse/ml/core/env/StreamUtilities.scala	`77.77% <0.00%> (-7.41%)`	⬇️
...se/ml/cognitive/MultivariateAnomalyDetection.scala	`87.03% <0.00%> (-0.75%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9657a53...3a9a3d3. Read the comment docs.

serena-ruan · 2022-05-10T23:33:28Z

/azp run

azure-pipelines · 2022-05-10T23:33:39Z

Azure Pipelines successfully started running 1 pipeline(s).

mhamilton723

Magnifique, one small Q

mhamilton723 · 2022-05-12T00:24:28Z

website/docs/mlflow/autologging.md

+2. Upload your customized `log_model_allowlist.txt` file to dbfs by clicking File/Upload Data button on Databricks UI.
+3. Set Spark configuration:
+```
+spark.conf.set("spark.mlflow.pysparkml.autolog.logModelAllowlistFile", "/dbfs/FileStore/PATH_TO_YOUR_log_model_allowlist.txt")


does this accept URLs? Perhaps we can host a reasonable default in our blob!

We might also want to mention this can be set in cluster configs too

Thanks for this idea! I'll raise a PR to mlflow to make the URL work lol this sounds so reasonable. And I just tested that the above spark.conf.set doesn't work as the cluster is already started, I'll change it to add spark config inside cluster configuration.

mlflow/mlflow#5857

mhamilton723 · 2022-05-12T00:25:34Z

website/docs/mlflow/autologging.md

+## Configuration process in Databricks as an example
+
+1. Install MLflow via `%pip install mlflow`
+2. Upload your customized `log_model_allowlist.txt` file to dbfs by clicking File/Upload Data button on Databricks UI.


if we can pass a URL in this param, perhaps we can make this platform agnostic or at least give advice for both Synapse and Databricks

Sure, I'll add example in step 2 in section To enable autologging for SynapseML

website/docs/mlflow/autologging.md

mhamilton723

Love this! Left a few copy edits and some minor questions and suggestions. Cant wait to get this on website and thanks for building this out you freakin PM/Dev combo

…SynapseML into serena/addMlflowDoc

serena-ruan · 2022-05-12T08:02:54Z

/azp run

azure-pipelines · 2022-05-12T08:03:05Z

Azure Pipelines successfully started running 1 pipeline(s).

feat: add documentation for mlflow autologging on website

020ce35

Merge branch 'master' into serena/addMlflowDoc

b80f4ed

mhamilton723 requested changes May 12, 2022

View reviewed changes