feat: Add support for ContextualBandit in the VW module #896

jackgerrits · 2020-07-16T15:15:49Z

Adds VowpalWabbitContextualBandit, VowpalWabbitContextualBanditModel, ColumnVectorSequencer classes
Update com.github.vowpalwabbit dependency version for CB support
Add tests in Scala and Python for the new functionality
Other featurizer improvements from @eisber

… using MTR/IWR

…master

src/main/python/mmlspark/vw/VowpalWabbitContextualBandit.py

src/main/scala/com/microsoft/ml/spark/vw/VowpalWabbitClassifier.scala

eisber · 2020-07-16T15:48:43Z

src/main/scala/com/microsoft/ml/spark/vw/VowpalWabbitContextualBandit.scala

+  def add_example(p_log: Double, reward: Double, p_pred: Double, count: Int = 1): Unit = {
+    total_events += count
+    if (p_pred > 0) {
+      val p_over_p = p_pred / p_log


@marco-rossi29 can you review?

src/main/scala/com/microsoft/ml/spark/vw/VowpalWabbitContextualBandit.scala

src/main/scala/com/microsoft/ml/spark/vw/VowpalWabbitUtil.scala

src/test/python/mmlsparktest/vw/test_vw_cb.py

src/test/scala/com/microsoft/ml/spark/vw/VerifyVowpalWabbitFeaturizer.scala

jackgerrits · 2020-07-16T16:22:20Z

/azp run

azure-pipelines · 2020-08-26T19:10:56Z

Azure Pipelines successfully started running 1 pipeline(s).

jackgerrits · 2020-08-26T21:07:25Z

/azp run

azure-pipelines · 2020-08-26T21:07:40Z

Azure Pipelines successfully started running 1 pipeline(s).

jackgerrits · 2020-08-27T16:26:37Z

/azp run

azure-pipelines · 2020-08-27T16:26:50Z

Azure Pipelines successfully started running 1 pipeline(s).

jackgerrits · 2020-08-27T16:41:40Z

/azp run

azure-pipelines · 2020-08-27T16:41:58Z

Azure Pipelines successfully started running 1 pipeline(s).

jackgerrits · 2020-08-27T17:22:37Z

@eisber considering this only really supports cb_adf_explore should I update the class naming to reflect that? Does it seem like an issue that it is scoped to that?

jackgerrits · 2020-08-27T18:26:28Z

/azp run

azure-pipelines · 2020-08-27T18:26:43Z

Azure Pipelines successfully started running 1 pipeline(s).

mhamilton723

Great work! A few minor nits on the tests and I think its ready to roll!

mhamilton723 · 2020-08-28T00:34:47Z

src/test/python/mmlsparktest/vw/test_vw_cb.py

+class VowpalWabbitSpec(unittest.TestCase):
+    def get_data(self):
+        # create sample data
+        schema = StructType([


you can just pass the col names if you are okay with the standard schema inference

It seems as though by default it makes whole numbers Longs whereas it expects Integers. I can extend the internal conversions to convert Long to Int but then if the number is too large then it will be a conversion exception rather than a schema validation failure? Seems better to limit the schema

no worries!

mhamilton723 · 2020-08-28T00:35:18Z

src/test/python/mmlsparktest/vw/test_vw_cb.py

+            StructField("probability", DoubleType())
+        ])
+
+        data = pyspark.sql.SparkSession.builder.getOrCreate().createDataFrame([


spark should already be availible to you because of the imports

mhamilton723 · 2020-08-28T00:35:30Z

src/test/python/mmlsparktest/vw/test_vw_cb.py

+            StructField("probability", DoubleType())
+        ])
+
+        data = pyspark.sql.SparkSession.builder.getOrCreate().createDataFrame([


likewise here

mhamilton723 · 2020-08-28T00:35:36Z

src/test/python/mmlsparktest/vw/test_vw_cb.py

+
+    def get_data_two_shared(self):
+        # create sample data
+        schema = StructType([


mhamilton723 · 2020-08-28T00:37:37Z

src/test/scala/com/microsoft/ml/spark/vw/VerifyVowpalWabbitContextualBanditFuzzing.scala

+  }
+}
+
+class VerifyVowpalWabbitContextualBanditFuzzing extends EstimatorFuzzing[VowpalWabbitContextualBandit] {


This doesent need to be a sepearate class from your COntextual bandit tests

mhamilton723 · 2020-09-03T18:54:57Z

/azp run

azure-pipelines · 2020-09-03T18:55:12Z

Azure Pipelines successfully started running 1 pipeline(s).

…master

eisber and others added 25 commits December 3, 2019 20:52

initial JSON format extension

4c808b4

continued featurization

68ce4e1

added min_data_in_leaf parameter

73bb3fc

added CB prototype

7cc3868

Merge branch 'master' into ilmat/add-min-data-leaf-param

3b17938

added contextual bandit impl and simulator based on SparkML regressor…

e55f590

… using MTR/IWR

improvements

e60051a

Merge branch '760' into marcozo/personalizer

83875f9

fixed data generation

30cead2

added lightgbm continuous training

b125518

further improvements

86710b2

continue to add multi-line support

4ecafb2

fixed off-by-1 for actions

ed98e89

refactored VW base for CB. first non-crashing version

fa12d51

Merge branch 'master' into marcozo/personalizer_merged_with_upstream_…

fd1cf9a

…master

Implement action merger

fc9d264

Add estimator

538d90b

Cleanup branch

95b758a

Implement transform schema

7bfe7d9

Change names in ContextualBanditMetrics

7986148

Add Python components

5cdf920

Update to released JAR

ceaf910

Add Python tests

a5a1a18

Test fixes and fix required for python test

525f4b5

fix copyright header

51b2e46

jackgerrits requested review from eisber and mhamilton723 as code owners July 16, 2020 15:15

jackgerrits changed the title ~~Feat: Add support for ContextualBandit in the VW module~~ feat: Add support for ContextualBandit in the VW module Jul 16, 2020

eisber requested changes Jul 16, 2020

View reviewed changes

jackgerrits added 2 commits August 26, 2020 16:59

Remove usage of structural type

b69cb55

Use the numeric zero field

0c0016d

jackgerrits added 5 commits August 27, 2020 11:52

Remove unused code

dacc059

Style fixes

22e8974

Make model ComplexParamsWritable

00f6385

Add fuzz tests

9886180

Use bundled dataset

d7fd819

jackgerrits added 2 commits August 27, 2020 12:41

Add exemption for mixin test

fe408c8

style fixes

18cc39a

Remove readable/writable from base trait and update leaves to complex

3e0be74

Check for incompatible options

7eab0db

jackgerrits requested review from mhamilton723 and eisber August 27, 2020 20:37

mhamilton723 requested changes Aug 28, 2020

View reviewed changes

Address feedback

76ef247

mhamilton723 approved these changes Sep 3, 2020

View reviewed changes

Merge branch 'master' into marcozo/personalizer_merged_with_upstream_…

43e86c0

…master

mhamilton723 merged commit e9d8802 into microsoft:master Sep 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for ContextualBandit in the VW module #896

feat: Add support for ContextualBandit in the VW module #896

jackgerrits commented Jul 16, 2020

eisber Jul 16, 2020

jackgerrits commented Jul 16, 2020

azure-pipelines bot commented Aug 26, 2020

jackgerrits commented Aug 26, 2020

azure-pipelines bot commented Aug 26, 2020

jackgerrits commented Aug 27, 2020

azure-pipelines bot commented Aug 27, 2020

jackgerrits commented Aug 27, 2020

azure-pipelines bot commented Aug 27, 2020

jackgerrits commented Aug 27, 2020

jackgerrits commented Aug 27, 2020

azure-pipelines bot commented Aug 27, 2020

mhamilton723 left a comment

mhamilton723 Aug 28, 2020

jackgerrits Aug 31, 2020

mhamilton723 Sep 3, 2020

mhamilton723 Aug 28, 2020

jackgerrits Aug 31, 2020

mhamilton723 Aug 28, 2020

mhamilton723 Aug 28, 2020

mhamilton723 Aug 28, 2020

jackgerrits Aug 31, 2020

mhamilton723 commented Sep 3, 2020

azure-pipelines bot commented Sep 3, 2020

feat: Add support for ContextualBandit in the VW module #896

feat: Add support for ContextualBandit in the VW module #896

Conversation

jackgerrits commented Jul 16, 2020

Choose a reason for hiding this comment

jackgerrits commented Jul 16, 2020

azure-pipelines bot commented Aug 26, 2020

jackgerrits commented Aug 26, 2020

azure-pipelines bot commented Aug 26, 2020

jackgerrits commented Aug 27, 2020

azure-pipelines bot commented Aug 27, 2020

jackgerrits commented Aug 27, 2020

azure-pipelines bot commented Aug 27, 2020

jackgerrits commented Aug 27, 2020

jackgerrits commented Aug 27, 2020

azure-pipelines bot commented Aug 27, 2020

mhamilton723 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhamilton723 commented Sep 3, 2020

azure-pipelines bot commented Sep 3, 2020