Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential problem in EpsilonGreedy.py? #336

Closed
sasvaritoni opened this issue Dec 7, 2018 · 5 comments
Closed

Potential problem in EpsilonGreedy.py? #336

sasvaritoni opened this issue Dec 7, 2018 · 5 comments
Assignees

Comments

@sasvaritoni
Copy link
Contributor

I created a MAB by using the Kubeflow template seldon-mab-v1alpha2.
It seems that the router is stuck at best branch 0, no matter what feedback I send in. The counters are not changing any time, constant 0's are printed:

10.233.75.41 - - [07/Dec/2018 13:51:22] "POST /send-feedback HTTP/1.1" 200 -
Routing
Current best branch: 0
Selected branch: 1

10.233.75.41 - - [07/Dec/2018 13:51:23] "POST /route HTTP/1.1" 200 -
Training
Prev success # [0, 0]
Prev tries # [0, 0]
Prev best branch: 0
New success # [0, 0]
New tries # [0, 0]
New best branch: 0

10.233.75.41 - - [07/Dec/2018 13:51:23] "POST /send-feedback HTTP/1.1" 200 -
Routing
Current best branch: 0
Selected branch: 1

10.233.75.41 - - [07/Dec/2018 13:51:23] "POST /route HTTP/1.1" 200 -
Training
Prev success # [0, 0]
Prev tries # [0, 0]
Prev best branch: 0
New success # [0, 0]
New tries # [0, 0]
New best branch: 0

This is the feedback sample I am sending in:


{'response': 
 {
	"meta": {
		"puid": "6dho003ic6p177pn2s3gv898vt",
		"tags": {
		},
		"routing": {
			"eg-router": 1
			
		}
	},
	"data": {
		"names": [
			"t:0",
			"t:1",
			"t:2"
		],
		"ndarray": [
			[
				"n02504013",
				"Indian_elephant",
				"0.8157868"
			],
			[
				"n01871265",
				"tusker",
				"0.11059379"
			],
			[
				"n02504458",
				"African_elephant",
				"0.071501344"
			]
		]
	}
},
'reward': '0.9'} 

Any hints? Anyone tested the MAB router before?

@ukclivecox
Copy link
Contributor

Try

  • Sending your request as well. This is where the epsilonGreedy is presently getting the number of predictions.
  • Sending a binary reward (1 or 0)

@sasvaritoni
Copy link
Contributor Author

Wow, sending the request as well did the trick indeed!
Actually I sent in a dummy request, because I did not want to send in the whole image data again.

In case of AB-test, I only sent the response and the reward, and that was enough just to have the feedback counters work with Prometheus, that's why I did not consider sending the request here either.
I think in some cases the request data may not be available anymore (or may be too big) to send in again with the feedback.

Anyway, it works now, thank you!

@sasvaritoni
Copy link
Contributor Author

I just noticed that you were right with the 2nd hint as well, seems that only binary reward works.

Isn't it supposed to accept any real number between 0 and 1?
At least on the orchestrator service level real numbers are accepted both for AB-test and MAB.
Seems that the router itself can only interpret binary numbers.

@jklaise
Copy link
Contributor

jklaise commented Dec 10, 2018

@sasvaritoni I'm in the process of overhauling the example router components, you can find a pull request with an updated e-greedy implementation at #335.

On the second point, the reward can be any real number and for the e-greedy example it is assumed to be a real number in [0,1]. The main assumption/limitation here is that because the send-feedback endpoint supports batch requests, we assume that the reward is interpreted as the proportion of successful (in a binary sense) samples in the batch (i.e. if the batch is only one sample, the reward can be only 0 or 1, if it is 2 samples, it can be 0, 0.5 or 1, etc. - the logic is in the n_success_failures method in the e-greedy implementation). If, for example, in your application you only ever send one sample per batch and your rewards are arbitrary floats, I would suggest modifying the e-greedy router component to your needs (i.e. get rid of the n_success_failures method and use the raw reward).

@sasvaritoni
Copy link
Contributor Author

Thank you for the help and explanation, this was really useful!

agrski added a commit that referenced this issue Dec 2, 2022
* Add tag pattern to ignore RC builds in generated release notes

* Add changelog handling for core versions vs. RC builds, nightlies, etc.

* Remove duplicate release notes generation command

* Use GITHUB_ENV instead of exporting vars

* Use simple toggle not interpolated auto-changelog args
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants