You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+45-43
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ ucb
5
5
6
6
**An upper confidence bounds algorithm for multi-armed bandit problems**
7
7
8
-
This implementation is based on [<em>Bandit Algorithms for Website Optimization</em>](http://shop.oreilly.com/product/0636920027393.do) and related empirical research in ["Algorithms for the multi-armed bandit problem"](http://www.cs.mcgill.ca/~vkules/bandits.pdf). In addition, this module conforms to the [BanditLab/2.0 specification](https://github.com/kurttheviking/banditlab-spec/releases).
8
+
This implementation is based on [<em>Bandit Algorithms for Website Optimization</em>](http://shop.oreilly.com/product/0636920027393.do) and related empirical research in ["Algorithms for the multi-armed bandit problem"](http://www.cs.mcgill.ca/~vkules/bandits.pdf). In addition, this module conforms to the [BanditLab/2.0 specification](https://github.com/kurttheviking/banditlab-spec/releases). Now written in Typescript!
9
9
10
10
11
11
## Get started
@@ -33,69 +33,71 @@ This implementation often encounters extended floating point numbers. Arm select
33
33
1. Create an optimizer with `3` arms:
34
34
35
35
```js
36
-
constAlgorithm=require('ucb');
36
+
const{ Ucb }=require('ucb');
37
37
38
-
constalgorithm=newAlgorithm({
38
+
constucb=newUcb({
39
39
arms:3
40
40
});
41
41
```
42
42
43
-
2. Select an arm (exploits or explores, determined by the algorithm):
43
+
2. Select an arm (exploits or explores, determined by ucb):
The `config` object supports two optional parameters:
69
71
70
72
-`arms` (`Number`, Integer): The number of arms over which the optimization will operate; defaults to `2`
71
73
72
-
Alternatively, the `state` object resolved from [`Algorithm#serialize`](https://github.com/kurttheviking/ucb-js#algorithmserialize) can be passed as `config`.
74
+
Alternatively, the `state` object resolved from [`Ucb#serialize`](https://github.com/kurttheviking/ucb-js#algorithmserialize) can be passed as `config`.
73
75
74
76
#### Returns
75
77
76
-
An instance ofthe ucb optimization algorithm.
78
+
An instance ofUcb.
77
79
78
80
#### Example
79
81
80
82
```js
81
-
const Algorithm = require('ucb');
82
-
const algorithm = new Algorithm();
83
+
const { Ucb } = require('ucb');
84
+
const ucb = new Ucb();
83
85
84
-
assert.equal(algorithm.arms, 2);
86
+
assert.equal(ucb.arms, 2);
85
87
```
86
88
87
89
Or, with a passed `config`:
88
90
89
91
```js
90
-
const Algorithm = require('ucb');
91
-
const algorithm = new Algorithm({ arms: 4 });
92
+
const { Ucb } = require('ucb');
93
+
const ucb = new Ucb({ arms: 4 });
92
94
93
-
assert.equal(algorithm.arms, 4);
95
+
assert.equal(ucb.arms, 4);
94
96
```
95
97
96
-
### `Algorithm#select()`
98
+
### `Ucb#select()`
97
99
98
-
Choose an arm to play, according to the optimization algorithm.
100
+
Choose an arm to play, according to Ucb.
99
101
100
102
#### Arguments
101
103
@@ -108,53 +110,61 @@ A `Promise` that resolves to a `Number` corresponding to the associated arm inde
108
110
#### Example
109
111
110
112
```js
111
-
const Algorithm = require('ucb');
112
-
const algorithm = new Algorithm();
113
+
const { Ucb } = require('ucb');
114
+
const ucb = new Ucb();
113
115
114
-
algorithm.select().then(arm => console.log(arm));
116
+
ucb.select().then(arm => console.log(arm));
117
+
// or
118
+
const arm = ucb.selectSync();
115
119
```
116
120
117
-
### `Algorithm#reward(arm, reward)`
121
+
### `Ucb#reward(arm, reward)`
118
122
119
-
Inform the algorithm about the payoff from a given arm.
123
+
Inform Ucb about the payoff from a given arm.
120
124
121
125
#### Arguments
122
126
123
-
-`arm` (`Number`, Integer): the arm index (provided from `Algorithm#select()`)
127
+
-`arm` (`Number`, Integer): the arm index (provided from `Ucb#select()`)
124
128
-`reward` (`Number`): the observed reward value (which can be 0 to indicate no reward)
125
129
126
130
#### Returns
127
131
128
-
A`Promise` that resolves to an updated instance ofthe algorithm. (The original instance is mutated as well.)
132
+
A`Promise` that resolves to an updated instance ofucb. (The original instance is mutated as well.)
0 commit comments