Skip to content

Commit 3a24be0

Browse files
committed
更新机器学习的几个数据结构
1 parent 010bea3 commit 3a24be0

File tree

6 files changed

+219
-32
lines changed

6 files changed

+219
-32
lines changed

Linear Regression/Images/graph1.png

22.4 KB
Loading

Linear Regression/Images/graph2.png

42.5 KB
Loading

Linear Regression/Images/graph3.png

30.7 KB
Loading

Linear Regression/README.markdown

Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
# Linear Regression
2+
3+
Linear regression is a technique for creating a model of the relationship between two (or more) variable quantities.
4+
5+
For example, let's say we are planning to sell a car. We are not sure how much money to ask for. So we look at recent advertisments for the asking prices of other cars. There are a lot of variables we could look at - for example: make, model, engine size. To simplify our task, we collect data on just the age of the car and the price:
6+
7+
Age (in years)| Price (in £)
8+
--------------|-------------
9+
10 | 500
10+
8 | 400
11+
3 | 7,000
12+
3 | 8,500
13+
2 | 11,000
14+
1 | 10,500
15+
16+
Our car is 4 years old. How can we set a price for our car based on the data in this table?
17+
18+
Let's start by looking at the data plotted out:
19+
20+
![graph1](Images/graph1.png)
21+
22+
We could imagine a straight line drawn through the points on this graph. It's not (in this case) going to go exactly through every point, but we could place the line so that it goes as close to all the points as possible.
23+
24+
To say this in another way, we want to make the distance from the line to each point as small as possible. This is most often done by minimising the square of the distance from the line to each point.
25+
26+
We can describe the straight line in terms of two variables:
27+
28+
1. The point at which it crosses the y-axis i.e. the predicted price of a brand new car. This is the *intercept*.
29+
2. The *slope* of the line - i.e. for every year of age, how much does the price change.
30+
31+
This is the equation for our line:
32+
33+
`carPrice = slope * carAge + intercept`
34+
35+
36+
How can we find the best values for the intercept and the slope? Let's look at two different ways to do this.
37+
38+
## An iterative approach
39+
One approach is to start with some arbitrary values for the intercept and the slope. We work out what small changes we make to these values to move our line closer to the data points. Then we repeat this multiple times. Eventually our line will approach the optimum position.
40+
41+
First let's set up our data structures. We will use two Swift arrays for the car age and the car price:
42+
43+
```swift
44+
let carAge: [Double] = [10, 8, 3, 3, 2, 1]
45+
let carPrice: [Double] = [500, 400, 7000, 8500, 11000, 10500]
46+
```
47+
48+
This is how we can represent our straight line:
49+
50+
```swift
51+
var intercept = 0.0
52+
var slope = 0.0
53+
func predictedCarPrice(_ carAge: Double) -> Double {
54+
return intercept + slope * carAge
55+
}
56+
57+
```
58+
Now for the code which will perform the iterations:
59+
60+
```swift
61+
let numberOfCarAdvertsWeSaw = carPrice.count
62+
let numberOfIterations = 100
63+
let alpha = 0.0001
64+
65+
for _ in 1...numberOfIterations {
66+
for i in 0..<numberOfCarAdvertsWeSaw {
67+
let difference = carPrice[i] - predictedCarPrice(carAge[i])
68+
intercept += alpha * difference
69+
slope += alpha * difference * carAge[i]
70+
}
71+
}
72+
```
73+
74+
```alpha``` is a factor that determines how much closer we move to the correct solution with each iteration. If this factor is too large then our program will not converge on the correct solution.
75+
76+
The program loops through each data point (each car age and car price). For each data point it adjusts the intercept and the slope to bring them closer to the correct values. The equations used in the code to adjust the intercept and the slope are based on moving in the direction of the maximal reduction of these variables. This is a *gradient descent*.
77+
78+
We want to minimise the square of the distance between the line and the points. We define a function `J` which represents this distance - for simplicity we consider only one point here. This function `J` is proportional to `((slope.carAge + intercept) - carPrice)) ^ 2`.
79+
80+
In order to move in the direction of maximal reduction, we take the partial derivative of this function with respect to the slope, and similarly for the intercept. We multiply these derivatives by our factor alpha and then use them to adjust the values of slope and intercept on each iteration.
81+
82+
Looking at the code, it intuitively makes sense - the larger the difference between the current predicted car Price and the actual car price, and the larger the value of ```alpha```, the greater the adjustments to the intercept and the slope.
83+
84+
It can take a lot of iterations to approach the ideal values. Let's look at how the intercept and slope change as we increase the number of iterations:
85+
86+
Iterations | Intercept | Slope | Predicted value of a 4 year old car
87+
:---------:|:---------:|:-----:|:------------------------:
88+
0 | 0 | 0 | 0
89+
2000 | 4112 | -113 | 3659
90+
6000 | 8564 | -764 | 5507
91+
10000 | 10517 | -1049 | 6318
92+
14000 | 11374 | -1175 | 6673
93+
18000 | 11750 | -1230 | 6829
94+
95+
Here is the same data shown as a graph. Each of the blue lines on the graph represents a row in the table above.
96+
97+
![graph2](Images/graph2.png)
98+
99+
After 18,000 iterations it looks as if the line is getting closer to what we would expect (just by looking) to be the correct line of best fit. Also, each additional 2,000 iterations has less and less effect on the final result - the values of the intercept and the slope are converging on the correct values.
100+
101+
## A closed form solution
102+
103+
There is another way we can calculate the line of best fit, without having to do multiple iterations. We can solve the equations describing the least squares minimisation and just work out the intercept and slope directly.
104+
105+
First we need some helper functions. This one calculates the average (the mean) of an array of Doubles:
106+
107+
```swift
108+
func average(_ input: [Double]) -> Double {
109+
return input.reduce(0, +) / Double(input.count)
110+
}
111+
```
112+
We are using the ```reduce``` Swift function to sum up all the elements of the array, and then divide that by the number of elements. This gives us the mean value.
113+
114+
We also need to be able to multiply each element in an array by the corresponding element in another array, to create a new array. Here is a function which will do this:
115+
116+
```swift
117+
func multiply(_ a: [Double], _ b: [Double]) -> [Double] {
118+
return zip(a,b).map(*)
119+
}
120+
```
121+
122+
We are using the ```map``` function to multiply each element.
123+
124+
Finally, the function which fits the line to the data:
125+
126+
```swift
127+
func linearRegression(_ xs: [Double], _ ys: [Double]) -> (Double) -> Double {
128+
let sum1 = average(multiply(ys, xs)) - average(xs) * average(ys)
129+
let sum2 = average(multiply(xs, xs)) - pow(average(xs), 2)
130+
let slope = sum1 / sum2
131+
let intercept = average(ys) - slope * average(xs)
132+
return { x in intercept + slope * x }
133+
}
134+
```
135+
This function takes as arguments two arrays of Doubles, and returns a function which is the line of best fit. The formulas to calculate the slope and the intercept can be derived from our definition of the function `J`. Let's see how the output from this line fits our data:
136+
137+
![graph3](Images/graph3.png)
138+
139+
Using this line, we would predict a price for our 4 year old car of £6952.
140+
141+
142+
## Summary
143+
We've seen two different ways to implement a simple linear regression in Swift. An obvious question is: why bother with the iterative approach at all?
144+
145+
Well, the line we've found doesn't fit the data perfectly. For one thing, the graph includes some negative values at high car ages! Possibly we would have to pay someone to tow away a very old car... but really these negative values just show that we have not modelled the real life situation very accurately. The relationship between the car age and the car price is not linear but instead is some other function. We also know that a car's price is not just related to its age but also other factors such as the make, model and engine size of the car. We would need to use additional variables to describe these other factors.
146+
147+
It turns out that in some of these more complicated models, the iterative approach is the only viable or efficient approach. This can also occur when the arrays of data are very large and may be sparsely populated with data values.
148+
149+
*Written for Swift Algorithm Club by James Harrop*

README.markdown

Lines changed: 34 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,15 @@
3030

3131
## 重要链接
3232

33-
[什么是算法和数据结构?](What%20are%20Algorithms.markdown) 薄饼!
33+
[什么是算法和数据结构?](What%20are%20Algorithms.markdown) —— 薄饼!
3434

35-
[为什么要学习算法?](Why%20Algorithms.markdown) 还在担心这不是你的菜吗?请读一下这篇文章。
35+
[为什么要学习算法?](Why%20Algorithms.markdown) —— 还在担心这不是你的菜吗?请读一下这篇文章。
3636

37-
[大O表示法](Big-O%20Notation.markdown) 我们经常会听到这样的话:“这个算法是 **O(n)** 的”。如果你不知道这是啥意思,请读读这篇文章。
37+
[大O表示法](Big-O%20Notation.markdown) —— 我们经常会听到这样的话:“这个算法是 **O(n)** 的”。如果你不知道这是啥意思,请读读这篇文章。
3838

39-
[算法设计技巧](Algorithm%20Design.markdown) 怎样设计自己的算法?
39+
[算法设计技巧](Algorithm%20Design.markdown) —— 怎样设计自己的算法?
4040

41-
欢迎参与贡献 通过留下issue反馈,或者提交pull request。
41+
欢迎参与贡献 —— 通过留下issue反馈,或者提交pull request。
4242

4343
## 从哪开始?
4444

@@ -117,22 +117,24 @@
117117

118118
### 数学算法
119119

120-
- [最大公约数算法(GCD)](GCD/)特殊福利:最小公倍数算法。
121-
- [排列组合算法](Combinatorics/)还记得高中学过俄组合数学吗?
122-
- [调度场算法](Shunting%20Yard/)用于将中缀表达式转换为后缀表达式的经典算法。
120+
- [最大公约数算法(GCD)](GCD/) —— 特殊福利:最小公倍数算法。
121+
- [排列组合算法](Combinatorics/) —— 还记得高中学过俄组合数学吗?
122+
- [调度场算法](Shunting%20Yard/) —— 用于将中缀表达式转换为后缀表达式的经典算法。
123123
- [karatsuba乘法](Karatsuba%20Multiplication/). Another take on elementary multiplication.
124124
- [Haversine Distance](HaversineDistance/). Calculating the distance between 2 points from a sphere.
125125
- [⏳Strassen's Multiplication Matrix](Strassen%20Matrix%20Multiplication/). Efficient way to handle matrix multiplication.
126126

127127
### 机器学习
128128

129-
- [k-Means 聚类算法](K-Means/)无监督的分类器,将数据聚类为 K 个簇。
129+
- [k-Means 聚类算法](K-Means/) —— 无监督的分类器,将数据聚类为 K 个簇。
130130
- ⏳K-近邻算法
131-
- ⏳线性回归
131+
- [⏳线性回归](Linear%20Regression/). A technique for creating a model of the relationship between two (or more) variable quantities.
132132
- ⏳逻辑回归
133133
- ⏳神经网络
134134
- ⏳网页排名算法
135135
- [⏳Naive Bayes Classifier](Naive%20Bayes%20Classifier/)
136+
- [⏳Simulated annealing](Simulated%20annealing/). Probabilistic technique for approximating the global maxima in a (often discrete) large search space.
137+
136138

137139

138140
## 数据结构
@@ -147,40 +149,40 @@
147149

148150
### 数组变体
149151

150-
- [二维数组](Array2D/)固定尺寸的二维数组,可用于棋盘游戏。
151-
- [比特集](Bit%20Set/)**n** 位大小固定尺度的序列。
152+
- [二维数组](Array2D/) —— 固定尺寸的二维数组,可用于棋盘游戏。
153+
- [比特集](Bit%20Set/) —— **n** 位大小固定尺度的序列。
152154
- [固定大小数组](Fixed%20Size%20Array/) - 如果你确切的知道数据的大小,使用老式的固定长度的数组会更加高效。
153-
- [有序数组](Ordered%20Array/)一个永远有序的数组。
155+
- [有序数组](Ordered%20Array/) —— 一个永远有序的数组。
154156
- [Rootish Array Stack](Rootish%20Array%20Stack/). A space and time efficient variation on Swift arrays.
155157

156158
### 队列
157159

158-
- [](Stack/)后进先出!
159-
- [队列](Queue/)先进先出!
160+
- [](Stack/) —— 后进先出!
161+
- [队列](Queue/) —— 先进先出!
160162
- [双端队列](Deque/)
161-
- [优先队列](Priority%20Queue)一个保持最重要的元素总是在最前面的队列。
162-
- [有限优先队列](Bounded%20Priority%20Queue)元素最大数受限制的优先队列。 :construction:
163-
- [环形缓冲区](Ring%20Buffer/)一个语义上的固定大小的环形缓冲区,实际使用的是一维序列头尾相接实现。
163+
- [优先队列](Priority%20Queue) —— 一个保持最重要的元素总是在最前面的队列。
164+
- [有限优先队列](Bounded%20Priority%20Queue) —— 元素最大数受限制的优先队列。 :construction:
165+
- [环形缓冲区](Ring%20Buffer/) —— 一个语义上的固定大小的环形缓冲区,实际使用的是一维序列头尾相接实现。
164166

165167
### 列表
166168

167-
- [链表](Linked%20List/)链接起来的数据序列。包含单向和双向链表。
168-
- [跳跃列表](Skip-List/) —— 跳过列表是一种概率数据结构,具有与AVL/或红黑树相同的对数时间限制和效率,并提供了有效支持搜索和更新操作的巧妙折衷。
169+
- [链表](Linked%20List/) —— 链接起来的数据序列。包含单向和双向链表。
170+
- [跳表](Skip-List/) —— 跳过列表是一种概率数据结构,具有与AVL/或红黑树相同的对数时间限制和效率,并提供了有效支持搜索和更新操作的巧妙折衷。
169171

170172
###
171173

172-
- [](Tree/)通用目的的树形结构。
173-
- [二叉树](Binary%20Tree/)-一种节点最多有两个孩子节点的树形结构
174-
- [二叉搜索树(BST)](Binary%20Search%20Tree/)以某种方式组织自己的节点的二叉树,以求较快的查询速度。
175-
- [AVL树](AVL%20Tree/)一种通过旋转来维持平衡的二叉搜索树。 :construction:
174+
- [](Tree/) —— 通用目的的树形结构。
175+
- [二叉树](Binary%20Tree/) —— 一种节点最多有两个子节点的树形结构
176+
- [二叉搜索树(BST)](Binary%20Search%20Tree/) —— 以某种方式组织自己的节点的二叉树,以求较快的查询速度。
177+
- [AVL树](AVL%20Tree/) —— 一种通过旋转来维持平衡的二叉搜索树。 :construction:
176178
- [红黑树](Red-Black%20Tree/). A self balancing binary search tree.
177179
- [伸展树](Splay%20Tree/). A self balancing binary search tree that enables fast retrieval of recently updated elements.
178180
- [线索二叉树](Threaded%20Binary%20Tree/). A binary tree that maintains a few extra variables for cheap and fast in-order traversals.
179-
- [线段树](Segment%20Tree/)能够快速地对某区间进行计算。
181+
- [线段树](Segment%20Tree/) —— 能够快速地对某区间进行计算。
180182
- [Lazy Propagation](https://github.com/raywenderlich/swift-algorithm-club/tree/master/Segment%20Tree/LazyPropagation)
181183
- ⏳k-d 树
182184
- [ST(稀疏表)算法](Sparse%20Table/). Another take on quickly computing a function over a portion of an array, but this time we'll make it even quicker!.
183-
- [](Heap/)存储在一维数组中的二叉树,所以它不需要使用指针。很适合做为优先队列使用。
185+
- [](Heap/) —— 存储在一维数组中的二叉树,所以它不需要使用指针。很适合做为优先队列使用。
184186
- ⏳斐波那契堆
185187
- [字典树(Trie)](Trie/). A special type of tree used to store associative data structures.
186188
- [B 树](B-Tree/). A self-balancing search tree, in which nodes can have more than two children.
@@ -189,25 +191,25 @@
189191

190192
### 哈希
191193

192-
- [哈希表](Hash%20Table/)允许你通过一个关键词来存取数据。字典通常都是基于哈希表实现的。
194+
- [哈希表](Hash%20Table/) —— 允许你通过一个关键词来存取数据。字典通常都是基于哈希表实现的。
193195
- ⏳哈希函数
194196

195197
### 集合
196198

197-
- [布隆过滤器](Bloom%20Filter/)一个常量内存数据结构,用于概率性的检测某个元素是否在集合中。
198-
- [哈希集合](Hash%20Set/)使用哈希表实现的集合。
199+
- [布隆过滤器](Bloom%20Filter/) —— 一个常量内存数据结构,用于概率性的检测某个元素是否在集合中。
200+
- [哈希集合](Hash%20Set/) —— 使用哈希表实现的集合。
199201
- [多重集](Multiset/). A set where the number of times an element is added matters. (Also known as a bag.)
200-
- [有序集](Ordered%20Set/)很看重元素顺序的集合。
202+
- [有序集](Ordered%20Set/) —— 很看重元素顺序的集合。
201203

202204
###
203205

204206
- [](Graph/)
205207
- [广度优先搜索(BFS)](Breadth-First%20Search/)
206208
- [深度优先搜索(DFS)](Depth-First%20Search/)
207-
- [最短路径算法](Shortest%20Path%20%28Unweighted%29/)作用对象为无权值树。
209+
- [最短路径算法](Shortest%20Path%20%28Unweighted%29/) —— 作用对象为无权值树。
208210
- [单源最短路径算法](Single-Source%20Shortest%20Paths%20(Weighted)/)
209211

210-
- [最小生成树(未加权图)](Minimum%20Spanning%20Tree%20%28Unweighted%29/)作用对象为无权值树。
212+
- [最小生成树(未加权图)](Minimum%20Spanning%20Tree%20%28Unweighted%29/) —— 作用对象为无权值树。
211213
- [最小生成树(加权图)](Minimum%20Spanning%20Tree/)
212214

213215
- [任意两点间的最短路径算法](All-Pairs%20Shortest%20Paths/)

Simulated annealing/README.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Simulated annealing
2+
3+
Simulated Annealing is a nature inspired global optimization technique and a metaheuristic to approximate global maxima in a (often discrete)large search space. The name comes from the process of annealing in metallurgy where a material is heated and cooled down under controlled conditions in order to improve its strength and durabilility. The objective is to find a minimum cost solution in the search space by exploiting properties of a thermodynamic system.
4+
Unlike hill climbing techniques which usually gets stuck in a local maxima ( downward moves are not allowed ), simulated annealing can escape local maxima. The interesting property of simulated annealing is that probability of allowing downward moves is high at the high temperatures and gradually reduced as it cools down. In other words, high temperature relaxes the acceptance criteria for the search space and triggers chaotic behavior of acceptance function in the algorithm (e.x initial/high temperature stages) which should make it possible to escape from local maxima and cooler temperatures narrows it and focuses on improvements.
5+
6+
Pseucocode
7+
8+
Input: initial, temperature, coolingRate, acceptance
9+
Output: Sbest
10+
Scurrent <- CreateInitialSolution(initial)
11+
Sbest <- Scurrent
12+
while temperature is not minimum:
13+
Snew <- FindNewSolution(Scurrent)
14+
if acceptance(Energy(Scurrent), Energy(Snew), temperature) > Rand():
15+
Scurrent = Snew
16+
if Energy(Scurrent) < Energy(Sbest):
17+
Sbest = Scurrent
18+
temperature = temperature * (1-coolingRate)
19+
20+
Common acceptance criteria :
21+
22+
P(accept) <- exp((e-ne)/T) where
23+
e is the current energy ( current solution ),
24+
ne is new energy ( new solution ),
25+
T is current temperature.
26+
27+
28+
We use this algorithm to solve a Travelling salesman problem instance with 20 cities. The code is in `simann_example.swift`
29+
30+
#See also
31+
32+
[Simulated annealing on Wikipedia](https://en.wikipedia.org/wiki/Simulated_annealing)
33+
34+
[Travelling salesman problem](https://en.wikipedia.org/wiki/Travelling_salesman_problem)
35+
36+
Written for Swift Algorithm Club by [Mike Taghavi](https://github.com/mitghi)

0 commit comments

Comments
 (0)