-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathTODO
35 lines (30 loc) · 1.73 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
. Remove the large testing data from git. (heavy and useless)
. It has been observed that using the hyperbolic tangent requires
a very small intial learning rate (about 10 times smaller than with
the sigmoid). If the initial learning rate is kept to 1.0 (Like for
the sigmoid), the weight correction rapidly converges towards 0 leading
to stagnation.
This is probably related to the fact that the kernel function's domain
is larger when using the hyperbolic tangent (2.0 wide instead of 1.0).
***********************************ARCHIVE******************************************
. Implement parsing of a "weight" file containing the value of the weights in the
network. This allows one to save a trained network and load it again.
The extension will be '.wt' and the grammar will be:
<expr> := <couple> <value>
<couple> := ( <index> , <index> )
<index> := Int
<value> := double
. Write a general purpose program allowing one to learn any data
from a standard file.
. The error propagation needs verification. Use the formula given in Neural networks
by Simon Haykin.
. Find out why the learning process gets completly out of control
-> The problem is related to the fact that the learning coefficient
should adapt to how far the network is in the learning process.
-> The program should distinguish between two main phases:
* A broader phase where the coefficient should be large enough to
speed up convergence towards the local minimum (in error).
* A more narrow phase where the coefficient should be carefully
chosen in order to find the local minimum in a small interval
-> A solution involves learning from a cross problem and using the
second derivative and Hessian matrices...