-
Notifications
You must be signed in to change notification settings - Fork 29
/
Copy pathfaq.html
131 lines (130 loc) · 6.97 KB
/
faq.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
layout: default
title: "Frequently Asked Questions"
comments: true
---
<div id="faq" class="fixed">
<h1>
How fast is Support Vector Machine (SVM) learning?
</h1>
<p>
Users often find the SVM learning algorithm to take too much time. This, however,
may be a reflection of the particular choice of learning parameters. For instance,
attempting to create a hard-margin SVM (with very high <a href="http://accord-net.github.io/docs/html/P_Accord_MachineLearning_VectorMachines_Learning_SequentialMinimalOptimization_Complexity.htm">
complexity parameter C
</a>) using a linear kernel in a non-linearly separable
dataset will diverge and possibly lead to an infinite loop. In those cases, consider
using the automatic parameter estimation routines in <a href="http://accord-net.github.io/docs/html/T_Accord_MachineLearning_VectorMachines_Learning_SequentialMinimalOptimization.htm">
SequentialMinimalOptimization
</a> and in the chosen Kernel function. Values
returned by those routines may lead to more sparse machines and faster training
times.
</p>
<p>
For best initial results, please set the <a href="/docs/html/P_Accord_MachineLearning_VectorMachines_Learning_SequentialMinimalOptimization_UseComplexityHeuristic.htm">
UseComplexityHeuristic
</a> property of the <a href="http://accord-net.github.io/docs/html/T_Accord_MachineLearning_VectorMachines_Learning_SequentialMinimalOptimization.htm">
SMO learning algorithm
</a> to true when learning your initial machine. This
will give you an indication of an appropriate running time for your problem. Also,
if you are using <a href="http://accord-net.github.io/docs/html/T_Accord_Statistics_Kernels_Gaussian.htm">
Gaussian kernels
</a>, try estimating the kernel parameters from the data using
the <a href="http://accord-net.github.io/docs/html/M_Accord_Statistics_Kernels_Gaussian_Estimate.htm">
Gaussian.Estimate
</a> static method of your kernel's class. The initial values
returned from those heuristics can be used as a starting point for hyperparameter
tuning.
</p>
<h1>
Why I am getting <a href="http://msdn.microsoft.com/en-us/library/system.outofmemoryexception.aspx">
OutOfMemoryExceptions
</a> when training my Vector Machines?
</h1>
<p>
The framework automatically builds a kernel function cache to help speed up computations
during SVM learning. However, there are cases that this cache may take too much
memory and lead to such exceptions. To make a balance between memory consumption
and CPU speed, set the <a href="http://accord-net.github.io/docs/html/P_Accord_MachineLearning_VectorMachines_Learning_SequentialMinimalOptimization_CacheSize.htm">
CacheSize
</a> property to a lower value. The default is to store all input vectors
in the cache; setting it to something lower (such as 1/20 the number of training
samples) might help.
</p>
<h1>
Why can't I open Excel worksheets using
<a href="http://accord-framework.net/docs/html/T_Accord_Statistics_Formats_ExcelReader.htm">
ExcelReader
</a>?
</h1>
<p>
To open Excel files, you have to install the <a href="http://www.microsoft.com/en-us/download/details.aspx?id=13255">
Microsoft Access Database Engine 2010 Redistributable
</a>, available <a href="http://www.microsoft.com/en-us/download/details.aspx?id=13255">
here
</a>. This engine, also known as ACE, is a replacement for JET, the
old ODBC driver which could be used to open Excel 2003 workbooks. In order to use
ACE in both 32-bit and 64-bit applications, you need to install both redistributables
from the Microsoft website. To install them both, you need to use the /passive command
line switch to prevent the installation from failing once it detects another component
version already installed. After downloading the executables, run:
</p>
<ul>
<li>C:\Users\You\Downloads\AccessDatabaseEngine.exe /passive</li>
<li>C:\Users\You\Downloads\AccessDatabaseEngine_x64.exe /passive</li>
</ul>
<h1>
Why am I getting "non-positive definite" or "variance is zero" exceptions?
</h1>
<p>
If you encounter any of the exceptions:
</p>
<ul>
<li>
"Variance is zero. Try specifying a regularization constant in the fitting options",
or
</li>
<li>
"Covariance matrix is not positive definite. Try specifying a regularization constant
in the fitting options"
</li>
</ul>
<p>
This might be a sign that the problem you are trying to solve is a tricky one, or
the model that is being assumed is not a very likely candidate for the samples you
have.
</p>
<p>
One of the possible reasons for those exceptions is the presence of a constant column
in your data (i.e. a column that only has the same value). This is a problem because
the standard deviation (or variance) for such variables will be zero, and any methods
making the assumption that the variance will be strictly positive (i.e. Gaussian
models) will fail. One possible way to overcome this situation is to add a regularization
constant to the variance calculations; this way the variance will be very small,
but will be different from zero.
</p>
<p>
Please note that the addition of this constant will likely reduce the accuracy of
your models. But since the models were already making wrong assumptions about your
data, this could be an indication that other models may be more indicated for your
particular problem after all.
</p>
<p>
On a final note, the way on how to add this regularization constant depends on the
model being estimated; but most likely the solution for this problem would involve
creating a <a href="/docs/html/AllMembers_T_Accord_Statistics_Distributions_Fitting_NormalOptions.htm">
NormalOptions
</a> object, setting its <a href="/docs/html/P_Accord_Statistics_Distributions_Fitting_NormalOptions_Regularization.htm">
Regularization
</a> property to some small value, and pass it along to the
model fitting algorithm. For examples involving Hidden Markov Models, please see
<a href="/docs/html/T_Accord_Statistics_Models_Markov_Learning_BaumWelchLearning_1.htm">
the examples on the BaumWelchLearning
</a> documentation page.
</p>
<p>
Update: On the next version after 2.11, a new way to work with non-positive definite
matrices within learning algorithms will be added in the framework. This new approach
may help improve the accuracy of such models.
</p>
</div>