More robust fixes for numerical issues #229

fumoboy007 · 2024-12-27T02:07:35Z

Commit 1: Fix inconsistency that could cause the optimization algorithm to oscillate.

Fixes #225.

Background

The optimization algorithm has three main calculations:

Select the working set {i, j} that minimizes the decrease in the objective function.
Change alpha[i] and alpha[j] to minimize the decrease in the objective function while respecting constraints.
Update the gradient of the objective function according to the changes to alpha[i] and alpha[j].

All three calculations make use of the matrix Q, which is represented by the QMatrix class. The QMatrix class has two main methods:

get_Q, which returns an array of values for a single column of the matrix; and
get_QD, which returns an array of diagonal values.

Problem

Q values are of type Qfloat while QD values are of type double. Qfloat is currently defined as float, so there can be inconsistency in the diagonal values returned by get_Q and get_QD. For example, in #225, one of the diagonal values is 181.05748749793070829 as double and 180.99411909539512067 as float.

The first two calculations of the optimization algorithm access the diagonal values via get_QD. However, the third calculation accesses the diagonal values via get_Q. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by #225.

Solution

We change the type of QD values from double to Qfloat. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency.

Note that this reverts the past commit 1c80a42. That commit changed the type of QD values from Qfloat to double to address a numerical issue. In a follow-up commit, we will allow Qfloat to be defined as double at runtime as a more general fix for numerical issues.

Future Changes

The Java code will be updated similarly in a separate commit.

Commit 2: Add a runtime parameter to specify the floating-point precision of kernel values.

This will make it easier for users to try double precision kernel values when they run into numerical issues.

…llate. Fixes cjlin1#225. # Background The optimization algorithm has three main calculations: 1. Select the working set `{i, j}` that [minimizes](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L829-L879) the decrease in the objective function. 2. Change `alpha[i]` and `alpha[j]` to [minimize](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L606-L691) the decrease in the objective function while respecting constraints. 3. [Update](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L698-L701) the gradient of the objective function according to the changes to `alpha[i]` and `alpha[j]`. All three calculations make use of the matrix `Q`, which is represented by the `QMatrix` [class](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L198). The `QMatrix` class has two main methods: - `get_Q`, which returns an array of values for a single column of the matrix; and - `get_QD`, which returns an array of diagonal values. # Problem `Q` values are of type `Qfloat` while `QD` values are of type `double`. `Qfloat` is currently [defined](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L16) as `float`, so there can be inconsistency in the diagonal values returned by `get_Q` and `get_QD`. For example, in cjlin1#225, one of the diagonal values is `181.05748749793070829` as `double` and `180.99411909539512067` as `float`. The first two calculations of the optimization algorithm access the diagonal values via `get_QD`. However, the third calculation accesses the diagonal values via `get_Q`. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by cjlin1#225. # Solution We change the type of `QD` values from `double` to `Qfloat`. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency. Note that this reverts the past commit 1c80a42. That commit changed the type of `QD` values from `Qfloat` to `double` to address a numerical issue. In a follow-up commit, we will allow `Qfloat` to be defined as `double` at runtime as a more general fix for numerical issues. # Future Changes The Java code will be updated similarly in a separate commit.

…rnel values. This will make it easier for users to try double precision kernel values when they run into numerical issues.

fumoboy007 added 2 commits December 25, 2024 16:09

Add a runtime parameter to specify the floating-point precision of ke…

bdc5e28

…rnel values. This will make it easier for users to try double precision kernel values when they run into numerical issues.

fumoboy007 mentioned this pull request Dec 27, 2024

Training gets stuck on a specific dataset #225

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More robust fixes for numerical issues #229

More robust fixes for numerical issues #229

fumoboy007 commented Dec 27, 2024

More robust fixes for numerical issues #229

Are you sure you want to change the base?

More robust fixes for numerical issues #229

Conversation

fumoboy007 commented Dec 27, 2024

Commit 1: Fix inconsistency that could cause the optimization algorithm to oscillate.

Background

Problem

Solution

Future Changes

Commit 2: Add a runtime parameter to specify the floating-point precision of kernel values.