Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust fixes for numerical issues #229

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

fumoboy007
Copy link

Commit 1: Fix inconsistency that could cause the optimization algorithm to oscillate.

Fixes #225.

Background

The optimization algorithm has three main calculations:

  1. Select the working set {i, j} that minimizes the decrease in the objective function.
  2. Change alpha[i] and alpha[j] to minimize the decrease in the objective function while respecting constraints.
  3. Update the gradient of the objective function according to the changes to alpha[i] and alpha[j].

All three calculations make use of the matrix Q, which is represented by the QMatrix class. The QMatrix class has two main methods:

  • get_Q, which returns an array of values for a single column of the matrix; and
  • get_QD, which returns an array of diagonal values.

Problem

Q values are of type Qfloat while QD values are of type double. Qfloat is currently defined as float, so there can be inconsistency in the diagonal values returned by get_Q and get_QD. For example, in #225, one of the diagonal values is 181.05748749793070829 as double and 180.99411909539512067 as float.

The first two calculations of the optimization algorithm access the diagonal values via get_QD. However, the third calculation accesses the diagonal values via get_Q. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by #225.

Solution

We change the type of QD values from double to Qfloat. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency.

Note that this reverts the past commit 1c80a42. That commit changed the type of QD values from Qfloat to double to address a numerical issue. In a follow-up commit, we will allow Qfloat to be defined as double at runtime as a more general fix for numerical issues.

Future Changes

The Java code will be updated similarly in a separate commit.

Commit 2: Add a runtime parameter to specify the floating-point precision of kernel values.

This will make it easier for users to try double precision kernel values when they run into numerical issues.

…llate.

Fixes cjlin1#225.

# Background

The optimization algorithm has three main calculations:
1. Select the working set `{i, j}` that [minimizes](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L829-L879) the decrease in the objective function.
2. Change `alpha[i]` and `alpha[j]` to [minimize](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L606-L691) the decrease in the objective function while respecting constraints.
3. [Update](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L698-L701) the gradient of the objective function according to the changes to `alpha[i]` and `alpha[j]`.

All three calculations make use of the matrix `Q`, which is represented by the `QMatrix` [class](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L198). The `QMatrix` class has two main methods:
- `get_Q`, which returns an array of values for a single column of the matrix; and
- `get_QD`, which returns an array of diagonal values.

# Problem

`Q` values are of type `Qfloat` while `QD` values are of type `double`. `Qfloat` is currently [defined](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L16) as `float`, so there can be inconsistency in the diagonal values returned by `get_Q` and `get_QD`. For example, in cjlin1#225, one of the diagonal values is `181.05748749793070829` as `double` and `180.99411909539512067` as `float`.

The first two calculations of the optimization algorithm access the diagonal values via `get_QD`. However, the third calculation accesses the diagonal values via `get_Q`. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by cjlin1#225.

# Solution

We change the type of `QD` values from `double` to `Qfloat`. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency.

Note that this reverts the past commit 1c80a42. That commit changed the type of `QD` values from `Qfloat` to `double` to address a numerical issue. In a follow-up commit, we will allow `Qfloat` to be defined as `double` at runtime as a more general fix for numerical issues.

# Future Changes

The Java code will be updated similarly in a separate commit.
…rnel values.

This will make it easier for users to try double precision kernel values when they run into numerical issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Training gets stuck on a specific dataset
1 participant