Fix an issue with the relative difference prior #696

robbietuk · 2020-09-24T15:58:35Z

The inclusion of the epsilon term in the denominator square

…the epsilon term in the denominator square.

robbietuk · 2020-09-24T16:04:59Z

From my notes, we have to consider the RDP with the addition of the epsilon term to prevent divide by zero.

AnderBiguri · 2020-09-24T16:11:56Z

Hum, you just changed the location of the epsilon right? if its only there to avoid division by zero, then its value must be much smaller than the expected values for the rest of the numerator and denominator (otherwise it would have a numerical impact, and we just want to avoid division by zero).

If that is the case, it does not really matter for it to be mathematically correct on its derivative. You just need to add epsilon to any division, and thats it. In theory, it will only have an impact when everything else is zero, so the error you are introducing by not being mathematically correct is neglectable (and if its not, then your epsilon is too big!)

Is that the case here? I am missing a but the context, so maybe I am talking out of place!

robbietuk · 2020-09-24T16:21:17Z

Yes, the purpose is to prevent the divide by zeros and it also exists in the computation of the function

STIR/src/recon_buildblock/RelativeDifferencePrior.cxx

Lines 316 to 319 in ef435cc

    
           current = weights[dz][dy][dx] * 0.5 * 
        
                   (pow(current_image_estimate[z][y][x]-current_image_estimate[z+dz][y+dy][x+dx],2)/ 
        
                   (current_image_estimate[z][y][x]+current_image_estimate[z+dz][y+dy][x+dx] 
        
                   + this->gamma * abs(current_image_estimate[z][y][x]-current_image_estimate[z+dz][y+dy][x+dx]) + this->epsilon ));

In general I would agree with you as generally the function and gradient are slightly decoupled. However, I am trying to be mathmatically correct as I am using line searches to find the MAP solution and therefore discrepancies between the function and its gradient can lead to problems (which is what I think I am finding). Numerical impact may infact be a problem but that just means it isnt the "true" RDP.

I could consider discarding the epsilon and instead use if x == 0 and y == 0: return 0. This would let it be both mathmatically correct and handle divide by zeros.

AnderBiguri · 2020-09-24T16:28:31Z

@robbietuk fair, fair. However, maybe you are just using an epsilon too large? As I said in the teams chat, if the value you are using as "epsilon" is large enough to have influence in your maths, either it is too large on value, or you have other serious issues on the math itself, as its too sensitive to numerical accuracy.

robbietuk · 2020-09-24T16:38:01Z

I agree, I feel that epsilon is causing a large issue right now. I want to try and replace it with the if statement.

AnderBiguri · 2020-09-24T16:45:25Z

Yes, that works too! But I guess my point is that if those microscopic epsilons are causing a problem, I don't expect your problems to go away... If the epsilon you are imputing is indeed small enough to be on the "noise" fraction of your data, and that impacts the numerical method, then any other "noise" (i.e. floating point error) that may arise just naturally on your computations will cause the same problems in the method!

robbietuk · 2020-09-24T17:16:08Z

Sorry, I dont think I phrased the above correctly. I think the epsilon I am using right now is too large for normalised data. That and the "decoupled" mathmatical function and gradient, and there is a problem.

STIR/src/recon_buildblock/RelativeDifferencePrior.cxx

Line 142 in ef435cc

this->epsilon = 0.0001;

Yes there is an computational problem of using very small epsilons and I think we are both in agreement that moving away from it is the correct thing?

AnderBiguri · 2020-09-24T17:25:21Z

No! That is not what I meant.

I think there is absolutely no issue with using very small epsilons, if their purpose is only avoiding by zero. They just need to be marginally bigger than zero, but just that. so 1e-7 should also work. In fact, you want epsilon to be as small as possible, in such a way that it can be considered noise for your purposes. Yo only want it so when our "incomplete" numerical system of using floating point arithmetic fails because it does not know how to divide by zero (and give you Inf), it doesn't, and gives you a "practical" Inf, something very large. Or, if the numerator is also zero, then for your division to give something that is essentially zero, but not, and avoid NaN.

In some sense, it has nothing to do with math, epsilon is not there to do something mathematically smart, its just there to help us overcome a limitation of using computers. This is why I was surprised when you used it in the gradient, I don't think its necessary. You just want to replace your divisions of a/b by a/(b+eps), not change your equations. Its not a regularizer, or some other mathematical construct, its just a "hack".

So, in short, before changing anything and going crazy with equations, I suggest you just do this->epsilon = 0.00000001; and try your maths again.

robbietuk · 2020-09-24T17:56:35Z

Yes, as epsilon > 0 tends to 0, the function and gradient behave like their non-epsilon counterparts (except when voxels are 0). The trouble is, what is the correct value of epsilon for the general case?

Removing epsilon, adding a check on the two voxel values being 0 (then return 0) becomes equiviant to an optimal epsion implementation, and we dont have to answer that question.

KrisThielemans · 2020-09-24T17:59:56Z

I strongly recommend making the gradient consistent "formula-wise" with the objective function. I think that should be the case for whatever value of epsilon you use.

I don't think checking with if is going to work. The division will likely explode in other cases as well.

We could check with Johan Nuyts et al. if you like

AnderBiguri · 2020-09-24T18:21:16Z

@robbietuk

The trouble is, what is the correct value of epsilon for the general case?

As close as zero without being zero. For any case. You just want a small enough perturbation to avoid division by zero.

@KrisThielemans

I strongly recommend making the gradient consistent "formula-wise" with the objective function.

While you are correct, and doing this will have no negative impact, in all cases in the past where I could do this but instead I just replaced divisions, there was no measurable difference with the simple "non-formula" way. In fact, one could argue that if there is a difference when using the formula, then your epsilon is too big in value. Conceptually you just want to introduce a value in the noise range of your computation, and by definition that should mean that it has no impact on your results. Its the same as adding doubles vs singles for all this. You are adding more numbers in the noise level of your method, thus it does not matter.

In fact, I would argue that if you do the "proper formula way" vs the "just adding it in the division", and there is a measurable impact in your algorithm, then your epsilon is wrong.

But of course, there should not be an issue with adding the proper way.

robbietuk · 2020-09-25T09:29:22Z

Okay I did some numerical test on the RDP. The trouble with using an epsilon in the denominator is that the equation and its gradient are significantly sensitive to poorly selected values. Unlike a function, such as the QP = (x - y)^2 , the RDP is sensitive to differences between voxels relatvie to the voxel values. I guess the clue is in the name.

I wrote some python to investigate this. One set of functions (RDP_e and grad_e have an epsilon = 1e-10) and the formulee are equivilant to that of my first comment (#696 (comment)).

The second set of RDP tests of this use an epsilon = 0.0 but have the addition of an if statement if x == 0.0 and y == 0.0 and epsilon == 0.0: return 0.0 for both function and its gradient computation. This is only to prevent the evaluation of 0.0/0.0, which I have assumed to equal 0.

x and y are a set of values given by the following python code [0.0] + [math.e ** i for i in range(-30, 5)] (i.e. [0.0, 9e-14, 2e-13,..., 54]) so voxel values both above and below the epsilon values are compared for this experiment. On the below images, the vertical and horizontal axis values are equivilant. Each pixel of the 2D images are the function or gradient value for the comparison between the respective x,y values.

I will compare these two sets of functions and show that the if statement version is equivilant to an optimally selected epsilon value of infintesimally small but positive value and therefore I believe we should use the if statements and do away with epsilon.

Function

The RDP and RDP_e are visually comparable (RDP shown below).

However subtracting one from the other (RDP - RDP_e) we can see the discrepancies, even if they are very small.

As the epsilon value tend to 0 (e.g. epsilon=1e-40) then this difference image becomes uniform 0's.

Gradient

The gradient is another matter entirely. The gradient of the RDP (grad) is given.

Only once is the if statement True, at x=0 and y=0. As is clear, the RDP at low voxel values is sensitive.

Now compare to grad_e, with the inclusion of epsion=1e-10 and we see that the sensitivity at lower values (<epsilon) has been reduced as the gradient tends to 0.

Lowering the epsilon value to 1e-40, << the smallest non-zero x and y, leads to virtually equivilant behaviours between grad and grad_e.

Remarks

While I agree we should set epision to be as small as possible, which should be much less than the smallest value of x, y, I think I have shown that its usage is redundant. If epsilon is too large, then it does have significant impact on the gradient computation and the selection of its value may become dangerous.

P.S. While not rediculously slow, the RDP is a large formula to recompute many times. We can possibly speed it it up with a queery using the if statement.

robbietuk · 2020-09-25T09:44:06Z

I guess then that the answer to my problem is that the current epsilon is too large for normalised data and that is why I am seeing weird behaviour in my line searches. I have have not seen this before with the RDP as I did not use normalised data, thus the voxel values were of the order of magnetude 1e3.

AnderBiguri · 2020-09-25T09:45:00Z

Very nice analysis!

robbietuk · 2020-09-25T10:21:12Z

Drafted the removal of epsilon, if we dont choose to use it. We can just revert the commit. It build for me but I want to ensure travis is happy.

robbietuk · 2020-10-01T16:44:08Z

I can't see any more issues, I am happy to merge.

robbietuk · 2020-10-02T15:25:10Z

After discussion with @KrisThielemans, we decided to go for a comprimise. We will keep the epsilon but by default set it to 0. The if statements will handle the case when epsilon=0, x=0, and y=0 by returning 0. This allows a user to set the value of epsilon.

Also corrected default gamma to the suggested value of 2.

robbietuk · 2020-10-04T16:47:02Z

As discussed via email, it is not obvious how to extend the RPD allow for negitive number and a reliable concave function. In the presence of a non-negitiviety constraint, the RDP will now work as expected, even with epsilon =0.

One idea is to incorporate an assert(x>=0) and assert(y>=0) onto every voxel evaluation. However, I don't know how well this would work in practice (it may be simplified by assert(min(volume)>=0)). Do we need to provide this hard boundary? Does STIR ever allow negitive voxel values in its reconstruction algoriothms. Should this assert really be a if statement that outputs a warning?

KrisThielemans

looks good. would be nice to add an example like \examples\samples\OSMAPOSL_QuadraticPrior.par. if you do, please do it with [ci skip] in the first commit line

Fix an issue with the relative difference prior and the inclusion of …

ef435cc

…the epsilon term in the denominator square.

Set default epsilon to 0.0. If epsilon, use the correct func. and grad.

0fb4cbe

robbietuk force-pushed the RDP_grad_fix branch from fc05869 to 0fb4cbe Compare October 2, 2020 15:10

RDP update documentation and spell checks

c2a2463

KrisThielemans approved these changes Oct 6, 2020

View reviewed changes

robbietuk and others added 2 commits October 6, 2020 13:55

Adds example OSMAPOSL implementation with RDP [ci skip]

c6391ad

update example RDP file to indicate default values [ci skip]

127ebed

KrisThielemans merged commit d45d7ed into UCL:release_4 Oct 6, 2020

KrisThielemans added this to the v4.1 milestone Oct 6, 2020

robbietuk deleted the RDP_grad_fix branch October 8, 2020 15:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix an issue with the relative difference prior #696

Fix an issue with the relative difference prior #696

robbietuk commented Sep 24, 2020

robbietuk commented Sep 24, 2020

AnderBiguri commented Sep 24, 2020

robbietuk commented Sep 24, 2020 •

edited

Loading

AnderBiguri commented Sep 24, 2020

robbietuk commented Sep 24, 2020

AnderBiguri commented Sep 24, 2020 •

edited

Loading

robbietuk commented Sep 24, 2020 •

edited

Loading

AnderBiguri commented Sep 24, 2020 •

edited

Loading

robbietuk commented Sep 24, 2020

KrisThielemans commented Sep 24, 2020

AnderBiguri commented Sep 24, 2020 •

edited

Loading

robbietuk commented Sep 25, 2020 •

edited

Loading

robbietuk commented Sep 25, 2020 •

edited

Loading

AnderBiguri commented Sep 25, 2020

robbietuk commented Sep 25, 2020 •

edited

Loading

robbietuk commented Oct 1, 2020

robbietuk commented Oct 2, 2020 •

edited

Loading

robbietuk commented Oct 4, 2020

KrisThielemans left a comment

Fix an issue with the relative difference prior #696

Fix an issue with the relative difference prior #696

Conversation

robbietuk commented Sep 24, 2020

robbietuk commented Sep 24, 2020

AnderBiguri commented Sep 24, 2020

robbietuk commented Sep 24, 2020 • edited Loading

AnderBiguri commented Sep 24, 2020

robbietuk commented Sep 24, 2020

AnderBiguri commented Sep 24, 2020 • edited Loading

robbietuk commented Sep 24, 2020 • edited Loading

AnderBiguri commented Sep 24, 2020 • edited Loading

robbietuk commented Sep 24, 2020

KrisThielemans commented Sep 24, 2020

AnderBiguri commented Sep 24, 2020 • edited Loading

robbietuk commented Sep 25, 2020 • edited Loading

Function

Gradient

Remarks

robbietuk commented Sep 25, 2020 • edited Loading

AnderBiguri commented Sep 25, 2020

robbietuk commented Sep 25, 2020 • edited Loading

robbietuk commented Oct 1, 2020

robbietuk commented Oct 2, 2020 • edited Loading

robbietuk commented Oct 4, 2020

KrisThielemans left a comment

Choose a reason for hiding this comment

robbietuk commented Sep 24, 2020 •

edited

Loading

AnderBiguri commented Sep 24, 2020 •

edited

Loading

robbietuk commented Sep 24, 2020 •

edited

Loading

AnderBiguri commented Sep 24, 2020 •

edited

Loading

AnderBiguri commented Sep 24, 2020 •

edited

Loading

robbietuk commented Sep 25, 2020 •

edited

Loading

robbietuk commented Sep 25, 2020 •

edited

Loading

robbietuk commented Sep 25, 2020 •

edited

Loading

robbietuk commented Oct 2, 2020 •

edited

Loading