-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix an issue with the relative difference prior #696
Conversation
…the epsilon term in the denominator square.
Hum, you just changed the location of the epsilon right? if its only there to avoid division by zero, then its value must be much smaller than the expected values for the rest of the numerator and denominator (otherwise it would have a numerical impact, and we just want to avoid division by zero). If that is the case, it does not really matter for it to be mathematically correct on its derivative. You just need to add epsilon to any division, and thats it. In theory, it will only have an impact when everything else is zero, so the error you are introducing by not being mathematically correct is neglectable (and if its not, then your epsilon is too big!) Is that the case here? I am missing a but the context, so maybe I am talking out of place! |
Yes, the purpose is to prevent the divide by zeros and it also exists in the computation of the function STIR/src/recon_buildblock/RelativeDifferencePrior.cxx Lines 316 to 319 in ef435cc
In general I would agree with you as generally the function and gradient are slightly decoupled. However, I am trying to be mathmatically correct as I am using line searches to find the MAP solution and therefore discrepancies between the function and its gradient can lead to problems (which is what I think I am finding). Numerical impact may infact be a problem but that just means it isnt the "true" RDP. I could consider discarding the epsilon and instead use |
@robbietuk fair, fair. However, maybe you are just using an epsilon too large? As I said in the teams chat, if the value you are using as "epsilon" is large enough to have influence in your maths, either it is too large on value, or you have other serious issues on the math itself, as its too sensitive to numerical accuracy. |
I agree, I feel that epsilon is causing a large issue right now. I want to try and replace it with the if statement. |
Yes, that works too! But I guess my point is that if those microscopic epsilons are causing a problem, I don't expect your problems to go away... If the epsilon you are imputing is indeed small enough to be on the "noise" fraction of your data, and that impacts the numerical method, then any other "noise" (i.e. floating point error) that may arise just naturally on your computations will cause the same problems in the method! |
Sorry, I dont think I phrased the above correctly. I think the epsilon I am using right now is too large for normalised data. That and the "decoupled" mathmatical function and gradient, and there is a problem.
Yes there is an computational problem of using very small epsilons and I think we are both in agreement that moving away from it is the correct thing? |
No! That is not what I meant. I think there is absolutely no issue with using very small epsilons, if their purpose is only avoiding by zero. They just need to be marginally bigger than zero, but just that. so 1e-7 should also work. In fact, you want epsilon to be as small as possible, in such a way that it can be considered noise for your purposes. Yo only want it so when our "incomplete" numerical system of using floating point arithmetic fails because it does not know how to divide by zero (and give you Inf), it doesn't, and gives you a "practical" Inf, something very large. Or, if the numerator is also zero, then for your division to give something that is essentially zero, but not, and avoid NaN. In some sense, it has nothing to do with math, So, in short, before changing anything and going crazy with equations, I suggest you just do |
Yes, as Removing |
I strongly recommend making the gradient consistent "formula-wise" with the objective function. I think that should be the case for whatever value of epsilon you use. I don't think checking with We could check with Johan Nuyts et al. if you like |
As close as zero without being zero. For any case. You just want a small enough perturbation to avoid division by zero.
While you are correct, and doing this will have no negative impact, in all cases in the past where I could do this but instead I just replaced divisions, there was no measurable difference with the simple "non-formula" way. In fact, one could argue that if there is a difference when using the formula, then your epsilon is too big in value. Conceptually you just want to introduce a value in the noise range of your computation, and by definition that should mean that it has no impact on your results. Its the same as adding doubles vs singles for all this. You are adding more numbers in the noise level of your method, thus it does not matter. In fact, I would argue that if you do the "proper formula way" vs the "just adding it in the division", and there is a measurable impact in your algorithm, then your epsilon is wrong. But of course, there should not be an issue with adding the proper way. |
Okay I did some numerical test on the RDP. The trouble with using an I wrote some python to investigate this. One set of functions ( The second set of RDP tests of this use an
I will compare these two sets of functions and show that the if statement version is equivilant to an optimally selected epsilon value of infintesimally small but positive value and therefore I believe we should use the if statements and do away with FunctionThe GradientThe gradient is another matter entirely. The gradient of the RDP ( Now compare to Lowering the RemarksWhile I agree we should set epision to be as small as possible, which should be much less than the smallest value of P.S. While not rediculously slow, the RDP is a large formula to recompute many times. We can possibly speed it it up with a queery using the if statement. |
I guess then that the answer to my problem is that the current |
Very nice analysis! |
Drafted the removal of epsilon, if we dont choose to use it. We can just revert the commit. It build for me but I want to ensure travis is happy. |
I can't see any more issues, I am happy to merge. |
fc05869
to
0fb4cbe
Compare
After discussion with @KrisThielemans, we decided to go for a comprimise. We will keep the Also corrected default gamma to the suggested value of 2. |
As discussed via email, it is not obvious how to extend the RPD allow for negitive number and a reliable concave function. In the presence of a non-negitiviety constraint, the RDP will now work as expected, even with One idea is to incorporate an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. would be nice to add an example like \examples\samples\OSMAPOSL_QuadraticPrior.par. if you do, please do it with [ci skip]
in the first commit line
The inclusion of the epsilon term in the denominator square