Better integration #608

davidnwobi · 2024-08-29T04:37:24Z

Updated models that require 1D integration to use new integration system. The new model have "_high_res" appended to their names.

The updated models are:

Core-Shell Bicelle
Core-Shell Cylinder
Core-Shell Ellipsoid
Cylinder
Ellipsoid
Flexible Cylinder with Elliptical Cross-section
Hollow Cylinder

Details of the method can be found at this repo

A more accurate cylinder model has been added to the sasmodels library and library functions have been added to support the model

The new models are appended with "_high_res" to indicate that they are using the new integration system.

pkienzle · 2024-09-06T00:30:52Z

(1) Using Δq = 2 π / d_max as a period spacing, and selecting, say, k=5 sample points per period, then you should get a ballpark estimate for the number of integration points by looking at the arc length at radius |q|. That is, Δq = |q| k Δθ, so step size in theta would be Δθ = Δq/(k |q|), and number of steps would be n = (π/2) / Δθ = k |q| d_max / 4.

How does this value compare to the n estimated by Gauss-Kronrod rule? If it is good enough, then we can apply it more easily across all the shape models without a bespoke spline fit for each model type.

(2) Rather than generic Gauss-Legendre, we could look at the structure of the function and select more points where it is larger (importance sampling). The problematic shapes have high eccentricity, which will show up near θ=0° or θ=90°.

(3) There are analytic approximations for large disks and long rods (#109). Can these be used for large q? Can they be extended to other models?

(4) The current code cannot run in OpenCL since the arrays are too large to put in constant memory. Not sure how to allocate them in main memory and make them accessible to the kernels.

(5) We could unroll the loops, computing the different (q, θ) values in parallel, then summing afterward. This would speed up the kernels a lot on high end graphics cards.

(6) Ideally we would update all models to use dynamic integration rather than having normal and high-resolution versions. It would be nice to have a mechanism to set the precision.

(7) What is the target precision on your models? Do you really need 32000 angles for them? You may need to be more careful about summing the terms to avoid loss in precision when adding many small numbers to a large total.

pkienzle · 2024-09-06T00:50:48Z

Note that you can adjust the number of points in the loop for an existing model. You can see this with

python -m sasmodels.compare -ngauss=512 ellipsoid

This is roughly equivalent to the following [untested]:

from sasmodels.core import load_model_info
from sasmodels.generate import set_integration_size
model_info = load_model_info('ellipsoid')
set_integration_size(model_info, 4096)

SasView could do this now, giving the number of theta points as a control parameter and sending that to sasmodels before evaluating. Not as good as having the model set the number of theta points as done in this PR, but it immediately applies to all models.

davidnwobi · 2024-09-06T11:38:39Z

Thanks you for your feedback, there are several aspects here that I should have considered earlier.

Comparison of integration points: I'll take a closer look and compare your method to the Gauss-Kronrod rule. While I'm no longer working with STFC, I do have some free time, so I'll run the comparisons and see how they stack up.
Importance sampling: I completely agree. For the cylinder model, peaks do occur near the ends of the interval, so targeting those areas would definitely improve efficiency. While it might involve more work, the payoff could be significant.
Analytical approximations: They can be used for large values of q but It also depends on the values of the other parameters. For example, for the cylinder model there are good approximations when either $q R$ and $\frac{q L}{2}$ is small. For other models, we might be able to find regions where certain parameters are small and see if the remaining part of the function can be integrated analytically.
OpenCL compatibility: I've had similar issues with OpenCL. I haven’t been able to get either the updated or default models running with OpenCL, whether in sasmodels or SasView. This might be a bug. Interestingly, CUDA works fine,AMD’s parallel processing works but runs out of memory for the updated models (integrated graphics though). The OpenCL definitely needs some further investigation.
Loop Unrolling: I could see if it's at least being done with theta at the moment.
Having precision control would be ideal, especially if the tolerance can vary based on the model or region. This looks doable, but it would require some more work.
What is the target precision on your models?

Target tolerance is a relative tolerance of 1 $\times 10^{-3}$.
As for needing 32,000 angles—no, not for the entire parameter space. Gauss-Quadrature struggles with highly oscillatory functions. it has a very slow convergence as the function becomes more oscillatory. Quick example for a decaying oscillatory
function:
$e^{- \sqrt{k} x} \sin(k x)$

With $k = 4000$:

$n = 512$; Rel error = $3.547640 \times 10^{-1}$
$n = 600$; Rel error = $4.000811 \times 10^{-2}$
$n = 700$; Rel error = $4.470038 \times 10^{-4}$
$n = 800$; Rel error = $1.042314 \times 10^{-4}$
$n = 900$; Rel error = $5.589838 \times 10^{-7}$
$n = 1024$; Rel error = $8.715261 \times 10^{-12}$

With $k = 400000$:

$n = 16384$; Rel error = $2.155721 \times 10^{0}$
$n = 20768$; Rel error = $1.492304 \times 10^{-1}$
$n = 22768$; Rel error = $5.929450 \times 10^{-2}$
$n = 24768$; Rel error = $7.977614 \times 10^{-3}$
$n = 26768$; Rel error = $2.309563 \times 10^{-4}$
$n = 28768$; Rel error = $4.160178 \times 10^{-4}$
$n = 30768$; Rel error = $3.771680 \times 10^{-5}$
$n = 32768$; Rel error = $5.919487 \times 10^{-6}$

Due to memory/space limitations, the number of points is selected from a discrete set (powers of 2 from 1 to 15), so that only those points need to be stored and loaded into memory. As a result, if it selects 32,768 points, it only means 16,384 wouldn’t be enough. It doesn't check any of the possibilities in between. It could have gotten away with less but it can't figure that out. A bit crude but it helps. Do you have any suggestions on a better way to handle this?

I'll look some of these issues.

Comparing with the Gauss Kronrod rule
And a dynamic way of setting the tolerance.

davidnwobi and others added 12 commits August 4, 2024 22:23

New cylinder model and library functions

a075f0c

A more accurate cylinder model has been added to the sasmodels library and library functions have been added to support the model

New models for cylinder and core_shell_bicelle

f53d4e1

Added New Models

1d10a05

New model for flexible cylinder

6096386

added hollow_cylinder2 model

fd792fb

Pringle transformation in progress

dfa8fd6

Updating the models

287572b

Initial commit

e858eaa

Added models with new integration system

49fea21

The new models are appended with "_high_res" to indicate that they are using the new integration system.

Deleted benchamrking_files

9893480

Some clean up

2c704ec

Specfied upper limits for the _high_res models

44ccd75

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better integration #608

Better integration #608

davidnwobi commented Aug 29, 2024 •

edited

Loading

pkienzle commented Sep 6, 2024

pkienzle commented Sep 6, 2024

davidnwobi commented Sep 6, 2024 •

edited

Loading

Better integration #608

Are you sure you want to change the base?

Better integration #608

Conversation

davidnwobi commented Aug 29, 2024 • edited Loading

pkienzle commented Sep 6, 2024

pkienzle commented Sep 6, 2024

davidnwobi commented Sep 6, 2024 • edited Loading

davidnwobi commented Aug 29, 2024 •

edited

Loading

davidnwobi commented Sep 6, 2024 •

edited

Loading