Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quad-double support for Camera2D navigation #147

Closed
claudeha opened this issue Jun 4, 2020 · 14 comments
Closed

quad-double support for Camera2D navigation #147

claudeha opened this issue Jun 4, 2020 · 14 comments

Comments

@claudeha
Copy link
Contributor

claudeha commented Jun 4, 2020

Is your feature request related to a problem? Please describe.

The double-precision zoom limit for 2D frags is frustrating. Using double-double, triple-double, quad-double compensated arithmetic techniques using unevaluated sums can work to increase precision (though in GLSL tricks need to be performed to avoid unsafe math optimisations that break everything; it can be made to work). However, this is not much use if one can't navigate with the 2D camera controls (mouse zooming, panning, etc).

Describe the solution you'd like

Support for quad-double precision for the Camera2D Center variable, gracefully downgrading to triple-double, double-double, double, or float, depending on which uniforms are defined. Something like uniform dvec2 Center[4]; where the array size can be from 1 to 4, or omitted for size 1, or vec2 for float (without array).

No intention of supporting double-float, triple-float, quad-float as double-float is less precise and the same speed or slower than double on most GPUs.

Widget in user interface would be multiple sliders (eg 8 double-sliders for quad-double 2D, arranged in order like x0 x1 x2 x3 y0 y1 y2 y3). Saved in presets like that too (unevaluated sum, not necessarily canonical form). Mouse navigation would ensure canonical form, but shader should be expected to canonicalize input.

Describe alternatives you've considered

Using only multiple sliders for choosing location is far too awkward in practice.

Additional context

type / approximate zoom limit
float / 1e6
double /1e15
double-double / 1e30
triple-double / 1e45
quad-double / 1e60

see also: #146

@3Dickulus
Copy link
Owner

uniform dvec2 Center[4];

could that be better served as uniform dvec4 CenterX and dvec4 CenterY ?

it's a brilliant idea, a very ambitious undertaking, also a very specialized application

Support for quad-double precision for the Camera2D Center variable, gracefully downgrading to triple-double, double-double, double, or float, depending on which uniforms are defined.

somehow I think "gracefully" will be difficult at best, impossible at worst.

is this being done with bits of code from libqd transposed to GLSL ? or from gqd ?

ref: http://homepages.math.uic.edu/~jan/mcs572/quad_double_cuda.pdf
ref: https://github.com/lumianph/gpuprec

@claudeha
Copy link
Contributor Author

claudeha commented Jun 6, 2020

could that be better served as uniform dvec4 CenterX and dvec4 CenterY ?

Probably the shaders would repack to that, yes, but the UI may be easier to do the graceful downgrade with arrays instead of having to handle all of double, dvec2, dvec3, dvec4.

it's a brilliant idea, a very ambitious undertaking, also a very specialized application

Yes, ambitious and specialized, but would be great to try it at least. Maybe it will turn out too slow to be practical, double-double is typically 10x slower than double on CPU iirc...

somehow I think "gracefully" will be difficult at best, impossible at worst.

If the quad-double is normalized, a normalized triple-double/double-double/double is just the first 3/2/1 values, but having it internally on the CPU in just quad-double at all times would simplify things.

is this being done with bits of code from libqd transposed to GLSL ? or from gqd ?

I was thinking copy/pasting the small parts of libqd that would be needed, with attribution, into a small header file for the C part, so that no extra deps are needed. Then think about porting more of qd to GLSL for the frag uses (starting with arithmetic and square root, maybe the transcendental stuff later).

Thanks for the links.

@3Dickulus
Copy link
Owner

considering this from ref[1] ?

The implementation with an interval memory layout is reported to be three times faster over the sequential memory layout.

...or does that only apply only to CUDA and not available under GLSL semantics?

@claudeha
Copy link
Contributor Author

claudeha commented Jun 6, 2020 via email

@3Dickulus
Copy link
Owner

hmm... like in a texture buffer? fanciful speculation perhaps. probably best to try some simple stuff from libqd to get a feel for path to take with this, by simple stuff I mean based on qd using vec4 as real and then moving to complex types of dvec4[2] <- there's your 8 sliders

I don't think I can make an hiprec single slider :(

@3Dickulus
Copy link
Owner

3Dickulus commented Jun 6, 2020

some random thoughts...

...internally, dvec4 varName values are addressed varName1, varName2... fi: when applying an easing curve to vec4 varName.y in the gui a resulting preset will use the label varName2 edit( and vice versa)

can add setParameter (name, [T]vec[n]) functions... currently these functions only take form f(name, val,val,val,val)

already have glm::vec[n] getParameter[n]f ( QString name ) functions... could be templatized ?
[T] getParameter[n][T]( QString name ) ?

@claudeha
Copy link
Contributor Author

claudeha commented Jun 6, 2020 via email

@3Dickulus
Copy link
Owner

yes, my line of thought was about internal manipulations, simply reiterating what's already in place and speculating about finishing some of that stuff.

@claudeha
Copy link
Contributor Author

claudeha commented Jun 7, 2020

I started implementing something, got a proof of concept working but I can't zoom beyond about 2e18 without the position getting very quantized, which makes it useless:

Zoom = 2.88871568071353344e+18 Logarithmic
CenterX = -1.25841004516429256,5.40000000000000023e-17,0,0
CenterY = 0.382432698177375352,2.99999999999999983e-18,0,0

My current guess is that the Float slider Number box uses %f when %g would be better for tiny values to avoid precision loss.

(It was easier to implement CenterX/CenterY as separate dvec4 widgets rather than tackle any array stuff at this time.)

@claudeha
Copy link
Contributor Author

claudeha commented Jun 7, 2020

In a QDoubleSpinBox with decimals set to 2, calling setValue(2.555) will cause value() to return 2.56.

The displayed value of the QDoubleSpinBox is limited to 18 characters

I guess it's not designed at all for very small values. I think I can work around it, by rescaling the values before/after passing to/from display/presets/etc

@3Dickulus
Copy link
Owner

3Dickulus commented Jun 7, 2020

everywhere a double is used for read, output is determined by widget type

QString FloatWidget::toString()
{
    double f = comboSlider1->getValue();
    return QString::number(f,'g',(isDouble() ? DDEC : FDEC));
}

    double getValue()
    {
        return spinner->value(); // returns double
    }

with precision determined by type DDEC : FDEC are set to 18 and 9 respectively [Double|Float]DECimals

@claudeha
Copy link
Contributor Author

claudeha commented Jun 8, 2020

work in progress at https://github.com/claudeha/FragM/tree/feature-camera2d-quad-double
example shader at https://code.mathr.co.uk/de/blob/de71631f4b8edde46af060818946429e7b3e89e4:/glsl/include/Camera2D.frag https://code.mathr.co.uk/de/blob/de71631f4b8edde46af060818946429e7b3e89e4:/glsl/examples/mET.frag

It works, but as expected it is quite slow (double-double is ~10x slower than double, which is ~4x slower than float on my GPU, plus the deeper zooms where double-double is necessary typically need more iterations. I timed-out my GPU (forced quit of X session) a couple of times...

@3Dickulus
Copy link
Owner

As the next big step in our efforts to accelerate high performance computing, the NVIDIA Ampere architecture defines third-generation Tensor Cores that accelerate FP64 math by 2.5x compared to last-generation GPUs.

...from https://blogs.nvidia.com/blog/2020/05/14/double-precision-tensor-cores/

@3Dickulus
Copy link
Owner

double-double is ~10x slower than double, which is ~4x slower than float

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants