Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: specialize on interpolations in the different Values structs #339

Closed
wants to merge 3 commits into from

Conversation

KristofferC
Copy link
Collaborator

Allows the compiler to know how many base functions there are when doing e.g. reinit!.

Example benchmark:

using Ferrite, BenchmarkTools

for dim in (2, 3)
    for shape in (RefTetrahedron, RefCube)
        for order in (1, 2)
            if dim == 3 && shape == RefCube
                continue
            end
            qr = QuadratureRule{dim,shape}(1)
            linear = Lagrange{dim,shape,1}()

            quadratic = Lagrange{dim,shape,2}()
            cv_vector = CellVectorValues(qr, quadratic, linear)
            cv_scalar = CellScalarValues(qr, quadratic, linear)

            x = (dim == 2 && shape == RefTetrahedron) ? Triangle :
                (dim == 2 && shape == RefCube) ? Quadrilateral :
                (dim == 3 && shape == RefTetrahedron) ? Tetrahedron :
                (dim == 3 && shape == RefTCube) ? Hexahedron : error()

            grid = generate_grid(x, dim == 2 ? (1,1) : (1,1,1))
            xe = getcoordinates(grid, 1)

            @btime reinit!($cv_scalar, $xe)
            @btime reinit!($cv_vector, $xe)
        end
    end
end

Results (PR in left, master in right):

10.409 ns vs 17.469 ns
19.393 ns vs 26.296 ns
10.000 ns vs 17.474 ns
20.199 ns vs 24.866 ns
13.961 ns vs 20.624 ns
25.148 ns vs 30.444 ns
14.174 ns vs 20.608 ns
25.492 ns vs 30.443 ns
31.065 ns vs 42.441 ns
84.960 ns vs 93.313 ns
30.860 ns vs 42.312 ns
83.530 ns vs 85.384 ns

@codecov-io
Copy link

Codecov Report

Merging #339 (760410d) into master (6a090bf) will decrease coverage by 0.05%.
The diff coverage is 96.55%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #339      +/-   ##
==========================================
- Coverage   87.92%   87.87%   -0.06%     
==========================================
  Files          20       20              
  Lines        2195     2194       -1     
==========================================
- Hits         1930     1928       -2     
- Misses        265      266       +1     
Impacted Files Coverage Δ
src/FEValues/common_values.jl 89.34% <85.71%> (-0.90%) ⬇️
src/FEValues/cell_values.jl 100.00% <100.00%> (ø)
src/FEValues/face_values.jl 100.00% <100.00%> (ø)
src/interpolations.jl 90.00% <100.00%> (+0.05%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a090bf...760410d. Read the comment docs.

@KristofferC
Copy link
Collaborator Author

@JimBrouzoulis wanted to know for isoparametric mapping:

using Ferrite, BenchmarkTools

for dim in (2, 3)
    for shape in (RefTetrahedron, RefCube)
        for order in (1, 2)
            if dim == 3 && shape == RefCube
                continue
            end
            qr = QuadratureRule{dim,shape}(1)

            quadratic = Lagrange{dim,shape,order}()
            cv_vector = CellVectorValues(qr, quadratic)
            cv_scalar = CellScalarValues(qr, quadratic)

            xe = Ferrite.reference_coordinates(quadratic)

            @btime reinit!($cv_scalar, $xe)
            @btime reinit!($cv_vector, $xe)
        end
    end
end
7.924  ns vs 12.783 ns 
13.946 ns vs 18.085 ns 
12.587 ns vs 22.067 ns 
22.923 ns vs 31.541 ns 
9.494  ns vs 17.298 ns 
17.128 ns vs 24.442 ns 
28.482 ns vs 31.705 ns 
40.953 ns vs 42.528 ns 
23.396 ns vs 29.716 ns 
47.527 ns vs 48.847 ns 
41.292 ns vs 52.257 ns 
90.091 ns vs 92.995 ns 

@fredrikekre
Copy link
Member

Perhaps just store the interpolations as fields too? I believe there are some code that currely tries to reverse-engineer that.

@KristofferC
Copy link
Collaborator Author

Perhaps just store the interpolations as fields too?

Could do that, I actually think that doesn't take any space.


julia> struct Bar end

julia> struct Foo end

julia> struct Baz
           b::Bar
           f::Foo
       end

julia> sizeof(Baz(Bar(), Foo()))
0

@KristofferC
Copy link
Collaborator Author

Added the field. Last commit is just some renameing, interp sounds as good as interpol imo and is shorter.

@lijas
Copy link
Collaborator

lijas commented Mar 30, 2021

Would it also be faster to allow the compiler to know how many quadrature points there are? So change QuadratureRule to

struct QuadratureRule{dim,shape,T,N}
    weights::NTuple{N,T}
    points::NTuple{N,Vec{dim,T}}
end

@KristofferC
Copy link
Collaborator Author

I don't think that will have a big effect. The work inside the quadrature loop should be big enough to offset it. This also applies to reinit! (since that is also only done once per element) but the perf improvements in reinit! isn't the main point of this PR, it is the different function_ function and also to be able to e.g. unroll the double loop when you assemble to local stiffness matrix.

I will try add a more representative benchmark.

@KristofferC
Copy link
Collaborator Author

The doassemble code in the heat equation examples goes from 0.22s -> 0.17s with this PR using a 1000x1000 element grid.

@kimauth
Copy link
Member

kimauth commented Apr 16, 2021

I've been trying this out for dispatching the reinit!(cv, x) function based on a custom geometry interpolation. Worked great! That could save me a bunch of awkward work-arounds, so I'd highly vote for merging this.
However, it was (obviously) breaking for a custom subtype of CellValues that I use.

Is there any drawback to also storing the QuadratureRule? That would be more complete than just storing the interpolations (and I could make use of it).

@koehlerson
Copy link
Member

koehlerson commented Aug 2, 2021

pinging this

I really like this PR, it can work very nicely for subtyped stuff and eases a lot of other things we faced when trying LOD and adapative prototyping with Ferrite. :)
Is there anything that speaks against merging it (except the small conflict in face_values.jl)?

@koehlerson
Copy link
Member

forgot to mention this in my previous post: I'd also vote to incorporate the qr in cellvalues, the completeness that Kim mentioned is really nice. In a lot of cases I had to propagate down cellvalues and quadrature rules, which is quite error prone - or at least more error prone than cv.qr or something. Those changes would lead to a higher abstraction level for cellvalues and we could make nice APIs out of that for more advanced features later :)

@fredrikekre
Copy link
Member

However, it was (obviously) breaking for a custom subtype of CellValues that I use.

The extra parameters on the abstract types could be skipped, then this wouldn't be a problem I think?

fredrikekre added a commit that referenced this pull request Feb 24, 2022
The interpolations are stored as abstract fields in the struct since
they are never accessed in performance critical parts of the code, but
it is convenient to have access too them after creating the FEValues.

Closes #339.
@fredrikekre fredrikekre deleted the kc/spec_vector branch February 24, 2022 14:28
koehlerson pushed a commit that referenced this pull request Apr 22, 2022
The interpolations are stored as abstract fields in the struct since
they are never accessed in performance critical parts of the code, but
it is convenient to have access too them after creating the FEValues.

Closes #339.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants