- Investigate
warning C4910
on MSVC - some declspec conflict - Look at
fwdsolver_mw.h
instantiation requirements, determine appropriate preprocessor gaurd (e.g. Clang?) - MATLAB QM are always complex, will crash on forward solve with real (NB interleaved API motivation)
- MATLAB gmsh reader out of date
- MATLAB flourescence example performance
- Interleaved API permits shallow copy of various inputs in the MATLAB interface to reduce round-trip, exploit
- Review element types in libfe, some contain unfinished defintions of operators and constants, remove
- Move semantics for mathlib vectors and matrices
- Improved initialisation for element entries
- Check propensity for structural nonzeros viz. direct solvers
- Python interface build assumes Release paths on Windows
- Default link list after make mesh appears arbitrary, resulting in enormous linklist/qmvec
- Bottlenecks
- Mesh sparsity calculation heapsort (single-threaded), called when computing the system matrix for fields
- Solvers
- Fast direct solvers require supernodal + BLAS implementation. Use of e.g. CHOLMOD for direct solve when computing forward and adjoint fields for an HD problem is optimum (c. N=200k, nQM = 60).
- MKL PARDISO less competitive than CHOLMOD.
- Simplicial solvers such as Eigen LLT, and legacy Cholesky implementation are not competitive with iterative solvers.
- Block Krylov methods don't appear to offer significant speedup and are reliant upon fast matrix solves thus indirectly require a decent BLAS.
- Iterative solvers (CG, BICGSTAB) readily parallelised and within an order of mangnitude of direct solvers, hot path is Sparse-Dense Ax & Cholesky substitution. No memory issues. SpMv improvements using different CSR structures have shown limited improvement.
- Jacobian computation, fast in basis, slow in mesh. Mesh path is dominated by IntFG cost, which in turn uses a virtual method call to element IntFG in hot loop. Experiments show 50% speedup possible by extracting this call and working over all RHS, and / or precomputing element integrals to avoid scaling and indexing cost.