-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generic Matrix/Vector/N-Dimensional Type #5
Comments
Is it even necessary to define vectors/matrices, as these are already arrays/seqs? What about defining Matrix as a 2-D array as per numpy with matrix algebra applied to these as in A = B.X, etc |
Sorry it took a while to reply - I was on holidays :-) @mihirparadkar I tried to use There is a tentative implementation of more generic linear algebra routines in my other library emmy. The types are compatible, so if you import both linalg and emmy you can have generic implementations for any ring The implementation in emmy has limitations, though: I only made it for the dynamic size case, and more importantly the algorithms used are the naif multiplications. @freevryheid I am not sure I understand your comment. The types defined in linalg are just type aliases, and in fact, matrices are little more than a two-dimensional array (with a little more information about the size and columnwise/rowwise layout) |
@andreaferretti, I guess that's my point - very little to gain by defining new types when the existing should suffice, no? When interfacing with blas/lapack, arrays need to be column aligned, so define them as such and leave it to to programmer to ensure this or provide a transpose function to facilitate this. |
Well, I still need the matrix types since they hold some more fields other than the arrays. And I needed to define custom types in the static dimension case, to hold dimension info in the type. It just made sense to define the dynamic vector types by analogy. In any case, the types are not distinct, just aliases. If you prefer to just spell them out, it is the same. That is, they are perfectly compatible with Nim types |
I've started working on this and finished porting
I didn't test with emmy and how Nim deals with specialized generics. |
Here's another generic port that incorporates complex numbers as well: This was coded to support my comments to @andreaferretti above. Sorry for the late response. Basically it comes down to defining a matrix[T] where T can be any number including complex. It is similar to the linalg matrix definition but uses a sequence[T] object (instead of an array) that stores row, column and order info via the toMatrix call. Calls to lapack and blas transform input matrices to column major and re-transform to row major order on output via pointer to seq[0] - hence no need for arrays and static parameters that IMHO overly complicate and restrict coding. I'm using the fortran blas library for matrix multiplication. Complex numbers are defined as array(2,T) instead of tuple. I modified complex.nim accordingly. The example under lalite.nim calculates complex eigenvectors from some square matrix per the Intel geev example online. This was mainly proof of concept and the code is still lacking and naive without due consideration given to memory, speed, etc. I offer it as a "lite" version to the current linalg and welcome any feedback . |
@freevryheid Thank you for your suggestions! I would like very much to make the library more generic, but it is not so simple. Let me make a few comments on your approach:
I would like very much to join efforts and come to a common codebase if it makes sense, but I would like to preserve
Do you think we can come to some form of convergence? |
Would love to discuss on Nim numerical ecosystem as well. Maybe on gitter/irc? As @andreaferretti knows, I had tough challenges to use In the end, I built a Tensor library from scratch, re-using nimblas as the backend to suit my needs. I need unified types for vectors, matrices and extensibility to up to 4D tensors at least (height, width, color, batch images processing for basic), 6D if possible (depth for 3D images, time for video). That also means that I can't use static typing. For now the library is still rough (tensor display might be cut for example), but here it is: https://github.com/mratsim/Arraymancer. |
I would be happy to discuss on gitter - I don't think that having 3 or more different approaches benefits anyone, and I would be glad to drop linalg in favour of more convenient alternatives. By the way, I happen to have some prototype neural network implementation that I did not have time to open source. I had to use methods for it to work nicely, and I did treat matrices and vectors separately. I am not sure whether we will be able on gitter, though, because of timezones - it is 7PM here in Italy |
In France, as well so same timezone. However this is a side project for me so I'm only free during lunch time or in the evening. |
I agree, to ensure longevity, the nim community should decide the preferred approach. My tenants ($0.02):
type
|
The issue with not specifying an order is that transposing becomes a non-constant time operation |
I've updated PR #14 Base type in linalg can now generic over floats with the original DVector32 ... being provided for compatibility/syntactic sugar: type
Vector*[N: static[int], T: SomeReal] = ref array[N, T]
Matrix*[M, N: static[int], T: SomeReal] = object
order: OrderType
data: ref array[N * M, T]
DVector*[T: SomeReal] = seq[T]
DMatrix*[T: SomeReal] = ref object
order: OrderType
M, N: int
data: seq[T]
Vector32*[N: static[int]] = Vector[N, float32]
Vector64*[N: static[int]] = Vector[N, float64]
Matrix32*[M, N: static[int]] = Matrix[M, N, float32]
Matrix64*[M, N: static[int]] = Matrix[M, N, float64]
DVector32* = DVector[float32]
DVector64* = DVector[float64]
DMatrix32* = DMatrix[float32]
DMatrix64* = DMatrix[float64] Regarding the overall discussion, I see several needs:
On the common points (feel free to add/correct):
On the differences:
I don't really know the OpenGL domain but I don't see any overlap on glm and linalg/lalite/arraymancer so I will only talk about the last 3 and explain why I choose to do something different. Why didn't I choose a simple 2D library:The ideal deep learning library supports at least 4D arrays and up to 6D arrays.
6D is for 3D videos (time + depth are added) Why no static parameter or array in the type:
Data structure chosenMost of why I choose to do something in a specific way is explained here
Unique differentiatorThanks to Arraymancer strided data structure, it can do arbitrary slicing/views without copying the underlying data. Also Arraymancer can hold any homogeneous type, i.e. you can have integer, floats, bool, objects... . You can't (and it's not a target) have heterogeneous like float + int in the same ndarray. import math, arraymancer, future
const
x = @[1, 2, 3, 4, 5]
y = @[1, 2, 3, 4, 5]
var
vandermonde: seq[seq[int]]
row: seq[int]
vandermonde = newSeq[seq[int]]()
for i, xx in x:
row = newSeq[int]()
vandermonde.add(row)
for j, yy in y:
vandermonde[i].add(xx^yy)
let foo = vandermonde.toTensor(Cpu)
echo foo
# Tensor of shape 5x5 of type "int" on backend "Cpu"
# |1 1 1 1 1|
# |2 4 8 16 32|
# |3 9 27 81 243|
# |4 16 64 256 1024|
# |5 25 125 625 3125|
echo foo.fmap(x => x.isPowerOfTwo) # map a function (=> need import future)
# Tensor of shape 5x5 of type "bool" on backend "Cpu"
# |true true true true true|
# |true true true true true|
# |false false false false false|
# |true true true true true|
# |false false false false false|
echo foo[1..2, 3..4] # slice
# Tensor of shape 2x2 of type "int" on backend "Cpu"
# |16 32|
# |81 243|
echo foo[3.._, _] # Span slice
# Tensor of shape 2x5 of type "int" on backend "Cpu"
# |4 16 64 256 1024|
# |5 25 125 625 3125|
echo foo[_..^3, _] # Slice until (inclusive, consistent with Nim)
# Tensor of shape 3x5 of type "int" on backend "Cpu"
# |1 1 1 1 1|
# |2 4 8 16 32|
# |3 9 27 81 243|
echo foo[_.._|2, _] # Step
# Tensor of shape 3x5 of type "int" on backend "Cpu"
# |1 1 1 1 1|
# |3 9 27 81 243|
# |5 25 125 625 3125|
echo foo[^1..0|-1, _] # Reverse step
# Tensor of shape 5x5 of type "int" on backend "Cpu"
# |5 25 125 625 3125|
# |4 16 64 256 1024|
# |3 9 27 81 243|
# |2 4 8 16 32|
# |1 1 1 1 1| |
I should mention that all the part of this library related to dynamic matrices has been ported to a new library neo, and that has been substantially expanded. I think tensors should be easily compatible with Neo and one should be able to transform between tensors, matrices and vectors without copying data. I still prefer having matrices and vectors as the main data types, even though tensor incorporate them. I was thinking that maybe I could add a tensor datatype later, with slice operations, and the possibility to convert 2d-tensors to matrices and 1d-tensor to vectors (and viceversa). This would allow to have higher dimensional data types, but still make easier to follow BLAS operations on the relevant types (I am sorry, I am always confused when trying to map BLAS to tensors) |
Why are the matrix and vector types separate depending on the value they contain? (i.e.
Matrix32[M, N]
andMatrix64[M, N]
instead ofMatrix[M, N, T]
) ? Since the implementations of matrix and vector operations for 32-bit and 64-bit floats are effectively identical, wouldn't generic procs specialized on the element type allow a lot less code duplication? Couldn't it would also allow easier scaling to complex matrices/vectors and boolean vectors for more concise and expressive indexing.The text was updated successfully, but these errors were encountered: