-
Notifications
You must be signed in to change notification settings - Fork 6
Home
Welcome to the ffi-experimental wiki!
Experimental work on next-gen ffi bindings into the c++ libtorch library in preparation for 0.0.2 which targets the 1.0 backend.
pytorch-project provides prebuild-binaries of c++ libtorch library on official page and debian-package for ubuntu. By using the reliable binaries, we can start running haskell-programs on various environments quickly. (Complation of c++ libtorch for CUDA takes long time.) Imagine running hasktorch on colaboratory by typing single command.
The development of pytorch-project is very fast, then the API is changed frequently. So it is difficult to keep maintenance of haskell's API for it manually.
There is a plan that Declarations.yaml
becomes the single, externally visible API.
See this issue.
Use generated Declarations.yaml
spec instead of header parsing for code generation.
Declarations.yaml
is located at ffi-experimental/deps/pytorch/build/aten/src/ATen/Declarations.yaml
.
The file is generated by building libtorch-binary or running ffi-experimental/deps/get-deps.sh
.
It supports the functions of Native, TH and NN.
It does not support the methods of c++'s class.
The codes for methods are generated by ffi-experimental/spec/cppclass/*.yaml
.
The dataflow is below.
spec/Declarations.yaml(pytorch) -> codegen(a program of this repo.) -> ffi(ffi bindings of this repo.)
spec/cppclass/*.yaml(this repo.)-|
Use inline-c-cpp functionality to bind the C++ API instead of the C API. inline-c-cpp generates c++-codes and haskell-codes at compilation time. To generate the codes, it uses template-haskell.
Technically, symbols of c++-codes are wrapped by extern "C"
.
See How to mix C and C++.
The generated haskell-codes use FFI.
Original inline-c-cpp does not support namespace and template of c++. To support namespace and template of c++, we use modified inline-c-cpp. See this PR.
C++ has 2 memory models. One is heap. Another is stack.
libtorch functions return stack's object.
When the function using the object of local variable returns, the object on stack is deleted,
For example, see below, when test() returns, "Tensor a" on stack is deleted.
void test(){
at::Tensor a = at::ones({2, 2}, at::kInt);
at::Tensor b = at::randn({2, 2});
auto c = a + b.to(at::kInt);
}
So this ffi puts it on the heap using new so that it is not deleted.
at::Tensor* ones_for_haskell(){
at::Tensor a = at::ones({2, 2}, at::kInt);
return new at::Tensor(a);
}
c-lang's data is passed to function-argument directly. c++'s object is passed to function-argument by using object-pointer.
In end of function-call, c-lang's data returns by value. c++'s object returns by object-pointer with new.
Use garbage collection of GHC.
Generated ffi-codes have unmanaged codes(ffi-experimental/ffi/src/Aten/Unmanaged/*
) and managed codes(ffi-experimental/ffi/src/Aten/Managed/*
).
Unmanaged codes use 'Ptr'-type which is the same as c/c++'s raw-pointer.
Managed codes use ForeignPtr
-type which is managed by GHC.
To convert unmanaged codes to managed codes, c++'s object have to be a instance of CppObject-type-class and managed codes is wrapped by cast of Castable-type-class. You can see details of cast in ffi-experimental/ffi/src/Aten/Cast.hs
.
class CppObject a where
fromPtr :: Ptr a -> IO (ForeignPtr a)
class Castable a b where
cast :: a -> (b -> IO r) -> IO r
uncast :: b -> (a -> IO r) -> IO r
instance (CppObject a) => Castable (ForeignPtr a) (Ptr a) where
cast x f = withForeignPtr x f
uncast x f = fromPtr x >>= f
cast0 :: (Castable a ca) => (IO ca) -> IO a
cast0 f = f >>= \ca -> uncast ca return
cast1 :: (Castable a ca, Castable y cy)
=> (ca -> IO cy) -> a -> IO y
cast1 f a = cast a $ \ca -> f ca >>= \cy -> uncast cy return
...
c++'s tuple becomes CppTuple2-instance to access each data on the tuple. The example is below.
class CppTuple2 m where
type A m
type B m
get0 :: m -> IO (A m)
get1 :: m -> IO (B m)
instance CppTuple2 (Ptr (Tensor,Tensor)) where
type A (Ptr (Tensor,Tensor)) = Ptr Tensor
type B (Ptr (Tensor,Tensor)) = Ptr Tensor
get0 v = [C.throwBlock| at::Tensor* { return new at::Tensor(std::get<0>(*$(std::tuple<at::Tensor,at::Tensor>* v)));}|]
get1 v = [C.throwBlock| at::Tensor* { return new at::Tensor(std::get<1>(*$(std::tuple<at::Tensor,at::Tensor>* v)));}|]
c++ operators are mapped to haskell's functions like python's one.
For example, operator+=
is assigned to _iadd_
.
The details of the mapping is this code.
When c++-function of libtorch fail, throw exception.
- For now, use stack. (To use cabal-v2, update shell.nix and cabal.project)
- CircleCI
- Ubuntu18.04
- stack
- Use pined libtorch-binary
# Download libtorch-binary and generate 'Declarations.yaml'
> pushd deps
> ./get-deps.sh
> popd
# Generate ffi-codes to output-directory.
> stack exec codegen-exe
# Check difference and copy the generated codes.
> diff -r output/Aten ffi/src/Aten
> cp -r output/Aten ffi/src/
# Build and test
> stack test ffi
See MemorySpec.hs.
See BasicTest.hs.
- Prebuild libtorch uses old ABI of gcc to maintain backwards compatibility. Pass
-D_GLIBCXX_USE_CXX11_ABI=0
to gcc.
- Integrate this ffi to hasktorch/hasktorch.
- (Resolved)Support autograd of libtorch on this ffi.
- The at::Tensor class in ATen is not differentiable by default. To add the differentiability of tensors the autograd API provides, we must use tensor factory functions from the torch:: namespace instead of the at namespace.
- For now, this ffi only uses tensor factory functions from the at:: namespace.
- We will add factory functions from the torch:: namespace.
- (Resolved)Make a script uploading pined libtorch-binaries for all environments and conditions(linux, mac and win).
- What does generated function's suffix mean? e.g. tts of add_tts.
- c++ supports overload. Haskell does not support it. We use the suffix not to conflict the names of function on Haskell.
- Is torch::Tensor the same as at::Tensor?
- Yes.
- Why not use fficxx?
- fficxx does not support managed codes using ForeignPtr.
- What are
native_functions.yaml
andnn.yaml
?- These files is used to generate
Declarations.yaml
.
- These files is used to generate
- What is the difference between c-API and c++-API?
- https://pytorch.org/cppdocs/
- https://github.com/fpco/inline-c
- https://github.com/wavewave/fficxx
Please feel free to update this document and add FAQ.