-
Notifications
You must be signed in to change notification settings - Fork 588
Mapping Recipes
With JavaCPP and the help of the native C++ compiler and its toolchain, we can easily call native functions and access data from C/C++ libraries. Normally, when creating bindings for a native library, we ideally need to write only one configuration class in Java, such as the following:
import org.bytedeco.javacpp.*;
import org.bytedeco.javacpp.annotation.*;
import org.bytedeco.javacpp.tools.*;
@Properties(
value = @Platform(
includepath = {"/path/to/include/"},
preloadpath = {"/path/to/deps/"},
linkpath = {"/path/to/lib/"},
include = {"NativeLibrary.h"},
preload = {"DependentLib"},
link = {"NativeLibrary"}
),
target = "NativeLibrary"
)
public class NativeLibraryConfig implements InfoMapper {
public void map(InfoMap infoMap) {
}
}
Along with the following commands to first parse the headers files into the target NativeLibrary
class, and then link everything together:
$ java -jar javacpp.jar NativeLibraryConfig.java
$ java -jar javacpp.jar NativeLibrary.java
Note: Any modifications to the target class NativeLibrary
will get overwritten. If you would like to write manually additional code, consider using a helper class.
Also note that, under the directory where the target NativeLibrary.class
is located, the last call outputs a shared library into a subdirectory named after the platform (linux-x86_64
, macosx-x86_64
, windows-x86_64
, etc) and also copies any dependent libraries, which need to be bundled as resources for the target class, either as files or inside a JAR file, it does not matter. The Loader
automatically extracts them in its cache when necessary. For platforms such as Android that already feature a native loader that expect to find the libraries in another directory, it can be specified in the platform properties under the platform.library.path
entry. For reference, we can consult all the default platform properties in the source code resources here:
Additionally, it is possible to simplify further these build steps with Maven and the included Mojo plugin as shown with the JavaCPP Presets, which were created based on the recipes detailed below, so they also serve as examples to follow:
However, until the parsing capabilities of JavaCPP improve, probably by relying on a full C++ compiler front end such as Clang, see issue #51, these simple instructions alone will typically fail unless we tweak the @Properties
and provide more Info
to the InfoMap
. The kind of Info
that we need to craft depends greatly on the content of the header files. This guide is structured in such a way that only the sections, the recipes, relevant to the tasks that we are interested in completing need to be consulted:
- Providing properties for each platform
- Including multiple header files
- Ignoring attributes and macros
- Defining macros and controlling their blocks
- Mapping macros to fields or methods
- Skipping lines from header files
- Specifying names to use in Java
- Mapping a declaration to custom code
- Redefining the code of a macro
- Writing additional code in a helper class
- Creating instances of C++ templates
- Defining wrappers for basic C++ containers
- Using adapters for C++ container types
- Dealing with abstract classes and virtual methods
InfoMap.java
also comes with default entries that one should be aware of as they can provide a good reference for some of the tasks explained below.
To provide a different set of @Platform
properties for each platform, we can pass an array of them to the @Properties
annotation. The @Platform(value = {"..."}, ...)
are matched against the platform name using String.startsWith()
such that, for example, @Platform(value = "android-arm", ...)
matches with both android-arm
and android-arm64
, but not android-x86
. Each matching @Platform
further down the list overrides the settings of any previous ones, leading to configuration files that look like this:
@Properties(value = {
@Platform(
includepath = {"/path/to/generic/include/"},
linkpath = {"/path/to/generic/lib/"},
include = {"NativeLibrary.h"},
link = {"NativeLibrary"}
),
@Platform(
value = {"android", "ios"},
includepath = {"/path/to/include/for/mobile/"},
linkpath = {"/path/to/lib/for/mobile/"}
),
@Platform(
value = "windows-x86",
include = {"NativeLibrary.h", "HacksForWindows.h"},
link = {"NativeLibraryForWindows"}
)},
// ...
)
It is not currently possible to aggregate the settings from multiple @Platform
annotations. Matching platform names only with their prefixes does not always offer enough flexibility, so we might want to revisit this and allow more sophisticated ways to perform matching using, for example, regular expressions. Nevertheless, it is already possible to work around this with the BuildEnabled
and LoadEnabled
interfaces.
The Parser class is responsible for parsing the header files found in the list of @Platform(include = {...}, ...)
annotation values. Its preprocessor currently does not honor the #include
directive, since it is generally unreliable to blacklist all the system header files we should not be mapping, instead opting for a whitelist approach. This also prevents issues with circular inclusions and lets users specify exactly in which order the content should appear in the target class, even though this is not something C/C++ developers usually have to think about. For example, given the following 2 header files:
- types.h
struct Data {
// ...
};
- functions.h
#include "types.h"
void function(Data data);
We would need to specify them both in this order: @Platform(include = {"types.h", "functions.h"}, ...)
. That needs to be done recursively for all headers files, basically doing a topological search manually (so there is obviously room for improvement here).
Note: If the header files are in C and not contained within a extern \"C\" { }
block, we need to list those in the @Platform(cinclude = { ... }, ...)
annotation values instead.
One of the things the Parser tries to do is to translate #define
macros into final
variables in Java. It attempts to guess the type it should use, and it works well most of the time, but it fails often enough that one of the first things we need to do to fix parsing errors is to ignore the macros we are not interested in translating. It also often trips over compiler attributes, which can be used almost anywhere in the declarations. This includes, but not limited to, things like calling conventions, memory alignment preferences, library import/export directives, assertions, exception handling, preconditions, and postconditions. One common pattern involves using macros to abstract away attributes that have similar meaning between compilers but that have different names, for example:
#ifdef _WIN32
#define EXPORTS __declspec(dllexport)
#define NOINLINE __declspec(noinline)
#elif defined(__GNUC__)
#define EXPORTS __attribute__((visibility ("default")))
#define NOINLINE __attribute__((noinline))
#else
#define EXPORTS
#define NOINLINE
#endif
In this case, we generally need to use this kind of Info
to be able to parse the header files successfully:
infoMap.put(new Info("EXPORTS", "NOINLINE").cppTypes().annotations());
An empty but non-null Info.cppTypes
list prevents the parser from trying to guess the type to assign to a variable, while an empty but non-null Info.annotations
instructs it to consider it also like an attribute, but without any corresponding Java annotations, so its output is also empty.
There are two places where we can define a macro: In the @Platform(define = { ... }, ...)
annotation values and with Info.define
in the InfoMap
. The first one is for the Generator, which simply outputs one #define
line per string. The second one is used by the Parser to provide users with some generic control over which part of the file gets parsed. In this case, the conditional groups #if
, #ifdef
, and #ifndef
do not get evaluated the usual way. The whole condition is matched as is with an Info
to decide whether to parse the block or not. Further, if no Info
matches, all blocks are parsed by default, regardless of the conditions. For example, a header file might already contain blocks like the following to prevent other tools like Doxygen or SWIG from tripping on some tricky piece of code:
#if !defined(DOXYGEN) && !defined(SWIG)
// ...
#endif
JavaCPP will most likely have issues with these blocks as well, so it would be wise to add the following:
infoMap.put(new Info("!defined(DOXYGEN) && !defined(SWIG)").define(false));
However, we do not wish to skip those blocks at compile time, so we do not add them to a @Platform
annotation, but we might want to define there other macros such as NDEBUG
or USE_OPENMP
to enable inlining of functions, parallel processing, etc, for example: @Platform(define = {"NDEBUG 1", "USE_OPENMP 1"}, ...)
.
Another thing that we might want to do with macros is to have them available as variables or methods. By default, macros that look like constants that can be translated easily into correct Java syntax will result in a public static final
variable, for example:
#define VERSION MAJOR "." MINOR
By default, this gets translated into:
public static final String VERSION = MAJOR + "." + MINOR;
But if MAJOR
or MINOR
are not actually defined, or if they are defined to some other type than String
we will get a Java compilation error. Using the following Info
we can instead consider this macro as a function returning a value in the given C++ type:
infoMap.put(new Info("VERSION").cppTypes("const char*").translate(false));
Function-like macros do not get mapped to Java by default. However, after providing the C++ types, we will get methods to call them, for example:
#define SQUARE(x) x * x
With this Info
:
infoMap.put(new Info("SQUARE").cppTypes("double", "double"));
Gives us:
public static native double SQUARE(double x);
When macros cannot be (mis)used to skip over just the right portions of header files, we can match the lines themselves against regular expressions. All we might have to go with could be comments, such as these ones, for example:
// START COMPLEX DECLARATIONS
// ...
// END COMPLEX DECLARATIONS
In this case, we could skip these lines with this Info
, using the patterns that mark the start and the end of the sections, respectively:
infoMap.put(new Info("filename.h").linePatterns("// START COMPLEX DECLARATIONS", "// END COMPLEX DECLARATIONS").skip());
Note that the strings need to be regular expressions. Moreover, the remaining lines must not contain any syntax errors introduced by the lines skipped. Further, without Info.skip
, this works in reverse, whitelisting the lines to parse instead.
Besides skipping linepatterns, it is also possible to skip a individual variables definitions:
infoMap.put(new Info("FFI_SYSV", "FFI_THISCALL", "FFI_FASTCALL", "FFI_STDCALL", "FFI_PASCAL", "FFI_REGISTER", "FFI_MS_CDECL").skip())
By default, the Parser tries to use the same name as the C/C++ identifiers for the fields and methods of the peer classes, but it is possible to change them. In general, for struct
, class
, or union
we can use Info.pointerTypes
, while for others such as member variables and functions we use Info.javaNames
, like this:
infoMap.put(new Info("full::namespace::TypeNameInCPP").pointerTypes("ClassNameInJava"));
infoMap.put(new Info("full::namespace::FunctioNameInCPP").javaNames("MethodNameInJava"));
infoMap.put(new Info("full::namespace::operator +(ns1::TypeA*, ns2::TypeB&)").javaNames("AddNameInJava"));
Note: Names for operator
functions we need to include one whitespace and function parameters in general are optional, but if given must not contain their names, one whitespace must follow each comma, but with no whitespace before or after *
, &
, (
or )
. Moreover, the types should not be typedef
aliases, but the real underlying type names. This is only a current limitation of the parser, not an inherit issue with how InfoMap
can and should work.
Regarding typedef
, since there is no equivalent in Java, the parser will always use the underlying type, whenever possible, but it only works for simple cases. One common pattern for C libraries is to alias struct
pointers to another name, for example:
struct DataStruct { /* ... */ };
typedef struct DataStruct* DataHandle;
Although the parser should probably handle these situations better by default, for now, we need to provide this kind of Info
to have it mapped in the expected way:
infoMap.put(new Info("DataStruct").pointerTypes("DataHandle"));
infoMap.put(new Info("DataHandle").valueTypes("DataHandle").pointerTypes("@Cast(\"DataHandle*\") PointerPointer", "@ByPtrPtr DataHandle"));
It is also possible to change the parent class of a Pointer
subclass using Info.base
, as long as the type we provide implements Pointer
, which can be Pointer
itself to force it back in the case where we are not interested in the parent class, for example:
infoMap.put(new Info("ChildClass").base("Pointer"));
Sometimes the parser fails miserably, with no way to rectify the situation using additional Info
. In this case, it is possible to provide custom Java code, which the parser will output as is, using Info.javaText
. For example, setting a member variable in C++ may not always be possible, because of deleted functions and what not, which the parser is currently unable to understand. Although we could use Info.skip
to ignore the field completely, we could also allow read only access with an Info
like this:
infoMap.put(new Info("DataStruct::aReadOnlyField").javaText("public native @MemberGetter @Const @ByRef FieldType aReadOnlyField();"));
In the case of macros, it is also possible to redefine its entire content before it actually gets processed. It might be useful, for example, when there is a function-like macro that appends a calling convention, an export directive, and other attributes that cause problems for the parser. In that case, we can nullify the macro with an Info.cppText
like this:
infoMap.put(new Info("DECORATE").cppText("#define DECORATE(returnType) returnType"));
If the parser does not fail, but does not get it quite right, or if we want to provide additional functionality specific to Java, such as custom deallocators with Pointer.DeallocatorReference
, we can place that code in a helper class. For a library named NativeLibrary
, it might look like this:
import org.bytedeco.javacpp.*;
import org.bytedeco.javacpp.annotation.*;
public class NativeLibraryHelper extends NativeLibraryConfig {
/** Registers a custom deallocator when the user calls our DataHandle.create(). */
public static abstract class AbstractDataHandle extends Pointer {
protected static class ReleaseDeallocator extends NativeLibrary.DataHandle implements Pointer.Deallocator {
ReleaseDeallocator(NativeLibrary.DataHandle p) { super(p); }
@Override public void deallocate() { NativeLibrary.releaseData(this); }
}
public AbstractDataHandle(Pointer p) { super(p); }
public static NativeLibrary.DataHandle create() {
NativeLibrary.DataHandle p = NativeLibrary.createData();
if (p != null) {
p.deallocator(new ReleaseDeallocator(p));
}
return p;
}
}
public static void customDataMethod(NativeLibrary.DataHandle p) { /* ... */ }
}
And then the only other thing we need to specify is the fully qualified name of that class in the @Properties(..., helper = "...")
annotation value:
@Properties(
// ...
target = "NativeLibrary",
helper = "NativeLibraryHelper"
)
public class NativeLibraryConfig implements InfoMapper {
public void map(InfoMap infoMap) {
infoMap.put(new Info("DataStruct").pointerTypes("DataHandle").base("AbstractDataHandle"));
// ...
}
}
This allows the target class to inherit from the helper class, such that we can refer from the target class to any method or class defined in the helper class, as well as vice versa.
With C++ templates, it is not usually obvious which types should be used to create instances, and further how to name them, so we need to specify them manually. Fortunately, it is typically quite straightforward, in a manner similar to Specifying names to use in Java, again using Info.pointerTypes
for data structures and Info.javaNames
for functions, for example:
infoMap.put(new Info("data::Blob<float>").pointerTypes("FloatBlob"));
infoMap.put(new Info("data::Blob<double>").pointerTypes("DoubleBlob"));
infoMap.put(new Info("processor::process<double,data::Blob<float> >").javaNames("processFloatBlob"));
infoMap.put(new Info("processor::process<double,data::Blob<double> >").javaNames("processDoubleBlob"));
Note: Because of the current state of the parser, we need a whitespace between each pair of >
, but there should not be any whitespaces after commas between template arguments. Again, this is only a limitation of the current implementation of the parser, not an inherit issue with how InfoMap
can and should work.
While containers such as std::vector
and std::map
are just templates, their definitions are quite complex and vary depending on the C++ compiler, so they are not portable. The Parser
instead provides a set of common features for those basic containers. As with normal templates, we need to create instances manually with an Info
for each, but to create a peer class, we also need to set Info.define
, for example:
infoMap.put(new Info("std::vector<data::Blob<float> >").pointerTypes("FloatBlobVector").define());
infoMap.put(new Info("std::map<std::string,data::Blob<float> >").pointerTypes("StringFloatBlobMap").define());
The list of supported basic containers includes by default the ones listed in InfoMap.java, but it is also possible to append to that list other similar templates this way:
infoMap.put(new Info("basic/containers").cppTypes("templates::MyMap", "templates::MyVector"));
For some standard C++ container types, it is sometimes preferable to use an adapter to map them to existing Java types. The Generator
provides a few adapters by default for std::string
, std::wstring
, std::vector
, std::shared_ptr
, and std::unique_ptr
. Therefore, by default, the Parser maps those types directly to both Pointer
types and standard Java types (String
, int[]
, etc) using the corresponding annotations @StdString
, @StdWString
, @StdVector
, @SharedPtr
and @UniquePtr
, as given in the defaults found in InfoMap.java. For @SharedPtr
and @UniquePtr
, since the namespace may sometimes be boost
or std
, we need to specify it in the @Platform
annotation like this:
@Platform(compiler = "cpp11", define = {"SHARED_PTR_NAMESPACE std", "UNIQUE_PTR_NAMESPACE std"}, ... )
Which can result in an output such as the following, but be aware that existing Java types have limitations, for example, Java arrays cannot be resized while std::vector
can:
public static native void transform(@SharedPtr DataHandle arg0, @StdVector int[] parameters);
Users can create more adapters by themselves, and use them with the @Adapter
annotation, either directly or on newly created annotations. At a minimum, we basically need to define a C++ class template with:
- a constructor taking a
const
pointer (which can be an array) to the values, thesize
(which can be always 0 or 1 for some containers), and anowner
pointer of the container itself (which may be null or equal to the value pointer), - an
assign()
method with the same set of parameters, but notconst
, - another constructor taking a reference to an existing container object, which can be an rvalue reference if required,
- a
static void deallocate(void *owner)
method to call the destructor, - appropriate cast operators to return types needed by function calls, along with
- member variables named
ptr
,size
andowner
, which basically mirror the state of the container, but outside of the container.
Each adapter instance is short-lived, so we cannot rely on the fields for anything that should persist. For example, the adapter required for a smart pointer similar to std::shared_ptr
may look like this:
template<class T> class SmartPtrAdapter {
public:
SmartPtrAdapter(const T* ptr, int size, void *owner) :
ptr((T*)ptr),
size(size),
owner(owner),
smartPtr2(owner != NULL && owner != ptr ? *(smart_ptr<T>*)owner : smart_ptr<T>((T*)ptr)),
smartPtr(smartPtr2) { }
SmartPtrAdapter(const smart_ptr<T>& smartPtr) :
ptr(0),
size(0),
owner(0),
smartPtr2(smartPtr),
smartPtr(smartPtr2) { }
void assign(T* ptr, int size, void* owner) {
this->ptr = ptr;
this->size = size;
this->owner = owner;
this->smartPtr = owner != NULL && owner != ptr ? *(smart_ptr<T>*)owner : smart_ptr<T>((T*)ptr);
}
static void deallocate(void* owner) {
delete (smart_ptr<T>*)owner;
}
operator T*() {
ptr = smartPtr.get();
if (owner == NULL || owner == ptr) {
owner = new smart_ptr<T>(smartPtr);
}
return ptr;
}
operator smart_ptr<T>&() {
return smartPtr;
}
operator smart_ptr<T>*() {
return ptr ? &smartPtr : 0;
}
T* ptr;
int size;
void* owner;
smart_ptr<T> smartPtr2;
smart_ptr<T>& smartPtr;
};
Along with the following annotation and Info.annotations
:
@Documented
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD, ElementType.PARAMETER})
@Adapter("SmartPtrAdapter")
public @interface SmartPtr {
/** template type */
String value() default "";
}
// ...
infoMap.put(new Info("ns::smart_ptr").skip().annotations("@SmartPtr"));
For abstract classes or other classes that cannot be instantiated because of deleted constructors, or what have you, that the Parser
may not understand, we can skip over the constructors with Info.purify
, while for classes containing virtual methods that we would like to override in Java, we can use Info.virtualize
to have the parser annotate the methods with @Virtual
annotations, which lets the Generator
output the necessary machinery to get this working using a hidden concrete implementation and JNI callbacks. For this reason, we should not activate both settings together for abstract classes with pure virtual functions that end users need to implement, for example:
class Logger {
protected:
virtual void log(const std::string& message) = 0;
virtual ~Logger() {}
};
With this Info
:
infoMap.put(new Info("Logger").purify(false).virtualize());
Results in the following usable peer class:
public static class Logger extends Pointer {
static { Loader.load(); }
/** Default native constructor. */
public Logger() { super((Pointer)null); allocate(); }
/** Pointer cast constructor. Invokes {@link Pointer#Pointer(Pointer)}. */
public Logger(Pointer p) { super(p); }
private native void allocate();
@Virtual(true) protected native void log(@Const @StdString @ByRef BytePointer message);
}