position, limit and capacity across Pointer casts #476

HGuillemet · 2021-05-08T11:09:38Z

This PR aims at changing the way position limit and capacity are interpreted when the Pointer is cast from a type to another.
The current behavior is counter-intuitive and may lead to difficult-to-diagnose errors for the user.

Issue

Example 1

Say you have 3D float data consisting in a series of images. You initialize the data somehow:

FloatPointer fp = new FloatPointer(n*w*h);
for (int i=0; i<n; i++) {
  fp.position(i*w*h);
  // initialize image i from, e.g. another FloatPointer:
  fp.put(imagePointer);
}

Then you'd like to apply some OpenCV processing on each image:

for (int i=0; i<n; i++) {
  fp.position(i*w*h);
  Mat m = new Mat(h, w, CV_32FC1, fp, Mat.AUTO_STEP);
  ...
}

However, since this Mat constructor accepts a Pointer as argument, and not a FloatPointer, its generated JNI code interprets the position (and limit and capacity) as a number of bytes and not of floats, and m will point the the wrong slice of the buffer.

To obtain the correct behavior, we must replace the code with:

for (int i=0; i<n; i++) {
  fp.position(i*w*h*4);
  Mat m = new Mat(h, w, CV_32FC1, fp, Mat.AUTO_STEP);
  ...
}

Note that the call to fp.put(imagePointer); in the first loop doesn't need this adjustment even if it calls Pointer.put(Pointer) and not some FloatPointer.put(FloatPointer). This is because put was hand-written and explicitly multiplies the positions by sizeof() before calling Pointer.memcpy. If you call memcpy directly you'll need the adjustments.

Example 2

You have a buffer of floats pointed by a FloatPointer of limit and capacity 100 and would like to create some view/layout on it:

   BytePointer bp = new BytePointer(fp);

You would expect bp to have a limit and capacity of 400, but it actually stays at 100.

Solutions

@saudet made a change in this commit that can address Example 1, if we replace the code with:

for (int i=0; i<n; i++) {
  FloatPointer fp2 = fp.getPointer(i*w*h);
  Mat m = new Mat(h, w, CV_32FC1, fp2, Mat.AUTO_STEP);
  ...
}

it fixes the wrong interpretation of the position, but not of the limit and capacity. It doesn't address example 2. Example 2 could be fix with taking sizeof into account in the Pointer constructors though, just like in Pointer.put.

This PR proposes to fix position, limit and capacity by storing their value in byte units. So JNI generated codes will always points to the right, "physical" position. Methods position(), limit() and capacity() still return element-wise values, so the change should be transparent for the user. position, limit and capacity fields are made private to enforce the use of the methods in subclasses, but they could also have been kept protected and renamed as positionBytes, limitBytes and capacityBytes.

Drawbacks

Existing codes that deal with the issue by explicitly adjusting positions with the size of elements will be broken
The presets that directly access the fields should be change to use the methods
calls to position(), limit() and capacity() are slowed down due to use of sizeof() that triggers a double map lookup in Loader.

Performance

To address this last point, I see three options:

Add a sizeof field to Pointer that could be initialized during the call to Pointer.init(). This would be the fastest but increases the size of Pointer instances from 40 to 44 bytes for storing a value that is truly a class property.
Store the sizeof in a single hash map Class => Integer instead of using the double hash map of offsets in Loader.
Add to the Parser the generation of an overriding of sizeof() (like in FloatPointer, etc...) . Either returning a constant value if the parser can somehow obtain the value of sizeof, or something like :

static int sizeof;
public int sizeof() { return sizeof };

With the static variable initialized by the Loader.

To be verified

Adaptation of Generator was a bit tricky. I probably made some mistakes. There are two portions of code marked with // HG which I'm particularly unsure about.

…bnd_upgrade

saudet · 2021-05-10T07:46:31Z

That looks OK, but the way things are right now works fine for like 99% of use cases, so unless we can figure out a way to preserve the performance without increasing the memory footprint or the workload for users not using the parser, we would need a way to disable all this alternative functionality. Could you try to work on that?

Store the sizeof in a single hash map Class => Integer instead of using the double hash map of offsets in Loader.

It might make it a bit faster, sure, but not as fast as no hash maps. We can modify that without changing the API though.

saudet · 2021-05-11T06:23:34Z

Instead of trying to change everything, what about adding new overloads like Pointer.getPointer(Class<? extends Pointer> c)?
I think that should pretty much cover everything, as long as users don't try to call position(long i) when they don't need it that is.

HGuillemet · 2021-05-11T08:50:18Z

You mean a way to create a bare pointer, with recalculated byte-wise position/limit/capacity, from any instance of Pointer ?

Would that be called automatically when using a generated method working on a Pointer ?

Would that break anything if the existing Pointer(Pointer) constructor did this ?

saudet · 2021-05-11T12:14:17Z

You mean a way to create a bare pointer, with recalculated byte-wise position/limit/capacity, from any instance of Pointer ?

Something like that, yeah.

Would that be called automatically when using a generated method working on a Pointer ?

No, Pointer.getPointer() is called by the user for casting and offsetting operations.

Would that break anything if the existing Pointer(Pointer) constructor did this ?

I keep telling you that even if we find a way to not break anything, it would also need to not add overhead!
If you find a way to do that, great, if not, please stop proposing changes like that.

… `capacity` with `sizeof()` (pull #476)

saudet · 2021-05-12T11:29:07Z

I've added the scaling to the existing Pointer.getPointer(Class<P>, long) method in commit 577239c and it seems to me like we can do everything correctly and easily with it, for example:

FloatPointer fp = someBytePointer.getPointer(FloatPointer.class);
for (int i=0; i<n; i++) {
  FloatPointer fp2 = fp.getPointer(i*w*h);
  fp2.put(imagePointer);
  Mat m = new Mat(h, w, CV_32FC1, fp2, Mat.AUTO_STEP);
  ...
}
...

Can you think of a scenario where Pointer.getPointer() would fail?

HGuillemet · 2021-05-12T21:56:21Z

I'll limit my propositions in the future for sure.

Can you think of a scenario where Pointer.getPointer() would fail?

In your example if the Mat constructor used limit or capacity it would.

saudet · 2021-05-13T08:51:44Z

I'll limit my propositions in the future for sure.

I'm sorry, I didn't mean it that way. I wish we could spend more time being productive, that's all, but I understand that communication is not easy, and it's not a big deal rewriting the same thing a couple of times, so please don't worry about it. I'm not angry or anything even though it may seem like that, but it's not the case.

Think of it that way, a lot of people just go right into modifying the address field manually, and don't even think about anything else, so it's unlikely that users are going to end up using awkward constructs like the cast constructor or position() by mistake instead of getPointer(). I could be wrong, but since getPointer() is easier to use, I don't really see how that would happen.

Can you think of a scenario where Pointer.getPointer() would fail?

In your example if the Mat constructor used limit or capacity it would.

But it doesn't, so everything is alright. If you find an example where this causes incorrect behavior, please let me know.

HGuillemet · 2021-05-13T12:17:19Z

I'm sorry, I didn't mean it that way. I wish we could spend more time being productive, that's all

My suggestions were typed a bit to fast I concede.

But it doesn't, so everything is alright.

Other method could, but I guess it's rare. To work around this, if ever the user is aware that position/limit/capacity will be interpreted byte-wise for the method he wants to call, he may also use getPointer this way:

FloatPointer fp = someBytePointer.getPointer(FloatPointer.class);
for (int i=0; i<n; i++) {
  fp.postion(i*w*h);
  fp2.put(imagePointer);
  Mat m = new Mat(h, w, CV_32FC1, fp.getPointer(Pointer.class), Mat.AUTO_STEP);
  ...
}

Anyway this PR was trying to solve definitively the problem of position consistency. Workarounds already exist, like modifying the address manually like you said.
But I understand now that there is no way to solve it without adding overhead, either in position (at least a multiplication or division) or when calling a native method (by somehow calling getPointer(typeInMethodSignature) like above but transparently). And the increased consistency in the API is not worth the overhead (and the required time of development).

So let's close this PR.

saudet · 2021-05-14T04:52:48Z

Other method could, but I guess it's rare.

I can't think of anything. The capacity is never used from JNI, and the only time limit is used is for adapters, and that doesn't work with Pointer, you can't have a std::vector<void> for example, and anything else results in a compile-time error. Can you think of something else?

But I understand now that there is no way to solve it without adding overhead, either in position (at least a multiplication or division) or when calling a native method (by somehow calling getPointer(typeInMethodSignature) like above but transparently). And the increased consistency in the API is not worth the overhead (and the required time of development).

I wouldn't consider any of this inconsistent, it's just how it works. Like I said, it's not meant to replace NIO, we have FMA for that: https://docs.oracle.com/en/java/javase/16/docs/api/jdk.incubator.foreign/jdk/incubator/foreign/package-summary.html

So let's close this PR.

Can we instead try to morph this into support for FMA?

HGuillemet · 2021-05-14T19:34:20Z

Can you think of something else?

No, I cannot.

I wouldn't consider any of this inconsistent, it's just how it works.

I think you can admit that the fact that position being not interpreted the same way in the caller code using some subclass of Pointer and in any called method M taking a Pointer as argument is not what is expected by a Java developer. Position is stored in the state of the Pointer object, it's not a method argument. Even from a C/C++ developer point of view, calling M(p.position(2)) would be expected to mirror C code like M(p+2) or M(&p[2]). Furthermore methods like Pointer.put(Pointer) with similar prototype, interpret position "correctly". For you, it's not inconsistent because you know well how it's implemented, so the way it actually works makes perfect sense, but it's not the case of a typical user.
I know your priority is to make things work and work fast, not to provide the best Java developer experience. With this in mind the way it presently works might well be the best that can be done.

Can we instead try to morph this into support for FMA?

Ok, sounds a good idea.
Do we speak of some JavaCPP 2 where Pointer disappears in favor of MemorySegment and MemoryAddress and the Generator and the Parser are reworked or do so you have something else in mind ?

saudet · 2021-05-17T00:17:45Z

Can you think of something else?

No, I cannot.

👍

I wouldn't consider any of this inconsistent, it's just how it works.

I think you can admit that the fact that position being not interpreted the same way in the caller code using some subclass of Pointer and in any called method M taking a Pointer as argument is not what is expected by a Java developer. Position is stored in the state of the Pointer object, it's not a method argument. Even from a C/C++ developer point of view, calling M(p.position(2)) would be expected to mirror C code like M(p+2) or M(&p[2]). Furthermore methods like Pointer.put(Pointer) with similar prototype, interpret position "correctly". For you, it's not inconsistent because you know well how it's implemented, so the way it actually works makes perfect sense, but it's not the case of a typical user.
I know your priority is to make things work and work fast, not to provide the best Java developer experience. With this in mind the way it presently works might well be the best that can be done.

From Java's perspective, sure, we want to make everything the safest possible, and having slow but safe(r) get()/put() method pairs sounds good to me. Java developers are used to having methods like that, and it looks like we can make everything consistent, as long as we call something that starts with "get" like getPointer() or "put" like put(), which I think is good. If the average user can do everything they need consistently like that, then we can move everything else to a lower-level API, which is more consistent with C++ than Java, and yes even the position reflects somewhat how C++ works where, for example, guess what the output of this is:

#include <stdio.h>
struct Foo {
    int i = 42;
};

struct Bar : public Foo {
    int j = 37;
};

void fooBar(Foo *p) {
    printf("%d\n", p[1].i);
}

int main() {
    Bar* p = new Bar[445];
    printf("%d\n", p[1].i);
    fooBar(p);
}

I'm not saying we should try to make everything work exactly like C++, I'm just saying it won't be possible to make everything consistent anyway because C++ is inconsistent by design.

Can we instead try to morph this into support for FMA?

Ok, sounds a good idea.
Do we speak of some JavaCPP 2 where Pointer disappears in favor of MemorySegment and MemoryAddress and the Generator and the Parser are reworked or do so you have something else in mind ?

I'm sure there's a way to keep backward compatibility if it's just for Pointer and FMA, but for the Parser and Clang, we probably want to break backward compatibility anyway like I talked about a bit at bytedeco/javacpp-presets#475 (comment) and #453 (comment).

saudet · 2021-08-03T09:03:59Z

BTW, if you're willing to maintain an experimental branch that breaks compatibility with JavaCPP 1.5.x, we could merge this there. Does that sound like something you'd like to do?

HGuillemet · 2021-08-09T23:05:19Z

Let’s be sure that there is an interest in spending resources in such project first, and that the result has a chance to be merged into JavaCPP one day.

The main difference between Panama Foreign Memory API and the corresponding features in JavaCPP is the relative safety FMA provides: by default, only MemorySegment can be dereferenced and a MemorySegment is necessary bound. You cannot dereference an address outside the bounds. Also once a segment is deallocated, no way to dereference it either.
Also the FMA API provides some methods to keep this safety even when a memory segment is accessed from different thread, and without relying on any costly mutex.

There is a way to create an abitrary MemorySegment from a MemoryAddress, or even from a long, but this requires launching the JVM with command line argument such as -enable-native-access=org.bytedeco.opencv. With this argument, it’s also possible to directly dereference a MemoryAddress without defining a MemorySegment.
If a C/C++ library returns a pointer to a memory it as allocated itself, and if the Java code needs to dereference the pointer, the command line argument is needed, which is a bit cumbersome.

So what do we get in breaking compatibility by adopting FMA ? I think it’s the increased safety from the user perspective. And to benefit from the efforts of the Panama team to implement optimized access to memory while keeping this safety.

If we only want to support FMA API for people needing to access native memory using FMA, in order to interact with Vector API for instance, or to use the Memory Layout feature of FMA, we can simply add some method that creates a memory segment from a Pointer (or the inverse). No need indeed to break compatibility.

If we do want the increased safety, then it’s a big overhaul, with the drop of Pointer classes, rewriting of Parser and Generator, and probably using a much thinner JNI layer (until the Foreign Function API supports C++ libraries). We could indeed take this opportunity to adopt Clang.

But is this the direction you want JavaCPP to take ? Your priority has always been speed and giving to Java users all the options C++ developers have. It has not been safety.

Is there some other aspect I missed ?

saudet · 2021-08-13T01:38:45Z

FMA is fine, it's basically a redesigned and enhanced version of the Buffer interface. They may decide to keep it hard to use for native libraries, and I think that's what they will choose to do, so we probably won't be able to implement the equivalent of a fully functional Pointer class on top of it, but nothing is decided yet. It's probably going to take a few years before that gets settled, but in the end yeah I don't think it will be something we can use instead of Pointer.

As for the FFI part of things, they (finally) made it pretty clear that the JDK will never support C++. It's just not something they will do, ever. Whatever users might want to do there, it's going to be done by higher-level libraries like JavaCPP, or some enhanced version of jextract or something. But if they keep FMA hard to use with FFI, it's never going to be easier to use than JNI anyway, so I don't see a whole lot of people adopting this. Personally, I'm waiting to see if Google decides to make Android compatible with Panama, or if they just keep using JNI, in which case everyone is probably just going to keep using JNI. That's also going to take a few years to become clear though.

All that to say that I don't have a crystal ball, but I don't think OpenJDK is going to matter. Something else is going to happen at some point and my bet is on LLVM...

HGuillemet added 6 commits April 23, 2021 17:57

upgrade BND to 5.3.0 for Java 15+ compatibility

34f8bca

BND plugin upgrade

c482f14

Merge branch 'master' of https://github.com/bytedeco/javacpp into hg_…

9b91ff0

…bnd_upgrade

Store position,limit,capacity in Bytes

12fdf01

Remove final to position()/limit()/capacity()

51ba0bf

Fix bugs introduced in 12fdf01

7af0aac

saudet added a commit that referenced this pull request May 12, 2021

* Upon Pointer.getPointer(Class<P>) scale position, limit, and…

577239c

… `capacity` with `sizeof()` (pull #476)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

position, limit and capacity across Pointer casts #476

position, limit and capacity across Pointer casts #476

HGuillemet commented May 8, 2021 •

edited

Loading

saudet commented May 10, 2021

saudet commented May 11, 2021

HGuillemet commented May 11, 2021

saudet commented May 11, 2021

saudet commented May 12, 2021

HGuillemet commented May 12, 2021

saudet commented May 13, 2021

HGuillemet commented May 13, 2021

saudet commented May 14, 2021 •

edited

Loading

HGuillemet commented May 14, 2021

saudet commented May 17, 2021

saudet commented Aug 3, 2021

HGuillemet commented Aug 9, 2021

saudet commented Aug 13, 2021

position, limit and capacity across Pointer casts #476

Are you sure you want to change the base?

position, limit and capacity across Pointer casts #476

Conversation

HGuillemet commented May 8, 2021 • edited Loading

Issue

Example 1

Example 2

Solutions

Drawbacks

Performance

To be verified

saudet commented May 10, 2021

saudet commented May 11, 2021

HGuillemet commented May 11, 2021

saudet commented May 11, 2021

saudet commented May 12, 2021

HGuillemet commented May 12, 2021

saudet commented May 13, 2021

HGuillemet commented May 13, 2021

saudet commented May 14, 2021 • edited Loading

HGuillemet commented May 14, 2021

saudet commented May 17, 2021

saudet commented Aug 3, 2021

HGuillemet commented Aug 9, 2021

saudet commented Aug 13, 2021

HGuillemet commented May 8, 2021 •

edited

Loading

saudet commented May 14, 2021 •

edited

Loading