-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reduce memory copying by API change/addition #240
Comments
The functionality you are requesting is already there. Some So, the solution for you would be to use it like, for example, here: public static void main(String[] args) {
Webcam w = Webcam.getDefault();
w.open();
do {
ByteBuffer bb = w.getImageBytes();
// process byte buffer
} while (something);
} In case you need more information about how the memory buffer is converted to I hope this answers your question. |
Well, not really. What you can do by this method is retrieving a ByteBuffer-Object from the capturer device. PLease correct me if I am wrong, this ByteBuffer-Object seems to be allocated everytime a new image is received. That means additionally overhead. What I would like to have is not a getter but a filler method something like public static void main(String[] args) {
Webcam w = Webcam.getDefault();
w.open();
ByteBuffer b = BufferUtils.allocate(w.getMaximumBufferSize());
do {
w.retrieveImageBytes(b);
// process byte buffer
} while (something);
} |
Yes, this is true. Call to the New direct ByteBuffer reference is created within BridJ JNI: jobject JNICALL Java_org_bridj_JNI_newDirectByteBuffer(JNIEnv *env, jobject jthis, jlong peer, jlong length) {
jobject ret;
ret = (*env)->NewDirectByteBuffer(env, JLONG_TO_PTR(peer), length);
return ret;
} And points to memory allocated inside the OpenIMAJGrabber native: void process_image(VideoGrabber* grabber, void* buffer, size_t length) {
if (grabber->rgb_buffer.length != length) {
if (grabber->rgb_buffer.start != NULL) {
free(grabber->rgb_buffer.start);
}
grabber->rgb_buffer.start = malloc(length);
grabber->rgb_buffer.length = length;
}
memcpy(grabber->rgb_buffer.start, buffer, length);
} There is a way to copy the memory into the target ByteBuffer (at least theoretically, I haven't test it), so this would be possible: public static void main(String[] args) {
Webcam w = Webcam.getDefault();
w.open();
ByteBuffer bb = BufferUtils.allocate(w.getImageBufferSize());
do {
w.getImageBytes(b);
// process byte buffer
} while (something);
} But please note that this operation will copy the data from one buffer to the other and from what I understood, you do not want any more copies. Existing functionality does not copy allocated buffer to ByteBuffer, but instead it creates new direct ByteBuffer object which points to pre-allocated memory. This is done by
The above neither allocates memory for image, nor copy it from one place to the other, so the big question here is what is better:
I do not have tools and knowledge to test which one is faster, but I guess it would be fairy simple to add new method which will do exactly what you want (point 2). |
Hey sarxos, thanks for your fast answer. Best practice in image processing is always:
These three axioms leeds to the best peformance that can be achieved. If the grabber devices actually deliver a pointer (Bytebuffer or something like that), we as application developers either use this pointer directly for displaying or processing. That will only work correctly if there is some kind of synchroization feature in the underlaying grabber API. Otherwise you will have mostly async/cutted images while drawing. As far as I can see in OpenIMAJGrabber native, this is not synchronized, so we should leave the pointer as fast as we can or is it done by the process callback? Either way, we will have to copy it. In this case it would be the best to copy the data directly in a pre defined Bytebuffer from the API user. Your big question: I hope I understood you correctly.
Well, both have the same performance at memcpy since you will cut the copied bytes at the memcpy call. That means VGA will only copy 1MB where HD will copy 7,6MB. drawbacks:
I think this decision must be left to the user of the API, since it has either memory or performance impact. That depends on the application. |
I think there a gap in communication :) I'm not native English speaker, so I may described these two points incorrectly. Let me describe again.
The OpenIMAJGrabber always allocates new memory buffer (lets call it A, size = width * height * 3) and copy data from device internal working memory to the buffer A. Unfortunately this cannot be changed in a simple way, at least not in Webcam Capture API project (OpenIMAJGrabber is part of OpenIMAJ). I'm not even sure if this won't break anything. So copy from device internals to A is something we have to deal with.. In case of 1, there is no second copy. The Java reference ByteBuffer is created in-place from pre-allocated memory or A. Nothing is being copied, the only memory allocated here is small portion of Java heap which is used to handle ByteBuffer reference (maybe around 32 bytes: 8 for address, 8 for length and 16 for Object class). In case of 2, you create ByteBuffer and allocate new memory (~1MB in heap, or direct - outside of the heap + ~32 bytes for Java reference), lets call this buffer B. In this case you will always copy from A to B. So to summarise:
This is what is done in 1. At least this is my understanding after reading the source code of OpenIMAJGrabber and BridJ. |
Okay, I see. I am not a native speaker as well (I am only 400km away from you ;) So, back to the topic... I don't really like version 1 since this seems to issue the described synchronization problems above. You should not access the pointer if it is not secured that no one else is writing it at the same time. Otherwise you will get cutted images. Version 2 is probably the best what we can get without changing the underlaying API's. This has one copy operation too much but that would be okay. Maybe this is a good point to but image transformation or other processing plugins (pipelining) in this particular place. |
The low-level webcam API is internally synchronized by the Example: import java.awt.Dimension;
import java.nio.ByteBuffer;
import com.github.sarxos.webcam.Webcam;
import com.github.sarxos.webcam.WebcamDevice.BufferAccess;
import com.github.sarxos.webcam.WebcamResolution;
import com.github.sarxos.webcam.WebcamTask;
public class Test {
public static class MySynchronisedTask extends WebcamTask {
private volatile boolean done = false;
public MySynchronisedTask(Webcam w) {
super(Webcam.getDriver(), w.getDevice());
}
@Override
protected void handle() {
ByteBuffer buffer = ((BufferAccess) getDevice()).getImageBytes();
if (buffer != null) {
// use the buffer to do something which needs to be
// synchronized, webcam will not execute any other operation
// until this method is in progress, e.g. you can copy from
// buffer to graphic card memory
done = true;
}
}
public boolean doTheJob() {
done = false;
try {
process();
} catch (InterruptedException e) {
}
return done;
}
}
public static void main(String[] args) throws InterruptedException {
Dimension s = WebcamResolution.QQVGA.getSize();
final Webcam w = Webcam.getDefault();
w.setViewSize(s);
w.open();
Thread t1 = new Thread(new Runnable() {
@Override
public void run() {
do {
if (!new MySynchronisedTask(w).doTheJob()) {
return;
}
} while (true);
}
});
t1.setDaemon(true);
t1.start();
Thread t2 = new Thread(new Runnable() {
@Override
public void run() {
do {
if (!new MySynchronisedTask(w).doTheJob()) {
return;
}
} while (true);
}
});
t2.setDaemon(true);
t2.start();
Thread.sleep(2000);
w.close();
}
} In the example above you can see both |
Except the functionality I described in my previous post, you can try the new method import java.awt.Dimension;
import java.awt.Transparency;
import java.awt.color.ColorSpace;
import java.awt.image.BufferedImage;
import java.awt.image.ComponentColorModel;
import java.awt.image.ComponentSampleModel;
import java.awt.image.DataBuffer;
import java.awt.image.DataBufferByte;
import java.awt.image.Raster;
import java.awt.image.WritableRaster;
import java.io.File;
import java.io.IOException;
import java.nio.ByteBuffer;
import javax.imageio.ImageIO;
import com.github.sarxos.webcam.Webcam;
import com.github.sarxos.webcam.WebcamResolution;
public class Test2 {
public static void main(String[] args) throws InterruptedException, IOException {
Dimension size = WebcamResolution.QQVGA.getSize();
int length = size.width * size.height * 3;
ByteBuffer bb = ByteBuffer.allocate(length);
Webcam w = Webcam.getDefault();
w.setViewSize(size);
w.open();
w.getImageBytes(bb);
byte[] bytes = new byte[length];
bb.rewind();
bb.get(bytes);
int[] BAND_OFFSETS = new int[] { 0, 1, 2 };
int[] BITS = { 8, 8, 8 };
int[] OFFSET = new int[] { 0 };
int DATA_TYPE = DataBuffer.TYPE_BYTE;
ColorSpace COLOR_SPACE = ColorSpace.getInstance(ColorSpace.CS_sRGB);
ComponentSampleModel smodel = new ComponentSampleModel(DATA_TYPE, size.width, size.height, 3, size.width * 3, BAND_OFFSETS);
ComponentColorModel cmodel = new ComponentColorModel(COLOR_SPACE, BITS, false, false, Transparency.OPAQUE, DATA_TYPE);
byte[][] data = new byte[][] { bytes };
DataBufferByte dbuf = new DataBufferByte(data, bytes.length, OFFSET);
WritableRaster raster = Raster.createWritableRaster(smodel, dbuf, null);
BufferedImage bi = new BufferedImage(cmodel, raster, false, null);
bi.flush();
ImageIO.write(bi, "jpg", new File("test.jpg"));
}
} |
@slajar, please let me know if this satisfies your needs. |
I really like the webcam library it is probably the most extensive and versatible lib for java. Nevertheless, it has one major (unfortunately common for java API mistake) it is designed to use too much copy processes. Let me clarify it a bit. You get retrieve a BufferedImage by a getSomeImage()-method that's great for using directly in swing applications. Unfortunately this copies the image once. After that you have to draw it to a graphics object. That copies it the second time.
In my case it's much worse. Over here we are in a opengl environment. That means we retrieve the image, transform it to a bytebuffer and copy it to the graphics card. That means at least three copies. As far as I understand the code below there is one more copy involved. The bigger, the image is, the more cpu is involved.
Therefore I ask for a function that let me specify a target ByteBuffer. With this change I would probably be able to remove at leat 2 copy processes. I am sure there are more applications that need the same performance.
The text was updated successfully, but these errors were encountered: