-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
webgpu: support dataToGPU api #6329
Conversation
@qjia7 @axinging @haoyunfeix @gyagp Please take a look, thank you. |
util.assert( | ||
options.customBufSize % 4 === 0, | ||
() => 'customBufSize should be a multiple of 4.'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make sure that options.customBufSize
is always larger than or equal to bufferInfo.byteSize
. Otherwise, the user will lose partial data.
And see Na's reply about the reason why customShape is needed in webgl
#5953 (review)
Also paste it here.
Just curious, why support customTexShape? Is there any special requirement for this?
The main use case is if user set the texShape to be the same as the canvas shape, then they can directly render the result on canvas. If they don't specify the texShape, we try to make a squarish texture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For webgpu, I think we keep it is to conveniently copy the returned buffer to a texture and render to the canvas. And if this point doesn't hold, we should remove options.customBufSize
for webgpu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -471,6 +472,83 @@ export class WebGPUBackend extends KernelBackend { | |||
return vals; | |||
} | |||
|
|||
async downloadGPUBufferData(gpuData: GPUData): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function body is very similar with getBufferData
. Can you write a common helper function to reuse the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
const res = b.dataToGPU(); | ||
expectArraysEqual(res.bufSize, size); | ||
const resData = await webGPUBackend.downloadGPUBufferData(res); | ||
expectArraysClose(resData, new Float32Array(data)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not directly use expectArraysClose(resData, data);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
expectArraysClose(resData, new Float32Array(data)); | ||
}); | ||
|
||
it('uses user defined bufSize.', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a shorter buffer, we should make it throw an error instead of making it valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had changed bufSize
from smaller to larger. This case is used to test that custom buffer size is also valid for dataToGPU. And throwing an error is covered by last case when buffer size is smaller.
} | ||
|
||
if (bufferInfo.buffer == null) { | ||
if (values != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems all test cases only test dataToGPU happens at CPU forward disabled, it means tensor are always in GPU. Should we handle tensor in CPU, such as add some CPU forward enabled case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, Xing, Sorry that I did not fully understand your meaning. But dataToGPU is used when the data of tensor must be in GPU.
tfjs-core/package.json
Outdated
@@ -78,7 +78,8 @@ | |||
"@types/webgl-ext": "0.0.30", | |||
"long": "4.0.0", | |||
"node-fetch": "~2.6.1", | |||
"seedrandom": "2.4.3" | |||
"seedrandom": "2.4.3", | |||
"@webgpu/types": "0.1.6" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use "@webgpu/types" defined in tfjs/package.json instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tfjs/package.json had already added this package, so I removed it here.
FYI. There are issues about changing the shape when calling dataToGPU, I are discussing with LiNa in #5953. |
Please review it again, I sort of agree to add the custom size option, some details is #5953 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add @lina128 to take another look.
LGTM with one open.
For webgpu, we are exposing the buffer instead of a texture. So I am not sure whether it's still necessary to let user specify a custom buffer size.
} | ||
|
||
const bufferSize = options.customBufSize != null ? | ||
Math.max(bufferInfo.byteSize, options.customBufSize) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Math.max(bufferInfo.byteSize, options.customBufSize)
-> options.customBufSize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
const bufferSize = options.customBufSize != null ? | ||
Math.max(bufferInfo.byteSize, options.customBufSize) : | ||
bufferInfo.byteSize; | ||
const copySize = options.customBufSize != null ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const copySize = bufferInfo.byteSize;
? Or directly use bufferInfo.byteSize
in copyBufferToBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also fix the bots failures. Thanks.
@@ -17,7 +17,7 @@ | |||
|
|||
import './flags_webgpu'; | |||
|
|||
import {backend_util, buffer, DataStorage, DataType, DataValues, engine, env, KernelBackend, Rank, RecursiveArray, ShapeMap, TensorBuffer, TensorInfo, TimingInfo, TypedArray, util} from '@tensorflow/tfjs-core'; | |||
import {backend_util, buffer, DataStorage, DataToGPUWebGLOption, DataType, DataValues, engine, env, GPUData, KernelBackend, Rank, RecursiveArray, ShapeMap, TensorBuffer, TensorInfo, TimingInfo, TypedArray, util} from '@tensorflow/tfjs-core'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use DataToGPUOptions
instead of DataToGPUWebGLOption
for webgpu backend?
this.submitQueue(); | ||
|
||
const tensorInfo = this.makeTensorInfo( | ||
[bufferSize / webgpu_util.GPUBytesPerElement(dtype)], dtype); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep consistent with WebGL, should we use the src tensor's shape here? And maybe add a test to verify the tensorRef
's shape
and dtype
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think there is a potential issue if we use the buffer size to calculate the tensor shape. Currently, it may be safe since the the buffer size is exactly equal to the shape's size. However, what if we use a larger buffer to store the data, for example, clamp all buffer to be aligned with 16 bytes for optimization in future.
Can we directly use the source tensor's shape here? I know currently we don't keep the shape in TensorBufferInfo
. Maybe we should add it?
Another example is webgl. WebGL is using RGBA 4 channels to store data. However, the actual result tensor shape's size may not be divisible by 16 bytes. So the original tensor shape is useful.
@xhcao @lina128 How's your opinion on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There exists an issue. We cannot compute the element count from the source buffer size, because the buffer size may be larger than the elements size, for example, if dtype is 2 bytes and there are 5 elements, because the buffer size must be multiple of 4 when creating the buffer, the real buffer size is 12, if we use buffer size to compute the element count, the result is 6 and is wrong. So we should add the shape for webgpu backend.
Please review it again, thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thank you!
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @axinging, @qjia7, @webgpu, and @xhcao)
tfjs-backend-webgpu/src/backend_webgpu.ts
line 517 at r1 (raw file):
Previously, xhcao wrote…
Hi, Xing, Sorry that I did not fully understand your meaning. But dataToGPU is used when the data of tensor must be in GPU.
Yeah, Jiajia asked me this question too for WebGL. This is the same behavior as WebGL implementation today. My thinking is that user should catch the error and read data from CPU, whether they want to upload or manipulate data directly in CPU. If data is on CPU, most likely it's not a lot of data, so the use case may actually be computation on CPU. But we don't know what the real use case is, so let's keep the design simple for now. If there's actual user request in the future, we can add the support. What do you think?
Reminder: Please review it again, thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with two nits. Thanks.
info.bufferInfo.buffer = resBuffer; | ||
// Explicitly change the buffer size that could release the buffer | ||
// successfully in future. | ||
info.bufferInfo.byteSize = bufferSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why need to explicitly change the buffer size since the buffer is acquired as the same size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the same, one is computed by shape
, and the other is real buffer size
util.sizeFromShape(shape) * webgpu_util.GPUBytesPerElement(dtype); | ||
if (options.customBufSize != null) { | ||
util.assert( | ||
options.customBufSize >= size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here changed from buffersize to shape size for allowing to copy multiple times of original tensor content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.
This change is