-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LayersModel#predict() results in all zeros when using WebGPU backend in Deno #6842
Comments
@haoyunfeix Please take a look, thanks. |
@vicary Thanks for your effort on denoland/deno#15853 and report this bug! And could you please show me the steps how to build deno from source at commit 2929ec9f if possible? I would like to use a local build of deno to reproduce this issue in your way. @qjia7 In another way, I can reproduce this issue with local build webgpu backend by skipping the feature check like:
Here is the code: import * as tf from 'https://cdn.skypack.dev/@tensorflow/tfjs'
import '../dist/bin/tfjs-core/tfjs-core_pkg/dist/tf-core.es2017.js'
import '../dist/bin/tfjs-backend-webgpu/tfjs-backend-webgpu_pkg/dist/tf-backend-webgpu.es2017.js'
async function test(backend) {
// initialize tensorflow
if(await tf.setBackend(backend)){
await tf.ready()
const model = tf.sequential();
model.add(tf.layers.dense({ units: 1, inputShape: [1] }));
const output = await model.predict(tf.tensor([1])).array();
console.log(`The output of ${backend} is ${output}`); // prints [[0]]
}
else{
console.log(`${backend} is not set successfully!`);
}
}
await test('webgpu');
await test('cpu'); command and output: wp >>> ~/.deno/bin/deno run --allow-write --allow-read --allow-net --unstable mod3.ts 22-09-21 13:03
libEGL warning: pci id for fd 12: 102b:0522, driver (null)
WARNING: lavapipe is not a conformant vulkan implementation, testing use only.
The output of webgpu is 0
The output of cpu is -0.8412973284721375 deno version: wp >>> ~/.deno/bin/deno --version 22-09-21 13:04
deno 1.25.0 (release, x86_64-unknown-linux-gnu)
v8 10.6.194.5
typescript 4.7.4 |
I followed the steps in their docs: https://deno.land/manual@v1.25.3/contributing/building_from_source
You may use |
Gently pinging @haoyunfeix, is there anything I can do to help move this forward? |
@vicary Sorry for the delay! Seems there are some shader validation difference between browser and deno. import '../dist/bin/tfjs-core/tfjs-core_pkg/dist/tf-core.es2017.js'
import '../dist/bin/tfjs-backend-webgpu/tfjs-backend-webgpu_pkg/dist/tf-backend-webgpu.es2017.js'
import * as tf from 'https://cdn.skypack.dev/@tensorflow/tfjs'
async function test(backend) {
// initialize tensorflow
if (await tf.setBackend(backend)) {
await tf.ready()
const a = tf.tensor2d([1, 2, -3, -4], [2, 2]);
const b = tf.tensor1d([1, 2]);
tf.env().set('WEBGPU_CPU_FORWARD', false);
let c = tf.add(a, b);
console.log(await a.data());
console.log(await b.data());
console.log(await c.data());
console.log(tf.getBackend());
} else {
console.log(`${backend} is not set successfully!`);
}
}
await test('webgpu');
await test('cpu'); Errors happens at https://github.com/tensorflow/tfjs/blob/master/tfjs-backend-webgpu/src/webgpu_program.ts#L59 @qjia7 WDYT? Is it an issue caused by different version of WGSL? |
@haoyunfeix @vicary I think deno is relying on an old webgpu(wgsl) point. But tfjs-webgpu is following up the latest webgpu/WGSL spec. So to resolve this issue, deno needs to upgrade the underlying webgpu implementation in your side. |
@vicary Please let us know whether this issue can be fixed after upgrading webgpu in deno. |
@qjia7 Deno is using a rust implementation of WebGPU
The Shader Support in the monorepo also mentioned a partial support of the draft.
Sorry for my lack of knowledge of the WebGPU standard, it would be very kind of you to address the earliest compatible version or specific features. I believe this would create a legit use case for them to work on those specific implementations. |
@vicary From the error message provided by @haoyunfeix , it's due to |
Thanks for the reply @qjia7, also please let me thank @haoyunfeix ahead of time for creating the issue upstream. From the first 2 errors I get that it's about the missing I am not sure I understand the last error though, maybe the rust implementation requires the |
@qjia7 Since wgpu relies on naga for wgsl-in compilation, I submitted |
Sure. My temporary fix as below, to move fn _start(@builtin(local_invocation_id) LocalId : vec3<u32>,
@builtin(global_invocation_id) GlobalId : vec3<u32>,
@builtin(num_workgroups) NumWorkgroups : vec3<u32>) {
localId = LocalId;
globalId = GlobalId;
numWorkgroups = NumWorkgroups;
main(getGlobalIndex());
}
fn main(index : i32)
{
// Fill in the shared memory buffer.
let localIndex = i32(localId.x);
if(localIndex < 2) {
sharedBuf[localIndex] = f32(B[localIndex]);
}
workgroupBarrier();
if(index < uniforms.size) {
let coords = getCoordsFromIndex(index);
let a = getAByOutputIndex(index);
let b = sharedBuf[coords[1]];
setOutputAtIndex(index, binaryOperation(a, b));
}
} After: fn _start(@builtin(local_invocation_id) LocalId : vec3<u32>,
@builtin(global_invocation_id) GlobalId : vec3<u32>,
@builtin(num_workgroups) NumWorkgroups : vec3<u32>) {
localId = LocalId;
globalId = GlobalId;
numWorkgroups = NumWorkgroups;
var index = getGlobalIndex();
let localIndex = i32(localId.x);
if(localIndex < 2) {
sharedBuf[localIndex] = f32(B[localIndex]);
}
workgroupBarrier();
if(index < uniforms.size) {
let coords = getCoordsFromIndex(index);
let a = getAByOutputIndex(index);
let b = sharedBuf[coords[1]];
setOutputAtIndex(index, binaryOperation(a, b));
}
} You could see that the other |
Oh, seems function must declare before entry point AND before ...
var<workgroup> sharedBuf : array<f32, 2>;
fn main(index : i32){
// Fill in the shared memory buffer.
let localIndex = i32(localId.x);
if(localIndex < 2) {
sharedBuf[localIndex] = f32(B[localIndex]);
}
workgroupBarrier();
if(index < uniforms.size) {
let coords = getCoordsFromIndex(index);
let a = getAByOutputIndex(index);
let b = sharedBuf[coords[1]];
setOutputAtIndex(index, binaryOperation(a, b));
}
}
@compute @workgroup_size(256, 1, 1)
//@compute @workgroup_size(workGroupSizeX, workGroupSizeY, workGroupSizeZ)
fn _start(@builtin(local_invocation_id) LocalId : vec3<u32>,
@builtin(global_invocation_id) GlobalId : vec3<u32>,
@builtin(num_workgroups) NumWorkgroups : vec3<u32>) {
localId = LocalId;
globalId = GlobalId;
numWorkgroups = NumWorkgroups;
main(getGlobalIndex());
} |
Maybe this is a bug for Mozilla's Naga? I think we call _start (entry point) after the declarations of main and _start, which should be fine. At least Tint also thinks it's OK. @haoyunfeix or @vicary , could you please also file a bug to them? |
Issue created. I don't have time to dig into naga yet. If it turns out lexical scoping is the root cause, it may result in a breaking change of theirs and we may not see it happens as soon as Interpreters with function scoping should be compatible with lexical scoping, moving What do you think? @gyagp @haoyunfeix |
FIXES tensorflow#6842 To support shader translation library which does not implement module scoping like naga
…6918) * [webgpu] Update shader to support non module-level scoping function FIXES #6842 To support shader translation library which does not implement module scoping like naga * Unify kerels use main() to generate user function and getStartHenderString() to make entry point function * Use isFlatPatchLayout to determine main header And address comments * remove unnecessary scope checking
@vicary Done with #6918, for shader compiler issues 1 and 2 mentioned above we intend not to fix in TFJS but track them on naga project(gfx-rs/naga#2071 and gfx-rs/naga#2080). BTW, I indeed try to fix 1 and 2 in TFJS(https://github.com/tensorflow/tfjs/compare/master...haoyunfeix:tfjs:test_6842?expand=1) and glad to see WebGPU on deno could get the same result as CPU backend. I posted all resources(updated webgpu build and test code) here in case you are interested in. |
System information
Describe the current behavior
I came from #6746 and made denoland/deno#15853 to have deno compatible with the WebGPU backend. Model predictions sometimes resulting in an all-zeros tensor, the problem does not exist with a CPU backend.
Describe the expected behavior
Output tensors should contain non-zero numbers.
Standalone code to reproduce the issue
Other info / logs
I am on an Apple M1 laptop.
Backstory and tracking issues:
deno_webgpu
fromgfx-rs/wgpu
, which in turn usesgfx-rs/naga
for all shader related things.The text was updated successfully, but these errors were encountered: