-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error Type: oneflow.ErrorProto.check_failed_error #527
Comments
see code updated ! I tested again ! but failed again! Error Type: oneflow.ErrorProto.check_failed_error
ERROR [2024-01-24 08:36:01] - Exception in __call__: e=Exception('Check failed: (45 == 46) \n File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob\n RunNormalOp(launch_context, launch_op, inputs)\n File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp\n it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))\n File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret\n [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()\n File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()\n user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)\n File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer\n Infer(*user_op_expr, infer_args)\n File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer\n user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })\n File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc\n physical_tensor_desc_infer_fn_(&infer_ctx)\n File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc\n CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))\nError Type: oneflow.ErrorProto.check_failed_error') |
This problem hasn't been resolved yet. Stay tuned. |
OK |
@onefish51 Please take a try |
Well done! I just tested it, and indeed the issue has been fixed! You guys are so efficient! You're truly amazing! 💯 👍 |
this bug seems to still exisit in compiling vae decoder |
We have passed the uncommon resolution test under the compiled VAE decoder, example at: onediff/examples/text_to_image_sdxl.py Line 69 in 4ecfe71
Can you provide the test environment and scripts? |
I tested OK ! # Compile unet with oneflow
if args.compile_unet:
print("Compiling unet with oneflow.")
base.unet = oneflow_compile(base.unet)
base.vae.decoder = oneflow_compile(base.vae.decoder)
# @cost_cnt
def run_once(args, h, w, i, output_type):
start = time.time()
print(f"====> run {i}-th h {h} w {w}")
image = base(
prompt=args.prompt,
height=h,
width=w,
num_inference_steps=args.n_steps,
output_type=output_type,
).images
# torch.cuda.empty_cache()
end = time.time()
print(f'infer one cost { end - start}')
# Normal SDXL run
torch.manual_seed(args.seed)
sizes = [1280, 1200, 1120, 960, 840, 800, 720]
# sizes = [1280, 960, 720]
# sizes = [1120, 896, 768]
# @cost_cnt
def test_49_graph():
for h in sizes:
for w in sizes:
for i in range(2):
run_once(args, h, w, i, output_type)
# empty torch mem cache
# torch.cuda.empty_cache()
test_49_graph() |
However, when testing the support for arbitrary sizes in DeepCache, I encountered the following issues: ====> run 1-th h 720 w 960
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:01<00:00, 25.94it/s]
infer one cost 1.4307401180267334
====> run 0-th h 720 w 720
0%| | 0/30 [00:00<?, ?it/s][ERROR](GRAPH:OneflowGraph_4:OneflowGraph) run got error: <class 'oneflow._oneflow_internal.exception.Exception'> Check failed: (45 == 46)
File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob
RunNormalOp(launch_context, launch_op, inputs)
File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp
it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
[&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
Infer(*user_op_expr, infer_args)
File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc
physical_tensor_desc_infer_fn_(&infer_ctx)
File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc
CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))
Error Type: oneflow.ErrorProto.check_failed_error
ERROR [2024-01-30 10:30:46] - Exception in __call__: e=Exception('Check failed: (45 == 46) \n File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob\n RunNormalOp(launch_context, launch_op, inputs)\n File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp\n it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))\n File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret\n [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()\n File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()\n user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)\n File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer\n Infer(*user_op_expr, infer_args)\n File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer\n user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })\n File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc\n physical_tensor_desc_infer_fn_(&infer_ctx)\n File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc\n CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))\nError Type: oneflow.ErrorProto.check_failed_error') Is it because DeepCache only supports sizes that are multiples of 32? |
We will also fix this issue in the extension of DeepCache later on. |
OK |
sorry,my mistake |
When I retested the support for arbitrary sizes using the latest update of onediff, 9c7cda2f274629d1e76c42f010ef4be134fde516, I found that the latest code does not support it. Successfully installed onediff-0.12.1.dev1 test log: Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 9.31it/s]
Compiling with oneflow.
====> run 0-th h 896 w 896
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:51<00:00, 25.97s/it]
infer one cost 54.91251730918884
====> run 1-th h 896 w 896
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 8.85it/s]
infer one cost 0.5545377731323242
====> run 0-th h 896 w 768
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00, 5.80s/it]
infer one cost 12.783660411834717
====> run 1-th h 896 w 768
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 9.94it/s]
infer one cost 0.4895670413970947
====> run 0-th h 768 w 896
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00, 5.87s/it]
infer one cost 12.91884446144104
====> run 1-th h 768 w 896
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 10.02it/s]
infer one cost 0.4890902042388916
====> run 0-th h 768 w 768
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:10<00:00, 5.50s/it]
infer one cost 12.05246877670288
====> run 1-th h 768 w 768
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 11.22it/s]
infer one cost 0.4315967559814453 |
This has been fixed in the latest main, please update. |
yes |
Describe the bug
when I run the script
https://github.com/siliconflow/onediff/blob/main/examples/text_to_image_sdxl.py
and change the input size from [896, 768] to [960, 720], an error occurs.
OneDiff git commit id
onediff-0.12.1.dev0
OneFlow version info
Run
python -m oneflow --doctor
and paste it here.How To Reproduce
change the input size from [896, 768] to [960, 720]
run
python examples/text_to_image_sdxl.py
The complete error message
The text was updated successfully, but these errors were encountered: