Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Type: oneflow.ErrorProto.check_failed_error #527

Closed
onefish51 opened this issue Jan 17, 2024 · 15 comments
Closed

Error Type: oneflow.ErrorProto.check_failed_error #527

onefish51 opened this issue Jan 17, 2024 · 15 comments
Assignees
Labels
Request-bug Something isn't working sig-hfdiffusers

Comments

@onefish51
Copy link

Describe the bug

when I run the script
https://github.com/siliconflow/onediff/blob/main/examples/text_to_image_sdxl.py
and change the input size from [896, 768] to [960, 720], an error occurs.

OneDiff git commit id

onediff-0.12.1.dev0

OneFlow version info

Run python -m oneflow --doctor and paste it here.

libibverbs not available, ibv_fork_init skipped
Ignoring PCI device with non-16bit domain.
Pass --enable-32bits-pci-domain to configure to support such devices
(warning: it would break the library ABI, don't enable unless really needed).
path: ['/opt/conda/lib/python3.10/site-packages/oneflow']
version: 0.9.1.dev20240115+cu121
git_commit: d6809f6
cmake_build_type: Release
rdma: True
mlir: True
enterprise: False

How To Reproduce

change the input size from [896, 768] to [960, 720]
run python examples/text_to_image_sdxl.py

The complete error message

python text_to_image_sdxl.py 
libibverbs not available, ibv_fork_init skipped
Ignoring PCI device with non-16bit domain.
Pass --enable-32bits-pci-domain to configure to support such devices
(warning: it would break the library ABI, don't enable unless really needed).
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00,  9.64it/s]
Compiling unet with oneflow.
Compiling vae with oneflow.
Warmup with running graphs...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:56<00:00,  1.87s/it]
W20240117 07:49:12.241235   973 cudnn_conv_util.cpp:105] Currently available alogrithm (algo=0, require memory=0, idx=1) meeting requirments (max_workspace_size=2147483648, determinism=0) is not fastest. Fastest algorithm (1) requires memory 2149842960
Warmup cost 59.32852792739868
Normal SDXL run...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:04<00:00,  7.45it/s]
Normal cost 4.5491204261779785
Warmup run with multiple resolutions...
====> run h 960 w 960
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:03<00:00,  7.88it/s]
infer one cost 4.2422120571136475
====> run h 960 w 720
  0%|                                                                                                                                                                          | 0/30 [00:00<?, ?it/s][ERROR](GRAPH:OneflowGraph_0:OneflowGraph) run got error: <class 'oneflow._oneflow_internal.exception.Exception'> Check failed: (45 == 46) 
  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob
    RunNormalOp(launch_context, launch_op, inputs)
  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp
    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
    Infer(*user_op_expr, infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc
    physical_tensor_desc_infer_fn_(&infer_ctx)
  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc
    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))
Error Type: oneflow.ErrorProto.check_failed_error
ERROR [2024-01-17 07:49:22] - Exception in __call__: e=Exception('Check failed: (45 == 46) \n  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob\n    RunNormalOp(launch_context, launch_op, inputs)\n  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp\n    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret\n    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()\n    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer\n    Infer(*user_op_expr, infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer\n    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })\n  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc\n    physical_tensor_desc_infer_fn_(&infer_ctx)\n  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc\n    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))\nError Type: oneflow.ErrorProto.check_failed_error')
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:43<00:00,  1.45s/it]
infer one cost 43.735071659088135

....
@onefish51
Copy link
Author

see code updated ! I tested again ! but failed again!

Error Type: oneflow.ErrorProto.check_failed_error
ERROR [2024-01-24 08:36:01] - Exception in __call__: e=Exception('Check failed: (45 == 46) \n  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob\n    RunNormalOp(launch_context, launch_op, inputs)\n  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp\n    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret\n    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()\n    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer\n    Infer(*user_op_expr, infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer\n    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })\n  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc\n    physical_tensor_desc_infer_fn_(&infer_ctx)\n  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc\n    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))\nError Type: oneflow.ErrorProto.check_failed_error')

@strint
Copy link
Collaborator

strint commented Jan 24, 2024

see code updated ! I tested again ! but failed again!

This problem hasn't been resolved yet. Stay tuned.

@onefish51
Copy link
Author

OK

@strint
Copy link
Collaborator

strint commented Jan 28, 2024

@onefish51 Please take a try

clackhan added a commit that referenced this issue Jan 29, 2024
fix multi resolutions bug.
#527
@onefish51
Copy link
Author

Well done! I just tested it, and indeed the issue has been fixed! You guys are so efficient! You're truly amazing! 💯 👍

@zhangvia
Copy link

this bug seems to still exisit in compiling vae decoder

@lixiang007666
Copy link
Contributor

this bug seems to still exisit in compiling vae decoder

We have passed the uncommon resolution test under the compiled VAE decoder, example at:

base.vae.decoder = oneflow_compile(base.vae.decoder)

Can you provide the test environment and scripts?

@onefish51
Copy link
Author

I tested OK !

# Compile unet with oneflow
if args.compile_unet:
    print("Compiling unet with oneflow.")
    base.unet = oneflow_compile(base.unet)
    base.vae.decoder = oneflow_compile(base.vae.decoder)

# @cost_cnt
def run_once(args, h, w, i, output_type):
    start = time.time()
    print(f"====> run {i}-th h {h} w {w}")
    image = base(
        prompt=args.prompt,
        height=h,
        width=w,
        num_inference_steps=args.n_steps,
        output_type=output_type,
    ).images
    # torch.cuda.empty_cache()
    end = time.time()
    print(f'infer one cost { end - start}')
# Normal SDXL run
torch.manual_seed(args.seed)
sizes = [1280, 1200, 1120, 960, 840, 800, 720]
# sizes = [1280, 960, 720]
# sizes = [1120, 896, 768]
# @cost_cnt
def test_49_graph():
    for h in sizes:
        for w in sizes:
            for i in range(2):
                run_once(args, h, w, i, output_type)
    # empty torch mem cache
    # torch.cuda.empty_cache()

test_49_graph()

image

@onefish51
Copy link
Author

However, when testing the support for arbitrary sizes in DeepCache, I encountered the following issues:

====> run 1-th h 720 w 960
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:01<00:00, 25.94it/s]
infer one cost 1.4307401180267334
====> run 0-th h 720 w 720
  0%|                                                                                                                                                            | 0/30 [00:00<?, ?it/s][ERROR](GRAPH:OneflowGraph_4:OneflowGraph) run got error: <class 'oneflow._oneflow_internal.exception.Exception'> Check failed: (45 == 46) 
  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob
    RunNormalOp(launch_context, launch_op, inputs)
  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp
    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
    Infer(*user_op_expr, infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc
    physical_tensor_desc_infer_fn_(&infer_ctx)
  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc
    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))
Error Type: oneflow.ErrorProto.check_failed_error
ERROR [2024-01-30 10:30:46] - Exception in __call__: e=Exception('Check failed: (45 == 46) \n  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob\n    RunNormalOp(launch_context, launch_op, inputs)\n  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp\n    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret\n    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()\n    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer\n    Infer(*user_op_expr, infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer\n    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })\n  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc\n    physical_tensor_desc_infer_fn_(&infer_ctx)\n  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc\n    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))\nError Type: oneflow.ErrorProto.check_failed_error')

Is it because DeepCache only supports sizes that are multiples of 32?

@lixiang007666
Copy link
Contributor

lixiang007666 commented Jan 30, 2024

However, when testing the support for arbitrary sizes in DeepCache, I encountered the following issues:

====> run 1-th h 720 w 960
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:01<00:00, 25.94it/s]
infer one cost 1.4307401180267334
====> run 0-th h 720 w 720
  0%|                                                                                                                                                            | 0/30 [00:00<?, ?it/s][ERROR](GRAPH:OneflowGraph_4:OneflowGraph) run got error: <class 'oneflow._oneflow_internal.exception.Exception'> Check failed: (45 == 46) 
  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob
    RunNormalOp(launch_context, launch_op, inputs)
  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp
    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
    Infer(*user_op_expr, infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc
    physical_tensor_desc_infer_fn_(&infer_ctx)
  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc
    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))
Error Type: oneflow.ErrorProto.check_failed_error
ERROR [2024-01-30 10:30:46] - Exception in __call__: e=Exception('Check failed: (45 == 46) \n  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob\n    RunNormalOp(launch_context, launch_op, inputs)\n  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp\n    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret\n    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()\n    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer\n    Infer(*user_op_expr, infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer\n    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })\n  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc\n    physical_tensor_desc_infer_fn_(&infer_ctx)\n  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc\n    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))\nError Type: oneflow.ErrorProto.check_failed_error')

Is it because DeepCache only supports sizes that are multiples of 32?

We will also fix this issue in the extension of DeepCache later on.

@onefish51
Copy link
Author

OK

@zhangvia
Copy link

this bug seems to still exisit in compiling vae decoder

We have passed the uncommon resolution test under the compiled VAE decoder, example at:

base.vae.decoder = oneflow_compile(base.vae.decoder)

Can you provide the test environment and scripts?

sorry,my mistake

@onefish51
Copy link
Author

onefish51 commented Jan 31, 2024

When I retested the support for arbitrary sizes using the latest update of onediff, 9c7cda2f274629d1e76c42f010ef4be134fde516, I found that the latest code does not support it.
OneDiff git commit id

Successfully installed onediff-0.12.1.dev1

test log:

Loading pipeline components...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00,  9.31it/s]
Compiling with oneflow.
====> run 0-th h 896 w 896
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:51<00:00, 25.97s/it]
infer one cost 54.91251730918884
====> run 1-th h 896 w 896
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  8.85it/s]
infer one cost 0.5545377731323242
====> run 0-th h 896 w 768
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00,  5.80s/it]
infer one cost 12.783660411834717
====> run 1-th h 896 w 768
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  9.94it/s]
infer one cost 0.4895670413970947
====> run 0-th h 768 w 896
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00,  5.87s/it]
infer one cost 12.91884446144104
====> run 1-th h 768 w 896
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 10.02it/s]
infer one cost 0.4890902042388916
====> run 0-th h 768 w 768
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:10<00:00,  5.50s/it]
infer one cost 12.05246877670288
====> run 1-th h 768 w 768
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 11.22it/s]
infer one cost 0.4315967559814453

@chengzeyi
Copy link
Contributor

However, when testing the support for arbitrary sizes in DeepCache, I encountered the following issues:

====> run 1-th h 720 w 960
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:01<00:00, 25.94it/s]
infer one cost 1.4307401180267334
====> run 0-th h 720 w 720
  0%|                                                                                                                                                            | 0/30 [00:00<?, ?it/s][ERROR](GRAPH:OneflowGraph_4:OneflowGraph) run got error: <class 'oneflow._oneflow_internal.exception.Exception'> Check failed: (45 == 46) 
  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob
    RunNormalOp(launch_context, launch_op, inputs)
  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp
    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret
    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()
  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()
    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer
    Infer(*user_op_expr, infer_args)
  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer
    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })
  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc
    physical_tensor_desc_infer_fn_(&infer_ctx)
  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc
    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))
Error Type: oneflow.ErrorProto.check_failed_error
ERROR [2024-01-30 10:30:46] - Exception in __call__: e=Exception('Check failed: (45 == 46) \n  File "oneflow/core/job/job_interpreter.cpp", line 307, in InterpretJob\n    RunNormalOp(launch_context, launch_op, inputs)\n  File "oneflow/core/job/job_interpreter.cpp", line 219, in RunNormalOp\n    it.Apply(*op, inputs, &outputs, OpExprInterpContext(empty_attr_map, JUST(launch_op.device)))\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 84, in NaiveInterpret\n    [&]() -> Maybe<const LocalTensorInferResult> { LocalTensorMetaInferArgs ... mut_local_tensor_infer_cache()->GetOrInfer(infer_args)); }()\n  File "oneflow/core/framework/op_interpreter/eager_local_op_interpreter.cpp", line 87, in operator()\n    user_op_expr.mut_local_tensor_infer_cache()->GetOrInfer(infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 210, in GetOrInfer\n    Infer(*user_op_expr, infer_args)\n  File "oneflow/core/framework/local_tensor_infer_cache.cpp", line 178, in Infer\n    user_op_expr.InferPhysicalTensorDesc( infer_args.attrs ... ) -> TensorMeta* { return &output_mut_metas.at(i); })\n  File "oneflow/core/framework/op_expr.cpp", line 602, in InferPhysicalTensorDesc\n    physical_tensor_desc_infer_fn_(&infer_ctx)\n  File "oneflow/user/ops/concat_op.cpp", line 55, in InferLogicalTensorDesc\n    CHECK_EQ_OR_RETURN(in_desc.shape().At(i), out_dim_vec.at(i))\nError Type: oneflow.ErrorProto.check_failed_error')

Is it because DeepCache only supports sizes that are multiples of 32?

We will also fix this issue in the extension of DeepCache later on.

This has been fixed in the latest main, please update.

@onefish51
Copy link
Author

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Request-bug Something isn't working sig-hfdiffusers
Projects
None yet
Development

No branches or pull requests

5 participants