Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix a issue that CUDA EP fallback to much nodes to CPU for some case …
…which cause huge data copy. If the node's inputs are all initializer, we shouldn't fallback the node to CPU. (#1727) Fix an issue that CUDA EP fallback too much nodes to CPU for some case which cause huge data copy. #1675 Currently, if the node's inputs are all as initialier, CUDA EP will fallback it to CPU. And it will also fallback some nodes under it. It could cause some huge data copy. for the case reported by a user, it has several Slices with input from initializer, and a Concat op to concat the output from Slice output. The data is huge 16MB after concat, which make the data copy from CPU to GPU quite costly because it's a sync copy. Fix If the node's inputs are all initializer, we shouldn't fallback the node to CPU.
- Loading branch information