Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

erros in installing cutlass #4

Closed
miracleagi opened this issue Jul 12, 2022 · 3 comments
Closed

erros in installing cutlass #4

miracleagi opened this issue Jul 12, 2022 · 3 comments

Comments

@miracleagi
Copy link

cutlass guide step 3 can not solve the errors as follows. Are there any way to install cutlass correctly?
FAILED: /home/xx/cutlass/examples/19_large_depthwise_conv2d_torch_extension/build/temp.linux-x86_64-3.8/backward_data_fp32.o
/usr/local/cuda-10.2/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/xx/cutlass/examples/19_large_depthwise_conv2d_torch_extension/build/temp.linux-x86_64-3.8/backward_data_fp32.o.d -I. -I/home/xx/cutlass/include -I/home/xx/cutlass/tools/library/include -I/home/xx/cutlass/tools/util/include -I/home/xx/cutlass/examples/common -I/root/miniconda3/lib/python3.8/site-packages/torch/include -I/root/miniconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/root/miniconda3/lib/python3.8/site-packages/torch/include/TH -I/root/miniconda3/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda-10.2/include -I/root/miniconda3/include/python3.8 -c -c /home/xx/cutlass/examples/19_large_depthwise_conv2d_torch_extension/backward_data_fp32.cu -o /home/xx/cutlass/examples/19_large_depthwise_conv2d_torch_extension/build/temp.linux-x86_64-3.8/backward_data_fp32.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -g -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_depthwise_conv2d_implicit_gemm_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 -std=c++14
/home/zhengchuanchuan/cutlass/examples/19_large_depthwise_conv2d_torch_extension/backward_data_fp32.cu(212): error: more than one instance of constructor "cutlass::Tensor4DCoord::Tensor4DCoord" matches the argument list:
function "cutlass::Tensor4DCoord::Tensor4DCoord(cutlass::Tensor4DCoord::Index, cutlass::Tensor4DCoord::Index, cutlass::Tensor4DCoord::Index, cutlass::Tensor4DCoord::Index)"
function "cutlass::Tensor4DCoord::Tensor4DCoord(cutlass::Tensor4DCoord::LongIndex, cutlass::Tensor4DCoord::LongIndex, cutlass::Tensor4DCoord::LongIndex, cutlass::Tensor4DCoord::LongIndex)"
argument types are: (int64_t, int64_t, int64_t, int)

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

@Shiweiliuiiiiiii
Copy link
Collaborator

Can you try to see if this help?

@miracleagi
Copy link
Author

Can you try to see if this help?

tried. However, there still exists one error.
error info:
backward_data_fp16.cu(215): error: no instance of constructor "cutlass::conv::kernel::ImplicitBatchedGemmTnDepthwiseConvolution<Mma_, Epilogue_, ThreadblockSwizzle_, ConvOperator, ConvProblemSize_>::Arguments::Arguments [with Mma_=cutlass::conv::threadblock::MmaTnPrecompPipelined<ThreadblockShape, cutlass::conv::threadblock::Dwconv2dTileIterator<cutlass::MatrixShape<64, 32>, ElementSrc, cutlass::layout::TensorNCHW, cutlass::transform::PitchLinearWarpRakedThreadMap<cutlass::layout::PitchLinearShape<32, 64>, 256, cutlass::layout::PitchLinearShape<4, 8>, 8>, 1, 0>, cutlass::transform::threadblock::RegularTileIterator<cutlass::MatrixShape<64, 32>, ElementSrc, cutlass::layout::RowMajorVoltaTensorOpMultiplicandCrosswise<16, 32>, 0, cutlass::transform::PitchLinearWarpRakedThreadMap<cutlass::layout::PitchLinearShape<32, 64>, 256, cutlass::layout::PitchLinearShape<4, 8>, 8>, 16>, cutlass::conv::threadblock::Dwconv2dTileFilterIteratorDgradPrecomp<cutlass::MatrixShape<32, 128>, ElementFilter, cutlass::layout::TensorNCHW, cutlass::transform::PitchLinearWarpRakedThreadMap<cutlass::layout::PitchLinearShape<128, 32>, 256, cutlass::layout::PitchLinearShape<8, 4>, 8>, 1>, cutlass::transform::threadblock::RegularTileIterator<cutlass::MatrixShape<32, 128>, ElementFilter, cutlass::layout::RowMajorVoltaTensorOpMultiplicandBCongruous<16>, 0, cutlass::transform::PitchLinearWarpRakedThreadMap<cutlass::layout::PitchLinearShape<128, 32>, 256, cutlass::layout::PitchLinearShape<8, 4>, 8>, 16>, ElementAccumulator, LayoutDst, cutlass::gemm::threadblock::MmaPolicy<cutlass::gemm::warp::MmaVoltaTensorOp<WarpShape, ElementSrc, cutlass::layout::RowMajorVoltaTensorOpMultiplicandCrosswise<16, 32>, ElementFilter, cutlass::layout::RowMajorVoltaTensorOpMultiplicandBCongruous<16>, ElementAccumulator, cutlass::layout::RowMajor, cutlass::gemm::warp::MmaTensorOpPolicy<cutlass::arch::Mma<cutlass::gemm::GemmShape<16, 16, 4>, 32, ElementSrc, cutlass::layout::RowMajor, ElementFilter, cutlass::layout::RowMajor, ElementAccumulator, cutlass::layout::RowMajor, cutlass::arch::OpMultiplyAdd>, cutlass::MatrixShape<1, 1>>, __nv_bool>, cutlass::MatrixShape<0, 0>, cutlass::MatrixShape<0, 0>, 1>, cutlass::NumericArrayConverter<ElementSrc, ElementSrc, 8, cutlass::FloatRoundStyle::round_to_nearest>, cutlass::NumericArrayConverter<ElementFilter, ElementFilter, 16, cutlass::FloatRoundStyle::round_to_nearest>, _nv_bool>, Epilogue=cutlass::epilogue::threadblock::ConvolutionEpilogue<ThreadblockShape, cutlass::layout::TensorNCHW, 1, cutlass::gemm::warp::MmaVoltaTensorOp<WarpShape, ElementSrc, cutlass::layout::RowMajorVoltaTensorOpMultiplicandCrosswise<16, 32>, ElementFilter, cutlass::layout::RowMajorVoltaTensorOpMultiplicandBCongruous<16>, ElementAccumulator, cutlass::layout::RowMajor, cutlass::gemm::warp::MmaTensorOpPolicy<cutlass::arch::Mma<cutlass::gemm::GemmShape<16, 16, 4>, 32, ElementSrc, cutlass::layout::RowMajor, ElementFilter, cutlass::layout::RowMajor, ElementAccumulator, cutlass::layout::RowMajor, cutlass::arch::OpMultiplyAdd>, cutlass::MatrixShape<1, 1>>, nv_bool>, cutlass::epilogue::threadblock::Dwconv2dPredicatedTileIterator<cutlass::epilogue::threadblock::OutputTileOptimalThreadMap<cutlass::epilogue::threadblock::OutputTileShape<128, 4, 4, 2, 1>, cutlass::epilogue::threadblock::OutputTileShape<1, 2, 1, 1, 2>, 256, 1, 16>, cutlass::layout::TensorNCHW, ElementDst>, cutlass::epilogue::warp::FragmentIteratorVoltaTensorOp<WarpShape, cutlass::gemm::GemmShape<32, 32, 4>, ElementAccumulator, cutlass::layout::RowMajor>, cutlass::epilogue::warp::TileIteratorVoltaTensorOp<WarpShape, cutlass::gemm::GemmShape<32, 32, 4>, ElementAccumulator, cutlass::layout::RowMajor>, cutlass::epilogue::threadblock::SharedLoadIterator<cutlass::epilogue::threadblock::OutputTileOptimalThreadMap<cutlass::epilogue::threadblock::OutputTileShape<128, 4, 4, 2, 1>, cutlass::epilogue::threadblock::OutputTileShape<1, 2, 1, 1, 2>, 256, 1, 16>::CompactedThreadMap, ElementAccumulator, 4>, cutlass::epilogue::threadblock::Dwconv2dBiasTileIterator<cutlass::layout::TensorNCHW, ElementDst, 1>, EpilogueOp, cutlass::MatrixShape<0, 2>, false>, ThreadblockSwizzle=SwizzleThreadBlock, ConvOperator=cutlass::conv::Operator::kDgrad, ConvProblemSize=cutlass::conv::Conv2dProblemSize]" matches the argument list
argument types are: ({...}, cutlass::TensorRef<ElementSrc, LayoutSrc>, cutlass::TensorRef<ElementSrc, LayoutSrc>, long, long, cutlass::TensorRef<ElementSrc, LayoutSrc>, {...})

python: 3.7.13

@Shiweiliuiiiiiii
Copy link
Collaborator

Hi please see my reply here to see if you can install it successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants