Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CMake 3/3] Split source files with Python dependency to separate library #1660

Merged
merged 21 commits into from
Apr 27, 2022

Conversation

Nayef211
Copy link
Contributor

@Nayef211 Nayef211 commented Mar 17, 2022

Reference Issue #1644

Description

Build Process

Building the torctext project using setup.py

python setup.py develop

Building the extension manually using CMake

# from root of repo
cd build
rm -rf * # clear all existing artifacts
cmake \
    -DCMAKE_PREFIX_PATH=`python -c 'import torch;print(torch.utils.cmake_prefix_path)'` \
    -DRE2_BUILD_TESTING:BOOL=OFF \
    -DBUILD_TESTING:BOOL=OFF \
    -DSPM_ENABLE_SHARED=OFF  \
    ..
cmake --build . --config Release

third_party/CMakeLists.txt Outdated Show resolved Hide resolved
@Nayef211 Nayef211 changed the title [WIP] Add CMake Build to torchtext [CMake 3/3] Split source files with Python dependency to separate library Apr 1, 2022
Copy link
Contributor

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The destination directory torchtext/lib is not created by setuptools.
One trick I used in torchaudio is to add .gitignore in torchaudio/lib.

https://github.com/pytorch/audio/blob/main/torchaudio/lib/.gitignore

Doing the same should resolve the current failing unittest_linux jobs.

@Nayef211
Copy link
Contributor Author

@mthrok adding the .gitignore file resolved the linux errors. Have you seen these windows errors before?

https://app.circleci.com/pipelines/github/pytorch/text/5294/workflows/5cc7b90a-7568-47f5-85b9-5e7e7646b2b9/jobs/178184/parallel-runs/0/steps/0-103

@mthrok
Copy link
Contributor

mthrok commented Apr 20, 2022

@mthrok adding the .gitignore file resolved the linux errors. Have you seen these windows errors before?

https://app.circleci.com/pipelines/github/pytorch/text/5294/workflows/5cc7b90a-7568-47f5-85b9-5e7e7646b2b9/jobs/178184/parallel-runs/0/steps/0-103

Looking at the log, it cannot find the implementations of torchtext's custom kernels.

Comparing the failing link command against the one from torchaudio's successful compilation [src], they look exactly the same.

So I think this suggests that something about libtorchtext.lib is prohibiting the linker from successfully locating the symbols.

_torchtext link command _torchaudio link command
[97/98]
cmd.exe
/C
"
cd .

&&

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\cmake\data\bin\cmake.exe
-E
vs_link_dll
--intdir=torchtext\csrc\CMakeFiles\_torchtext.dir
--rc=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\rc.exe
--mt=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\mt.exe
--manifests

--

C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\link.exe

torchtext\csrc\CMakeFiles\_torchtext.dir\register_pybindings.cpp.obj

/out:torchtext\csrc\_torchtext.pyd
/implib:torchtext\csrc\_torchtext.lib
/pdb:torchtext\csrc\_torchtext.pdb
/dll
/version:0.0
/machine:x64
/INCREMENTAL:NO

torchtext\csrc\libtorchtext.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch_python.lib

C:\tools\miniconda3\envs\env3.10\libs\python310.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch_cpu.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\c10.lib

third_party\re2\re2.lib

third_party\double-conversion\double-conversion.lib

third_party\sentencepiece\src\sentencepiece_train.lib

third_party\sentencepiece\src\sentencepiece.lib

kernel32.lib
user32.lib
gdi32.lib
winspool.lib
shell32.lib
ole32.lib
oleaut32.lib
uuid.lib
comdlg32.lib
advapi32.lib

&&
cd .
"
[22/23]
cmd.exe
/C
"cd
.
&&
C:\tools\miniconda3\envs\env3.10\Lib\site-packages\cmake\data\bin\cmake.exe
-E
vs_link_dll
--intdir=torchaudio\csrc\CMakeFiles\_torchaudio.dir
--rc=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\rc.exe
--mt=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\mt.exe
--manifests

--
C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\link.exe

torchaudio\csrc\CMakeFiles\_torchaudio.dir\pybind\pybind.cpp.obj

/out:torchaudio\csrc\_torchaudio.pyd
/implib:torchaudio\csrc\_torchaudio.lib
/pdb:torchaudio\csrc\_torchaudio.pdb
/dll
/version:0.0
/machine:x64
/INCREMENTAL:NO

torchaudio\csrc\libtorchaudio.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch_python.lib

C:\tools\miniconda3\envs\env3.10\libs\python310.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch_cpu.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\c10.lib

kernel32.lib
user32.lib
gdi32.lib
winspool.lib
shell32.lib
ole32.lib
oleaut32.lib
uuid.lib
comdlg32.lib
advapi32.lib

&&
cd
."

@mthrok
Copy link
Contributor

mthrok commented Apr 20, 2022

The link command for libtorchXXX also look identical

libtorchtext link command libtorchaudio link command
[95/98]
cmd.exe
/C
"cd
.
&&
C:\tools\miniconda3\envs\env3.10\Lib\site-packages\cmake\data\bin\cmake.exe
-E
vs_link_dll
--intdir=torchtext\csrc\CMakeFiles\libtorchtext.dir
--rc=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\rc.exe
--mt=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\mt.exe
--manifests

--
C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\link.exe

torchtext\csrc\CMakeFiles\libtorchtext.dir\clip_tokenizer.cpp.obj
torchtext\csrc\CMakeFiles\libtorchtext.dir\common.cpp.obj
torchtext\csrc\CMakeFiles\libtorchtext.dir\gpt2_bpe_tokenizer.cpp.obj
torchtext\csrc\CMakeFiles\libtorchtext.dir\regex.cpp.obj
torchtext\csrc\CMakeFiles\libtorchtext.dir\regex_tokenizer.cpp.obj
torchtext\csrc\CMakeFiles\libtorchtext.dir\register_torchbindings.cpp.obj
torchtext\csrc\CMakeFiles\libtorchtext.dir\sentencepiece.cpp.obj
torchtext\csrc\CMakeFiles\libtorchtext.dir\vectors.cpp.obj
torchtext\csrc\CMakeFiles\libtorchtext.dir\vocab.cpp.obj

/out:torchtext\csrc\libtorchtext.pyd
/implib:torchtext\csrc\libtorchtext.lib
/pdb:torchtext\csrc\libtorchtext.pdb
/dll
/version:0.0
/machine:x64
/INCREMENTAL:NO

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch.lib

third_party\re2\re2.lib

third_party\double-conversion\double-conversion.lib

third_party\sentencepiece\src\sentencepiece.lib

third_party\sentencepiece\src\sentencepiece_train.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch_cpu.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\c10.lib

third_party\sentencepiece\src\sentencepiece.lib

kernel32.lib
user32.lib
gdi32.lib
winspool.lib
shell32.lib
ole32.lib
oleaut32.lib
uuid.lib
comdlg32.lib
advapi32.lib

&&
cd
."
[19/23]
cmd.exe
/C
"cd
.
&&
C:\tools\miniconda3\envs\env3.10\Lib\site-packages\cmake\data\bin\cmake.exe
-E
vs_link_dll
--intdir=torchaudio\csrc\CMakeFiles\libtorchaudio.dir
--rc=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\rc.exe
--mt=C:\PROGRA~2\WI3CF2~1\10\bin\100220~1.0\x64\mt.exe
--manifests

--
C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\link.exe

torchaudio\csrc\CMakeFiles\libtorchaudio.dir\lfilter.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\overdrive.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\utils.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\rnnt\cpu\compute_alphas.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\rnnt\cpu\compute_betas.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\rnnt\cpu\compute.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\rnnt\compute_alphas.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\rnnt\compute_betas.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\rnnt\compute.cpp.obj
torchaudio\csrc\CMakeFiles\libtorchaudio.dir\rnnt\autograd.cpp.obj

/out:torchaudio\csrc\libtorchaudio.pyd
/implib:torchaudio\csrc\libtorchaudio.lib
/pdb:torchaudio\csrc\libtorchaudio.pdb
/dll
/version:0.0
/machine:x64
/INCREMENTAL:NO

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\torch_cpu.lib

C:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\lib\c10.lib

kernel32.lib
user32.lib
gdi32.lib
winspool.lib
shell32.lib
ole32.lib
oleaut32.lib
uuid.lib
comdlg32.lib
advapi32.lib

&&
cd
."

@mthrok
Copy link
Contributor

mthrok commented Apr 20, 2022

The command for compiling obj for libtorchXXX seems to be different. Ttrchtext uses -MT whereas torchaudio uses -MD.

torchtext clip_tokenizer.cpp compile command torchaudio lfilter.cpp compile command
[85/98]
C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\cl.exe


/TP
-DUSE_C10D_GLOO
-DUSE_DISTRIBUTED
-Dlibtorchtext_EXPORTS
-IC:\Users\circleci\project
-IC:\Users\circleci\project\third_party\sentencepiece\src
-IC:\Users\circleci\project\third_party\re2
-IC:\Users\circleci\project\third_party\double-conversion
-external:IC:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\include
-external:IC:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\include\torch\csrc\api\include
-external:W0
/DWIN32
/D_WINDOWS
/GR
/EHsc
/wd4819
-Wall

/O2
/Ob2
/DNDEBUG
-MT
/EHsc
/DNOMINMAX
/wd4267
/wd4251
/wd4522
/wd4838
/wd4305
/wd4244
/wd4190
/wd4101
/wd4996
/wd4275
/bigobj
-std:c++14
/showIncludes
/Fotorchtext\csrc\CMakeFiles\libtorchtext.dir\clip_tokenizer.cpp.obj
/Fdtorchtext\csrc\CMakeFiles\libtorchtext.dir\
/FS
-c
C:\Users\circleci\project\torchtext\csrc\clip_tokenizer.cpp
[17/23]
C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\cl.exe


/TP
-DUSE_C10D_GLOO
-DUSE_DISTRIBUTED
-Dlibtorchaudio_EXPORTS
-IC:\Users\circleci\project
-external:IC:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\include
-external:IC:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\include\torch\csrc\api\include
-external:W0
/DWIN32
/D_WINDOWS
/GR
/EHsc
/wd4819
-Wall

/O2
/Ob2
/DNDEBUG
-MD
/EHsc
/DNOMINMAX
/wd4267
/wd4251
/wd4522
/wd4838
/wd4305
/wd4244
/wd4190
/wd4101
/wd4996
/wd4275
/bigobj
-openmp
-std:c++14
/showIncludes
/Fotorchaudio\csrc\CMakeFiles\libtorchaudio.dir\lfilter.cpp.obj
/Fdtorchaudio\csrc\CMakeFiles\libtorchaudio.dir\
/FS
-c
C:\Users\circleci\project\torchaudio\csrc\lfilter.cpp

@mthrok
Copy link
Contributor

mthrok commented Apr 21, 2022

Similarly compile command for PyBind registration is different.

torchtext register_pybind.cpp compile command torchaudio pybind.cpp compile command
[96/98]
C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\cl.exe


/TP
-DUSE_C10D_GLOO
-DUSE_DISTRIBUTED
-D_torchtext_EXPORTS
-IC:\Users\circleci\project
-IC:\Users\circleci\project\third_party\sentencepiece\src
-IC:\Users\circleci\project\third_party\re2
-IC:\Users\circleci\project\third_party\double-conversion
-external:IC:\tools\miniconda3\envs\env3.10\include
-external:IC:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\include
-external:IC:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\include\torch\csrc\api\include
-external:W0
/DWIN32
/D_WINDOWS
/GR
/EHsc
/wd4819
-Wall

/O2
/Ob2
/DNDEBUG
-MT
/EHsc/DNOMINMAX
/wd4267
/wd4251
/wd4522
/wd4838
/wd4305
/wd4244
/wd4190
/wd4101
/wd4996
/wd4275
/bigobj
-std:c++14
/showIncludes
/Fotorchtext\csrc\CMakeFiles\_torchtext.dir\register_pybindings.cpp.obj
/Fdtorchtext\csrc\CMakeFiles\_torchtext.dir\
/FS
-c
C:\Users\circleci\project\torchtext\csrc\register_pybindings.cpp
[21/23]
C:\PROGRA~2\MICROS~3\2019\COMMUN~1\VC\Tools\MSVC\1429~1.301\bin\Hostx64\x64\cl.exe


/TP
-DUSE_C10D_GLOO
-DUSE_DISTRIBUTED
-D_torchaudio_EXPORTS
-IC:\Users\circleci\project
-external:IC:\tools\miniconda3\envs\env3.10\include
-external:IC:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\include
-external:IC:\tools\miniconda3\envs\env3.10\Lib\site-packages\torch\include\torch\csrc\api\include
-external:W0
/DWIN32
/D_WINDOWS
/GR
/EHsc
/wd4819
-Wall

/O2
/Ob2
/DNDEBUG
-MD
/EHsc/DNOMINMAX
/wd4267
/wd4251
/wd4522
/wd4838
/wd4305
/wd4244
/wd4190
/wd4101
/wd4996
/wd4275
/bigobj
-openmp
-std:c++14
/showIncludes
/Fotorchaudio\csrc\CMakeFiles\_torchaudio.dir\pybind\pybind.cpp.obj
/Fdtorchaudio\csrc\CMakeFiles\_torchaudio.dir\
/FS
-c
C:\Users\circleci\project\torchaudio\csrc\pybind\pybind.cpp

@mthrok
Copy link
Contributor

mthrok commented Apr 21, 2022

The -MT flag was derived from here

set(CMAKE_MSVC_RUNTIME_LIBRARY "MultiThreaded$<$<CONFIG:Debug>:Debug>")

Maybe changing this to "MultiThreadedDLL$<$<CONFIG:Debug>:Debug>" might help. -> NOPE #1685

@Nayef211 Nayef211 marked this pull request as ready for review April 27, 2022 17:58
@Nayef211 Nayef211 requested a review from mthrok April 27, 2022 18:25
Copy link
Contributor

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

CMakeLists.txt Outdated Show resolved Hide resolved
torchtext/csrc/CMakeLists.txt Show resolved Hide resolved
@Nayef211 Nayef211 merged commit 13fa5a5 into pytorch:main Apr 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants