Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Building MxNet on Win10 with CPP package, CUDA9 and MKL 2018 #10049

Closed
EternalSaga opened this issue Mar 9, 2018 · 7 comments
Closed

Building MxNet on Win10 with CPP package, CUDA9 and MKL 2018 #10049

EternalSaga opened this issue Mar 9, 2018 · 7 comments

Comments

@EternalSaga
Copy link

EternalSaga commented Mar 9, 2018

Description

I'm trying to build mxnet on win10 with cpp package. After some hard efforts, I built the libmxnet.dll successfully. But I cannot generate the op.h file through the python script OpWrapperGenerator.py.
Python tells me that some unknown modules cannot be found when generating the op.h header file. Perhaps it is resulted by the MKL path, but I have no idea how to fix it.

Environment info (Required)

Win10
VS2015
MKL2018
CUDA9
CUDNN7
python3.6

##Diagnose Information
----------Python Info----------
Version : 3.6.4
Compiler : MSC v.1900 64 bit (AMD64)
Build : ('v3.6.4:d48eceb', 'Dec 19 2017 06:54:40')
Arch : ('64bit', 'WindowsPE')
------------Pip Info-----------
Version : 9.0.1
Directory : C:\Program Files\Python36\lib\site-packages\pip
----------MXNet Info-----------
No MXNet installed.
----------System Info----------
Platform : Windows-10-10.0.16299-SP0
system : Windows
node : DESKTOP-TE2BP5I
release : 10
version : 10.0.16299
----------Hardware Info----------
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
Name
Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz

----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0792 sec, LOAD: 1.6584 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.3294 sec, LOAD: 1.9067 sec.
Error open Gluon Tutorial(cn): https://zh.gluon.ai, The read operation timed out, DNS finished in 0.544719934463501 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.1333 sec, LOAD: 1.2828 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0939 sec, LOAD: 6.4636 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0620 sec, LOAD: 14.4082 sec.

Package used (Python/R/Scala/Julia):
(I'm using ...)
I want to use mxnet through C++.

Build info (Required if built from source)

mxnetbuilt

Compiler (gcc/clang/mingw/visual studio):
visual studio
MXNet commit hash:
(Paste the output of git rev-parse HEAD here.)

Build config:
(Paste the content of config.mk, or the build command.)

Error Message:

When this command is running, this message shows.
python OpWrapperGenerator.py D:/Program
mingAndStudy/cpp library/mxnet/build/Release/libmxnet.dll
Traceback (most recent call last):
File "OpWrapperGenerator.py", line 425, in
raise(e)
File "OpWrapperGenerator.py", line 419, in
f.write(patternStr % ParseAllOps())
File "OpWrapperGenerator.py", line 314, in ParseAllOps
cdll.libmxnet = cdll.LoadLibrary(sys.argv[1])
File "C:\Program Files\Python36\lib\ctypes_init_.py", line 426, in LoadLibrary
return self.dlltype(name)
File "C:\Program Files\Python36\lib\ctypes_init
.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

Minimum reproducible example

(If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.)

Steps to reproduce

After building the libmxnet.dll with mkl, cd to the folder incubator-mxnet\cpp-package\scripts,
run
python OpWrapperGenerator.py path_to_the_libmxnet.dll

What have you tried to solve it?

I tried to add opencv/build/bin/x64 and mkl/lib/intel_64_win to environment varialbes PATH.
Obviously it is useless becasue intel has not provided any shared library for mkl. <--------- This is wrong.

Additional Efforts

After building the libmxnet.dll, I tried to install it for python. The installation is successful, but when I import the mxnet package, there is a similar error for python install.

import mxnet
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\yueji\Anaconda3\lib\site-packages\mxnet-1.2.0-py3.6.egg\mxnet_init_.py", line 25, in
from . import engine
File "C:\Users\yueji\Anaconda3\lib\site-packages\mxnet-1.2.0-py3.6.egg\mxnet\engine.py", line 23, in
from .base import _LIB, check_call
File "C:\Users\yueji\Anaconda3\lib\site-packages\mxnet-1.2.0-py3.6.egg\mxnet\base.py", line 113, in
_LIB = _load_lib()
File "C:\Users\yueji\Anaconda3\lib\site-packages\mxnet-1.2.0-py3.6.egg\mxnet\base.py", line 105, in load_lib
lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
File "C:\Users\yueji\Anaconda3\lib\ctypes_init
.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

@EternalSaga
Copy link
Author

@sandeep-krishnamurthy Could you please tell me what's the unclear error/doc mean? Do I need to add more information?

@EternalSaga
Copy link
Author

OK, finally I found the solution.

Solution

First, use your dumpbin tool along with your msvc to check the dll dependence.
dumpbin /dependents libmxnet.dll
mkl_rt.dll
libiomp5md.dll
opencv_world341.dll
cudnn64_7.dll
cublas64_90.dll
cufft64_90.dll
cusolver64_90.dll
curand64_90.dll
nvrtc64_90.dll
nvcuda.dll
KERNEL32.dll
Put the three dlls, mkl_rt.dll, libiomp5md.dll, opencv_world341.dll with python.exe together. And then this problem could be solved. mkl_rt and libiomp should be located in your directory IntelSWTools\compilers_and_libraries_2018.1.156\windows\redist\intel64_win

@jinhuang415
Copy link
Contributor

@EternalSaga Besides put all .dll at the same directory, would you also check if this issue could be resolved if we add the actual path of mkl_rt.dll, libiomp5md.dll, opencv_world341.dll to the windows PATH variable? (see https://msdn.microsoft.com/en-us/library/7d83bc18.aspx for windows search dll order)

@sojiadeshina
Copy link
Contributor

@sandeep-krishnamurthy could we close this issue. user seems to have resolved it.

larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
larroy added a commit to larroy/mxnet that referenced this issue Aug 1, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
@EternalSaga
Copy link
Author

Yes, this issue could be closed.

larroy added a commit to larroy/mxnet that referenced this issue Aug 2, 2018
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049
marcoabreu pushed a commit that referenced this issue Aug 3, 2018
* Windows scripted build
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    #8714
    #11100
    #10166
    #10049

* Fix bug

* Fix non-portable ut

* add xunit
aaronmarkham pushed a commit to aaronmarkham/incubator-mxnet that referenced this issue Aug 6, 2018
* Windows scripted build
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049

* Fix bug

* Fix non-portable ut

* add xunit
aaronmarkham added a commit to aaronmarkham/incubator-mxnet that referenced this issue Aug 7, 2018
[MXNET-750] fix nested call on CachedOp. (apache#11951)

* fix nested call on cachedop.

* fix.

extend reshape op to allow reverse shape inference (apache#11956)

Improve sparse embedding index out of bound error message; (apache#11940)

[MXNET-770] Remove fixed seed in flaky test (apache#11958)

* Remove fixed seed in flaky test

* Remove fixed seed in flaky test

Update ONNX docs with the latest supported ONNX version (apache#11936)

Reduced test to 3 epochs and made gpu only (apache#11863)

* Reduced test to 3 epochs and made GPU only

* Moved logger variable so that it's accessible

Fix flaky tests for test_laop_4 (apache#11972)

Updating R client docs (apache#11954)

* Updating R client docs

* Forcing build

Fix install instructions for MXNET-R (apache#11976)

* fix install instructions for MXNET-R

* fix install instructions for MXNET-R

* fix default cuda version for MXNet-R

[MXNET-751] fix ce_loss flaky (apache#11971)

* add xavier initializer

* remove comment line

[MXNET-769] set MXNET_HOME as base for downloaded models through base.data_dir() (apache#11636)

* set MXNET_DATA_DIR as base for downloaded models through base.data_dir()
push joblib to save containers so is not required when running

* MXNET_DATA_DIR -> MXNET_HOME

[MXNET-748] linker fixed on Scala issues (apache#11989)

* put force load back as a temporary solution

* use project.basedir as relative path for OSX linker

[MXNET-772] Re-enable test_module.py:test_module_set_params (apache#11979)

[MXNET-771] Fix Flaky Test test_executor.py:test_dot (apache#11978)

* use assert_almost_equal, increase rtol, reduce matrix size

* remove seed in test_bind

* add seed 0 to test_bind, it is still flaky

* add comments for tracking

remove mod from arity 2 version of load-checkpoint in clojure-package (apache#11808)

* remove mod from arity 2 version of load-checkpoint

* load-checkpoint arity 2 test

Add unit test stage for mxnet cpu in debug mode (apache#11974)

Website broken link fixes (apache#12014)

* fix broken link

* fix broken link

* switch to .md links

* fix broken link

removed seed from flaky test (apache#11975)

Disable ccache log print due to threadunsafety (apache#11997)

Added default tolerance levels for regression checks for MBCC (apache#12006)

* Added tolerance level for assert_almost_equal for MBCC

* Nudge to CI

Disable flaky mkldnn test_requantize_int32_to_int8 (apache#11748)

[MXNET-769] Usability improvements to windows builds (apache#11947)

* Windows scripted build
Adjust Jenkins builds to use ci/build_windows.py

Issues:

    apache#8714
    apache#11100
    apache#10166
    apache#10049

* Fix bug

* Fix non-portable ut

* add xunit

Fix import statement (apache#12005)

array and multiply are undefined. Importing them from
ndarray

Disable flaky test test_random.test_gamma_generator (apache#12022)

[MXNET-770] Fix flaky test: test_factorization_machine_module (apache#12023)

* Remove fixed seed in flaky test

* Remove fixed seed in flaky test

* Update random seed to reproduce the issue

* Fix Flaky unit test and add a training test

* Remove fixed seed in flaky test

* Update random seed to reproduce the issue

* Fix Flaky unit test and add a training test

* Increase accuracy check

disable opencv threading for forked process (apache#12025)

Bug fixes in control flow operators (apache#11942)

Fix data narrowing warning on graph_executor.cc (apache#11969)

Fix flaky tests for test_squared_hinge_loss (apache#12017)

Fix flaky tests for test_hinge_loss (apache#12020)

remove fixed seed for test_sparse_ndarray/test_operator_gpu.test_sparse_nd_pickle (apache#12012)

Removed fixed seed from , test_loss:test_ctc_loss_train (apache#11985)

Removed fixed seed from , test_loss:test_sample_weight_loss (apache#11986)

Fix reduce_kernel_M1 (apache#12026)

* Fix reduce_kernel_M1

* Improve test_norm

Update test_loss.py to remove fixed seed (apache#11995)

[MXNET-23] Adding support to profile kvstore server during distributed training  (apache#11215)

* server profiling

merge with master

cleanup old code

added a check and better info message

add functions for C compatibility

fix doc

lint fixes

fix compile issues

lint fix

build error

update function signatures to preserve compatibility

fix comments

lint

* add part1 of test

* add integration test

Re-enabling test_ndarray/test_cached (apache#11950)

Test passes on CPU and GPU (10000 runs)

make gluon rnn layers hybrid blocks (apache#11482)

* make Gluon RNN layer hybrid block

* separate gluon gpu tests

* remove excess assert_raises_cudnn_disabled usage

* add comments and refactor

* add bidirectional test

* temporarily remove hybridize in test_gluon_rnn.test_layer_fill_shape

[MXNET-751] fix bce_loss flaky (apache#11955)

* add fix to bce_loss

* add comments

* remove unecessary comments

Doc fix for a few optimizers (apache#12034)

* Update optimizer.py

* Update optimizer.py
@withmch
Copy link

withmch commented Aug 22, 2018

@EternalSaga hello! I have met the same problem,after building the mxnet from mxnet-source-code with VS2015,the mxnet-cpp file didn't generated the op.h. when I found out the linked DLL files of libmxnet.dll and put this DLLs into the same file with python.exe,the problem still didnt solved,there wasn't op.h. Please could you tell how to solve the problem?

@YCAyca
Copy link

YCAyca commented Nov 23, 2019

Hi! I met with the same error and I did all the things that I read on blogs. Finally I downloaded all the missing dll files (found them typing dumpbin /dependents libmxnet.dll). Then the error message changed and now I'm getting this error : OSError: [WinError 193] %1 is not a valid Win32 application
Is there someone who know how to solve this error? I'm usin VS 2017 on Windows 10 by the way

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants