-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caffe support by pranv #368
Conversation
refactored data layers added error handling added more return values(inputs and outputs)
I've been thinking of adding tests, but since the model files are huge to be included into keras, I think we have only 2 options:
Any suggestions? |
+1 for tests, and +1 for fetch on demand, just as it is done with the datasets. |
If you've uploaded your model files somewhere (e.g. S3), then it's just one line of code: from keras.datasets.data_utils import get_file
local_path = get_file('local_name.ext', origin="https://s3.amazonaws.com/some_path.ext") |
The model files will be available on the researcher's page, who trained the model or in Caffe model Zoo. |
Has anyone tried it out on a few models yet? |
input_layer_names.append(layers[input_layer].name) | ||
|
||
if layer_nb in ends: | ||
name = 'output_' + name # outputs nodes are marked with 'output_' prefix from which output is derived later in 'add_output' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid very long lines, I would recommend putting comments before the line (possibly over several lines).
The protobuf issue is fixed in a clean way by adding the following to setup.py: import os
from six.moves.urllib.request import urlretrieve
# First, compile Caffe protobuf Python file
datadir = os.path.expanduser(os.path.join('~', '.keras', 'data'))
if not os.path.exists(datadir):
os.makedirs(datadir)
caffe_source = os.path.join(datadir, 'caffe.proto')
caffe_destination = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'keras', 'caffe')
urlretrieve('https://raw.githubusercontent.com/BVLC/caffe/master/src/caffe/proto/caffe.proto', caffe_source)
os.system('protoc --proto_path="' + datadir + '" --python_out="' + caffe_destination + '" "' + caffe_source + '"') This should be Windows compatible as well. Only potential issue is that this requires Protobuf to be installed before running setup.py. Or else Caffe import won't work (Keras can still be installed though). |
We can now remove the pre-compiled protobuf file in keras/caffe as well. Since we can generate it at install time. Please update the PR. |
Thanks for the feedback! This idea will help remove the I will update the based on your suggestions ASAP. |
Can we make google protocol buffer a optional dependancy? Like h5py was before? |
I don't have any S3 storage, please do it |
@fchollet have you tried out a few models? Any results, feedback, bugs in that regard? |
Using the code above, it is already de facto an optional dependency, because you can install and use Keras without it. You just won't be able to load Caffe models.
Sure. In that case just remove every model file, I'll set up S3 storage & fetching.
Not yet. |
I believe in the current form, the CaffeToKeras class creates the network graph based on the caffemodel file instead of the prototxt. That is, if the user wants to plug in a subset of weights from the caffemodel into a new model (as defined in the prototxt) they currently cannot. Since I assume many will try to do transfer learning using weights from models in the Model Zoo plugged into new models, this could be a big issue. |
I think I know the problem. When a caffemodel is provided, my code will completely construct keras model from it, disregarding the prototext. Hence changing the prototext will not change your model. This is a bad idea. My initial idea was to create a model from prototext and then copy weights. I reverted it to be what it is now, since I hadn't written the I think it will be fixed when I complete it, along with the other changes mentioned here by tomorrow. Thanks for pointing it out! |
@pranv Just wanted to see if you were able to make any progress. If you are swamped with work and already know what needs to change let me know - I can try to work on some changes as well. |
What's the status on this PR? We'd like to merge it asap. If you don't have time for it, do you want me to take over? |
Hey, |
layer_output_dim = layer_input_dim | ||
|
||
else: | ||
raise RuntimeError("one or many layers used int this model is not currently supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo fix and clarification:
raise RuntimeError("One or more layers used in this model are not currently supported")
@fchollet I think I've done the changes. |
Cool, thank you. I'll take it from here. |
I could not find caffe converter in official keras repository. Where shall I look? |
In the caffe branch: https://github.com/fchollet/keras/tree/caffe It is still being tested and debugged. On 18 August 2015 at 03:35, llcao notifications@github.com wrote:
|
I've attached my testing code if someone would like to try it out as well. I've loaded an example image the same way and just used caffe. Please see code below. You can try any image ('exampleimg.jpg') and I have just used the 16 layer caffemodel file. My 16 layer prototxt file is also shown below the code. The caffe output claims its from the fc7 layer but given I'm getting a lot of zeros, I'm pretty sure the ReLU is being applied. Either way, the result from the two aren't matching up. Please let me know if I made any egregious errors below. import sys
import numpy as np
from scipy.misc import imread, imresize
import pdb
import caffe
from keras.caffe import convert
# model files used
cnn_model_def = 'cnn_params/VGG_ILSVRC_16_layers_deploy_features.prototxt'
cnn_model_params = 'cnn_params/VGG_ILSVRC_16_layers.caffemodel'
C = 3
H = 224
W = 224
def format_img_for_input(image, H, W):
"""
Helper function to convert image read from imread to caffe input
Input:
image - numpy array describing the image
H - height in px
W - width in px
"""
if len(image.shape) == 2:
image = np.tile(image[:, :, np.newaxis], (1, 1, 3))
# RGB -> BGR
image = image[:, :, (2, 1, 0)]
# mean subtraction (get mean from model file?..hardcoded for now)
image = image - np.array([103.939, 116.779, 123.68])
# resize
image = imresize(image, (H, W))
# get channel in correct dimension
image = np.transpose(image, (2, 0, 1))
return image
# setup caffe cnn
print "Setting up caffe CNN..."
net = caffe.Net(cnn_model_def, cnn_model_params)
net.set_mode_gpu()
net.set_phase_test()
caffe_batch = np.zeros((10, C, H, W))
# setup keras
print "Setting up keras CNN..."
model = convert.caffe_to_keras(
prototext=cnn_model_def,
caffemodel=cnn_model_params,
phase='test')
graph = model
keras_batch = np.zeros((1, C, H, W))
# Load image and format for input
print "Loading example image..."
im = imread('exampleimg.jpg')
formatted_im = format_img_for_input(im, H, W)
keras_batch[0, :, :, :] = formatted_im
for i in range(10):
caffe_batch[i] = formatted_im
# extract features using caffe
print "Extracting features from caffe Net..."
out = net.forward(**{net.inputs[0]: caffe_batch})
caffe_features = out[net.outputs[0]].squeeze(axis=(2, 3))
caffe_features = caffe_features[0]
# extract features using keras
print "Extracting features from keras Graph..."
graph.compile('rmsprop', {graph.outputs.keys()[0]: 'mse'})
keras_features = graph.predict({'conv1_1':keras_batch}, batch_size=1, verbose=1)
# compare values - print True if equal
print "Compare values..."
print np.sum(caffe_features==keras_features) == 4096
pdb.set_trace() Any my prototxt here:
|
Please post any further comments in the current PR thread: #442 @asampat3090 : is the caffemodel hosted somewhere? I'd like to take a look. One initial reason why you would see different result (independently of any potential bug in the PR) is that the networks are in different phases; the Keras net is in test mode and the Caffe net is in train mode (which is why Dropout is being applied). This changes intermediate representations substantially, but should not affect significantly the last layer probabilities (assuming the network has been trained until convergence). |
… by : @divyashreepathihalli (#368) * add mlp classifier example * remove tfaddons dependency, remove GELU and AdamW and replace with keras core optimizer isnstead * review updates applied
… by : @divyashreepathihalli (keras-team#368) * add mlp classifier example * remove tfaddons dependency, remove GELU and AdamW and replace with keras core optimizer isnstead * review updates applied
Creating a PR for easier reviewing