Importing TF and ProdClient in the same file #5

FrancescoSaverioZuppichini · 2018-04-20T09:58:33Z

Hello,

I need to import some function from keras in order to preprocess the input to send to the tf-serving server. I am doing sentiment analysis in tweets. My client is pretty basic

# client.py
from predict_client.prod_client import ProdClient
from Config import Config
from data import get_tokenizer, make_inputs

client = ProdClient(Config.HOST, Config.MODEL_NAME, Config.MODEL_VERSION)


def predict(text):

    tokenizer = get_tokenizer()
    data = make_inputs(text, tokenizer)
    print(data)
    req_data = [{'in_tensor_name': 'inputs', 'in_tensor_dtype': 'DT_FLOAT', 'data': data}]

    prediction = client.predict(req_data, request_timeout=10)

    return prediction


text = 'Trump is bad'
print(predict(text))

I import from data.py two utils functions that use keras

#data.py
from tensorflow.python.keras.preprocessing.text import Tokenizer
from tensorflow.python.keras.preprocessing.sequence import pad_sequences
import numpy as np

def get_tokenizer():
    # TODO better approach is to subclass the Tokenizer
    conn = redis.Redis(Config.REDIS_HOST, port=Config.REDIS_PORT)

    tokenizer = Tokenizer(num_words=Config.N_WORDS + 1, oov_token="<OOV>")

    tokenizer.word_index = conn.hgetall('word_index')

    return tokenizer

def make_inputs(tokenizer, data):
    sequences = tokenizer.texts_to_sequences(data)
    sequences_pad = pad_sequences(sequences, maxlen=Config.MAX_LEN)
    inputs = np.array(sequences_pad).reshape([-1, Config.MAX_LEN])
    return inputs

I get this error:

TypeError: Couldn't build proto file into descriptor pool!
Invalid proto descriptor for file "tensorflow/core/framework/tensor_shape.proto":
  tensorflow.TensorShapeProto.dim: "tensorflow.TensorShapeProto.dim" is already defined in file "tensor_shape.proto".
  tensorflow.TensorShapeProto.unknown_rank: "tensorflow.TensorShapeProto.unknown_rank" is already defined in file "tensor_shape.proto".
  tensorflow.TensorShapeProto.Dim.size: "tensorflow.TensorShapeProto.Dim.size" is already defined in file "tensor_shape.proto".
  tensorflow.TensorShapeProto.Dim.name: "tensorflow.TensorShapeProto.Dim.name" is already defined in file "tensor_shape.proto".
  tensorflow.TensorShapeProto.Dim: "tensorflow.TensorShapeProto.Dim" is already defined in file "tensor_shape.proto".
  tensorflow.TensorShapeProto: "tensorflow.TensorShapeProto" is already defined in file "tensor_shape.proto".
  tensorflow.TensorShapeProto.dim: "tensorflow.TensorShapeProto.Dim" seems to be defined in "tensor_shape.proto", which is not imported by "tensorflow/core/framework/tensor_shape.proto".  To use it here, please add the necessary import.

I am preatty sure this is because I import a file that imports keras from tf. Any idea to fix it?

Thank you for your time,

Cheers,

Francesco Saverio Zuppichini

The text was updated successfully, but these errors were encountered:

stianlp · 2018-05-18T09:51:50Z

Hi, and sorry for the super late reply!

TLDR; see the bottom of this comment.

At first I thought this had to do with naming (which is sort of does) and that renaming packages would work in the client, but then again cause problems when sending requests to the server because client and server must use same names on the messages to work in gRPC.

I thought I could "fool" Keras to believe that we are using the same .proto files that is located in tensorflow/core/framework/ and make it work. But it looks like packages and naming isn't the only issue, looks like it has to do with loading files multiple times (exactly what the error message says).

So, I found this:
protocolbuffers/protobuf#3002 (comment)

export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION='python'

It fixes the issue for me, but I am not why and if it's a good long-term solution.

EDIT: I've only tried to import these two lines, and it worked sending a request.

from tensorflow.python.keras.preprocessing.text import Tokenizer
from tensorflow.python.keras.preprocessing.sequence import pad_sequences

But I haven't tried to actually call the tokenizer functions and stuff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Importing TF and ProdClient in the same file #5

Importing TF and ProdClient in the same file #5

FrancescoSaverioZuppichini commented Apr 20, 2018

stianlp commented May 18, 2018 •

edited

Loading

Importing TF and ProdClient in the same file #5

Importing TF and ProdClient in the same file #5

Comments

FrancescoSaverioZuppichini commented Apr 20, 2018

stianlp commented May 18, 2018 • edited Loading

stianlp commented May 18, 2018 •

edited

Loading