Linux |
---|
tensorflow-onnx will use the onnx version installed on your system and installs the latest onnx version if none is found.
By default we use opset 7 for the resulting onnx graph since most runtimes will support opset 7. Opset 7 was introduced in onnx-1.2.
With the release of onnx-1.3 there is now opset 8 - to create an onnx graph for opset 8 use in the command line --opset 8
.
We support many TensorFlow models. Support for Fully Connected and Convolutional networks is mature. Dynamic LSTM networks should work but the code for this is evolving. A list of models that we use for testing can be found here
If you don't have tensorflow installed already, install the desired tensorflow build, for example:
pip install tensorflow
or
pip install tensorflow-gpu
Install an onnx runtime of your choice if you want to run tests. For example:
onnxruntime (only avaliable on linux):
pip install onnxruntime
For caffe2, follow the instructions here:
https://caffe2.ai/
We tested with caffe2 and onnxruntime and unit tests are passing for those.
We tested with tensorflow 1.5-1.11 and anaconda 3.5,3.6.
pip install -U tf2onnx
Once dependencies are installed, from the tensorflow-onnx folder call:
python setup.py install
or
python setup.py develop
tensorflow-onnx requires onnx-1.2.2 or better and will install/upgrade onnx if needed.
To create a distribution:
python setup.py bdist_wheel
To convert a TensorFlow model, tf2onnx expects a frozen TensorFlow graph
and the user needs to specify inputs and outputs for the graph by passing the input and output
names with --inputs INPUTS
and --outputs OUTPUTS
.
python -m tf2onnx.convert --input SOURCE_FROZEN_GRAPH_PB
--inputs SOURCE_GRAPH_INPUTS
--outputs SOURCE_GRAPH_OUTPUS
[--inputs-as-nchw inputs_provided_as_nchw]
[--target TARGET]
[--output TARGET_ONNX_GRAPH]
[--target TARGET]
[--continue_on_error]
[--verbose]
[--custom-ops list-of-custom-ops]
[--opset OPSET]
[--fold_const]
frozen TensorFlow graph, which can be created with the freeze graph tool.
the target onnx file path.
Tensorflow graph's input/output names, which can be found with summarize graph tool. Those names typically end on :0
, for example --inputs input0:0,input1:0
By default we preserve the image format of inputs (nchw or nhwc) as given in the TensorFlow model. If your hosts (for example windows) native format nchw and the model is written for nhwc, --inputs-as-nchw
tensorflow-onnx will transpose the input. Doing so is convinient for the application and the converter in many cases can optimize the transpose away. For example --inputs input0:0,input1:0 --inputs-as-nchw input0:0
assumes that images are passed into input0:0
as nchw while the TensorFlow model given uses nhwc.
Some runtimes need workarounds, for example they don't support all types given in the onnx spec. We'll workaround it in some cases by generating a different graph. Those workarounds are activated with --target TARGET
.
by default we uses the newest opset 7 to generate the graph. By specifieing --opset
the user can override the default to generate a graph with the desired opset. For example --opset 5
would create a onnx graph that uses only ops available in opset 5. Because older opsets have in most cases fewer ops, some models might not convert on a older opset.
the runtime may support custom ops that are not defined in onnx. A user can asked the converter to map to custom ops by listing them with the --custom-ops option. Tensorflow ops listed here will be mapped to a custom op with the same name as the tensorflow op but in the onnx domain ai.onnx.converters.tensorflow. For example: --custom-ops Print
will insert a op Print
in the onnx domain ai.onnx.converters.tensorflow
into the graph. We also support a python api for custom ops documented later in this readme.
when set, TensorFlow fold_constants transformation will be applied before conversion. This will benefit features including Transpose optimization (e.g. Transpose operations introduced during tf-graph-to-onnx-graph conversion will be removed), and RNN unit conversion (for example LSTM). Older TensorFlow version might run into issues with this option depending on the model.
Usage example (run following commands in tensorflow-onnx root directory):
python -m tf2onnx.convert\
--input tests/models/fc-layers/frozen.pb\
--inputs X:0\
--outputs output:0\
--output tests/models/fc-layers/model.onnx\
--verbose
Some models specify placeholders with unknown ranks and dims which can not be mapped to onnx.
In those cases one can add the shape behind the input name in []
, for example --inputs X:0[1,28,28,3]
To find the inputs and outputs for the TensorFlow graph the model developer will know or you can consult TensorFlow's summarize_graph tool, for example:
summarize_graph --in_graph=tests/models/fc-layers/frozen.pb
The TensorFlow tool to freeze the graph is here.
For example:
python -m tensorflow.python.tools.freeze_graph \
--input_graph=my_checkpoint_dir/graphdef.pb \
--input_binary=true \
--output_node_names=output \
--input_checkpoint=my_checkpoint_dir \
--output_graph=tests/models/fc-layers/frozen.pb
There are 2 types of tests.
python setup.py test
python tests/run_pretrained_models.py
usage: run_pretrained_models.py [-h] [--cache CACHE] [--tests TESTS] [--backend BACKEND] [--verbose] [--debug] [--config yaml-config]
optional arguments:
-h, --help show this help message and exit
--cache CACHE pre-trained models cache dir
--tests TESTS tests to run
--backend BACKEND backend to use
--config yaml config file
--verbose verbose output
--opset OPSET target opset to use
--perf csv-file capture performance numbers or tensorflow and onnx runtime
--debug dump generated graph with shape info
--fold_const when set, TensorFlow fold_constants transformation will be applied before conversion. This will benefit features including Transpose optimization (e.g. Transpose operations introduced during tf-graph-to-onnx-graph conversion will be removed), and RNN unit conversion (for example LSTM).
run_pretrained_models.py
will run the TensorFlow model, captures the TensorFlow output and runs the same test against the specified ONNX backend after converting the model.
If the option --perf csv-file
is specified, we'll capture the timeing for inferece of tensorflow and onnx runtime and write the result into the given csv file.
You call it for example with:
python tests/run_pretrained_models.py --backend onnxruntime --config tests/run_pretrained_models.yaml --perf perf.csv
In some cases it will be useful to convert the models from TensorFlow to ONNX from a python script. You can use the following API:
import tf2onnx
tf2onnx.tfonnx.process_tf_graph(tf_graph,
continue_on_error=False, verbose=False, target=None,
opset=None, custom_op_handlers=None,
custom_rewriter=None, extra_opset=None,
shape_override=None, inputs_as_nchw=None, output_names=None):
"""Convert tensorflow graph to onnx graph.
Args:
tf_graph: tensorflow graph
continue_on_error: if an op can't be processed (aka there is no mapping), continue
verbose: print summary stats
target: list of workarounds applied to help certain platforms
opset: the opset to be used (int, default is latest)
custom_op_handlers: dictionary of custom ops handlers
custom_rewriter: list of custom graph rewriters
extra_opset: list of extra opset's, for example the opset's used by custom ops
shape_override: dict with inputs that override the shapes given by tensorflow
inputs_as_nchw: transpose inputs in list from nchw to nchw
output_names: name of output nodes in graph
Return:
onnx graph
"""
For example in examples/call_coverter_via_python.py:
import tensorflow as tf
import tf2onnx
with tf.Session() as sess:
x = tf.placeholder(tf.float32, [2, 3], name="input")
x_ = tf.add(x, x)
_ = tf.identity(x_, name="output")
onnx_graph = tf2onnx.tfonnx.process_tf_graph(sess.graph, output_names=["output:0"])
model_proto = onnx_graph.make_model("test")
with open("/tmp/model.onnx", "wb") as f:
f.write(model_proto.SerializeToString())
For complex custom ops that require graph rewrites or input / attribute rewrites using the python interface to insert a custom op will be the eaiest way to accomplish the task. A dictionary of name->custom_op_handler can be passed to tf2onnx.tfonnx.process_tf_graph. If the op name is found in the graph the handler will have access to all internal structures and can rewrite that is needed. For example examples/custom_op_via_python.py:
import tensorflow as tf
import tf2onnx
from onnx import helper
_TENSORFLOW_DOMAIN = "ai.onnx.converters.tensorflow"
def print_handler(ctx, node, name, args):
# replace tf.Print() with Identity
# T output = Print(T input, data, @list(type) U, @string message, @int first_n, @int summarize)
# becomes:
# T output = Identity(T Input)
node.type = "Identity"
node.domain = _TENSORFLOW_DOMAIN
del node.input[1:]
return node
with tf.Session() as sess:
x = tf.placeholder(tf.float32, [2, 3], name="input")
x_ = tf.add(x, x)
x_ = tf.Print(x, [x], "hello")
_ = tf.identity(x_, name="output")
onnx_graph = tf2onnx.tfonnx.process_tf_graph(sess.graph,
custom_op_handlers={"Print": print_handler},
extra_opset=[helper.make_opsetid(_TENSORFLOW_DOMAIN, 1)],
output_names=["output:0"])
model_proto = onnx_graph.make_model("test")
with open("/tmp/model.onnx", "wb") as f:
f.write(model_proto.SerializeToString())
The converter needs to take care of a few things:
- Convert the protobuf format. Since the format is similar this step is straight forward.
- TensorFlow types need to be mapped to their ONNX equivalent.
- For many ops TensorFlow passes parameters like shapes as inputs where ONNX wants to see them as attributes. Since we use a frozen graph, the converter will fetch the input as constant, converts it to an attribute and remove the original input.
- TensorFlow in many cases composes ops out of multiple simpler ops. The converter will need to identify the subgraph for such ops, slice the subgraph out and replace it with the ONNX equivalent. This can become fairly complex so we use a graph matching library for it. A good example of this is the tensorflow transpose op.
- TensorFlow's default data format is NHWC where ONNX requires NCHW. The converter will insert transpose ops to deal with this.
- There are some ops like relu6 that are not supported in ONNX but the converter can be composed out of other ONNX ops.
- ONNX backends are new and their implementations are not complete yet. For some ops the converter generate ops with deal with issues in existing backends.
tf2onnx starts with a frozen graph. This is because of item 3 above.
tf2onnx first does a simple convertion from the TensorFlow protobuf format to the ONNX protobuf format without looking at individual ops. We do this so we can use the ONNX graph as internal representation and write helper functions around it. The code that does the conversion is in tensorflow_to_onnx(). tensorflow_to_onnx() will return the ONNX graph and a dictionary with shape information from TensorFlow. The shape information is helpful in some cases when processing individual ops. The ONNX graph is wrapped in a Graph object and nodes in the graph are wrapped in a Node object to allow easier graph manipulations on the graph. All code that deals with nodes and graphs is in graph.py.
In the next step we apply graph matching code on the graph to re-write subgraphs for ops like transpose and lstm. For an example looks at rewrite_transpose().
In the fourth step we look at individual ops that need attention. The dictionary _OPS_MAPPING will map tensorflow op types to a method that is used to process the op. The simplest case is direct_op() where the op can be taken as is. Whenever possible we try to group ops into common processing, for example all ops that require dealing with broadcasting are mapped to broadcast_op(). For an op that composes the tensorflow op from multiple onnx ops, see relu6_op().
Once all ops are converted, we need to do a topological sort since ONNX requires it. process_tf_graph() is the method that takes care of all above steps.
If you like to contribute and add new conversions to tf2onnx, the process is something like:
- See if the op fits into one of the existing mappings. If so adding it to _OPS_MAPPING is all that is needed.
- If the new op needs extra procesing, start a new mapping function.
- If the tensorflow op is composed of multiple ops, consider using a graph re-write. While this might be a little harder initially, it works better for complex patterns.
- Add a unit test in tests/test_backend.py. The unit tests mostly create the tensorflow graph, run it and capture the output, than convert to onnx, run against a onnx backend and compare tensorflow and onnx results.
- If there are pre-trained models that use the new op, consider adding those to test/run_pretrained_models.py.