Search anything:

ONNX format for interchangeable AI models

Learn Algorithms and become a National Programmer
Indian Technical Authorship Contest starts on 1st July 2023. Stay tuned.

Reading time: 30 minutes

The Open Neural Network Exchange Format (ONNX) is a new standard/ format for exchanging deep learning models. It will make deep learning models portable thus preventing vendor lock in.

An open-source battle is being fought to dominate artificial intelligence. It is being fought by industry giants, universities and communities of machine-learning researchers world-wide.

The good news is that the battleground is free and open. None of the big players are pushing closed-source solutions. Keras and Tensorflow backed by Google, MXNet by Apache endorsed by Amazon, or Caffe2 or PyTorch supported by Facebook, all solutions are open-source software.

While these projects are open, they are not interoperable. Each framework constitutes a complete stack that until recently could not interface in any way with any other framework. A new industry-backed standard, the Open Neural Network Exchange format, ONNX will change that.

ONNX is being developed by 3 industry giants:


ONNX is being supported by several industry leaders like:



At a high level, ONNX is designed to allow framework interoporability. There are many excellent machine learning libraries in various languages such as:

  • PyTorch
  • TensorFlow
  • MXNet
  • Caffe
    and several others

The idea is that you can train a model with one tool stack and then deploy it using another for inference and prediction.

To ensure this interoperability you must export your model in the model.onnx format which is serialized representation of the model in a protobuf file format. Currently there is native support in ONNX for PyTorch, CNTK, MXNet, and Caffe2 but there are also converters for TensorFlow and CoreML.

Challenge of building ONNX

Building a single file format that can express all of the capabilities of all the deep learning frameworks is no trivial task. How do you describe convolutions or recurrent networks with memory? What about embeddings and nearest neighbor algorithms found in fastText or StarSpace?

In TensorFlow, ONNX declares everything is a graph of tensor operations. That statement alone is not sufficient. Hundreds of operations must be supported, not all of which will be supported by all other tools and frameworks. Some frameworks may also implement an operation differently.

There has been considerable debate in the ONNX community about what level tensor operations should be modeled at. Should ONNX be a mathematical toolbox that can support arbitrary equations with primitives such as sine and multiplication, or should it support higher-level constructs like integrated GRU units or Layer Normalization as single monolithic operations?

As it stands, ONNX currently defines about 100 operations. They range in complexity from arithmetic addition to a complete Long Short-Term Memory implementation. Not all tools support all operations, so just because you can generate an ONNX file of your model does not mean it will run anywhere.

Real life example: PyTorch to CoreML

Let us imagine that you want to train a model to predict if a person is happy or not so that you can ensure everyone is in a good mood. You decide to run a a bunch of photos of people that is at various stages of mental state and pass it in to a convolutional neural network (CNN) that looks at images and trains it to predict if the person is happy.

Once you have trained your model, you then want to deploy it to a new iOS mobile app so that anyone can use your pre-trained model to check the mental state of their friends. You initially trained your model using PyTorch for various reasons such as:

  • Your development team is familiar with PyTorch
  • You do not need high computational efficiency
  • Your app has a preference for dynamic computational graphs for future extension

The issue is that you decided to deploy it as an iOS app which expects CoreML to be used inside the app.

You need not write and train your model from scratch. Just use ONNX.

ONNX is an intermediary representation of your model that lets you easily go from one environment to the other environment.

Using PyTorch you would normally export your model using torch.save(the_model.state_dict(), PATH) Exporting to the ONNX interchange format is just one more line:

torch.onnx.export(model, dummy_input, 'SplitModel.proto', verbose=True)

Using a tool like ONNX-CoreML, you can now easily turn your pre-trained model in to a file that you can import in to XCode and integrate seamlessly with your app.

Benefits of ONNX

There are several benefits of ONNX such as:

  • You can use a framework to train your model and use another framework for inference/ deployment.

  • A single framework need not be complete and support everything from training to deployment challenges (Network protocols, efficient allocation, parameter sharing). Each framework can do a small section and all of them can be glued together by ONNX.

  • Depending upon the runtime environment/ situation, you can use any framework according to your advantage

  • Any tools exporting ONNX models can benefit ONNX-compatible runtimes and libraries designed to maximize performance on some of the best hardware in the industry

OpenGenus Tech Review Team

OpenGenus Tech Review Team

The official account of OpenGenus's Technical Review Team. This team review all technical articles and incorporates peer feedback. The team consist of experts in the leading domains of Computing.

Read More

Vote for Author of this article:

Improved & Reviewed by: