Deploying deep neural networks is an essential endeavor in ML system design. OpenVINO™, developed by Intel, is a toolkit designed specifically for this purpose.
It stands out for its ability to optimize and accelerate neural network performance on a range of Intel hardware. This makes it an invaluable tool for developers looking to deploy AI models more effectively.

OpenVINO™ supports many well-known model formats and offers tools for model optimization, performance tuning, and easy deployment.

In this tutorial, we’ll explore a core component of this toolkit: the OpenVINO™ Converter, introduced in the 2023.1 OpenVINO™ release. It was known as an optimizer, which is now considered a legacy.

Let’s get started.

OpenVINO™ Converter

The OpenVINO converter is a methodolgy that significantly enhances the performance of deep learning models. Unlike other optimizers, it explicitly targets Intel architectures (like all standard CPUs), ensuring that models are optimized in general and finely tuned to leverage the full capabilities of Intel processors. 

💡
What are the results of such an optimization?

According to my experiment, it gives an additional 22 to 47% improvement, depending on the CPU type. It’s something not trivial and worth checking.

It works on Intel hardware, which includes CPUs, integrated GPUs, and VPUs like the Intel Movidius Neural Compute Stick. However, it is essential for the CPU, as most cloud providers still use Intel processors for their dedicated server offerings. That’s opening the door for cheap optimization and inference on uncomplicated models.

how OpenVino works

How OpenVino™ optimizations works

The converter (or optimizer) achieves this by converting and compressing AI models into an intermediate representation (IR) format. This process includes crucial steps like layer fusion and horizontal fusion, which streamline the neural network for more efficient computation. Doing so ensures reduced latency and higher throughput for AI models, particularly crucial for real-time applications.

In the next section, we will download and install the toolkit.

How to install

From the 2022.1 release, we can install the OpenVINO™ Development Tools via PyPI.


# Installing OpenVINO™
# Make sure you got the newest version, after installation
pip install openvino

And that’s it. It was harder to install using some shell scripts in the past, but now it is just the pip install.

Convert to Intermediate Representation (IR)

Some general steps to use OpenVINO™ optimizer effectively:

  1. Model preparation: Ensure your model is in a supported format like TensorFlow, Caffe, or ONNX. 
  2. Model conversion: Use the Model Converter API to convert your model to OpenVINO’s Intermediate Representation (IR) format. This creates .xml and .bin files (the IR).
  3. Model optimization (optional): Fine-tune the conversion process with flags for batch size, precision, and other model-specific optimizations.
  4. Verification: Test the IR model using OpenVINO’s Inference Engine to ensure it’s performing as expected.

Converting TensorFlow model

Here, we can add an example of converting MobileNet_v2 image classification model from TensorFlow Hub.

First, we need to install the dependencies:


# Install the dependencies
# Make sure to have the latest version of OpenVino
pip install tensorflow_hub tensorflow pillow numpy matplotlib
pip install "openvino>=2023.2.0"

Then, perform the conversion:


from pathlib import Path
import os
from urllib.request import urlretrieve
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"

import tensorflow_hub as hub
import tensorflow as tf
import PIL
import numpy as np
import matplotlib.pyplot as plt

import openvino as ov

tf.get_logger().setLevel("ERROR")

IMAGE_SHAPE = (224, 224)
IMAGE_URL, IMAGE_PATH = "https://storage.googleapis.com/download.tensorflow.org/example_images/grace_hopper.jpg", "data/grace_hopper.jpg"
MODEL_URL, MODEL_PATH = "https://www.kaggle.com/models/google/mobilenet-v1/frameworks/tensorFlow2/variations/100-224-classification/versions/2", "models/mobilenet_v2_100_224.xml"

model = hub.KerasLayer(MODEL_URL, input_shape=IMAGE_SHAPE + (3,))


# Perform the actual conversion to IR format
if not Path(MODEL_PATH).exists():
    converted_model = ov.convert_model(model)
    ov.save_model(converted_model, MODEL_PATH)

And you should be able to see the .xml and .bin files locally.

What type of models can we convert

You can convert a significant number of architectures to the intermediate format. For some of them, it can be more challenging. For example, converting the PyTorch model will require first converting to ONNX and then IR. This tutorial will help you do that.

In general, most of Keras classification models can be easily converted. You can also try with: 

Case studies

In the real world, OpenVINO has shown remarkable success in various use cases. 

One notable example I worked on is optimizing and serving a computer vision model for an e-commerce application. By leveraging the OpenVINO™ converter, the model’s inference speed increased significantly, allowing real-time analysis of customers behavior in online store.

The next step, after converting into the intermediate format, is the deployment. It can be very similar to deploying a model with TensorFlow serving, which we covered in another tutorial. I plan to make a dedicated article for OpenVINO™ serving, so stay tuned.

In conclusion, OpenVINO™ is a robust solution for optimizing and deploying deep learning models, offering significant performance improvements and broad applicability in real-world scenarios.