Converting NVIDIA DeepStream Pipelines to Intel® Deep Learning Streamer (Intel® DL Streamer) Pipeline Framework#
This document will describe the steps to convert a pipeline from NVIDIA DeepStream to Intel® DL Streamer Pipeline Framework. We also have a running example through the document that will be updated at each step to help show the modifications being described.
Note
The intermediate steps of the pipeline are not meant to run. They are simply there as a reference example of the changes being made in each section.
Preparing Your Model#
To use Intel® DL Streamer Pipeline Framework and OpenVINO™ Toolkit the model needs to be in Intermediate Representation (IR) format. To convert your model to this format, use the Model Optimizer tool from OpenVINO™ Toolkit.
Pipeline Framework’s inference plugins optionally can do some pre- and post-processing operations before/after running inference. These operations are specified in a model-proc file. Visit this page for more information on creating a model-proc file and examples with various models from Open Model Zoo.
GStreamer Pipeline Adjustments#
In the following sections we will be converting the below pipeline that is using DeepStream elements to Pipeline Framework. It is taken from one of the examples here. It takes an input stream from file, decodes, runs inference, overlays the inferences on the video, re-encodes and outputs a new .mp4 file.
filesrc location=input_file.mp4 ! decodebin ! \
nvstreammux batch-size=1 width=1920 height=1080 ! queue ! \
nvinfer config-file-path=./config.txt ! \
nvvideoconvert ! "video/x-raw(memory:NVMM), format=RGBA" ! \
nvdsosd ! queue ! \
nvvideoconvert ! "video/x-raw, format=I420" ! videoconvert ! avenc_mpeg4 bitrate=8000000 ! qtmux ! filesink location=output_file.mp4
Mux and Demux Elements#
Remove
nvstreammux
andnvstreamdemux
and all their properties.These elements are used in the case of multiple input streams to connect all inputs to the same inferencing element. In DL Streamer, the inferencing elements share properties and instances if they share the same
model-instance-id
property.In our example, we only have one source so we will skip this for now. See more on how to do this with Pipeline Framework in the section Multiple Input Streams below.
At this stage we have removed nvstreammux
and the queue
that
followed it. Notably, the batch-size
property is also removed. It
will be added in the next section as a property of the Pipeline Framework
inference elements.
filesrc location=input_file.mp4 ! decodebin ! \
nvinfer config-file-path=./config.txt ! \
nvvideoconvert ! "video/x-raw(memory:NVMM), format=RGBA" ! \
nvdsosd ! queue ! \
nvvideoconvert ! "video/x-raw, format=I420" ! videoconvert ! avenc_mpeg4 bitrate=8000000 ! qtmux ! filesink location=output_file.mp4
Inferencing Elements#
Remove
nvinfer
and replace it withgvainference
,gvadetect
orgvaclassify
depending on the following use cases:For doing detection on full frames and outputting a region of interest, use gvadetect. This replaces
nvinfer
when it is used in primary mode.Replace
config-file-path
property withmodel
andmodel-proc
.gvadetect
generates GstVideoRegionOfInterestMeta.
For doing classification on previously detected objects, use gvaclassify. This replaces nvinfer when it is used in secondary mode.
Replace
config-file-path
property withmodel
andmodel-proc
.gvaclassify
requires GstVideoRegionOfInterestMeta as input.
For doing generic full frame inference, use gvainference. This replaces
nvinfer
when used in primary mode.gvainference
generates GstGVATensorMeta.
In this example we will use gvadetect to infer on the full frame and
output region of interests. batch-size
was also added for
consistency with what was removed above (the default value is 1 so it is
not needed). We replaced config-file-path
property with model
and model-proc
properties. See the section above about “Preparing
your model” for converting the model to IR format and creating a
model-proc file.
Note
The model-proc
file is not always needed depending on the model’s inputs and outputs.
filesrc location=input_file.mp4 ! decodebin ! \
gvadetect model=./model.xml model-proc=./model_proc.json batch-size=1 ! \
nvvideoconvert ! "video/x-raw(memory:NVMM), format=RGBA" ! \
nvdsosd ! queue ! \
nvvideoconvert ! "video/x-raw, format=I420" ! videoconvert ! avenc_mpeg4 bitrate=8000000 ! qtmux ! filesink location=output_file.mp4
Video Processing Elements#
Replace video processing elements with vaapi equivalents for GPU or native GStreamer elements for CPU.
nvvideoconvert
withvaapipostproc
ormfxvpp
(GPU) orvideoconvert
(CPU).If the
nvvideoconvert
is being used to convert to/frommemory:NVMM
it can just be removed.
nvv4ldecoder
can be replaced withvaapi{CODEC}dec
, for examplevaapih264dec
for decode only orvaapidecodebin
for decode and vaapipostproc. Alternatively, the native GStreamer elementdecodebin
can be used to automatically choose an available decoder.
Some caps filters that follow an inferencing element may need to be adjusted or removed. Pipeline Framework inferencing elements do not support color space conversion in post-processing. You will need to have a
vaapipostproc
orvideoconvert
element to handle this.
Here we removed a few caps filters and instances of nvvideoconvert
used for conversions from DeepStream’s NVMM because Pipeline Framework uses
standard GStreamer structures and memory types. We will leave the
standard gstreamer element videoconvert
to do color space conversion
on CPU, however if available, we suggest using vaapipostproc
to run
on Intel Graphics. Also, we will use the GStreamer standard element
decodebin
to choose an appropriate demuxer and decoder depending on
the input stream as well as what is available on the system.
filesrc location=input_file.mp4 ! decodebin ! \
gvadetect model=./model.xml model-proc=./model_proc.json batch-size=1 ! \
nvdsosd ! queue ! \
videoconvert ! avenc_mpeg4 bitrate=8000000 ! qtmux ! filesink location=output_file.mp4
Metadata Elements#
Replace
nvtracker
with gvatrackRemove
ll-lib-file
property. Optionally replace withtracking-type
if you want to specify the algorithm used. By default it will use the ‘short-term’ tracker.Remove all other properties.
Replace
nvdsosd
with gvawatermarkRemove all properties
Replace
nvmsgconv
with gvametaconvertgvametaconvert
can be used to convert metadata from inferencing elements to JSON and to output metadata to the GST_DEBUG log.It has optional properties to configure what information goes into the JSON object including frame data for frames with no detections found, tensor data, the source the inferences came from, and tags, a user defined JSON object that is attached to each output for additional custom data.
Replace
nvmsgbroker
with gvametapublishgvametapublish
can be used to output the JSON messages generated bygvametaconvert
to stdout, file, MQTT or Kafka.
The only metadata processing that is done in this pipeline is to overlay
the inferences on the video for which we use gvawatermark
.
filesrc location=input_file.mp4 ! decodebin ! \
gvadetect model=./model.xml model-proc=./model_proc.json batch-size=1 ! \
gvawatermark ! queue ! \
videoconvert ! avenc_mpeg4 bitrate=8000000 ! qtmux ! filesink location=output_file.mp4
Multiple Input Streams#
nvstreammux
element, Pipeline Framework shares all model and
Inference Engine properties between elements that have the same
model-instance-id
property. Meaning that you do not need to mux
all sources together before inference and you can remove any
instances of nvstreammux
and nvstreamdemux
. Below is a pseudo
example of a pipeline with two streams.nvstreammux ! nvinfer config-file-path=./config.txt ! nvstreamdemux filesrc ! decode ! mux.sink_0 filesrc ! decode ! mux.sink_1 demux.src_0 ! encode ! filesink demux.src_1 ! encode ! filesink
When using Pipeline Framework, the pipeline will look like this:
filesrc ! decode ! gvadetect model=./model.xml model-proc=./model_proc.json model-instance-id=model1 ! encode ! filesink filesrc ! decode ! gvadetect model-instance-id=model1 ! encode ! filesink