.. index:: pair: page; Intel® Deep Learning Streamer (Intel® DL Streamer) API reference
.. _doxid-indexpage:

Intel® Deep Learning Streamer (Intel® DL Streamer) API reference
==================================================================

Overview
========

This documentation describes Intel® DL Streamer metadata API (Application Programming Interface) that allows you to access and to control inference results obtained from Intel® DL Streamer plugin elements, such as **gvainference**, **gvadetect**, **gvaclassify**. In Intel® DL Streamer, all inference results (both raw and interpreted) are passing through GStreamer pipeline being attached to GstBuffer objects as GStreamer metadata.

For better developer experience we provide Intel® DL Streamer API (C++ and Python versions) which simplifies any job you want to perform on inference results, like consuming and further reusing, additional custom post-processing, and more. Also, most of the times, Intel® DL Streamer API allows you to abstract from internal inference results as GStreamer metadata.

In Intel® DL Streamer API, most of the job is done by these classes: :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>`, :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` & :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` (C++ version is referenced here, but Python version is available as well). Take a look at schematic picture below, which represents how these classes relate.

.. image:: GVA_API.png

On this picture you can see 3 classes mentioned above. :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` is the most high-level object, which is constructed from GstBuffer. It means that :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` represents one image (one video frame). One :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` can contain from 0 to many :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` objects (ROI on picture above). :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` describes detected object, e.g. its bounding box coordinates and label. Number of :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` objects added to :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` depends on whether detection already happened on this image (by **gvadetect** element in a pipeline) and whether this image contains patterns which are recognized by detection models (for example, it could be image with humans faces and detection model, which was trained to detect faces).

Similar, one :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` can contain from 0 to many :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` objects, which are products of running inference (by **gvainference** element in a pipeline) on full video frame. Such :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` contains raw inference result which can be consumed and post-processed (interpreted) any way you like. For example, if you have custom detection model, which is not supported by **gvadetect**, you can run inference with **gvainference** element in a pipeline, and obtain results as a :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` object. You can then use these results for getting regions of interest of this frame with your post-processing algorithm, and add your own :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` to :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` with :ref:`GVA::VideoFrame::add_region <doxid-class_g_v_a_1_1_video_frame_1a890a49a3959934594df428b032d4fed7>`. Tutorial devoted to this use case you can find in ``Custom post-processing tutorial`` section.

:ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` can also contain multiple :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` objects, which are products of running classification (by **gvaclassify** element in a pipeline) on :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` they belong to. Such :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` can contain :ref:`GVA::Tensor::label <doxid-class_g_v_a_1_1_tensor_1a81285ccb826dd979f0b77fa8361a7f06>` which is string containing interpreted classification result (examples are person's age, car's color, etc.). :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` also stores raw data (model output blob), which can be obtained with :ref:`GVA::Tensor::data <doxid-class_g_v_a_1_1_tensor_1ac61de94784df20b83e2b045b26b48506>`. Normally, one :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` contains some additional detection information, which can be obtained with :ref:`GVA::RegionOfInterest::detection <doxid-class_g_v_a_1_1_region_of_interest_1aa3360d1e9ffd9f4921905cfc756d2f4a>`. You can check if :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` is detection tensor with :ref:`GVA::Tensor::is_detection <doxid-class_g_v_a_1_1_tensor_1a28a71b8b0973df2df5707bb74b1588f7>`. Detection :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` extends detection information contained in :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>`.

Any modification you perform using Intel® DL Streamer API will affect underlying metadata as though you modified it directly. For example, you can add :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` or :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` objects to :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>`, and real GStreamer metadata instances will be added to GstBuffer of current :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>`. It means, the objects you added will behave as if they were produced by :ref:`GVA <doxid-namespace_g_v_a>` elements mentioned above. You will be able to reach these objects futher by pipeline using both internal metadata representation and Intel® DL Streamer API. Also, any Intel® DL Streamer elements that rely on inference results will employ objects you added. For example, **gvawatermark** will render on screen these objects (for example, it will draw bounding box for added :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>`).

Inference results flow
======================

Video analytics pipeline is a GStreamer pipeline with one or several Intel® DL Streamer elements for inference and additional actions (publishing, rendering, etc.) if needed. Take a look at this pipeline:

.. ref-code-block:: cpp

	INPUT=video.mp4
	MODEL1=face-detection-adas-0001
	MODEL2=age-gender-recognition-retail-0013
	MODEL3=emotions-recognition-retail-0003
	
	gst-launch-1.0 \
	    filesrc location=${INPUT} ! decodebin ! video/x-raw ! videoconvert ! \
	    gvadetect   model=$(MODEL_PATH $MODEL1) ! queue ! \
	    gvaclassify model=$(MODEL_PATH $MODEL2) model-proc=$(PROC_PATH $MODEL2) ! queue ! \
	    gvaclassify model=$(MODEL_PATH $MODEL3) model-proc=$(PROC_PATH $MODEL3) ! queue ! \
	    gvawatermark ! videoconvert ! fpsdisplaysink video-sink=xvimagesink sync=false

Note: following explanation is based on C++ Intel® DL Streamer API, but Python Intel® DL Streamer API can be used in the same way

Here, **gvadetect** runs face detection on each frame (:ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>`) of provided video file. Let's say it detected 3 faces on particular frame. It means that 3 instances of :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` will be added to this frame. Also, each :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` will get its own detection :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` attached. Such :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` contains additional detection information.

After detection frame goes into first of two **gvaclassify** elements, which will iterate by 3 instances of :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` added to :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>`. For each :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` it will take region itself from full frame and run age & gender classification on this region. Thus, each person detected will also get age and gender classified. It's achieved by adding 2 :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` instances to current :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` (one for age and one for gender).

After first classification frame goes into second **gvaclassify** element, which does the same job as previous **gvaclassify** element, but it classifies person's emotion, instead of age & gender. Result emotion will be contained in its own :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` added to current :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>`.

Thus, after detection & classification part of pipeline we have :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` with 3 instances of :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` added, and 4 instances of :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` added to each of 3 instances of :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>`. On image level, it means that we've got 3 persons detected, and age, gender & emotion of each of them is classified.

Last Intel® DL Streamer element in the pipeline is **gvawatermark** and its job is to display inference results on top of frame currently rendered on screen. To do this, it creates :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` instance for current frame and iterates by attached :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` instances, and by :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` instances, attached to each of :ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>`. This way, you will see bounding boxes drawed around persons faces, and these boxes will also be labeled with age, gender & emotion.

The whole point of Intel® DL Streamer API is to allow you to access any object we described above in any point of pipeline. You can read it, modify it, add your own tensors and regions of interest, and so on. Thus, you can add custom post-processing for any deep learning model in case Intel® DL Streamer inference elements don't support it. There is a handy **gvainference** element, which adds raw inference result (output layer blob) in :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>`, so you can do custom post-processing on it.

Print inference results example
===============================

Here we provide a few code snippets, which print some of inference results obtained from video analytics pipeline. They are different ways to achieve the same thing. To get thorough understanding of how to create an application which takes advantage of Intel® DL Streamer API, make sure to check out the rest of this document.

1. Access inference results with low-level C code
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is not recommended way of dealing with inference results obtained from video analytics pipeline and should be avoided unless some specific considerations exist

.. ref-code-block:: cpp

	#include <gst/video/video.h> // contains GstVideoRegionOfInterestMeta
	
	void PrintMeta(GstBuffer *buffer) {
	    gpointer state = NULL;
	    GstMeta *meta = NULL;
	    while ((meta = gst_buffer_iterate_meta(buffer, &state)) != NULL) { // iterate by meta attached to buffer
	        if (meta->info->api != GST_VIDEO_REGION_OF_INTEREST_META_API_TYPE)
	            continue; // we only interested in GstVideoRegionOfInterestMeta instances, because it's produced by gvadetect
	        GstVideoRegionOfInterestMeta *roi_meta = (GstVideoRegionOfInterestMeta*)meta;
	        printf("Object bounding box %d,%d,%d,%d\n", roi_meta->x, roi_meta->y, roi_meta->w, roi_meta->h);
	        for (GList *l = roi_meta->params; l; l = g_list_next(l)) { // iterate by tensors attached to this region of interest
	            GstStructure *structure = (GstStructure *) l->data;
	            if (gst_structure_has_name(structure, "detection"))
	                continue; // detection tensor doesn't contain classification result and hence doesn't contain label
	            // print some tensor information
	            printf("  Attribute %s\n", gst_structure_get_name(structure));
	            if (gst_structure_has_field(structure, "label")) {
	                printf("    label=%s\n", gst_structure_get_string(structure, "label"));
	            }
	            if (gst_structure_has_field(structure, "confidence")) {
	                double confidence;
	                gst_structure_get_double(structure, "confidence", &confidence);
	                printf("    confidence=%.2f\n", confidence);
	            }
	        }
	    }
	}

2. Access inference results with C++ API
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is recommended way of dealing with inference results obtained from video analytics pipeline from C++ application

.. ref-code-block:: cpp

	#include "video_frame.h" // contains GVA::VideoFrame, GVA::RegionOfInterest and GVA::Tensor
	
	void PrintMeta(GstBuffer *buffer, GstCaps *caps) { // simple function to display some tensor information
	    :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>` video_frame(buffer, caps);
	    std::vector<GVA::RegionOfInterest> regions = video_frame.regions();
	    for (:ref:`GVA::RegionOfInterest <doxid-class_g_v_a_1_1_region_of_interest>` &roi : regions) { // iterate by regions of interest attached to this video frame
	        auto bbox = roi.rect(); // get bounding box information
	        std::cout << "Object bounding box " << bbox.x << "," << bbox.y << "," << bbox.w << "," << bbox.h << "," << std::endl;
	        for (const :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>` &tensor : roi.tensors()) { // iterate by tensors attached to this region of interest
	            if (tensor.is_detection())
	                continue; // detection tensor doesn't contain classification result and hence doesn't contain label
	            // print some tensor information
	            std::cout << "  Attribute "     << tensor.name()       << std::endl;
	            std::cout << "    label= "      << tensor.label()      << std::endl;
	            std::cout << "    confidence= " << tensor.confidence() << std::endl;
	        }
	    }
	}

3. Access inference results with Python API
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is recommended way of dealing with inference results obtained from video analytics pipeline from Python code (via Python application or via ``gvapython`` element in pipeline)

.. ref-code-block:: cpp

	import gi
	gi.require_version('Gst', '1.0')
	from gi.repository import Gst  # contains GStreamer entities
	from gstgva import VideoFrame
	
	def PrintMeta(buffer: Gst.Buffer, caps: Gst.Caps):  # simple function to display some tensor information
	    frame = VideoFrame(buffer, caps=caps)
	    for roi in frame.regions():  # iterate by regions of interest attached to this video frame
	        bbox = roi.rect()  # get bounding box information
	        print("Object bounding box {}, {}, {}, {}".format(bbox.x, bbox.y, bbox.w, bbox.h))
	        for tensor in roi.tensors():  # iterate by tensors attached to this region of interest
	            if tensor.is_detection():
	                continue  # detection tensor doesn't contain classification result and hence doesn't contain label
	            # print some tensor information
	            print("  Attribute {}".format(tensor.name()))
	            print("    label= {}".format(tensor.label()))
	            print("    confidence= {}".format(tensor.confidence()))

Custom post-processing tutorial
===============================

There are several ways how you can access inference results provided by Intel® DL Streamer elements in GStreamer pipeline using Intel® DL Streamer API:

#. Create **C++/Python** application, which sets up callback on buffer passing through any Intel® DL Streamer element and runs video analytics pipeline. In the body of this callback GstBuffer and, hence, Intel® DL Streamer API will be available

#. Create **C++/Python** application, which runs video analytics pipeline with standard appsink element added as sink. You will then be able to register callback on GstBuffer incoming to appsink

#. Write your own GStreamer plugin which has access to GstBuffers coming through and insert it to pipeline after Intel® DL Streamer inference elements

We've got plenty examples of following ways 1 & 2 in our samples and most existing Intel® DL Streamer elements relying on existing inference result are basically based on 3rd way (e.g. **gvaclassify**, **gvatrack**, and more). Let's focus on one specific task: let's say, you have very specific deep learning model, which requires custom post-processing (**gvadetect** is not able to correctly interpret inference results of some models you can train or find on Web). You know how post processing should be implemented, but you don't know how to get and make any use of inference results produced by video analytics pipeline. To solve this task, you will need **gvainference** element, which runs deep learning model inference on passing video frame and stores raw inference result in a form of :ref:`GVA::Tensor <doxid-class_g_v_a_1_1_tensor>`, attached to :ref:`GVA::VideoFrame <doxid-class_g_v_a_1_1_video_frame>`. So we use **gvainference** to get tensors, but how do we access these produced tensors?

Any of 3 approaches above will suffice. For the clarity of explanation, let's choose 1st one and focus on it. Also, for our tutorial we will add "custom" post-processing for SSD-like models. **gvadetect** already implements this type of post-processing, but here we will use **gvainference** and set up post-processing as callback. In your case, you will need to only put your post-processing code instead of ours.

Below you can see full snippet of **Python** code that is ready to solve your task. Take a look, and then we will talk about it closely. Note, that almost the same code can be written in **C++**.

.. ref-code-block:: cpp

	import sys
	from argparse import ArgumentParser
	import gi  # get Python bindings for GLib-based libraries
	gi.require_version('GstVideo', '1.0')
	gi.require_version('Gst', '1.0')
	gi.require_version('GObject', '2.0')
	from gi.repository import Gst, GstVideo, GObject
	
	# Intel® DL Streamer API modules
	from gstgva import VideoFrame, util
	
	parser = ArgumentParser(add_help=False)
	_args = parser.add_argument_group('Options')
	_args.add_argument("-i", "--input", help="Required. Path to input video file",
	                   required=True, type=str)
	_args.add_argument("-d", "--detection_model", help="Required. Path to an .xml file with object detection model",
	                   required=True, type=str)
	
	# init GStreamer
	Gst.init(sys.argv)
	
	# post-processing code
	def process_frame(frame: VideoFrame, threshold: float = 0.5) -> bool:
	    width = frame.video_info().width
	    height = frame.video_info().height
	
	    for tensor in frame.tensors():
	        dims = tensor.dims()
	        data = tensor.data()
	        object_size = dims[-1]
	        for i in range(dims[-2]):
	            image_id = data[i * object_size + 0]
	            confidence = data[i * object_size + 2]
	            x_min = int(data[i * object_size + 3] * width + 0.5)
	            y_min = int(data[i * object_size + 4] * height + 0.5)
	            x_max = int(data[i * object_size + 5] * width + 0.5)
	            y_max = int(data[i * object_size + 6] * height + 0.5)
	
	            if image_id != 0:
	                break
	            if confidence < threshold:
	                continue
	
	            frame.add_region(x_min, y_min, x_max - x_min, y_max - y_min, "car", confidence)
	
	    return True
	
	def detect_postproc_callback(pad, info):
	    with util.GST_PAD_PROBE_INFO_BUFFER(info) as buffer:
	        caps = pad.get_current_caps()
	        frame = VideoFrame(buffer, caps=caps)
	        status = process_frame(frame)
	    return Gst.PadProbeReturn.OK if status else Gst.PadProbeReturn.DROP
	
	def main():
	    args = parser.parse_args()
	
	    # build pipeline using parse_launch
	    pipeline_str = "filesrc location={} ! decodebin ! videoconvert ! video/x-raw,format=BGRx ! " \
	        "gvainference name=gvainference model={} ! queue ! " \
	        "gvawatermark ! videoconvert ! fpsdisplaysink video-sink=xvimagesink sync=false".format(
	            args.input, args.detection_model)
	    pipeline = Gst.parse_launch(pipeline_str)
	
	    # set callback
	    gvainference = pipeline.get_by_name("gvainference")
	    if gvainference:
	        pad = gvainference.get_static_pad("src")
	        pad.add_probe(Gst.PadProbeType.BUFFER, detect_postproc_callback)
	
	    # start pipeline
	    pipeline.set_state(Gst.State.PLAYING)
	
	    # wait until EOS or error
	    bus = pipeline.get_bus()
	    msg = bus.timed_pop_filtered(
	        Gst.CLOCK_TIME_NONE, Gst.MessageType.ERROR | Gst.MessageType.EOS)
	
	    # free pipeline
	    pipeline.set_state(Gst.State.NULL)
	
	if __name__ == '__main__':
	    sys.exit(main() or 0)

Let's go through the most interesting pieces. First, we import necessary Python modules:

.. ref-code-block:: cpp

	import sys
	from argparse import ArgumentParser
	import gi  # get Python bindings for GLib-based libraries
	gi.require_version('GstVideo', '1.0')
	gi.require_version('Gst', '1.0')
	gi.require_version('GObject', '2.0')
	from gi.repository import Gst, GstVideo, GObject
	
	# Intel® DL Streamer API modules
	from gstgva import VideoFrame, util

Then, we parse command-line arguments. When run this script, you should specify input video with "-i" and your detection model with "-d":

.. ref-code-block:: cpp

	parser = ArgumentParser(add_help=False)
	_args = parser.add_argument_group('Options')
	_args.add_argument("-i", "--input", help="Required. Path to input video file",
	                   required=True, type=str)
	_args.add_argument("-d", "--detection_model", help="Required. Path to an .xml file with object detection model",
	                   required=True, type=str)

Next, function ``process_frame`` defines post-processing. As we said above, this code is for SSD-like models, so please feel free to replace it with your own post-processing implementation that suffices your custom model. Meanwhile, let's take a look at usage of Intel® DL Streamer API in this piece.

Tons of image information regarding current video frame can be obtain with :ref:`gstgva.video_frame.VideoFrame.video_info <doxid-classgstgva_1_1video__frame_1_1_video_frame_1ab20943e0804b2f72effefb587389877f>`. You can get image width, height, channels format and much more:

.. ref-code-block:: cpp

	width = frame.video_info().width
	height = frame.video_info().height

Next, we iterate by :ref:`gstgva.video_frame.VideoFrame.tensors <doxid-classgstgva_1_1video__frame_1_1_video_frame_1a77d5b23d5e7b94b4c8ce376d05fb6f8d>`, which were added by **gvainference**. We can get some inference result information, like :ref:`gstgva.tensor.Tensor.dims <doxid-classgstgva_1_1tensor_1_1_tensor_1abc261e3b136831d4d039d20491997f15>` (list of model output blob dimensions) and :ref:`gstgva.tensor.Tensor.data <doxid-classgstgva_1_1tensor_1_1_tensor_1ada98d45397a28a562d10b1048f2e2e32>` (raw output blob to interpret with your post-processing code):

.. ref-code-block:: cpp

	for tensor in frame.tensors():
	    dims = tensor.dims()
	    data = tensor.data()

After we eject bounding box parameters from raw inference blob, we are ready to call :ref:`gstgva.video_frame.VideoFrame.add_region <doxid-classgstgva_1_1video__frame_1_1_video_frame_1a91d71fb1082fb075250c8cc10bbed2c2>` with box coordinates, label and confidence as parameters.

.. ref-code-block:: cpp

	frame.add_region(x_min, y_min, x_max - x_min, y_max - y_min, "car", confidence)

Next, we define callback which will run ``process_frame`` (our post-processing code) on each video frame passing by pipeline:

.. ref-code-block:: cpp

	def detect_postproc_callback(pad, info):
	    with util.GST_PAD_PROBE_INFO_BUFFER(info) as buffer:
	        caps = pad.get_current_caps()
	        frame = VideoFrame(buffer, caps=caps)
	        status = process_frame(frame)
	    return Gst.PadProbeReturn.OK if status else Gst.PadProbeReturn.DROP

In ``main`` function we create string template of video analytics pipeline with **gvainference** to run inference and **gvawatermark** to display bounding boxes and their labels (the ones we set to "face"):

.. ref-code-block:: cpp

	# build pipeline using parse_launch
	pipeline_str = "filesrc location={} ! decodebin ! videoconvert ! video/x-raw,format=BGRx ! " \
	    "gvainference name=gvainference model={} ! queue ! " \
	    "gvawatermark ! videoconvert ! fpsdisplaysink video-sink=xvimagesink sync=false".format(
	    args.input, args.detection_model)

Finally, we register callback on **gvainference** source pad (source pad is meant to produce GstBuffer):

.. ref-code-block:: cpp

	# set callback
	gvainference = pipeline.get_by_name("gvainference")
	if gvainference:
	    pad = gvainference.get_static_pad("src")
	    pad.add_probe(Gst.PadProbeType.BUFFER, detect_postproc_callback)

Thus, before current frame leaves **gvainference**, ``detect_postproc_callback`` with access to GstBuffer will be called, where custom post-processing code is executed.

At the end of application, after all frames have passed by the pipeline, playback is finished.