initial commit

2025-08-14 01:57:43 +08:00 · 2020-03-23 11:48:41 +01:00
commit fce6dc35b4
6434 changed files with 2823345 additions and 0 deletions
--- a/doc/tutorials/gapi/anisotropic_segmentation/pics/massif_export_gapi.png
+++ b/doc/tutorials/gapi/anisotropic_segmentation/pics/massif_export_gapi.png
--- a/doc/tutorials/gapi/anisotropic_segmentation/pics/massif_export_gapi_fluid.png
+++ b/doc/tutorials/gapi/anisotropic_segmentation/pics/massif_export_gapi_fluid.png
--- a/doc/tutorials/gapi/anisotropic_segmentation/pics/massif_export_ocv.png
+++ b/doc/tutorials/gapi/anisotropic_segmentation/pics/massif_export_ocv.png
--- a/doc/tutorials/gapi/anisotropic_segmentation/pics/result.jpg
+++ b/doc/tutorials/gapi/anisotropic_segmentation/pics/result.jpg
--- a/doc/tutorials/gapi/anisotropic_segmentation/pics/segm.gif
+++ b/doc/tutorials/gapi/anisotropic_segmentation/pics/segm.gif
--- a/doc/tutorials/gapi/anisotropic_segmentation/pics/segm_fluid.gif
+++ b/doc/tutorials/gapi/anisotropic_segmentation/pics/segm_fluid.gif
--- a/doc/tutorials/gapi/anisotropic_segmentation/porting_anisotropic_segmentation.markdown
+++ b/doc/tutorials/gapi/anisotropic_segmentation/porting_anisotropic_segmentation.markdown
@ -0,0 +1,401 @@
+# Porting anisotropic image segmentation on G-API {#tutorial_gapi_anisotropic_segmentation}
+
+[TOC]
+
+# Introduction {#gapi_anisotropic_intro}
+
+In this tutorial you will learn:
+* How an existing algorithm can be transformed into a G-API
+  computation (graph);
+* How to inspect and profile G-API graphs;
+* How to customize graph execution without changing its code.
+
+This tutorial is based on @ref
+tutorial_anisotropic_image_segmentation_by_a_gst.
+
+# Quick start: using OpenCV backend {#gapi_anisotropic_start}
+
+Before we start, let's review the original algorithm implementation:
+
+@include cpp/tutorial_code/ImgProc/anisotropic_image_segmentation/anisotropic_image_segmentation.cpp
+
+## Examining calcGST() {#gapi_anisotropic_calcgst}
+
+The function calcGST() is clearly an image processing pipeline:
+* It is just a sequence of operations over a number of cv::Mat;
+* No logic (conditionals) and loops involved in the code;
+* All functions operate on 2D images (like cv::Sobel, cv::multiply,
+cv::boxFilter, cv::sqrt, etc).
+
+Considering the above, calcGST() is a great candidate to start
+with. In the original code, its prototype is defined like this:
+
+@snippet cpp/tutorial_code/ImgProc/anisotropic_image_segmentation/anisotropic_image_segmentation.cpp calcGST_proto
+
+With G-API, we can define it as follows:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi.cpp calcGST_proto
+
+It is important to understand that the new G-API based version of
+calcGST() will just produce a compute graph, in contrast to its
+original version, which actually calculates the values. This is a
+principal difference -- G-API based functions like this are used to
+construct graphs, not to process the actual data.
+
+Let's start implementing calcGST() with calculation of \f$J\f$
+matrix. This is how the original code looks like:
+
+@snippet cpp/tutorial_code/ImgProc/anisotropic_image_segmentation/anisotropic_image_segmentation.cpp calcJ_header
+
+Here we need to declare output objects for every new operation (see
+img as a result for cv::Mat::convertTo, imgDiffX and others as results for
+cv::Sobel and cv::multiply).
+
+The G-API analogue is listed below:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi.cpp calcGST_header
+
+This snippet demonstrates the following syntactic difference between
+G-API and traditional OpenCV:
+* All standard G-API functions are by default placed in "cv::gapi"
+namespace;
+* G-API operations _return_ its results -- there's no need to pass
+extra "output" parameters to the functions.
+
+Note -- this code is also using `auto` -- types of intermediate objects
+like `img`, `imgDiffX`, and so on are inferred automatically by the
+C++ compiler. In this example, the types are determined by G-API
+operation return values which all are cv::GMat.
+
+G-API standard kernels are trying to follow OpenCV API conventions
+whenever possible -- so cv::gapi::sobel takes the same arguments as
+cv::Sobel, cv::gapi::mul follows cv::multiply, and so on (except
+having a return value).
+
+The rest of calcGST() function can be implemented the same
+way trivially. Below is its full source code:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi.cpp calcGST
+
+## Running G-API graph {#gapi_anisotropic_running}
+
+After calcGST() is defined in G-API language, we can construct a graph
+based on it and finally run it -- pass input image and obtain
+result. Before we do it, let's have a look how original code looked
+like:
+
+@snippet cpp/tutorial_code/ImgProc/anisotropic_image_segmentation/anisotropic_image_segmentation.cpp main_extra
+
+G-API-based functions like calcGST() can't be applied to input data
+directly, since it is a _construction_ code, not the _processing_ code.
+In order to _run_ computations, a special object of class
+cv::GComputation needs to be created. This object wraps our G-API code
+(which is a composition of G-API data and operations) into a callable
+object, similar to C++11
+[std::function<>](https://en.cppreference.com/w/cpp/utility/functional/function).
+
+cv::GComputation class has a number of constructors which can be used
+to define a graph. Generally, user needs to pass graph boundaries
+-- _input_ and _output_ objects, on which a GComputation is
+defined. Then G-API analyzes the call flow from _outputs_ to _inputs_
+and reconstructs the graph with operations in-between the specified
+boundaries. This may sound complex, however in fact the code looks
+like this:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi.cpp main
+
+Note that this code slightly changes from the original one: forming up
+the resulting image is also a part of the pipeline (done with
+cv::gapi::addWeighted).
+
+Result of this G-API pipeline bit-exact matches the original one
+(given the same input image):
+
+![Segmentation result with G-API](pics/result.jpg)
+
+## G-API initial version: full listing {#gapi_anisotropic_ocv}
+
+Below is the full listing of the initial anisotropic image
+segmentation port on G-API:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi.cpp full_sample
+
+# Inspecting the initial version {#gapi_anisotropic_inspect}
+
+After we have got the initial working version of our algorithm working
+with G-API, we can use it to inspect and learn how G-API works. This
+chapter covers two aspects: understanding the graph structure, and
+memory profiling.
+
+## Understanding the graph structure {#gapi_anisotropic_inspect_graph}
+
+G-API stands for "Graph API", but did you mention any graphs in the
+above example? It was one of the initial design goals -- G-API was
+designed with expressions in mind to make adoption and porting process
+more straightforward. People _usually_ don't think in terms of
+_Nodes_ and _Edges_ when writing ordinary code, and so G-API, while
+being a Graph API, doesn't force its users to do that.
+
+However, a graph is still built implicitly when a cv::GComputation
+object is defined. It may be useful to inspect how the resulting graph
+looks like to check if it is generated correctly and if it really
+represents our alrogithm. It is also useful to learn the structure of
+the graph to see if it has any redundancies.
+
+G-API allows to dump generated graphs to `.dot` files which then
+could be visualized with [Graphviz](https://www.graphviz.org/), a
+popular open graph visualization software.
+
+<!-- TODO THIS VARIABLE NEEDS TO BE FIXED TO DUMP DIR ASAP! -->
+
+In order to dump our graph to a `.dot` file, set `GRAPH_DUMP_PATH` to a
+file name before running the application, e.g.:
+
+    $ GRAPH_DUMP_PATH=segm.dot ./bin/example_tutorial_porting_anisotropic_image_segmentation_gapi
+
+Now this file can be visualized with a `dot` command like this:
+
+    $ dot segm.dot -Tpng -o segm.png
+
+or viewed interactively with `xdot` (please refer to your
+distribution/operating system documentation on how to install these
+packages).
+
+![Anisotropic image segmentation graph](pics/segm.gif)
+
+The above diagram demonstrates a number of interesting aspects of
+G-API's internal algorithm representation:
+1. G-API underlying graph is a bipartite graph: it consists of
+   _Operation_ and _Data_ nodes such that a _Data_ node can only be
+   connected to an _Operation_ node, _Operation_ node can only be
+   connected to a _Data_ node, and nodes of a single kind are never
+   connected directly.
+2. Graph is directed - every edge in the graph has a direction.
+3. Graph "begins" and "ends" with a _Data_ kind of nodes.
+4. A _Data_ node can have only a single writer and multiple readers.
+5. An _Operation_ node may have multiple inputs, though every input
+   must have an unique _port number_ (among inputs).
+6. An _Operation_ node may have multiple outputs, and every output
+   must have an unique _port number_ (among outputs).
+
+## Measuring memory footprint {#gapi_anisotropic_memory_ocv}
+
+Let's measure and compare memory footprint of the algorithm in its two
+versions: G-API-based and OpenCV-based. At the moment, G-API version
+is also OpenCV-based since it fallbacks to OpenCV functions inside.
+
+On GNU/Linux, application memory footprint can be profiled with
+[Valgrind](http://valgrind.org/). On Debian/Ubuntu systems it can be
+installed like this (assuming you have administrator privileges):
+
+    $ sudo apt-get install valgrind massif-visualizer
+
+Once installed, we can collect memory profiles easily for our two
+algorithm versions:
+
+    $ valgrind --tool=massif --massif-out-file=ocv.out ./bin/example_tutorial_anisotropic_image_segmentation
+    ==6101== Massif, a heap profiler
+    ==6101== Copyright (C) 2003-2015, and GNU GPL'd, by Nicholas Nethercote
+    ==6101== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
+    ==6101== Command: ./bin/example_tutorial_anisotropic_image_segmentation
+    ==6101==
+    ==6101==
+    $ valgrind --tool=massif --massif-out-file=gapi.out ./bin/example_tutorial_porting_anisotropic_image_segmentation_gapi
+    ==6117== Massif, a heap profiler
+    ==6117== Copyright (C) 2003-2015, and GNU GPL'd, by Nicholas Nethercote
+    ==6117== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
+    ==6117== Command: ./bin/example_tutorial_porting_anisotropic_image_segmentation_gapi
+    ==6117==
+    ==6117==
+
+Once done, we can inspect the collected profiles with
+[Massif Visualizer](https://github.com/KDE/massif-visualizer)
+(installed in the above step).
+
+Below is the visualized memory profile of the original OpenCV version
+of the algorithm:
+
+![Memory profile: original Anisotropic Image Segmentation sample](pics/massif_export_ocv.png)
+
+We see that memory is allocated as the application
+executes, reaching its peak in the calcGST() function; then the
+footprint drops as calcGST() completes its execution and all temporary
+buffers are freed. Massif reports us peak memory consumption of 7.6 MiB.
+
+Now let's have a look on the profile of G-API version:
+
+![Memory profile: G-API port of Anisotropic Image Segmentation sample](pics/massif_export_gapi.png)
+
+Once G-API computation is created and its execution starts, G-API
+allocates all required memory at once and then the memory profile
+remains flat until the termination of the program. Massif reports us
+peak memory consumption of 11.4 MiB.
+
+A reader may ask a right question at this point -- is G-API that bad?
+What is the reason in using it than?
+
+Hopefully, it is not. The reason why we see here an increased memory
+consumption is because the default naive OpenCV-based backend is used to
+execute this graph. This backend serves mostly for quick prototyping
+and debugging algorithms before offload/further optimization.
+
+This backend doesn't utilize any complex memory management strategies yet
+since it is not its point at the moment. In the following chapter,
+we'll learn about Fluid backend and see how the same G-API code can
+run in a completely different model (and the footprint shrunk to a
+number of kilobytes).
+
+# Backends and kernels {#gapi_anisotropic_backends}
+
+This chapter covers how a G-API computation can be executed in a
+special way -- e.g. offloaded to another device, or scheduled with a
+special intelligence. G-API is designed to make its graphs portable --
+it means that once a graph is defined in G-API terms, no changes
+should be required in it if we want to run it on CPU or on GPU or on
+both devices at once. [G-API High-level overview](@ref gapi_hld) and
+[G-API Kernel API](@ref gapi_kernel_api) shed more light on technical
+details which make it possible. In this chapter, we will utilize G-API
+Fluid backend to make our graph cache-efficient on CPU.
+
+G-API defines _backend_ as the lower-level entity which knows how to
+run kernels. Backends may have (and, in fact, do have) different
+_Kernel APIs_ which are used to program and integrate kernels for that
+backends. In this context, _kernel_ is an implementation of an
+_operation_, which is defined on the top API level (see
+G_TYPED_KERNEL() macro).
+
+Backend is a thing which is aware of device & platform specifics, and
+which executes its kernels with keeping that specifics in mind. For
+example, there may be [Halide](http://halide-lang.org/) backend which
+allows to write (implement) G-API operations in Halide language and
+then generate functional Halide code for portions of G-API graph which
+map well there.
+
+## Running a graph with a Fluid backend {#gapi_anisotropic_fluid}
+
+OpenCV 4.0 is bundled with two G-API backends -- the default "OpenCV"
+which we just used, and a special "Fluid" backend.
+
+Fluid backend reorganizes the execution to save memory and to achieve
+near-perfect cache locality, implementing so-called "streaming" model
+of execution.
+
+In order to start using Fluid kernels, we need first to include
+appropriate header files (which are not included by default):
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi_fluid.cpp fluid_includes
+
+Once these headers are included, we can form up a new _kernel package_
+and specify it to G-API:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi_fluid.cpp kernel_pkg
+
+In G-API, kernels (or operation implementations) are objects. Kernels are
+organized into collections, or _kernel packages_, represented by class
+cv::gapi::GKernelPackage. The main purpose of a kernel package is to
+capture which kernels we would like to use in our graph, and pass it
+as a _graph compilation option_:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi_fluid.cpp kernel_pkg_use
+
+Traditional OpenCV is logically divided into modules, with every
+module providing a set of functions. In G-API, there are also
+"modules" which are represented as kernel packages provided by a
+particular backend. In this example, we pass Fluid kernel packages to
+G-API to utilize appropriate Fluid functions in our graph.
+
+Kernel packages are combinable -- in the above example, we take "Core"
+and "ImgProc" Fluid kernel packages and combine it into a single
+one. See documentation reference on cv::gapi::combine.
+
+If no kernel packages are specified in options, G-API is using
+_default_ package which consists of default OpenCV implementations and
+thus G-API graphs are executed via OpenCV functions by default. OpenCV
+backend provides broader functional coverage than any other
+backend. If a kernel package is specified, like in this example, then
+it is being combined with the _default_.
+It means that user-specified implementations will replace default implementations in case of
+conflict.
+
+<!-- FIXME Document this process better as a part of regular -->
+<!-- documentation, not a tutorial kind of thing -->
+
+## Troubleshooting and customization {#gapi_anisotropic_trouble}
+
+After the above modifications, (in OpenCV 4.0) the app should crash
+with a message like this:
+
+```
+$ ./bin/example_tutorial_porting_anisotropic_image_segmentation_gapi_fluid
+terminate called after throwing an instance of 'std::logic_error'
+  what():  .../modules/gapi/src/backends/fluid/gfluidimgproc.cpp:436: Assertion kernelSize.width == 3 && kernelSize.height == 3 in function run failed
+
+Aborted (core dumped)
+```
+
+Fluid backend has a number of limitations in OpenCV 4.0 (see this
+[wiki page](https://github.com/opencv/opencv/wiki/Graph-API) for a
+more up-to-date status). In particular, the Box filter used in this
+sample supports only static 3x3 kernel size.
+
+We can overcome this problem easily by avoiding G-API using Fluid
+version of Box filter kernel in this sample. It can be done by
+removing the appropriate kernel from the kernel package we've just
+created:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi_fluid.cpp kernel_hotfix
+
+Now this kernel package doesn't have _any_ implementation of Box
+filter kernel interface (specified as a template parameter). As
+described above, G-API will fall-back to OpenCV to run this kernel
+now. The resulting code with this change now looks like:
+
+@snippet cpp/tutorial_code/gapi/porting_anisotropic_image_segmentation/porting_anisotropic_image_segmentation_gapi_fluid.cpp kernel_pkg_proper
+
+Let's examine the memory profile for this sample after we switched to
+Fluid backend. Now it looks like this:
+
+![Memory profile: G-API/Fluid port of Anisotropic Image Segmentation sample](pics/massif_export_gapi_fluid.png)
+
+Now the tool reports 4.7MiB -- and we just changed a few lines in our
+code, without modifying the graph itself! It is a ~2.4X improvement of
+the previous G-API result, and ~1.6X improvement of the original OpenCV
+version.
+
+Let's also examine how the internal representation of the graph now
+looks like. Dumping the graph into `.dot` would result into a
+visualization like this:
+
+![Anisotropic image segmentation graph with OpenCV & Fluid kernels](pics/segm_fluid.gif)
+
+This graph doesn't differ structurally from its previous version (in
+terms of operations and data objects), though a changed layout (on the
+left side of the dump) is easily noticeable.
+
+The visualization reflects how G-API deals with mixed graphs, also
+called _heterogeneous_ graphs. The majority of operations in this
+graph are implemented with Fluid backend, but Box filters are executed
+by the OpenCV backend. One can easily see that the graph is partitioned
+(with rectangles). G-API groups connected operations based on their
+affinity, forming _subgraphs_ (or _islands_ in G-API terminology), and
+our top-level graph becomes a composition of multiple smaller
+subgraphs. Every backend determines how its subgraph (island) is
+executed, so Fluid backend optimizes out memory where possible, and
+six intermediate buffers accessed by OpenCV Box filters are allocated
+fully and can't be optimized out.
+
+<!-- TODO: add a chapter on custom kernels -->
+<!-- TODO: make a full-fluid pipeline -->
+<!-- TODO: talk about parallelism when it is available -->
+
+# Conclusion {#gapi_tutor_conclusion}
+
+This tutorial demonstrates what G-API is and what its key design
+concepts are, how an algorithm can be ported to G-API, and
+how to utilize graph model benefits after that.
+
+In OpenCV 4.0, G-API is still in its inception stage -- it is more a
+foundation for all future work, though ready for use even now.
+
+Further, this tutorial will be extended with new chapters on custom
+kernels programming, parallelism, and more.
--- a/doc/tutorials/gapi/face_beautification/face_beautification.markdown
+++ b/doc/tutorials/gapi/face_beautification/face_beautification.markdown
@ -0,0 +1,440 @@
+# Implementing a face beautification algorithm with G-API {#tutorial_gapi_face_beautification}
+
+[TOC]
+
+# Introduction {#gapi_fb_intro}
+
+In this tutorial you will learn:
+* Basics of a sample face beautification algorithm;
+* How to infer different networks inside a pipeline with G-API;
+* How to run a G-API pipeline on a video stream.
+
+## Prerequisites {#gapi_fb_prerec}
+
+This sample requires:
+- PC with GNU/Linux or Microsoft Windows (Apple macOS is supported but
+  was not tested);
+- OpenCV 4.2 or later built with Intel® Distribution of [OpenVINO™
+  Toolkit](https://docs.openvinotoolkit.org/) (building with [Intel®
+  TBB](https://www.threadingbuildingblocks.org/intel-tbb-tutorial) is
+  a plus);
+- The following topologies from OpenVINO™ Toolkit [Open Model
+  Zoo](https://github.com/opencv/open_model_zoo):
+  - `face-detection-adas-0001`;
+  - `facial-landmarks-35-adas-0002`.
+
+## Face beautification algorithm {#gapi_fb_algorithm}
+
+We will implement a simple face beautification algorithm using a
+combination of modern Deep Learning techniques and traditional
+Computer Vision. The general idea behind the algorithm is to make
+face skin smoother while preserving face features like eyes or a
+mouth contrast. The algorithm identifies parts of the face using a DNN
+inference, applies different filters to the parts found, and then
+combines it into the final result using basic image arithmetics:
+
+\dot
+strict digraph Pipeline {
+  node [shape=record fontname=Helvetica fontsize=10 style=filled color="#4c7aa4" fillcolor="#5b9bd5" fontcolor="white"];
+  edge [color="#62a8e7"];
+  ordering="out";
+  splines=ortho;
+  rankdir=LR;
+
+  input [label="Input"];
+  fd [label="Face\ndetector"];
+  bgMask [label="Generate\nBG mask"];
+  unshMask [label="Unsharp\nmask"];
+  bilFil [label="Bilateral\nfilter"];
+  shMask [label="Generate\nsharp mask"];
+  blMask [label="Generate\nblur mask"];
+  mul_1 [label="*" fontsize=24 shape=circle labelloc=b];
+  mul_2 [label="*" fontsize=24 shape=circle labelloc=b];
+  mul_3 [label="*" fontsize=24 shape=circle labelloc=b];
+
+  subgraph cluster_0 {
+    style=dashed
+    fontsize=10
+    ld [label="Landmarks\ndetector"];
+    label="for each face"
+  }
+
+  sum_1 [label="+" fontsize=24 shape=circle];
+  out [label="Output"];
+
+  temp_1 [style=invis shape=point width=0];
+  temp_2 [style=invis shape=point width=0];
+  temp_3 [style=invis shape=point width=0];
+  temp_4 [style=invis shape=point width=0];
+  temp_5 [style=invis shape=point width=0];
+  temp_6 [style=invis shape=point width=0];
+  temp_7 [style=invis shape=point width=0];
+  temp_8 [style=invis shape=point width=0];
+  temp_9 [style=invis shape=point width=0];
+
+  input -> temp_1 [arrowhead=none]
+  temp_1 -> fd -> ld
+  ld -> temp_4 [arrowhead=none]
+  temp_4 -> bgMask
+  bgMask -> mul_1 -> sum_1 -> out
+
+  temp_4 -> temp_5 -> temp_6 [arrowhead=none constraint=none]
+  ld -> temp_2 -> temp_3 [style=invis constraint=none]
+
+  temp_1 -> {unshMask, bilFil}
+  fd -> unshMask [style=invis constraint=none]
+  unshMask -> bilFil [style=invis constraint=none]
+
+  bgMask -> shMask [style=invis constraint=none]
+  shMask -> blMask [style=invis constraint=none]
+  mul_1 -> mul_2 [style=invis constraint=none]
+  temp_5 -> shMask -> mul_2
+  temp_6 -> blMask -> mul_3
+
+  unshMask -> temp_2 -> temp_5 [style=invis]
+  bilFil -> temp_3 -> temp_6 [style=invis]
+
+  mul_2 -> temp_7 [arrowhead=none]
+  mul_3 -> temp_8 [arrowhead=none]
+
+  temp_8 -> temp_7 [arrowhead=none constraint=none]
+  temp_7 -> sum_1 [constraint=none]
+
+  unshMask -> mul_2 [constraint=none]
+  bilFil -> mul_3 [constraint=none]
+  temp_1 -> mul_1 [constraint=none]
+}
+\enddot
+
+Briefly the algorithm is described as follows:
+- Input image \f$I\f$ is passed to unsharp mask and bilateral filters
+  (\f$U\f$ and \f$L\f$ respectively);
+- Input image \f$I\f$ is passed to an SSD-based face detector;
+- SSD result (a \f$[1 \times 1 \times 200 \times 7]\f$ blob) is parsed
+  and converted to an array of faces;
+- Every face is passed to a landmarks detector;
+- Based on landmarks found for every face, three image masks are
+  generated:
+  - A background mask \f$b\f$ -- indicating which areas from the
+    original image to keep as-is;
+  - A face part mask \f$p\f$ -- identifying regions to preserve
+    (sharpen).
+  - A face skin mask \f$s\f$ -- identifying regions to blur;
+- The final result \f$O\f$ is a composition of features above
+  calculated as \f$O = b*I + p*U + s*L\f$.
+
+Generating face element masks based on a limited set of features (just
+35 per face, including all its parts) is not very trivial and is
+described in the sections below.
+
+# Constructing a G-API pipeline {#gapi_fb_pipeline}
+
+## Declaring Deep Learning topologies {#gapi_fb_decl_nets}
+
+This sample is using two DNN detectors. Every network takes one input
+and produces one output. In G-API, networks are defined with macro
+G_API_NET():
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp net_decl
+
+To get more information, see
+[Declaring Deep Learning topologies](@ref gapi_ifd_declaring_nets)
+described in the "Face Analytics pipeline" tutorial.
+
+## Describing the processing graph {#gapi_fb_ppline}
+
+The code below generates a graph for the algorithm above:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ppl
+
+The resulting graph is a mixture of G-API's standard operations,
+user-defined operations (namespace `custom::`), and DNN inference.
+The generic function `cv::gapi::infer<>()` allows to trigger inference
+within the pipeline; networks to infer are specified as template
+parameters.  The sample code is using two versions of `cv::gapi::infer<>()`:
+- A frame-oriented one is used to detect faces on the input frame.
+- An ROI-list oriented one is used to run landmarks inference on a
+  list of faces -- this version produces an array of landmarks per
+  every face.
+
+More on this in "Face Analytics pipeline"
+([Building a GComputation](@ref gapi_ifd_gcomputation) section).
+
+## Unsharp mask in G-API {#gapi_fb_unsh}
+
+The unsharp mask \f$U\f$ for image \f$I\f$ is defined as:
+
+\f[U = I - s * L(M(I)),\f]
+
+where \f$M()\f$ is a median filter, \f$L()\f$ is the Laplace operator,
+and \f$s\f$ is a strength coefficient. While G-API doesn't provide
+this function out-of-the-box, it is expressed naturally with the
+existing G-API operations:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp unsh
+
+Note that the code snipped above is a regular C++ function defined
+with G-API types. Users can write functions like this to simplify
+graph construction; when called, this function just puts the relevant
+nodes to the pipeline it is used in.
+
+# Custom operations {#gapi_fb_proc}
+
+The face beautification graph is using custom operations
+extensively. This chapter focuses on the most interesting kernels,
+refer to [G-API Kernel API](@ref gapi_kernel_api) for general
+information on defining operations and implementing kernels in G-API.
+
+## Face detector post-processing {#gapi_fb_face_detect}
+
+A face detector output is converted to an array of faces with the
+following kernel:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp vec_ROI
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp fd_pp
+
+## Facial landmarks post-processing {#gapi_fb_landm_detect}
+
+The algorithm infers locations of face elements (like the eyes, the mouth
+and the head contour itself) using a generic facial landmarks detector
+(<a href="https://github.com/opencv/open_model_zoo/blob/master/models/intel/facial-landmarks-35-adas-0002/description/facial-landmarks-35-adas-0002.md">details</a>)
+from OpenVINO™ Open Model Zoo. However, the detected landmarks as-is are not
+enough to generate masks --- this operation requires regions of interest on
+the face represented by closed contours, so some interpolation is applied to
+get them. This landmarks
+processing and interpolation is performed by the following kernel:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ld_pp_cnts
+
+The kernel takes two arrays of denormalized landmarks coordinates and
+returns an array of elements' closed contours and an array of faces'
+closed contours; in other words, outputs are, the first, an array of
+contours of image areas to be sharpened and, the second, another one
+to be smoothed.
+
+Here and below `Contour` is a vector of points.
+
+### Getting an eye contour {#gapi_fb_ld_eye}
+
+Eye contours are estimated with the following function:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ld_pp_incl
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ld_pp_eye
+
+Briefly, this function restores the bottom side of an eye by a
+half-ellipse based on two points in left and right eye
+corners. In fact, `cv::ellipse2Poly()` is used to approximate the eye region, and
+the function only defines ellipse parameters based on just two points:
+- The ellipse center and the \f$X\f$ half-axis calculated by two eye Points;
+- The \f$Y\f$ half-axis calculated according to the assumption that an average
+eye width is \f$1/3\f$ of its length;
+- The start and the end angles which are 0 and 180 (refer to
+  `cv::ellipse()` documentation);
+- The angle delta: how much points to produce in the contour;
+- The inclination angle of the axes.
+
+The use of the `atan2()` instead of just `atan()` in function
+`custom::getLineInclinationAngleDegrees()` is essential as it allows to
+return a negative value depending on the `x` and the `y` signs so we
+can get the right angle even in case of upside-down face arrangement
+(if we put the points in the right order, of course).
+
+### Getting a forehead contour {#gapi_fb_ld_fhd}
+
+The function  approximates the forehead contour:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp ld_pp_fhd
+
+As we have only jaw points in our detected landmarks, we have to get a
+half-ellipse based on three points of a jaw: the leftmost, the
+rightmost and the lowest one. The jaw width is assumed to be equal to the
+forehead width and the latter is calculated using the left and the
+right points. Speaking of the \f$Y\f$ axis, we have no points to get
+it directly, and instead assume that the forehead height is about \f$2/3\f$
+of the jaw height, which can be figured out from the face center (the
+middle between the left and right points) and the lowest jaw point.
+
+## Drawing masks {#gapi_fb_masks_drw}
+
+When we have all the contours needed, we are able to draw masks:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp msk_ppline
+
+The steps to get the masks are:
+* the "sharp" mask calculation:
+    * fill the contours that should be sharpened;
+    * blur that to get the "sharp" mask (`mskSharpG`);
+* the "bilateral" mask calculation:
+    * fill all the face contours fully;
+    * blur that;
+    * subtract areas which intersect with the "sharp" mask --- and get the
+      "bilateral" mask (`mskBlurFinal`);
+* the background mask calculation:
+    * add two previous masks
+    * set all non-zero pixels of the result as 255 (by `cv::gapi::threshold()`)
+    * revert the output (by `cv::gapi::bitwise_not`) to get the background
+      mask (`mskNoFaces`).
+
+# Configuring and running the pipeline {#gapi_fb_comp_args}
+
+Once the graph is fully expressed, we can finally compile it and run
+on real data. G-API graph compilation is the stage where the G-API
+framework actually understands which kernels and networks to use. This
+configuration happens via G-API compilation arguments.
+
+## DNN parameters {#gapi_fb_comp_args_net}
+
+This sample is using OpenVINO™ Toolkit Inference Engine backend for DL
+inference, which is configured the following way:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp net_param
+
+Every `cv::gapi::ie::Params<>` object is related to the network
+specified in its template argument. We should pass there the network
+type we have defined in `G_API_NET()` in the early beginning of the
+tutorial.
+
+Network parameters are then wrapped in `cv::gapi::NetworkPackage`:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp netw
+
+More details in "Face Analytics Pipeline"
+([Configuring the pipeline](@ref gapi_ifd_configuration) section).
+
+## Kernel packages  {#gapi_fb_comp_args_kernels}
+
+In this example we use a lot of custom kernels, in addition to that we
+use Fluid backend to optimize out memory for G-API's standard kernels
+where applicable. The resulting kernel package is formed like this:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp kern_pass_1
+
+## Compiling the streaming pipeline  {#gapi_fb_compiling}
+
+G-API optimizes execution for video streams when compiled in the
+"Streaming" mode.
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp str_comp
+
+More on this in "Face Analytics Pipeline"
+([Configuring the pipeline](@ref gapi_ifd_configuration) section).
+
+## Running the streaming pipeline {#gapi_fb_running}
+
+In order to run the G-API streaming pipeline, all we need is to
+specify the input video source, call
+`cv::GStreamingCompiled::start()`, and then fetch the pipeline
+processing results:
+
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp str_src
+@snippet cpp/tutorial_code/gapi/face_beautification/face_beautification.cpp str_loop
+
+Once results are ready and can be pulled from the pipeline we display
+it on the screen and handle GUI events.
+
+See [Running the pipeline](@ref gapi_ifd_running) section
+in the "Face Analytics Pipeline" tutorial for more details.
+
+# Conclusion {#gapi_fb_cncl}
+
+The tutorial has two goals: to show the use of brand new features of
+G-API introduced in OpenCV 4.2, and give a basic understanding on a
+sample face beautification algorithm.
+
+The result of the algorithm application:
+
+![Face Beautification example](pics/example.jpg)
+
+On the test machine (Intel® Core™ i7-8700) the G-API-optimized video
+pipeline outperforms its serial (non-pipelined) version by a factor of
+**2.7** -- meaning that for such a non-trivial graph, the proper
+pipelining can bring almost 3x increase in performance.
+
+<!---
+The idea in general is to implement a real-time video stream processing that
+detects faces and applies some filters to make them look beautiful (more or
+less). The pipeline is the following:
+
+Two topologies from OMZ have been used in this sample: the
+<a href="https://github.com/opencv/open_model_zoo/tree/master/models/intel
+/face-detection-adas-0001">face-detection-adas-0001</a>
+and the
+<a href="https://github.com/opencv/open_model_zoo/blob/master/models/intel
+/facial-landmarks-35-adas-0002/description/facial-landmarks-35-adas-0002.md">
+facial-landmarks-35-adas-0002</a>.
+
+The face detector takes the input image and returns a blob with the shape
+[1,1,200,7] after the inference (200 is the maximum number of
+faces which can be detected).
+In order to process every face individually, we need to convert this output to a
+list of regions on the image.
+
+The masks for different filters are built based on facial landmarks, which are
+inferred for every face. The result of the inference
+is a blob with 35 landmarks: the first 18 of them are facial elements
+(eyes, eyebrows, a nose, a mouth) and the last 17 --- a jaw contour. Landmarks
+are floating point values of coordinates normalized relatively to an input ROI
+(not the original frame). In addition, for the further goals we need contours of
+eyes, mouths, faces, etc., not the landmarks. So, post-processing of the Mat is
+also required here. The process is split into two parts --- landmarks'
+coordinates denormalization to the real pixel coordinates of the source frame
+and getting necessary closed contours based on these coordinates.
+
+The last step of processing the inference data is drawing masks using the
+calculated contours. In this demo the contours don't need to be pixel accurate,
+since masks are blurred with Gaussian filter anyway. Another point that should
+be mentioned here is getting
+three masks (for areas to be smoothed, for ones to be sharpened and for the
+background) which have no intersections with each other; this approach allows to
+apply the calculated masks to the corresponding images prepared beforehand and
+then just to summarize them to get the output image without any other actions.
+
+As we can see, this algorithm is appropriate to illustrate G-API usage
+convenience and efficiency in the context of solving a real CV/DL problem.
+
+(On detector post-proc)
+Some points to be mentioned about this kernel implementation:
+
+- It takes a `cv::Mat` from the detector and a `cv::Mat` from the input; it
+returns an array of ROI's where faces have been detected.
+
+- `cv::Mat` data parsing by the pointer on a float is used here.
+
+- By far the most important thing here is solving an issue that sometimes
+detector returns coordinates located outside of the image; if we pass such an
+ROI to be processed, errors in the landmarks detection will occur. The frame box
+`borders` is created and then intersected with the face rectangle
+(by `operator&()`) to handle such cases and save the ROI which is for sure
+inside the frame.
+
+Data parsing after the facial landmarks detector happens according to the same
+scheme with inconsiderable adjustments.
+
+
+## Possible further improvements
+
+There are some points in the algorithm to be improved.
+
+### Correct ROI reshaping for meeting conditions required by the facial landmarks detector
+
+The input of the facial landmarks detector is a square ROI, but the face
+detector gives non-square rectangles in general. If we let the backend within
+Inference-API compress the rectangle to a square by itself, the lack of
+inference accuracy can be noticed in some cases.
+There is a solution: we can give a describing square ROI instead of the
+rectangular one to the landmarks detector, so there will be no need to compress
+the ROI, which will lead to accuracy improvement.
+Unfortunately, another problem occurs if we do that:
+if the rectangular ROI is near the border, a describing square will probably go
+out of the frame --- that leads to errors of the landmarks detector.
+To avoid such a mistake, we have to implement an algorithm that, firstly,
+describes every rectangle by a square, then counts the farthest coordinates
+turned up to be outside of the frame and, finally, pads the source image by
+borders (e.g. single-colored) with the size counted. It will be safe to take
+square ROIs for the facial landmarks detector after that frame adjustment.
+
+### Research for the best parameters (used in GaussianBlur() or unsharpMask(), etc.)
+
+### Parameters autoscaling
+
+-->
--- a/doc/tutorials/gapi/face_beautification/pics/example.jpg
+++ b/doc/tutorials/gapi/face_beautification/pics/example.jpg
--- a/doc/tutorials/gapi/interactive_face_detection/interactive_face_detection.markdown
+++ b/doc/tutorials/gapi/interactive_face_detection/interactive_face_detection.markdown
@ -0,0 +1,353 @@
+# Face analytics pipeline with G-API {#tutorial_gapi_interactive_face_detection}
+
+[TOC]
+
+# Overview {#gapi_ifd_intro}
+
+In this tutorial you will learn:
+* How to integrate Deep Learning inference in a G-API graph;
+* How to run a G-API graph on a video stream and obtain data from it.
+
+# Prerequisites {#gapi_ifd_prereq}
+
+This sample requires:
+- PC with GNU/Linux or Microsoft Windows (Apple macOS is supported but
+  was not tested);
+- OpenCV 4.2 or later built with Intel® Distribution of [OpenVINO™
+  Toolkit](https://docs.openvinotoolkit.org/) (building with [Intel®
+  TBB](https://www.threadingbuildingblocks.org/intel-tbb-tutorial) is
+  a plus);
+- The following topologies from OpenVINO™ Toolkit [Open Model
+  Zoo](https://github.com/opencv/open_model_zoo):
+  - `face-detection-adas-0001`;
+  - `age-gender-recognition-retail-0013`;
+  - `emotions-recognition-retail-0003`.
+
+# Introduction: why G-API {#gapi_ifd_why}
+
+Many computer vision algorithms run on a video stream rather than on
+individual images. Stream processing usually consists of multiple
+steps -- like decode, preprocessing, detection, tracking,
+classification (on detected objects), and visualization -- forming a
+*video processing pipeline*. Moreover, many these steps of such
+pipeline can run in parallel -- modern platforms have different
+hardware blocks on the same chip like decoders and GPUs, and extra
+accelerators can be plugged in as extensions, like Intel® Movidius™
+Neural Compute Stick for deep learning offload.
+
+Given all this manifold of options and a variety in video analytics
+algorithms, managing such pipelines effectively quickly becomes a
+problem. For sure it can be done manually, but this approach doesn't
+scale: if a change is required in the algorithm (e.g. a new pipeline
+step is added), or if it is ported on a new platform with different
+capabilities, the whole pipeline needs to be re-optimized.
+
+Starting with version 4.2, OpenCV offers a solution to this
+problem. OpenCV G-API now can manage Deep Learning inference (a
+cornerstone of any modern analytics pipeline) with a traditional
+Computer Vision as well as video capturing/decoding, all in a single
+pipeline. G-API takes care of pipelining itself -- so if the algorithm
+or platform changes, the execution model adapts to it automatically.
+
+# Pipeline overview {#gapi_ifd_overview}
+
+Our sample application is based on ["Interactive Face Detection"] demo
+from OpenVINO™ Toolkit Open Model Zoo. A simplified pipeline consists
+of the following steps:
+1. Image acquisition and decode;
+2. Detection with preprocessing;
+3. Classification with preprocessing for every detected object with
+   two networks;
+4. Visualization.
+
+\dot
+digraph pipeline {
+  node [shape=record fontname=Helvetica fontsize=10 style=filled color="#4c7aa4" fillcolor="#5b9bd5" fontcolor="white"];
+  edge [color="#62a8e7"];
+  splines=ortho;
+
+  rankdir = LR;
+  subgraph cluster_0 {
+    color=invis;
+    capture [label="Capture\nDecode"];
+    resize [label="Resize\nConvert"];
+    detect [label="Detect faces"];
+    capture -> resize -> detect
+  }
+
+  subgraph cluster_1 {
+    graph[style=dashed];
+
+    subgraph cluster_2 {
+      color=invis;
+      temp_4 [style=invis shape=point width=0];
+      postproc_1 [label="Crop\nResize\nConvert"];
+      age_gender [label="Classify\nAge/gender"];
+      postproc_1 -> age_gender [constraint=true]
+      temp_4 -> postproc_1 [constraint=none]
+    }
+
+    subgraph cluster_3 {
+      color=invis;
+      postproc_2 [label="Crop\nResize\nConvert"];
+      emo [label="Classify\nEmotions"];
+      postproc_2 -> emo [constraint=true]
+    }
+    label="(for each face)";
+  }
+
+  temp_1 [style=invis shape=point width=0];
+  temp_2 [style=invis shape=point width=0];
+  detect -> temp_1 [arrowhead=none]
+  temp_1 -> postproc_1
+
+  capture -> {temp_4, temp_2} [arrowhead=none constraint=false]
+  temp_2 -> postproc_2
+
+  temp_1 -> temp_2 [arrowhead=none constraint=false]
+
+  temp_3 [style=invis shape=point width=0];
+  show [label="Visualize\nDisplay"];
+
+  {age_gender, emo} -> temp_3 [arrowhead=none]
+  temp_3 -> show
+}
+\enddot
+
+# Constructing a pipeline {#gapi_ifd_constructing}
+
+Constructing a G-API graph for a video streaming case does not differ
+much from a [regular usage](@ref gapi_example) of G-API -- it is still
+about defining graph *data* (with cv::GMat, cv::GScalar, and
+cv::GArray) and *operations* over it. Inference also becomes an
+operation in the graph, but is defined in a little bit different way.
+
+## Declaring Deep Learning topologies {#gapi_ifd_declaring_nets}
+
+In contrast with traditional CV functions (see [core] and [imgproc])
+where G-API declares distinct operations for every function, inference
+in G-API is a single generic operation cv::gapi::infer<>. As usual, it
+is just an interface and it can be implemented in a number of ways under
+the hood. In OpenCV 4.2, only OpenVINO™ Inference Engine-based backend
+is available, and OpenCV's own DNN module-based backend is to come.
+
+cv::gapi::infer<> is _parametrized_ by the details of a topology we are
+going to execute. Like operations, topologies in G-API are strongly
+typed and are defined with a special macro G_API_NET():
+
+@snippet cpp/tutorial_code/gapi/age_gender_emotion_recognition/age_gender_emotion_recognition.cpp G_API_NET
+
+Similar to how operations are defined with G_API_OP(), network
+description requires three parameters:
+1. A type name. Every defined topology is declared as a distinct C++
+   type which is used further in the program -- see below;
+2. A `std::function<>`-like API signature. G-API traits networks as
+   regular "functions" which take and return data. Here network
+   `Faces` (a detector) takes a cv::GMat and returns a cv::GMat, while
+   network `AgeGender` is known to provide two outputs (age and gender
+   blobs, respectively) -- so its has a `std::tuple<>` as a return
+   type.
+3. A topology name -- can be any non-empty string, G-API is using
+   these names to distinguish networks inside. Names should be unique
+   in the scope of a single graph.
+
+## Building a GComputation {#gapi_ifd_gcomputation}
+
+Now the above pipeline is expressed in G-API like this:
+
+@snippet cpp/tutorial_code/gapi/age_gender_emotion_recognition/age_gender_emotion_recognition.cpp GComputation
+
+Every pipeline starts with declaring empty data objects -- which act
+as inputs to the pipeline. Then we call a generic cv::gapi::infer<>
+specialized to `Faces` detection network. cv::gapi::infer<> inherits its
+signature from its template parameter -- and in this case it expects
+one input cv::GMat and produces one output cv::GMat.
+
+In this sample we use a pre-trained SSD-based network and its output
+needs to be parsed to an array of detections (object regions of
+interest, ROIs). It is done by a custom operation `custom::PostProc`,
+which returns an array of rectangles (of type `cv::GArray<cv::Rect>`)
+back to the pipeline. This operation also filters out results by a
+confidence threshold -- and these details are hidden in the kernel
+itself. Still, at the moment of graph construction we operate with
+interfaces only and don't need actual kernels to express the pipeline
+-- so the implementation of this post-processing will be listed later.
+
+After detection result output is parsed to an array of objects, we can run
+classification on any of those. G-API doesn't support syntax for
+in-graph loops like `for_each()` yet, but instead cv::gapi::infer<>
+comes with a special list-oriented overload.
+
+User can call cv::gapi::infer<> with a cv::GArray as the first
+argument, so then G-API assumes it needs to run the associated network
+on every rectangle from the given list of the given frame (second
+argument). Result of such operation is also a list -- a  cv::GArray of
+cv::GMat.
+
+Since `AgeGender` network itself produces two outputs, it's output
+type for a list-based version of cv::gapi::infer is a tuple of
+arrays. We use `std::tie()` to decompose this input into two distinct
+objects.
+
+`Emotions` network produces a single output so its list-based
+inference's return type is `cv::GArray<cv::GMat>`.
+
+# Configuring the pipeline {#gapi_ifd_configuration}
+
+G-API strictly separates construction from configuration -- with the
+idea to keep algorithm code itself platform-neutral. In the above
+listings we only declared our operations and expressed the overall
+data flow, but didn't even mention that we use OpenVINO™. We only
+described *what* we do, but not *how* we do it. Keeping these two
+aspects clearly separated is the design goal for G-API.
+
+Platform-specific details arise when the pipeline is *compiled* --
+i.e. is turned from a declarative to an executable form. The way *how*
+to run stuff is specified via compilation arguments, and new
+inference/streaming features are no exception from this rule.
+
+G-API is built on backends which implement interfaces (see
+[Architecture] and [Kernels] for details) -- thus cv::gapi::infer<> is
+a function which can be implemented by different backends. In OpenCV
+4.2, only OpenVINO™ Inference Engine backend for inference is
+available. Every inference backend in G-API has to provide a special
+parameterizable structure to express *backend-specific* neural network
+parameters -- and in this case, it is cv::gapi::ie::Params:
+
+@snippet cpp/tutorial_code/gapi/age_gender_emotion_recognition/age_gender_emotion_recognition.cpp Param_Cfg
+
+Here we define three parameter objects: `det_net`, `age_net`, and
+`emo_net`. Every object is a cv::gapi::ie::Params structure
+parametrization for each particular network we use. On a compilation
+stage, G-API automatically matches network parameters with their
+cv::gapi::infer<> calls in graph using this information.
+
+Regardless of the topology, every parameter structure is constructed
+with three string arguments -- specific to the OpenVINO™ Inference
+Engine:
+1. Path to the topology's intermediate representation (.xml file);
+2. Path to the topology's model weights (.bin file);
+3. Device where to run -- "CPU", "GPU", and others -- based on your
+OpenVINO™ Toolkit installation.
+These arguments are taken from the command-line parser.
+
+Once networks are defined and custom kernels are implemented, the
+pipeline is compiled for streaming:
+
+@snippet cpp/tutorial_code/gapi/age_gender_emotion_recognition/age_gender_emotion_recognition.cpp Compile
+
+cv::GComputation::compileStreaming() triggers a special video-oriented
+form of graph compilation where G-API is trying to optimize
+throughput. Result of this compilation is an object of special type
+cv::GStreamingCompiled -- in constract to a traditional callable
+cv::GCompiled, these objects are closer to media players in their
+semantics.
+
+@note There is no need to pass metadata arguments describing the
+format of the input video stream in
+cv::GComputation::compileStreaming() -- G-API figures automatically
+what are the formats of the input vector and adjusts the pipeline to
+these formats on-the-fly. User still can pass metadata there as with
+regular cv::GComputation::compile() in order to fix the pipeline to
+the specific input format.
+
+# Running the pipeline  {#gapi_ifd_running}
+
+Pipelining optimization is based on processing multiple input video
+frames simultaneously, running different steps of the pipeline in
+parallel. This is why it works best when the framework takes full
+control over the video stream.
+
+The idea behind streaming API is that user specifies an *input source*
+to the pipeline and then G-API manages its execution automatically
+until the source ends or user interrupts the execution. G-API pulls
+new image data from the source and passes it to the pipeline for
+processing.
+
+Streaming sources are represented by the interface
+cv::gapi::wip::IStreamSource. Objects implementing this interface may
+be passed to `GStreamingCompiled` as regular inputs via `cv::gin()`
+helper function. In OpenCV 4.2, only one streaming source is allowed
+per pipeline -- this requirement will be relaxed in the future.
+
+OpenCV comes with a great class cv::VideoCapture and by default G-API
+ships with a stream source class based on it --
+cv::gapi::wip::GCaptureSource. Users can implement their own
+streaming sources e.g. using [VAAPI] or other Media or Networking
+APIs.
+
+Sample application specifies the input source as follows:
+
+@snippet cpp/tutorial_code/gapi/age_gender_emotion_recognition/age_gender_emotion_recognition.cpp Source
+
+Please note that a GComputation may still have multiple inputs like
+cv::GMat, cv::GScalar, or cv::GArray objects. User can pass their
+respective host-side types (cv::Mat, cv::Scalar, std::vector<>) in the
+input vector as well, but in Streaming mode these objects will create
+"endless" constant streams. Mixing a real video source stream and a
+const data stream is allowed.
+
+Running a pipeline is easy -- just call
+cv::GStreamingCompiled::start() and fetch your data with blocking
+cv::GStreamingCompiled::pull() or non-blocking
+cv::GStreamingCompiled::try_pull(); repeat until the stream ends:
+
+@snippet cpp/tutorial_code/gapi/age_gender_emotion_recognition/age_gender_emotion_recognition.cpp Run
+
+The above code may look complex but in fact it handles two modes --
+with and without graphical user interface (GUI):
+- When a sample is running in a "headless" mode (`--pure` option is
+  set), this code simply pulls data from the pipeline with the
+  blocking `pull()` until it ends. This is the most performant mode of
+  execution.
+- When results are also displayed on the screen, the Window System
+  needs to take some time to refresh the window contents and handle
+  GUI events. In this case, the demo pulls data with a non-blocking
+  `try_pull()` until there is no more data available (but it does not
+  mark end of the stream -- just means new data is not ready yet), and
+  only then displays the latest obtained result and refreshes the
+  screen. Reducing the time spent in GUI with this trick increases the
+  overall performance a little bit.
+
+# Comparison with serial mode {#gapi_ifd_comparison}
+
+The sample can also run in a serial mode for a reference and
+benchmarking purposes.  In this case, a regular
+cv::GComputation::compile() is used and a regular single-frame
+cv::GCompiled object is produced; the pipelining optimization is not
+applied within G-API; it is the user responsibility to acquire image
+frames from cv::VideoCapture object and pass those to G-API.
+
+@snippet cpp/tutorial_code/gapi/age_gender_emotion_recognition/age_gender_emotion_recognition.cpp Run_Serial
+
+On a test machine (Intel® Core™ i5-6600), with OpenCV built with
+[Intel® TBB]
+support, detector network assigned to CPU, and classifiers to iGPU,
+the pipelined sample outperformes the serial one by the factor of
+1.36x (thus adding +36% in overall throughput).
+
+# Conclusion {#gapi_ifd_conclusion}
+
+G-API introduces a technological way to build and optimize hybrid
+pipelines. Switching to a new execution model does not require changes
+in the algorithm code expressed with G-API -- only the way how graph
+is triggered differs.
+
+# Listing: post-processing kernel {#gapi_ifd_pp}
+
+G-API gives an easy way to plug custom code into the pipeline even if
+it is running in a streaming mode and processing tensor
+data. Inference results are represented by multi-dimensional cv::Mat
+objects so accessing those is as easy as with a regular DNN module.
+
+The OpenCV-based SSD post-processing kernel is defined and implemented in this
+sample as follows:
+
+@snippet cpp/tutorial_code/gapi/age_gender_emotion_recognition/age_gender_emotion_recognition.cpp Postproc
+
+["Interactive Face Detection"]: https://github.com/opencv/open_model_zoo/tree/master/demos/interactive_face_detection_demo
+[core]: @ref gapi_core
+[imgproc]: @ref gapi_imgproc
+[Architecture]: @ref gapi_hld
+[Kernels]: @ref gapi_kernel_api
+[VAAPI]: https://01.org/vaapi
--- a/doc/tutorials/gapi/table_of_content_gapi.markdown
+++ b/doc/tutorials/gapi/table_of_content_gapi.markdown
@ -0,0 +1,42 @@
+# Graph API (gapi module) {#tutorial_table_of_content_gapi}
+
+In this section you will learn about graph-based image processing and
+how G-API module can be used for that.
+
+- @subpage tutorial_gapi_interactive_face_detection
+
+    *Languages:* C++
+
+    *Compatibility:* \> OpenCV 4.2
+
+    *Author:* Dmitry Matveev
+
+    This tutorial illustrates how to build a hybrid video processing
+    pipeline with G-API where Deep Learning and image processing are
+    combined effectively to maximize the overall throughput. This
+    sample requires Intel® distribution of OpenVINO™ Toolkit version
+    2019R2 or later.
+
+- @subpage tutorial_gapi_anisotropic_segmentation
+
+    *Languages:* C++
+
+    *Compatibility:* \> OpenCV 4.0
+
+    *Author:* Dmitry Matveev
+
+    This is an end-to-end tutorial where an existing sample algorithm
+    is ported on G-API, covering the basic intuition behind this
+    transition process, and examining benefits which a graph model
+    brings there.
+
+- @subpage tutorial_gapi_face_beautification
+
+    *Languages:* C++
+
+    *Compatibility:* \> OpenCV 4.2
+
+    *Author:* Orest Chura
+
+    In this tutorial we build a complex hybrid Computer Vision/Deep
+    Learning video processing pipeline with G-API.