Files
esp32-opencv/esp32/doc/optimization.md
Joachim 58d3b77970 Started to optimize OpenCV for the ESP32
- Using float instead of double for floating point matrix multiplications (in core/src/matmul.simd.hpp) reduces a lot the computation time
2020-05-20 14:12:33 +02:00

58 lines
1.4 KiB
Markdown

# Optimization
This doc details some optimizations done for OpenCV to run faster on the ESP32.
Activating optimization
----------------------------
To activate the optimization for the ESP32, the CMake parameter `-DESP32_OPTIMIZATION=ON` must be enabled. Every optimization done will be disabled if this parameter is OFF.
## Floating point support
The ESP32 only have a single precision Floating Point Unit (no double precision). Therefore, OpenCV functions using double types are very slow.
### Matrix multiplications
In files `core/matmul.dispatch.cp` and `core/matmul.simd.hpp`.
Results by multiplying 100x6 * 6x100 matrices:
1. Initial test : 60 ms
2. Changing `alpha` and `beta` from double to float in `GEMMsingleMult()` function: 12ms
3. Changing `alpha` and `beta` from double to float in `gemmImpl()` function: 4.6ms
Results by multiplying 150x100 * 100x150 matrices:
1. Initial test: 2757ms
2. Changing double in `GEMMStore()` function: 888ms
## Esp-dsp library
The ESP32 processor has the following hardware:
* 16/24-bit Instruction Set
* Support for FPU (Floating Point Unit)
* Support for DSP instructions
* 32-bit integer multiplier
* 32-bit integer divider
* 40-bit MAC (Multiply-Accumulate)
The *esp-dsp* library (https://github.com/espressif/esp-dsp) provides functions written in assembly to use this hardware.
This part describes which functions are used where in OpenCV for better performances.