mirror of
https://gitcode.com/gh_mirrors/es/esp32-opencv.git
synced 2025-08-06 18:24:38 +08:00

- Using float instead of double for floating point matrix multiplications (in core/src/matmul.simd.hpp) reduces a lot the computation time
58 lines
1.4 KiB
Markdown
58 lines
1.4 KiB
Markdown
# Optimization
|
|
|
|
This doc details some optimizations done for OpenCV to run faster on the ESP32.
|
|
|
|
|
|
|
|
Activating optimization
|
|
----------------------------
|
|
|
|
To activate the optimization for the ESP32, the CMake parameter `-DESP32_OPTIMIZATION=ON` must be enabled. Every optimization done will be disabled if this parameter is OFF.
|
|
|
|
|
|
|
|
## Floating point support
|
|
|
|
The ESP32 only have a single precision Floating Point Unit (no double precision). Therefore, OpenCV functions using double types are very slow.
|
|
|
|
|
|
|
|
### Matrix multiplications
|
|
|
|
In files `core/matmul.dispatch.cp` and `core/matmul.simd.hpp`.
|
|
|
|
Results by multiplying 100x6 * 6x100 matrices:
|
|
|
|
1. Initial test : 60 ms
|
|
2. Changing `alpha` and `beta` from double to float in `GEMMsingleMult()` function: 12ms
|
|
3. Changing `alpha` and `beta` from double to float in `gemmImpl()` function: 4.6ms
|
|
|
|
|
|
|
|
Results by multiplying 150x100 * 100x150 matrices:
|
|
|
|
1. Initial test: 2757ms
|
|
2. Changing double in `GEMMStore()` function: 888ms
|
|
|
|
|
|
|
|
|
|
|
|
## Esp-dsp library
|
|
|
|
The ESP32 processor has the following hardware:
|
|
|
|
* 16/24-bit Instruction Set
|
|
* Support for FPU (Floating Point Unit)
|
|
* Support for DSP instructions
|
|
* 32-bit integer multiplier
|
|
* 32-bit integer divider
|
|
* 40-bit MAC (Multiply-Accumulate)
|
|
|
|
|
|
|
|
The *esp-dsp* library (https://github.com/espressif/esp-dsp) provides functions written in assembly to use this hardware.
|
|
|
|
This part describes which functions are used where in OpenCV for better performances.
|
|
|