mirror of https://gitcode.com/gh_mirrors/es/esp32-opencv.git synced 2025-08-14 10:40:47 +08:00

Files

Joachim 58d3b77970 Started to optimize OpenCV for the ESP32

- Using float instead of double for floating point matrix multiplications (in core/src/matmul.simd.hpp) reduces a lot the computation time

2020-05-20 14:12:33 +02:00

1.4 KiB

Raw Permalink Blame History

Optimization

This doc details some optimizations done for OpenCV to run faster on the ESP32.

Activating optimization

To activate the optimization for the ESP32, the CMake parameter -DESP32_OPTIMIZATION=ON must be enabled. Every optimization done will be disabled if this parameter is OFF.

Floating point support

The ESP32 only have a single precision Floating Point Unit (no double precision). Therefore, OpenCV functions using double types are very slow.

Matrix multiplications

In files core/matmul.dispatch.cp and core/matmul.simd.hpp.

Results by multiplying 100x6 * 6x100 matrices:

Initial test : 60 ms
Changing alpha and beta from double to float in GEMMsingleMult() function: 12ms
Changing alpha and beta from double to float in gemmImpl() function: 4.6ms

Results by multiplying 150x100 * 100x150 matrices:

Initial test: 2757ms
Changing double in GEMMStore() function: 888ms

Esp-dsp library

The ESP32 processor has the following hardware:

16/24-bit Instruction Set
Support for FPU (Floating Point Unit)
Support for DSP instructions
- 32-bit integer multiplier
- 32-bit integer divider
- 40-bit MAC (Multiply-Accumulate)

The esp-dsp library (https://github.com/espressif/esp-dsp) provides functions written in assembly to use this hardware.

This part describes which functions are used where in OpenCV for better performances.

1.4 KiB Raw Permalink Blame History