[v5,0/5] software_isp: debayer_cpu: Add multi-threading support
mbox series

Message ID 20260304075052.11599-1-johannes.goede@oss.qualcomm.com
Headers show
Series
  • software_isp: debayer_cpu: Add multi-threading support
Related show

Message

Hans de Goede March 4, 2026, 7:50 a.m. UTC
Hi All,

The QCM2290 SoC used on the Arduino Uno-Q seems to have a weak GPU, so weak
that it is barely faster then a single CPU core (CPU without CCM).

This has made me code-up the long envisioned multi-threading support
for the CPU softISP.

Changes in v5:
- Extend software_isp.threads docs in runtime_configuration.rst
- New patch: "Documentation/runtime_configuration: Add missing
  software_isp.mode doc"

Changes in v4:
- Use const in for (const auto &s : stats_) {} in SwStatsCpu::finishFrame()
- Move kMaxLineBuffers constant to DebayerCpuThread class
- Document software_isp.threads option in runtime_configuration.rst
- Add an use constants for min/max/default number of threads

Changes in v3:
- Use std::unique_ptr for the DebayerCpuThread pointers
- Document new DebayerCpuThread class
- Make DebayerCpuThread inherit from both Thread and Object
- Use for (auto &thread : threads_)
- Use for (auto &s : stats_) {}
- Move input format logging from DebayerCpu::configure() to
  SoftwareIsp::configure()

Changes in v2:
- Quite a bit of refactoring based on v1 feedback, dropped patch 3/5 and 4/5
  from v2 since these now no longer make sense
- Move the allocation of the vector of SwIspStats objects to inside
  the SwStatsCpu class, controlled by a configure() arguments instead
  of making the caller allocate the objects
- Replace the DebayerCpuThreadData struct from v1 with a DebayerCpuThread
  class, derived from Object to allow calling invokeMethod for thread re-use
  in followup patches
- As part of this also move a bunch of methods which primarily deal with
  per thread data: setupInputMemcpy(), shiftLinePointers(), memcpyNextLine(),
  process*() to the new DebayerCpuThread class
- Re-use threads instead of starting new threads every frame
- Add a new patch adding some extra DebayerCpu input format logging

Benchmark results for the Uno-Q + IMX219 running at 3280x2464 -> 3272x2464
without CCM:

1 thread :  147ms / frame, ~6.5  fps
2 threads:   80ms / frame, ~12.5 fps
3 threads:   65ms / frame, ~15   fps
GPU:        130ms / frame, ~7,5  fps
GPU 0-copy: 110ms / frame, ~9.5  fps (requires pipeline + camss hacks)
GPU lite:    85ms / frame, ~12   fps (CCM, contrast and gamma disabled)

Regards,

Hans


Hans de Goede (5):
  software_isp: swstats_cpu: Prepare for multi-threading support
  software_isp: debayer_cpu: Add DebayerCpuThread class
  software_isp: debayer_cpu: Add multi-threading support
  software_isp: Log input config from configure()
  Documentation/runtime_configuration: Add missing software_isp.mode doc

 Documentation/runtime_configuration.rst       |  20 ++
 .../internal/software_isp/swstats_cpu.h       |  25 +-
 src/libcamera/software_isp/debayer_cpu.cpp    | 286 ++++++++++++++----
 src/libcamera/software_isp/debayer_cpu.h      |  33 +-
 src/libcamera/software_isp/software_isp.cpp   |  12 +-
 src/libcamera/software_isp/swstats_cpu.cpp    |  54 ++--
 6 files changed, 321 insertions(+), 109 deletions(-)