[v4,0/4] software_isp: debayer_cpu: Add multi-threading support
mbox series

Message ID 20260303111741.17417-1-johannes.goede@oss.qualcomm.com
Headers show
Series
  • software_isp: debayer_cpu: Add multi-threading support
Related show

Message

Hans de Goede March 3, 2026, 11:17 a.m. UTC
Hi All,

The QCM2290 SoC used on the Arduino Uno-Q seems to have a weak GPU, so weak
that it is barely faster then a single CPU core (CPU without CCM).

This has made me code-up the long envisioned multi-threading support
for the CPU softISP.

Changes in v4:
- Use const in for (const auto &s : stats_) {} in SwStatsCpu::finishFrame()
- Move kMaxLineBuffers constant to DebayerCpuThread class
- Document "software_isp", "threads" option in runtime_configuration.rst
- Add an use constants for min/max/default number of threads

Changes in v3:
- Use std::unique_ptr for the DebayerCpuThread pointers
- Document new DebayerCpuThread class
- Make DebayerCpuThread inherit from both Thread and Object
- Use for (auto &thread : threads_)
- Use for (auto &s : stats_) {}
- Move input format logging from DebayerCpu::configure() to
  SoftwareIsp::configure()

Changes in v2:
- Quite a bit of refactoring based on v1 feedback, dropped patch 3/5 and 4/5
  from v2 since these now no longer make sense
- Move the allocation of the vector of SwIspStats objects to inside
  the SwStatsCpu class, controlled by a configure() arguments instead
  of making the caller allocate the objects
- Replace the DebayerCpuThreadData struct from v1 with a DebayerCpuThread
  class, derived from Object to allow calling invokeMethod for thread re-use
  in followup patches
- As part of this also move a bunch of methods which primarily deal with
  per thread data: setupInputMemcpy(), shiftLinePointers(), memcpyNextLine(),
  process*() to the new DebayerCpuThread class
- Re-use threads instead of starting new threads every frame
- Add a new patch adding some extra DebayerCpu input format logging

Benchmark results for the Uno-Q + IMX219 running at 3280x2464 -> 3272x2464
without CCM:

1 thread :  147ms / frame, ~6.5  fps
2 threads:   80ms / frame, ~12.5 fps
3 threads:   65ms / frame, ~15   fps
GPU:        130ms / frame, ~7,5  fps
GPU 0-copy: 110ms / frame, ~9.5  fps (requires pipeline + camss hacks)
GPU lite:    85ms / frame, ~12   fps (CCM, contrast and gamma disabled)

Regards,

Hans


Hans de Goede (4):
  software_isp: swstats_cpu: Prepare for multi-threading support
  software_isp: debayer_cpu: Add DebayerCpuThread class
  software_isp: debayer_cpu: Add multi-threading support
  software_isp: Log input config from configure()

 Documentation/runtime_configuration.rst       |   1 +
 .../internal/software_isp/swstats_cpu.h       |  25 +-
 src/libcamera/software_isp/debayer_cpu.cpp    | 286 ++++++++++++++----
 src/libcamera/software_isp/debayer_cpu.h      |  33 +-
 src/libcamera/software_isp/software_isp.cpp   |  12 +-
 src/libcamera/software_isp/swstats_cpu.cpp    |  54 ++--
 6 files changed, 302 insertions(+), 109 deletions(-)