| Message ID | 20260216190204.106922-1-johannes.goede@oss.qualcomm.com |
|---|---|
| Headers | show |
| Series |
|
| Related | show |
Hi Hans, Quoting Hans de Goede (2026-02-16 19:01:59) > Hi All, > > The QCM2290 SoC used on the Arduino Uno-Q seems to have a very weak GPU(1), > so weak that it is barely faster then a single CPU core. > > This has made me code-up the long envisioned multi-threading support > for the CPU softISP :) > > Benchmark results for the Uno-Q + IMX219 running at 3280x2464 -> 3272x2464: I'm afraid I think there are CI failures in this branch: https://gitlab.freedesktop.org/camera/libcamera/-/jobs/93408936#L538 > > 1 thread : 147ms / frame, ~6.5 fps > 2 threads: 81ms / frame, ~12 fps > 3 threads: 66ms / frame, ~14.5 fps > GPU: 130ms / frame, ~7,5 fps > GPU 0-copy: 110ms / frame, ~9.5 fps (requires pipeline + camss hacks) > GPU lite: 85ms / frame, ~12 fps (CCM, contrast and gamma disabled) > > Regards, > > Hans > > 1) If the GPU really is this weak needs to be investigated more > > Hans de Goede (5): > software_isp: swstats_cpu: Move accumulator storage out of the class > software_isp: debayer_cpu: Add per render thread data > software_isp: debayer_cpu: Group innerloop variables together > software_isp: debayer_cpu: Select process inner loop by function > pointer > software_isp: debayer_cpu: Add multi-threading support > > .../internal/software_isp/swstats_cpu.h | 29 ++-- > src/libcamera/software_isp/debayer_cpu.cpp | 131 ++++++++++++------ > src/libcamera/software_isp/debayer_cpu.h | 44 ++++-- > src/libcamera/software_isp/swstats_cpu.cpp | 65 ++++++--- > 4 files changed, 180 insertions(+), 89 deletions(-) > > -- > 2.52.0 >
Hans de Goede <johannes.goede@oss.qualcomm.com> writes: > Hi All, > > The QCM2290 SoC used on the Arduino Uno-Q seems to have a very weak GPU(1), > so weak that it is barely faster then a single CPU core. > > This has made me code-up the long envisioned multi-threading support > for the CPU softISP :) Reason to not drop CPU ISP in future? > Benchmark results for the Uno-Q + IMX219 running at 3280x2464 -> 3272x2464: > > 1 thread : 147ms / frame, ~6.5 fps > 2 threads: 81ms / frame, ~12 fps > 3 threads: 66ms / frame, ~14.5 fps > GPU: 130ms / frame, ~7,5 fps > GPU 0-copy: 110ms / frame, ~9.5 fps (requires pipeline + camss hacks) > GPU lite: 85ms / frame, ~12 fps (CCM, contrast and gamma disabled) The CPU measurements are with or without CCM? > Regards, > > Hans > > 1) If the GPU really is this weak needs to be investigated more > > Hans de Goede (5): > software_isp: swstats_cpu: Move accumulator storage out of the class > software_isp: debayer_cpu: Add per render thread data > software_isp: debayer_cpu: Group innerloop variables together > software_isp: debayer_cpu: Select process inner loop by function > pointer > software_isp: debayer_cpu: Add multi-threading support > > .../internal/software_isp/swstats_cpu.h | 29 ++-- > src/libcamera/software_isp/debayer_cpu.cpp | 131 ++++++++++++------ > src/libcamera/software_isp/debayer_cpu.h | 44 ++++-- > src/libcamera/software_isp/swstats_cpu.cpp | 65 ++++++--- > 4 files changed, 180 insertions(+), 89 deletions(-)
On Tue, Feb 17, 2026 at 11:00:06PM +0100, Milan Zamazal wrote: > Hans de Goede writes: > > > Hi All, > > > > The QCM2290 SoC used on the Arduino Uno-Q seems to have a very weak GPU(1), > > so weak that it is barely faster then a single CPU core. > > > > This has made me code-up the long envisioned multi-threading support > > for the CPU softISP :) > > Reason to not drop CPU ISP in future? Note that I still think the CPU implementation should evolve to perform the same computation as the GPU implementation, which will include LSC and other algorithms. It will therefore slow down. The right solution to this problem is to support the hardware ISP included in the QCM2290 :-) > > Benchmark results for the Uno-Q + IMX219 running at 3280x2464 -> 3272x2464: > > > > 1 thread : 147ms / frame, ~6.5 fps > > 2 threads: 81ms / frame, ~12 fps > > 3 threads: 66ms / frame, ~14.5 fps > > GPU: 130ms / frame, ~7,5 fps > > GPU 0-copy: 110ms / frame, ~9.5 fps (requires pipeline + camss hacks) > > GPU lite: 85ms / frame, ~12 fps (CCM, contrast and gamma disabled) > > The CPU measurements are with or without CCM? > > > Regards, > > > > Hans > > > > 1) If the GPU really is this weak needs to be investigated more > > > > Hans de Goede (5): > > software_isp: swstats_cpu: Move accumulator storage out of the class > > software_isp: debayer_cpu: Add per render thread data > > software_isp: debayer_cpu: Group innerloop variables together > > software_isp: debayer_cpu: Select process inner loop by function > > pointer > > software_isp: debayer_cpu: Add multi-threading support > > > > .../internal/software_isp/swstats_cpu.h | 29 ++-- > > src/libcamera/software_isp/debayer_cpu.cpp | 131 ++++++++++++------ > > src/libcamera/software_isp/debayer_cpu.h | 44 ++++-- > > src/libcamera/software_isp/swstats_cpu.cpp | 65 ++++++--- > > 4 files changed, 180 insertions(+), 89 deletions(-)
Hi, On 17-Feb-26 11:00 PM, Milan Zamazal wrote: > Hans de Goede <johannes.goede@oss.qualcomm.com> writes: > >> Hi All, >> >> The QCM2290 SoC used on the Arduino Uno-Q seems to have a very weak GPU(1), >> so weak that it is barely faster then a single CPU core. >> >> This has made me code-up the long envisioned multi-threading support >> for the CPU softISP :) > > Reason to not drop CPU ISP in future? One reason yes, I think it will be good to keep it around as a lowest common denominator anyways also for e.g. phones with older powervr gfx which will never get FOSS GPU support and other cases where we may not be able to use a GPU for one reason or another. >> Benchmark results for the Uno-Q + IMX219 running at 3280x2464 -> 3272x2464: >> >> 1 thread : 147ms / frame, ~6.5 fps >> 2 threads: 81ms / frame, ~12 fps >> 3 threads: 66ms / frame, ~14.5 fps >> GPU: 130ms / frame, ~7,5 fps >> GPU 0-copy: 110ms / frame, ~9.5 fps (requires pipeline + camss hacks) >> GPU lite: 85ms / frame, ~12 fps (CCM, contrast and gamma disabled) > > The CPU measurements are with or without CCM? without CCM. Regards, Hans
Hi All, The QCM2290 SoC used on the Arduino Uno-Q seems to have a very weak GPU(1), so weak that it is barely faster then a single CPU core. This has made me code-up the long envisioned multi-threading support for the CPU softISP :) Benchmark results for the Uno-Q + IMX219 running at 3280x2464 -> 3272x2464: 1 thread : 147ms / frame, ~6.5 fps 2 threads: 81ms / frame, ~12 fps 3 threads: 66ms / frame, ~14.5 fps GPU: 130ms / frame, ~7,5 fps GPU 0-copy: 110ms / frame, ~9.5 fps (requires pipeline + camss hacks) GPU lite: 85ms / frame, ~12 fps (CCM, contrast and gamma disabled) Regards, Hans 1) If the GPU really is this weak needs to be investigated more Hans de Goede (5): software_isp: swstats_cpu: Move accumulator storage out of the class software_isp: debayer_cpu: Add per render thread data software_isp: debayer_cpu: Group innerloop variables together software_isp: debayer_cpu: Select process inner loop by function pointer software_isp: debayer_cpu: Add multi-threading support .../internal/software_isp/swstats_cpu.h | 29 ++-- src/libcamera/software_isp/debayer_cpu.cpp | 131 ++++++++++++------ src/libcamera/software_isp/debayer_cpu.h | 44 ++++-- src/libcamera/software_isp/swstats_cpu.cpp | 65 ++++++--- 4 files changed, 180 insertions(+), 89 deletions(-)