[v3,0/8] libcamera: software_isp: gpu: Add go faster stripes
mbox series

Message ID 20260626113325.3218045-1-bryan.odonoghue@linaro.org
Headers show
Series
  • libcamera: software_isp: gpu: Add go faster stripes
Related show

Message

Bryan O'Donoghue June 26, 2026, 11:33 a.m. UTC
v3:
- Fixes a missing switch default from libcamera ci loop - bod

v2:
- Reverts to debayer_egl specific caching mechanism - Laurent
- Pivots on dmabuf handle as key for cache. dmabuf handle
  must be unique between start() and stop() for eglCreateImageKHR to work - Barnabas
- Ensures sizeof cache doesn't exceed the number of expected buffers
  as discovered during configure() - Barnabas
- Drops GL_16F patch - Robert
- Drops the "input" name from createInputTexture2D() - Robert
- Adds RB as specified by Robert

v1:
Following on from Robert Mader's ask to bring forward GPUISP multi-pass
cache operations to mainline first, I've done some work to enable that.

This series implements an input/output texture caching scheme which results
in an overall absolute 4.5 millisecond - roughly improvement in performance
per processed frame. This is a bit of an odd result as I was expecting to
shave a particular % off of each frame in the order of 20% or so. My best
guess is that particular paths we are optimising here are around texture
generation and these are "fixed costs" on the CPU side.

One very welcome outcome of this series is genuine zero-copy on dma-buf
paths we can support it on i.e. on paths where the CSI2 and GPU strides
agree.

The numbers cited in my example are for non dma-buf handle upload and
dma-buf handle render-to-texture on the output so, in fact I'd expect to
see a much larger improvement on systems where dma-buf handle is used on
both paths.

As it is on my test reference systems I have a 50% improvement per frame
for one system, a 20% improvement on another system or we could view it as
a fixed 4.5 millisecond improvement on both.

I implemented the cache around V4L2BufferCache as per Barnabas' suggestion,
including a fixed cache size.

Bryan O'Donoghue (8):
  libcamera: software_isp: debayer_egl: Pass eglImage as parameter to
    setShaderVariables
  libcamera: software_isp: debayer_egl: Flag dmabuf use once per session
    not for every frame
  libcamera: egl: Add new helper attachTextureToFBO
  libcamera: egl: Add createOutputTexture2D
  libcamera: egl: Add updateTexture2D
  libcamera: egl: Add activateBindTexture
  libcamera: egl: Drop dmabuf_import_failed_
  libcamera: software_isp: debayer_egl: Implement input/output frame
    caching mechanism

 include/libcamera/internal/egl.h           |   6 +-
 src/libcamera/egl.cpp                      | 115 +++++++++++++++++----
 src/libcamera/software_isp/debayer_egl.cpp |  98 +++++++++++++-----
 src/libcamera/software_isp/debayer_egl.h   |  14 ++-
 4 files changed, 185 insertions(+), 48 deletions(-)