Message ID | 20251015012251.17508-1-bryan.odonoghue@linaro.org |
---|---|
Headers | show |
Series |
|
Related | show |
Hi Bryan, Building this currently is very 'verbose' Found ninja-1.13.1 at /usr/bin/ninja Cleaning... 11 files. [21/369] Generating include/libcamera/internal/gen-shader-headers with a custom command + '[' 7 -lt 4 ']' + src_dir=/home/kbingham/iob/libcamera-softisp + shift + build_dir=/home/kbingham/iob/libcamera-softisp/build-gcc + shift + build_path=/home/kbingham/iob/libcamera-softisp/build-gcc/include/libcamera/internal/glsl_shaders.h + shift + cat + cat + for file in "$@" + echo 'file is ../include/libcamera/internal/shaders/bayer_1x_packed.frag' file is ../include/libcamera/internal/shaders/bayer_1x_packed.frag ++ basename /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_1x_packed.frag ++ tr . _ + name=bayer_1x_packed_frag + echo ' * unsigned char bayer_1x_packed_frag;' + for file in "$@" + echo 'file is ../include/libcamera/internal/shaders/bayer_unpacked.frag' file is ../include/libcamera/internal/shaders/bayer_unpacked.frag ++ tr . _ ++ basename /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_unpacked.frag + name=bayer_unpacked_frag + echo ' * unsigned char bayer_unpacked_frag;' + for file in "$@" + echo 'file is ../include/libcamera/internal/shaders/bayer_unpacked.vert' file is ../include/libcamera/internal/shaders/bayer_unpacked.vert ++ basename /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_unpacked.vert ++ tr . _ + name=bayer_unpacked_vert + echo ' * unsigned char bayer_unpacked_vert;' + for file in "$@" + echo 'file is ../include/libcamera/internal/shaders/identity.vert' file is ../include/libcamera/internal/shaders/identity.vert ++ tr . _ ++ basename /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/identity.vert + name=identity_vert + echo ' * unsigned char identity_vert;' + echo '*/' + echo '/* Hex encoded shader data */' + for file in "$@" ++ basename /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_1x_packed.frag + name=bayer_1x_packed.frag + /home/kbingham/iob/libcamera-softisp/utils/gen-shader-header.py bayer_1x_packed.frag /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_1x_packed.frag + echo + for file in "$@" ++ basename /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_unpacked.frag + name=bayer_unpacked.frag + /home/kbingham/iob/libcamera-softisp/utils/gen-shader-header.py bayer_unpacked.frag /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_unpacked.frag + echo + for file in "$@" ++ basename /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_unpacked.vert + name=bayer_unpacked.vert + /home/kbingham/iob/libcamera-softisp/utils/gen-shader-header.py bayer_unpacked.vert /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/bayer_unpacked.vert + echo + for file in "$@" ++ basename /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/identity.vert + name=identity.vert + /home/kbingham/iob/libcamera-softisp/utils/gen-shader-header.py identity.vert /home/kbingham/iob/libcamera-softisp/build-gcc/../include/libcamera/internal/shaders/identity.vert + echo Can we turn off the '-x' or such that must be in here somewhere? Quoting Bryan O'Donoghue (2025-10-15 02:22:12) > This version 3 > > - Adds AWB to the debayer routine as calculated by the IPA thread > > - Implements ~ all of the feedback from Barnabas quicker to mention > what hasn't been done. > a) A comment about member initialisation in eGL.cpp > code I wrote to make constructor init common seemed to negate > that ask. > b) meson dependency checks for egl. > I remember struggling with this earlier on in development. > I will certainly try to do this for a v4 so its more > pending a try as opposed to not indended to be done. > > - Incorporates various fixes from Robert Mader > When to sync removing tearing for Milan > Some error checking that although Robert didn't mention in his > feedback were in his patches so I stole that code. Thanks. > > - Also worth mentioning Robert identified a permissions fix > that pipewire would need for eGL to work in libcamera with pipewire > published that fix and got it merged too. > > Owe you a beer for that one. > > - Is rebased on tip-of-tree > > - Currently the documentation checks for the various classes > don't pass but that is easy enough to fix in a V4. > > - In line with our discussions gpuisp is now the default instead of cpuisp. > > - Since its only the documentation checks that are pending I thought > rather than delay further it was time to publish the series without > and see if anything major gets snagged. > > v2: > > This version 2 is an incomplete update with-respect-to previous comment > feedback, which ordinarily I would not publish however, given OSSEU is > starting on Monday and we have talk about this topic, in addition to some > pretty good progress in the interregnum I thought a v2 would be > appropriate. > > - V2 drops use of GBM surface in favour of generating a framebuffer from > the dma-buf handle, called render-to-texture. > > The conversion from GBM surface + memcpy() including the associated cache > invalidate has a dramatic effect on GPUISP performance. > > Some rough stats for a Qualcomm sm8250 "kona" device with an imx517 > sensor @ 4048 x 3040 ABRG8888 - debug builds > > CPUISP + CCM: > 2 FPS CPU usage > 100% single core pulls about 9 watts > > GPUISP v1 + CCM: > 14 FPS - power not measured > > GPUISP v2 + CCM: > 30 FPS - sensor linerate - CPU usage ~ 70 % pulling 8 Watts. > > Milan Zamal has reported a TI AM69 + imx219 - unknown resolution > > CPUISP 4 FPS > GPUISP v2 - 2 or 3 FPS > GPUISP v2 - 15 FPS =3D=3D sensor linerate > > In other words for these boards we can hit linerate with GPUISP + 3A + > CCM. > > - Drop GBM surface rendering > - Drop swapbuffers > - Use eglCreateImageKHR to directly render into the output dma-buf buffer > eglCreateImageKHR lets you specify the FOURCC of the texture which means > we can create the texture in the uncompressed target output pixel format > we want. > - Fix stride calculation to 256 bytes > Laurent and Maxime explained to me about GPU stride alignments being > tribal wisdom and that 256 bytes is a good cross-platform value. > This helped to get the render-to-texture command right. > - A synchronous blocking wait is used to ensure GPU operations have > completed. Laurent wants this to be made async. > At the moment its not clear to me the eglWaitSyncKHR is really required > and in any case doesn't seem to have any performance impact. > But this part is still TBD - I've included the sync wait for simplicity > and safety. > - A Debayer::stop() method has been introduced to ensure we call > eglDestroySyncKHR when the eGL context is valid, as opposed to in the > callchain of destructors triggering eGL::~eGL(); > - stats move constructor call chain dropped - Branabas > - Incorporates Milan's area-of-interest constraint for Bayer stats > i.e. squashes his v3 update into debayer_egl.cpp directly > - Moves ALIGN_TO into a common area to facilitate its reuse in > egl.cpp > - Rebases on 0.5.2 > > - There are a number of known checks failing on the CI loop right now > > Link to v1: https://lists.libcamera.org/pipermail/libcamera-devel/2025-June= > /050692.html > > v1: > This series introduces a GLES 2.0 GPU ISP to libcamera. > > We have had extensive discussions, meetings and collaborative discussions > about this topic over the last year or so. > > As an overview we want to start to move as much processing of software_isp > into the GPU as possible. This is especially advantageous when we are > talking about processing a framebuffer's worth of pixels as quickly as > possible. > > The decision to use GLES 2.0 instead of say Vulcan stems from a desire to > support as much in the way of older hardware as possible and the fact we > already have upstream GLES 2.0 fragment shaders to do debayer. > > Generally the approach is > > - Move the fragment shaders out of qcam and into a common location > - Update the existing SoftwareISP Debayer/DebayerCPU pair to facilitate > addition of a new class DebayerEGL. > - Introduce that class > - Then do progressive change of the shaders and DebayerEGL class to make > the modifications as transparent as possible in the git log. > - Reuse as much of the SoftIPA data-structures and logic as possible. > - Consume the data from SoftIPA in the Debayer Shaders so that CPUISP and > GPUISP give similar - hopefully the same results but with GPUISP going > faster. > > In order to get untiled and uncompressed pixel data out of the GPU > framebuffer we need to tell the GPU how to store the data it is writing to > that framebuffer. GPUs can store their framebuffer data in tiled or even > compressed formats which is why the naive approach of running your fragment > shader and then using glReadPixels(GL_RGBA); will be horrendously slow as > glReadPixels must convert from the internal GPU format to the requested > output format - an operation that for me takes ~ 10 milliseconds per frame. > > Instead we get the GPU to store its data as ARGB8888 swap buffers and > memcpy() from the swapped buffer to our output frame. Right now this series > supports 32 bit output formats only. > > The memcpy() also entails flushing the cache of the target buffer as per > the terms of the dma-buf software contract. > > This leads us onto the main outstanding TODOs > > - 24 bit GBM buffer support leading > - 24 bit output framebuffer support > - Surfaceless GBM and eGL context with no swapbuffer > - Render to texture > If we render directly to a buffer provided to the GPU the output > buffer we will not need to memcpy() to the output buffer > nor will we need to invalidate the output buffer cache. > - eglCreateImageKHR for the texture upload. > > This list is of the colour "make it go faster" not "make it work" which is > why we are moving to start to submit a v1 for discussion in the full > realisation it will have to go through several cycles of review giving us > the opportunity to fix: > > - Doxygen is missing for new classes and methods > - Some of the pipelines don't complete in gitlab > - 24 bit output seems doable before merge > - Render to texture perhaps even too > > For me on my Qualcomm hardware GPUISP works very well I get 30fps in qcam > with about 75% CPU usage versus > 100% - cam goes faster which to me > implies a good bit of time is being consumed in qcam itself. > > The series starts out with fixes and updates from Hans and finishes it out > with shader modifications from Milan both of whom along with Kieran, > Laurent and Maxime I'd like to thank for being some helpful and patient. > > > Bryan O'Donoghue (31): > libcamera: shaders: Move GL shader programs to > src/libcamera/assets/shader > utils: gen-shader-headers: Add a utility to generate headers from > shaders > meson: Automatically generate glsl_shaders.h from specified shader > programs > libcamera: software_isp: Move useful items from DebayerCpu to Debayer > base class > libcamera: software_isp: Move Bayer params init from DebayerCpu to > Debayer > libcamera: software_isp: Move param select code to Debayer base class > libcamera: software_isp: Move DMA Sync code to Debayer base class > libcamera: software_isp: Make output DMA sync contingent > libcamera: software_isp: Move isStandardBayerOrder to base class > libcamera: software_isp: Start the ISP thread in configure > libcamera: software_isp: Move configure to worker thread > libcamera: software_isp: debayer: Make the debayer_ object of type > class Debayer not DebayerCpu > libcamera: software_isp: debayer: Extend DebayerParams struct to hold > a copy of per-frame CCM values > libcamera: software_isp: debayer: Extend DebayerParams to hold a copy > of per-frame AWB values > libcamera: software_isp: awb Populate AWB gains to Debayer params > structure > libcamera: software_isp: ccm: Populate CCM table to Debayer params > structure > libcamera: software_isp: debayer: Introduce a stop() callback to the > debayer object > libcamera: software_isp: lut: Make gain corrected CCM in lut.cpp > available in debayer params > libcamera: software_isp: gbm: Add in a GBM helper class for GPU > surface access > libcamera: software_isp: Make isStandardBayerOrder static > libcamera: software_isp: egl: Introduce an eGL base helper class > libcamera: shaders: Use highp not mediump for float precision > libcamera: shaders: Extend debayer shaders to apply RGB gain values on > output > libcamera: shaders: Extend bayer shaders to support swapping R and B > on output > libcamera: shaders: Add support for Auto White Balance gains > libcamera: software_isp: debayer_egl: Add an eGL debayer class > libcamera: software_isp: debayer_egl: Make DebayerEGL an environment > option > libcamera: software_isp: debayer_egl: Make gpuisp default softisp mode > libcamera: software_isp: debayer_cpu: Make getInputConfig and > getOutputConfig static > libcamera: software_isp: Switch on uncalibrated CCM to validate > eGLDebayer > libcamera: software_isp: Add a gpuisp todo list > > Hans de Goede (5): > libcamera: swstats_cpu: Update statsProcessFn() / processLine0() > documentation > libcamera: swstats_cpu: Drop patternSize_ documentation > libcamera: swstats_cpu: Move header to libcamera/internal/software_isp > libcamera: software_isp: Move benchmark code to its own class > libcamera: swstats_cpu: Add processFrame() method > > Milan Zamazal (3): > libcamera: shaders: Rename bayer_8 to bayer_unpacked > libcamera: shaders: Fix neighbouring positions in 8-bit debayering > libcamera: software_isp: GPU support for unpacked 10/12-bit formats > > include/libcamera/internal/egl.h | 162 +++++ > include/libcamera/internal/gbm.h | 43 ++ > include/libcamera/internal/meson.build | 11 + > .../libcamera/internal/shaders}/RGB.frag | 2 +- > .../internal/shaders}/YUV_2_planes.frag | 2 +- > .../internal/shaders}/YUV_3_planes.frag | 2 +- > .../internal/shaders}/YUV_packed.frag | 2 +- > .../internal/shaders}/bayer_1x_packed.frag | 68 +- > .../internal/shaders/bayer_unpacked.frag | 84 ++- > .../internal/shaders/bayer_unpacked.vert | 8 +- > .../libcamera/internal/shaders}/identity.vert | 0 > .../libcamera/internal/shaders/meson.build | 10 + > .../internal/software_isp/benchmark.h | 39 ++ > .../internal/software_isp/debayer_params.h | 13 + > .../internal/software_isp/meson.build | 2 + > .../internal/software_isp/software_isp.h | 5 +- > .../internal}/software_isp/swstats_cpu.h | 15 +- > src/apps/qcam/assets/shader/shaders.qrc | 16 +- > src/apps/qcam/meson.build | 3 + > src/apps/qcam/viewfinder_gl.cpp | 70 +- > src/ipa/simple/algorithms/awb.cpp | 4 +- > src/ipa/simple/algorithms/ccm.cpp | 4 +- > src/ipa/simple/algorithms/lut.cpp | 1 + > src/ipa/simple/data/uncalibrated.yaml | 12 +- > src/libcamera/egl.cpp | 435 ++++++++++++ > src/libcamera/gbm.cpp | 61 ++ > src/libcamera/meson.build | 34 + > src/libcamera/software_isp/benchmark.cpp | 92 +++ > src/libcamera/software_isp/debayer.cpp | 63 ++ > src/libcamera/software_isp/debayer.h | 53 +- > src/libcamera/software_isp/debayer_cpu.cpp | 88 +-- > src/libcamera/software_isp/debayer_cpu.h | 44 +- > src/libcamera/software_isp/debayer_egl.cpp | 648 ++++++++++++++++++ > src/libcamera/software_isp/debayer_egl.h | 174 +++++ > src/libcamera/software_isp/gpuisp-todo.txt | 83 +++ > src/libcamera/software_isp/meson.build | 9 + > src/libcamera/software_isp/software_isp.cpp | 49 +- > src/libcamera/software_isp/swstats_cpu.cpp | 79 ++- > utils/gen-shader-header.py | 38 + > utils/gen-shader-headers.sh | 44 ++ > utils/meson.build | 2 + > 41 files changed, 2370 insertions(+), 204 deletions(-) > create mode 100644 include/libcamera/internal/egl.h > create mode 100644 include/libcamera/internal/gbm.h > rename {src/apps/qcam/assets/shader => include/libcamera/internal/shaders}/RGB.frag (93%) > rename {src/apps/qcam/assets/shader => include/libcamera/internal/shaders}/YUV_2_planes.frag (97%) > rename {src/apps/qcam/assets/shader => include/libcamera/internal/shaders}/YUV_3_planes.frag (96%) > rename {src/apps/qcam/assets/shader => include/libcamera/internal/shaders}/YUV_packed.frag (99%) > rename {src/apps/qcam/assets/shader => include/libcamera/internal/shaders}/bayer_1x_packed.frag (75%) > rename src/apps/qcam/assets/shader/bayer_8.frag => include/libcamera/internal/shaders/bayer_unpacked.frag (55%) > rename src/apps/qcam/assets/shader/bayer_8.vert => include/libcamera/internal/shaders/bayer_unpacked.vert (85%) > rename {src/apps/qcam/assets/shader => include/libcamera/internal/shaders}/identity.vert (100%) > create mode 100644 include/libcamera/internal/shaders/meson.build > create mode 100644 include/libcamera/internal/software_isp/benchmark.h > rename {src/libcamera => include/libcamera/internal}/software_isp/swstats_cpu.h (84%) > create mode 100644 src/libcamera/egl.cpp > create mode 100644 src/libcamera/gbm.cpp > create mode 100644 src/libcamera/software_isp/benchmark.cpp > create mode 100644 src/libcamera/software_isp/debayer_egl.cpp > create mode 100644 src/libcamera/software_isp/debayer_egl.h > create mode 100644 src/libcamera/software_isp/gpuisp-todo.txt > create mode 100755 utils/gen-shader-header.py > create mode 100755 utils/gen-shader-headers.sh > > -- > 2.51.0 >