[v4,00/23] Add GLES 2.0 GPUISP to libcamera

Message ID	20251120233347.5046-1-bryan.odonoghue@linaro.org
Headers	show Return-Path: <libcamera-devel-bounces@lists.libcamera.org> From: Bryan O'Donoghue <bryan.odonoghue@linaro.org> To: libcamera-devel@lists.libcamera.org Cc: pavel@ucw.cz, Bryan O'Donoghue <bryan.odonoghue@linaro.org> Subject: [PATCH v4 00/23] Add GLES 2.0 GPUISP to libcamera Date: Thu, 20 Nov 2025 23:33:24 +0000 Message-ID: <20251120233347.5046-1-bryan.odonoghue@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" <libcamera-devel-bounces@lists.libcamera.org>
Series	Add GLES 2.0 GPUISP to libcamera
Related	show [v4,00/23] Add GLES 2.0 GPUISP to libcamera [v4,01/23] libcamera: software_isp: gbm: Add a GBM helper class for GPU surface access [v4,02/23] libcamera: software_isp: Make isStandardBayerOrder static [v4,03/23] libcamera: software_isp: egl: Add a eGL base helper class [v4,04/23] libcamera: shaders: Rename bayer_8 to bayer_unpacked [v4,05/23] libcamera: shaders: Use highp not mediump for float precision [v4,06/23] libcamera: shaders: Extend debayer shaders to apply RGB gain values on output [v4,07/23] libcamera: shaders: Extend bayer shaders to support swapping R and B on output [v4,08/23] libcamera: shaders: Add support for black level compenstation [v4,09/23] libcamera: shaders: Add support for Gamma [v4,10/23] libcamera: shaders: Add support for contrast [v4,11/23] libcamera: software_isp: debayer_egl: Add an eGL debayer class [v4,12/23] libcamera: software_isp: debayer_egl: Make DebayerEGL an environment option [v4,13/23] libcamera: software_isp: debayer_egl: Make gpuisp default softisp mode [v4,14/23] libcamera: software_isp: debayer_cpu: Make getInputConfig and getOutputConfig static [v4,15/23] libcamera: software_isp: GPU support for unpacked 10/12-bit formats [v4,16/23] libcamera: software_isp: Add a gpuisp todo list [v4,17/23] libcamera: software_isp: lut: Change default Gamma to 1.0/2.2 [v4,18/23] ipa: Add a new Algorithm::init() to support self-initalising algorithms [v4,19/23] libcamera: software_isp: Implement a static init() routine [v4,20/23] ipa: simple: Add a flag to indicate gpuIspEnabled [v4,21/23] ipa: libipa: module: Add createSelfEnumeratingAlgorithm [v4,22/23] ipa: software_isp: Call createSelfEnumeratingAlgorithm() to statically instantiate CCM... [v4,23/23] libcamera: software_isp: lut: Skip calculation lookup tables if gpuIspEnabled is true.

Bryan O'Donoghue Nov. 20, 2025, 11:33 p.m. UTC

This version 4:

- Drops AWB since the CCM contains it already
- Includes Gamma
- Includes Contrast - testable via camshark
- Includes Saturation - testable via camshark
- Includes a scaler from Robert
- Includes synch changes from Robert
- Includes all feedback incorporated from Pavel
- Generates a default 65k CCM if none is supplied
- Various Doxygen torments fixed along the way
- And is the "top half" of the precursor series as the GPUISP
  series becomes 44 patches long this is an unreasonable number
  to merge in one go.

- Full testable branch
Link: https://gitlab.freedesktop.org/camera/libcamera-softisp/-/tree/v0.5.2-gpuisp-v4e?ref_type=heads

- The first part of the series is in the precurso here
Link: https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/v0.5.2-gpuisp-v4e-split

That precursor is just a tag about half way through the integrated series.

This version 3:

- Adds AWB to the debayer routine as calculated by the IPA thread

- Implements ~ all of the feedback from Barnabas quicker to mention
  what hasn't been done.
  a) A comment about member initialisation in eGL.cpp
     code I wrote to make constructor init common seemed to negate
     that ask.
  b) meson dependency checks for egl.
     I remember struggling with this earlier on in development.
     I will certainly try to do this for a v4 so its more
     pending a try as opposed to not indended to be done.

- Incorporates various fixes from Robert Mader
  When to sync removing tearing for Milan
  Some error checking that although Robert didn't mention in his
  feedback were in his patches so I stole that code. Thanks.

- Also worth mentioning Robert identified a permissions fix
  that pipewire would need for eGL to work in libcamera with pipewire
  published that fix and got it merged too.

  Owe you a beer for that one.

- Is rebased on tip-of-tree

- Currently the documentation checks for the various classes
  don't pass but that is easy enough to fix in a V4.

- In line with our discussions gpuisp is now the default instead of cpuisp.

- Since its only the documentation checks that are pending I thought
  rather than delay further it was time to publish the series without
  and see if anything major gets snagged.

v2:

This version 2 is an incomplete update with-respect-to previous comment
feedback, which ordinarily I would not publish however, given OSSEU is
starting on Monday and we have talk about this topic, in addition to some
pretty good progress in the interregnum I thought a v2 would be
appropriate.

- V2 drops use of GBM surface in favour of generating a framebuffer from
  the dma-buf handle, called render-to-texture.

  The conversion from GBM surface + memcpy() including the associated cache
  invalidate has a dramatic effect on GPUISP performance.

  Some rough stats for a Qualcomm sm8250 "kona" device with an imx517
  sensor @ 4048 x 3040 ABRG8888 - debug builds

    CPUISP + CCM:
      2 FPS CPU usage > 100% single core pulls about 9 watts

    GPUISP v1 + CCM:
      14 FPS - power not measured

    GPUISP v2 + CCM:
      30 FPS - sensor linerate - CPU usage ~ 70 % pulling 8 Watts.

  Milan Zamal has reported a TI AM69 + imx219 - unknown resolution

    CPUISP 4 FPS
    GPUISP v2 - 2 or 3 FPS
    GPUISP v2 - 15 FPS =3D=3D sensor linerate

  In other words for these boards we can hit linerate with GPUISP + 3A +
  CCM.

- Drop GBM surface rendering
- Drop swapbuffers
- Use eglCreateImageKHR to directly render into the output dma-buf buffer
  eglCreateImageKHR lets you specify the FOURCC of the texture which means
  we can create the texture in the uncompressed target output pixel format
  we want.
- Fix stride calculation to 256 bytes
  Laurent and Maxime explained to me about GPU stride alignments being
  tribal wisdom and that 256 bytes is a good cross-platform value.
  This helped to get the render-to-texture command right.
- A synchronous blocking wait is used to ensure GPU operations have
  completed. Laurent wants this to be made async.
  At the moment its not clear to me the eglWaitSyncKHR is really required
  and in any case doesn't seem to have any performance impact.
  But this part is still TBD - I've included the sync wait for simplicity
  and safety.
- A Debayer::stop() method has been introduced to ensure we call
  eglDestroySyncKHR when the eGL context is valid, as opposed to in the
  callchain of destructors triggering eGL::~eGL();
- stats move constructor call chain dropped - Branabas
- Incorporates Milan's area-of-interest constraint for Bayer stats
  i.e. squashes his v3 update into debayer_egl.cpp directly
- Moves ALIGN_TO into a common area to facilitate its reuse in
  egl.cpp
- Rebases on 0.5.2

- There are a number of known checks failing on the CI loop right now

Link to v1: https://lists.libcamera.org/pipermail/libcamera-devel/2025-June=
/050692.html

v1:
This series introduces a GLES 2.0 GPU ISP to libcamera.

We have had extensive discussions, meetings and collaborative discussions
about this topic over the last year or so.

As an overview we want to start to move as much processing of software_isp
into the GPU as possible. This is especially advantageous when we are
talking about processing a framebuffer's worth of pixels as quickly as
possible.

The decision to use GLES 2.0 instead of say Vulcan stems from a desire to
support as much in the way of older hardware as possible and the fact we
already have upstream GLES 2.0 fragment shaders to do debayer.

Generally the approach is

- Move the fragment shaders out of qcam and into a common location
- Update the existing SoftwareISP Debayer/DebayerCPU pair to facilitate
  addition of a new class DebayerEGL.
- Introduce that class
- Then do progressive change of the shaders and DebayerEGL class to make
  the modifications as transparent as possible in the git log.
- Reuse as much of the SoftIPA data-structures and logic as possible.
- Consume the data from SoftIPA in the Debayer Shaders so that CPUISP and
  GPUISP give similar - hopefully the same results but with GPUISP going
  faster.

In order to get untiled and uncompressed pixel data out of the GPU
framebuffer we need to tell the GPU how to store the data it is writing to
that framebuffer. GPUs can store their framebuffer data in tiled or even
compressed formats which is why the naive approach of running your fragment
shader and then using glReadPixels(GL_RGBA); will be horrendously slow as
glReadPixels must convert from the internal GPU format to the requested
output format - an operation that for me takes ~ 10 milliseconds per frame.

Instead we get the GPU to store its data as ARGB8888 swap buffers and
memcpy() from the swapped buffer to our output frame. Right now this series
supports 32 bit output formats only.

The memcpy() also entails flushing the cache of the target buffer as per
the terms of the dma-buf software contract.

This leads us onto the main outstanding TODOs

- 24 bit GBM buffer support leading
- 24 bit output framebuffer support
- Surfaceless GBM and eGL context with no swapbuffer
- Render to texture
  If we render directly to a buffer provided to the GPU the output
  buffer we will not need to memcpy() to the output buffer
  nor will we need to invalidate the output buffer cache.
- eglCreateImageKHR for the texture upload.

This list is of the colour "make it go faster" not "make it work" which is
why we are moving to start to submit a v1 for discussion in the full
realisation it will have to go through several cycles of review giving us
the opportunity to fix:

- Doxygen is missing for new classes and methods
- Some of the pipelines don't complete in gitlab
- 24 bit output seems doable before merge
- Render to texture perhaps even too

For me on my Qualcomm hardware GPUISP works very well I get 30fps in qcam
with about 75% CPU usage versus > 100% - cam goes faster which to me
implies a good bit of time is being consumed in qcam itself.

The series starts out with fixes and updates from Hans and finishes it out
with shader modifications from Milan both of whom along with Kieran,
Laurent and Maxime I'd like to thank for being some helpful and patient.

Bryan O'Donoghue (21):
  libcamera: software_isp: gbm: Add a GBM helper class for GPU surface
    access
  libcamera: software_isp: Make isStandardBayerOrder static
  libcamera: software_isp: egl: Add a eGL base helper class
  libcamera: shaders: Use highp not mediump for float precision
  libcamera: shaders: Extend debayer shaders to apply RGB gain values on
    output
  libcamera: shaders: Extend bayer shaders to support swapping R and B
    on output
  libcamera: shaders: Add support for black level compenstation
  libcamera: shaders: Add support for Gamma
  libcamera: shaders: Add support for contrast
  libcamera: software_isp: debayer_egl: Add an eGL debayer class
  libcamera: software_isp: debayer_egl: Make DebayerEGL an environment
    option
  libcamera: software_isp: debayer_egl: Make gpuisp default softisp mode
  libcamera: software_isp: debayer_cpu: Make getInputConfig and
    getOutputConfig static
  libcamera: software_isp: Add a gpuisp todo list
  libcamera: software_isp: lut: Change default Gamma to 1.0/2.2
  ipa: Add a new Algorithm::init() to support self-initalising
    algorithms
  libcamera: software_isp: Implement a static init() routine
  ipa: simple: Add a flag to indicate gpuIspEnabled
  ipa: libipa: module: Add createSelfEnumeratingAlgorithm
  ipa: software_isp: Call createSelfEnumeratingAlgorithm() to statically
    instantiate CCM algo
  libcamera: software_isp: lut:  Skip calculation lookup tables if
    gpuIspEnabled is true.

Milan Zamazal (2):
  libcamera: shaders: Rename bayer_8 to bayer_unpacked
  libcamera: software_isp: GPU support for unpacked 10/12-bit formats

 include/libcamera/internal/egl.h              | 412 +++++++++++
 include/libcamera/internal/gbm.h              |  84 +++
 include/libcamera/internal/meson.build        |   1 +
 include/libcamera/internal/shaders/RGB.frag   |   2 +-
 .../internal/shaders/YUV_2_planes.frag        |   2 +-
 .../internal/shaders/YUV_3_planes.frag        |   2 +-
 .../internal/shaders/YUV_packed.frag          |   2 +-
 .../internal/shaders/bayer_1x_packed.frag     |  89 ++-
 .../{bayer_8.frag => bayer_unpacked.frag}     | 105 ++-
 .../{bayer_8.vert => bayer_unpacked.vert}     |   0
 .../libcamera/internal/shaders/meson.build    |   4 +-
 include/libcamera/ipa/soft.mojom              |   2 +-
 src/apps/qcam/assets/shader/shaders.qrc       |   4 +-
 src/apps/qcam/viewfinder_gl.cpp               |  16 +-
 src/ipa/libipa/algorithm.cpp                  |  13 +-
 src/ipa/libipa/algorithm.h                    |   5 +
 src/ipa/libipa/module.h                       |  41 ++
 src/ipa/simple/algorithms/ccm.cpp             |  18 +
 src/ipa/simple/algorithms/ccm.h               |   1 +
 src/ipa/simple/algorithms/lut.cpp             |  72 +-
 src/ipa/simple/ipa_context.h                  |   1 +
 src/ipa/simple/soft_simple.cpp                |  13 +-
 src/libcamera/egl.cpp                         | 436 ++++++++++++
 src/libcamera/gbm.cpp                         |  61 ++
 src/libcamera/meson.build                     |  34 +
 src/libcamera/software_isp/debayer.h          |   2 +-
 src/libcamera/software_isp/debayer_cpu.h      |   4 +-
 src/libcamera/software_isp/debayer_egl.cpp    | 668 ++++++++++++++++++
 src/libcamera/software_isp/debayer_egl.h      | 177 +++++
 src/libcamera/software_isp/gpuisp-todo.txt    |  83 +++
 src/libcamera/software_isp/meson.build        |   8 +
 src/libcamera/software_isp/software_isp.cpp   |  35 +-
 32 files changed, 2334 insertions(+), 63 deletions(-)
 create mode 100644 include/libcamera/internal/egl.h
 create mode 100644 include/libcamera/internal/gbm.h
 rename include/libcamera/internal/shaders/{bayer_8.frag => bayer_unpacked.frag} (50%)
 rename include/libcamera/internal/shaders/{bayer_8.vert => bayer_unpacked.vert} (100%)
 create mode 100644 src/libcamera/egl.cpp
 create mode 100644 src/libcamera/gbm.cpp
 create mode 100644 src/libcamera/software_isp/debayer_egl.cpp
 create mode 100644 src/libcamera/software_isp/debayer_egl.h
 create mode 100644 src/libcamera/software_isp/gpuisp-todo.txt

Pavel Machek Nov. 21, 2025, 11:36 a.m. UTC | #1

Hi!

> 
> - Drops AWB since the CCM contains it already
> - Includes Gamma
> - Includes Contrast - testable via camshark
> - Includes Saturation - testable via camshark
> - Includes a scaler from Robert
> - Includes synch changes from Robert
> - Includes all feedback incorporated from Pavel
> - Generates a default 65k CCM if none is supplied
> - Various Doxygen torments fixed along the way
> - And is the "top half" of the precursor series as the GPUISP
>   series becomes 44 patches long this is an unreasonable number
>   to merge in one go.
> 
> - Full testable branch
> Link: https://gitlab.freedesktop.org/camera/libcamera-softisp/-/tree/v0.5.2-gpuisp-v4e?ref_type=heads
>

I did very quick testing on Librem 5 front camera:

Default configuration is: 1628x1224-ABGR8888/Unset
Available streams: 1
Stream 0 pixel format=ABGR8888 size=1628x1224
Validated configuration is: 1628x1224-ABGR8888/Unset
[13:39:51.752299467] [5577]  INFO Camera camera.cpp:1215 configuring streams: (0) 1628x1224-ABGR8888/Unset
[13:39:51.753785964] [5591]  WARN IPASoft soft_simple.cpp:255 IPASoft: Minimum gain is zero, that can't be linear
[13:39:51.754911728] [5591]  INFO IPASoft soft_simple.cpp:269 IPASoft: Exposure 6-2524, gain 100-240 (1)

=== Available Controls ===
Contrast (id=15)
  Type: Float

Allocated 4 buffers for stream
....
Request completed: Request(10:C:0/1:0)
	SensorBlackLevels = [ 3072, 3072, 3072, 3072 ]
	ColourGains = [ 0.000949, 0.000958 ]
	ExposureTime = 90250
	AnalogueGain = 100.000000
	SensorTimestamp = 49194147415000
 seq: 000068 timestamp: 49194147415000 bytesused: 7970688

And yes, it works :-), and gives me AE.

ABGR8888 is really bad format for video encoding. Memory is uncached
and thus slow, so we spend time copying out that unused alpha
channel. BGR888 would be better. (YUY2 would be even better).

I notice AE works, but I don't see a way to run it on/off or to
control exposure/gain manually. Both would be welcome for photo
taking.

Thanks for doing this.

Tested-by: Pavel Machek <pavel@ucw.cz>

Best regards,
							Pavel

Pavel Machek Nov. 21, 2025, 1:36 p.m. UTC | #2

Hi1

> This version 4:
> 
> - Drops AWB since the CCM contains it already
> - Includes Gamma
> - Includes Contrast - testable via camshark
> - Includes Saturation - testable via camshark
> - Includes a scaler from Robert

Does it? I tried requesting lower resolution, and it seemed like image
is cropped, not resized.

Best regards,
								Pavel

Bryan O'Donoghue Nov. 22, 2025, 4:13 p.m. UTC | #3

On 21/11/2025 11:36, Pavel Machek wrote:
> ABGR8888 is really bad format for video encoding. Memory is uncached
> and thus slow, so we spend time copying out that unused alpha
> channel. BGR888 would be better. (YUY2 would be even better).

We talked alot about 32 bit formats and concluded most hardware probably 
worked better with 32 bit.

A fragment shader's view of the world - how the pipeline works in fact 
is gl_FragColor = DWORD.

So its not possible from a fragment shader to write any other type of 
packing.

We can and do re-order the bytes to swap blue as an example.

To do differently packed formats - we need compute shaders and SSBOs, 
i.e. buffers which aren't constrained to a final 32 bit write.

> I notice AE works, but I don't see a way to run it on/off or to
> control exposure/gain manually. Both would be welcome for photo
> taking.

I don't know how to do that either. You can certainly play with 
saturation and contrast in camshark though.

> 
> Thanks for doing this.
> 
> Tested-by: Pavel Machek<pavel@ucw.cz>

Bryan O'Donoghue Nov. 22, 2025, 4:14 p.m. UTC | #4

On 21/11/2025 13:36, Pavel Machek wrote:
>> This version 4:
>>
>> - Drops AWB since the CCM contains it already
>> - Includes Gamma
>> - Includes Contrast - testable via camshark
>> - Includes Saturation - testable via camshark
>> - Includes a scaler from Robert
> Does it? I tried requesting lower resolution, and it seemed like image
> is cropped, not resized.
> 
> Best regards,
> 								Pavel

Ah, you'll have to ask Robert what's going on with this, I haven't 
played with that bit myself.

---
bod

Pavel Machek Nov. 22, 2025, 10:38 p.m. UTC | #5

Hi!

> > ABGR8888 is really bad format for video encoding. Memory is uncached
> > and thus slow, so we spend time copying out that unused alpha
> > channel. BGR888 would be better. (YUY2 would be even better).
> 
> We talked alot about 32 bit formats and concluded most hardware probably
> worked better with 32 bit.
> 
> A fragment shader's view of the world - how the pipeline works in fact is
> gl_FragColor = DWORD.
> 
> So its not possible from a fragment shader to write any other type of
> packing.

It should be possible to write 16bits, too, according to ChatGPT.

What I've done in Clicks machine is write YUY2, so 2 pixels into
32bits. That definitelly works and has additional advantage of being
much faster to compress into movie file.

Best regards,
								Pavel

Bryan O'Donoghue Nov. 23, 2025, 10:55 a.m. UTC | #6

On 22/11/2025 22:38, Pavel Machek wrote:
> Hi!
> 
>>> ABGR8888 is really bad format for video encoding. Memory is uncached
>>> and thus slow, so we spend time copying out that unused alpha
>>> channel. BGR888 would be better. (YUY2 would be even better).
>>
>> We talked alot about 32 bit formats and concluded most hardware probably
>> worked better with 32 bit.
>>
>> A fragment shader's view of the world - how the pipeline works in fact is
>> gl_FragColor = DWORD.
>>
>> So its not possible from a fragment shader to write any other type of
>> packing.
> 
> It should be possible to write 16bits, too, according to ChatGPT.

Its an interesting question.

glTexture2D( format = GL_RG ); would write what to memory @ gl_FragColor 
= vec4();

I'd guess two bytes of data and two bytes of 0 or does it bytemask - or 
more pertinently is there a way to get it to bytemask if you specify the 
storage format of the pixels int he right way.

That's one of the reasons we use eglCreateImageKHR because we can 
control the storage format

https://forums.raspberrypi.com/viewtopic.php?t=293312

It might be possible to specify the surface as YUV2 as above and do so 
portably across platfroms/archs not sure.

We pretty much have the lowest common denominator with ARGB8888 at the 
moment.

We would also I'm reasonably sure have to create the GBM surface with 
the right number of pixels too and then setup our mesa context

https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/gbm/main/gbm.c?ref_type=heads#L223

But yeah, perhaps it is possible.

> What I've done in Clicks machine is write YUY2, so 2 pixels into
> 32bits. That definitelly works and has additional advantage of being
> much faster to compress into movie file.

So yeah with GBM and then eglCreateImageKHR can we

1. Make a 16 bit surface
2. Get the fragment shader to byte mask..

The second one I'm not so sure about.

The other alternative is compute shaders and SSBOs

We might in phase 2 look at say moving common code into some kind of 
include file so that core logic could be used in a fragment shader - 
a-la qcam in original unmodified mode but then included into a compute 
shader for all of the is more flexible stuff we want to do.

An SSBO would mean we could write 1, 2, 3, 4 bytes pretty much at will 
on the output without having to worry about surfaces, storage formats of 
cajoling the GPU to byte mask.

---
bod

Pavel Machek Nov. 23, 2025, 6:51 p.m. UTC | #7

Hi!

> > > > ABGR8888 is really bad format for video encoding. Memory is uncached
> > > > and thus slow, so we spend time copying out that unused alpha
> > > > channel. BGR888 would be better. (YUY2 would be even better).
> > > 
> > > We talked alot about 32 bit formats and concluded most hardware probably
> > > worked better with 32 bit.
> > > 
> > > A fragment shader's view of the world - how the pipeline works in fact is
> > > gl_FragColor = DWORD.
> > > 
> > > So its not possible from a fragment shader to write any other type of
> > > packing.
> > 
> > It should be possible to write 16bits, too, according to ChatGPT.
> 
> Its an interesting question.
> 
> glTexture2D( format = GL_RG ); would write what to memory @ gl_FragColor =
> vec4();
> 
> I'd guess two bytes of data and two bytes of 0 or does it bytemask - or more
> pertinently is there a way to get it to bytemask if you specify the storage
> format of the pixels int he right way.
> 
> That's one of the reasons we use eglCreateImageKHR because we can control
> the storage format
> 
> https://forums.raspberrypi.com/viewtopic.php?t=293312
> 
> It might be possible to specify the surface as YUV2 as above and do so
> portably across platfroms/archs not sure.
> 
> We pretty much have the lowest common denominator with ARGB8888 at the
> moment.

Yes. So what I've done in clicks machine is:

1) Tell hardware we are doing ARGB8888, half width.

2) Compute two pixels in the shader, extract YUV YUV from two
consecutive pixels.

3) Store Y U Y V as "R G B A" in the shader.

For Librem 5, it was great win:

a) half as many bytes transferred from GPU.

b) YUY2 is way faster to compress into video.

Best regards,
									Pavel

Bryan O'Donoghue Nov. 23, 2025, 11:45 p.m. UTC | #8

On 23/11/2025 18:51, Pavel Machek wrote:
> Hi!
> 
>>>>> ABGR8888 is really bad format for video encoding. Memory is uncached
>>>>> and thus slow, so we spend time copying out that unused alpha
>>>>> channel. BGR888 would be better. (YUY2 would be even better).
>>>>
>>>> We talked alot about 32 bit formats and concluded most hardware probably
>>>> worked better with 32 bit.
>>>>
>>>> A fragment shader's view of the world - how the pipeline works in fact is
>>>> gl_FragColor = DWORD.
>>>>
>>>> So its not possible from a fragment shader to write any other type of
>>>> packing.
>>>
>>> It should be possible to write 16bits, too, according to ChatGPT.
>>
>> Its an interesting question.
>>
>> glTexture2D( format = GL_RG ); would write what to memory @ gl_FragColor =
>> vec4();
>>
>> I'd guess two bytes of data and two bytes of 0 or does it bytemask - or more
>> pertinently is there a way to get it to bytemask if you specify the storage
>> format of the pixels int he right way.
>>
>> That's one of the reasons we use eglCreateImageKHR because we can control
>> the storage format
>>
>> https://forums.raspberrypi.com/viewtopic.php?t=293312
>>
>> It might be possible to specify the surface as YUV2 as above and do so
>> portably across platfroms/archs not sure.
>>
>> We pretty much have the lowest common denominator with ARGB8888 at the
>> moment.
> 
> Yes. So what I've done in clicks machine is:
> 
> 1) Tell hardware we are doing ARGB8888, half width.
> 
> 2) Compute two pixels in the shader, extract YUV YUV from two
> consecutive pixels.
> 
> 3) Store Y U Y V as "R G B A" in the shader.
> 
> For Librem 5, it was great win:
> 
> a) half as many bytes transferred from GPU.
> 
> b) YUY2 is way faster to compress into video.
> 
> Best regards,
> 									Pavel

Perhaps a discreet compute shader pass would be best/easiest for the 
codebase we have.

---
bod

Pavel Machek Nov. 28, 2025, 10:36 p.m. UTC | #9

Hi!

> This version 4:
> 
> - Drops AWB since the CCM contains it already
> - Includes Gamma
> - Includes Contrast - testable via camshark
> - Includes Saturation - testable via camshark
> - Includes a scaler from Robert

Now I'm pretty sure it crops instead of resizes.

I'm also getting assertion failures when trying to it with high
resolution on Librem 5 with mcam test program. Not sure how to debug
that.

[Thread 0xffffe534ee00 (LWP 10180) exited]
mcam: ../include/libcamera/controls.h:189: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::basic_string_view<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.

Thread 23 "IPAProxySoft" received signal SIGABRT, Aborted.
[Switching to Thread 0xffffe5b5ee00 (LWP 10181)]
__pthread_kill_implementation (threadid=281474535648768, signo=signo@entry=6, 
    no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
warning: 44	./nptl/pthread_kill.c: No such file or directory
(gdb) bt
#0  __pthread_kill_implementation (threadid=281474535648768, signo=signo@entry=6, 
    no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1  0x0000fffff6f77e64 in __pthread_kill_internal (threadid=<optimized out>, signo=6)
    at ./nptl/pthread_kill.c:89
#2  0x0000fffff6f26980 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x0000fffff6f11ac4 in __GI_abort () at ./stdlib/abort.c:73
#4  0x0000fffff6f1f9bc in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>, 
    file=<optimized out>, line=189, function=<optimized out>) at ./assert/assert.c:118
#5  0x0000aaaaaaabddd4 in libcamera::ControlValue::get<int, decltype(nullptr)> (
    this=0xffffc8000cc0) at ../include/libcamera/controls.h:189
#6  0x0000ffffe5c0e1c8 in libcamera::ipa::soft::IPASoftSimple::processStats (this=0xffffd0035aa0, 
    frame=3, bufferId=0, sensorControls=...) at ../src/ipa/simple/soft_simple.cpp:312
#7  0x0000fffff7c1da4c in libcamera::ipa::soft::IPAProxySoftThreaded::ThreadProxy::processStats (
    this=0xffffd0012ea8, frame=3, bufferId=0, sensorControls=...)
    at include/libcamera/ipa/soft_ipa_proxy.h:131
#8  0x0000fffff7c200c8 in libcamera::BoundMethodMember<libcamera::ipa::soft::IPAProxySoftThreaded::ThreadProxy, void, unsigned int, unsigned int, libcamera::ControlList const&>::invoke (
    this=0xffffc8002170, args#0=3, args#1=0, args#2=...)
    at ../include/libcamera/base/bound_method.h:182
#9  0x0000fffff7c02d78 in libcamera::BoundMethodArgs<void, unsigned int, unsigned int, libcamera::ControlList const&>::invokePack<0ul, 1ul, 2ul> (this=0xffffc8002170, pack=0xffffd0027e50)
    at ../include/libcamera/base/bound_method.h:102
#10 0x0000fffff7c021f4 in libcamera::BoundMethodArgs<void, unsigned int, unsigned int, libcamera::ControlList const&>::invokePack (this=0xffffc8002170, pack=0xffffd0027e50)
    at ../include/libcamera/base/bound_method.h:111
#11 0x0000fffff76b3d68 in libcamera::InvokeMessage::invoke (this=0xffffc8001ac0)
    at ../src/libcamera/base/message.cpp:153
#12 0x0000fffff7699ef0 in libcamera::Object::message (this=0xffffd0012ea8, msg=0xffffc8001ac0)
    at ../src/libcamera/base/object.cpp:211
#13 0x0000fffff76b5650 in libcamera::Thread::dispatchMessages (this=0xffffd0012e58, 
    type=libcamera::Message::None, receiver=0x0) at ../src/libcamera/base/thread.cpp:662
#14 0x0000fffff76a3740 in libcamera::EventDispatcherPoll::processEvents (this=0xffffd4000ba0)
    at ../src/libcamera/base/event_dispatcher_poll.cpp:146
#15 0x0000fffff76b46ec in libcamera::Thread::exec (this=0xffffd0012e58)
    at ../src/libcamera/base/thread.cpp:319
#16 0x0000fffff76b4778 in libcamera::Thread::run (this=0xffffd0012e58)
    at ../src/libcamera/base/thread.cpp:346
#17 0x0000fffff76b4684 in libcamera::Thread::startThread (this=0xffffd0012e58)
    at ../src/libcamera/base/thread.cpp:297
#18 0x0000fffff76b9bc4 in std::__invoke_impl<void, void (libcamera::Thread::*)(), libcamera::Thread*> (
    __f=@0xffffd00055d0: (void (libcamera::Thread::*)(class libcamera::Thread * const)) 0xfffff76b4510 <libcamera::Thread::startThread()>, __t=@0xffffd00055c8: 0xffffd0012e58)
    at /usr/include/c++/14/bits/invoke.h:74
--Type <RET> for more, q to quit, c to continue without paging--
#19 0x0000fffff76b9b10 in std::__invoke<void (libcamera::Thread::*)(), libcamera::Thread*> (
    __fn=@0xffffd00055d0: (void (libcamera::Thread::*)(class libcamera::Thread * const)) 0xfffff76b4510 <libcamera::Thread::startThread()>) at /usr/include/c++/14/bits/invoke.h:96
#20 0x0000fffff76b9a74 in std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::_M_invoke<0ul, 1ul> (this=0xffffd00055c8)
    at /usr/include/c++/14/bits/std_thread.h:301
#21 0x0000fffff76b9a2c in std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::operator() (this=0xffffd00055c8) at /usr/include/c++/14/bits/std_thread.h:308
#22 0x0000fffff76b9a0c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> > >::_M_run (this=0xffffd00055c0)
    at /usr/include/c++/14/bits/std_thread.h:253
#23 0x0000fffff71cb4e0 in ?? () from /lib/aarch64-linux-gnu/libstdc++.so.6
#24 0x0000fffff6f75f78 in start_thread (arg=0xffffe5b5ee00) at ./nptl/pthread_create.c:448
#25 0x0000fffff6fdde8c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone3.S:76
(gdb) 

Thanks and best regards,
								Pavel

Bryan O'Donoghue Nov. 29, 2025, 1:41 p.m. UTC | #10

On 28/11/2025 22:36, Pavel Machek wrote:
> Hi!
> 
>> This version 4:
>>
>> - Drops AWB since the CCM contains it already
>> - Includes Gamma
>> - Includes Contrast - testable via camshark
>> - Includes Saturation - testable via camshark
>> - Includes a scaler from Robert
> 
> Now I'm pretty sure it crops instead of resizes.
> 
> I'm also getting assertion failures when trying to it with high
> resolution on Librem 5 with mcam test program. Not sure how to debug
> that.
> 
> [Thread 0xffffe534ee00 (LWP 10180) exited]
> mcam: ../include/libcamera/controls.h:189: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::basic_string_view<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.
> 
> Thread 23 "IPAProxySoft" received signal SIGABRT, Aborted.
> [Switching to Thread 0xffffe5b5ee00 (LWP 10181)]
> __pthread_kill_implementation (threadid=281474535648768, signo=signo@entry=6,
>      no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
> warning: 44	./nptl/pthread_kill.c: No such file or directory
> (gdb) bt
> #0  __pthread_kill_implementation (threadid=281474535648768, signo=signo@entry=6,
>      no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
> #1  0x0000fffff6f77e64 in __pthread_kill_internal (threadid=<optimized out>, signo=6)
>      at ./nptl/pthread_kill.c:89
> #2  0x0000fffff6f26980 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> #3  0x0000fffff6f11ac4 in __GI_abort () at ./stdlib/abort.c:73
> #4  0x0000fffff6f1f9bc in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>,
>      file=<optimized out>, line=189, function=<optimized out>) at ./assert/assert.c:118
> #5  0x0000aaaaaaabddd4 in libcamera::ControlValue::get<int, decltype(nullptr)> (
>      this=0xffffc8000cc0) at ../include/libcamera/controls.h:189
> #6  0x0000ffffe5c0e1c8 in libcamera::ipa::soft::IPASoftSimple::processStats (this=0xffffd0035aa0,
>      frame=3, bufferId=0, sensorControls=...) at ../src/ipa/simple/soft_simple.cpp:312
> #7  0x0000fffff7c1da4c in libcamera::ipa::soft::IPAProxySoftThreaded::ThreadProxy::processStats (
>      this=0xffffd0012ea8, frame=3, bufferId=0, sensorControls=...)
>      at include/libcamera/ipa/soft_ipa_proxy.h:131
> #8  0x0000fffff7c200c8 in libcamera::BoundMethodMember<libcamera::ipa::soft::IPAProxySoftThreaded::ThreadProxy, void, unsigned int, unsigned int, libcamera::ControlList const&>::invoke (
>      this=0xffffc8002170, args#0=3, args#1=0, args#2=...)
>      at ../include/libcamera/base/bound_method.h:182
> #9  0x0000fffff7c02d78 in libcamera::BoundMethodArgs<void, unsigned int, unsigned int, libcamera::ControlList const&>::invokePack<0ul, 1ul, 2ul> (this=0xffffc8002170, pack=0xffffd0027e50)
>      at ../include/libcamera/base/bound_method.h:102
> #10 0x0000fffff7c021f4 in libcamera::BoundMethodArgs<void, unsigned int, unsigned int, libcamera::ControlList const&>::invokePack (this=0xffffc8002170, pack=0xffffd0027e50)
>      at ../include/libcamera/base/bound_method.h:111
> #11 0x0000fffff76b3d68 in libcamera::InvokeMessage::invoke (this=0xffffc8001ac0)
>      at ../src/libcamera/base/message.cpp:153
> #12 0x0000fffff7699ef0 in libcamera::Object::message (this=0xffffd0012ea8, msg=0xffffc8001ac0)
>      at ../src/libcamera/base/object.cpp:211
> #13 0x0000fffff76b5650 in libcamera::Thread::dispatchMessages (this=0xffffd0012e58,
>      type=libcamera::Message::None, receiver=0x0) at ../src/libcamera/base/thread.cpp:662
> #14 0x0000fffff76a3740 in libcamera::EventDispatcherPoll::processEvents (this=0xffffd4000ba0)
>      at ../src/libcamera/base/event_dispatcher_poll.cpp:146
> #15 0x0000fffff76b46ec in libcamera::Thread::exec (this=0xffffd0012e58)
>      at ../src/libcamera/base/thread.cpp:319
> #16 0x0000fffff76b4778 in libcamera::Thread::run (this=0xffffd0012e58)
>      at ../src/libcamera/base/thread.cpp:346
> #17 0x0000fffff76b4684 in libcamera::Thread::startThread (this=0xffffd0012e58)
>      at ../src/libcamera/base/thread.cpp:297
> #18 0x0000fffff76b9bc4 in std::__invoke_impl<void, void (libcamera::Thread::*)(), libcamera::Thread*> (
>      __f=@0xffffd00055d0: (void (libcamera::Thread::*)(class libcamera::Thread * const)) 0xfffff76b4510 <libcamera::Thread::startThread()>, __t=@0xffffd00055c8: 0xffffd0012e58)
>      at /usr/include/c++/14/bits/invoke.h:74
> --Type <RET> for more, q to quit, c to continue without paging--
> #19 0x0000fffff76b9b10 in std::__invoke<void (libcamera::Thread::*)(), libcamera::Thread*> (
>      __fn=@0xffffd00055d0: (void (libcamera::Thread::*)(class libcamera::Thread * const)) 0xfffff76b4510 <libcamera::Thread::startThread()>) at /usr/include/c++/14/bits/invoke.h:96
> #20 0x0000fffff76b9a74 in std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::_M_invoke<0ul, 1ul> (this=0xffffd00055c8)
>      at /usr/include/c++/14/bits/std_thread.h:301
> #21 0x0000fffff76b9a2c in std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::operator() (this=0xffffd00055c8) at /usr/include/c++/14/bits/std_thread.h:308
> #22 0x0000fffff76b9a0c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> > >::_M_run (this=0xffffd00055c0)
>      at /usr/include/c++/14/bits/std_thread.h:253
> #23 0x0000fffff71cb4e0 in ?? () from /lib/aarch64-linux-gnu/libstdc++.so.6
> #24 0x0000fffff6f75f78 in start_thread (arg=0xffffe5b5ee00) at ./nptl/pthread_create.c:448
> #25 0x0000fffff6fdde8c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone3.S:76
> (gdb)
> 
> Thanks and best regards,
> 								Pavel

This doesn't look specific to the GPU patches to me

Could you try LIBCAMERA_LOG_LEVELS=*:DEBUG LIBCAMERA_SOFTISP_MODE=cpu 
/path/to/mcam <mcam_args> ?

1. Can you bisect ?

2. Where is mcam and how hard is it to build/run ?
    Perhaps I can figure this out if I can run your test code.

---
bod

Robert Mader Nov. 29, 2025, 2:01 p.m. UTC | #11

Hi - unintentionally send my reply only to Bryan, sorry 😅, so here again

On 22.11.25 17:14, Bryan O'Donoghue wrote:
> On 21/11/2025 13:36, Pavel Machek wrote:
>>> This version 4:
>>>
>>> - Drops AWB since the CCM contains it already
>>> - Includes Gamma
>>> - Includes Contrast - testable via camshark
>>> - Includes Saturation - testable via camshark
>>> - Includes a scaler from Robert
>> Does it? I tried requesting lower resolution, and it seemed like image
>> is cropped, not resized.
>>
>> Best regards,
>>                                 Pavel
>
> Ah, you'll have to ask Robert what's going on with this, I haven't 
> played with that bit myself.

For me things still work as expected on v4/v0.5.2-gpuisp-v5a.

Pavel: it should be easy to check if the scaler improves the situation 
over the SWISP - just choose a resolution that would need to get scaled, 
take pictures with and without GPUISP and check if the later looks more 
"zoomed out".

Note: if the aspect ratio of the camera mode does not match the 
requested mode - i.e. one is 4:3 and the other 16:9 - then the image 
will get cropped on one direction. That should be in line with how most 
hardware ISPs work.

If you think the scaling isn't correct yet, you can play around with the 
values for projMatrix in debayer_egl.cpp

Regards
>
> ---
> bod

Pavel Machek Nov. 29, 2025, 7:51 p.m. UTC | #12

Hi!

> > [Thread 0xffffe534ee00 (LWP 10180) exited]
> > mcam: ../include/libcamera/controls.h:189: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::basic_string_view<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.
> > 
> > Thread 23 "IPAProxySoft" received signal SIGABRT, Aborted.
> > [Switching to Thread 0xffffe5b5ee00 (LWP 10181)]
> > __pthread_kill_implementation (threadid=281474535648768, signo=signo@entry=6,
> >      no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
> > warning: 44	./nptl/pthread_kill.c: No such file or directory
> > (gdb) bt
> > #0  __pthread_kill_implementation (threadid=281474535648768, signo=signo@entry=6,
> >      no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
> > #1  0x0000fffff6f77e64 in __pthread_kill_internal (threadid=<optimized out>, signo=6)
> >      at ./nptl/pthread_kill.c:89
> > #2  0x0000fffff6f26980 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> > #3  0x0000fffff6f11ac4 in __GI_abort () at ./stdlib/abort.c:73
> > #4  0x0000fffff6f1f9bc in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>,
> >      file=<optimized out>, line=189, function=<optimized out>) at ./assert/assert.c:118
> > #5  0x0000aaaaaaabddd4 in libcamera::ControlValue::get<int, decltype(nullptr)> (
> >      this=0xffffc8000cc0) at ../include/libcamera/controls.h:189
> > #6  0x0000ffffe5c0e1c8 in libcamera::ipa::soft::IPASoftSimple::processStats (this=0xffffd0035aa0,
> >      frame=3, bufferId=0, sensorControls=...) at ../src/ipa/simple/soft_simple.cpp:312
> > #7  0x0000fffff7c1da4c in libcamera::ipa::soft::IPAProxySoftThreaded::ThreadProxy::processStats (
> >      this=0xffffd0012ea8, frame=3, bufferId=0, sensorControls=...)
> >      at include/libcamera/ipa/soft_ipa_proxy.h:131
> > #8  0x0000fffff7c200c8 in libcamera::BoundMethodMember<libcamera::ipa::soft::IPAProxySoftThreaded::ThreadProxy, void, unsigned int, unsigned int, libcamera::ControlList const&>::invoke (
> >      this=0xffffc8002170, args#0=3, args#1=0, args#2=...)
> >      at ../include/libcamera/base/bound_method.h:182
> > #9  0x0000fffff7c02d78 in libcamera::BoundMethodArgs<void, unsigned int, unsigned int, libcamera::ControlList const&>::invokePack<0ul, 1ul, 2ul> (this=0xffffc8002170, pack=0xffffd0027e50)
> >      at ../include/libcamera/base/bound_method.h:102
> > #10 0x0000fffff7c021f4 in libcamera::BoundMethodArgs<void, unsigned int, unsigned int, libcamera::ControlList const&>::invokePack (this=0xffffc8002170, pack=0xffffd0027e50)
> >      at ../include/libcamera/base/bound_method.h:111
> > #11 0x0000fffff76b3d68 in libcamera::InvokeMessage::invoke (this=0xffffc8001ac0)
> >      at ../src/libcamera/base/message.cpp:153
> > #12 0x0000fffff7699ef0 in libcamera::Object::message (this=0xffffd0012ea8, msg=0xffffc8001ac0)
> >      at ../src/libcamera/base/object.cpp:211
> > #13 0x0000fffff76b5650 in libcamera::Thread::dispatchMessages (this=0xffffd0012e58,
> >      type=libcamera::Message::None, receiver=0x0) at ../src/libcamera/base/thread.cpp:662
> > #14 0x0000fffff76a3740 in libcamera::EventDispatcherPoll::processEvents (this=0xffffd4000ba0)
> >      at ../src/libcamera/base/event_dispatcher_poll.cpp:146
> > #15 0x0000fffff76b46ec in libcamera::Thread::exec (this=0xffffd0012e58)
> >      at ../src/libcamera/base/thread.cpp:319
> > #16 0x0000fffff76b4778 in libcamera::Thread::run (this=0xffffd0012e58)
> >      at ../src/libcamera/base/thread.cpp:346
> > #17 0x0000fffff76b4684 in libcamera::Thread::startThread (this=0xffffd0012e58)
> >      at ../src/libcamera/base/thread.cpp:297
> > #18 0x0000fffff76b9bc4 in std::__invoke_impl<void, void (libcamera::Thread::*)(), libcamera::Thread*> (
> >      __f=@0xffffd00055d0: (void (libcamera::Thread::*)(class libcamera::Thread * const)) 0xfffff76b4510 <libcamera::Thread::startThread()>, __t=@0xffffd00055c8: 0xffffd0012e58)
> >      at /usr/include/c++/14/bits/invoke.h:74
> > --Type <RET> for more, q to quit, c to continue without paging--
> > #19 0x0000fffff76b9b10 in std::__invoke<void (libcamera::Thread::*)(), libcamera::Thread*> (
> >      __fn=@0xffffd00055d0: (void (libcamera::Thread::*)(class libcamera::Thread * const)) 0xfffff76b4510 <libcamera::Thread::startThread()>) at /usr/include/c++/14/bits/invoke.h:96
> > #20 0x0000fffff76b9a74 in std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::_M_invoke<0ul, 1ul> (this=0xffffd00055c8)
> >      at /usr/include/c++/14/bits/std_thread.h:301
> > #21 0x0000fffff76b9a2c in std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::operator() (this=0xffffd00055c8) at /usr/include/c++/14/bits/std_thread.h:308
> > #22 0x0000fffff76b9a0c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> > >::_M_run (this=0xffffd00055c0)
> >      at /usr/include/c++/14/bits/std_thread.h:253
> > #23 0x0000fffff71cb4e0 in ?? () from /lib/aarch64-linux-gnu/libstdc++.so.6
> > #24 0x0000fffff6f75f78 in start_thread (arg=0xffffe5b5ee00) at ./nptl/pthread_create.c:448
> > #25 0x0000fffff6fdde8c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone3.S:76
> > (gdb)
> > 
> > Thanks and best regards,
> > 								Pavel
> 
> This doesn't look specific to the GPU patches to me
> 
> Could you try LIBCAMERA_LOG_LEVELS=*:DEBUG LIBCAMERA_SOFTISP_MODE=cpu
> /path/to/mcam <mcam_args> ?

Let me try that. I don't get this crash when running w/o cpuisp/gpuisp
-- I do that by sudo chmod 600/666 /dev/dma_heap/system.

And yes, I get it with cpuisp only, too:

[34:53:41.933945338] [15549] DEBUG V4L2 v4l2_videodevice.cpp:1777 /dev/video0[19:cap]: Queueing buffer 1
[34:53:41.934053942] [15549] DEBUG DelayedControls delayed_controls.cpp:214 Reading Analogue Gain to 240 at index 0
[34:53:41.934110224] [15549] DEBUG DelayedControls delayed_controls.cpp:214 Reading Exposure to 629 at index 0
[34:53:41.934234668] [15549] DEBUG V4L2 v4l2_videodevice.cpp:1777 /dev/video0[19:cap]: Queueing buffer 2
[34:53:41.934316991] [15549] DEBUG DelayedControls delayed_controls.cpp:214 Reading Analogue Gain to <ValueType Error> at index 1
[34:53:41.934371233] [15549] DEBUG DelayedControls delayed_controls.cpp:214 Reading Exposure to <ValueType Error> at index 1
mcam: ../include/libcamera/controls.h:189: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::basic_string_view<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.
[34:53:41.934487037] [15549] DEBUG V4L2 v4l2_videodevice.cpp:1777 /dev/video0[19:cap]: Queueing buffer 3
Aborted (core dumped)
mobian@mobian:~/g/libcamera/build$ 

Triggered by holding "Snap" button for a while, then releasing it.

> 1. Can you bisect ?

Not really.

> 2. Where is mcam and how hard is it to build/run ?
>    Perhaps I can figure this out if I can run your test code.

I just put code at

https://gitlab.com/tui/libcamera/-/merge_requests/new?merge_request%5Bsource_branch%5D=millicam_1

It crashed fairly quickly when holding the "Snap" button.

Queueing requests: Request(0:P:1/1:0)187651128454704
...loop
Too slow 7422 msec
...loop
Capture ran for 1 seconds and stopped with exit status: 0
mcam: ../include/libcamera/controls.h:189: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::basic_string_view<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.
Aborted (core dumped)
mobian@mobian:~/g/libcamera/build$ git push
Enumerating objects: 29, done.
Counting objects: 100% (29/29), done.
Delta compression using up to 4 threads
Compressing objects: 100% (24/24), done.
Writing objects: 100% (24/24), 3.31 KiB | 261.00 KiB/s, done.
Total 24 (delta 16), reused 0 (delta 0), pack-reused 0 (from 0)
remote: 
remote: To create a merge request for millicam_1, visit:
remote:   https://gitlab.com/tui/libcamera/-/merge_requests/new?merge_request%5Bsource_branch%5D=millicam_1

So it looks like this happens with both cpuisp ang gpuisp.

Best regards,
							Pavel

Barnabás Pőcze Dec. 1, 2025, 11:53 a.m. UTC | #13

Hi

2025. 11. 29. 20:51 keltezéssel, Pavel Machek írta:
> Hi!
> 
>>> [Thread 0xffffe534ee00 (LWP 10180) exited]
>>> mcam: ../include/libcamera/controls.h:189: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::basic_string_view<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.

I think you're running into this one: https://gitlab.freedesktop.org/camera/libcamera/-/issues/241


Regards,
Barnabás Pőcze


>>>
>>> Thread 23 "IPAProxySoft" received signal SIGABRT, Aborted.
>>> [Switching to Thread 0xffffe5b5ee00 (LWP 10181)]
>>> __pthread_kill_implementation (threadid=281474535648768, signo=signo@entry=6,
>>>       no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
>>> warning: 44	./nptl/pthread_kill.c: No such file or directory
>>> (gdb) bt
>>> #0  __pthread_kill_implementation (threadid=281474535648768, signo=signo@entry=6,
>>>       no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
>>> #1  0x0000fffff6f77e64 in __pthread_kill_internal (threadid=<optimized out>, signo=6)
>>>       at ./nptl/pthread_kill.c:89
>>> #2  0x0000fffff6f26980 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
>>> #3  0x0000fffff6f11ac4 in __GI_abort () at ./stdlib/abort.c:73
>>> #4  0x0000fffff6f1f9bc in __assert_fail_base (fmt=<optimized out>, assertion=<optimized out>,
>>>       file=<optimized out>, line=189, function=<optimized out>) at ./assert/assert.c:118
>>> #5  0x0000aaaaaaabddd4 in libcamera::ControlValue::get<int, decltype(nullptr)> (
>>>       this=0xffffc8000cc0) at ../include/libcamera/controls.h:189
>>> #6  0x0000ffffe5c0e1c8 in libcamera::ipa::soft::IPASoftSimple::processStats (this=0xffffd0035aa0,
>>>       frame=3, bufferId=0, sensorControls=...) at ../src/ipa/simple/soft_simple.cpp:312
>>> #7  0x0000fffff7c1da4c in libcamera::ipa::soft::IPAProxySoftThreaded::ThreadProxy::processStats (
>>>       this=0xffffd0012ea8, frame=3, bufferId=0, sensorControls=...)
>>>       at include/libcamera/ipa/soft_ipa_proxy.h:131
>>> #8  0x0000fffff7c200c8 in libcamera::BoundMethodMember<libcamera::ipa::soft::IPAProxySoftThreaded::ThreadProxy, void, unsigned int, unsigned int, libcamera::ControlList const&>::invoke (
>>>       this=0xffffc8002170, args#0=3, args#1=0, args#2=...)
>>>       at ../include/libcamera/base/bound_method.h:182
>>> #9  0x0000fffff7c02d78 in libcamera::BoundMethodArgs<void, unsigned int, unsigned int, libcamera::ControlList const&>::invokePack<0ul, 1ul, 2ul> (this=0xffffc8002170, pack=0xffffd0027e50)
>>>       at ../include/libcamera/base/bound_method.h:102
>>> #10 0x0000fffff7c021f4 in libcamera::BoundMethodArgs<void, unsigned int, unsigned int, libcamera::ControlList const&>::invokePack (this=0xffffc8002170, pack=0xffffd0027e50)
>>>       at ../include/libcamera/base/bound_method.h:111
>>> #11 0x0000fffff76b3d68 in libcamera::InvokeMessage::invoke (this=0xffffc8001ac0)
>>>       at ../src/libcamera/base/message.cpp:153
>>> #12 0x0000fffff7699ef0 in libcamera::Object::message (this=0xffffd0012ea8, msg=0xffffc8001ac0)
>>>       at ../src/libcamera/base/object.cpp:211
>>> #13 0x0000fffff76b5650 in libcamera::Thread::dispatchMessages (this=0xffffd0012e58,
>>>       type=libcamera::Message::None, receiver=0x0) at ../src/libcamera/base/thread.cpp:662
>>> #14 0x0000fffff76a3740 in libcamera::EventDispatcherPoll::processEvents (this=0xffffd4000ba0)
>>>       at ../src/libcamera/base/event_dispatcher_poll.cpp:146
>>> #15 0x0000fffff76b46ec in libcamera::Thread::exec (this=0xffffd0012e58)
>>>       at ../src/libcamera/base/thread.cpp:319
>>> #16 0x0000fffff76b4778 in libcamera::Thread::run (this=0xffffd0012e58)
>>>       at ../src/libcamera/base/thread.cpp:346
>>> #17 0x0000fffff76b4684 in libcamera::Thread::startThread (this=0xffffd0012e58)
>>>       at ../src/libcamera/base/thread.cpp:297
>>> #18 0x0000fffff76b9bc4 in std::__invoke_impl<void, void (libcamera::Thread::*)(), libcamera::Thread*> (
>>>       __f=@0xffffd00055d0: (void (libcamera::Thread::*)(class libcamera::Thread * const)) 0xfffff76b4510 <libcamera::Thread::startThread()>, __t=@0xffffd00055c8: 0xffffd0012e58)
>>>       at /usr/include/c++/14/bits/invoke.h:74
>>> --Type <RET> for more, q to quit, c to continue without paging--
>>> #19 0x0000fffff76b9b10 in std::__invoke<void (libcamera::Thread::*)(), libcamera::Thread*> (
>>>       __fn=@0xffffd00055d0: (void (libcamera::Thread::*)(class libcamera::Thread * const)) 0xfffff76b4510 <libcamera::Thread::startThread()>) at /usr/include/c++/14/bits/invoke.h:96
>>> #20 0x0000fffff76b9a74 in std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::_M_invoke<0ul, 1ul> (this=0xffffd00055c8)
>>>       at /usr/include/c++/14/bits/std_thread.h:301
>>> #21 0x0000fffff76b9a2c in std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> >::operator() (this=0xffffd00055c8) at /usr/include/c++/14/bits/std_thread.h:308
>>> #22 0x0000fffff76b9a0c in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (libcamera::Thread::*)(), libcamera::Thread*> > >::_M_run (this=0xffffd00055c0)
>>>       at /usr/include/c++/14/bits/std_thread.h:253
>>> #23 0x0000fffff71cb4e0 in ?? () from /lib/aarch64-linux-gnu/libstdc++.so.6
>>> #24 0x0000fffff6f75f78 in start_thread (arg=0xffffe5b5ee00) at ./nptl/pthread_create.c:448
>>> #25 0x0000fffff6fdde8c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone3.S:76
>>> (gdb)
>>>
>>> Thanks and best regards,
>>> 								Pavel
>>
>> This doesn't look specific to the GPU patches to me
>>
>> Could you try LIBCAMERA_LOG_LEVELS=*:DEBUG LIBCAMERA_SOFTISP_MODE=cpu
>> /path/to/mcam <mcam_args> ?
> 
> Let me try that. I don't get this crash when running w/o cpuisp/gpuisp
> -- I do that by sudo chmod 600/666 /dev/dma_heap/system.
> 
> And yes, I get it with cpuisp only, too:
> 
> [34:53:41.933945338] [15549] DEBUG V4L2 v4l2_videodevice.cpp:1777 /dev/video0[19:cap]: Queueing buffer 1
> [34:53:41.934053942] [15549] DEBUG DelayedControls delayed_controls.cpp:214 Reading Analogue Gain to 240 at index 0
> [34:53:41.934110224] [15549] DEBUG DelayedControls delayed_controls.cpp:214 Reading Exposure to 629 at index 0
> [34:53:41.934234668] [15549] DEBUG V4L2 v4l2_videodevice.cpp:1777 /dev/video0[19:cap]: Queueing buffer 2
> [34:53:41.934316991] [15549] DEBUG DelayedControls delayed_controls.cpp:214 Reading Analogue Gain to <ValueType Error> at index 1
> [34:53:41.934371233] [15549] DEBUG DelayedControls delayed_controls.cpp:214 Reading Exposure to <ValueType Error> at index 1
> mcam: ../include/libcamera/controls.h:189: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::basic_string_view<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.
> [34:53:41.934487037] [15549] DEBUG V4L2 v4l2_videodevice.cpp:1777 /dev/video0[19:cap]: Queueing buffer 3
> Aborted (core dumped)
> mobian@mobian:~/g/libcamera/build$
> 
> Triggered by holding "Snap" button for a while, then releasing it.
> 
>> 1. Can you bisect ?
> 
> Not really.
> 
>> 2. Where is mcam and how hard is it to build/run ?
>>     Perhaps I can figure this out if I can run your test code.
> 
> I just put code at
> 
> https://gitlab.com/tui/libcamera/-/merge_requests/new?merge_request%5Bsource_branch%5D=millicam_1
> 
> It crashed fairly quickly when holding the "Snap" button.
> 
> Queueing requests: Request(0:P:1/1:0)187651128454704
> ...loop
> Too slow 7422 msec
> ...loop
> Capture ran for 1 seconds and stopped with exit status: 0
> mcam: ../include/libcamera/controls.h:189: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::basic_string_view<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.
> Aborted (core dumped)
> mobian@mobian:~/g/libcamera/build$ git push
> Enumerating objects: 29, done.
> Counting objects: 100% (29/29), done.
> Delta compression using up to 4 threads
> Compressing objects: 100% (24/24), done.
> Writing objects: 100% (24/24), 3.31 KiB | 261.00 KiB/s, done.
> Total 24 (delta 16), reused 0 (delta 0), pack-reused 0 (from 0)
> remote:
> remote: To create a merge request for millicam_1, visit:
> remote:   https://gitlab.com/tui/libcamera/-/merge_requests/new?merge_request%5Bsource_branch%5D=millicam_1
> 
> So it looks like this happens with both cpuisp ang gpuisp.
> 
> Best regards,
> 							Pavel

[v4,00/23] Add GLES 2.0 GPUISP to libcamera
mbox series

Message

Comments

[v4,00/23] Add GLES 2.0 GPUISP to libcamera 25124 mbox series

Message

Comments

[v4,00/23] Add GLES 2.0 GPUISP to libcamera
mbox series