[libcamera-devel,00/11] libcamera: introduce Software ISP and Software IPA
mbox series

Message ID 20231214121350.206015-1-hdegoede@redhat.com
Headers show
Series
  • libcamera: introduce Software ISP and Software IPA
Related show

Message

Hans de Goede Dec. 14, 2023, 12:13 p.m. UTC
Hi All,

Here is a new, now non RFC because believed to be mostly ready
for merging, version of the Software ISP work.

First of all apologies for sending this out so soon after RFC v2.
While Andrey was working on the general SoftwareIsp framework
I have been focussing on optimizing the code for gathering
statistics and doing the debayering. I ended up splitting out
the swstats and debayer code into separate classes and this
first non RFC posting integrates the best of Andrey's and my
work. And it seemed better to post a new version integrating
our work soon, so that review efforts can be focussed on
the new combined work.

Changes in v1 vs RFC-v2:
- Add and use SwStats[Cpu] and Debayer[Cpu] classes
- Rename linaro soft-IPA and soft-ISP implementations to simple following
  the simple pipeline hander
- Integrate Pavel's swstats and debayer improvements

Known open items:
- The AWB red/blue gain calculations needs to be moved to the IPA
  and IPA then needs to pass a DebayerParams struct back to the
  SwIsp
- Properly document all methods / attributed for doxygen

git branch for this v1 patch set:
https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v04

RFC-v2 cover-letter with small updates:

Here is an implementation of Software ISP and Software IPA
which provide debayering and implementation of image processing
algorithms for systems without a hardware ISP, or in cases when
there are no public drivers for the hardware ISP present in the
system.

The implementation of the Software ISP is a reference one.
A naive AWB alorithm is implemented as part of a function which
does debayering and statistics calculations - the algorithm part
is to be moved to the IPA in the next version of the patch set.
And for debayering itself there is already a more efficient
implementation by Hans de Goede. This patch set is currently using
the earlier debayering implementation as it is less lines of code
and is OK for the initial discussion.
Only RAW10P format from the sensor is currently supported, but
other debayering functions can be easily added (which of them to
call is decided based on the input format).

The Software IPA has only auto exposure and AGC. For the AGC
the analogue gain control of the camera sensor is used (if
available). The algorithm is very much simplified, and is
mostly included as a reference code.

The 6th patch renames some variables in the simple pipeline
handler for the Software ISP to use the same buffer handling
code as the Converter currently does. This lets one to
avoid adding extra code to the pipeline handler, but also
makes the Software ISP and the Converter mutually exclusive.

The Software ISP / IPA are intended to be used with the simple
pipeline handler. The pipeline handler doesn't interact with
the Software IPA directly - the Software IPA is hidden behind
the Software ISP interface. To use the Software ISP the build
must be configured with
  -Dpipelines=simple/simple -Dipas=simple/simple
and the Converter must not be used (the latter is hardcoded
in the simple pipeline handler).
If the build is configured with just
  -Dpipelines=simple
the Software ISP / IPA are not used, and the simple pipeline
handler works in the same way as without this patchset.

"simple" in the
  -Dpipelines=simple/simple -Dipas=simple/simple
is the name of the Software ISP / IPA implementation.
  -Dpipelines=simple/<something else> -Dipas=simple/<something else>
should work too if another implementation were added. What
implementation to use is the build time choice, only one
implementation is included into the build.

This patch set uses SharedMemObject class used by the RPi pipeline
handler - the second patch in the series moves the header file
to a common directory.

This patch set has been tested on Qualcomm RB5 board with
a mezzanine board equipped with RPi camera v2 (not the
standard RB5 camera mezzanine).

git branch for the RFC-v2 patch set:
https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v03

RFC-v1 of the patch set:
https://patchwork.libcamera.org/cover/19262/

Changes in v2 vs v1:
- patches are restructured and reordered
- the Software IPA is hidden behind the Software ISP interface. The
  pipeline handler doesn't interact with the Software IPA directly
- instantiation of the Software ISP / IPA is configurable (at build
  time)
- comment explaining the implementation limitations is added to
  SwIspLinaro::IspWorker::debayerRaw10P()

Regards,

Hans


Andrey Konovalov (7):
  libcamera: introduce SoftwareIsp class
  libcamera: internal: Move SharedMemObject class to a common directory
  libcamera: ipa: add Soft IPA common files
  libcamera: ipa: Soft IPA: add a Simple Soft IPA implementation
  libcamera: software_isp: add Simple SoftwareIsp implementation
  libcamera: pipeline: simple: rename converterBuffers_ and related vars
  libcamera: pipeline: simple: enable use of Soft ISP and Soft IPA

Hans de Goede (4):
  libcamera: software_isp: Add SwStats base class
  libcamera: software_isp: Add SwStatsCpu class
  libcamera: software_isp: Add Debayer base class
  libcamera: software_isp: Add DebayerCpu class

 include/libcamera/internal/meson.build        |   3 +
 .../libcamera/internal}/shared_mem_object.h   |   4 -
 include/libcamera/internal/software_isp.h     | 109 +++++
 .../libcamera/internal/software_isp/debayer.h |  90 ++++
 .../internal/software_isp/debayer_cpu.h       | 104 +++++
 .../internal/software_isp/debayer_params.h    |  24 +
 .../internal/software_isp/meson.build         |  11 +
 .../internal/software_isp/swisp_simple.h      |  67 +++
 .../internal/software_isp/swisp_stats.h       |  21 +
 .../libcamera/internal/software_isp/swstats.h | 150 ++++++
 .../internal/software_isp/swstats_cpu.h       |  46 ++
 include/libcamera/ipa/meson.build             |   1 +
 include/libcamera/ipa/soft.mojom              |  27 ++
 meson_options.txt                             |   3 +-
 src/ipa/simple/common/meson.build             |  17 +
 src/ipa/simple/common/soft_base.cpp           |  66 +++
 src/ipa/simple/common/soft_base.h             |  47 ++
 src/ipa/simple/meson.build                    |  12 +
 src/ipa/simple/simple/data/meson.build        |   9 +
 src/ipa/simple/simple/data/soft.conf          |   3 +
 src/ipa/simple/simple/meson.build             |  26 ++
 src/ipa/simple/simple/soft_simple.cpp         | 210 +++++++++
 src/libcamera/meson.build                     |   2 +
 src/libcamera/pipeline/simple/simple.cpp      | 164 +++++--
 src/libcamera/software_isp.cpp                |  62 +++
 src/libcamera/software_isp/debayer.cpp        |  22 +
 src/libcamera/software_isp/debayer_cpu.cpp    | 440 ++++++++++++++++++
 src/libcamera/software_isp/meson.build        |  27 ++
 src/libcamera/software_isp/swisp_simple.cpp   | 219 +++++++++
 src/libcamera/software_isp/swstats.cpp        |  65 +++
 src/libcamera/software_isp/swstats_cpu.cpp    | 169 +++++++
 31 files changed, 2164 insertions(+), 56 deletions(-)
 rename {src/libcamera/pipeline/rpi/common => include/libcamera/internal}/shared_mem_object.h (98%)
 create mode 100644 include/libcamera/internal/software_isp.h
 create mode 100644 include/libcamera/internal/software_isp/debayer.h
 create mode 100644 include/libcamera/internal/software_isp/debayer_cpu.h
 create mode 100644 include/libcamera/internal/software_isp/debayer_params.h
 create mode 100644 include/libcamera/internal/software_isp/meson.build
 create mode 100644 include/libcamera/internal/software_isp/swisp_simple.h
 create mode 100644 include/libcamera/internal/software_isp/swisp_stats.h
 create mode 100644 include/libcamera/internal/software_isp/swstats.h
 create mode 100644 include/libcamera/internal/software_isp/swstats_cpu.h
 create mode 100644 include/libcamera/ipa/soft.mojom
 create mode 100644 src/ipa/simple/common/meson.build
 create mode 100644 src/ipa/simple/common/soft_base.cpp
 create mode 100644 src/ipa/simple/common/soft_base.h
 create mode 100644 src/ipa/simple/meson.build
 create mode 100644 src/ipa/simple/simple/data/meson.build
 create mode 100644 src/ipa/simple/simple/data/soft.conf
 create mode 100644 src/ipa/simple/simple/meson.build
 create mode 100644 src/ipa/simple/simple/soft_simple.cpp
 create mode 100644 src/libcamera/software_isp.cpp
 create mode 100644 src/libcamera/software_isp/debayer.cpp
 create mode 100644 src/libcamera/software_isp/debayer_cpu.cpp
 create mode 100644 src/libcamera/software_isp/meson.build
 create mode 100644 src/libcamera/software_isp/swisp_simple.cpp
 create mode 100644 src/libcamera/software_isp/swstats.cpp
 create mode 100644 src/libcamera/software_isp/swstats_cpu.cpp

Comments

Bryan O'Donoghue Dec. 14, 2023, 2:40 p.m. UTC | #1
On 14/12/2023 12:13, Hans de Goede wrote:
> Hi All,
> 
> Here is a new, now non RFC because believed to be mostly ready
> for merging, version of the Software ISP work.


Thanks for sending this out.

One thing I'm noticing when compiling is a load of warnings about 
undocumented variables.

meson setup -Dprefix=/home/deckard/Development/FireFox/libcamera-install 
-Dpipelines=simple/simple -Dipas=simple/simple build.swisp

ninja -j `nproc` -C build.swisp
ninja: Entering directory `build.swisp'
[12/184] Generating src/ipa-priv-key with a custom command
........+...+...+.+........+..................+.......+...+..+.+......+...+......+.....+....+......+.........+......+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*......+....+..+....+.....+......+.+...............+........+....+......+..+...............+.............+......+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*..+........+...+...+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
....+......+.....+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*...+............+....................+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*...............+.......+............+...+..+...+..................+.+........+.............+.........+.....+.+...+..+.......+.....+.......+.....+.+...+............+...+..+.........+.........+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
[43/184] Generating src/libcamera/ipa_pub_key_cpp with a custom command
writing RSA key
[184/184] Generating Documentation/doxygen with a custom command
/home/deckard/Development/libcamera-downstream/include/libcamera/internal/software_isp/debayer.h:30: 
warning: Compound libcamera::Debayer is not documented.
/home/deckard/Development/libcamera-downstream/include/libcamera/internal/software_isp/debayer_cpu.h:25: 
warning: Compound libcamera::DebayerCpu is not documented.
/home/deckard/Development/libcamera-downstream/include/libcamera/internal/software_isp/debayer_params.h:15: 
warning: Compound libcamera::DebayerParams is not documented.
/home/deckard/Development/libcamera-downstream/include/libcamera/internal/shared_mem_object.h:23: 
warning: Compound libcamera::SharedMemObject is not documented.
/home/deckard/Development/libcamera-downstream/include/libcamera/internal/software_isp.h:34: 
warning: Compound libcamera::SoftwareIsp is not documented.

Not sure if you or Andrey see that yourselves but one to be aware of for V2.

---
bod
Hans de Goede Dec. 14, 2023, 2:47 p.m. UTC | #2
Hi Bryan,

On 12/14/23 15:40, Bryan O'Donoghue wrote:
> On 14/12/2023 12:13, Hans de Goede wrote:
>> Hi All,
>>
>> Here is a new, now non RFC because believed to be mostly ready
>> for merging, version of the Software ISP work.
> 
> 
> Thanks for sending this out.

You're welcome.

> One thing I'm noticing when compiling is a load of warnings about undocumented variables.
<snip>
> Not sure if you or Andrey see that yourselves but one to be aware of for V2.

Quoting from the cover-letter:

"""
Known open items:
- The AWB red/blue gain calculations needs to be moved to the IPA
  and IPA then needs to pass a DebayerParams struct back to the
  SwIsp
- Properly document all methods / attributed for doxygen
"""

So this is a known issue.

BTW I'm currently working on adding support for 10bpp unpacked
bayer and while working on this I noticed a small bug in the swstats
code, you may want to squash in this fix:


--- a/src/libcamera/software_isp/swstats_cpu.cpp
+++ b/src/libcamera/software_isp/swstats_cpu.cpp
@@ -139,7 +139,7 @@ int SwStatsCpu::configure(const StreamConfiguration &inputCfg)
 		bpp_ = 10;
 		patternSize_.height = 2;
 		patternSize_.width = 4; /* 5 bytes per *4* pixels */
-		y_skip_mask_ = 0x0c; /* Skip every 3th and 4th line */
+		y_skip_mask_ = 0x02; /* Skip every 3th and 4th line */
 		x_shift_ = 0;
 
 		switch (bayerFormat.order) {


Regards,

Hans
Bryan O'Donoghue Dec. 14, 2023, 3:16 p.m. UTC | #3
On 14/12/2023 14:47, Hans de Goede wrote:
> Hi Bryan,
> 
> On 12/14/23 15:40, Bryan O'Donoghue wrote:
>> On 14/12/2023 12:13, Hans de Goede wrote:
>>> Hi All,
>>>
>>> Here is a new, now non RFC because believed to be mostly ready
>>> for merging, version of the Software ISP work.
>>
>>
>> Thanks for sending this out.
> 
> You're welcome.
> 
>> One thing I'm noticing when compiling is a load of warnings about undocumented variables.
> <snip>
>> Not sure if you or Andrey see that yourselves but one to be aware of for V2.
> 
> Quoting from the cover-letter:
> 
> """
> Known open items:
> - The AWB red/blue gain calculations needs to be moved to the IPA
>    and IPA then needs to pass a DebayerParams struct back to the
>    SwIsp
> - Properly document all methods / attributed for doxygen
> """
> 
> So this is a known issue.
> 
> BTW I'm currently working on adding support for 10bpp unpacked
> bayer and while working on this I noticed a small bug in the swstats
> code, you may want to squash in this fix:

Applied that change.

I'm comparing to your earlier branch

commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> 
SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
Author: Hans de Goede <hdegoede@redhat.com>
Date:   Thu Nov 30 20:13:29 2023 +0100


On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm 
getting 18fps @ 100%.

Old branch:
https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-image.jpg

New branch:
https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-posted-image.jpg

Seems to be eating alot more cycles and producing a more pinkish result 
on my hw.

---
bod
Hans de Goede Dec. 14, 2023, 4:02 p.m. UTC | #4
Hi Bryan,

On 12/14/23 16:16, Bryan O'Donoghue wrote:
> On 14/12/2023 14:47, Hans de Goede wrote:

<snip>

>> BTW I'm currently working on adding support for 10bpp unpacked
>> bayer and while working on this I noticed a small bug in the swstats
>> code, you may want to squash in this fix:
> 
> Applied that change.
> 
> I'm comparing to your earlier branch
> 
> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
> Author: Hans de Goede <hdegoede@redhat.com>
> Date:   Thu Nov 30 20:13:29 2023 +0100
> 
> 
> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
> Seems to be eating alot more cycles

Ok, that is no good.

Can you try:

https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3

And then start with checking out:

64c9bc0f55f459b9722fe78dfd35a07fb35d2a7b ("libcamera: software_isp: Add debayer-line callbacks for bayer patterns repeating every 4 lines")

that is basically softisp/SoftwareISP-v01-hans1 rebased on top of SoftwareISP-v02,
so that should give you to same performance as before.

First please check this indeed restores performance ?

And then after that try newer commits from that branch in this order
(I'm skipping commits which should not have a performance impact here):

7136d4d59aafb2564a24a2d4be773ca220257fdc
e3c2a5931dd825c58f626da8c12429b70d20219b (not really expecting a performance impact from this one but maybe)
b3aa4e4f781e881746953177dacbce8c943cb5a3 (bugfix but one which is expected to have some performance impact)
ec75339a5bff8b8e9a0031b9408bd47020905a4f
f62ed5df54850a24406bf3a762271a5fa3c0303d

The reason why I'm asking you to test these instead of do a bisect
is since there might be a number of smaller performance regressions
all adding up ...

If you can let me know the results of this then I'll see if I can
restore the old performance for you.

Note that as mentioned above b3aa4e4f781e is a bug fix which
some potentially big performance implications but the old code
really was wrong there ...

> and producing a more pinkish result on my hw.

As part of my work I more or less rewrote white-balancing, but I thought I fixed
the pinkish thing. It seems the new white-balancing is somewhat sensitive to
over-exposure and your image does look a bit overexposed.

Regards,

Hans
Hans de Goede Dec. 14, 2023, 4:25 p.m. UTC | #5
Hi,

On 12/14/23 17:02, Hans de Goede wrote:
> Hi Bryan,
> 
> On 12/14/23 16:16, Bryan O'Donoghue wrote:
>> On 14/12/2023 14:47, Hans de Goede wrote:
> 
> <snip>
> 
>>> BTW I'm currently working on adding support for 10bpp unpacked
>>> bayer and while working on this I noticed a small bug in the swstats
>>> code, you may want to squash in this fix:
>>
>> Applied that change.
>>
>> I'm comparing to your earlier branch
>>
>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>> Author: Hans de Goede <hdegoede@redhat.com>
>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>
>>
>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>> Seems to be eating alot more cycles
> 
> Ok, that is no good.
> 
> Can you try:
> 
> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3
> 
> And then start with checking out:
> 
> 64c9bc0f55f459b9722fe78dfd35a07fb35d2a7b ("libcamera: software_isp: Add debayer-line callbacks for bayer patterns repeating every 4 lines")
> 
> that is basically softisp/SoftwareISP-v01-hans1 rebased on top of SoftwareISP-v02,
> so that should give you to same performance as before.
> 
> First please check this indeed restores performance ?
> 
> And then after that try newer commits from that branch in this order
> (I'm skipping commits which should not have a performance impact here):
> 
> 7136d4d59aafb2564a24a2d4be773ca220257fdc
> e3c2a5931dd825c58f626da8c12429b70d20219b (not really expecting a performance impact from this one but maybe)
> b3aa4e4f781e881746953177dacbce8c943cb5a3 (bugfix but one which is expected to have some performance impact)
> ec75339a5bff8b8e9a0031b9408bd47020905a4f
> f62ed5df54850a24406bf3a762271a5fa3c0303d
> 
> The reason why I'm asking you to test these instead of do a bisect
> is since there might be a number of smaller performance regressions
> all adding up ...
> 
> If you can let me know the results of this then I'll see if I can
> restore the old performance for you.
> 
> Note that as mentioned above b3aa4e4f781e is a bug fix which
> some potentially big performance implications but the old code
> really was wrong there ...
> 
>> and producing a more pinkish result on my hw.
> 
> As part of my work I more or less rewrote white-balancing, but I thought I fixed
> the pinkish thing. It seems the new white-balancing is somewhat sensitive to
> over-exposure and your image does look a bit overexposed.

p.s.

It would also be good to compare:

https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans2
https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3

(just the last commit of each)

hans2 -> hans3 is mostly prep work for merging things with SoftwareISP-v03,
but one significant change in hans3 is that I added Pavel's code changes
to avoid macro-use (in swstats) resp reduce code duplication (in debayer).

Pavel's changes rely on the compiler being smart enough to inline *all*
the helpers and then replace the ctxt struct with using registers for each
of the struct members. Maybe that is not working for you and we need
to undo Pavel's changes.

Regards,

Hans
Bryan O'Donoghue Dec. 15, 2023, 12:34 a.m. UTC | #6
On 14/12/2023 16:25, Hans de Goede wrote:
> Hi,
> 
> On 12/14/23 17:02, Hans de Goede wrote:
>> Hi Bryan,
>>
>> On 12/14/23 16:16, Bryan O'Donoghue wrote:
>>> On 14/12/2023 14:47, Hans de Goede wrote:
>>
>> <snip>
>>
>>>> BTW I'm currently working on adding support for 10bpp unpacked
>>>> bayer and while working on this I noticed a small bug in the swstats
>>>> code, you may want to squash in this fix:
>>>
>>> Applied that change.
>>>
>>> I'm comparing to your earlier branch
>>>
>>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>>> Author: Hans de Goede <hdegoede@redhat.com>
>>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>>
>>>
>>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>>> Seems to be eating alot more cycles
>>
>> Ok, that is no good.
>>
>> Can you try:
>>
>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3

So this branch has the same performance issue 18fps, pinked-out display 
and ~ 100% on one of the cores.

>>
>> And then start with checking out:
>>
>> 64c9bc0f55f459b9722fe78dfd35a07fb35d2a7b ("libcamera: software_isp: Add debayer-line callbacks for bayer patterns repeating every 4 lines")

I get 30fps here with 60-9x% cpu usage. Looks a bit dark but not pink.

Looks like a reasonable baseline.

>>
>> that is basically softisp/SoftwareISP-v01-hans1 rebased on top of SoftwareISP-v02,
>> so that should give you to same performance as before.
>>
>> First please check this indeed restores performance ?
>>
>> And then after that try newer commits from that branch in this order
>> (I'm skipping commits which should not have a performance impact here):
>>
>> 7136d4d59aafb2564a24a2d4be773ca220257fdc

brighter but pink, lower cpu occupancy 40%, 30fps

>> e3c2a5931dd825c58f626da8c12429b70d20219b (not really expecting a performance impact from this one but maybe)


Way brighter - too much IMO in comparsion to the previous, occupancy 
100% fps 21.

This looks like your wayward commit to me.

>> b3aa4e4f781e881746953177dacbce8c943cb5a3 (bugfix but one which is expected to have some performance impact)

Same result as previous 21 fps, 100% cpu.


>> ec75339a5bff8b8e9a0031b9408bd47020905a4f

28fps, 130% cpu

>> f62ed5df54850a24406bf3a762271a5fa3c0303d

28fps, 100% cpu

>>
>> The reason why I'm asking you to test these instead of do a bisect
>> is since there might be a number of smaller performance regressions
>> all adding up ...
>>
>> If you can let me know the results of this then I'll see if I can
>> restore the old performance for you.
>>
>> Note that as mentioned above b3aa4e4f781e is a bug fix which
>> some potentially big performance implications but the old code
>> really was wrong there ...
>>
>>> and producing a more pinkish result on my hw.
>>
>> As part of my work I more or less rewrote white-balancing, but I thought I fixed
>> the pinkish thing. It seems the new white-balancing is somewhat sensitive to
>> over-exposure and your image does look a bit overexposed.
> 
> p.s.
> 
> It would also be good to compare:
> 
> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans2

27fps, 100% cpu occupancy, bright and pink

> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3

18fps, 100% cpu occupancy maybe less pink than previous but still 
noticeably too red

> 
> (just the last commit of each)
> 
> hans2 -> hans3 is mostly prep work for merging things with SoftwareISP-v03,
> but one significant change in hans3 is that I added Pavel's code changes
> to avoid macro-use (in swstats) resp reduce code duplication (in debayer).


> 
> Pavel's changes rely on the compiler being smart enough to inline *all*
> the helpers and then replace the ctxt struct with using registers for each
> of the struct members. Maybe that is not working for you and we need
> to undo Pavel's changes.

Nothing special there. Stock Debian aarch64 GCC.

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-linux-gnu/13/lto-wrapper
Target: aarch64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 13.2.0-6' 
--with-bugurl=file:///usr/share/doc/gcc-13/README.Bugs 
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr 
--with-gcc-major-version-only --program-suffix=-13 
--program-prefix=aarch64-linux-gnu- --enable-shared 
--enable-linker-build-id --libexecdir=/usr/libexec 
--without-included-gettext --enable-threads=posix --libdir=/usr/lib 
--enable-nls --enable-bootstrap --enable-clocale=gnu 
--enable-libstdcxx-debug --enable-libstdcxx-time=yes 
--with-default-libstdcxx-abi=new --enable-gnu-unique-object 
--disable-libquadmath --disable-libquadmath-support --enable-plugin 
--enable-default-pie --with-system-zlib 
--enable-libphobos-checking=release --with-target-system-zlib=auto 
--enable-objc-gc=auto --enable-multiarch --enable-fix-cortex-a53-843419 
--disable-werror --enable-checking=release --build=aarch64-linux-gnu 
--host=aarch64-linux-gnu --target=aarch64-linux-gnu 
--with-build-config=bootstrap-lto-lean --enable-link-serialization=4
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.2.0 (Debian 13.2.0-6)

---
bod
Bryan O'Donoghue Dec. 15, 2023, 12:48 a.m. UTC | #7
On 14/12/2023 16:25, Hans de Goede wrote:
> Hi,
> 
> On 12/14/23 17:02, Hans de Goede wrote:
>> Hi Bryan,
>>
>> On 12/14/23 16:16, Bryan O'Donoghue wrote:
>>> On 14/12/2023 14:47, Hans de Goede wrote:
>>
>> <snip>
>>
>>>> BTW I'm currently working on adding support for 10bpp unpacked
>>>> bayer and while working on this I noticed a small bug in the swstats
>>>> code, you may want to squash in this fix:
>>>
>>> Applied that change.
>>>
>>> I'm comparing to your earlier branch
>>>
>>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>>> Author: Hans de Goede <hdegoede@redhat.com>
>>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>>
>>>
>>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>>> Seems to be eating alot more cycles
>>
>> Ok, that is no good.
>>
>> Can you try:
>>
>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3
>>
>> And then start with checking out:
>>
>> 64c9bc0f55f459b9722fe78dfd35a07fb35d2a7b ("libcamera: software_isp: Add debayer-line callbacks for bayer patterns repeating every 4 lines")
>>
>> that is basically softisp/SoftwareISP-v01-hans1 rebased on top of SoftwareISP-v02,
>> so that should give you to same performance as before.
>>
>> First please check this indeed restores performance ?
>>
>> And then after that try newer commits from that branch in this order
>> (I'm skipping commits which should not have a performance impact here):
>>
>> 7136d4d59aafb2564a24a2d4be773ca220257fdc
>> e3c2a5931dd825c58f626da8c12429b70d20219b (not really expecting a performance impact from this one but maybe)
>> b3aa4e4f781e881746953177dacbce8c943cb5a3 (bugfix but one which is expected to have some performance impact)
>> ec75339a5bff8b8e9a0031b9408bd47020905a4f
>> f62ed5df54850a24406bf3a762271a5fa3c0303d
>>
>> The reason why I'm asking you to test these instead of do a bisect
>> is since there might be a number of smaller performance regressions
>> all adding up ...
>>
>> If you can let me know the results of this then I'll see if I can
>> restore the old performance for you.
>>
>> Note that as mentioned above b3aa4e4f781e is a bug fix which
>> some potentially big performance implications but the old code
>> really was wrong there ...
>>
>>> and producing a more pinkish result on my hw.
>>
>> As part of my work I more or less rewrote white-balancing, but I thought I fixed
>> the pinkish thing. It seems the new white-balancing is somewhat sensitive to
>> over-exposure and your image does look a bit overexposed.
> 
> p.s.
> 
> It would also be good to compare:
> 
> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans2
> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3

Pavel pointed out libcamera is -Os not -O3

meson setup -Dprefix=/home/deckard/Development/FireFox/libcamera-install 
-Dpipelines=simple/simple -Dipas=simple/simple --buildtype=release 
build.swisp

Switching to release brings the cpu occupancy down to 60% on the topic 
branch for this series.

We should probably shouldn't be too concerned about performance of -Os 
code for this case.

Still looks washed out / red in comparsion to 
64c9bc0f55f459b9722fe78dfd35a07fb35d2a7b

I think that warrants fixing.

---
bod
Hans de Goede Dec. 15, 2023, 2:27 p.m. UTC | #8
Hi,

On 12/15/23 01:48, Bryan O'Donoghue wrote:
> On 14/12/2023 16:25, Hans de Goede wrote:
>> Hi,
>>
>> On 12/14/23 17:02, Hans de Goede wrote:
>>> Hi Bryan,
>>>
>>> On 12/14/23 16:16, Bryan O'Donoghue wrote:
>>>> On 14/12/2023 14:47, Hans de Goede wrote:
>>>
>>> <snip>
>>>
>>>>> BTW I'm currently working on adding support for 10bpp unpacked
>>>>> bayer and while working on this I noticed a small bug in the swstats
>>>>> code, you may want to squash in this fix:
>>>>
>>>> Applied that change.
>>>>
>>>> I'm comparing to your earlier branch
>>>>
>>>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>>>> Author: Hans de Goede <hdegoede@redhat.com>
>>>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>>>
>>>>
>>>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>>>> Seems to be eating alot more cycles
>>>
>>> Ok, that is no good.
>>>
>>> Can you try:
>>>
>>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3
>>>
>>> And then start with checking out:
>>>
>>> 64c9bc0f55f459b9722fe78dfd35a07fb35d2a7b ("libcamera: software_isp: Add debayer-line callbacks for bayer patterns repeating every 4 lines")
>>>
>>> that is basically softisp/SoftwareISP-v01-hans1 rebased on top of SoftwareISP-v02,
>>> so that should give you to same performance as before.
>>>
>>> First please check this indeed restores performance ?
>>>
>>> And then after that try newer commits from that branch in this order
>>> (I'm skipping commits which should not have a performance impact here):
>>>
>>> 7136d4d59aafb2564a24a2d4be773ca220257fdc
>>> e3c2a5931dd825c58f626da8c12429b70d20219b (not really expecting a performance impact from this one but maybe)
>>> b3aa4e4f781e881746953177dacbce8c943cb5a3 (bugfix but one which is expected to have some performance impact)
>>> ec75339a5bff8b8e9a0031b9408bd47020905a4f
>>> f62ed5df54850a24406bf3a762271a5fa3c0303d
>>>
>>> The reason why I'm asking you to test these instead of do a bisect
>>> is since there might be a number of smaller performance regressions
>>> all adding up ...
>>>
>>> If you can let me know the results of this then I'll see if I can
>>> restore the old performance for you.
>>>
>>> Note that as mentioned above b3aa4e4f781e is a bug fix which
>>> some potentially big performance implications but the old code
>>> really was wrong there ...
>>>
>>>> and producing a more pinkish result on my hw.
>>>
>>> As part of my work I more or less rewrote white-balancing, but I thought I fixed
>>> the pinkish thing. It seems the new white-balancing is somewhat sensitive to
>>> over-exposure and your image does look a bit overexposed.
>>
>> p.s.
>>
>> It would also be good to compare:
>>
>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans2
>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3
> 
> Pavel pointed out libcamera is -Os not -O3
> 
> meson setup -Dprefix=/home/deckard/Development/FireFox/libcamera-install -Dpipelines=simple/simple -Dipas=simple/simple --buildtype=release build.swisp
> 
> Switching to release brings the cpu occupancy down to 60% on the topic branch for this series.
> 
> We should probably shouldn't be too concerned about performance of -Os code for this case.
> 
> Still looks washed out / red in comparsion to 64c9bc0f55f459b9722fe78dfd35a07fb35d2a7b
> 
> I think that warrants fixing.

Ack. FWIW I'm not seeing that with my sensor. We need to investigate this.
Yesterday I finished about a 10 day libcamera sprint so for now I'm not
going to touch libcamera for a few days.

I plan to get back to libcamera again on Wednesday.

In the mean time can you let me know what the bayer order for your sensor is?
I think the swstats refactor may have messed up the stats for your sensor.

And please also do a test run with the attached patch applied and send me (part of)
the debug output please.

Regards,

Hans
Hans de Goede Dec. 15, 2023, 2:37 p.m. UTC | #9
Hi,

On 12/15/23 01:34, Bryan O'Donoghue wrote:
> On 14/12/2023 16:25, Hans de Goede wrote:
>> Hi,
>>
>> On 12/14/23 17:02, Hans de Goede wrote:
>>> Hi Bryan,
>>>
>>> On 12/14/23 16:16, Bryan O'Donoghue wrote:
>>>> On 14/12/2023 14:47, Hans de Goede wrote:
>>>
>>> <snip>
>>>
>>>>> BTW I'm currently working on adding support for 10bpp unpacked
>>>>> bayer and while working on this I noticed a small bug in the swstats
>>>>> code, you may want to squash in this fix:
>>>>
>>>> Applied that change.
>>>>
>>>> I'm comparing to your earlier branch
>>>>
>>>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>>>> Author: Hans de Goede <hdegoede@redhat.com>
>>>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>>>
>>>>
>>>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>>>> Seems to be eating alot more cycles
>>>
>>> Ok, that is no good.
>>>
>>> Can you try:
>>>
>>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v02-hans3
> 
> So this branch has the same performance issue 18fps, pinked-out display and ~ 100% on one of the cores.
> 
>>>
>>> And then start with checking out:
>>>
>>> 64c9bc0f55f459b9722fe78dfd35a07fb35d2a7b ("libcamera: software_isp: Add debayer-line callbacks for bayer patterns repeating every 4 lines")
> 
> I get 30fps here with 60-9x% cpu usage. Looks a bit dark but not pink.
> 
> Looks like a reasonable baseline.
> 
>>>
>>> that is basically softisp/SoftwareISP-v01-hans1 rebased on top of SoftwareISP-v02,
>>> so that should give you to same performance as before.
>>>
>>> First please check this indeed restores performance ?
>>>
>>> And then after that try newer commits from that branch in this order
>>> (I'm skipping commits which should not have a performance impact here):
>>>
>>> 7136d4d59aafb2564a24a2d4be773ca220257fdc
> 
> brighter but pink, lower cpu occupancy 40%, 30fps
> 
>>> e3c2a5931dd825c58f626da8c12429b70d20219b (not really expecting a performance impact from this one but maybe)
> 
> 
> Way brighter - too much IMO in comparsion to the previous, occupancy 100% fps 21.

Ok, so the CPU change here is weird. I know building with -O3 helps, but it looks
like there is still more CPU load gain to have.

Can you try adding this change on top of SoftwareISP-v04 ? :

diff --git a/include/libcamera/internal/software_isp/debayer.h b/include/libcamera/internal/software_isp/debayer.h
index 206bc2ac..e2a63f24 100644
--- a/include/libcamera/internal/software_isp/debayer.h
+++ b/include/libcamera/internal/software_isp/debayer.h
@@ -77,10 +77,8 @@ public:
 			return {};
 		}
 
-		return SizeRange(Size(pattern_size.width, pattern_size.height),
-				 Size((inputSize.width - 2 * pattern_size.width) & ~(pattern_size.width - 1),
-				      (inputSize.height - 2 * pattern_size.height) & ~(pattern_size.height - 1)),
-				 pattern_size.width, pattern_size.height);
+		return SizeRange(Size((inputSize.width - 2 * pattern_size.width) & ~(pattern_size.width - 1),
+				      (inputSize.height - 2 * pattern_size.height) & ~(pattern_size.height - 1)));
 	}
 
 	Signal<FrameBuffer *> inputBufferReady;

That basically undoes e3c2a5931dd825c58f626da8c12429b70d20219b but then
the SoftwareISP-v04 equivalent of it.

I wonder if that will give you another CPU load drop :)

As for the way brighter that probably is the gamma correction which
I added which I've hardcoded to 0.5 which is probably way too much.
This change (also in the AWB debug patch I send) disables the gamma
correction:

diff --git a/src/libcamera/software_isp/swisp_simple.cpp b/src/libcamera/software_isp/swisp_simple.cpp
index ff05b6fe..0f9213cc 100644
--- a/src/libcamera/software_isp/swisp_simple.cpp
+++ b/src/libcamera/software_isp/swisp_simple.cpp
@@ -22,7 +22,7 @@
 namespace libcamera {
 
 SwIspSimple::SwIspSimple(PipelineHandler *pipe, const ControlInfoMap &sensorControls)
-	: SoftwareIsp(pipe, sensorControls), debayer_(nullptr), debayerParams_{256, 256, 0.5f}
+	: SoftwareIsp(pipe, sensorControls), debayer_(nullptr), debayerParams_{256, 256, 1.0f}
 {
 	std::unique_ptr<SwStatsCpu> stats;
 
Once we have the purple issue fixed I wouldn't mind getting some opinions on what is
a good gamma-correction default value. Note this is not entirely linear, 0.75
does much less (close to none) correction then 0.5, so you could e.g. try 0.6 0.65
but lets first fix the pink / purple haze issue.

Regards,

Hans
Andrey Konovalov Dec. 17, 2023, 8:23 p.m. UTC | #10
Hi All,

On 14.12.2023 18:16, Bryan O'Donoghue wrote:
> On 14/12/2023 14:47, Hans de Goede wrote:
>> Hi Bryan,
>>
>> On 12/14/23 15:40, Bryan O'Donoghue wrote:
>>> On 14/12/2023 12:13, Hans de Goede wrote:
>>>> Hi All,
>>>>
>>>> Here is a new, now non RFC because believed to be mostly ready
>>>> for merging, version of the Software ISP work.
>>>
>>>
>>> Thanks for sending this out.
>>
>> You're welcome.
>>
>>> One thing I'm noticing when compiling is a load of warnings about undocumented variables.
>> <snip>
>>> Not sure if you or Andrey see that yourselves but one to be aware of for V2.
>>
>> Quoting from the cover-letter:
>>
>> """
>> Known open items:
>> - The AWB red/blue gain calculations needs to be moved to the IPA
>>    and IPA then needs to pass a DebayerParams struct back to the
>>    SwIsp
>> - Properly document all methods / attributed for doxygen
>> """
>>
>> So this is a known issue.
>>
>> BTW I'm currently working on adding support for 10bpp unpacked
>> bayer and while working on this I noticed a small bug in the swstats
>> code, you may want to squash in this fix:
> 
> Applied that change.
> 
> I'm comparing to your earlier branch
> 
> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
> Author: Hans de Goede <hdegoede@redhat.com>
> Date:   Thu Nov 30 20:13:29 2023 +0100
> 
> 
> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
> 
> Old branch:
> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-image.jpg
> 
> New branch:
> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-posted-image.jpg
> 
> Seems to be eating alot more cycles and producing a more pinkish result on my hw.

RE the more pinkish result.

This is a bug in swstats_cpu.cpp, which in fact broke both AWB and AGC (the exposure/gain could only increase, and
never go back down).

I've pushed a branch [1] with the three commits on top of SoftwareISP-v04-hans1 by Hans:

6baa50e7 ("libcamera: software_isp: fix statistics calculations") is the fix to the "pinkish" image issue.
     Needs to be included into the patch set.
f8efb3be ("libcamera: software_isp: fix AWB algorithm not to increase luminance") makes the luminance not to change
     due to the colour gains applied. This improves the image a bit (makes it a bit less bright), but the effect
     is relatively small - within 10% depending on the light and the objects in the particular image. This change
     increases the load on CPU, so I am not sure if this image improvement is worth this higher CPU utilization.
f5906f15 ("[DNI] libcamera: software_isp: print colour gains to the log") prints the colour gains to the console
     so that one could see the effect of the previous patch - without it the green gain would always be 256, and the
     other two gains would increase in the same proportion. E.g. "gains R/G/B = 253/240/406" would become
     "gains R/G/B = 270/256/433" without commit f8efb3be.

Thanks,
Andrey

[1] https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v04-hans1--ynk1

> ---
> bod
>
Hans de Goede Dec. 18, 2023, 10:05 a.m. UTC | #11
Hi Andrey,

On 12/17/23 21:23, Andrey Konovalov wrote:
> Hi All,
> 
> On 14.12.2023 18:16, Bryan O'Donoghue wrote:
>> On 14/12/2023 14:47, Hans de Goede wrote:
>>> Hi Bryan,
>>>
>>> On 12/14/23 15:40, Bryan O'Donoghue wrote:
>>>> On 14/12/2023 12:13, Hans de Goede wrote:
>>>>> Hi All,
>>>>>
>>>>> Here is a new, now non RFC because believed to be mostly ready
>>>>> for merging, version of the Software ISP work.
>>>>
>>>>
>>>> Thanks for sending this out.
>>>
>>> You're welcome.
>>>
>>>> One thing I'm noticing when compiling is a load of warnings about undocumented variables.
>>> <snip>
>>>> Not sure if you or Andrey see that yourselves but one to be aware of for V2.
>>>
>>> Quoting from the cover-letter:
>>>
>>> """
>>> Known open items:
>>> - The AWB red/blue gain calculations needs to be moved to the IPA
>>>    and IPA then needs to pass a DebayerParams struct back to the
>>>    SwIsp
>>> - Properly document all methods / attributed for doxygen
>>> """
>>>
>>> So this is a known issue.
>>>
>>> BTW I'm currently working on adding support for 10bpp unpacked
>>> bayer and while working on this I noticed a small bug in the swstats
>>> code, you may want to squash in this fix:
>>
>> Applied that change.
>>
>> I'm comparing to your earlier branch
>>
>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>> Author: Hans de Goede <hdegoede@redhat.com>
>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>
>>
>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>>
>> Old branch:
>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-image.jpg
>>
>> New branch:
>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-posted-image.jpg
>>
>> Seems to be eating alot more cycles and producing a more pinkish result on my hw.
> 
> RE the more pinkish result.
> 
> This is a bug in swstats_cpu.cpp, which in fact broke both AWB and AGC (the exposure/gain could only increase, and
> never go back down).
> 
> I've pushed a branch [1] with the three commits on top of SoftwareISP-v04-hans1 by Hans:
> 
> 6baa50e7 ("libcamera: software_isp: fix statistics calculations") is the fix to the "pinkish" image issue.
>     Needs to be included into the patch set.

For those reading along, this is the fix:

--- a/src/libcamera/software_isp/swstats_cpu.cpp
+++ b/src/libcamera/software_isp/swstats_cpu.cpp
@@ -63,7 +63,7 @@ statsBayer10P(const int width, const uint8_t *src0, const uint8_t *src1, bool bg
 			r  = src1[x];
 			g2 = src1[x + 1];
 		}
-		g = g + g2 / 2;
+		g = ((unsigned int)g + g2) / 2;
 
 		sumR += r;
 		sumG += g;


Hmm, that smells like a compiler bug TBH. g and g2 are both uint8_t, so they have a range
of 0-255 so C/C++ integer promotion says these should both be promoted to an int
(0-255 fits in in an int) before doing the + operation:

https://en.cppreference.com/w/cpp/language/implicit_conversion#Integral_promotion

So "g + g2 / 2" should be done using all int values (so no overflow) and then
the resulting int which is in the 0-255 range again gets stored in uint8_t g, which just
takes the low 8 bits which should contain the right value.

I guess the compiler may be changing this to:

	g += g2;
	g /= 2;

but that is wrong and a compiler bug. With all that said I'm fine with adding the explicit
cast to (unsigned int), but unless I am missing something here this really does feel
like a compiler bug.

> f8efb3be ("libcamera: software_isp: fix AWB algorithm not to increase luminance") makes the luminance not to change
>     due to the colour gains applied. This improves the image a bit (makes it a bit less bright), but the effect
>     is relatively small - within 10% depending on the light and the objects in the particular image. This change
>     increases the load on CPU, so I am not sure if this image improvement is worth this higher CPU utilization.

Interesting since we have r g b lookup tables already the extra CPU load is only
in generating the green table
once per frame, the load of that should be negligible.

And with the separate gamma lookup we can then also give the gamma-lookup table a bit
more precision giving it 1024 entries. The impact of this on performance should be
minimal.

> f5906f15 ("[DNI] libcamera: software_isp: print colour gains to the log") prints the colour gains to the console
>     so that one could see the effect of the previous patch - without it the green gain would always be 256, and the
>     other two gains would increase in the same proportion. E.g. "gains R/G/B = 253/240/406" would become
>     "gains R/G/B = 270/256/433" without commit f8efb3be.

I think we probably want something like this (but then in the IPA once the AWB calculations
have moved there) and then using a log-level of debug. Doing a debug log once per frame
should have no performance impact when debug logging is disabled.

Regards,

Hans
Andrey Konovalov Dec. 18, 2023, 10:34 a.m. UTC | #12
Hi Hans,

On 18.12.2023 13:05, Hans de Goede wrote:
> Hi Andrey,
> 
> On 12/17/23 21:23, Andrey Konovalov wrote:
>> Hi All,
>>
>> On 14.12.2023 18:16, Bryan O'Donoghue wrote:
>>> On 14/12/2023 14:47, Hans de Goede wrote:
>>>> Hi Bryan,
>>>>
>>>> On 12/14/23 15:40, Bryan O'Donoghue wrote:
>>>>> On 14/12/2023 12:13, Hans de Goede wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Here is a new, now non RFC because believed to be mostly ready
>>>>>> for merging, version of the Software ISP work.
>>>>>
>>>>>
>>>>> Thanks for sending this out.
>>>>
>>>> You're welcome.
>>>>
>>>>> One thing I'm noticing when compiling is a load of warnings about undocumented variables.
>>>> <snip>
>>>>> Not sure if you or Andrey see that yourselves but one to be aware of for V2.
>>>>
>>>> Quoting from the cover-letter:
>>>>
>>>> """
>>>> Known open items:
>>>> - The AWB red/blue gain calculations needs to be moved to the IPA
>>>>     and IPA then needs to pass a DebayerParams struct back to the
>>>>     SwIsp
>>>> - Properly document all methods / attributed for doxygen
>>>> """
>>>>
>>>> So this is a known issue.
>>>>
>>>> BTW I'm currently working on adding support for 10bpp unpacked
>>>> bayer and while working on this I noticed a small bug in the swstats
>>>> code, you may want to squash in this fix:
>>>
>>> Applied that change.
>>>
>>> I'm comparing to your earlier branch
>>>
>>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>>> Author: Hans de Goede <hdegoede@redhat.com>
>>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>>
>>>
>>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>>>
>>> Old branch:
>>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-image.jpg
>>>
>>> New branch:
>>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-posted-image.jpg
>>>
>>> Seems to be eating alot more cycles and producing a more pinkish result on my hw.
>>
>> RE the more pinkish result.
>>
>> This is a bug in swstats_cpu.cpp, which in fact broke both AWB and AGC (the exposure/gain could only increase, and
>> never go back down).
>>
>> I've pushed a branch [1] with the three commits on top of SoftwareISP-v04-hans1 by Hans:
>>
>> 6baa50e7 ("libcamera: software_isp: fix statistics calculations") is the fix to the "pinkish" image issue.
>>      Needs to be included into the patch set.
> 
> For those reading along, this is the fix:
> 
> --- a/src/libcamera/software_isp/swstats_cpu.cpp
> +++ b/src/libcamera/software_isp/swstats_cpu.cpp
> @@ -63,7 +63,7 @@ statsBayer10P(const int width, const uint8_t *src0, const uint8_t *src1, bool bg
>   			r  = src1[x];
>   			g2 = src1[x + 1];
>   		}
> -		g = g + g2 / 2;
> +		g = ((unsigned int)g + g2) / 2;
>   
>   		sumR += r;
>   		sumG += g;
> 
> 
> Hmm, that smells like a compiler bug TBH. g and g2 are both uint8_t, so they have a range
> of 0-255 so C/C++ integer promotion says these should both be promoted to an int
> (0-255 fits in in an int) before doing the + operation:
> 
> https://en.cppreference.com/w/cpp/language/implicit_conversion#Integral_promotion

Ah OK, I've missed that integral promotion thing. Then the fix would be just adding the
missing brackets (not tested):

  -		g = g + g2 / 2;
  +		g = (g + g2) / 2;

> So "g + g2 / 2" should be done using all int values (so no overflow) and then
> the resulting int which is in the 0-255 range again gets stored in uint8_t g, which just
> takes the low 8 bits which should contain the right value.
> 
> I guess the compiler may be changing this to:
> 
> 	g += g2;
> 	g /= 2;
> 
> but that is wrong and a compiler bug. With all that said I'm fine with adding the explicit
> cast to (unsigned int), but unless I am missing something here this really does feel
> like a compiler bug.
> 
>> f8efb3be ("libcamera: software_isp: fix AWB algorithm not to increase luminance") makes the luminance not to change
>>      due to the colour gains applied. This improves the image a bit (makes it a bit less bright), but the effect
>>      is relatively small - within 10% depending on the light and the objects in the particular image. This change
>>      increases the load on CPU, so I am not sure if this image improvement is worth this higher CPU utilization.
> 
> Interesting since we have r g b lookup tables already the extra CPU load is only
> in generating the green table
> once per frame, the load of that should be negligible.
> 
> And with the separate gamma lookup we can then also give the gamma-lookup table a bit
> more precision giving it 1024 entries. The impact of this on performance should be
> minimal.

OK. Then we could take this commit into the patch set. Having 1024 entries in the gamma-lookup table
would be a nice addition too.

>> f5906f15 ("[DNI] libcamera: software_isp: print colour gains to the log") prints the colour gains to the console
>>      so that one could see the effect of the previous patch - without it the green gain would always be 256, and the
>>      other two gains would increase in the same proportion. E.g. "gains R/G/B = 253/240/406" would become
>>      "gains R/G/B = 270/256/433" without commit f8efb3be.
> 
> I think we probably want something like this (but then in the IPA once the AWB calculations
> have moved there) and then using a log-level of debug. Doing a debug log once per frame
> should have no performance impact when debug logging is disabled.

OK

Thanks,
Andrey

> Regards,
> 
> Hans
>
Hans de Goede Dec. 18, 2023, 10:52 a.m. UTC | #13
Hi Andrey,

On 12/18/23 11:34, Andrey Konovalov wrote:
> Hi Hans,
> 
> On 18.12.2023 13:05, Hans de Goede wrote:
>> Hi Andrey,
>>
>> On 12/17/23 21:23, Andrey Konovalov wrote:
>>> Hi All,
>>>
>>> On 14.12.2023 18:16, Bryan O'Donoghue wrote:
>>>> On 14/12/2023 14:47, Hans de Goede wrote:
>>>>> Hi Bryan,
>>>>>
>>>>> On 12/14/23 15:40, Bryan O'Donoghue wrote:
>>>>>> On 14/12/2023 12:13, Hans de Goede wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> Here is a new, now non RFC because believed to be mostly ready
>>>>>>> for merging, version of the Software ISP work.
>>>>>>
>>>>>>
>>>>>> Thanks for sending this out.
>>>>>
>>>>> You're welcome.
>>>>>
>>>>>> One thing I'm noticing when compiling is a load of warnings about undocumented variables.
>>>>> <snip>
>>>>>> Not sure if you or Andrey see that yourselves but one to be aware of for V2.
>>>>>
>>>>> Quoting from the cover-letter:
>>>>>
>>>>> """
>>>>> Known open items:
>>>>> - The AWB red/blue gain calculations needs to be moved to the IPA
>>>>>     and IPA then needs to pass a DebayerParams struct back to the
>>>>>     SwIsp
>>>>> - Properly document all methods / attributed for doxygen
>>>>> """
>>>>>
>>>>> So this is a known issue.
>>>>>
>>>>> BTW I'm currently working on adding support for 10bpp unpacked
>>>>> bayer and while working on this I noticed a small bug in the swstats
>>>>> code, you may want to squash in this fix:
>>>>
>>>> Applied that change.
>>>>
>>>> I'm comparing to your earlier branch
>>>>
>>>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>>>> Author: Hans de Goede <hdegoede@redhat.com>
>>>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>>>
>>>>
>>>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>>>>
>>>> Old branch:
>>>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-image.jpg
>>>>
>>>> New branch:
>>>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-posted-image.jpg
>>>>
>>>> Seems to be eating alot more cycles and producing a more pinkish result on my hw.
>>>
>>> RE the more pinkish result.
>>>
>>> This is a bug in swstats_cpu.cpp, which in fact broke both AWB and AGC (the exposure/gain could only increase, and
>>> never go back down).
>>>
>>> I've pushed a branch [1] with the three commits on top of SoftwareISP-v04-hans1 by Hans:
>>>
>>> 6baa50e7 ("libcamera: software_isp: fix statistics calculations") is the fix to the "pinkish" image issue.
>>>      Needs to be included into the patch set.
>>
>> For those reading along, this is the fix:
>>
>> --- a/src/libcamera/software_isp/swstats_cpu.cpp
>> +++ b/src/libcamera/software_isp/swstats_cpu.cpp
>> @@ -63,7 +63,7 @@ statsBayer10P(const int width, const uint8_t *src0, const uint8_t *src1, bool bg
>>               r  = src1[x];
>>               g2 = src1[x + 1];
>>           }
>> -        g = g + g2 / 2;
>> +        g = ((unsigned int)g + g2) / 2;
>>             sumR += r;
>>           sumG += g;
>>
>>
>> Hmm, that smells like a compiler bug TBH. g and g2 are both uint8_t, so they have a range
>> of 0-255 so C/C++ integer promotion says these should both be promoted to an int
>> (0-255 fits in in an int) before doing the + operation:
>>
>> https://en.cppreference.com/w/cpp/language/implicit_conversion#Integral_promotion
> 
> Ah OK, I've missed that integral promotion thing. Then the fix would be just adding the
> missing brackets (not tested):
> 
>  -        g = g + g2 / 2;
>  +        g = (g + g2) / 2;

Ah I missed that you also added the brackets / parenthesis. Yeah the missing parenthesis
are my bad, I'll squash a fix into the original commit for this.

Thank you for catching this.

>> So "g + g2 / 2" should be done using all int values (so no overflow) and then
>> the resulting int which is in the 0-255 range again gets stored in uint8_t g, which just
>> takes the low 8 bits which should contain the right value.
>>
>> I guess the compiler may be changing this to:
>>
>>     g += g2;
>>     g /= 2;
>>
>> but that is wrong and a compiler bug. With all that said I'm fine with adding the explicit
>> cast to (unsigned int), but unless I am missing something here this really does feel
>> like a compiler bug.
>>
>>> f8efb3be ("libcamera: software_isp: fix AWB algorithm not to increase luminance") makes the luminance not to change
>>>      due to the colour gains applied. This improves the image a bit (makes it a bit less bright), but the effect
>>>      is relatively small - within 10% depending on the light and the objects in the particular image. This change
>>>      increases the load on CPU, so I am not sure if this image improvement is worth this higher CPU utilization.
>>
>> Interesting since we have r g b lookup tables already the extra CPU load is only
>> in generating the green table
>> once per frame, the load of that should be negligible.
>>
>> And with the separate gamma lookup we can then also give the gamma-lookup table a bit
>> more precision giving it 1024 entries. The impact of this on performance should be
>> minimal.
> 
> OK. Then we could take this commit into the patch set. Having 1024 entries in the gamma-lookup table
> would be a nice addition too.

Agreed. I think it would be best to squash this change (including making the gamma table 1024 entries)
into the original commit adding debayer_cpu.cpp is that ok with you ?

Regards,

Hans
Andrey Konovalov Dec. 18, 2023, 11:28 a.m. UTC | #14
Hi Hans,

On 18.12.2023 13:52, Hans de Goede wrote:
> Hi Andrey,
> 
> On 12/18/23 11:34, Andrey Konovalov wrote:
>> Hi Hans,
>>
>> On 18.12.2023 13:05, Hans de Goede wrote:
>>> Hi Andrey,
>>>
>>> On 12/17/23 21:23, Andrey Konovalov wrote:
>>>> Hi All,
>>>>
>>>> On 14.12.2023 18:16, Bryan O'Donoghue wrote:
>>>>> On 14/12/2023 14:47, Hans de Goede wrote:
>>>>>> Hi Bryan,
>>>>>>
>>>>>> On 12/14/23 15:40, Bryan O'Donoghue wrote:
>>>>>>> On 14/12/2023 12:13, Hans de Goede wrote:
>>>>>>>> Hi All,
>>>>>>>>
>>>>>>>> Here is a new, now non RFC because believed to be mostly ready
>>>>>>>> for merging, version of the Software ISP work.
>>>>>>>
>>>>>>>
>>>>>>> Thanks for sending this out.
>>>>>>
>>>>>> You're welcome.
>>>>>>
>>>>>>> One thing I'm noticing when compiling is a load of warnings about undocumented variables.
>>>>>> <snip>
>>>>>>> Not sure if you or Andrey see that yourselves but one to be aware of for V2.
>>>>>>
>>>>>> Quoting from the cover-letter:
>>>>>>
>>>>>> """
>>>>>> Known open items:
>>>>>> - The AWB red/blue gain calculations needs to be moved to the IPA
>>>>>>      and IPA then needs to pass a DebayerParams struct back to the
>>>>>>      SwIsp
>>>>>> - Properly document all methods / attributed for doxygen
>>>>>> """
>>>>>>
>>>>>> So this is a known issue.
>>>>>>
>>>>>> BTW I'm currently working on adding support for 10bpp unpacked
>>>>>> bayer and while working on this I noticed a small bug in the swstats
>>>>>> code, you may want to squash in this fix:
>>>>>
>>>>> Applied that change.
>>>>>
>>>>> I'm comparing to your earlier branch
>>>>>
>>>>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>>>>> Author: Hans de Goede <hdegoede@redhat.com>
>>>>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>>>>
>>>>>
>>>>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm getting 18fps @ 100%.
>>>>>
>>>>> Old branch:
>>>>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-image.jpg
>>>>>
>>>>> New branch:
>>>>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-posted-image.jpg
>>>>>
>>>>> Seems to be eating alot more cycles and producing a more pinkish result on my hw.
>>>>
>>>> RE the more pinkish result.
>>>>
>>>> This is a bug in swstats_cpu.cpp, which in fact broke both AWB and AGC (the exposure/gain could only increase, and
>>>> never go back down).
>>>>
>>>> I've pushed a branch [1] with the three commits on top of SoftwareISP-v04-hans1 by Hans:
>>>>
>>>> 6baa50e7 ("libcamera: software_isp: fix statistics calculations") is the fix to the "pinkish" image issue.
>>>>       Needs to be included into the patch set.
>>>
>>> For those reading along, this is the fix:
>>>
>>> --- a/src/libcamera/software_isp/swstats_cpu.cpp
>>> +++ b/src/libcamera/software_isp/swstats_cpu.cpp
>>> @@ -63,7 +63,7 @@ statsBayer10P(const int width, const uint8_t *src0, const uint8_t *src1, bool bg
>>>                r  = src1[x];
>>>                g2 = src1[x + 1];
>>>            }
>>> -        g = g + g2 / 2;
>>> +        g = ((unsigned int)g + g2) / 2;
>>>              sumR += r;
>>>            sumG += g;
>>>
>>>
>>> Hmm, that smells like a compiler bug TBH. g and g2 are both uint8_t, so they have a range
>>> of 0-255 so C/C++ integer promotion says these should both be promoted to an int
>>> (0-255 fits in in an int) before doing the + operation:
>>>
>>> https://en.cppreference.com/w/cpp/language/implicit_conversion#Integral_promotion
>>
>> Ah OK, I've missed that integral promotion thing. Then the fix would be just adding the
>> missing brackets (not tested):
>>
>>   -        g = g + g2 / 2;
>>   +        g = (g + g2) / 2;
> 
> Ah I missed that you also added the brackets / parenthesis. Yeah the missing parenthesis
> are my bad, I'll squash a fix into the original commit for this.
> 
> Thank you for catching this.
> 
>>> So "g + g2 / 2" should be done using all int values (so no overflow) and then
>>> the resulting int which is in the 0-255 range again gets stored in uint8_t g, which just
>>> takes the low 8 bits which should contain the right value.
>>>
>>> I guess the compiler may be changing this to:
>>>
>>>      g += g2;
>>>      g /= 2;
>>>
>>> but that is wrong and a compiler bug. With all that said I'm fine with adding the explicit
>>> cast to (unsigned int), but unless I am missing something here this really does feel
>>> like a compiler bug.
>>>
>>>> f8efb3be ("libcamera: software_isp: fix AWB algorithm not to increase luminance") makes the luminance not to change
>>>>       due to the colour gains applied. This improves the image a bit (makes it a bit less bright), but the effect
>>>>       is relatively small - within 10% depending on the light and the objects in the particular image. This change
>>>>       increases the load on CPU, so I am not sure if this image improvement is worth this higher CPU utilization.
>>>
>>> Interesting since we have r g b lookup tables already the extra CPU load is only
>>> in generating the green table
>>> once per frame, the load of that should be negligible.
>>>
>>> And with the separate gamma lookup we can then also give the gamma-lookup table a bit
>>> more precision giving it 1024 entries. The impact of this on performance should be
>>> minimal.
>>
>> OK. Then we could take this commit into the patch set. Having 1024 entries in the gamma-lookup table
>> would be a nice addition too.
> 
> Agreed. I think it would be best to squash this change (including making the gamma table 1024 entries)
> into the original commit adding debayer_cpu.cpp is that ok with you ?

Yes, I think squashing this change (including making the gamma table 1024 entries) and the missing parenthesis fix
into the original commit is the way to go.

Thanks,
Andrey

> Regards,
> 
> Hans
> 
> 
>
Bryan O'Donoghue Dec. 18, 2023, 12:20 p.m. UTC | #15
On 17/12/2023 20:23, Andrey Konovalov wrote:
> Hi All,
> 
> On 14.12.2023 18:16, Bryan O'Donoghue wrote:
>> On 14/12/2023 14:47, Hans de Goede wrote:
>>> Hi Bryan,
>>>
>>> On 12/14/23 15:40, Bryan O'Donoghue wrote:
>>>> On 14/12/2023 12:13, Hans de Goede wrote:
>>>>> Hi All,
>>>>>
>>>>> Here is a new, now non RFC because believed to be mostly ready
>>>>> for merging, version of the Software ISP work.
>>>>
>>>>
>>>> Thanks for sending this out.
>>>
>>> You're welcome.
>>>
>>>> One thing I'm noticing when compiling is a load of warnings about 
>>>> undocumented variables.
>>> <snip>
>>>> Not sure if you or Andrey see that yourselves but one to be aware of 
>>>> for V2.
>>>
>>> Quoting from the cover-letter:
>>>
>>> """
>>> Known open items:
>>> - The AWB red/blue gain calculations needs to be moved to the IPA
>>>    and IPA then needs to pass a DebayerParams struct back to the
>>>    SwIsp
>>> - Properly document all methods / attributed for doxygen
>>> """
>>>
>>> So this is a known issue.
>>>
>>> BTW I'm currently working on adding support for 10bpp unpacked
>>> bayer and while working on this I noticed a small bug in the swstats
>>> code, you may want to squash in this fix:
>>
>> Applied that change.
>>
>> I'm comparing to your earlier branch
>>
>> commit ae92fa44991ded151c63d5d202efdacc9d640aff (HEAD -> 
>> SoftwareISP-v01-hans1, softisp/SoftwareISP-v01-hans1)
>> Author: Hans de Goede <hdegoede@redhat.com>
>> Date:   Thu Nov 30 20:13:29 2023 +0100
>>
>>
>> On the earlier branch I get 30fps @ 68% CPU usage. On this branch I'm 
>> getting 18fps @ 100%.
>>
>> Old branch:
>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-image.jpg
>>
>> New branch:
>> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/blob/bod/SoftwareISP-v04-review/v1-posted-image.jpg
>>
>> Seems to be eating alot more cycles and producing a more pinkish 
>> result on my hw.
> 
> RE the more pinkish result.
> 
> This is a bug in swstats_cpu.cpp, which in fact broke both AWB and AGC 
> (the exposure/gain could only increase, and
> never go back down).
> 
> I've pushed a branch [1] with the three commits on top of 
> SoftwareISP-v04-hans1 by Hans:
> 
> 6baa50e7 ("libcamera: software_isp: fix statistics calculations") is the 
> fix to the "pinkish" image issue.
>      Needs to be included into the patch set.
> f8efb3be ("libcamera: software_isp: fix AWB algorithm not to increase 
> luminance") makes the luminance not to change
>      due to the colour gains applied. This improves the image a bit 
> (makes it a bit less bright), but the effect
>      is relatively small - within 10% depending on the light and the 
> objects in the particular image. This change
>      increases the load on CPU, so I am not sure if this image 
> improvement is worth this higher CPU utilization.
> f5906f15 ("[DNI] libcamera: software_isp: print colour gains to the 
> log") prints the colour gains to the console
>      so that one could see the effect of the previous patch - without it 
> the green gain would always be 256, and the
>      other two gains would increase in the same proportion. E.g. "gains 
> R/G/B = 253/240/406" would become
>      "gains R/G/B = 270/256/433" without commit f8efb3be.
> 
> Thanks,
> Andrey
> 
> [1] 
> https://gitlab.freedesktop.org/camera/libcamera-softisp/-/commits/SoftwareISP-v04-hans1--ynk1


Yep, that branch fixes for me.

BTW just to answer the earlier q from Hans, Bayer order is GRBG - 
driver/media/i2c/ov5675.c

---
bod
Hans de Goede Dec. 18, 2023, 5:09 p.m. UTC | #16
Hi,

On 12/18/23 11:05, Hans de Goede wrote:

<snip>

>> f8efb3be ("libcamera: software_isp: fix AWB algorithm not to increase luminance") makes the luminance not to change
>>     due to the colour gains applied. This improves the image a bit (makes it a bit less bright), but the effect
>>     is relatively small - within 10% depending on the light and the objects in the particular image. This change
>>     increases the load on CPU, so I am not sure if this image improvement is worth this higher CPU utilization.
> 
> Interesting since we have r g b lookup tables already the extra CPU load is only
> in generating the green table
> once per frame, the load of that should be negligible.
> 
> And with the separate gamma lookup we can then also give the gamma-lookup table a bit
> more precision giving it 1024 entries. The impact of this on performance should be
> minimal.

Quick remark on this fix, I had to change the y256 type from unsigned int to unsigned long
because with a FHD sensor 256 * averagesum of ~700000 pixels does not fit in 32 bits.

I have a WIP branch with your 3 improvements (with 1024 gamma entries) squashed in
here:

https://github.com/jwrdegoede/libcamera/commits/SoftwareISP-v04

Note I regularly do forced pushes there as I keep refining things.

Regards,

Hans





> 
>> f5906f15 ("[DNI] libcamera: software_isp: print colour gains to the log") prints the colour gains to the console
>>     so that one could see the effect of the previous patch - without it the green gain would always be 256, and the
>>     other two gains would increase in the same proportion. E.g. "gains R/G/B = 253/240/406" would become
>>     "gains R/G/B = 270/256/433" without commit f8efb3be.
> 
> I think we probably want something like this (but then in the IPA once the AWB calculations
> have moved there) and then using a log-level of debug. Doing a debug log once per frame
> should have no performance impact when debug logging is disabled.
> 
> Regards,
> 
> Hans
>
Hans de Goede Dec. 21, 2023, 8:36 p.m. UTC | #17
Hi Bryan,

On 12/15/23 15:37, Hans de Goede wrote:

<snip>

>>>> e3c2a5931dd825c58f626da8c12429b70d20219b (not really expecting a performance impact from this one but maybe)
>>
>>
>> Way brighter - too much IMO in comparsion to the previous, occupancy 100% fps 21.
> 
> Ok, so the CPU change here is weird. I know building with -O3 helps, but it looks
> like there is still more CPU load gain to have.
> 
> Can you try adding this change on top of SoftwareISP-v04 ? :
> 
> diff --git a/include/libcamera/internal/software_isp/debayer.h b/include/libcamera/internal/software_isp/debayer.h
> index 206bc2ac..e2a63f24 100644
> --- a/include/libcamera/internal/software_isp/debayer.h
> +++ b/include/libcamera/internal/software_isp/debayer.h
> @@ -77,10 +77,8 @@ public:
>  			return {};
>  		}
>  
> -		return SizeRange(Size(pattern_size.width, pattern_size.height),
> -				 Size((inputSize.width - 2 * pattern_size.width) & ~(pattern_size.width - 1),
> -				      (inputSize.height - 2 * pattern_size.height) & ~(pattern_size.height - 1)),
> -				 pattern_size.width, pattern_size.height);
> +		return SizeRange(Size((inputSize.width - 2 * pattern_size.width) & ~(pattern_size.width - 1),
> +				      (inputSize.height - 2 * pattern_size.height) & ~(pattern_size.height - 1)));
>  	}
>  
>  	Signal<FrameBuffer *> inputBufferReady;
> 
> That basically undoes e3c2a5931dd825c58f626da8c12429b70d20219b but then
> the SoftwareISP-v04 equivalent of it.

This question (the CPU increase you saw due to e3c2a5931dd825c58f626da8c12429b70d20219b
is still relevant and the above diff to just return a single fixed size
should still apply to SoftwareISP-v04-hans2 .

I know -O3 helps, but I'm still puzzled why the sizes() changed caused
a performance regression at all. So I wonder of if you undo the change
(diff above) we get even better performance with -O3.

Regards,

Hans