[v2,0/9] Software ISP support for CCM
mbox series

Message ID 20241125180310.1254519-1-mzamazal@redhat.com
Headers show
Series
  • Software ISP support for CCM
Related show

Message

Milan Zamazal Nov. 25, 2024, 6:03 p.m. UTC
This series implements application of color correction matrices in
software ISP.

Support for color temperature is added to auto white balance algorithm,
this is needed to obtain the right CCM.  A new Ccm algorithm is added to
determine the matrix.  Both roughly follow what the hardware pipelines
do.

Lut algorithm is modified to incorporate the CCM and perform some
precomputations in the form of 10 lookup tables (one per each matrix
element and one for gamma).

A tricky part is to support both CCM and the original (no-CCM)
transformations.  CCM obviously adds a significant overhead; per-frame
time increases by ~45% on Debix Model A and ~85% on TI AM69 SK.  This
means CCM must be optional, which is determined by presence of Ccm
algorithm in the tuning file.

The performance critical debayering code must not include extra
conditionals and must work efficiently for both the cases.  The macros
in debayering code are refactored and restructured for this.  An added
bonus is that this also removes some annoying code duplication.  And
some more, partially already present, templating is used to select
between CCM and non-CCM processing.

Performance is much influenced by the data structures used in
debayering.  I tried to define something reasonable and performed only
basic performance testing.  Suggestions for possible improvements are
welcome.

See the commit messages for more details.

Changes in v2:
- Fix of the gamma table index upper boundary.
- Fix of red<->blue being temporarily swapped in an intermediate patch.
- Fix of CCM transformation when swapRedBlueGains_ is true.
- A related minor lookup data structures rearrangement, with an
  unintended but welcome effect of ~10% speedup with CCM in my
  environment. 

Milan Zamazal (9):
  libcamera: software_isp: Determine color temperature
  libcamera: software_isp: Store color temperature to metadata
  libcamera: software_isp: lut: Remove maybe_unused on a used argument
  libcamera: software_isp: Use common code to store debayered pixels
  libcamera: software_isp: Use a macro to assign debayering methods
  libcamera: software_isp: Add CCM algorithm
  libcamera: software_isp: Add an example CCM to uncalibrated.yaml
  libcamera: software_isp: Track whether CCM is enabled
  libcamera: software_isp: Apply CCM in debayering

 .../internal/software_isp/debayer_params.h    |  20 ++-
 include/libcamera/ipa/soft.mojom              |   2 +-
 src/ipa/simple/algorithms/awb.cpp             |  18 ++-
 src/ipa/simple/algorithms/ccm.cpp             |  86 ++++++++++++
 src/ipa/simple/algorithms/ccm.h               |  45 ++++++
 src/ipa/simple/algorithms/lut.cpp             |  62 +++++---
 src/ipa/simple/algorithms/lut.h               |   1 +
 src/ipa/simple/algorithms/meson.build         |   1 +
 src/ipa/simple/data/uncalibrated.yaml         |   7 +
 src/ipa/simple/ipa_context.h                  |  22 ++-
 src/ipa/simple/soft_simple.cpp                |   8 +-
 src/libcamera/software_isp/debayer.cpp        |  53 ++++++-
 src/libcamera/software_isp/debayer.h          |   3 +-
 src/libcamera/software_isp/debayer_cpu.cpp    | 132 ++++++++++--------
 src/libcamera/software_isp/debayer_cpu.h      |  31 ++--
 src/libcamera/software_isp/software_isp.cpp   |  11 +-
 16 files changed, 393 insertions(+), 109 deletions(-)
 create mode 100644 src/ipa/simple/algorithms/ccm.cpp
 create mode 100644 src/ipa/simple/algorithms/ccm.h

Comments

Hans de Goede Dec. 3, 2024, 5:19 p.m. UTC | #1
Hi Milan,

On 25-Nov-24 7:03 PM, Milan Zamazal wrote:
> This series implements application of color correction matrices in
> software ISP.
> 
> Support for color temperature is added to auto white balance algorithm,
> this is needed to obtain the right CCM.  A new Ccm algorithm is added to
> determine the matrix.  Both roughly follow what the hardware pipelines
> do.
> 
> Lut algorithm is modified to incorporate the CCM and perform some
> precomputations in the form of 10 lookup tables (one per each matrix
> element and one for gamma).
> 
> A tricky part is to support both CCM and the original (no-CCM)
> transformations.  CCM obviously adds a significant overhead; per-frame
> time increases by ~45% on Debix Model A and ~85% on TI AM69 SK.  This
> means CCM must be optional, which is determined by presence of Ccm
> algorithm in the tuning file.
> 
> The performance critical debayering code must not include extra
> conditionals and must work efficiently for both the cases.  The macros
> in debayering code are refactored and restructured for this.  An added
> bonus is that this also removes some annoying code duplication.  And
> some more, partially already present, templating is used to select
> between CCM and non-CCM processing.
> 
> Performance is much influenced by the data structures used in
> debayering.  I tried to define something reasonable and performed only
> basic performance testing.  Suggestions for possible improvements are
> welcome.

Thank you for writing this. I have tested this on a ThinkPad X1 Yoga
with a 13th Gen Intel(R) CoreT i7-1370P.

Pinning qcam and all its threads to only the 6 performance cores /
12 threads:

taskset 0xfff ./redhat-linux-build/src/apps/qcam/qcam

I consistently get 6.3 ms / frame processing time without the CCM
both with and without your patches and 9.5 ms / frame with the CCM
snippet in uncalibrated.yaml activated. So it seems that we should
be able to activate the CCM with the software ISP in the IPU6 case.

I did notice that the image becomes quite green with the default
1 - 1 - 1  CCM, so it seems that currently using the CCM ignores
the color-gains calculated by the simple IPA.

I'm pretty sure that real ISP-s have both color gains and a CCM,
so I think that we should apply the color gains to the CCM
when calculating the lookup tables.

E.g. multiply CcmColumn.r by red-gain before using it to fill
the lookup-table, etc.

And then it would be up to the IPA code to either do
whitebalance through the gains or directly in the ccm,
e.g. when using a ccm to enhance saturation then awb
could still be done through the gains.

Regards,

Hans



> 
> See the commit messages for more details.
> 
> Changes in v2:
> - Fix of the gamma table index upper boundary.
> - Fix of red<->blue being temporarily swapped in an intermediate patch.
> - Fix of CCM transformation when swapRedBlueGains_ is true.
> - A related minor lookup data structures rearrangement, with an
>   unintended but welcome effect of ~10% speedup with CCM in my
>   environment. 
> 
> Milan Zamazal (9):
>   libcamera: software_isp: Determine color temperature
>   libcamera: software_isp: Store color temperature to metadata
>   libcamera: software_isp: lut: Remove maybe_unused on a used argument
>   libcamera: software_isp: Use common code to store debayered pixels
>   libcamera: software_isp: Use a macro to assign debayering methods
>   libcamera: software_isp: Add CCM algorithm
>   libcamera: software_isp: Add an example CCM to uncalibrated.yaml
>   libcamera: software_isp: Track whether CCM is enabled
>   libcamera: software_isp: Apply CCM in debayering
> 
>  .../internal/software_isp/debayer_params.h    |  20 ++-
>  include/libcamera/ipa/soft.mojom              |   2 +-
>  src/ipa/simple/algorithms/awb.cpp             |  18 ++-
>  src/ipa/simple/algorithms/ccm.cpp             |  86 ++++++++++++
>  src/ipa/simple/algorithms/ccm.h               |  45 ++++++
>  src/ipa/simple/algorithms/lut.cpp             |  62 +++++---
>  src/ipa/simple/algorithms/lut.h               |   1 +
>  src/ipa/simple/algorithms/meson.build         |   1 +
>  src/ipa/simple/data/uncalibrated.yaml         |   7 +
>  src/ipa/simple/ipa_context.h                  |  22 ++-
>  src/ipa/simple/soft_simple.cpp                |   8 +-
>  src/libcamera/software_isp/debayer.cpp        |  53 ++++++-
>  src/libcamera/software_isp/debayer.h          |   3 +-
>  src/libcamera/software_isp/debayer_cpu.cpp    | 132 ++++++++++--------
>  src/libcamera/software_isp/debayer_cpu.h      |  31 ++--
>  src/libcamera/software_isp/software_isp.cpp   |  11 +-
>  16 files changed, 393 insertions(+), 109 deletions(-)
>  create mode 100644 src/ipa/simple/algorithms/ccm.cpp
>  create mode 100644 src/ipa/simple/algorithms/ccm.h
>
Laurent Pinchart Dec. 3, 2024, 5:23 p.m. UTC | #2
On Tue, Dec 03, 2024 at 06:19:41PM +0100, Hans de Goede wrote:
> Hi Milan,
> 
> On 25-Nov-24 7:03 PM, Milan Zamazal wrote:
> > This series implements application of color correction matrices in
> > software ISP.
> > 
> > Support for color temperature is added to auto white balance algorithm,
> > this is needed to obtain the right CCM.  A new Ccm algorithm is added to
> > determine the matrix.  Both roughly follow what the hardware pipelines
> > do.
> > 
> > Lut algorithm is modified to incorporate the CCM and perform some
> > precomputations in the form of 10 lookup tables (one per each matrix
> > element and one for gamma).
> > 
> > A tricky part is to support both CCM and the original (no-CCM)
> > transformations.  CCM obviously adds a significant overhead; per-frame
> > time increases by ~45% on Debix Model A and ~85% on TI AM69 SK.  This
> > means CCM must be optional, which is determined by presence of Ccm
> > algorithm in the tuning file.
> > 
> > The performance critical debayering code must not include extra
> > conditionals and must work efficiently for both the cases.  The macros
> > in debayering code are refactored and restructured for this.  An added
> > bonus is that this also removes some annoying code duplication.  And
> > some more, partially already present, templating is used to select
> > between CCM and non-CCM processing.
> > 
> > Performance is much influenced by the data structures used in
> > debayering.  I tried to define something reasonable and performed only
> > basic performance testing.  Suggestions for possible improvements are
> > welcome.
> 
> Thank you for writing this. I have tested this on a ThinkPad X1 Yoga
> with a 13th Gen Intel(R) CoreT i7-1370P.
> 
> Pinning qcam and all its threads to only the 6 performance cores /
> 12 threads:
> 
> taskset 0xfff ./redhat-linux-build/src/apps/qcam/qcam
> 
> I consistently get 6.3 ms / frame processing time without the CCM
> both with and without your patches and 9.5 ms / frame with the CCM
> snippet in uncalibrated.yaml activated. So it seems that we should
> be able to activate the CCM with the software ISP in the IPU6 case.
> 
> I did notice that the image becomes quite green with the default
> 1 - 1 - 1  CCM, so it seems that currently using the CCM ignores
> the color-gains calculated by the simple IPA.
> 
> I'm pretty sure that real ISP-s have both color gains and a CCM,
> so I think that we should apply the color gains to the CCM
> when calculating the lookup tables.

They do, but they typically apply colour gains before CFA interpolation,
not after.

> E.g. multiply CcmColumn.r by red-gain before using it to fill
> the lookup-table, etc.
> 
> And then it would be up to the IPA code to either do
> whitebalance through the gains or directly in the ccm,
> e.g. when using a ccm to enhance saturation then awb
> could still be done through the gains.
> 
> Regards,
> 
> Hans
> 
> > See the commit messages for more details.
> > 
> > Changes in v2:
> > - Fix of the gamma table index upper boundary.
> > - Fix of red<->blue being temporarily swapped in an intermediate patch.
> > - Fix of CCM transformation when swapRedBlueGains_ is true.
> > - A related minor lookup data structures rearrangement, with an
> >   unintended but welcome effect of ~10% speedup with CCM in my
> >   environment. 
> > 
> > Milan Zamazal (9):
> >   libcamera: software_isp: Determine color temperature
> >   libcamera: software_isp: Store color temperature to metadata
> >   libcamera: software_isp: lut: Remove maybe_unused on a used argument
> >   libcamera: software_isp: Use common code to store debayered pixels
> >   libcamera: software_isp: Use a macro to assign debayering methods
> >   libcamera: software_isp: Add CCM algorithm
> >   libcamera: software_isp: Add an example CCM to uncalibrated.yaml
> >   libcamera: software_isp: Track whether CCM is enabled
> >   libcamera: software_isp: Apply CCM in debayering
> > 
> >  .../internal/software_isp/debayer_params.h    |  20 ++-
> >  include/libcamera/ipa/soft.mojom              |   2 +-
> >  src/ipa/simple/algorithms/awb.cpp             |  18 ++-
> >  src/ipa/simple/algorithms/ccm.cpp             |  86 ++++++++++++
> >  src/ipa/simple/algorithms/ccm.h               |  45 ++++++
> >  src/ipa/simple/algorithms/lut.cpp             |  62 +++++---
> >  src/ipa/simple/algorithms/lut.h               |   1 +
> >  src/ipa/simple/algorithms/meson.build         |   1 +
> >  src/ipa/simple/data/uncalibrated.yaml         |   7 +
> >  src/ipa/simple/ipa_context.h                  |  22 ++-
> >  src/ipa/simple/soft_simple.cpp                |   8 +-
> >  src/libcamera/software_isp/debayer.cpp        |  53 ++++++-
> >  src/libcamera/software_isp/debayer.h          |   3 +-
> >  src/libcamera/software_isp/debayer_cpu.cpp    | 132 ++++++++++--------
> >  src/libcamera/software_isp/debayer_cpu.h      |  31 ++--
> >  src/libcamera/software_isp/software_isp.cpp   |  11 +-
> >  16 files changed, 393 insertions(+), 109 deletions(-)
> >  create mode 100644 src/ipa/simple/algorithms/ccm.cpp
> >  create mode 100644 src/ipa/simple/algorithms/ccm.h
Milan Zamazal Dec. 4, 2024, 9:38 a.m. UTC | #3
Hi Hans,

Hans de Goede <hdegoede@redhat.com> writes:

> Hi Milan,
>
> On 25-Nov-24 7:03 PM, Milan Zamazal wrote:
>> This series implements application of color correction matrices in
>> software ISP.
>> 
>> Support for color temperature is added to auto white balance algorithm,
>> this is needed to obtain the right CCM.  A new Ccm algorithm is added to
>> determine the matrix.  Both roughly follow what the hardware pipelines
>> do.
>> 
>> Lut algorithm is modified to incorporate the CCM and perform some
>> precomputations in the form of 10 lookup tables (one per each matrix
>> element and one for gamma).
>> 
>> A tricky part is to support both CCM and the original (no-CCM)
>> transformations.  CCM obviously adds a significant overhead; per-frame
>> time increases by ~45% on Debix Model A and ~85% on TI AM69 SK.  This
>> means CCM must be optional, which is determined by presence of Ccm
>> algorithm in the tuning file.
>> 
>> The performance critical debayering code must not include extra
>> conditionals and must work efficiently for both the cases.  The macros
>> in debayering code are refactored and restructured for this.  An added
>> bonus is that this also removes some annoying code duplication.  And
>> some more, partially already present, templating is used to select
>> between CCM and non-CCM processing.
>> 
>> Performance is much influenced by the data structures used in
>> debayering.  I tried to define something reasonable and performed only
>> basic performance testing.  Suggestions for possible improvements are
>> welcome.
>
> Thank you for writing this. I have tested this on a ThinkPad X1 Yoga
> with a 13th Gen Intel(R) CoreT i7-1370P.
>
> Pinning qcam and all its threads to only the 6 performance cores /
> 12 threads:
>
> taskset 0xfff ./redhat-linux-build/src/apps/qcam/qcam
>
> I consistently get 6.3 ms / frame processing time without the CCM
> both with and without your patches and 9.5 ms / frame with the CCM
> snippet in uncalibrated.yaml activated. So it seems that we should
> be able to activate the CCM with the software ISP in the IPU6 case.

Cool, thank you for testing.

> I did notice that the image becomes quite green with the default
> 1 - 1 - 1  CCM, so it seems that currently using the CCM ignores
> the color-gains calculated by the simple IPA.

Yes.

> I'm pretty sure that real ISP-s have both color gains and a CCM,
> so I think that we should apply the color gains to the CCM
> when calculating the lookup tables.

OK, since the CCM depends on temperature, I thought the gains might be
already included, but it indeed doesn't seem to be the case.

> E.g. multiply CcmColumn.r by red-gain before using it to fill
> the lookup-table, etc.

This would be easy to do.

> And then it would be up to the IPA code to either do
> whitebalance through the gains or directly in the ccm,
> e.g. when using a ccm to enhance saturation then awb
> could still be done through the gains.

Why should CCM + saturation need applying gains separately rather than
in the CCM?

> Regards,
>
> Hans
>
>
>
>> 
>> See the commit messages for more details.
>> 
>> Changes in v2:
>> - Fix of the gamma table index upper boundary.
>> - Fix of red<->blue being temporarily swapped in an intermediate patch.
>> - Fix of CCM transformation when swapRedBlueGains_ is true.
>> - A related minor lookup data structures rearrangement, with an
>>   unintended but welcome effect of ~10% speedup with CCM in my
>>   environment. 
>> 
>> Milan Zamazal (9):
>>   libcamera: software_isp: Determine color temperature
>>   libcamera: software_isp: Store color temperature to metadata
>>   libcamera: software_isp: lut: Remove maybe_unused on a used argument
>>   libcamera: software_isp: Use common code to store debayered pixels
>>   libcamera: software_isp: Use a macro to assign debayering methods
>>   libcamera: software_isp: Add CCM algorithm
>>   libcamera: software_isp: Add an example CCM to uncalibrated.yaml
>>   libcamera: software_isp: Track whether CCM is enabled
>>   libcamera: software_isp: Apply CCM in debayering
>> 
>>  .../internal/software_isp/debayer_params.h    |  20 ++-
>>  include/libcamera/ipa/soft.mojom              |   2 +-
>>  src/ipa/simple/algorithms/awb.cpp             |  18 ++-
>>  src/ipa/simple/algorithms/ccm.cpp             |  86 ++++++++++++
>>  src/ipa/simple/algorithms/ccm.h               |  45 ++++++
>>  src/ipa/simple/algorithms/lut.cpp             |  62 +++++---
>>  src/ipa/simple/algorithms/lut.h               |   1 +
>>  src/ipa/simple/algorithms/meson.build         |   1 +
>>  src/ipa/simple/data/uncalibrated.yaml         |   7 +
>>  src/ipa/simple/ipa_context.h                  |  22 ++-
>>  src/ipa/simple/soft_simple.cpp                |   8 +-
>>  src/libcamera/software_isp/debayer.cpp        |  53 ++++++-
>>  src/libcamera/software_isp/debayer.h          |   3 +-
>>  src/libcamera/software_isp/debayer_cpu.cpp    | 132 ++++++++++--------
>>  src/libcamera/software_isp/debayer_cpu.h      |  31 ++--
>>  src/libcamera/software_isp/software_isp.cpp   |  11 +-
>>  16 files changed, 393 insertions(+), 109 deletions(-)
>>  create mode 100644 src/ipa/simple/algorithms/ccm.cpp
>>  create mode 100644 src/ipa/simple/algorithms/ccm.h
>>
Kieran Bingham Dec. 8, 2024, 6:56 p.m. UTC | #4
Hi Milan,

I'm trying to work through and test this series, but unfortunately I
think the master branch moved fast on this one and there are a couple of
areas not compiling. It seems to be around the use of matrix which moved
to libcamera/internal/matrix.h and then I also get issues with the
activestate, when I tried to work through those manually.

Could you rebase this series when you get chance please?

--
Kieran

Quoting Milan Zamazal (2024-11-25 18:03:01)
> This series implements application of color correction matrices in
> software ISP.
> 
> Support for color temperature is added to auto white balance algorithm,
> this is needed to obtain the right CCM.  A new Ccm algorithm is added to
> determine the matrix.  Both roughly follow what the hardware pipelines
> do.
> 
> Lut algorithm is modified to incorporate the CCM and perform some
> precomputations in the form of 10 lookup tables (one per each matrix
> element and one for gamma).
> 
> A tricky part is to support both CCM and the original (no-CCM)
> transformations.  CCM obviously adds a significant overhead; per-frame
> time increases by ~45% on Debix Model A and ~85% on TI AM69 SK.  This
> means CCM must be optional, which is determined by presence of Ccm
> algorithm in the tuning file.
> 
> The performance critical debayering code must not include extra
> conditionals and must work efficiently for both the cases.  The macros
> in debayering code are refactored and restructured for this.  An added
> bonus is that this also removes some annoying code duplication.  And
> some more, partially already present, templating is used to select
> between CCM and non-CCM processing.
> 
> Performance is much influenced by the data structures used in
> debayering.  I tried to define something reasonable and performed only
> basic performance testing.  Suggestions for possible improvements are
> welcome.
> 
> See the commit messages for more details.
> 
> Changes in v2:
> - Fix of the gamma table index upper boundary.
> - Fix of red<->blue being temporarily swapped in an intermediate patch.
> - Fix of CCM transformation when swapRedBlueGains_ is true.
> - A related minor lookup data structures rearrangement, with an
>   unintended but welcome effect of ~10% speedup with CCM in my
>   environment. 
> 
> Milan Zamazal (9):
>   libcamera: software_isp: Determine color temperature
>   libcamera: software_isp: Store color temperature to metadata
>   libcamera: software_isp: lut: Remove maybe_unused on a used argument
>   libcamera: software_isp: Use common code to store debayered pixels
>   libcamera: software_isp: Use a macro to assign debayering methods
>   libcamera: software_isp: Add CCM algorithm
>   libcamera: software_isp: Add an example CCM to uncalibrated.yaml
>   libcamera: software_isp: Track whether CCM is enabled
>   libcamera: software_isp: Apply CCM in debayering
> 
>  .../internal/software_isp/debayer_params.h    |  20 ++-
>  include/libcamera/ipa/soft.mojom              |   2 +-
>  src/ipa/simple/algorithms/awb.cpp             |  18 ++-
>  src/ipa/simple/algorithms/ccm.cpp             |  86 ++++++++++++
>  src/ipa/simple/algorithms/ccm.h               |  45 ++++++
>  src/ipa/simple/algorithms/lut.cpp             |  62 +++++---
>  src/ipa/simple/algorithms/lut.h               |   1 +
>  src/ipa/simple/algorithms/meson.build         |   1 +
>  src/ipa/simple/data/uncalibrated.yaml         |   7 +
>  src/ipa/simple/ipa_context.h                  |  22 ++-
>  src/ipa/simple/soft_simple.cpp                |   8 +-
>  src/libcamera/software_isp/debayer.cpp        |  53 ++++++-
>  src/libcamera/software_isp/debayer.h          |   3 +-
>  src/libcamera/software_isp/debayer_cpu.cpp    | 132 ++++++++++--------
>  src/libcamera/software_isp/debayer_cpu.h      |  31 ++--
>  src/libcamera/software_isp/software_isp.cpp   |  11 +-
>  16 files changed, 393 insertions(+), 109 deletions(-)
>  create mode 100644 src/ipa/simple/algorithms/ccm.cpp
>  create mode 100644 src/ipa/simple/algorithms/ccm.h
> 
> -- 
> 2.44.2
>