[05/13] shaders: bayer: Use native matrix multiplication
diff mbox series

Message ID 20260407-kbingham-awb-split-v1-5-a39af3f4dc20@ideasonboard.com
State New
Headers show
Series
  • ipa: simple: Convert to libipa AWB implementation
Related show

Commit Message

Kieran Bingham April 7, 2026, 10:01 p.m. UTC
Remove the open coded matrix multiplication as glsl can perform this
directly. Adapt the uniform ccmUniformDataIn_ to ensure we correctly
upload the matrix in row major ordering.

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
---
 src/libcamera/shaders/bayer_1x_packed.frag | 45 ++----------------------------
 src/libcamera/shaders/bayer_unpacked.frag  | 44 ++---------------------------
 src/libcamera/software_isp/debayer_egl.cpp | 12 ++------
 3 files changed, 7 insertions(+), 94 deletions(-)

Comments

Laurent Pinchart April 8, 2026, 12:17 a.m. UTC | #1
On Tue, Apr 07, 2026 at 11:01:08PM +0100, Kieran Bingham wrote:
> Remove the open coded matrix multiplication as glsl can perform this
> directly. Adapt the uniform ccmUniformDataIn_ to ensure we correctly
> upload the matrix in row major ordering.
> 
> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> ---
>  src/libcamera/shaders/bayer_1x_packed.frag | 45 ++----------------------------
>  src/libcamera/shaders/bayer_unpacked.frag  | 44 ++---------------------------
>  src/libcamera/software_isp/debayer_egl.cpp | 12 ++------
>  3 files changed, 7 insertions(+), 94 deletions(-)
> 
> diff --git a/src/libcamera/shaders/bayer_1x_packed.frag b/src/libcamera/shaders/bayer_1x_packed.frag
> index 9a1992e219dd066945b3f46ec509f47a31590385..57f0be9b264bdbdf523291dc9b35ab5fd6c9ef6a 100644
> --- a/src/libcamera/shaders/bayer_1x_packed.frag
> +++ b/src/libcamera/shaders/bayer_1x_packed.frag
> @@ -231,49 +231,8 @@ void main(void)
>  	/* Apply AWB gains, and saturate each channel at sensor range */
>  	rgb = clamp(rgb * awb, vec3(0.0), vec3(1.0) - blacklevel);
>  
> -	/*
> -	 *   CCM is a 3x3 in the format
> -	 *
> -	 *   +--------------+----------------+---------------+
> -	 *   | RedRedGain   | RedGreenGain   | RedBlueGain   |
> -	 *   +--------------+----------------+---------------+
> -	 *   | GreenRedGain | GreenGreenGain | GreenBlueGain |
> -	 *   +--------------+----------------+---------------+
> -	 *   | BlueRedGain  |  BlueGreenGain | BlueBlueGain  |
> -	 *   +--------------+----------------+---------------+
> -	 *
> -	 *   Rout = RedRedGain * Rin + RedGreenGain * Gin + RedBlueGain * Bin
> -	 *   Gout = GreenRedGain * Rin + GreenGreenGain * Gin + GreenBlueGain * Bin
> -	 *   Bout = BlueRedGain * Rin + BlueGreenGain * Gin + BlueBlueGain * Bin
> -	 *
> -	 *   We upload to the GPU without transposition glUniformMatrix3f(.., .., GL_FALSE, ccm);
> -	 *
> -	 *   CPU
> -	 *   float ccm [] = {
> -	 *             RedRedGain,   RedGreenGain,   RedBlueGain,
> -	 *             GreenRedGain, GreenGreenGain, GreenBlueGain,
> -	 *             BlueRedGain,  BlueGreenGain,  BlueBlueGain,
> -	 *   };
> -	 *
> -	 *   GPU
> -	 *   ccm = {
> -	 *             RedRedGain,   GreenRedGain,   BlueRedGain,
> -	 *             RedGreenGain, GreenGreenGain, BlueGreenGain,
> -	 *             RedBlueGain,  GreenBlueGain,  BlueBlueGain,
> -	 *   }
> -	 *
> -	 *   However the indexing for the mat data-type is column major hence
> -	 *   ccm[0][0] = RedRedGain, ccm[0][1] = RedGreenGain, ccm[0][2] = RedBlueGain
> -	 *
> -	 */
> -	float rin, gin, bin;
> -	rin = rgb.r;
> -	gin = rgb.g;
> -	bin = rgb.b;
> -
> -	rgb.r = (rin * ccm[0][0]) + (gin * ccm[0][1]) + (bin * ccm[0][2]);
> -	rgb.g = (rin * ccm[1][0]) + (gin * ccm[1][1]) + (bin * ccm[1][2]);
> -	rgb.b = (rin * ccm[2][0]) + (gin * ccm[2][1]) + (bin * ccm[2][2]);
> +	/* Row major Colour Correction Matrix multiplication */

Row-major is related to how data is stored in memory. When writing

	ccm * rgb

that's not relevant any more. You can drop it from the comment. Same
below.

> +	rgb = ccm * rgb;

I wonder why that was open-coded. Any idea ? I suppose there's no
negative performance impact from this patch, as the compiler should
generate the same or better instructions ?

>  
>  	/*
>  	 * Contrast
> diff --git a/src/libcamera/shaders/bayer_unpacked.frag b/src/libcamera/shaders/bayer_unpacked.frag
> index 87e4e4915fe19679943fdcc3d213a0224b89065e..52de0a7901cfac33dbc7b306f1840ec0048cd8ea 100644
> --- a/src/libcamera/shaders/bayer_unpacked.frag
> +++ b/src/libcamera/shaders/bayer_unpacked.frag
> @@ -134,49 +134,9 @@ void main(void) {
>  	/* Apply AWB gains, and saturate each channel at sensor range */
>  	rgb = clamp(rgb * awb, vec3(0.0), vec3(1.0) - blacklevel);
>  
> -	/*
> -	 *   CCM is a 3x3 in the format
> -	 *
> -	 *   +--------------+----------------+---------------+
> -	 *   | RedRedGain   | RedGreenGain   | RedBlueGain   |
> -	 *   +--------------+----------------+---------------+
> -	 *   | GreenRedGain | GreenGreenGain | GreenBlueGain |
> -	 *   +--------------+----------------+---------------+
> -	 *   | BlueRedGain  |  BlueGreenGain | BlueBlueGain  |
> -	 *   +--------------+----------------+---------------+
> -	 *
> -	 *   Rout = RedRedGain * Rin + RedGreenGain * Gin + RedBlueGain * Bin
> -	 *   Gout = GreenRedGain * Rin + GreenGreenGain * Gin + GreenBlueGain * Bin
> -	 *   Bout = BlueRedGain * Rin + BlueGreenGain * Gin + BlueBlueGain * Bin
> -	 *
> -	 *   We upload to the GPU without transposition glUniformMatrix3f(.., .., GL_FALSE, ccm);
> -	 *
> -	 *   CPU
> -	 *   float ccm [] = {
> -	 *             RedRedGain,   RedGreenGain,   RedBlueGain,
> -	 *             GreenRedGain, GreenGreenGain, GreenBlueGain,
> -	 *             BlueRedGain,  BlueGreenGain,  BlueBlueGain,
> -	 *   };
> -	 *
> -	 *   GPU
> -	 *   ccm = {
> -	 *             RedRedGain,   GreenRedGain,   BlueRedGain,
> -	 *             RedGreenGain, GreenGreenGain, BlueGreenGain,
> -	 *             RedBlueGain,  GreenBlueGain,  BlueBlueGain,
> -	 *   }
> -	 *
> -	 *   However the indexing for the mat data-type is column major hence
> -	 *   ccm[0][0] = RedRedGain, ccm[0][1] = RedGreenGain, ccm[0][2] = RedBlueGain
> -	 *
> -	 */
> -	float rin, gin, bin;
> -	rin = rgb.r;
> -	gin = rgb.g;
> -	bin = rgb.b;
>  
> -	rgb.r = (rin * ccm[0][0]) + (gin * ccm[0][1]) + (bin * ccm[0][2]);
> -	rgb.g = (rin * ccm[1][0]) + (gin * ccm[1][1]) + (bin * ccm[1][2]);
> -	rgb.b = (rin * ccm[2][0]) + (gin * ccm[2][1]) + (bin * ccm[2][2]);
> +	/* Row major Colour Correction Matrix multiplication */
> +	rgb = ccm * rgb;
>  
>  	/*
>  	 * Contrast
> diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp
> index 738036e649224f370bdf9a0cc59b399f8b1066de..1022f595cc71751cd0c9133b3e5755f591224dc4 100644
> --- a/src/libcamera/software_isp/debayer_egl.cpp
> +++ b/src/libcamera/software_isp/debayer_egl.cpp
> @@ -464,15 +464,9 @@ void DebayerEGL::setShaderVariableValues(const DebayerParams &params)
>  			    << " textureUniformProjMatrix_ " << textureUniformProjMatrix_;
>  
>  	GLfloat ccm[9] = {
> -		params.combinedMatrix[0][0],
> -		params.combinedMatrix[0][1],
> -		params.combinedMatrix[0][2],
> -		params.combinedMatrix[1][0],
> -		params.combinedMatrix[1][1],
> -		params.combinedMatrix[1][2],
> -		params.combinedMatrix[2][0],
> -		params.combinedMatrix[2][1],
> -		params.combinedMatrix[2][2],
> +		params.combinedMatrix[0][0], params.combinedMatrix[1][0], params.combinedMatrix[2][0],
> +		params.combinedMatrix[0][1], params.combinedMatrix[1][1], params.combinedMatrix[2][1],
> +		params.combinedMatrix[0][2], params.combinedMatrix[1][2], params.combinedMatrix[2][2],

You're now storing the data in column-major order, ...

>  	};
>  	glUniformMatrix3fv(ccmUniformDataIn_, 1, GL_FALSE, ccm);

... and the third argument matches that. As the libcamera Matrix class
stores elements in row-major order, I think it would be clearer to store
the data in row-major order in the ccm array as well, and set the third
argument to GL_TRUE accordingly.

We could then possibly avoid copying the data to a local array (here or
as a separate patch).

>  	LOG(Debayer, Debug) << " ccmUniformDataIn_ " << ccmUniformDataIn_ << " data " << params.combinedMatrix;
>

Patch
diff mbox series

diff --git a/src/libcamera/shaders/bayer_1x_packed.frag b/src/libcamera/shaders/bayer_1x_packed.frag
index 9a1992e219dd066945b3f46ec509f47a31590385..57f0be9b264bdbdf523291dc9b35ab5fd6c9ef6a 100644
--- a/src/libcamera/shaders/bayer_1x_packed.frag
+++ b/src/libcamera/shaders/bayer_1x_packed.frag
@@ -231,49 +231,8 @@  void main(void)
 	/* Apply AWB gains, and saturate each channel at sensor range */
 	rgb = clamp(rgb * awb, vec3(0.0), vec3(1.0) - blacklevel);
 
-	/*
-	 *   CCM is a 3x3 in the format
-	 *
-	 *   +--------------+----------------+---------------+
-	 *   | RedRedGain   | RedGreenGain   | RedBlueGain   |
-	 *   +--------------+----------------+---------------+
-	 *   | GreenRedGain | GreenGreenGain | GreenBlueGain |
-	 *   +--------------+----------------+---------------+
-	 *   | BlueRedGain  |  BlueGreenGain | BlueBlueGain  |
-	 *   +--------------+----------------+---------------+
-	 *
-	 *   Rout = RedRedGain * Rin + RedGreenGain * Gin + RedBlueGain * Bin
-	 *   Gout = GreenRedGain * Rin + GreenGreenGain * Gin + GreenBlueGain * Bin
-	 *   Bout = BlueRedGain * Rin + BlueGreenGain * Gin + BlueBlueGain * Bin
-	 *
-	 *   We upload to the GPU without transposition glUniformMatrix3f(.., .., GL_FALSE, ccm);
-	 *
-	 *   CPU
-	 *   float ccm [] = {
-	 *             RedRedGain,   RedGreenGain,   RedBlueGain,
-	 *             GreenRedGain, GreenGreenGain, GreenBlueGain,
-	 *             BlueRedGain,  BlueGreenGain,  BlueBlueGain,
-	 *   };
-	 *
-	 *   GPU
-	 *   ccm = {
-	 *             RedRedGain,   GreenRedGain,   BlueRedGain,
-	 *             RedGreenGain, GreenGreenGain, BlueGreenGain,
-	 *             RedBlueGain,  GreenBlueGain,  BlueBlueGain,
-	 *   }
-	 *
-	 *   However the indexing for the mat data-type is column major hence
-	 *   ccm[0][0] = RedRedGain, ccm[0][1] = RedGreenGain, ccm[0][2] = RedBlueGain
-	 *
-	 */
-	float rin, gin, bin;
-	rin = rgb.r;
-	gin = rgb.g;
-	bin = rgb.b;
-
-	rgb.r = (rin * ccm[0][0]) + (gin * ccm[0][1]) + (bin * ccm[0][2]);
-	rgb.g = (rin * ccm[1][0]) + (gin * ccm[1][1]) + (bin * ccm[1][2]);
-	rgb.b = (rin * ccm[2][0]) + (gin * ccm[2][1]) + (bin * ccm[2][2]);
+	/* Row major Colour Correction Matrix multiplication */
+	rgb = ccm * rgb;
 
 	/*
 	 * Contrast
diff --git a/src/libcamera/shaders/bayer_unpacked.frag b/src/libcamera/shaders/bayer_unpacked.frag
index 87e4e4915fe19679943fdcc3d213a0224b89065e..52de0a7901cfac33dbc7b306f1840ec0048cd8ea 100644
--- a/src/libcamera/shaders/bayer_unpacked.frag
+++ b/src/libcamera/shaders/bayer_unpacked.frag
@@ -134,49 +134,9 @@  void main(void) {
 	/* Apply AWB gains, and saturate each channel at sensor range */
 	rgb = clamp(rgb * awb, vec3(0.0), vec3(1.0) - blacklevel);
 
-	/*
-	 *   CCM is a 3x3 in the format
-	 *
-	 *   +--------------+----------------+---------------+
-	 *   | RedRedGain   | RedGreenGain   | RedBlueGain   |
-	 *   +--------------+----------------+---------------+
-	 *   | GreenRedGain | GreenGreenGain | GreenBlueGain |
-	 *   +--------------+----------------+---------------+
-	 *   | BlueRedGain  |  BlueGreenGain | BlueBlueGain  |
-	 *   +--------------+----------------+---------------+
-	 *
-	 *   Rout = RedRedGain * Rin + RedGreenGain * Gin + RedBlueGain * Bin
-	 *   Gout = GreenRedGain * Rin + GreenGreenGain * Gin + GreenBlueGain * Bin
-	 *   Bout = BlueRedGain * Rin + BlueGreenGain * Gin + BlueBlueGain * Bin
-	 *
-	 *   We upload to the GPU without transposition glUniformMatrix3f(.., .., GL_FALSE, ccm);
-	 *
-	 *   CPU
-	 *   float ccm [] = {
-	 *             RedRedGain,   RedGreenGain,   RedBlueGain,
-	 *             GreenRedGain, GreenGreenGain, GreenBlueGain,
-	 *             BlueRedGain,  BlueGreenGain,  BlueBlueGain,
-	 *   };
-	 *
-	 *   GPU
-	 *   ccm = {
-	 *             RedRedGain,   GreenRedGain,   BlueRedGain,
-	 *             RedGreenGain, GreenGreenGain, BlueGreenGain,
-	 *             RedBlueGain,  GreenBlueGain,  BlueBlueGain,
-	 *   }
-	 *
-	 *   However the indexing for the mat data-type is column major hence
-	 *   ccm[0][0] = RedRedGain, ccm[0][1] = RedGreenGain, ccm[0][2] = RedBlueGain
-	 *
-	 */
-	float rin, gin, bin;
-	rin = rgb.r;
-	gin = rgb.g;
-	bin = rgb.b;
 
-	rgb.r = (rin * ccm[0][0]) + (gin * ccm[0][1]) + (bin * ccm[0][2]);
-	rgb.g = (rin * ccm[1][0]) + (gin * ccm[1][1]) + (bin * ccm[1][2]);
-	rgb.b = (rin * ccm[2][0]) + (gin * ccm[2][1]) + (bin * ccm[2][2]);
+	/* Row major Colour Correction Matrix multiplication */
+	rgb = ccm * rgb;
 
 	/*
 	 * Contrast
diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp
index 738036e649224f370bdf9a0cc59b399f8b1066de..1022f595cc71751cd0c9133b3e5755f591224dc4 100644
--- a/src/libcamera/software_isp/debayer_egl.cpp
+++ b/src/libcamera/software_isp/debayer_egl.cpp
@@ -464,15 +464,9 @@  void DebayerEGL::setShaderVariableValues(const DebayerParams &params)
 			    << " textureUniformProjMatrix_ " << textureUniformProjMatrix_;
 
 	GLfloat ccm[9] = {
-		params.combinedMatrix[0][0],
-		params.combinedMatrix[0][1],
-		params.combinedMatrix[0][2],
-		params.combinedMatrix[1][0],
-		params.combinedMatrix[1][1],
-		params.combinedMatrix[1][2],
-		params.combinedMatrix[2][0],
-		params.combinedMatrix[2][1],
-		params.combinedMatrix[2][2],
+		params.combinedMatrix[0][0], params.combinedMatrix[1][0], params.combinedMatrix[2][0],
+		params.combinedMatrix[0][1], params.combinedMatrix[1][1], params.combinedMatrix[2][1],
+		params.combinedMatrix[0][2], params.combinedMatrix[1][2], params.combinedMatrix[2][2],
 	};
 	glUniformMatrix3fv(ccmUniformDataIn_, 1, GL_FALSE, ccm);
 	LOG(Debayer, Debug) << " ccmUniformDataIn_ " << ccmUniformDataIn_ << " data " << params.combinedMatrix;