[v3,1/1] libcamera: software_isp: Reduce statistics image area
diff mbox series

Message ID 20250813170240.50173-2-mzamazal@redhat.com
State New
Headers show
Series
  • Reduce statistics image area for GPU
Related show

Commit Message

Milan Zamazal Aug. 13, 2025, 5:02 p.m. UTC
The statistics in software ISP is computed basically over the whole
image area, although only on part of the pixels.  It is not necessary to
cover the whole image area, it's sufficient to compute the statistics
let's say over the central area of 2/3 of the image width and height,
which should be both sufficient and faster.  The speedup is not that
important with the CPU implementation but it may save CPU work
more noticeably with GPU debayering implementation.

Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
---
 src/libcamera/software_isp/debayer_cpu.cpp | 9 +++++++--
 src/libcamera/software_isp/debayer_egl.cpp | 7 +++++--
 2 files changed, 12 insertions(+), 4 deletions(-)

Comments

Kieran Bingham Aug. 13, 2025, 5:36 p.m. UTC | #1
Quoting Milan Zamazal (2025-08-13 18:02:40)
> The statistics in software ISP is computed basically over the whole
> image area, although only on part of the pixels.  It is not necessary to
> cover the whole image area, it's sufficient to compute the statistics
> let's say over the central area of 2/3 of the image width and height,
> which should be both sufficient and faster.  The speedup is not that
> important with the CPU implementation but it may save CPU work
> more noticeably with GPU debayering implementation.
> 
> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>

Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>

> ---
>  src/libcamera/software_isp/debayer_cpu.cpp | 9 +++++++--
>  src/libcamera/software_isp/debayer_egl.cpp | 7 +++++--
>  2 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp
> index e56492848..a7d6e4fd8 100644
> --- a/src/libcamera/software_isp/debayer_cpu.cpp
> +++ b/src/libcamera/software_isp/debayer_cpu.cpp
> @@ -526,8 +526,13 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg,
>         window_.width = outputCfg.size.width;
>         window_.height = outputCfg.size.height;
>  
> -       /* Don't pass x,y since process() already adjusts src before passing it */
> -       stats_->setWindow(Rectangle(window_.size()));
> +       /*
> +        * Don't pass x,y from window_ since process() already adjusts for it.
> +        * But crop the window to 2/3 of its width and height for speedup.
> +        * The speedup is more important with GPU than with CPU ISP; we want the
> +        * same implementation on both.
> +        */
> +       stats_->setWindow((window_.size() * 2 / 3).centeredTo(window_.center()));
>  
>         /* pad with patternSize.Width on both left and right side */
>         lineBufferPadding_ = inputConfig_.patternSize.width * inputConfig_.bpp / 8;
> diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp
> index 9ec966608..057ad1c38 100644
> --- a/src/libcamera/software_isp/debayer_egl.cpp
> +++ b/src/libcamera/software_isp/debayer_egl.cpp
> @@ -393,8 +393,11 @@ int DebayerEGL::configure(const StreamConfiguration &inputCfg,
>         window_.width = outputCfg.size.width;
>         window_.height = outputCfg.size.height;
>  
> -       /* Don't pass x,y since process() already adjusts src before passing it */
> -       stats_->setWindow(Rectangle(window_.size()));
> +       /*
> +        * Don't pass x,y from window_ since process() already adjusts for it.
> +        * But crop the window to 2/3 of its width and height for speedup.
> +        */
> +       stats_->setWindow((window_.size() * 2 / 3).centeredTo(window_.center()));
>  
>         LOG(Debayer, Debug) << "Input width " << inputCfg.size.width << " height " << inputCfg.size.height;
>         LOG(Debayer, Debug) << "Output width " << outputCfg.size.width << " height " << outputCfg.size.height;
> -- 
> 2.50.1
>
Kieran Bingham Aug. 14, 2025, 10:24 a.m. UTC | #2
Hans, Bryan,

Quoting Kieran Bingham (2025-08-13 18:36:21)
> Quoting Milan Zamazal (2025-08-13 18:02:40)
> > The statistics in software ISP is computed basically over the whole
> > image area, although only on part of the pixels.  It is not necessary to
> > cover the whole image area, it's sufficient to compute the statistics
> > let's say over the central area of 2/3 of the image width and height,
> > which should be both sufficient and faster.  The speedup is not that
> > important with the CPU implementation but it may save CPU work
> > more noticeably with GPU debayering implementation.
> > 
> > Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
> 

Could you test/review this please?

If everyone's happy in softisp I can merge this.

-
Kieran

> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> 
> > ---
> >  src/libcamera/software_isp/debayer_cpu.cpp | 9 +++++++--
> >  src/libcamera/software_isp/debayer_egl.cpp | 7 +++++--
> >  2 files changed, 12 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp
> > index e56492848..a7d6e4fd8 100644
> > --- a/src/libcamera/software_isp/debayer_cpu.cpp
> > +++ b/src/libcamera/software_isp/debayer_cpu.cpp
> > @@ -526,8 +526,13 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg,
> >         window_.width = outputCfg.size.width;
> >         window_.height = outputCfg.size.height;
> >  
> > -       /* Don't pass x,y since process() already adjusts src before passing it */
> > -       stats_->setWindow(Rectangle(window_.size()));
> > +       /*
> > +        * Don't pass x,y from window_ since process() already adjusts for it.
> > +        * But crop the window to 2/3 of its width and height for speedup.
> > +        * The speedup is more important with GPU than with CPU ISP; we want the
> > +        * same implementation on both.
> > +        */
> > +       stats_->setWindow((window_.size() * 2 / 3).centeredTo(window_.center()));
> >  
> >         /* pad with patternSize.Width on both left and right side */
> >         lineBufferPadding_ = inputConfig_.patternSize.width * inputConfig_.bpp / 8;
> > diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp
> > index 9ec966608..057ad1c38 100644
> > --- a/src/libcamera/software_isp/debayer_egl.cpp
> > +++ b/src/libcamera/software_isp/debayer_egl.cpp
> > @@ -393,8 +393,11 @@ int DebayerEGL::configure(const StreamConfiguration &inputCfg,
> >         window_.width = outputCfg.size.width;
> >         window_.height = outputCfg.size.height;
> >  
> > -       /* Don't pass x,y since process() already adjusts src before passing it */
> > -       stats_->setWindow(Rectangle(window_.size()));
> > +       /*
> > +        * Don't pass x,y from window_ since process() already adjusts for it.
> > +        * But crop the window to 2/3 of its width and height for speedup.
> > +        */
> > +       stats_->setWindow((window_.size() * 2 / 3).centeredTo(window_.center()));
> >  
> >         LOG(Debayer, Debug) << "Input width " << inputCfg.size.width << " height " << inputCfg.size.height;
> >         LOG(Debayer, Debug) << "Output width " << outputCfg.size.width << " height " << outputCfg.size.height;
> > -- 
> > 2.50.1
> >
Bryan O'Donoghue Aug. 14, 2025, 11:12 a.m. UTC | #3
On 14/08/2025 11:24, Kieran Bingham wrote:
> Hans, Bryan,
> 
> Quoting Kieran Bingham (2025-08-13 18:36:21)
>> Quoting Milan Zamazal (2025-08-13 18:02:40)
>>> The statistics in software ISP is computed basically over the whole
>>> image area, although only on part of the pixels.  It is not necessary to
>>> cover the whole image area, it's sufficient to compute the statistics
>>> let's say over the central area of 2/3 of the image width and height,
>>> which should be both sufficient and faster.  The speedup is not that
>>> important with the CPU implementation but it may save CPU work
>>> more noticeably with GPU debayering implementation.
>>>
>>> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
>>
> 
> Could you test/review this please?
> 
> If everyone's happy in softisp I can merge this.
You'll have to cut out the egl part as its not upstream yet but, 
applying to the GPUISP branch it seems to work well.

CPU fps is about the same, gpu FPS hits line rate and keep in mind this 
is on a debug build too.

Also seems to improve the flickering bug I've been seeing because the 
area of interest is the center of the image instead of the edges.

Before:
https://drive.google.com/file/d/19JGJSCjTI-lcu_zcs1qmu7K5Y9aktEoG/view?usp=drive_link

After:
https://drive.google.com/file/d/1A_niOE6cKC97N3IkvH8iw4JB9qyjGj5l/view?usp=sharing

Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Bryan O'Donoghue Aug. 14, 2025, 11:16 a.m. UTC | #4
On 14/08/2025 12:12, Bryan O'Donoghue wrote:
> On 14/08/2025 11:24, Kieran Bingham wrote:
>> Hans, Bryan,
>>
>> Quoting Kieran Bingham (2025-08-13 18:36:21)
>>> Quoting Milan Zamazal (2025-08-13 18:02:40)
>>>> The statistics in software ISP is computed basically over the whole
>>>> image area, although only on part of the pixels.  It is not 
>>>> necessary to
>>>> cover the whole image area, it's sufficient to compute the statistics
>>>> let's say over the central area of 2/3 of the image width and height,
>>>> which should be both sufficient and faster.  The speedup is not that
>>>> important with the CPU implementation but it may save CPU work
>>>> more noticeably with GPU debayering implementation.
>>>>
>>>> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
>>>
>>
>> Could you test/review this please?
>>
>> If everyone's happy in softisp I can merge this.
> You'll have to cut out the egl part as its not upstream yet but, 
> applying to the GPUISP branch it seems to work well.
> 
> CPU fps is about the same, gpu FPS hits line rate and keep in mind this 
> is on a debug build too.
> 
> Also seems to improve the flickering bug I've been seeing because the 
> area of interest is the center of the image instead of the edges.
> 
> Before:
> https://drive.google.com/file/d/19JGJSCjTI-lcu_zcs1qmu7K5Y9aktEoG/view? 
> usp=drive_link
> 
> After:
> https://drive.google.com/file/d/1A_niOE6cKC97N3IkvH8iw4JB9qyjGj5l/view? 
> usp=sharing
> 
> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
> Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>

https://gitlab.freedesktop.org/camera/libcamera-softisp/-/tree/origin-master+whole-frame-swtats-v2-gpuisp-v1n?ref_type=heads

---
bod

Patch
diff mbox series

diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp
index e56492848..a7d6e4fd8 100644
--- a/src/libcamera/software_isp/debayer_cpu.cpp
+++ b/src/libcamera/software_isp/debayer_cpu.cpp
@@ -526,8 +526,13 @@  int DebayerCpu::configure(const StreamConfiguration &inputCfg,
 	window_.width = outputCfg.size.width;
 	window_.height = outputCfg.size.height;
 
-	/* Don't pass x,y since process() already adjusts src before passing it */
-	stats_->setWindow(Rectangle(window_.size()));
+	/*
+	 * Don't pass x,y from window_ since process() already adjusts for it.
+	 * But crop the window to 2/3 of its width and height for speedup.
+	 * The speedup is more important with GPU than with CPU ISP; we want the
+	 * same implementation on both.
+	 */
+	stats_->setWindow((window_.size() * 2 / 3).centeredTo(window_.center()));
 
 	/* pad with patternSize.Width on both left and right side */
 	lineBufferPadding_ = inputConfig_.patternSize.width * inputConfig_.bpp / 8;
diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp
index 9ec966608..057ad1c38 100644
--- a/src/libcamera/software_isp/debayer_egl.cpp
+++ b/src/libcamera/software_isp/debayer_egl.cpp
@@ -393,8 +393,11 @@  int DebayerEGL::configure(const StreamConfiguration &inputCfg,
 	window_.width = outputCfg.size.width;
 	window_.height = outputCfg.size.height;
 
-	/* Don't pass x,y since process() already adjusts src before passing it */
-	stats_->setWindow(Rectangle(window_.size()));
+	/*
+	 * Don't pass x,y from window_ since process() already adjusts for it.
+	 * But crop the window to 2/3 of its width and height for speedup.
+	 */
+	stats_->setWindow((window_.size() * 2 / 3).centeredTo(window_.center()));
 
 	LOG(Debayer, Debug) << "Input width " << inputCfg.size.width << " height " << inputCfg.size.height;
 	LOG(Debayer, Debug) << "Output width " << outputCfg.size.width << " height " << outputCfg.size.height;