[v2,6/6] libcamera: software_isp: Run sw-statistics once every 4th frame
diff mbox series

Message ID 20250925221708.7471-7-hansg@kernel.org
State Superseded
Headers show
Series
  • ipa: software_isp: AGC: Fox AGC oscillation bug
Related show

Commit Message

Hans de Goede Sept. 25, 2025, 10:17 p.m. UTC
Run sw-statistics once every 4th frame, instead of every frame. There are
2 reasons for this:

1. There really is no need to have statistics for every frame and only
doing this every 4th frame helps save some CPU time.

2. The generic nature of the simple pipeline-handler, so no information
about possible CSI receiver frame-delays. In combination with the software
ISP often being used with sensors without sensor info in the sensor-helper
code, so no reliable control-delay information means that the software ISP
is prone to AGC oscillation. Skipping statistics gathering also means
skipping running the AGC algorithm slowing it down, avoiding this
oscillation.

Note ideally the AGC oscillation problem would be fixed by adding sensor
metadata support all through the stack so that the exact gain and exposure
used for a specific frame are reliably provided by the sensor metadata.

Signed-off-by: Hans de Goede <hansg@kernel.org>
---
 src/libcamera/software_isp/debayer_cpu.cpp | 25 +++++++++++++---------
 src/libcamera/software_isp/debayer_cpu.h   |  4 ++--
 src/libcamera/software_isp/swstats_cpu.cpp |  5 +++++
 src/libcamera/software_isp/swstats_cpu.h   |  3 +++
 4 files changed, 25 insertions(+), 12 deletions(-)

Comments

Milan Zamazal Sept. 26, 2025, 1:33 p.m. UTC | #1
Hi Hans,

thank you for the patch.

Hans de Goede <hansg@kernel.org> writes:

> Run sw-statistics once every 4th frame, 

I still miss any explanation why exactly every 4th frame.

> instead of every frame. There are 2 reasons for this:
>
> 1. There really is no need to have statistics for every frame 

Well, depends on fps, I'd say.  Arguably, with slow fps rates,
everything is and can be slow, but still.

> and only doing this every 4th frame helps save some CPU time.
>
> 2. The generic nature of the simple pipeline-handler, so no information
> about possible CSI receiver frame-delays. In combination with the software
> ISP often being used with sensors without sensor info in the sensor-helper
> code, so no reliable control-delay information means that the software ISP
> is prone to AGC oscillation. Skipping statistics gathering also means
> skipping running the AGC algorithm slowing it down, avoiding this
> oscillation.

The last sentence is not very clear to me.

> Note ideally the AGC oscillation problem would be fixed by adding sensor
> metadata support all through the stack so that the exact gain and exposure
> used for a specific frame are reliably provided by the sensor metadata.
>
> Signed-off-by: Hans de Goede <hansg@kernel.org>
> ---
>  src/libcamera/software_isp/debayer_cpu.cpp | 25 +++++++++++++---------
>  src/libcamera/software_isp/debayer_cpu.h   |  4 ++--
>  src/libcamera/software_isp/swstats_cpu.cpp |  5 +++++
>  src/libcamera/software_isp/swstats_cpu.h   |  3 +++
>  4 files changed, 25 insertions(+), 12 deletions(-)
>
> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp
> index bfa60888..9010333e 100644
> --- a/src/libcamera/software_isp/debayer_cpu.cpp
> +++ b/src/libcamera/software_isp/debayer_cpu.cpp
> @@ -655,7 +655,7 @@ void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])
>  	lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1);
>  }
>  
> -void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
> +void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)
>  {
>  	unsigned int yEnd = window_.y + window_.height;
>  	/* Holds [0] previous- [1] current- [2] next-line */
> @@ -681,7 +681,8 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>  	for (unsigned int y = window_.y; y < yEnd; y += 2) {
>  		shiftLinePointers(linePointers, src);
>  		memcpyNextLine(linePointers);
> -		stats_->processLine0(y, linePointers);
> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
> +			stats_->processLine0(y, linePointers);
>  		(this->*debayer0_)(dst, linePointers);
>  		src += inputConfig_.stride;
>  		dst += outputConfig_.stride;
> @@ -696,7 +697,8 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>  	if (window_.y == 0) {
>  		shiftLinePointers(linePointers, src);
>  		memcpyNextLine(linePointers);
> -		stats_->processLine0(yEnd, linePointers);
> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
> +			stats_->processLine0(yEnd, linePointers);
>  		(this->*debayer0_)(dst, linePointers);
>  		src += inputConfig_.stride;
>  		dst += outputConfig_.stride;
> @@ -710,7 +712,7 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>  	}
>  }
>  
> -void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
> +void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst)
>  {
>  	const unsigned int yEnd = window_.y + window_.height;
>  	/*
> @@ -733,7 +735,8 @@ void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>  	for (unsigned int y = window_.y; y < yEnd; y += 4) {
>  		shiftLinePointers(linePointers, src);
>  		memcpyNextLine(linePointers);
> -		stats_->processLine0(y, linePointers);
> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
> +			stats_->processLine0(y, linePointers);
>  		(this->*debayer0_)(dst, linePointers);
>  		src += inputConfig_.stride;
>  		dst += outputConfig_.stride;
> @@ -746,7 +749,8 @@ void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>  
>  		shiftLinePointers(linePointers, src);
>  		memcpyNextLine(linePointers);
> -		stats_->processLine2(y, linePointers);
> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
> +			stats_->processLine2(y, linePointers);
>  		(this->*debayer2_)(dst, linePointers);
>  		src += inputConfig_.stride;
>  		dst += outputConfig_.stride;
> @@ -821,12 +825,13 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
>  		return;
>  	}
>  
> -	stats_->startFrame();
> +	if (frame % SwStatsCpu::kStatPerNumFrames == 0)
> +		stats_->startFrame();
>  
>  	if (inputConfig_.patternSize.height == 2)
> -		process2(in.planes()[0].data(), out.planes()[0].data());
> +		process2(frame, in.planes()[0].data(), out.planes()[0].data());
>  	else
> -		process4(in.planes()[0].data(), out.planes()[0].data());
> +		process4(frame, in.planes()[0].data(), out.planes()[0].data());
>  
>  	metadata.planes()[0].bytesused = out.planes()[0].size();
>  
> @@ -851,7 +856,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
>  	 *
>  	 * \todo Pass real bufferId once stats buffer passing is changed.
>  	 */
> -	stats_->finishFrame(frame, 0, true);
> +	stats_->finishFrame(frame, 0, frame % SwStatsCpu::kStatPerNumFrames == 0);

This leads to a crash in my environment, after 4 frames:

cam: ../include/libcamera/controls.h:188: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::__cxx11::basic_string<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.

Thread 5 "cam" received signal SIGABRT, Aborted.
[...]
#5  0x0000fffff4b99324 in libcamera::ControlValue::get<int, decltype(nullptr)> (this=<optimized out>) at ../include/libcamera/controls.h:188
#6  libcamera::ipa::soft::IPASoftSimple::processStats (this=0xfffff0009f40, frame=4, bufferId=<optimized out>, sensorControls=...) at ../src/ipa/simple/soft_simple.cpp:312
[...]

This omits IPASoftSimple::processStats, which I don't think is optional,
it does (somewhat misleadingly) more than just stats processing.

>  	outputBufferReady.emit(output);
>  	inputBufferReady.emit(input);
>  }
> diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h
> index 9d343e46..03e0d784 100644
> --- a/src/libcamera/software_isp/debayer_cpu.h
> +++ b/src/libcamera/software_isp/debayer_cpu.h
> @@ -133,8 +133,8 @@ private:
>  	void setupInputMemcpy(const uint8_t *linePointers[]);
>  	void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src);
>  	void memcpyNextLine(const uint8_t *linePointers[]);
> -	void process2(const uint8_t *src, uint8_t *dst);
> -	void process4(const uint8_t *src, uint8_t *dst);
> +	void process2(uint32_t frame, const uint8_t *src, uint8_t *dst);
> +	void process4(uint32_t frame, const uint8_t *src, uint8_t *dst);
>  
>  	/* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */
>  	static constexpr unsigned int kMaxLineBuffers = 5;
> diff --git a/src/libcamera/software_isp/swstats_cpu.cpp b/src/libcamera/software_isp/swstats_cpu.cpp
> index da91f912..35ba0a46 100644
> --- a/src/libcamera/software_isp/swstats_cpu.cpp
> +++ b/src/libcamera/software_isp/swstats_cpu.cpp
> @@ -89,6 +89,11 @@ namespace libcamera {
>   * \brief Signals that the statistics are ready
>   */
>  
> +/**
> + * \var SwStatsCpu::kStatPerNumFrames
> + * \brief Run stats once every kStatPerNumFrames frames
> + */
> +
>  /**
>   * \typedef SwStatsCpu::statsProcessFn
>   * \brief Called when there is data to get statistics from
> diff --git a/src/libcamera/software_isp/swstats_cpu.h b/src/libcamera/software_isp/swstats_cpu.h
> index 6ac3c4de..ea0e6d5a 100644
> --- a/src/libcamera/software_isp/swstats_cpu.h
> +++ b/src/libcamera/software_isp/swstats_cpu.h
> @@ -32,6 +32,9 @@ public:
>  	SwStatsCpu();
>  	~SwStatsCpu() = default;
>  
> +	/* Run stats once every 4 frames */
> +	static constexpr uint32_t kStatPerNumFrames = 4;
> +
>  	bool isValid() const { return sharedStats_.fd().isValid(); }
>  
>  	const SharedFD &getStatsFD() { return sharedStats_.fd(); }
Barnabás Pőcze Sept. 26, 2025, 2:21 p.m. UTC | #2
Hi

2025. 09. 26. 15:33 keltezéssel, Milan Zamazal írta:
> Hi Hans,
> 
> thank you for the patch.
> 
> Hans de Goede <hansg@kernel.org> writes:
> 
>> Run sw-statistics once every 4th frame,
> 
> I still miss any explanation why exactly every 4th frame.
> 
>> instead of every frame. There are 2 reasons for this:
>>
>> 1. There really is no need to have statistics for every frame
> 
> Well, depends on fps, I'd say.  Arguably, with slow fps rates,
> everything is and can be slow, but still.
> 
>> and only doing this every 4th frame helps save some CPU time.
>>
>> 2. The generic nature of the simple pipeline-handler, so no information
>> about possible CSI receiver frame-delays. In combination with the software
>> ISP often being used with sensors without sensor info in the sensor-helper
>> code, so no reliable control-delay information means that the software ISP
>> is prone to AGC oscillation. Skipping statistics gathering also means
>> skipping running the AGC algorithm slowing it down, avoiding this
>> oscillation.
> 
> The last sentence is not very clear to me.
> 
>> Note ideally the AGC oscillation problem would be fixed by adding sensor
>> metadata support all through the stack so that the exact gain and exposure
>> used for a specific frame are reliably provided by the sensor metadata.
>>
>> Signed-off-by: Hans de Goede <hansg@kernel.org>
>> ---
>>   src/libcamera/software_isp/debayer_cpu.cpp | 25 +++++++++++++---------
>>   src/libcamera/software_isp/debayer_cpu.h   |  4 ++--
>>   src/libcamera/software_isp/swstats_cpu.cpp |  5 +++++
>>   src/libcamera/software_isp/swstats_cpu.h   |  3 +++
>>   4 files changed, 25 insertions(+), 12 deletions(-)
>>
>> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp
>> index bfa60888..9010333e 100644
>> --- a/src/libcamera/software_isp/debayer_cpu.cpp
>> +++ b/src/libcamera/software_isp/debayer_cpu.cpp
>> @@ -655,7 +655,7 @@ void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])
>>   	lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1);
>>   }
>>   
>> -void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>> +void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)
>>   {
>>   	unsigned int yEnd = window_.y + window_.height;
>>   	/* Holds [0] previous- [1] current- [2] next-line */
>> @@ -681,7 +681,8 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>>   	for (unsigned int y = window_.y; y < yEnd; y += 2) {
>>   		shiftLinePointers(linePointers, src);
>>   		memcpyNextLine(linePointers);
>> -		stats_->processLine0(y, linePointers);
>> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>> +			stats_->processLine0(y, linePointers);
>>   		(this->*debayer0_)(dst, linePointers);
>>   		src += inputConfig_.stride;
>>   		dst += outputConfig_.stride;
>> @@ -696,7 +697,8 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>>   	if (window_.y == 0) {
>>   		shiftLinePointers(linePointers, src);
>>   		memcpyNextLine(linePointers);
>> -		stats_->processLine0(yEnd, linePointers);
>> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>> +			stats_->processLine0(yEnd, linePointers);
>>   		(this->*debayer0_)(dst, linePointers);
>>   		src += inputConfig_.stride;
>>   		dst += outputConfig_.stride;
>> @@ -710,7 +712,7 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>>   	}
>>   }
>>   
>> -void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>> +void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst)
>>   {
>>   	const unsigned int yEnd = window_.y + window_.height;
>>   	/*
>> @@ -733,7 +735,8 @@ void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>>   	for (unsigned int y = window_.y; y < yEnd; y += 4) {
>>   		shiftLinePointers(linePointers, src);
>>   		memcpyNextLine(linePointers);
>> -		stats_->processLine0(y, linePointers);
>> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>> +			stats_->processLine0(y, linePointers);
>>   		(this->*debayer0_)(dst, linePointers);
>>   		src += inputConfig_.stride;
>>   		dst += outputConfig_.stride;
>> @@ -746,7 +749,8 @@ void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>>   
>>   		shiftLinePointers(linePointers, src);
>>   		memcpyNextLine(linePointers);
>> -		stats_->processLine2(y, linePointers);
>> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>> +			stats_->processLine2(y, linePointers);
>>   		(this->*debayer2_)(dst, linePointers);
>>   		src += inputConfig_.stride;
>>   		dst += outputConfig_.stride;
>> @@ -821,12 +825,13 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
>>   		return;
>>   	}
>>   
>> -	stats_->startFrame();
>> +	if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>> +		stats_->startFrame();
>>   
>>   	if (inputConfig_.patternSize.height == 2)
>> -		process2(in.planes()[0].data(), out.planes()[0].data());
>> +		process2(frame, in.planes()[0].data(), out.planes()[0].data());
>>   	else
>> -		process4(in.planes()[0].data(), out.planes()[0].data());
>> +		process4(frame, in.planes()[0].data(), out.planes()[0].data());
>>   
>>   	metadata.planes()[0].bytesused = out.planes()[0].size();
>>   
>> @@ -851,7 +856,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
>>   	 *
>>   	 * \todo Pass real bufferId once stats buffer passing is changed.
>>   	 */
>> -	stats_->finishFrame(frame, 0, true);
>> +	stats_->finishFrame(frame, 0, frame % SwStatsCpu::kStatPerNumFrames == 0);
> 
> This leads to a crash in my environment, after 4 frames:
> 
> cam: ../include/libcamera/controls.h:188: T libcamera::ControlValue::get() const [with T = int; typename std::enable_if<((! libcamera::details::is_span<U>::value) && (! std::is_same<std::__cxx11::basic_string<char>, typename std::remove_cv< <template-parameter-1-1> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ == details::control_type<std::remove_cv_t<T>>::value' failed.
> 
> Thread 5 "cam" received signal SIGABRT, Aborted.
> [...]
> #5  0x0000fffff4b99324 in libcamera::ControlValue::get<int, decltype(nullptr)> (this=<optimized out>) at ../include/libcamera/controls.h:188
> #6  libcamera::ipa::soft::IPASoftSimple::processStats (this=0xfffff0009f40, frame=4, bufferId=<optimized out>, sensorControls=...) at ../src/ipa/simple/soft_simple.cpp:312
> [...]
> 
> This omits IPASoftSimple::processStats, which I don't think is optional,
> it does (somewhat misleadingly) more than just stats processing.

Is this assertion failure reliably reproducible with these changes?
I cannot see it with an ipu6 laptop (running `cam -c1 -C32` repeatedly).

As far as I can tell `SwStatsCpu::finishFrame()` will emit the same signal
regardless, so I am not sure if the issue is here. I think it's that
`IPASoftSimple::processStats()` expects `sensorControls` to have
`V4L2_CID_{EXPOSURE,ANALOGUE_GAIN}` present, but at least one of them is not present.

There is even a "sanity check" in the function but it's too late...
that should be fixed at least.


Regards,
Barnabás Pőcze


> 
>>   	outputBufferReady.emit(output);
>>   	inputBufferReady.emit(input);
>>   }
>> diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h
>> index 9d343e46..03e0d784 100644
>> --- a/src/libcamera/software_isp/debayer_cpu.h
>> +++ b/src/libcamera/software_isp/debayer_cpu.h
>> @@ -133,8 +133,8 @@ private:
>>   	void setupInputMemcpy(const uint8_t *linePointers[]);
>>   	void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src);
>>   	void memcpyNextLine(const uint8_t *linePointers[]);
>> -	void process2(const uint8_t *src, uint8_t *dst);
>> -	void process4(const uint8_t *src, uint8_t *dst);
>> +	void process2(uint32_t frame, const uint8_t *src, uint8_t *dst);
>> +	void process4(uint32_t frame, const uint8_t *src, uint8_t *dst);
>>   
>>   	/* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */
>>   	static constexpr unsigned int kMaxLineBuffers = 5;
>> diff --git a/src/libcamera/software_isp/swstats_cpu.cpp b/src/libcamera/software_isp/swstats_cpu.cpp
>> index da91f912..35ba0a46 100644
>> --- a/src/libcamera/software_isp/swstats_cpu.cpp
>> +++ b/src/libcamera/software_isp/swstats_cpu.cpp
>> @@ -89,6 +89,11 @@ namespace libcamera {
>>    * \brief Signals that the statistics are ready
>>    */
>>   
>> +/**
>> + * \var SwStatsCpu::kStatPerNumFrames
>> + * \brief Run stats once every kStatPerNumFrames frames
>> + */
>> +
>>   /**
>>    * \typedef SwStatsCpu::statsProcessFn
>>    * \brief Called when there is data to get statistics from
>> diff --git a/src/libcamera/software_isp/swstats_cpu.h b/src/libcamera/software_isp/swstats_cpu.h
>> index 6ac3c4de..ea0e6d5a 100644
>> --- a/src/libcamera/software_isp/swstats_cpu.h
>> +++ b/src/libcamera/software_isp/swstats_cpu.h
>> @@ -32,6 +32,9 @@ public:
>>   	SwStatsCpu();
>>   	~SwStatsCpu() = default;
>>   
>> +	/* Run stats once every 4 frames */
>> +	static constexpr uint32_t kStatPerNumFrames = 4;
>> +
>>   	bool isValid() const { return sharedStats_.fd().isValid(); }
>>   
>>   	const SharedFD &getStatsFD() { return sharedStats_.fd(); }
>
Milan Zamazal Sept. 26, 2025, 7:02 p.m. UTC | #3
Barnabás Pőcze <barnabas.pocze@ideasonboard.com> writes:

> Hi
>
> 2025. 09. 26. 15:33 keltezéssel, Milan Zamazal írta:
>> Hi Hans,
>> thank you for the patch.
>> Hans de Goede <hansg@kernel.org> writes:
>> 
>>> Run sw-statistics once every 4th frame,
>> I still miss any explanation why exactly every 4th frame.
>> 
>>> instead of every frame. There are 2 reasons for this:
>>>
>>> 1. There really is no need to have statistics for every frame
>> Well, depends on fps, I'd say.  Arguably, with slow fps rates,
>> everything is and can be slow, but still.
>> 
>>> and only doing this every 4th frame helps save some CPU time.
>>>
>>> 2. The generic nature of the simple pipeline-handler, so no information
>>> about possible CSI receiver frame-delays. In combination with the software
>>> ISP often being used with sensors without sensor info in the sensor-helper
>>> code, so no reliable control-delay information means that the software ISP
>>> is prone to AGC oscillation. Skipping statistics gathering also means
>>> skipping running the AGC algorithm slowing it down, avoiding this
>>> oscillation.
>> The last sentence is not very clear to me.
>> 
>>> Note ideally the AGC oscillation problem would be fixed by adding sensor
>>> metadata support all through the stack so that the exact gain and exposure
>>> used for a specific frame are reliably provided by the sensor metadata.
>>>
>>> Signed-off-by: Hans de Goede <hansg@kernel.org>
>>> ---
>>>   src/libcamera/software_isp/debayer_cpu.cpp | 25 +++++++++++++---------
>>>   src/libcamera/software_isp/debayer_cpu.h   |  4 ++--
>>>   src/libcamera/software_isp/swstats_cpu.cpp |  5 +++++
>>>   src/libcamera/software_isp/swstats_cpu.h   |  3 +++
>>>   4 files changed, 25 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp
>>> index bfa60888..9010333e 100644
>>> --- a/src/libcamera/software_isp/debayer_cpu.cpp
>>> +++ b/src/libcamera/software_isp/debayer_cpu.cpp
>>> @@ -655,7 +655,7 @@ void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])
>>>   	lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1);
>>>   }
>>>   -void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>>> +void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)
>>>   {
>>>   	unsigned int yEnd = window_.y + window_.height;
>>>   	/* Holds [0] previous- [1] current- [2] next-line */
>>> @@ -681,7 +681,8 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>>>   	for (unsigned int y = window_.y; y < yEnd; y += 2) {
>>>   		shiftLinePointers(linePointers, src);
>>>   		memcpyNextLine(linePointers);
>>> -		stats_->processLine0(y, linePointers);
>>> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>>> +			stats_->processLine0(y, linePointers);
>>>   		(this->*debayer0_)(dst, linePointers);
>>>   		src += inputConfig_.stride;
>>>   		dst += outputConfig_.stride;
>>> @@ -696,7 +697,8 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>>>   	if (window_.y == 0) {
>>>   		shiftLinePointers(linePointers, src);
>>>   		memcpyNextLine(linePointers);
>>> -		stats_->processLine0(yEnd, linePointers);
>>> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>>> +			stats_->processLine0(yEnd, linePointers);
>>>   		(this->*debayer0_)(dst, linePointers);
>>>   		src += inputConfig_.stride;
>>>   		dst += outputConfig_.stride;
>>> @@ -710,7 +712,7 @@ void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
>>>   	}
>>>   }
>>>   -void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>>> +void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst)
>>>   {
>>>   	const unsigned int yEnd = window_.y + window_.height;
>>>   	/*
>>> @@ -733,7 +735,8 @@ void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>>>   	for (unsigned int y = window_.y; y < yEnd; y += 4) {
>>>   		shiftLinePointers(linePointers, src);
>>>   		memcpyNextLine(linePointers);
>>> -		stats_->processLine0(y, linePointers);
>>> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>>> +			stats_->processLine0(y, linePointers);
>>>   		(this->*debayer0_)(dst, linePointers);
>>>   		src += inputConfig_.stride;
>>>   		dst += outputConfig_.stride;
>>> @@ -746,7 +749,8 @@ void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
>>>     		shiftLinePointers(linePointers, src);
>>>   		memcpyNextLine(linePointers);
>>> -		stats_->processLine2(y, linePointers);
>>> +		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>>> +			stats_->processLine2(y, linePointers);
>>>   		(this->*debayer2_)(dst, linePointers);
>>>   		src += inputConfig_.stride;
>>>   		dst += outputConfig_.stride;
>>> @@ -821,12 +825,13 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
>>>   		return;
>>>   	}
>>>   -	stats_->startFrame();
>>> +	if (frame % SwStatsCpu::kStatPerNumFrames == 0)
>>> +		stats_->startFrame();
>>>     	if (inputConfig_.patternSize.height == 2)
>>> -		process2(in.planes()[0].data(), out.planes()[0].data());
>>> +		process2(frame, in.planes()[0].data(), out.planes()[0].data());
>>>   	else
>>> -		process4(in.planes()[0].data(), out.planes()[0].data());
>>> +		process4(frame, in.planes()[0].data(), out.planes()[0].data());
>>>     	metadata.planes()[0].bytesused = out.planes()[0].size();
>>>   @@ -851,7 +856,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
>>>   	 *
>>>   	 * \todo Pass real bufferId once stats buffer passing is changed.
>>>   	 */
>>> -	stats_->finishFrame(frame, 0, true);
>>> +	stats_->finishFrame(frame, 0, frame % SwStatsCpu::kStatPerNumFrames == 0);
>> This leads to a crash in my environment, after 4 frames:
>> cam: ../include/libcamera/controls.h:188: T libcamera::ControlValue::get() const [with T = int; typename
>> std::enable_if<((! libcamera::details::is_span<U>::value) && (!
>> std::is_same<std::__cxx11::basic_string<char>, typename std::remove_cv< <template-parameter-1-1>
>> >::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ ==
>> details::control_type<std::remove_cv_t<T>>::value' failed.
>> Thread 5 "cam" received signal SIGABRT, Aborted.
>> [...]
>> #5 0x0000fffff4b99324 in libcamera::ControlValue::get<int, decltype(nullptr)> (this=<optimized out>) at
>> ../include/libcamera/controls.h:188
>> #6 libcamera::ipa::soft::IPASoftSimple::processStats (this=0xfffff0009f40, frame=4, bufferId=<optimized
>> out>, sensorControls=...) at ../src/ipa/simple/soft_simple.cpp:312
>> [...]
>> This omits IPASoftSimple::processStats, which I don't think is optional,
>> it does (somewhat misleadingly) more than just stats processing.
>
> Is this assertion failure reliably reproducible with these changes?

Yes.

> I cannot see it with an ipu6 laptop (running `cam -c1 -C32` repeatedly).

I think it depends on kStatPerNumFrames, the sensor delay (imx219 in my
case) and whatever happens in the DelayControls ring.  The problem is
apparently that delayedCtrls_ is not updated on each frame, in which
case libcamera::DelayedControls::get() may hit an uninitialized value
and crash on the value type check due to the type being ControlTypeNone.

> As far as I can tell `SwStatsCpu::finishFrame()` will emit the same signal
> regardless, so I am not sure if the issue is here. I think it's that
> `IPASoftSimple::processStats()` expects `sensorControls` to have
> `V4L2_CID_{EXPOSURE,ANALOGUE_GAIN}` present, but at least one of them is not present.

At least in my case, the keys are present but not the values.

> There is even a "sanity check" in the function but it's too late...
> that should be fixed at least.

Yes (although it wouldn't help here).

But regarding this series: Do we still need delayedCtrls_ with these
patches?

> Regards,
> Barnabás Pőcze
>
>
>> 
>>>   	outputBufferReady.emit(output);
>>>   	inputBufferReady.emit(input);
>>>   }
>>> diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h
>>> index 9d343e46..03e0d784 100644
>>> --- a/src/libcamera/software_isp/debayer_cpu.h
>>> +++ b/src/libcamera/software_isp/debayer_cpu.h
>>> @@ -133,8 +133,8 @@ private:
>>>   	void setupInputMemcpy(const uint8_t *linePointers[]);
>>>   	void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src);
>>>   	void memcpyNextLine(const uint8_t *linePointers[]);
>>> -	void process2(const uint8_t *src, uint8_t *dst);
>>> -	void process4(const uint8_t *src, uint8_t *dst);
>>> +	void process2(uint32_t frame, const uint8_t *src, uint8_t *dst);
>>> +	void process4(uint32_t frame, const uint8_t *src, uint8_t *dst);
>>>     	/* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */
>>>   	static constexpr unsigned int kMaxLineBuffers = 5;
>>> diff --git a/src/libcamera/software_isp/swstats_cpu.cpp b/src/libcamera/software_isp/swstats_cpu.cpp
>>> index da91f912..35ba0a46 100644
>>> --- a/src/libcamera/software_isp/swstats_cpu.cpp
>>> +++ b/src/libcamera/software_isp/swstats_cpu.cpp
>>> @@ -89,6 +89,11 @@ namespace libcamera {
>>>    * \brief Signals that the statistics are ready
>>>    */
>>>   +/**
>>> + * \var SwStatsCpu::kStatPerNumFrames
>>> + * \brief Run stats once every kStatPerNumFrames frames
>>> + */
>>> +
>>>   /**
>>>    * \typedef SwStatsCpu::statsProcessFn
>>>    * \brief Called when there is data to get statistics from
>>> diff --git a/src/libcamera/software_isp/swstats_cpu.h b/src/libcamera/software_isp/swstats_cpu.h
>>> index 6ac3c4de..ea0e6d5a 100644
>>> --- a/src/libcamera/software_isp/swstats_cpu.h
>>> +++ b/src/libcamera/software_isp/swstats_cpu.h
>>> @@ -32,6 +32,9 @@ public:
>>>   	SwStatsCpu();
>>>   	~SwStatsCpu() = default;
>>>   +	/* Run stats once every 4 frames */
>>> +	static constexpr uint32_t kStatPerNumFrames = 4;
>>> +
>>>   	bool isValid() const { return sharedStats_.fd().isValid(); }
>>>     	const SharedFD &getStatsFD() { return sharedStats_.fd(); }
>>
Hans de Goede Sept. 27, 2025, 10:44 a.m. UTC | #4
Hi,

On 26-Sep-25 9:02 PM, Milan Zamazal wrote:
> Barnabás Pőcze <barnabas.pocze@ideasonboard.com> writes:
> 
>> Hi
>>
>> 2025. 09. 26. 15:33 keltezéssel, Milan Zamazal írta:
>>> Hi Hans,
>>> thank you for the patch.

...

>>>>   @@ -851,7 +856,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
>>>>   	 *
>>>>   	 * \todo Pass real bufferId once stats buffer passing is changed.
>>>>   	 */
>>>> -	stats_->finishFrame(frame, 0, true);
>>>> +	stats_->finishFrame(frame, 0, frame % SwStatsCpu::kStatPerNumFrames == 0);
>>> This leads to a crash in my environment, after 4 frames:
>>> cam: ../include/libcamera/controls.h:188: T libcamera::ControlValue::get() const [with T = int; typename
>>> std::enable_if<((! libcamera::details::is_span<U>::value) && (!
>>> std::is_same<std::__cxx11::basic_string<char>, typename std::remove_cv< <template-parameter-1-1>
>>>> ::type>::value)), std::nullptr_t>::type <anonymous> = nullptr]: Assertion `type_ ==
>>> details::control_type<std::remove_cv_t<T>>::value' failed.
>>> Thread 5 "cam" received signal SIGABRT, Aborted.
>>> [...]
>>> #5 0x0000fffff4b99324 in libcamera::ControlValue::get<int, decltype(nullptr)> (this=<optimized out>) at
>>> ../include/libcamera/controls.h:188
>>> #6 libcamera::ipa::soft::IPASoftSimple::processStats (this=0xfffff0009f40, frame=4, bufferId=<optimized
>>> out>, sensorControls=...) at ../src/ipa/simple/soft_simple.cpp:312
>>> [...]
>>> This omits IPASoftSimple::processStats, which I don't think is optional,
>>> it does (somewhat misleadingly) more than just stats processing.
>>
>> Is this assertion failure reliably reproducible with these changes?
> 
> Yes.
> 
>> I cannot see it with an ipu6 laptop (running `cam -c1 -C32` repeatedly).
> 
> I think it depends on kStatPerNumFrames, the sensor delay (imx219 in my
> case) and whatever happens in the DelayControls ring.  The problem is
> apparently that delayedCtrls_ is not updated on each frame, in which
> case libcamera::DelayedControls::get() may hit an uninitialized value
> and crash on the value type check due to the type being ControlTypeNone.
> 
>> As far as I can tell `SwStatsCpu::finishFrame()` will emit the same signal
>> regardless, so I am not sure if the issue is here. I think it's that
>> `IPASoftSimple::processStats()` expects `sensorControls` to have
>> `V4L2_CID_{EXPOSURE,ANALOGUE_GAIN}` present, but at least one of them is not present.
> 
> At least in my case, the keys are present but not the values.
> 
>> There is even a "sanity check" in the function but it's too late...
>> that should be fixed at least.
> 
> Yes (although it wouldn't help here).
> 
> But regarding this series: Do we still need delayedCtrls_ with these
> patches?

We want to keep using delayedCtrls_ to make the software ISP IPA work
the same as other IPAs. This will also be important with Kieran's
upcoming work to share common AGC code between IPAs which will likely
also expect delayed ctrls.

Regards,

Hans

Patch
diff mbox series

diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp
index bfa60888..9010333e 100644
--- a/src/libcamera/software_isp/debayer_cpu.cpp
+++ b/src/libcamera/software_isp/debayer_cpu.cpp
@@ -655,7 +655,7 @@  void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])
 	lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1);
 }
 
-void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
+void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)
 {
 	unsigned int yEnd = window_.y + window_.height;
 	/* Holds [0] previous- [1] current- [2] next-line */
@@ -681,7 +681,8 @@  void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
 	for (unsigned int y = window_.y; y < yEnd; y += 2) {
 		shiftLinePointers(linePointers, src);
 		memcpyNextLine(linePointers);
-		stats_->processLine0(y, linePointers);
+		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
+			stats_->processLine0(y, linePointers);
 		(this->*debayer0_)(dst, linePointers);
 		src += inputConfig_.stride;
 		dst += outputConfig_.stride;
@@ -696,7 +697,8 @@  void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
 	if (window_.y == 0) {
 		shiftLinePointers(linePointers, src);
 		memcpyNextLine(linePointers);
-		stats_->processLine0(yEnd, linePointers);
+		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
+			stats_->processLine0(yEnd, linePointers);
 		(this->*debayer0_)(dst, linePointers);
 		src += inputConfig_.stride;
 		dst += outputConfig_.stride;
@@ -710,7 +712,7 @@  void DebayerCpu::process2(const uint8_t *src, uint8_t *dst)
 	}
 }
 
-void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
+void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst)
 {
 	const unsigned int yEnd = window_.y + window_.height;
 	/*
@@ -733,7 +735,8 @@  void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
 	for (unsigned int y = window_.y; y < yEnd; y += 4) {
 		shiftLinePointers(linePointers, src);
 		memcpyNextLine(linePointers);
-		stats_->processLine0(y, linePointers);
+		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
+			stats_->processLine0(y, linePointers);
 		(this->*debayer0_)(dst, linePointers);
 		src += inputConfig_.stride;
 		dst += outputConfig_.stride;
@@ -746,7 +749,8 @@  void DebayerCpu::process4(const uint8_t *src, uint8_t *dst)
 
 		shiftLinePointers(linePointers, src);
 		memcpyNextLine(linePointers);
-		stats_->processLine2(y, linePointers);
+		if (frame % SwStatsCpu::kStatPerNumFrames == 0)
+			stats_->processLine2(y, linePointers);
 		(this->*debayer2_)(dst, linePointers);
 		src += inputConfig_.stride;
 		dst += outputConfig_.stride;
@@ -821,12 +825,13 @@  void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
 		return;
 	}
 
-	stats_->startFrame();
+	if (frame % SwStatsCpu::kStatPerNumFrames == 0)
+		stats_->startFrame();
 
 	if (inputConfig_.patternSize.height == 2)
-		process2(in.planes()[0].data(), out.planes()[0].data());
+		process2(frame, in.planes()[0].data(), out.planes()[0].data());
 	else
-		process4(in.planes()[0].data(), out.planes()[0].data());
+		process4(frame, in.planes()[0].data(), out.planes()[0].data());
 
 	metadata.planes()[0].bytesused = out.planes()[0].size();
 
@@ -851,7 +856,7 @@  void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output
 	 *
 	 * \todo Pass real bufferId once stats buffer passing is changed.
 	 */
-	stats_->finishFrame(frame, 0, true);
+	stats_->finishFrame(frame, 0, frame % SwStatsCpu::kStatPerNumFrames == 0);
 	outputBufferReady.emit(output);
 	inputBufferReady.emit(input);
 }
diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h
index 9d343e46..03e0d784 100644
--- a/src/libcamera/software_isp/debayer_cpu.h
+++ b/src/libcamera/software_isp/debayer_cpu.h
@@ -133,8 +133,8 @@  private:
 	void setupInputMemcpy(const uint8_t *linePointers[]);
 	void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src);
 	void memcpyNextLine(const uint8_t *linePointers[]);
-	void process2(const uint8_t *src, uint8_t *dst);
-	void process4(const uint8_t *src, uint8_t *dst);
+	void process2(uint32_t frame, const uint8_t *src, uint8_t *dst);
+	void process4(uint32_t frame, const uint8_t *src, uint8_t *dst);
 
 	/* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */
 	static constexpr unsigned int kMaxLineBuffers = 5;
diff --git a/src/libcamera/software_isp/swstats_cpu.cpp b/src/libcamera/software_isp/swstats_cpu.cpp
index da91f912..35ba0a46 100644
--- a/src/libcamera/software_isp/swstats_cpu.cpp
+++ b/src/libcamera/software_isp/swstats_cpu.cpp
@@ -89,6 +89,11 @@  namespace libcamera {
  * \brief Signals that the statistics are ready
  */
 
+/**
+ * \var SwStatsCpu::kStatPerNumFrames
+ * \brief Run stats once every kStatPerNumFrames frames
+ */
+
 /**
  * \typedef SwStatsCpu::statsProcessFn
  * \brief Called when there is data to get statistics from
diff --git a/src/libcamera/software_isp/swstats_cpu.h b/src/libcamera/software_isp/swstats_cpu.h
index 6ac3c4de..ea0e6d5a 100644
--- a/src/libcamera/software_isp/swstats_cpu.h
+++ b/src/libcamera/software_isp/swstats_cpu.h
@@ -32,6 +32,9 @@  public:
 	SwStatsCpu();
 	~SwStatsCpu() = default;
 
+	/* Run stats once every 4 frames */
+	static constexpr uint32_t kStatPerNumFrames = 4;
+
 	bool isValid() const { return sharedStats_.fd().isValid(); }
 
 	const SharedFD &getStatsFD() { return sharedStats_.fd(); }