[10/10] libcamera: software_isp: debayer_egl: Implement input/output frame caching mechanism
diff mbox series

Message ID 20260624085849.873784-11-bryan.odonoghue@linaro.org
State New
Headers show
Series
  • libcamera: software_isp: gpu: Add go faster stripes
Related show

Commit Message

Bryan O'Donoghue June 24, 2026, 8:58 a.m. UTC
Implement a texture caching mechanism for both input and output frames and
for both types of input frame.

The before/after on a Qualcomm x1e is:

9.737ms per frame
5.691ms per frame

The before/after on a Qualcomm sm8250 is:

21.710ms per frame
17.336ms per frame

for i in {1..20} do
cam -c /base/soc@0/cci@ac16000/i2c-bus@1/camera@10 -s width=1920,height=1080 --capture=60

Interestingly there appears to be an absolute ~ 4.x ms per frame uplift as
opposed to what intuition might suggest a proportional.

Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
---
 src/libcamera/software_isp/debayer_egl.cpp | 108 +++++++++++++++++----
 src/libcamera/software_isp/debayer_egl.h   |  12 ++-
 2 files changed, 100 insertions(+), 20 deletions(-)

Comments

Robert Mader June 24, 2026, 11:10 a.m. UTC | #1
Thanks a lot!

I tested this on a FP5 and a Pixel 3a and can not only confirm the 
improvements on the timings but also observed lower CPU usage. In 
particular in combination with dmabuf imports the CPU usage seems to 
essentially collapse when eyeballing htop, see below.

So for the whole series I can already give:

Tested-by: Robert Mader <robert.mader@collabora.com>

Will now go over the individual commits.

---

FairPhone 5

cam -c 1 -s width=1920,height=1080 --capture=60

INFO SoftwareIsp software_isp.cpp:299 Input 2048x1536-RGGB-10-CSI2P 
stride 2560
dmabuf import succeeds

before:
INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 365691us, 
12189 us/frame
~39% CPU time in Wireplumber/Pipewire when running Snapshot

after:
INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 204924us, 
6830 us/frame
~13% CPU time in Wireplumber/Pipewire when running Snapshot

-

cam -c 2 -s width=1920,height=1080 --capture=60

INFO SoftwareIsp software_isp.cpp:299 Input 4080x3072-GRBG-10-CSI2P 
stride 5104
dmabuf import fails

before:
INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 687332us, 
22911 us/frame
~63% CPU time in Wireplumber/Pipewire when running Snapshot

after:
INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 579945us, 
19331 us/frame
~53% CPU time in Wireplumber/Pipewire when running Snapshot

---

Pixel 3a

cam -c 1 -s width=1920,height=1080 --capture=60

INFO SoftwareIsp software_isp.cpp:299 Input 1936x1096-RGGB-10-CSI2P 
stride 2432
dmabuf import succeeds

before:
INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 489368us, 
16312 us/frame
~59% CPU time in Wireplumber/Pipewire when running Snapshot

after:
INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 353912us, 
11797 us/frame
~17% CPU time in Wireplumber/Pipewire when running Snapshot

-

cam -c 2 -s width=1920,height=1080 --capture=60

INFO SoftwareIsp software_isp.cpp:299 Input 4032x3024-RGGB-10-CSI2P 
stride 5040
dmabuf import fails

before:
INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 916702us, 
30556 us/frame
~57% CPU time in Wireplumber/Pipewire when running Snapshot

after:
INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 771620us, 
25720 us/frame
~47% CPU time in Wireplumber/Pipewire when running Snapshot

On 24.06.26 10:58, Bryan O'Donoghue wrote:
> Implement a texture caching mechanism for both input and output frames and
> for both types of input frame.
>
> The before/after on a Qualcomm x1e is:
>
> 9.737ms per frame
> 5.691ms per frame
>
> The before/after on a Qualcomm sm8250 is:
>
> 21.710ms per frame
> 17.336ms per frame
>
> for i in {1..20} do
> cam -c /base/soc@0/cci@ac16000/i2c-bus@1/camera@10 -s width=1920,height=1080 --capture=60
>
> Interestingly there appears to be an absolute ~ 4.x ms per frame uplift as
> opposed to what intuition might suggest a proportional.
>
> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
> ---
>   src/libcamera/software_isp/debayer_egl.cpp | 108 +++++++++++++++++----
>   src/libcamera/software_isp/debayer_egl.h   |  12 ++-
>   2 files changed, 100 insertions(+), 20 deletions(-)
>
> diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp
> index 0568c413b..8ac5cb76f 100644
> --- a/src/libcamera/software_isp/debayer_egl.cpp
> +++ b/src/libcamera/software_isp/debayer_egl.cpp
> @@ -355,6 +355,12 @@ int DebayerEGL::configure(const StreamConfiguration &inputCfg,
>   	 */
>   	stats_->setWindow(Rectangle(window_.size()));
>   
> +	inputBufferCache_ = std::make_unique<V4L2BufferCache>(inputCfg.bufferCount);
> +	outputBufferCache_ = std::make_unique<V4L2BufferCache>(outputCfg.bufferCount);
> +
> +	eglImageBayerIn_.resize(inputCfg.bufferCount);
> +	eglImageBayerOut_.resize(outputCfg.bufferCount);
> +
>   	return 0;
>   }
>   
> @@ -514,34 +520,106 @@ void DebayerEGL::setShaderVariableValues(eGLImage &eglImageIn, const DebayerPara
>   	return;
>   }
>   
> -int DebayerEGL::debayerGPU(FrameBuffer *input, FrameBuffer *output, const DebayerParams &params, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
> +int DebayerEGL::getBufferCache(V4L2BufferCache &cache, FrameBuffer *framebuffer, bool &cache_hit)
>   {
> -	/* eGL context switch */
> -	egl_.makeCurrent();
> +	int cache_idx;
> +
> +	cache_idx = cache.get(*framebuffer, cache_hit);
> +	if (cache_idx < 0) {
> +		LOG(Debayer, Error) << "buffer exceeds configured cache size";
> +		return -ENODEV;
> +	}
> +	cache.put(cache_idx);
> +
> +	return cache_idx;
> +}
> +
> +eGLImage *DebayerEGL::getCachedInputFrameBuffer(FrameBuffer *input, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
> +{
> +	eGLImage *eglImageIn;
> +	bool cache_hit;
> +	int cache_idx;
> +
> +	cache_idx = getBufferCache(*inputBufferCache_, input, cache_hit);
> +	if (cache_idx < 0)
> +		return nullptr;
> +
> +	if (!cache_hit) {
> +		eglImageBayerIn_[cache_idx] = std::make_unique<eGLImage>(glFormat_, inputConfig_.stride / bytesPerPixel_,
> +									 height_, inputConfig_.stride, GL_TEXTURE0, 0);
> +	}
> +
> +	eglImageIn = eglImageBayerIn_[cache_idx].get();
>   
>   	/* Try to create texture for input buffer via dmabuf import */
> -	if (use_dmabuf_) {
> -		if (egl_.createInputDMABufTexture2D(*eglImageBayerIn_, input->planes()[0].fd.get()) != 0) {
> +	if (use_dmabuf_ && !cache_hit) {
> +		if (egl_.createInputDMABufTexture2D(*eglImageIn, input->planes()[0].fd.get()) != 0) {
>   			use_dmabuf_ = false;
>   			LOG(Debayer, Info) << "Importing input buffer with DMABuf import failed, falling back to upload";
>   		}
>   	}
>   
> +	/* Cache hit using dmabuf activate and bind */
> +	if (use_dmabuf_ && cache_hit) {
> +		egl_.activateBindTexture(*eglImageIn);
> +	}
> +
>   	/* Otherwise create texture for input buffer via upload from CPU */
>   	if (!use_dmabuf_) {
>   		inDmaSyncer->emplace(input->planes()[0].fd, DmaSyncer::SyncType::Read);
>   		inMapped->emplace(input, MappedFrameBuffer::MapFlag::Read);
>   		if (!inMapped->value().isValid()) {
>   			LOG(Debayer, Error) << "mmap-ing buffer(s) failed";
> -			return -ENODEV;
> +			return nullptr;
>   		}
> -		egl_.createInputTexture2D(*eglImageBayerIn_, inMapped->value().planes()[0].data());
> +		if (cache_hit)
> +			egl_.updateInputTexture2D(*eglImageIn, inMapped->value().planes()[0].data());
> +		else
> +			egl_.createInputTexture2D(*eglImageIn, inMapped->value().planes()[0].data());
>   	}
>   
> -	/* Generate the output render framebuffer as render to texture */
> -	egl_.createOutputDMABufTexture2D(*eglImageBayerOut_, output->planes()[0].fd.get());
> +	return eglImageIn;
> +}
> +
> +eGLImage *DebayerEGL::getCachedOutputFrameBuffer(FrameBuffer *output)
> +{
> +	eGLImage *eglImageOut;
> +	bool cache_hit;
> +	int cache_idx;
> +
> +	cache_idx = getBufferCache(*outputBufferCache_, output, cache_hit);
> +	if (cache_idx < 0)
> +		return nullptr;
> +
> +	if (!cache_hit) {
> +		eglImageBayerOut_[cache_idx] = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width,
> +									  outputSize_.height, outputConfig_.stride, GL_TEXTURE1, 1);
> +		egl_.createOutputDMABufTexture2D(*eglImageBayerOut_[cache_idx], output->planes()[0].fd.get());
> +	}
> +	eglImageOut = eglImageBayerOut_[cache_idx].get();
> +
> +	return eglImageOut;
> +}
> +
> +int DebayerEGL::debayerGPU(FrameBuffer *input, FrameBuffer *output, const DebayerParams &params, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
> +{
> +	eGLImage *eglImageIn;
> +	eGLImage *eglImageOut;
> +
> +	/* eGL context switch */
> +	egl_.makeCurrent();
> +
> +	eglImageIn = getCachedInputFrameBuffer(input, inMapped, inDmaSyncer);
> +	if (!eglImageIn)
> +		return -ENOMEM;
> +
> +	eglImageOut = getCachedOutputFrameBuffer(output);
> +	if (!eglImageOut)
> +		return -ENOMEM;
> +
> +	egl_.attachTextureToFBO(*eglImageOut);
> +	setShaderVariableValues(*eglImageIn, params);
>   
> -	setShaderVariableValues(*eglImageBayerIn_, params);
>   	glViewport(0, 0, width_, height_);
>   	glClear(GL_COLOR_BUFFER_BIT);
>   	glDrawArrays(GL_TRIANGLE_FAN, 0, DEBAYER_OPENGL_COORDS);
> @@ -623,19 +701,13 @@ int DebayerEGL::start()
>   	if (initBayerShaders(inputPixelFormat_, outputPixelFormat_))
>   		return -EINVAL;
>   
> -	/* Raw bayer input as texture */
> -	eglImageBayerIn_ = std::make_unique<eGLImage>(glFormat_, inputConfig_.stride / bytesPerPixel_, height_, inputConfig_.stride, GL_TEXTURE0, 0);
> -
> -	/* Texture we will render to */
> -	eglImageBayerOut_ = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width, outputSize_.height, outputConfig_.stride, GL_TEXTURE1, 1);
> -
>   	return 0;
>   }
>   
>   void DebayerEGL::stop()
>   {
> -	eglImageBayerOut_.reset();
> -	eglImageBayerIn_.reset();
> +	eglImageBayerOut_.clear();
> +	eglImageBayerIn_.clear();
>   
>   	if (programId_)
>   		glDeleteProgram(programId_);
> diff --git a/src/libcamera/software_isp/debayer_egl.h b/src/libcamera/software_isp/debayer_egl.h
> index d8509e9f2..238fe7345 100644
> --- a/src/libcamera/software_isp/debayer_egl.h
> +++ b/src/libcamera/software_isp/debayer_egl.h
> @@ -22,6 +22,7 @@
>   #include "libcamera/internal/mapped_framebuffer.h"
>   #include "libcamera/internal/software_isp/benchmark.h"
>   #include "libcamera/internal/software_isp/swstats_cpu.h"
> +#include "libcamera/internal/v4l2_videodevice.h"
>   
>   #include <EGL/egl.h>
>   #include <EGL/eglext.h>
> @@ -70,14 +71,21 @@ private:
>   
>   	bool use_dmabuf_;
>   
> +	int getBufferCache(V4L2BufferCache &buffercache, FrameBuffer *framebuffer, bool &hit);
> +	eGLImage *getCachedInputFrameBuffer(FrameBuffer *input, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer);
> +	eGLImage *getCachedOutputFrameBuffer(FrameBuffer *output);
> +
> +	std::unique_ptr<V4L2BufferCache> inputBufferCache_;
> +	std::unique_ptr<V4L2BufferCache> outputBufferCache_;
> +
>   	/* Shader program identifiers */
>   	GLuint vertexShaderId_ = 0;
>   	GLuint fragmentShaderId_ = 0;
>   	GLuint programId_ = 0;
>   
>   	/* Pointer to object representing input texture */
> -	std::unique_ptr<eGLImage> eglImageBayerIn_;
> -	std::unique_ptr<eGLImage> eglImageBayerOut_;
> +	std::vector<std::unique_ptr<eGLImage>> eglImageBayerIn_;
> +	std::vector<std::unique_ptr<eGLImage>> eglImageBayerOut_;
>   
>   	/* Shader parameters */
>   	float firstRed_x_;
Robert Mader June 24, 2026, 12:35 p.m. UTC | #2
On 24.06.26 10:58, Bryan O'Donoghue wrote:
> Implement a texture caching mechanism for both input and output frames and
> for both types of input frame.
>
> The before/after on a Qualcomm x1e is:
>
> 9.737ms per frame
> 5.691ms per frame
>
> The before/after on a Qualcomm sm8250 is:
>
> 21.710ms per frame
> 17.336ms per frame
>
> for i in {1..20} do
> cam -c /base/soc@0/cci@ac16000/i2c-bus@1/camera@10 -s width=1920,height=1080 --capture=60
>
> Interestingly there appears to be an absolute ~ 4.x ms per frame uplift as
> opposed to what intuition might suggest a proportional.
>
> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
> ---
>   src/libcamera/software_isp/debayer_egl.cpp | 108 +++++++++++++++++----
>   src/libcamera/software_isp/debayer_egl.h   |  12 ++-
>   2 files changed, 100 insertions(+), 20 deletions(-)
>
> diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp
> index 0568c413b..8ac5cb76f 100644
> --- a/src/libcamera/software_isp/debayer_egl.cpp
> +++ b/src/libcamera/software_isp/debayer_egl.cpp
> @@ -355,6 +355,12 @@ int DebayerEGL::configure(const StreamConfiguration &inputCfg,
>   	 */
>   	stats_->setWindow(Rectangle(window_.size()));
>   
> +	inputBufferCache_ = std::make_unique<V4L2BufferCache>(inputCfg.bufferCount);
> +	outputBufferCache_ = std::make_unique<V4L2BufferCache>(outputCfg.bufferCount);
> +
> +	eglImageBayerIn_.resize(inputCfg.bufferCount);
> +	eglImageBayerOut_.resize(outputCfg.bufferCount);
> +
>   	return 0;
>   }
>   
> @@ -514,34 +520,106 @@ void DebayerEGL::setShaderVariableValues(eGLImage &eglImageIn, const DebayerPara
>   	return;
>   }
>   
> -int DebayerEGL::debayerGPU(FrameBuffer *input, FrameBuffer *output, const DebayerParams &params, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
> +int DebayerEGL::getBufferCache(V4L2BufferCache &cache, FrameBuffer *framebuffer, bool &cache_hit)
Maybe something like lookupFramebufferFromBufferCache()? 
getBufferCache() sounds like it would return a V4L2BufferCache.
>   {
> -	/* eGL context switch */
> -	egl_.makeCurrent();
> +	int cache_idx;
> +
> +	cache_idx = cache.get(*framebuffer, cache_hit);
> +	if (cache_idx < 0) {
> +		LOG(Debayer, Error) << "buffer exceeds configured cache size";
> +		return -ENODEV;
> +	}
> +	cache.put(cache_idx);
> +
> +	return cache_idx;
> +}
> +
> +eGLImage *DebayerEGL::getCachedInputFrameBuffer(FrameBuffer *input, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
> +{
> +	eGLImage *eglImageIn;
> +	bool cache_hit;
> +	int cache_idx;
> +
> +	cache_idx = getBufferCache(*inputBufferCache_, input, cache_hit);
> +	if (cache_idx < 0)
> +		return nullptr;
> +
> +	if (!cache_hit) {
> +		eglImageBayerIn_[cache_idx] = std::make_unique<eGLImage>(glFormat_, inputConfig_.stride / bytesPerPixel_,
> +									 height_, inputConfig_.stride, GL_TEXTURE0, 0);
> +	}
> +
> +	eglImageIn = eglImageBayerIn_[cache_idx].get();
>   
>   	/* Try to create texture for input buffer via dmabuf import */
> -	if (use_dmabuf_) {
> -		if (egl_.createInputDMABufTexture2D(*eglImageBayerIn_, input->planes()[0].fd.get()) != 0) {
> +	if (use_dmabuf_ && !cache_hit) {
> +		if (egl_.createInputDMABufTexture2D(*eglImageIn, input->planes()[0].fd.get()) != 0) {
>   			use_dmabuf_ = false;
>   			LOG(Debayer, Info) << "Importing input buffer with DMABuf import failed, falling back to upload";
>   		}
>   	}
>   
> +	/* Cache hit using dmabuf activate and bind */
> +	if (use_dmabuf_ && cache_hit) {
> +		egl_.activateBindTexture(*eglImageIn);
> +	}
> +
>   	/* Otherwise create texture for input buffer via upload from CPU */
>   	if (!use_dmabuf_) {
>   		inDmaSyncer->emplace(input->planes()[0].fd, DmaSyncer::SyncType::Read);
>   		inMapped->emplace(input, MappedFrameBuffer::MapFlag::Read);
>   		if (!inMapped->value().isValid()) {
>   			LOG(Debayer, Error) << "mmap-ing buffer(s) failed";
> -			return -ENODEV;
> +			return nullptr;
>   		}
> -		egl_.createInputTexture2D(*eglImageBayerIn_, inMapped->value().planes()[0].data());
> +		if (cache_hit)
> +			egl_.updateInputTexture2D(*eglImageIn, inMapped->value().planes()[0].data());
> +		else
> +			egl_.createInputTexture2D(*eglImageIn, inMapped->value().planes()[0].data());
>   	}
>   
> -	/* Generate the output render framebuffer as render to texture */
> -	egl_.createOutputDMABufTexture2D(*eglImageBayerOut_, output->planes()[0].fd.get());
> +	return eglImageIn;
> +}
> +
> +eGLImage *DebayerEGL::getCachedOutputFrameBuffer(FrameBuffer *output)
> +{
> +	eGLImage *eglImageOut;
> +	bool cache_hit;
> +	int cache_idx;
> +
> +	cache_idx = getBufferCache(*outputBufferCache_, output, cache_hit);
> +	if (cache_idx < 0)
> +		return nullptr;
> +
> +	if (!cache_hit) {
> +		eglImageBayerOut_[cache_idx] = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width,
> +									  outputSize_.height, outputConfig_.stride, GL_TEXTURE1, 1);
> +		egl_.createOutputDMABufTexture2D(*eglImageBayerOut_[cache_idx], output->planes()[0].fd.get());
> +	}

Don't we need to call

egl_.activateBindTexture(*eglImageOut);

in the else-case here? It's called in createDMABufTexture2D().

It seems to work for me either way, but I'm confused why. Is it somehow 
implicit via the attachTextureToFBO() below?

> +	eglImageOut = eglImageBayerOut_[cache_idx].get();
> +
> +	return eglImageOut;
> +}
> +
> +int DebayerEGL::debayerGPU(FrameBuffer *input, FrameBuffer *output, const DebayerParams &params, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
> +{
> +	eGLImage *eglImageIn;
> +	eGLImage *eglImageOut;
> +
> +	/* eGL context switch */
> +	egl_.makeCurrent();
> +
> +	eglImageIn = getCachedInputFrameBuffer(input, inMapped, inDmaSyncer);
> +	if (!eglImageIn)
> +		return -ENOMEM;
> +
> +	eglImageOut = getCachedOutputFrameBuffer(output);
> +	if (!eglImageOut)
> +		return -ENOMEM;
> +
> +	egl_.attachTextureToFBO(*eglImageOut);
> +	setShaderVariableValues(*eglImageIn, params);
>   
> -	setShaderVariableValues(*eglImageBayerIn_, params);
>   	glViewport(0, 0, width_, height_);
>   	glClear(GL_COLOR_BUFFER_BIT);
>   	glDrawArrays(GL_TRIANGLE_FAN, 0, DEBAYER_OPENGL_COORDS);
> @@ -623,19 +701,13 @@ int DebayerEGL::start()
>   	if (initBayerShaders(inputPixelFormat_, outputPixelFormat_))
>   		return -EINVAL;
>   
> -	/* Raw bayer input as texture */
> -	eglImageBayerIn_ = std::make_unique<eGLImage>(glFormat_, inputConfig_.stride / bytesPerPixel_, height_, inputConfig_.stride, GL_TEXTURE0, 0);
> -
> -	/* Texture we will render to */
> -	eglImageBayerOut_ = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width, outputSize_.height, outputConfig_.stride, GL_TEXTURE1, 1);
> -
>   	return 0;
>   }
>   
>   void DebayerEGL::stop()
>   {
> -	eglImageBayerOut_.reset();
> -	eglImageBayerIn_.reset();
> +	eglImageBayerOut_.clear();
> +	eglImageBayerIn_.clear();
>   
>   	if (programId_)
>   		glDeleteProgram(programId_);
> diff --git a/src/libcamera/software_isp/debayer_egl.h b/src/libcamera/software_isp/debayer_egl.h
> index d8509e9f2..238fe7345 100644
> --- a/src/libcamera/software_isp/debayer_egl.h
> +++ b/src/libcamera/software_isp/debayer_egl.h
> @@ -22,6 +22,7 @@
>   #include "libcamera/internal/mapped_framebuffer.h"
>   #include "libcamera/internal/software_isp/benchmark.h"
>   #include "libcamera/internal/software_isp/swstats_cpu.h"
> +#include "libcamera/internal/v4l2_videodevice.h"
>   
>   #include <EGL/egl.h>
>   #include <EGL/eglext.h>
> @@ -70,14 +71,21 @@ private:
>   
>   	bool use_dmabuf_;
>   
> +	int getBufferCache(V4L2BufferCache &buffercache, FrameBuffer *framebuffer, bool &hit);
> +	eGLImage *getCachedInputFrameBuffer(FrameBuffer *input, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer);
> +	eGLImage *getCachedOutputFrameBuffer(FrameBuffer *output);
> +
> +	std::unique_ptr<V4L2BufferCache> inputBufferCache_;
> +	std::unique_ptr<V4L2BufferCache> outputBufferCache_;
> +
>   	/* Shader program identifiers */
>   	GLuint vertexShaderId_ = 0;
>   	GLuint fragmentShaderId_ = 0;
>   	GLuint programId_ = 0;
>   
>   	/* Pointer to object representing input texture */
> -	std::unique_ptr<eGLImage> eglImageBayerIn_;
> -	std::unique_ptr<eGLImage> eglImageBayerOut_;
> +	std::vector<std::unique_ptr<eGLImage>> eglImageBayerIn_;
> +	std::vector<std::unique_ptr<eGLImage>> eglImageBayerOut_;
>   
>   	/* Shader parameters */
>   	float firstRed_x_;

Two small comments, otherwise LGTM - and works great!

Reviewed-by: Robert Mader <robert.mader@collabora.com>

Patch
diff mbox series

diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp
index 0568c413b..8ac5cb76f 100644
--- a/src/libcamera/software_isp/debayer_egl.cpp
+++ b/src/libcamera/software_isp/debayer_egl.cpp
@@ -355,6 +355,12 @@  int DebayerEGL::configure(const StreamConfiguration &inputCfg,
 	 */
 	stats_->setWindow(Rectangle(window_.size()));
 
+	inputBufferCache_ = std::make_unique<V4L2BufferCache>(inputCfg.bufferCount);
+	outputBufferCache_ = std::make_unique<V4L2BufferCache>(outputCfg.bufferCount);
+
+	eglImageBayerIn_.resize(inputCfg.bufferCount);
+	eglImageBayerOut_.resize(outputCfg.bufferCount);
+
 	return 0;
 }
 
@@ -514,34 +520,106 @@  void DebayerEGL::setShaderVariableValues(eGLImage &eglImageIn, const DebayerPara
 	return;
 }
 
-int DebayerEGL::debayerGPU(FrameBuffer *input, FrameBuffer *output, const DebayerParams &params, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
+int DebayerEGL::getBufferCache(V4L2BufferCache &cache, FrameBuffer *framebuffer, bool &cache_hit)
 {
-	/* eGL context switch */
-	egl_.makeCurrent();
+	int cache_idx;
+
+	cache_idx = cache.get(*framebuffer, cache_hit);
+	if (cache_idx < 0) {
+		LOG(Debayer, Error) << "buffer exceeds configured cache size";
+		return -ENODEV;
+	}
+	cache.put(cache_idx);
+
+	return cache_idx;
+}
+
+eGLImage *DebayerEGL::getCachedInputFrameBuffer(FrameBuffer *input, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
+{
+	eGLImage *eglImageIn;
+	bool cache_hit;
+	int cache_idx;
+
+	cache_idx = getBufferCache(*inputBufferCache_, input, cache_hit);
+	if (cache_idx < 0)
+		return nullptr;
+
+	if (!cache_hit) {
+		eglImageBayerIn_[cache_idx] = std::make_unique<eGLImage>(glFormat_, inputConfig_.stride / bytesPerPixel_,
+									 height_, inputConfig_.stride, GL_TEXTURE0, 0);
+	}
+
+	eglImageIn = eglImageBayerIn_[cache_idx].get();
 
 	/* Try to create texture for input buffer via dmabuf import */
-	if (use_dmabuf_) {
-		if (egl_.createInputDMABufTexture2D(*eglImageBayerIn_, input->planes()[0].fd.get()) != 0) {
+	if (use_dmabuf_ && !cache_hit) {
+		if (egl_.createInputDMABufTexture2D(*eglImageIn, input->planes()[0].fd.get()) != 0) {
 			use_dmabuf_ = false;
 			LOG(Debayer, Info) << "Importing input buffer with DMABuf import failed, falling back to upload";
 		}
 	}
 
+	/* Cache hit using dmabuf activate and bind */
+	if (use_dmabuf_ && cache_hit) {
+		egl_.activateBindTexture(*eglImageIn);
+	}
+
 	/* Otherwise create texture for input buffer via upload from CPU */
 	if (!use_dmabuf_) {
 		inDmaSyncer->emplace(input->planes()[0].fd, DmaSyncer::SyncType::Read);
 		inMapped->emplace(input, MappedFrameBuffer::MapFlag::Read);
 		if (!inMapped->value().isValid()) {
 			LOG(Debayer, Error) << "mmap-ing buffer(s) failed";
-			return -ENODEV;
+			return nullptr;
 		}
-		egl_.createInputTexture2D(*eglImageBayerIn_, inMapped->value().planes()[0].data());
+		if (cache_hit)
+			egl_.updateInputTexture2D(*eglImageIn, inMapped->value().planes()[0].data());
+		else
+			egl_.createInputTexture2D(*eglImageIn, inMapped->value().planes()[0].data());
 	}
 
-	/* Generate the output render framebuffer as render to texture */
-	egl_.createOutputDMABufTexture2D(*eglImageBayerOut_, output->planes()[0].fd.get());
+	return eglImageIn;
+}
+
+eGLImage *DebayerEGL::getCachedOutputFrameBuffer(FrameBuffer *output)
+{
+	eGLImage *eglImageOut;
+	bool cache_hit;
+	int cache_idx;
+
+	cache_idx = getBufferCache(*outputBufferCache_, output, cache_hit);
+	if (cache_idx < 0)
+		return nullptr;
+
+	if (!cache_hit) {
+		eglImageBayerOut_[cache_idx] = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width,
+									  outputSize_.height, outputConfig_.stride, GL_TEXTURE1, 1);
+		egl_.createOutputDMABufTexture2D(*eglImageBayerOut_[cache_idx], output->planes()[0].fd.get());
+	}
+	eglImageOut = eglImageBayerOut_[cache_idx].get();
+
+	return eglImageOut;
+}
+
+int DebayerEGL::debayerGPU(FrameBuffer *input, FrameBuffer *output, const DebayerParams &params, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer)
+{
+	eGLImage *eglImageIn;
+	eGLImage *eglImageOut;
+
+	/* eGL context switch */
+	egl_.makeCurrent();
+
+	eglImageIn = getCachedInputFrameBuffer(input, inMapped, inDmaSyncer);
+	if (!eglImageIn)
+		return -ENOMEM;
+
+	eglImageOut = getCachedOutputFrameBuffer(output);
+	if (!eglImageOut)
+		return -ENOMEM;
+
+	egl_.attachTextureToFBO(*eglImageOut);
+	setShaderVariableValues(*eglImageIn, params);
 
-	setShaderVariableValues(*eglImageBayerIn_, params);
 	glViewport(0, 0, width_, height_);
 	glClear(GL_COLOR_BUFFER_BIT);
 	glDrawArrays(GL_TRIANGLE_FAN, 0, DEBAYER_OPENGL_COORDS);
@@ -623,19 +701,13 @@  int DebayerEGL::start()
 	if (initBayerShaders(inputPixelFormat_, outputPixelFormat_))
 		return -EINVAL;
 
-	/* Raw bayer input as texture */
-	eglImageBayerIn_ = std::make_unique<eGLImage>(glFormat_, inputConfig_.stride / bytesPerPixel_, height_, inputConfig_.stride, GL_TEXTURE0, 0);
-
-	/* Texture we will render to */
-	eglImageBayerOut_ = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width, outputSize_.height, outputConfig_.stride, GL_TEXTURE1, 1);
-
 	return 0;
 }
 
 void DebayerEGL::stop()
 {
-	eglImageBayerOut_.reset();
-	eglImageBayerIn_.reset();
+	eglImageBayerOut_.clear();
+	eglImageBayerIn_.clear();
 
 	if (programId_)
 		glDeleteProgram(programId_);
diff --git a/src/libcamera/software_isp/debayer_egl.h b/src/libcamera/software_isp/debayer_egl.h
index d8509e9f2..238fe7345 100644
--- a/src/libcamera/software_isp/debayer_egl.h
+++ b/src/libcamera/software_isp/debayer_egl.h
@@ -22,6 +22,7 @@ 
 #include "libcamera/internal/mapped_framebuffer.h"
 #include "libcamera/internal/software_isp/benchmark.h"
 #include "libcamera/internal/software_isp/swstats_cpu.h"
+#include "libcamera/internal/v4l2_videodevice.h"
 
 #include <EGL/egl.h>
 #include <EGL/eglext.h>
@@ -70,14 +71,21 @@  private:
 
 	bool use_dmabuf_;
 
+	int getBufferCache(V4L2BufferCache &buffercache, FrameBuffer *framebuffer, bool &hit);
+	eGLImage *getCachedInputFrameBuffer(FrameBuffer *input, std::optional<MappedFrameBuffer> *inMapped, std::optional<DmaSyncer> *inDmaSyncer);
+	eGLImage *getCachedOutputFrameBuffer(FrameBuffer *output);
+
+	std::unique_ptr<V4L2BufferCache> inputBufferCache_;
+	std::unique_ptr<V4L2BufferCache> outputBufferCache_;
+
 	/* Shader program identifiers */
 	GLuint vertexShaderId_ = 0;
 	GLuint fragmentShaderId_ = 0;
 	GLuint programId_ = 0;
 
 	/* Pointer to object representing input texture */
-	std::unique_ptr<eGLImage> eglImageBayerIn_;
-	std::unique_ptr<eGLImage> eglImageBayerOut_;
+	std::vector<std::unique_ptr<eGLImage>> eglImageBayerIn_;
+	std::vector<std::unique_ptr<eGLImage>> eglImageBayerOut_;
 
 	/* Shader parameters */
 	float firstRed_x_;