| Message ID | 20260618122245.946138-30-bryan.odonoghue@linaro.org |
|---|---|
| State | New |
| Headers | show |
| Series |
|
| Related | show |
Hi 2026. 06. 18. 14:22 keltezéssel, Bryan O'Donoghue írta: > Once a texture has been created using dma-buf handle, we can switch texture > units and ids with our glsl program without re-creating textures. > > Since we are mapping pages, instead of copying the GPU simply takes the > maps it needs and operates on those. > > Much faster. > > ➜ libcamera git:(0.7.0-multipass-v4) ✗ grep Bench before.log > [15:07:08.009062165] [1195303] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 729270us, 24309 us/frame > [15:07:11.686143411] [1195334] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 733995us, 24466 us/frame > [15:07:14.980640685] [1195363] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 740157us, 24671 us/frame > [15:07:18.163299379] [1195393] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 720094us, 24003 us/frame > [15:07:21.366461990] [1195422] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 719166us, 23972 us/frame > [15:07:24.718877325] [1195451] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 725425us, 24180 us/frame > [15:07:28.924768220] [1195481] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 753400us, 25113 us/frame > [15:07:32.336224289] [1195513] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 727160us, 24238 us/frame > [15:07:35.638928194] [1195542] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 762408us, 25413 us/frame > [15:07:38.868084716] [1195579] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 728991us, 24299 us/frame > > ➜ libcamera git:(0.7.0-multipass-v4) ✗ grep Bench after.log > [16:26:07.109426223] [1202010] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 650120us, 21670 us/frame > [16:26:18.925748074] [1202048] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 611062us, 20368 us/frame > [16:26:22.712614967] [1202077] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 609333us, 20311 us/frame > [16:26:26.551615514] [1202107] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 609791us, 20326 us/frame > [16:26:30.085663553] [1202136] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 612838us, 20427 us/frame > [16:26:34.945255617] [1202165] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 681918us, 22730 us/frame > [16:26:39.031353171] [1202194] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 595551us, 19851 us/frame > [16:26:42.610503048] [1202227] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 663929us, 22130 us/frame > [16:26:46.100211690] [1202256] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 580685us, 19356 us/frame > [16:26:49.394640903] [1202286] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 595072us, 19835 us/frame > > Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> > --- > .../software_isp/software_isp_pipeline_gpu.cpp | 17 ++++++++++------- > .../software_isp/software_isp_pipeline_gpu.h | 2 +- > 2 files changed, 11 insertions(+), 8 deletions(-) > > diff --git a/src/libcamera/software_isp/software_isp_pipeline_gpu.cpp b/src/libcamera/software_isp/software_isp_pipeline_gpu.cpp > index 2e5c0e40e..bc5d59575 100644 > --- a/src/libcamera/software_isp/software_isp_pipeline_gpu.cpp > +++ b/src/libcamera/software_isp/software_isp_pipeline_gpu.cpp > @@ -263,8 +263,14 @@ int SoftwareIspPipelineGpu::processGPU(FrameBuffer *input, FrameBuffer *output, > egl_.updateInputTexture2D(*eglImageBayerIn_, inMapped->value().planes()[0].data()); > } > > - /* Generate the output render framebuffer as render to texture */ > - egl_.createOutputDMABufTexture2D(*eglImageRGBAOut_, output->planes()[0].fd.get()); > + /* Find an existing eglImage in the cache */ > + auto [output_cache, output_miss] = eglImageRGBAOut_.try_emplace(output); > + if (output_miss) { > + /* Generate the output render framebuffer as render to texture */ > + output_cache->second = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width, outputSize_.height, outputConfig_.stride, GL_TEXTURE3, 3); > + egl_.createOutputDMABufTexture2D(*output_cache->second, output->planes()[0].fd.get()); > + } > + eGLImage &eglImageRGBAOut = *output_cache->second; > > pipelineResult = gpuIspShaderPassBlcNormalise_.process(*eglImageBayerIn_, *eglImagePingPong_[0], width_, height_, params); > if (pipelineResult) { > @@ -272,7 +278,7 @@ int SoftwareIspPipelineGpu::processGPU(FrameBuffer *input, FrameBuffer *output, > return pipelineResult; > } > > - pipelineResult = gpuIspShaderPassDemosiac_.process(*eglImagePingPong_[0], *eglImageRGBAOut_, width_, height_, params); > + pipelineResult = gpuIspShaderPassDemosiac_.process(*eglImagePingPong_[0], eglImageRGBAOut, width_, height_, params); > if (pipelineResult) { > LOG(Debayer, Error) << "Demosiac fail"; > return pipelineResult; > @@ -371,9 +377,6 @@ int SoftwareIspPipelineGpu::start() > eglImagePingPong_[0] = std::make_unique<eGLImage>(gpuIspShaderPassDemosiac_.glFormat_, width_, height_, outputConfig_.stride, GL_TEXTURE1, 1); > eglImagePingPong_[1] = std::make_unique<eGLImage>(gpuIspShaderPassDemosiac_.glFormat_, width_, height_, outputConfig_.stride, GL_TEXTURE2, 2); > > - /* Texture we will render to */ > - eglImageRGBAOut_ = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width, outputSize_.height, outputConfig_.stride, GL_TEXTURE3, 3); > - > egl_.createInputTexture2D(*eglImageBayerIn_, NULL); > egl_.createOutputTexture2D(*eglImagePingPong_[0]); > egl_.createOutputTexture2D(*eglImagePingPong_[1]); > @@ -383,7 +386,7 @@ int SoftwareIspPipelineGpu::start() > > void SoftwareIspPipelineGpu::stop() > { > - eglImageRGBAOut_.reset(); > + eglImageRGBAOut_.clear(); > eglImagePingPong_[1].reset(); > eglImagePingPong_[0].reset(); > eglImageBayerIn_.reset(); > diff --git a/src/libcamera/software_isp/software_isp_pipeline_gpu.h b/src/libcamera/software_isp/software_isp_pipeline_gpu.h > index b32d4cad3..995e84295 100644 > --- a/src/libcamera/software_isp/software_isp_pipeline_gpu.h > +++ b/src/libcamera/software_isp/software_isp_pipeline_gpu.h > @@ -69,7 +69,7 @@ private: > /* Pointer to object representing input texture */ > std::unique_ptr<eGLImage> eglImageBayerIn_; > std::unique_ptr<eGLImage> eglImagePingPong_[2]; > - std::unique_ptr<eGLImage> eglImageRGBAOut_; > + std::unordered_map<FrameBuffer *, std::unique_ptr<eGLImage>> eglImageRGBAOut_; This has to have a hard-limit on the number of entries because technically each request may use a different set of buffers, so the size must be limited. Then it must also handle situation correctly where a `FrameBuffer` is destroyed after request completion, and a new one is created for a later request that happens to have the same address. I think something like `V4L2BufferCache` is needed if you want to do caching. Regards, Barnabás Pőcze > > std::unique_ptr<SwStatsCpu> stats_; > eGL egl_;
diff --git a/src/libcamera/software_isp/software_isp_pipeline_gpu.cpp b/src/libcamera/software_isp/software_isp_pipeline_gpu.cpp index 2e5c0e40e..bc5d59575 100644 --- a/src/libcamera/software_isp/software_isp_pipeline_gpu.cpp +++ b/src/libcamera/software_isp/software_isp_pipeline_gpu.cpp @@ -263,8 +263,14 @@ int SoftwareIspPipelineGpu::processGPU(FrameBuffer *input, FrameBuffer *output, egl_.updateInputTexture2D(*eglImageBayerIn_, inMapped->value().planes()[0].data()); } - /* Generate the output render framebuffer as render to texture */ - egl_.createOutputDMABufTexture2D(*eglImageRGBAOut_, output->planes()[0].fd.get()); + /* Find an existing eglImage in the cache */ + auto [output_cache, output_miss] = eglImageRGBAOut_.try_emplace(output); + if (output_miss) { + /* Generate the output render framebuffer as render to texture */ + output_cache->second = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width, outputSize_.height, outputConfig_.stride, GL_TEXTURE3, 3); + egl_.createOutputDMABufTexture2D(*output_cache->second, output->planes()[0].fd.get()); + } + eGLImage &eglImageRGBAOut = *output_cache->second; pipelineResult = gpuIspShaderPassBlcNormalise_.process(*eglImageBayerIn_, *eglImagePingPong_[0], width_, height_, params); if (pipelineResult) { @@ -272,7 +278,7 @@ int SoftwareIspPipelineGpu::processGPU(FrameBuffer *input, FrameBuffer *output, return pipelineResult; } - pipelineResult = gpuIspShaderPassDemosiac_.process(*eglImagePingPong_[0], *eglImageRGBAOut_, width_, height_, params); + pipelineResult = gpuIspShaderPassDemosiac_.process(*eglImagePingPong_[0], eglImageRGBAOut, width_, height_, params); if (pipelineResult) { LOG(Debayer, Error) << "Demosiac fail"; return pipelineResult; @@ -371,9 +377,6 @@ int SoftwareIspPipelineGpu::start() eglImagePingPong_[0] = std::make_unique<eGLImage>(gpuIspShaderPassDemosiac_.glFormat_, width_, height_, outputConfig_.stride, GL_TEXTURE1, 1); eglImagePingPong_[1] = std::make_unique<eGLImage>(gpuIspShaderPassDemosiac_.glFormat_, width_, height_, outputConfig_.stride, GL_TEXTURE2, 2); - /* Texture we will render to */ - eglImageRGBAOut_ = std::make_unique<eGLImage>(GL_RGBA, outputSize_.width, outputSize_.height, outputConfig_.stride, GL_TEXTURE3, 3); - egl_.createInputTexture2D(*eglImageBayerIn_, NULL); egl_.createOutputTexture2D(*eglImagePingPong_[0]); egl_.createOutputTexture2D(*eglImagePingPong_[1]); @@ -383,7 +386,7 @@ int SoftwareIspPipelineGpu::start() void SoftwareIspPipelineGpu::stop() { - eglImageRGBAOut_.reset(); + eglImageRGBAOut_.clear(); eglImagePingPong_[1].reset(); eglImagePingPong_[0].reset(); eglImageBayerIn_.reset(); diff --git a/src/libcamera/software_isp/software_isp_pipeline_gpu.h b/src/libcamera/software_isp/software_isp_pipeline_gpu.h index b32d4cad3..995e84295 100644 --- a/src/libcamera/software_isp/software_isp_pipeline_gpu.h +++ b/src/libcamera/software_isp/software_isp_pipeline_gpu.h @@ -69,7 +69,7 @@ private: /* Pointer to object representing input texture */ std::unique_ptr<eGLImage> eglImageBayerIn_; std::unique_ptr<eGLImage> eglImagePingPong_[2]; - std::unique_ptr<eGLImage> eglImageRGBAOut_; + std::unordered_map<FrameBuffer *, std::unique_ptr<eGLImage>> eglImageRGBAOut_; std::unique_ptr<SwStatsCpu> stats_; eGL egl_;
Once a texture has been created using dma-buf handle, we can switch texture units and ids with our glsl program without re-creating textures. Since we are mapping pages, instead of copying the GPU simply takes the maps it needs and operates on those. Much faster. ➜ libcamera git:(0.7.0-multipass-v4) ✗ grep Bench before.log [15:07:08.009062165] [1195303] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 729270us, 24309 us/frame [15:07:11.686143411] [1195334] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 733995us, 24466 us/frame [15:07:14.980640685] [1195363] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 740157us, 24671 us/frame [15:07:18.163299379] [1195393] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 720094us, 24003 us/frame [15:07:21.366461990] [1195422] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 719166us, 23972 us/frame [15:07:24.718877325] [1195451] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 725425us, 24180 us/frame [15:07:28.924768220] [1195481] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 753400us, 25113 us/frame [15:07:32.336224289] [1195513] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 727160us, 24238 us/frame [15:07:35.638928194] [1195542] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 762408us, 25413 us/frame [15:07:38.868084716] [1195579] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 728991us, 24299 us/frame ➜ libcamera git:(0.7.0-multipass-v4) ✗ grep Bench after.log [16:26:07.109426223] [1202010] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 650120us, 21670 us/frame [16:26:18.925748074] [1202048] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 611062us, 20368 us/frame [16:26:22.712614967] [1202077] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 609333us, 20311 us/frame [16:26:26.551615514] [1202107] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 609791us, 20326 us/frame [16:26:30.085663553] [1202136] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 612838us, 20427 us/frame [16:26:34.945255617] [1202165] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 681918us, 22730 us/frame [16:26:39.031353171] [1202194] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 595551us, 19851 us/frame [16:26:42.610503048] [1202227] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 663929us, 22130 us/frame [16:26:46.100211690] [1202256] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 580685us, 19356 us/frame [16:26:49.394640903] [1202286] INFO Benchmark benchmark.cpp:89 Debayer processed 30 frames in 595072us, 19835 us/frame Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> --- .../software_isp/software_isp_pipeline_gpu.cpp | 17 ++++++++++------- .../software_isp/software_isp_pipeline_gpu.h | 2 +- 2 files changed, 11 insertions(+), 8 deletions(-)