From patchwork Tue May 26 08:06:39 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robert Mader X-Patchwork-Id: 26801 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id A39CDBDCBC for ; Tue, 26 May 2026 08:07:08 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id 6B0D06175A; Tue, 26 May 2026 10:07:07 +0200 (CEST) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (1024-bit key; unprotected) header.d=collabora.com header.i=robert.mader@collabora.com header.b="KJozV3Zx"; dkim-atps=neutral Received: from sender4-pp-f112.zoho.com (sender4-pp-f112.zoho.com [136.143.188.112]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id A370D62FE8 for ; Tue, 26 May 2026 10:07:01 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; t=1779782818; cv=none; d=zohomail.com; s=zohoarc; b=FFS/GCugBQ3/UsjdgIab2DAByovjgzPP3W2XSFVkTMoYy/NfMO66yRPaDm9dric8FqHM4aXXO54N8R3vsGt9xd8+sdHJpeJsX8r08X9w+iqLDvhXU06kiFAAuLhaXrlNNquAS1MxtCOSrE7GIzteFoAK84HIAyJtC0pqDK9qzXA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1779782818; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To; bh=K5kcKpaFplODiMO6UlgBFWqCeukHA2z/tDC8Oje4cx0=; b=MzjyInr3knogtYfOJgJTaskb3kXFtpl3JHcGMi5nGi1VS4zgLTRZUt6idiI0LGntiNA+wPsm9Wiphv+ikPWTWAg/ocZyIHgV4DVNHTpDzMIYbGsycyjyIn/ChC0JFFJdvOrZF0qZaolcKraK+5ljKNEUPNjr1J6uo2naspg8kqI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=collabora.com; spf=pass smtp.mailfrom=robert.mader@collabora.com; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1779782818; s=zohomail; d=collabora.com; i=robert.mader@collabora.com; h=From:From:To:To:Cc:Cc:Subject:Subject:Date:Date:Message-ID:In-Reply-To:References:MIME-Version:Content-Transfer-Encoding:Message-Id:Reply-To; bh=K5kcKpaFplODiMO6UlgBFWqCeukHA2z/tDC8Oje4cx0=; b=KJozV3ZxYdcGzozXZT7DFedpYPntIS7d0wii22rpOvWjMqlj0Df+w6gT4/gQHTip +ZLBicqdDKZH9Kr0Uzfzs5mAi248nWYQ7uW29wBhDu2qUGwHh+D6twvos5lhVGxAZMT eMP3pBoDpgk1AIn4I+EQd8dIhYMOF2ruSulFSGuk= Received: by mx.zohomail.com with SMTPS id 1779782817125866.728989903609; Tue, 26 May 2026 01:06:57 -0700 (PDT) From: Robert Mader To: libcamera-devel@lists.libcamera.org Cc: Robert Mader Subject: [PATCH v2 2/2] debayer_egl: Sync output buffers after processing stats Date: Tue, 26 May 2026 10:06:39 +0200 Message-ID: <20260526080639.70173-3-robert.mader@collabora.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260526080639.70173-1-robert.mader@collabora.com> References: <20260526080639.70173-1-robert.mader@collabora.com> MIME-Version: 1.0 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Instead of waiting for the GPU to finish output buffers *before* computing stats, do so afterwards. This allows work to happen in parallel on the GPU and CPU, improving throughput and reducing latency on various devices as shown below. On order for this to work well we need to flush all GL commands to the GPU before doing heavy CPU work - thus add a corresponding function. Below are some benchmark results. All where done using postmarketOS edge with updates from 21th May 2026 (Mesa 26.1.1). The mentioned pipelines where run five times each, with the mean value included here, which should be quite representive as the variance was rather small. All devices where using the powersave governor. Notes: 1. We only expect changes for frames where stats get computed, currently every fourth frame (see kStatPerNumFrames) - and the improvements indeed have been observed to increase when computing stats more often. 2. At least the qcom/freedreno devices have been found to be affected by a performance issue *without* this patch. This issue can be worked around by calling `glFlush()` directly before `glFinish()` - as done in v1 of this series - or by running with `GALLIUM_THREAD=0`. The benchmarks below show the performance gains *with* those workarounds applied in order to not inflate the impact of this patch. See https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15516 for more context. cam -c /base/soc@0/cci@ac4a000/i2c-bus@0/camera@1a -s width=1920,height=1080 --capture=60 Before: 33596 us/frame After: 30179 us/frame cam -c /base/soc@0/cci@ac4a000/i2c-bus@1/camera@1a -s width=1920,height=1080 --capture=60 Before: 14922 us/frame After: 14304 us/frame cam -c /base/soc@0/cci@ac4b000/i2c-bus@1/camera@10 -s width=1920,height=1080 --capture=60 Before: 26106 us/frame After: 23312 us/frame cam -c /base/soc@0/cci@ac4a000/i2c-bus@1/camera@29 -s width=1920,height=1080 --capture=60 Before: 15897 us/frame After: 14791 us/frame cam -c /base/soc@0/cci@ac4a000/i2c-bus@1/camera@10 -s width=1920,height=1080 --capture=60 Before: 25721 us/frame After: 23625 us/frame cam -c /base/soc@0/cci@ac4a000/i2c-bus@0/camera@10 -s width=1920,height=1080 --capture=60 Before: 34124 us/frame After: 29471 us/frame cam -c /base/soc@0/cci@ac4a000/i2c-bus@0/camera@1a -s width=1920,height=1080 --capture=60 Before: 23707 us/frame After: 21890 us/frame cam -c /base/soc@0/bus@30800000/i2c@30a40000/camera@20 -s width=1280,height=720 --capture=60 Before: 91649 us/frame After: 83233 us/frame cam -c /base/soc@0/bus@30800000/i2c@30a50000/camera@2d -s width=1280,height=720 --capture=60 Before: 76956 us/frame After: 69569 us/frame cam -c /base/i2c-csi/front-camera@3c -s width=1280,height=720 --capture=60 Before: 188500 us/frame After: 173764 us/frame cam -c /base/i2c-csi/rear-camera@4c -s width=1280,height=720 --capture=60 Before: 190222 us/frame After: 177251 us/frame Signed-off-by: Robert Mader Reviewed-by: Bryan O'Donoghue --- include/libcamera/internal/egl.h | 1 + src/libcamera/egl.cpp | 13 +++++++++++++ src/libcamera/software_isp/debayer_egl.cpp | 3 ++- 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/libcamera/internal/egl.h b/include/libcamera/internal/egl.h index 0ad2320b1..0ec8ea6ec 100644 --- a/include/libcamera/internal/egl.h +++ b/include/libcamera/internal/egl.h @@ -119,6 +119,7 @@ public: void useProgram(GLuint programId); void deleteProgram(GLuint programId); void syncOutput(); + void flushOutput(); private: LIBCAMERA_DISABLE_COPY_AND_MOVE(eGL) diff --git a/src/libcamera/egl.cpp b/src/libcamera/egl.cpp index 357918711..19ae92305 100644 --- a/src/libcamera/egl.cpp +++ b/src/libcamera/egl.cpp @@ -97,6 +97,19 @@ void eGL::syncOutput() glFinish(); } +/** + * \brief Flush the rendering pipeline + * + * Calls glFlush(). + * + */ +void eGL::flushOutput() +{ + ASSERT(tid_ == Thread::currentId()); + + glFlush(); +} + /** * \brief Create a DMA-BUF backed 2D texture * \param[in,out] eglImage EGL image to associate with the DMA-BUF diff --git a/src/libcamera/software_isp/debayer_egl.cpp b/src/libcamera/software_isp/debayer_egl.cpp index ed9a68013..7b9e02d90 100644 --- a/src/libcamera/software_isp/debayer_egl.cpp +++ b/src/libcamera/software_isp/debayer_egl.cpp @@ -521,7 +521,7 @@ int DebayerEGL::debayerGPU(MappedFrameBuffer &in, int out_fd, const DebayerParam LOG(eGL, Error) << "Drawing scene fail " << err; return -ENODEV; } else { - egl_.syncOutput(); + egl_.flushOutput(); } return 0; @@ -558,6 +558,7 @@ void DebayerEGL::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output stats_->processFrame(frame, 0, input); dmaSyncers.clear(); + egl_.syncOutput(); bench_.finishFrame(); outputBufferReady.emit(output);