From patchwork Tue Mar 10 12:01:04 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 26272 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id 1A945BE086 for ; Tue, 10 Mar 2026 12:01:22 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id DC92B62645; Tue, 10 Mar 2026 13:01:20 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (2048-bit key; unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.b="fF/rb/Ne"; dkim=pass (2048-bit key; unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="HL0w5Z1O"; dkim-atps=neutral Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id F346C62626 for ; Tue, 10 Mar 2026 13:01:18 +0100 (CET) Received: from pps.filterd (m0279863.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 62A9I8Vt2460651 for ; Tue, 10 Mar 2026 12:01:17 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=qcppdkim1; bh= j8iK55wVYltfsPQzWGh2pIi5AXAitYIIZhZveZC8niU=; b=fF/rb/NeczPdcfpR gu7jEtVCj2eBGg3F9Se41p9kk/pUJXlJzoivPHbtS9yiG0FSYYy7sqep2qZPXsUR VO3dSI9BBNU8s6dmI8GODfdNs0CYVxhUl2j5m7+ujpLnBEWTGPGdHlFKmkyq/LDv GyZH1lP3Zo9nWKyhvfxx3caCk9lTH3MvFts/CT9n5VtEj//uWuHBas2JkK88Fia9 uc0G22krnlk9TxqzPShWiFh5UH3mG/CGrXhrQ/KoxQhMA+tg7XhPWN8myuAbTSRM ozvaIcGC89voDNypGEgNweyOf8LVs5Xr22mpJ3S3tHnhYCSwVgdW7f0d3M3UpIoN VFoqig== Received: from mail-vs1-f72.google.com (mail-vs1-f72.google.com [209.85.217.72]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4csyv1c5eg-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Tue, 10 Mar 2026 12:01:17 +0000 (GMT) Received: by mail-vs1-f72.google.com with SMTP id ada2fe7eead31-5ffbe27449cso34544749137.0 for ; Tue, 10 Mar 2026 05:01:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1773144076; x=1773748876; darn=lists.libcamera.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=j8iK55wVYltfsPQzWGh2pIi5AXAitYIIZhZveZC8niU=; b=HL0w5Z1O5oPySme9OkfgKqTKRdlBXodvbMurXYvJ/4o3w6AzGybDKMfE1vI3tbUbeb 9db0EFU2Xy32IWXewPWS2ElteS+cdA7yJHxarud09oFXRGxFnnMac6tSgSsAKJgpdrKW y8BHcB7lhgjbxyd8MtFXxplqqTMo189418sSYenECOiFOrmkV2/SykH1nWF6lLjBDoqB 2Ci4Z+RXacu9fPkYs+/clTyfRYLKruAsSqum0Gru9UqLoos+/2scGZW3MfMW5S1YqIkp AgIrwv1XEfaeRTn8nLKYQALGgDQIwWeliATzwKVebt9/1Xmn9cSzOCOReJnczwcqPnF7 PYzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773144076; x=1773748876; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=j8iK55wVYltfsPQzWGh2pIi5AXAitYIIZhZveZC8niU=; b=bgd+UlwvXxmDR0rce/LjC7IllLL/dduZUdzVQE/oUNJTNTk/ktPrdimRdj6lI65anU 8iNVAi1VdpvlM/opfPozN8ByVG/WodVBtKt1rh7Ciz7yxFSJ7LQHRs6tDIfSgOLBXIJ7 QhfWlqzU2oMVGZo56gCU7uobEBpl5OS97lAEwY0+za6BIIn4JCnbrkcSzwaU1C75Jbrr oLdS9gep8NVQKSVzUNc/sXt6KdHpcVWlxVi5UFBC0w/hoze9vPPL4OPQpMCNlWzk/rPF yw3Mb/clOmB2ODv0lL/Rjnrpz5d4gAez0Dp8x6epye8cqwcyODbt8ratV45p4pLhN7Ru Txlw== X-Gm-Message-State: AOJu0Yyzc5PmszAsiW2CVdt8u2qQLBFnbmAWZI+NArP7HIi2ucrngw+3 rqD7xA9pamRcTtiTWVTza0nW+9sEeyLpQ2rxHn2ORo961NGC4a8an3HmS5Mb9hHD4fVTW4Jz73o Mdj1va5wjGtEpC2d9OFVXIv9RNgLj3SJ+3OFarZ89KBL/XE+VALigH5imGJCKD782v1xscT436Z 6yyXZaVRdj X-Gm-Gg: ATEYQzx0SEKCNKCw7Sy+LwTQVVO0w8ukKLCLZsTIog4LqFezADaVMcSEwjN+G6NddGI XS2MBxDclxY76sddJ3yFNH+RdKHusTB8AtNEW3/DA3rcQI1cKu09g6zy1dQJwYcTQDuM1zyA/yC L5n0e/AjNxvEF8lqEqTjAF4h1WRIBCwK/V0dwarJIM+I64Gps1e9lhQLLEODqKzjGHsW64a5+pG x0QoM/eBPoIBBCibLfB7ejt7/+o9TioiX1D83TKpYXTTIAHAHsUSGmFQFWeF7uBIk+CzZARYVto MeSyTCMI4nrxQTZzd+mmW6MFDaimZbDe2JFVpGMvEPDdKlsUs4teOrKDrdpjHzIeXdToRpfVjF2 i6shMofsMJIQxN2kME+2yIpH7zH0zesNDcIKJtLdhUyQP3E5KWXroBtdwpcoWX/ZuNodpsVkJuu AVGLlPZ4KPHWDAq0TNEI3jFxJdwsq6kNb+dA== X-Received: by 2002:a05:6102:26d4:b0:5ff:de83:3e46 with SMTP id ada2fe7eead31-6003a341f56mr1157373137.7.1773144075524; Tue, 10 Mar 2026 05:01:15 -0700 (PDT) X-Received: by 2002:a05:6102:26d4:b0:5ff:de83:3e46 with SMTP id ada2fe7eead31-6003a341f56mr1157345137.7.1773144074769; Tue, 10 Mar 2026 05:01:14 -0700 (PDT) Received: from t14s (2001-1c00-0c32-7800-beb3-9058-f5fe-3f2e.cable.dynamic.v6.ziggo.nl. [2001:1c00:c32:7800:beb3:9058:f5fe:3f2e]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48541aa7aacsm87843405e9.13.2026.03.10.05.01.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Mar 2026 05:01:13 -0700 (PDT) From: Hans de Goede To: libcamera-devel@lists.libcamera.org Cc: Milan Zamazal , Hans de Goede , =?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= Subject: [PATCH v7 3/5] software_isp: debayer_cpu: Add multi-threading support Date: Tue, 10 Mar 2026 13:01:04 +0100 Message-ID: <20260310120106.79922-4-johannes.goede@oss.qualcomm.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260310120106.79922-1-johannes.goede@oss.qualcomm.com> References: <20260310120106.79922-1-johannes.goede@oss.qualcomm.com> MIME-Version: 1.0 X-Proofpoint-GUID: 3MAOpUKOEH6MDNmz17NZUs5fzruURrkD X-Proofpoint-ORIG-GUID: 3MAOpUKOEH6MDNmz17NZUs5fzruURrkD X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzEwMDEwNCBTYWx0ZWRfX2MDRV3635NWZ M/+MboujdkJjSqIEYIga1Xp8Annc7S19W61qh6AB+eixy4j36tRU7jXWpzbdBd0n90mHJFOVQLq Dnpw1yYmQGZ8v98hE/eUIa5lm7Skyo8uLRP/HBlxfAAFUwrOjGI4WO6bXyWxF6zMWS7gJwLXQdO P7cpKsFcpyZp+QvnE4kamTUmNXw3AMdpLzCJDGonWxmuZfTf7hEsnJVkbO2BSYoqsL4lBhj3RzG IuiccQsP6GKQP/ECvtGOZNoveO8NnFGehcFToV6cB5qKfnnNn/o4Wq3T/NjHNsibkEGCGhqFWVR YIKDN6mTvYM+2j9XU8IaVcWflkSvarkRU6oSX6xtQHehwduGGvtTHnkQaP6kBhKag9xgedMGjb8 XickuVKrIABqBYQY2plNz8AdWcolHoMA57fr95dYTw7amVofA8KXRDKKbnzWJ00iXQOxSU3SCP0 SEn0BxYhMDF1+MbyF7Q== X-Authority-Analysis: v=2.4 cv=Cuays34D c=1 sm=1 tr=0 ts=69b0080d cx=c_pps a=DUEm7b3gzWu7BqY5nP7+9g==:117 a=xqWC_Br6kY4A:10 a=IkcTkHD0fZMA:10 a=Yq5XynenixoA:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=u7WPNUs3qKkmUXheDGA7:22 a=yOCtJkima9RkubShWh1s:22 a=P1BnusSwAAAA:8 a=20KFwNOVAAAA:8 a=EUspDBNiAAAA:8 a=4i1XYD-pSnUufzSozdQA:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10 a=-aSRE8QhW-JAV6biHavz:22 a=D0XLA9XvdZm18NrgonBM:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293, Aquarius:18.0.1143, Hydra:6.1.51, FMLib:17.12.100.49 definitions=2026-03-10_02,2026-03-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 clxscore=1015 spamscore=0 adultscore=0 priorityscore=1501 phishscore=0 suspectscore=0 lowpriorityscore=0 bulkscore=0 impostorscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2602130000 definitions=main-2603100104 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Add CPU soft ISP multi-threading support. Benchmark results for the Arduino Uno-Q with a weak CPU which is good for performance testing, all numbers with an IMX219 running at 3280x2464 -> 3272x2464: 1 thread : 147ms / frame, ~6.5 fps 2 threads: 80ms / frame, ~12.5 fps 3 threads: 65ms / frame, ~15 fps Adding a 4th thread does not improve performance. Tested-by: Barnabás Pőcze # ThinkPad X1 Yoga Gen 7 + ov2740 Reviewed-by: Milan Zamazal Signed-off-by: Hans de Goede --- Changes in v7: - Add Debug message logging thread count Changes in v5: - Extend software_isp.threads docs in runtime_configuration.rst Changes in v4: - Document software_isp.threads option in runtime_configuration.rst - Add an use constants for min/max/default number of threads Changes in v3: - Adjust for DebayerCpuThread now inheriting from Thread - Use for (auto &thread : threads_) Changes in v2: - Adjust to use the new DebayerCpuThread class introduced in the v2 patch-series - Re-use threads instead of starting new threads every frame --- Documentation/runtime_configuration.rst | 8 ++++ src/libcamera/software_isp/debayer_cpu.cpp | 47 ++++++++++++++++++++-- src/libcamera/software_isp/debayer_cpu.h | 10 +++++ 3 files changed, 62 insertions(+), 3 deletions(-) diff --git a/Documentation/runtime_configuration.rst b/Documentation/runtime_configuration.rst index e99ef2fb9..651929a4d 100644 --- a/Documentation/runtime_configuration.rst +++ b/Documentation/runtime_configuration.rst @@ -51,6 +51,7 @@ file structure: measure: skip: # non-negative integer, frames to skip initially number: # non-negative integer, frames to measure + threads: # integer >= 1, number of render threads to use, default 2 Configuration file example -------------------------- @@ -84,6 +85,7 @@ Configuration file example measure: skip: 50 number: 30 + threads: 2 List of variables and configuration options ------------------------------------------- @@ -167,6 +169,12 @@ software_isp.measure.skip, software_isp.measure.number Example `number` value: ``30`` +software_isp.threads + Number of render threads the software ISP uses when using the CPU. + This must be between 1 and 8 and the default is 2. + + Example value: ``2`` + Further details --------------- diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp index fc3305d2e..1de70b3b7 100644 --- a/src/libcamera/software_isp/debayer_cpu.cpp +++ b/src/libcamera/software_isp/debayer_cpu.cpp @@ -76,6 +76,7 @@ DebayerCpuThread::DebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex debayer_(debayer), threadIndex_(threadIndex), enableInputMemcpy_(enableInputMemcpy) { + moveToThread(this); } /** @@ -107,11 +108,15 @@ DebayerCpu::DebayerCpu(std::unique_ptr stats, const GlobalConfigurat bool enableInputMemcpy = configuration.option({ "software_isp", "copy_input_buffer" }).value_or(true); - /* Just one thread object for now, which will be called inline rather than async */ - threads_.resize(1); + unsigned int threadCount = + configuration.option({ "software_isp", "threads" }).value_or(kDefaultThreads); + threadCount = std::clamp(threadCount, kMinThreads, kMaxThreads); + threads_.resize(threadCount); for (unsigned int i = 0; i < threads_.size(); i++) threads_[i] = std::make_unique(this, i, enableInputMemcpy); + + LOG(Debayer, Debug) << "Thread count " << threadCount; } DebayerCpu::~DebayerCpu() = default; @@ -746,6 +751,11 @@ void DebayerCpuThread::process(uint32_t frame, const uint8_t *src, uint8_t *dst) process2(frame, src, dst); else process4(frame, src, dst); + + debayer_->workPendingMutex_.lock(); + debayer_->workPending_ &= ~(1 << threadIndex_); + debayer_->workPendingMutex_.unlock(); + debayer_->workPendingCv_.notify_one(); } void DebayerCpuThread::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) @@ -985,7 +995,21 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output stats_->startFrame(frame); - threads_[0]->process(frame, in.planes()[0].data(), out.planes()[0].data()); + workPendingMutex_.lock(); + workPending_ = (1 << threads_.size()) - 1; + workPendingMutex_.unlock(); + + for (auto &thread : threads_) + thread->invokeMethod(&DebayerCpuThread::process, + ConnectionTypeQueued, frame, + in.planes()[0].data(), out.planes()[0].data()); + + { + MutexLocker locker(workPendingMutex_); + workPendingCv_.wait(locker, [&]() LIBCAMERA_TSA_REQUIRES(workPendingMutex_) { + return workPending_ == 0; + }); + } metadata.planes()[0].bytesused = out.planes()[0].size(); @@ -1004,6 +1028,23 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output inputBufferReady.emit(input); } +int DebayerCpu::start() +{ + for (auto &thread : threads_) + thread->start(); + + return 0; +} + +void DebayerCpu::stop() +{ + for (auto &thread : threads_) + thread->exit(); + + for (auto &thread : threads_) + thread->wait(); +} + SizeRange DebayerCpu::sizes(PixelFormat inputFormat, const Size &inputSize) { Size patternSize = this->patternSize(inputFormat); diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h index 8e57c273b..a96998e92 100644 --- a/src/libcamera/software_isp/debayer_cpu.h +++ b/src/libcamera/software_isp/debayer_cpu.h @@ -16,6 +16,7 @@ #include #include +#include #include "libcamera/internal/bayer_format.h" #include "libcamera/internal/global_configuration.h" @@ -41,6 +42,8 @@ public: std::tuple strideAndFrameSize(const PixelFormat &outputFormat, const Size &size); void process(uint32_t frame, FrameBuffer *input, FrameBuffer *output, const DebayerParams ¶ms); + int start(); + void stop(); SizeRange sizes(PixelFormat inputFormat, const Size &inputSize); const SharedFD &getStatsFD() { return stats_->getStatsFD(); } @@ -144,6 +147,13 @@ private: std::unique_ptr stats_; unsigned int xShift_; /* Offset of 0/1 applied to window_.x */ + static constexpr unsigned int kMinThreads = 1; + static constexpr unsigned int kMaxThreads = 8; + static constexpr unsigned int kDefaultThreads = 2; + + unsigned int workPending_ LIBCAMERA_TSA_GUARDED_BY(workPendingMutex_); + Mutex workPendingMutex_; + ConditionVariable workPendingCv_; std::vector> threads_; };