From patchwork Mon Feb 16 19:02:00 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 26164 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id 7CFD8C0DA4 for ; Mon, 16 Feb 2026 19:02:14 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id 16F68621F6; Mon, 16 Feb 2026 20:02:12 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (2048-bit key; unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.b="S9+VVNS7"; dkim=pass (2048-bit key; unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="Jhn03iVy"; dkim-atps=neutral Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 7584562084 for ; Mon, 16 Feb 2026 20:02:10 +0100 (CET) Received: from pps.filterd (m0279863.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 61GBCMhu2522870 for ; Mon, 16 Feb 2026 19:02:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=ZPJPpQWAv3f wV2B3qD8jcpXwRAqAWVW9lYTGrYN+zBY=; b=S9+VVNS7WFJNfxZMOBLZuxQjSSY c+QssJXfMo/9Uq+Mb1YIm6c2xibe26TQmr9qVcarO9kDp9FctTeaaM+4owaGkEr1 1Q8JssshbLvLFy6Q2abEglwymaEBHP6kQT1CYQnRXJwCxeJaln6/DNZ0waaBFoxi YwnvXJfjoXKZxmOUw2usG3znCihsvpDnNPgXHoU8eYTBJ2HWDL0eDmMsEESZwd81 Glps/M36dVjimNjbYVs1ACRf2ll3hNv9hUyF8qBdLHZBfRpzOuqb86G8EUSLuqR/ cmbNvzD5VgrKrrSd+yk2nBdgWPMyQfZH+jEveqdkf/llfFmJzDIDwXPI89A== Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4cb6bukt7g-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Mon, 16 Feb 2026 19:02:08 +0000 (GMT) Received: by mail-vk1-f198.google.com with SMTP id 71dfb90a1353d-5662a8e87a0so6082139e0c.1 for ; Mon, 16 Feb 2026 11:02:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1771268527; x=1771873327; darn=lists.libcamera.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZPJPpQWAv3fwV2B3qD8jcpXwRAqAWVW9lYTGrYN+zBY=; b=Jhn03iVySC94d16byLjk7kIoWiv4/ClQw8BK/2gaPpyb32yRTR9VMUOOcYiSV4e8W9 CbSRQs2Qz+PT2sgIsnuyXiwbRajYXJSe98e5wtVM6dcsn93puPkXszpOaeZsQGIrRQg7 AwjMOsOrvvvn15YWfslnGSPho7CLGKst4DUiAesGZ9iBly93l7xQEGnpltpaiV9R8Zke lPzTXKPR0DI9i53oA5pxuzOqt6YvU51nyxZGxejV4Bg/SOkXmlkoV0NxuMTxJ6zI2v6s UpV4YlbEVDoqqndxuBONgJZwZA8NcNLigX3qCQnE/ddIsgEl+8ItShucu+gKPWKRC/kW JJhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771268527; x=1771873327; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ZPJPpQWAv3fwV2B3qD8jcpXwRAqAWVW9lYTGrYN+zBY=; b=XNBi4bYhAvP6NNw6AEl70TOms/EaS0+TbmJnUiismemtyINO5r0ICrbkyLDZaDkRcu dR7owhq+mp0Zb0JQp3Ue+fIVXAyRGEfx2DkXPc9knGNyTSL2Ho6mjzhtsLvLw5ruFPau UjKma71vGIILs9KTRBfgKWP0MZBO7/9T7AF3t7ppv9J+0TLFKZqRQFnDnrfk43WB/NgW sQGa+nQQhP37jyEfASaOdaD1KEQe72V6rv8r4JUIQTy4oYXUJrxXA1abdP0+ja5/WM4r HhV2irvl/iqhVvyAe9WvdonNiAms5fNHnKVDUOQrXYli9+CY9Ve6swy730mkAkQMz5hf qqOw== X-Gm-Message-State: AOJu0YwqXTIWjLMTkKkfVwXdTkLsaCI9Oevd5L0nlS496TrnQzCJdonR /UUnvwoCSrtSZ2wUW624gZth/kTj6rdI0n21+65U6GXxUorkhtBGDxkZPdg/USXRj4pv8fnYyVw hoyxa9iydz7kge2RUGjMPbHLTwmHuS5TW1dprEEVjRwqdYRiqneIFcsNW2njDf6TRThg6vTpTqn vzysj+RrME X-Gm-Gg: AZuq6aKjSkAJa3560N/GmHV7aAk6LOvM7EgbCe7OUdBZzHbce9kJvrwtKUSsrB+5/8i 7Qx6cWiymkquCzxs6jmcopCPzAjQEH22wmyJ7sH8EpCVGkwAHga++2bTeV07tdrYAHe8GwVbVsx xXgkY45lAi6rzgrjE5ZLjYCtbKIxrqGA8tT7sPYMD6QLu79y7dQYn1pXC+y2pbwRg0La7b6bdQh jSz1PgzT/4gjeXHfAMYE4MFVmC29Ono2TE0LUmh5y6rNpLyvuqh2tighO3CH6q/aQTSlQsdk11L Q7/UW3wL2mDXzSJhX2EUOJh5BdypoPH9UQxn2JjOaBT+XEwAuSSo5RiQ2cb/e/ZDe/yfMo/bCDf b6I93n5D1ZQpmnlJH9u+Lkd0p2BDOZs0K+qVbQfXvK9a3Yo/oy38mxNT52bWCQ65bFkFpNc84A/ 6WEKDqG4/QfzlUc/vSQPc5z981d0W7SAOwrMVb X-Received: by 2002:a05:6122:a01:b0:563:73ff:19be with SMTP id 71dfb90a1353d-56889b6bd02mr2103340e0c.8.1771268527254; Mon, 16 Feb 2026 11:02:07 -0800 (PST) X-Received: by 2002:a05:6122:a01:b0:563:73ff:19be with SMTP id 71dfb90a1353d-56889b6bd02mr2103283e0c.8.1771268526657; Mon, 16 Feb 2026 11:02:06 -0800 (PST) Received: from shalem (2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl. [2001:1c00:c32:7800:5bfa:a036:83f0:f9ec]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8fc735e587sm276698966b.2.2026.02.16.11.02.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Feb 2026 11:02:06 -0800 (PST) From: Hans de Goede To: libcamera-devel@lists.libcamera.org, Milan Zamazal Cc: Hans de Goede Subject: [PATCH 1/5] software_isp: swstats_cpu: Move accumulator storage out of the class Date: Mon, 16 Feb 2026 20:02:00 +0100 Message-ID: <20260216190204.106922-2-johannes.goede@oss.qualcomm.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> References: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> MIME-Version: 1.0 X-Proofpoint-GUID: j7gR0LYcap0kG8jzxCFVBwRbf8C_6K8R X-Proofpoint-ORIG-GUID: j7gR0LYcap0kG8jzxCFVBwRbf8C_6K8R X-Authority-Analysis: v=2.4 cv=M8dA6iws c=1 sm=1 tr=0 ts=699369b0 cx=c_pps a=1Os3MKEOqt8YzSjcPV0cFA==:117 a=xqWC_Br6kY4A:10 a=HzLeVaNsDn8A:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Mpw57Om8IfrbqaoTuvik:22 a=GgsMoib0sEa3-_RKJdDe:22 a=EUspDBNiAAAA:8 a=3n2xhyToohh5fnZkaeQA:9 a=hhpmQAJR8DioWGSBphRh:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMjE2MDE2MyBTYWx0ZWRfX6LJNkA33AM6L 7CzMv7vI1yC8IXt0EjsZwOoYMtoGufYJE6j16rjM349g5IrkQEpOGyj/wkRgGwV7rW4vMUVqQ25 oJYBkw8BO8gZ+f6qDVcqAJ8lTr1OfIi7D2v92PmQzbBYvEXrcJB6lUlebneMkMpjfuKrXb9xzqI HB9LgW852xrBuDOoCRdGjnvwCh1PS9MxQT1SJMwYjmIVndumxDJZ80vIWTt/IDpnRm4YQst9g+T cPhoFWMkCCyuzksjjfKd0YEbKl1w/neKWEl8ZU7zBX62/nssz0tC2zRNa5V5vepvVjojJWRn/N2 qwUfG0loJCqPLZKe/6ScxMWqNm0U6UFT0M3ZerHhZsSd5NgkqA2dWgqgkkxUimL+vuYc7PnFtpG bDXoe1fmAqjJyU6NBj0J5FXU7LTL71qgXn+24PELY9tAVmfiO+gJRG/AWIRu2a2Po9EyudeQXO0 LlpjNEk+8bOTcjE+Zxg== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51, FMLib:17.12.100.49 definitions=2026-02-16_06,2026-02-16_04,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 adultscore=0 suspectscore=0 bulkscore=0 impostorscore=0 clxscore=1015 phishscore=0 lowpriorityscore=0 priorityscore=1501 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2601150000 definitions=main-2602160163 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Move the storage used to accumulate the RGB sums and the Y histogram out of the SwStatsCpu class and into the callers. The idea is to allow a single SwStatsCpu object to be shared between multiple threads each processing part of the image, with finishFrame() accumulating the per thread data into the final stats for the entire frame. This is a preparation patch for making DebayerCpu support multi-threading and this could also be used to make processFrame() multi-threaded. Benchmarking with the GPU-ISP which does separate swstats benchmarking, on the Uno-Q which has a weak CPU which is good for performance testing, shows 20-21ms to generate stats for a 3272x2464 frame both before and after this change. Signed-off-by: Hans de Goede --- .../internal/software_isp/swstats_cpu.h | 29 ++++----- src/libcamera/software_isp/debayer_cpu.cpp | 12 ++-- src/libcamera/software_isp/debayer_cpu.h | 1 + src/libcamera/software_isp/swstats_cpu.cpp | 65 +++++++++++++------ 4 files changed, 65 insertions(+), 42 deletions(-) diff --git a/include/libcamera/internal/software_isp/swstats_cpu.h b/include/libcamera/internal/software_isp/swstats_cpu.h index 64b3e23f..a157afe8 100644 --- a/include/libcamera/internal/software_isp/swstats_cpu.h +++ b/include/libcamera/internal/software_isp/swstats_cpu.h @@ -53,11 +53,11 @@ public: int configure(const StreamConfiguration &inputCfg); void setWindow(const Rectangle &window); - void startFrame(uint32_t frame); - void finishFrame(uint32_t frame, uint32_t bufferId); + void startFrame(uint32_t frame, struct SwIspStats statsBuffer[], unsigned int statsBufferCount); + void finishFrame(uint32_t frame, uint32_t bufferId, struct SwIspStats statsBuffer[], unsigned int statsBufferCount); void processFrame(uint32_t frame, uint32_t bufferId, FrameBuffer *input); - void processLine0(uint32_t frame, unsigned int y, const uint8_t *src[]) + void processLine0(uint32_t frame, unsigned int y, const uint8_t *src[], SwIspStats *stats) { if (frame % kStatPerNumFrames) return; @@ -66,10 +66,10 @@ public: y >= (window_.y + window_.height)) return; - (this->*stats0_)(src); + (this->*stats0_)(src, stats); } - void processLine2(uint32_t frame, unsigned int y, const uint8_t *src[]) + void processLine2(uint32_t frame, unsigned int y, const uint8_t *src[], SwIspStats *stats) { if (frame % kStatPerNumFrames) return; @@ -78,27 +78,27 @@ public: y >= (window_.y + window_.height)) return; - (this->*stats2_)(src); + (this->*stats2_)(src, stats); } Signal statsReady; private: - using statsProcessFn = void (SwStatsCpu::*)(const uint8_t *src[]); - using processFrameFn = void (SwStatsCpu::*)(MappedFrameBuffer &in); + using statsProcessFn = void (SwStatsCpu::*)(const uint8_t *src[], SwIspStats *stats); + using processFrameFn = void (SwStatsCpu::*)(MappedFrameBuffer &in, SwIspStats *stats); int setupStandardBayerOrder(BayerFormat::Order order); /* Bayer 8 bpp unpacked */ - void statsBGGR8Line0(const uint8_t *src[]); + void statsBGGR8Line0(const uint8_t *src[], SwIspStats *stats); /* Bayer 10 bpp unpacked */ - void statsBGGR10Line0(const uint8_t *src[]); + void statsBGGR10Line0(const uint8_t *src[], SwIspStats *stats); /* Bayer 12 bpp unpacked */ - void statsBGGR12Line0(const uint8_t *src[]); + void statsBGGR12Line0(const uint8_t *src[], SwIspStats *stats); /* Bayer 10 bpp packed */ - void statsBGGR10PLine0(const uint8_t *src[]); - void statsGBRG10PLine0(const uint8_t *src[]); + void statsBGGR10PLine0(const uint8_t *src[], SwIspStats *stats); + void statsGBRG10PLine0(const uint8_t *src[], SwIspStats *stats); - void processBayerFrame2(MappedFrameBuffer &in); + void processBayerFrame2(MappedFrameBuffer &in, SwIspStats *stats); processFrameFn processFrame_; @@ -117,7 +117,6 @@ private: unsigned int stride_; SharedMemObject sharedStats_; - SwIspStats stats_; Benchmark bench_; }; diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp index d0988357..97c1959a 100644 --- a/src/libcamera/software_isp/debayer_cpu.cpp +++ b/src/libcamera/software_isp/debayer_cpu.cpp @@ -673,7 +673,7 @@ void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) for (unsigned int y = 0; y < yEnd; y += 2) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, y, linePointers); + stats_->processLine0(frame, y, linePointers, &statsBuffer_); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -688,7 +688,7 @@ void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) if (window_.y == 0) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, yEnd, linePointers); + stats_->processLine0(frame, yEnd, linePointers, &statsBuffer_); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -724,7 +724,7 @@ void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) for (unsigned int y = 0; y < window_.height; y += 4) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, y, linePointers); + stats_->processLine0(frame, y, linePointers, &statsBuffer_); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -737,7 +737,7 @@ void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine2(frame, y, linePointers); + stats_->processLine2(frame, y, linePointers, &statsBuffer_); (this->*debayer2_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -866,7 +866,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output return; } - stats_->startFrame(frame); + stats_->startFrame(frame, &statsBuffer_, 1); if (inputConfig_.patternSize.height == 2) process2(frame, in.planes()[0].data(), out.planes()[0].data()); @@ -885,7 +885,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output * * \todo Pass real bufferId once stats buffer passing is changed. */ - stats_->finishFrame(frame, 0); + stats_->finishFrame(frame, 0, &statsBuffer_, 1); outputBufferReady.emit(output); inputBufferReady.emit(input); } diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h index 7a651746..8abf5168 100644 --- a/src/libcamera/software_isp/debayer_cpu.h +++ b/src/libcamera/software_isp/debayer_cpu.h @@ -135,6 +135,7 @@ private: LookupTable gammaLut_; bool ccmEnabled_; DebayerParams params_; + SwIspStats statsBuffer_; debayerFn debayer0_; debayerFn debayer1_; diff --git a/src/libcamera/software_isp/swstats_cpu.cpp b/src/libcamera/software_isp/swstats_cpu.cpp index 5c3011a7..23842f6c 100644 --- a/src/libcamera/software_isp/swstats_cpu.cpp +++ b/src/libcamera/software_isp/swstats_cpu.cpp @@ -182,14 +182,14 @@ static constexpr unsigned int kBlueYMul = 29; /* 0.114 * 256 */ yVal = r * kRedYMul; \ yVal += g * kGreenYMul; \ yVal += b * kBlueYMul; \ - stats_.yHistogram[yVal * SwIspStats::kYHistogramSize / (256 * 256 * (div))]++; + stats->yHistogram[yVal * SwIspStats::kYHistogramSize / (256 * 256 * (div))]++; #define SWSTATS_FINISH_LINE_STATS() \ - stats_.sum_.r() += sumR; \ - stats_.sum_.g() += sumG; \ - stats_.sum_.b() += sumB; + stats->sum_.r() += sumR; \ + stats->sum_.g() += sumG; \ + stats->sum_.b() += sumB; -void SwStatsCpu::statsBGGR8Line0(const uint8_t *src[]) +void SwStatsCpu::statsBGGR8Line0(const uint8_t *src[], SwIspStats *stats) { const uint8_t *src0 = src[1] + window_.x; const uint8_t *src1 = src[2] + window_.x; @@ -214,7 +214,7 @@ void SwStatsCpu::statsBGGR8Line0(const uint8_t *src[]) SWSTATS_FINISH_LINE_STATS() } -void SwStatsCpu::statsBGGR10Line0(const uint8_t *src[]) +void SwStatsCpu::statsBGGR10Line0(const uint8_t *src[], SwIspStats *stats) { const uint16_t *src0 = (const uint16_t *)src[1] + window_.x; const uint16_t *src1 = (const uint16_t *)src[2] + window_.x; @@ -240,7 +240,7 @@ void SwStatsCpu::statsBGGR10Line0(const uint8_t *src[]) SWSTATS_FINISH_LINE_STATS() } -void SwStatsCpu::statsBGGR12Line0(const uint8_t *src[]) +void SwStatsCpu::statsBGGR12Line0(const uint8_t *src[], SwIspStats *stats) { const uint16_t *src0 = (const uint16_t *)src[1] + window_.x; const uint16_t *src1 = (const uint16_t *)src[2] + window_.x; @@ -266,7 +266,7 @@ void SwStatsCpu::statsBGGR12Line0(const uint8_t *src[]) SWSTATS_FINISH_LINE_STATS() } -void SwStatsCpu::statsBGGR10PLine0(const uint8_t *src[]) +void SwStatsCpu::statsBGGR10PLine0(const uint8_t *src[], SwIspStats *stats) { const uint8_t *src0 = src[1] + window_.x * 5 / 4; const uint8_t *src1 = src[2] + window_.x * 5 / 4; @@ -292,7 +292,7 @@ void SwStatsCpu::statsBGGR10PLine0(const uint8_t *src[]) SWSTATS_FINISH_LINE_STATS() } -void SwStatsCpu::statsGBRG10PLine0(const uint8_t *src[]) +void SwStatsCpu::statsGBRG10PLine0(const uint8_t *src[], SwIspStats *stats) { const uint8_t *src0 = src[1] + window_.x * 5 / 4; const uint8_t *src1 = src[2] + window_.x * 5 / 4; @@ -321,10 +321,13 @@ void SwStatsCpu::statsGBRG10PLine0(const uint8_t *src[]) /** * \brief Reset state to start statistics gathering for a new frame * \param[in] frame The frame number + * \param[in] statsBuffer Array of buffers storing stats + * \param[in] statsBufferCount number of buffers in the statsBuffer array * * This may only be called after a successful setWindow() call. */ -void SwStatsCpu::startFrame(uint32_t frame) +void SwStatsCpu::startFrame(uint32_t frame, + struct SwIspStats statsBuffer[], unsigned int statsBufferCount) { if (frame % kStatPerNumFrames) return; @@ -332,21 +335,39 @@ void SwStatsCpu::startFrame(uint32_t frame) if (window_.width == 0) LOG(SwStatsCpu, Error) << "Calling startFrame() without setWindow()"; - stats_.sum_ = RGB({ 0, 0, 0 }); - stats_.yHistogram.fill(0); + for (unsigned int i = 0; i < statsBufferCount; i++) { + statsBuffer[i].sum_ = RGB({ 0, 0, 0 }); + statsBuffer[i].yHistogram.fill(0); + } } /** * \brief Finish statistics calculation for the current frame * \param[in] frame The frame number * \param[in] bufferId ID of the statistics buffer + * \param[in] statsBuffer Array of buffers storing stats + * \param[in] statsBufferCount number of buffers in the statsBuffer array * * This may only be called after a successful setWindow() call. */ -void SwStatsCpu::finishFrame(uint32_t frame, uint32_t bufferId) +void SwStatsCpu::finishFrame(uint32_t frame, uint32_t bufferId, + struct SwIspStats statsBuffer[], unsigned int statsBufferCount) { - stats_.valid = frame % kStatPerNumFrames == 0; - *sharedStats_ = stats_; + if (frame % kStatPerNumFrames) { + sharedStats_->valid = false; + statsReady.emit(frame, bufferId); + return; + } + + sharedStats_->sum_ = RGB({ 0, 0, 0 }); + sharedStats_->yHistogram.fill(0); + for (unsigned int i = 0; i < statsBufferCount; i++) { + sharedStats_->sum_ += statsBuffer[i].sum_; + for (unsigned int j = 0; j < SwIspStats::kYHistogramSize; j++) + sharedStats_->yHistogram[j] += statsBuffer[i].yHistogram[j]; + } + + sharedStats_->valid = true; statsReady.emit(frame, bufferId); } @@ -487,7 +508,7 @@ void SwStatsCpu::setWindow(const Rectangle &window) window_.height &= ~(patternSize_.height - 1); } -void SwStatsCpu::processBayerFrame2(MappedFrameBuffer &in) +void SwStatsCpu::processBayerFrame2(MappedFrameBuffer &in, SwIspStats *stats) { const uint8_t *src = in.planes()[0].data(); const uint8_t *linePointers[3]; @@ -504,7 +525,7 @@ void SwStatsCpu::processBayerFrame2(MappedFrameBuffer &in) /* linePointers[0] is not used by any stats0_ functions */ linePointers[1] = src; linePointers[2] = src + stride_; - (this->*stats0_)(linePointers); + (this->*stats0_)(linePointers, stats); src += stride_ * 2; } } @@ -520,12 +541,14 @@ void SwStatsCpu::processBayerFrame2(MappedFrameBuffer &in) void SwStatsCpu::processFrame(uint32_t frame, uint32_t bufferId, FrameBuffer *input) { if (frame % kStatPerNumFrames) { - finishFrame(frame, bufferId); + finishFrame(frame, bufferId, NULL, 0); return; } + SwIspStats stats; + bench_.startFrame(); - startFrame(frame); + startFrame(frame, &stats, 1); MappedFrameBuffer in(input, MappedFrameBuffer::MapFlag::Read); if (!in.isValid()) { @@ -533,8 +556,8 @@ void SwStatsCpu::processFrame(uint32_t frame, uint32_t bufferId, FrameBuffer *in return; } - (this->*processFrame_)(in); - finishFrame(frame, bufferId); + (this->*processFrame_)(in, &stats); + finishFrame(frame, bufferId, &stats, 1); bench_.finishFrame(); } From patchwork Mon Feb 16 19:02:01 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 26168 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id 6932BC3240 for ; Mon, 16 Feb 2026 19:02:19 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id 1BB9F6220D; Mon, 16 Feb 2026 20:02:19 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (2048-bit key; unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.b="pifdwMZV"; dkim=pass (2048-bit key; unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="GcySVJCA"; dkim-atps=neutral Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 45D0962217 for ; Mon, 16 Feb 2026 20:02:17 +0100 (CET) Received: from pps.filterd (m0279865.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 61GC5oZc2987647 for ; Mon, 16 Feb 2026 19:02:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=KGJwpNxvV2U wvsIFRaK7R3GhgzvleU0SLTfhWhMvPDM=; b=pifdwMZVHisoO8+2+m81C716Q1F eMDo1KAwQw4u2KZozc3yt4QJ+3t9hTbrxg0qfWgnKOViPmSkn55bJSRZvfu8AMKc 0PJwj97ZjoiuVidhnjszPnsQZ6rOhPihjiBQHsFsbymokT1B5RZ0WUznCFTGZnam DwwVLbXy6lwx19sON0CpRr313WrzEPR/KBIS7fgBrNySEZZOXNT7jgd8QbeGrX+D J/iJIJIz6josPy9VZ6mpvRuMpmw8IkaW6frCO3JMMsLNbxN8KBrRlyB7tkingA58 qpQq75Z++aTW0BDNnyPauauhkpALiXm4EcZvcvtm7u/+AR2f2Uy3wz5U0zw== Received: from mail-vk1-f197.google.com (mail-vk1-f197.google.com [209.85.221.197]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4cbfuw340s-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Mon, 16 Feb 2026 19:02:09 +0000 (GMT) Received: by mail-vk1-f197.google.com with SMTP id 71dfb90a1353d-5662a8e87a0so6082184e0c.1 for ; Mon, 16 Feb 2026 11:02:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1771268528; x=1771873328; darn=lists.libcamera.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KGJwpNxvV2UwvsIFRaK7R3GhgzvleU0SLTfhWhMvPDM=; b=GcySVJCA3eKDNP1Fsb1yQoj9JGpgGIhRlMp9mYT/A/OVu5/bQX/GOdEkzCaWYf/HFE +3ECL5VJY4niOSTa3kiDi+bTuo20P3WQ7E4IsSM3+3FMUx17VdjG3dwt6mOHTWZb4F2U 0Muqr2ACeFEaZp9zhGbKNVnsRy4u99TuBXx88dFyR39h7b5AJytPZvSnHqkGs/XvNO5r y5XrVcvpTTTr1mu9+8lqm3drnYKD5njjrkjpoPVuuvpo5vBcJVc5+8ny45FN7RzALGc/ T/6LddlnO+6pMJXr43knwblHlAng8FVzMbvRwWVDPsPdfQpbicCnl2sFv1Bf+QvxaV0R sgXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771268528; x=1771873328; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=KGJwpNxvV2UwvsIFRaK7R3GhgzvleU0SLTfhWhMvPDM=; b=B+1gGs/S8Ik7wTf10EHAf8bLeiFFYQLNx7/M0OsziwS7MJKFEtKtg3biZvM9F2n3rc cNx2T26P6UGf5yYyBc69qlH6FJC9BTtOSSMXFoIwSGLeD+OrQBIMviWNxBqwIpKyre6p Ij184Fw0upco9IdZkGkHRFmW1iAnNJxE9ZYESyO6Uxbb+neS+paHAbfNyvub3i2rBukm a0qpWD/DpP7SU5kOK043+wopx2YqoiHTKARBJK81iZJ596Kk35yjO8fLnOXPwHP+B5NI PhjpCtnzw1x2jdri6Q6ob7dKHQTLjscOCL2mtDOPn3HaryHx5KxHJjNig+cQ2O6vBgh2 3Y1Q== X-Gm-Message-State: AOJu0Yw7bfQg4Oc72Drf2Wiq68kDAmHefElHT7nbavu27gTeVb3MWEf1 +R9D6NzTXwoMLZzcigM37aoZXj0TvaDGaoebKjbDKM9X3TCDLR1Fm3/Wk1bHeZIuY5yM/9Yha7w vGN2qQX0LCZgZgaXMGvLspvxwuGL69wPtVRM4ZKrwsoZD/KuoDmr8bd++D5lKS59bFIihQtU9eA GafSD0MXw0 X-Gm-Gg: AZuq6aKSWziFe/jszY3vPzFxIJcY1nCumBVstlizzo5IGRRs3zEkqV4/2Li6OuTyKWh wtLzEpQFf8LutBT0lMrQywm/aja7lfa3lHzqL8TEbi02bbneg0TJN0nJJlo0zT2otn77eA14QWI wIUFJtaeY7MiHFktWvulcRu8eeno/BwU85jWqOfH+avBTLxP9xBd4hhN7oCFOCNyLTvi2cO1j5C YP/Y6Hsivsp6ieorkqCdHl/Ke3i8JnmYi9zmETbJAwZEoI4Cn0Kc7ImaVs9Rcph8CbCYNcgBTiH BWJ7A77mfYsfIMimkpcI4re3hi+1DWX6nl/ImK9UXIdLZiuHp1TDG7sC46OvyKnQOrQWR3UNwLY C3QuDsGdKRYZuhyY1GUfuhegYDVmJSiw17jiuopHiNhzQUH/JYNrijxMlrW/FX5GJLgoWHaVldW 1OeHEq9YQoZVa37spi8D6LAiiLyMPvUxOlahi6 X-Received: by 2002:a05:6122:2a4c:b0:567:d87:e18c with SMTP id 71dfb90a1353d-56889b8e750mr2305622e0c.9.1771268528179; Mon, 16 Feb 2026 11:02:08 -0800 (PST) X-Received: by 2002:a05:6122:2a4c:b0:567:d87:e18c with SMTP id 71dfb90a1353d-56889b8e750mr2305572e0c.9.1771268527605; Mon, 16 Feb 2026 11:02:07 -0800 (PST) Received: from shalem (2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl. [2001:1c00:c32:7800:5bfa:a036:83f0:f9ec]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8fc735e587sm276698966b.2.2026.02.16.11.02.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Feb 2026 11:02:07 -0800 (PST) From: Hans de Goede To: libcamera-devel@lists.libcamera.org, Milan Zamazal Cc: Hans de Goede Subject: [PATCH 2/5] software_isp: debayer_cpu: Add per render thread data Date: Mon, 16 Feb 2026 20:02:01 +0100 Message-ID: <20260216190204.106922-3-johannes.goede@oss.qualcomm.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> References: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> MIME-Version: 1.0 X-Authority-Analysis: v=2.4 cv=Jb+xbEKV c=1 sm=1 tr=0 ts=699369b1 cx=c_pps a=JIY1xp/sjQ9K5JH4t62bdg==:117 a=xqWC_Br6kY4A:10 a=HzLeVaNsDn8A:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Mpw57Om8IfrbqaoTuvik:22 a=GgsMoib0sEa3-_RKJdDe:22 a=EUspDBNiAAAA:8 a=LZmg44-WHxdkjKT3r1sA:9 a=tNoRWFLymzeba-QzToBc:22 X-Proofpoint-ORIG-GUID: Tnf95s2-LoQSF_CWipS46Gc7eyAHIkwq X-Proofpoint-GUID: Tnf95s2-LoQSF_CWipS46Gc7eyAHIkwq X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMjE2MDE2MyBTYWx0ZWRfX9veWaIY+hMaM G50kJPbPUGs3IKJ7yu/3E4op1GiFqSQYG4b1QIJnTsY/H/10rfdkJnKsJKVccah6wp97Y1en0e7 iobp5HywP+eZ/sfDLhJym1RiROoER4iNhAlcBRArcRX8ZDd563YAZDqrV4fuNXCy6ntTJBnHC1A e6hZH/l0H+gGUKjah1lXDMI2S5tIL5rrY3pOI4BqSqG4zetMg7cVIr0iSAvfsDKlprQ6j5XgiD6 aRWDQyVP3JzU6teCQmT+UFQJxJKX7Ok7IBWzOwgxgdbZmWWL5jCEtxzKH9K8qA0QDpriSJKLiFN JfwjWXkiPAfwoqZZy3n+bL2iFjIE9UogkehHUkxiFuJc4RXxJOxzxkc9ePTVYaGeqiycyNUq/Yh 5NOrh5d3V1INAkhw/DAmGk58eNZ7PljJPMz/b63yw453oQY0XdLAmgMq2eYQZoZb47jCFmzUZqf FByQqatzTEPd8CWm2Lg== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51, FMLib:17.12.100.49 definitions=2026-02-16_06,2026-02-16_04,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 impostorscore=0 adultscore=0 suspectscore=0 spamscore=0 lowpriorityscore=0 priorityscore=1501 phishscore=0 clxscore=1015 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2601150000 definitions=main-2602160163 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Add a DebayerCpuThreadData data struct and use this in the inner render loop. This contains data which needs to be separate per thread. This is a preparation patch for making DebayerCpu support multi-threading. Note this passed the DebayerCpuThreadData with a pointer rather then by reference, because passing by reference is not supported for functions passed as the thread function to std::thread(). Benchmarking on the Uno-Q with a weak CPU which is good for performance testing, shows 146-147ms per 3272x2464 frame both before and after this change, with things maybe being 0.5 ms slower after this change. Signed-off-by: Hans de Goede --- src/libcamera/software_isp/debayer_cpu.cpp | 90 ++++++++++++++-------- src/libcamera/software_isp/debayer_cpu.h | 30 +++++--- 2 files changed, 77 insertions(+), 43 deletions(-) diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp index 97c1959a..e1d3c164 100644 --- a/src/libcamera/software_isp/debayer_cpu.cpp +++ b/src/libcamera/software_isp/debayer_cpu.cpp @@ -41,7 +41,7 @@ namespace libcamera { * \param[in] configuration The global configuration */ DebayerCpu::DebayerCpu(std::unique_ptr stats, const GlobalConfiguration &configuration) - : Debayer(configuration), stats_(std::move(stats)) + : Debayer(configuration), stats_(std::move(stats)), threadCount_(1) { /* * Reading from uncached buffers may be very slow. @@ -555,8 +555,9 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg, 2 * lineBufferPadding_; if (enableInputMemcpy_) { - for (unsigned int i = 0; i <= inputConfig_.patternSize.height; i++) - lineBuffers_[i].resize(lineBufferLength_); + for (unsigned int i = 0; i < threadCount_; i++) + for (unsigned int j = 0; j <= inputConfig_.patternSize.height; j++) + threadData_[i].lineBuffers[j].resize(lineBufferLength_); } return 0; @@ -600,7 +601,8 @@ DebayerCpu::strideAndFrameSize(const PixelFormat &outputFormat, const Size &size return std::make_tuple(stride, stride * size.height); } -void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[]) +void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[], + DebayerCpuThreadData *threadData) { const unsigned int patternHeight = inputConfig_.patternSize.height; @@ -608,14 +610,14 @@ void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[]) return; for (unsigned int i = 0; i < patternHeight; i++) { - memcpy(lineBuffers_[i].data(), + memcpy(threadData->lineBuffers[i].data(), linePointers[i + 1] - lineBufferPadding_, lineBufferLength_); - linePointers[i + 1] = lineBuffers_[i].data() + lineBufferPadding_; + linePointers[i + 1] = threadData->lineBuffers[i].data() + lineBufferPadding_; } /* Point lineBufferIndex_ to first unused lineBuffer */ - lineBufferIndex_ = patternHeight; + threadData->lineBufferIndex = patternHeight; } void DebayerCpu::shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src) @@ -629,66 +631,78 @@ void DebayerCpu::shiftLinePointers(const uint8_t *linePointers[], const uint8_t (patternHeight / 2) * (int)inputConfig_.stride; } -void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[]) +void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[], + DebayerCpuThreadData *threadData) { const unsigned int patternHeight = inputConfig_.patternSize.height; if (!enableInputMemcpy_) return; - memcpy(lineBuffers_[lineBufferIndex_].data(), + memcpy(threadData->lineBuffers[threadData->lineBufferIndex].data(), linePointers[patternHeight] - lineBufferPadding_, lineBufferLength_); - linePointers[patternHeight] = lineBuffers_[lineBufferIndex_].data() + lineBufferPadding_; + linePointers[patternHeight] = threadData->lineBuffers[threadData->lineBufferIndex].data() + lineBufferPadding_; - lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1); + threadData->lineBufferIndex = (threadData->lineBufferIndex + 1) % (patternHeight + 1); } -void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) +void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst, + DebayerCpuThreadData *threadData) { - unsigned int yEnd = window_.height; /* Holds [0] previous- [1] current- [2] next-line */ const uint8_t *linePointers[3]; /* Adjust src to top left corner of the window */ - src += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; + src += (window_.y + threadData->yStart) * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; /* [x] becomes [x - 1] after initial shiftLinePointers() call */ - if (window_.y) { + if (window_.y + threadData->yStart) { linePointers[1] = src - inputConfig_.stride; /* previous-line */ linePointers[2] = src; } else { - /* window_.y == 0, use the next line as prev line */ + /* Top line, use the next line as prev line */ linePointers[1] = src + inputConfig_.stride; linePointers[2] = src; + } + + if (window_.y == 0 && threadData->yEnd == window_.height) { /* * Last 2 lines also need special handling. * (And configure() ensures that yEnd >= 2.) */ - yEnd -= 2; + threadData->yEnd -= 2; + threadData->processLastLinesSeperately = true; + } else { + threadData->processLastLinesSeperately = false; } - setupInputMemcpy(linePointers); + setupInputMemcpy(linePointers, threadData); - for (unsigned int y = 0; y < yEnd; y += 2) { + /* + * Note y is the line-number *inside* the window, since stats_' window + * is the stats window inside/relative to the debayer window. IOW for + * single thread rendering y goes from 0 - window_.height. + */ + for (unsigned int y = threadData->yStart; y < threadData->yEnd; y += 2) { shiftLinePointers(linePointers, src); - memcpyNextLine(linePointers); + memcpyNextLine(linePointers, threadData); stats_->processLine0(frame, y, linePointers, &statsBuffer_); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; shiftLinePointers(linePointers, src); - memcpyNextLine(linePointers); + memcpyNextLine(linePointers, threadData); (this->*debayer1_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; } - if (window_.y == 0) { + if (threadData->processLastLinesSeperately) { shiftLinePointers(linePointers, src); - memcpyNextLine(linePointers); - stats_->processLine0(frame, yEnd, linePointers, &statsBuffer_); + memcpyNextLine(linePointers, threadData); + stats_->processLine0(frame, threadData->yEnd, linePointers, &statsBuffer_); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -702,7 +716,8 @@ void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) } } -void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) +void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst, + DebayerCpuThreadData *threadData) { /* * This holds pointers to [0] 2-lines-up [1] 1-line-up [2] current-line @@ -711,7 +726,7 @@ void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) const uint8_t *linePointers[5]; /* Adjust src to top left corner of the window */ - src += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; + src += (window_.y + threadData->yStart) * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; /* [x] becomes [x - 1] after initial shiftLinePointers() call */ linePointers[1] = src - 2 * inputConfig_.stride; @@ -719,31 +734,36 @@ void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) linePointers[3] = src; linePointers[4] = src + inputConfig_.stride; - setupInputMemcpy(linePointers); + setupInputMemcpy(linePointers, threadData); - for (unsigned int y = 0; y < window_.height; y += 4) { + /* + * Note y is the line-number *inside* the window, since stats_' window + * is the stats window inside/relative to the debayer window. IOW for + * single thread rendering y goes from 0 - window_.height. + */ + for (unsigned int y = threadData->yStart; y < threadData->yEnd; y += 4) { shiftLinePointers(linePointers, src); - memcpyNextLine(linePointers); + memcpyNextLine(linePointers, threadData); stats_->processLine0(frame, y, linePointers, &statsBuffer_); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; shiftLinePointers(linePointers, src); - memcpyNextLine(linePointers); + memcpyNextLine(linePointers, threadData); (this->*debayer1_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; shiftLinePointers(linePointers, src); - memcpyNextLine(linePointers); + memcpyNextLine(linePointers, threadData); stats_->processLine2(frame, y, linePointers, &statsBuffer_); (this->*debayer2_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; shiftLinePointers(linePointers, src); - memcpyNextLine(linePointers); + memcpyNextLine(linePointers, threadData); (this->*debayer3_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -868,10 +888,12 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output stats_->startFrame(frame, &statsBuffer_, 1); + threadData_[0].yStart = 0; + threadData_[0].yEnd = window_.height; if (inputConfig_.patternSize.height == 2) - process2(frame, in.planes()[0].data(), out.planes()[0].data()); + process2(frame, in.planes()[0].data(), out.planes()[0].data(), &threadData_[0]); else - process4(frame, in.planes()[0].data(), out.planes()[0].data()); + process4(frame, in.planes()[0].data(), out.planes()[0].data(), &threadData_[0]); metadata.planes()[0].bytesused = out.planes()[0].size(); diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h index 8abf5168..800b018c 100644 --- a/src/libcamera/software_isp/debayer_cpu.h +++ b/src/libcamera/software_isp/debayer_cpu.h @@ -74,6 +74,19 @@ private: */ using debayerFn = void (DebayerCpu::*)(uint8_t *dst, const uint8_t *src[]); + /* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */ + static constexpr unsigned int kMaxLineBuffers = 5; + + /* Per render thread data */ + struct DebayerCpuThreadData { + unsigned int yStart; + unsigned int yEnd; + std::vector lineBuffers[kMaxLineBuffers]; + unsigned int lineBufferIndex; + /* Stored here to avoid causing register pressure in inner loop */ + bool processLastLinesSeperately; + }; + /* 8-bit raw bayer format */ template void debayer8_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]); @@ -105,17 +118,14 @@ private: int setDebayerFunctions(PixelFormat inputFormat, PixelFormat outputFormat, bool ccmEnabled); - void setupInputMemcpy(const uint8_t *linePointers[]); + void setupInputMemcpy(const uint8_t *linePointers[], DebayerCpuThreadData *threadData); void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src); - void memcpyNextLine(const uint8_t *linePointers[]); - void process2(uint32_t frame, const uint8_t *src, uint8_t *dst); - void process4(uint32_t frame, const uint8_t *src, uint8_t *dst); + void memcpyNextLine(const uint8_t *linePointers[], DebayerCpuThreadData *threadData); + void process2(uint32_t frame, const uint8_t *src, uint8_t *dst, DebayerCpuThreadData *threadData); + void process4(uint32_t frame, const uint8_t *src, uint8_t *dst, DebayerCpuThreadData *threadData); void updateGammaTable(const DebayerParams ¶ms); void updateLookupTables(const DebayerParams ¶ms); - /* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */ - static constexpr unsigned int kMaxLineBuffers = 5; - static constexpr unsigned int kRGBLookupSize = 256; static constexpr unsigned int kGammaLookupSize = 1024; struct CcmColumn { @@ -143,12 +153,14 @@ private: debayerFn debayer3_; Rectangle window_; std::unique_ptr stats_; - std::vector lineBuffers_[kMaxLineBuffers]; unsigned int lineBufferLength_; unsigned int lineBufferPadding_; - unsigned int lineBufferIndex_; unsigned int xShift_; /* Offset of 0/1 applied to window_.x */ bool enableInputMemcpy_; + + static constexpr unsigned int kMaxThreads = 4; + struct DebayerCpuThreadData threadData_[kMaxThreads]; + unsigned int threadCount_; }; } /* namespace libcamera */ From patchwork Mon Feb 16 19:02:02 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 26165 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id 24671C0DA4 for ; Mon, 16 Feb 2026 19:02:16 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id D47CB62212; Mon, 16 Feb 2026 20:02:15 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (2048-bit key; unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.b="MmENt5Ft"; dkim=pass (2048-bit key; unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="bQrk8eSH"; dkim-atps=neutral Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 63CFC62201 for ; Mon, 16 Feb 2026 20:02:12 +0100 (CET) Received: from pps.filterd (m0279864.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 61GH4XCA1545248 for ; Mon, 16 Feb 2026 19:02:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=UlDgFMps2ge rX/b1RmWRBztHRaK0o9MdqNKO/MQzYBY=; b=MmENt5FtKEzPhihm3/Z7nhmgxQL 8kqq+8/vDu3Wk8yEGW7eX07nIAgIrqHoCkr6wF2LM4qyUl3u5uuVGLLNSdb1uYSM 6pGvS3h5zrlU/tpKy5dsLcdH/XcEgUke8ex2qJ2hDs4RCwZJ0gh0nsrK+xAAnJut ghffuw6FGzM6wWi9ma6r4vYZLvfxxOhDlOx2kpyO1CDSSANrqxYKstPtFLnlz9p5 iBhxxwEcWC92s7w/pD33cCgsmvx8gSA0tvrBBMDoCsApG5G41bWe4CiBbvV5Kmng EE61bGxOgCsOM0aRQ7p7iSCxCQYxHPiBusS7viepzwEc9WsmxmOtzqk6dTg== Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com [209.85.221.199]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4cc7ajr8nt-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Mon, 16 Feb 2026 19:02:10 +0000 (GMT) Received: by mail-vk1-f199.google.com with SMTP id 71dfb90a1353d-567503c3dbdso5295664e0c.2 for ; Mon, 16 Feb 2026 11:02:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1771268529; x=1771873329; darn=lists.libcamera.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UlDgFMps2gerX/b1RmWRBztHRaK0o9MdqNKO/MQzYBY=; b=bQrk8eSH0JFa0TTGbdKE0KDISiiqCRNxqoxNbz0wKgxvvzrGuthJ74VRtoa/jGjCYx xbAQIysD8Y+ydsR6BDFrTF4KCZ+c6UZODSfF/r6mZEoFqkRKTkcvEVZgh6Hh6abWAFAd decMIqi8AQDFYcP3a4xhnWjtNwxVrJZp1pa/oul1JIax+Q67G6PgPcHBR3FAeMRclGOT aeOeN79mcCEmCs/Bnf5/rVLLyoEs2eDEjAd6S4g5cM2CUA+tdnYv8HrDyaxx6gkrPHlG EpztY/KKc/NhHEzTLr85PtCqa8g1x3Se28wr85o1zzSloLe5WYO0TOZNl+d85fk+ul8p FURw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771268529; x=1771873329; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=UlDgFMps2gerX/b1RmWRBztHRaK0o9MdqNKO/MQzYBY=; b=J48zI2tMRjsKH6L4Tuk+AkY2sNhRL9OvJTXgP4ELiTHHbCAUzkX5Lidur6iDOaD1yS vUYw0hKraLoEd6xmXxozsuMcWGFDkQC6qoe3cIlplsINnKJjjMEoZ0UvDS1WUXkeQdCY aS6rE1Juhxy7D0HiOxqA5uzPzpvmvwuSOFRFYX7wKdIBrJ8wwTaiA1NorsLwDmej4dKR v1FEAYuuSMq3kdJlcQ2lXXVYRjRhBV2/lT/ZNweoOXR0ot8IeVb+pMkAOEIrASRBZK6k ONIvM06i3Kv0bO6sjPGyVJLoAieGvmf57I/hwn9Xprl9ZWWSV4BRtYD9IBQ0cPSt6mjx ZhqA== X-Gm-Message-State: AOJu0YxgGY5g+IH1ljweYSzJM0pePmlt7LKqorFg5PW2hPlrZNcwvN5K g8z/pXjV9USGokgF+HawWE6WqOH7bCkZCOH+YkSOus4P8qaOaKqrL/ODCbL15bqiswavUloY7h1 9o1M/61vyXTDG7mUISKejHAU1XkNF72csBtc9+v+BAMF5KOt0FzQGY+yAUsjcF3his48VavH2az Q5z/5C5Chb X-Gm-Gg: AZuq6aIMA4Oy791r+IrWJHQH2KIa9vYBSlZeP629VfI8sG8GSnG0BbqkoPKSqopbctR VnylWhdv+y9RsOOEWVHgV+ZTC6XPYW6J87kokb76RyuXaO69RGIpMq2GG/T7T7maeoKmJcM2M/g QQYkzJgBvhFGtTjO8nkpIUiktEPiJAB8hDbA+SzgC93XZnzQu7tskt9wcm9FRFRSfc1ppSP5/bB 2uyeajAnXP1XDnY3jr3JS5iec0o8kr36Vk/n01fgK+7jGiTaLPSZIkg5H9x+f8Wf+JzcvmLCnGH od99fZhPptwQWwbwxHuS9ppAgxCwlaVEWizQTZIYZ2oiiOi4mPJgqs7zGYgkOIHQelWtuZnh6c3 gGOAgRzbWGAgOLTT03fLDAZpxtYp+4e5oeyxQXMvm4902ivjlfpl1BkuASHxoiRNmBVegkl6bPm og932i/ymPKg4Gji0kW3WEGdk5FnGMRgLQy507 X-Received: by 2002:a05:6122:8284:b0:55f:c318:1afa with SMTP id 71dfb90a1353d-56889b68e4fmr2774292e0c.6.1771268529138; Mon, 16 Feb 2026 11:02:09 -0800 (PST) X-Received: by 2002:a05:6122:8284:b0:55f:c318:1afa with SMTP id 71dfb90a1353d-56889b68e4fmr2774248e0c.6.1771268528624; Mon, 16 Feb 2026 11:02:08 -0800 (PST) Received: from shalem (2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl. [2001:1c00:c32:7800:5bfa:a036:83f0:f9ec]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8fc735e587sm276698966b.2.2026.02.16.11.02.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Feb 2026 11:02:08 -0800 (PST) From: Hans de Goede To: libcamera-devel@lists.libcamera.org, Milan Zamazal Cc: Hans de Goede Subject: [PATCH 3/5] software_isp: debayer_cpu: Group innerloop variables together Date: Mon, 16 Feb 2026 20:02:02 +0100 Message-ID: <20260216190204.106922-4-johannes.goede@oss.qualcomm.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> References: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> MIME-Version: 1.0 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMjE2MDE2MyBTYWx0ZWRfX8C6/Lmys7wL0 Vu5LM/v1x79o5EwKXm2A+LqLS8772oWv39j0Db9BdF5W3vRhw8Uj8ANSXhQ5jGQjrgpnhJxBLdy CFeN/QOjrH+QI+4CSbhCpMeB93mOC3J9jPMRpZL4xDfO2XbtY1sOhE/UsQawCjPE3VlEpOZXueA YrTrhiq7O18e5vFKeTuIbvLzxG7BEuVaf/xU3aM5IaT5C+idH5s8WuLrf/nWjpkTNNAf3EYBvGa GGwio2XsYGtkPo8ov/C1lR1PYxKXiykLLBGupgc+eFGKP33lPDkiM4TAtewBncCoRDwAJdHUbdt XgNR30Gm9lcmssfGKru73MF04pCAlh828Gn0yi3c0LOQtFy8x334o73Pf4rtjBLJcSZ9BMYKzII 1ZUulAd0nBrtQpuB/Y2ahZ0NXoIMR//lrO4rJbUlZq6rWOzErppqVJq+pCsB1BHt1IJWT3Vng6D lO7MyvIl4THlF9AnFTw== X-Proofpoint-ORIG-GUID: wmZiOMpz7I56ixSz9mrl8Msn3w_uIIaf X-Authority-Analysis: v=2.4 cv=BryQAIX5 c=1 sm=1 tr=0 ts=699369b2 cx=c_pps a=+D9SDfe9YZWTjADjLiQY5g==:117 a=xqWC_Br6kY4A:10 a=HzLeVaNsDn8A:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Mpw57Om8IfrbqaoTuvik:22 a=GgsMoib0sEa3-_RKJdDe:22 a=EUspDBNiAAAA:8 a=8m3Se83TVroFe3lHWL4A:9 a=vmgOmaN-Xu0dpDh8OwbV:22 X-Proofpoint-GUID: wmZiOMpz7I56ixSz9mrl8Msn3w_uIIaf X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51, FMLib:17.12.100.49 definitions=2026-02-16_06,2026-02-16_04,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 spamscore=0 lowpriorityscore=0 suspectscore=0 phishscore=0 clxscore=1015 adultscore=0 bulkscore=0 priorityscore=1501 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2601150000 definitions=main-2602160163 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Group variables used every pixel together, followed by variables used every lines and then lastly variables only used every frame. The idea here is to have all the data used every pixel fit in as few cachelines as possible. Benchmarking does not show any differerence before after, possibly because most of the per pixel lookup tables where already grouped together. Despite that this still seems like a good idea. Signed-off-by: Hans de Goede --- src/libcamera/software_isp/debayer_cpu.h | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h index 800b018c..a54418dc 100644 --- a/src/libcamera/software_isp/debayer_cpu.h +++ b/src/libcamera/software_isp/debayer_cpu.h @@ -135,6 +135,7 @@ private: }; using LookupTable = std::array; using CcmLookupTable = std::array; + /* Variables used every pixel */ LookupTable red_; LookupTable green_; LookupTable blue_; @@ -143,24 +144,26 @@ private: CcmLookupTable blueCcm_; std::array gammaTable_; LookupTable gammaLut_; - bool ccmEnabled_; - DebayerParams params_; - SwIspStats statsBuffer_; + Rectangle window_; + /* Variables used every line */ + SwIspStats statsBuffer_; debayerFn debayer0_; debayerFn debayer1_; debayerFn debayer2_; debayerFn debayer3_; - Rectangle window_; std::unique_ptr stats_; unsigned int lineBufferLength_; unsigned int lineBufferPadding_; unsigned int xShift_; /* Offset of 0/1 applied to window_.x */ bool enableInputMemcpy_; - static constexpr unsigned int kMaxThreads = 4; struct DebayerCpuThreadData threadData_[kMaxThreads]; + + /* variables used every frame */ unsigned int threadCount_; + bool ccmEnabled_; + DebayerParams params_; }; } /* namespace libcamera */ From patchwork Mon Feb 16 19:02:03 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 26166 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id F37DDC3240 for ; Mon, 16 Feb 2026 19:02:16 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id 774DE62209; Mon, 16 Feb 2026 20:02:16 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (2048-bit key; unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.b="DVMA2GQN"; dkim=pass (2048-bit key; unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="bDPt1sWl"; dkim-atps=neutral Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 462E5621FD for ; Mon, 16 Feb 2026 20:02:13 +0100 (CET) Received: from pps.filterd (m0279862.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 61GH4VBZ985305 for ; Mon, 16 Feb 2026 19:02:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=6rYZ4DZgBGf tLzZQ7ANPf59s8EG5sl99CSqwra87u9Y=; b=DVMA2GQNY4y+d2kruLmyfWQWZvZ 2YuaOVe5siLrv8u/mze/s5gdPZK9p0InIrosLSpc6BRM5QTbIlh/3FSlyQtSlJRZ Oe6q/bdtFMKavnun7v1vzD0ACi5Ski+ITDoYjNGJQsWquUsGh39ltaMgNH6VKDTJ WeIrUhSeFC79O36eA7FBL9rVQHv3Ie/ahWnLR/P/70FVRt6qaKcvTyvmC9FeXWf7 xno/Ub3YY8B+/MsZVVdmKXIUXdZA/nabP7dfJYPhsqHEoqn55T4WqYcRgPfSKEEo BYow9XQQKQqvzl91CLF7olPnDGsj1/3cKsLX7W2RW5xdNrPA1fOFxFfNSXQ== Received: from mail-vk1-f198.google.com (mail-vk1-f198.google.com [209.85.221.198]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4cc7ap08mj-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Mon, 16 Feb 2026 19:02:11 +0000 (GMT) Received: by mail-vk1-f198.google.com with SMTP id 71dfb90a1353d-567503c3dbdso5295695e0c.2 for ; Mon, 16 Feb 2026 11:02:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1771268530; x=1771873330; darn=lists.libcamera.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6rYZ4DZgBGftLzZQ7ANPf59s8EG5sl99CSqwra87u9Y=; b=bDPt1sWlbLKu9eB6G85v3Q6xX9Aqr6vS4fko9MTm/rXxRirVRz9n0ggfRyLhl8xOhJ gEr9uz4xo8+OA9yrz3lYuynIFlYVEINSwXK2zFBMqhnZ4pnn0EQdCHocp9bD9js2sPCB gYUAa/8Ib5OheygC1HjcalLO4vt610i6lg6bMk105EZnkeFNvdc8vaeYOEpgCfaHer6T rJwERWuWdc6at4jE6YhRbUJBSWn67bej4t6EwxUIg/2+dHp4OwRy0oGbQU2qYDkfgXL5 VaNtqxbON+KjMquTCGJQ2f7p7gZFsPRXZLP2N/T9w92+IZfHXPcSonG3kU/NfBgP1mlQ 39Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771268530; x=1771873330; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6rYZ4DZgBGftLzZQ7ANPf59s8EG5sl99CSqwra87u9Y=; b=HsrkD6STa6AqzFFUDgOBOYqDRvTklt7C4W+pNf99qqS+n2+fj3f1VxE2c9Kp5EaA9i /Nvo2ppEQcfJOBTNjWCyE6Q9q64RoO5vm0qIAIAv7fIINQaWaysi/5e3a3XWrDv0JHk4 4RDLUwGWQWHoCWwkdoEp5k1MkAL/ZEp9jM9++3C9p1UfH3PicTIKm34nHii+liHoUnsQ xBQivBadr9I53JzxVQfxM9QP418c2mfR8bfBTN4cneeiknEjwklVE5V/jilyoY5BWchM ChwLwPdenrDtXg+4HCks7vNRrznMnsNen0KMDeFy1BUKWLL0e6kZwhTrm83wuwwjvOTI oQRQ== X-Gm-Message-State: AOJu0YyMZG9qS/6PAT1RP3JJtfvnnVb0BOrD2UCGkj3h90CxV/s/jd62 anhe/4t8G9PX25+fc3GZ9Gi+0pbnF++zEpywwY2f4/pufcpHVzS0p6ZTltkKt8U1FGlWbX0gxvR lGwihDNCo9DpgK1wrg/68w4WnxXCcm9LUflCsUhXL+mYQ1Zuc1Zy8e4+2VowimrxVa29kmqt6n+ Q6sGkYxBMv X-Gm-Gg: AZuq6aI/3GJYYohKjpjcbjcUpoSzTgMjENrHFzSaGp0fT+zp53uMXvSammTnkF6725l kfKGjsq3GXQ9JKC2tX+iP3Bj/YYJOaMV0q0QvNPOaDYDkzAVmNNYpCBcuvwti1zX8t1gjou5Fcb B1/4IRRiBKyH+WCfIniqXaNfZynzHpOzx92ppABI+YjEUVsmgPLYcydXvlLngjwXwx2lFJdeH1c p8/zSgZLQiINgegFco4q3tqGIPKUwx8yP8HhQhv6nbl33h87GAvFEVCHGQIPYF1XeVlEL6iaS1F r7DtVLxTfbO6wUwton8KxIphq9l8jXFLviXVgIqgNkyfG37vRxxTEzRvoVK/59Gmcmen85OKtKB OjqwGgc3eIauM2SMOzX+Sk13poHf7Vl6Y2pZnuOvTGlsNnuX3CV1ocOi4kZgBZlI2WTjF9qWuXn /pFaQABmdtm76RmD6pT1tH6E7yuO9efkzYtjt/ X-Received: by 2002:a05:6122:32cb:b0:559:5ef5:b196 with SMTP id 71dfb90a1353d-56889bf9467mr2270160e0c.13.1771268530162; Mon, 16 Feb 2026 11:02:10 -0800 (PST) X-Received: by 2002:a05:6122:32cb:b0:559:5ef5:b196 with SMTP id 71dfb90a1353d-56889bf9467mr2270129e0c.13.1771268529692; Mon, 16 Feb 2026 11:02:09 -0800 (PST) Received: from shalem (2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl. [2001:1c00:c32:7800:5bfa:a036:83f0:f9ec]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8fc735e587sm276698966b.2.2026.02.16.11.02.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Feb 2026 11:02:09 -0800 (PST) From: Hans de Goede To: libcamera-devel@lists.libcamera.org, Milan Zamazal Cc: Hans de Goede Subject: [PATCH 4/5] software_isp: debayer_cpu: Select process inner loop by function pointer Date: Mon, 16 Feb 2026 20:02:03 +0100 Message-ID: <20260216190204.106922-5-johannes.goede@oss.qualcomm.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> References: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> MIME-Version: 1.0 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMjE2MDE2MyBTYWx0ZWRfXwm/dHNZI1da2 N+Z33R+MagnMruuiioHe805RGODeT+92LJuEXQe6mT1+/fIUdg1tVzQ1LAArvPtfC2oCYlFb7TX u2dm2PBIhWUSwklWFwW/lPEcbto7XOt4NndBIvAMdpb1bCLoDdreKVrFvrSzKVXCaCZCzOY59j4 RpnleqIlYb0wBWX9IulTtxgYN7x12ZGvPeWllfhOM0xlwVFEHuYYZN+uUV55htu1pT7n5QR+rzv dgFD02EgGyL61AOs+QEc4t60YwHLDzwPy/cDJ3BKU9LrpdAmuar5ksllk+dOBt0hPedDhNsV3/n liK6lp3VoDqmklAQ2VAubYEKfXig6nTsxTJfz5bbJBAceJ3lvMVbDZJYTGiBE6voKrLceoZnTLR s3lkytz3momfIUI86yLEA2h5OSpAgz4tmZAbzZAkxRR7DzMeEKgMdDqyrWRqeXI9IDEaegED1E8 JsnVFa9qKykaDeDbEAw== X-Proofpoint-ORIG-GUID: wkF-Jh02Ag3z1cqHGhWzi3BRCQWy0yKF X-Proofpoint-GUID: wkF-Jh02Ag3z1cqHGhWzi3BRCQWy0yKF X-Authority-Analysis: v=2.4 cv=Rfydyltv c=1 sm=1 tr=0 ts=699369b3 cx=c_pps a=1Os3MKEOqt8YzSjcPV0cFA==:117 a=xqWC_Br6kY4A:10 a=HzLeVaNsDn8A:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Mpw57Om8IfrbqaoTuvik:22 a=GgsMoib0sEa3-_RKJdDe:22 a=EUspDBNiAAAA:8 a=invZXm7wCyodYBJXmhMA:9 a=hhpmQAJR8DioWGSBphRh:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51, FMLib:17.12.100.49 definitions=2026-02-16_06,2026-02-16_04,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 spamscore=0 lowpriorityscore=0 impostorscore=0 suspectscore=0 clxscore=1015 phishscore=0 priorityscore=1501 adultscore=0 bulkscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2601150000 definitions=main-2602160163 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" At a processInner_ function pointer and set this to process2() / process4() in configure instead of making the choise inline in process(). This is a preparation patch for making DebayerCpu support multi-threading. Signed-off-by: Hans de Goede --- src/libcamera/software_isp/debayer_cpu.cpp | 10 ++++++---- src/libcamera/software_isp/debayer_cpu.h | 4 ++++ 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp index e1d3c164..5e168554 100644 --- a/src/libcamera/software_isp/debayer_cpu.cpp +++ b/src/libcamera/software_isp/debayer_cpu.cpp @@ -437,6 +437,11 @@ int DebayerCpu::setDebayerFunctions(PixelFormat inputFormat, return invalidFmt(); } + if (inputConfig_.patternSize.height == 2) + processInner_ = &DebayerCpu::process2; + else + processInner_ = &DebayerCpu::process4; + if ((bayerFormat.bitDepth == 8 || bayerFormat.bitDepth == 10 || bayerFormat.bitDepth == 12) && bayerFormat.packing == BayerFormat::Packing::None && isStandardBayerOrder(bayerFormat.order)) { @@ -890,10 +895,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output threadData_[0].yStart = 0; threadData_[0].yEnd = window_.height; - if (inputConfig_.patternSize.height == 2) - process2(frame, in.planes()[0].data(), out.planes()[0].data(), &threadData_[0]); - else - process4(frame, in.planes()[0].data(), out.planes()[0].data(), &threadData_[0]); + (this->*processInner_)(frame, in.planes()[0].data(), out.planes()[0].data(), &threadData_[0]); metadata.planes()[0].bytesused = out.planes()[0].size(); diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h index a54418dc..b85dd11c 100644 --- a/src/libcamera/software_isp/debayer_cpu.h +++ b/src/libcamera/software_isp/debayer_cpu.h @@ -87,6 +87,9 @@ private: bool processLastLinesSeperately; }; + using processFn = void (DebayerCpu::*)(uint32_t frame, const uint8_t *src, uint8_t *dst, + DebayerCpuThreadData *threadData); + /* 8-bit raw bayer format */ template void debayer8_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]); @@ -164,6 +167,7 @@ private: unsigned int threadCount_; bool ccmEnabled_; DebayerParams params_; + processFn processInner_; }; } /* namespace libcamera */ From patchwork Mon Feb 16 19:02:04 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 26167 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id B0570C0DA4 for ; Mon, 16 Feb 2026 19:02:17 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id 2971F62214; Mon, 16 Feb 2026 20:02:17 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (2048-bit key; unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.b="MJd7ThD+"; dkim=pass (2048-bit key; unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="R1XjJN0Z"; dkim-atps=neutral Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 5B94362201 for ; Mon, 16 Feb 2026 20:02:14 +0100 (CET) Received: from pps.filterd (m0279867.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 61GG1jeQ048861 for ; Mon, 16 Feb 2026 19:02:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=UVPPPmSuPjO m3wDBERqYZg3W4pI5aWj8jXPV/ADSsQE=; b=MJd7ThD+M9VpwE7kHnetYMCgv/1 Ojobp/IBnaSrZK4Cdlbo9lBGuPS+qLpw1fxQeEvGtwh9k79Bhvp9xanK+vNlLPiX jojKW4C+W7bvE+bPMCvx7WZ2W5hjywECzyGS1d9EP8v624QEpaDehjLEZh84XR/9 B24s2EC58oJ7uRbrDvH1OAB+ezaxPBlt3TQW6jD8GW22icIvTfRl0WWX2bCYoeW4 CuI4f+CaI2E51xFBs45gkc5typ7e04YD93w2RocqNHTyQocq4YOBgOvUz0AGWKkR FVyu6UrSvhV0ZkNjDqrWubZoZT636f6MlXxeWSTOLrG3s0meFNzwUmgluLg== Received: from mail-vk1-f200.google.com (mail-vk1-f200.google.com [209.85.221.200]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4cc6d80dxw-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Mon, 16 Feb 2026 19:02:12 +0000 (GMT) Received: by mail-vk1-f200.google.com with SMTP id 71dfb90a1353d-5674e566967so5946604e0c.0 for ; Mon, 16 Feb 2026 11:02:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1771268531; x=1771873331; darn=lists.libcamera.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UVPPPmSuPjOm3wDBERqYZg3W4pI5aWj8jXPV/ADSsQE=; b=R1XjJN0ZQMPcI3k3qlBS8aiGLXCji6p9yE0gH6qCJDwwQKxBsCLDzR+fTwCTJKScaz hAwLOSAFby1ae0mYrky62JFz0rSao23XX0/D2DmeMH/l7Pht2oMyaOEPnpAwTjiEYYIf r862M0zLbR0vkpXtcLFngWO3QF9KA/tIlqFcGSKu0ya2iGKlO6DLDytO3IIc2557iGdK xPeb7Ae2hcNW5alTOha2whC9zjA6VOVRNyUKT1eeSX3KC0iiX7+MSXyospwtXKYU7nDE 588RZed2wXEVDI3g5yzN0OzWJuEFJNuaFJfpsd/M/EKUuZ5ysyOtoQGZ0L3jprF4G3th kCfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771268531; x=1771873331; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=UVPPPmSuPjOm3wDBERqYZg3W4pI5aWj8jXPV/ADSsQE=; b=bGV6hpdQvthkqcZHBUNkxmzunCKXWclYR1sKOqWsYt+Iy519GUzX4h6glUMxO2quPN 7hZwZb87pWFXY9XD3IfyirzUShhQ8/AzBcsdAmFLoDx3mFOrgmDGrjZQPPdNH9WZ1k4s 7ZXhadX6dPlahsWLj73vuXFl1L1cLoAv6DkRAeaWek+KRzUbetA81D96UbJFiQlPINrz C5k2lnYPZFIgjtNvL8NuDgABLbH4aZiuU8NqKfIQX/3Lc5HDJZDqCxzh0G9ODEEqC2ku hpqFT4ieKDut00d67SBptttVi3YP0InjKVVrNVu17or4UW7EY+vyO+JUgHEDhoVg1MF0 XmHA== X-Gm-Message-State: AOJu0YzyWJxIVkAVDGZyn9QvVBnvhbeqnRb+z8ZNHGshPHiYVfQanu5t iBxSSBkaroOuAEwL74LxNZwLuKChddmApDVAjrwveXZyJA4s2wEiCRQlD/O+NWoSt+UWVpMYgGD wnQPaDqk8K5Z4/qXHwinNgEl/ZuOd+ct4Ml6K4qarW2qJT7KZdex1WYuRoyQa2kTHw8LVqeXPgF m7Jqm5KN/R X-Gm-Gg: AZuq6aJ9McwzB9ZdZ0YfdIkB3vS74dhGsFY1hH8QNJ/qYjXA8G0E3BKBMva8zp4amGg r9U1iWaxadhK3WY77u2z+XwXnkOBQnksgW884UKZgWki4/dPnu5/P1ZFO7o52XOtQdlWZs3xzg4 FijQaYNpZkxAN14Ap0jQO2iwqXtkBA+RR+UR+gincubfDD52zUh72cVV5ue/37vbM3FnNHAQVbx yd+3FQWwPGyYYbG1zssmIAJ3kvvhnXkDjra92IZmJn3qThFm3WI7LQFE5FEMbSNj1Q8PquyzuCN pCX2rPeqJyD47/vuELLVR+mQNzSRljPKy+whv/K9LiceIFblghDEobCpciDdcHc0IiD7MbexZLp +EKSFrQP9ZjxzmgR3hvKZaMYEuTAUMOKhpI6b6of45NynhgyMsH7xNynVsce3PYVi6PFlMKHPzO lvenEj7CyJ9Pa04aiUl8Nn2S15nUbQJebH9DJ+ X-Received: by 2002:a05:6122:8cb:b0:559:6663:8b1a with SMTP id 71dfb90a1353d-56768179c4bmr3391296e0c.4.1771268531259; Mon, 16 Feb 2026 11:02:11 -0800 (PST) X-Received: by 2002:a05:6122:8cb:b0:559:6663:8b1a with SMTP id 71dfb90a1353d-56768179c4bmr3391251e0c.4.1771268530732; Mon, 16 Feb 2026 11:02:10 -0800 (PST) Received: from shalem (2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl. [2001:1c00:c32:7800:5bfa:a036:83f0:f9ec]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b8fc735e587sm276698966b.2.2026.02.16.11.02.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Feb 2026 11:02:10 -0800 (PST) From: Hans de Goede To: libcamera-devel@lists.libcamera.org, Milan Zamazal Cc: Hans de Goede Subject: [PATCH 5/5] software_isp: debayer_cpu: Add multi-threading support Date: Mon, 16 Feb 2026 20:02:04 +0100 Message-ID: <20260216190204.106922-6-johannes.goede@oss.qualcomm.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> References: <20260216190204.106922-1-johannes.goede@oss.qualcomm.com> MIME-Version: 1.0 X-Authority-Analysis: v=2.4 cv=bqVBxUai c=1 sm=1 tr=0 ts=699369b4 cx=c_pps a=wuOIiItHwq1biOnFUQQHKA==:117 a=xqWC_Br6kY4A:10 a=HzLeVaNsDn8A:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Mpw57Om8IfrbqaoTuvik:22 a=GgsMoib0sEa3-_RKJdDe:22 a=EUspDBNiAAAA:8 a=j_KrAe4nr5UCvrO7opAA:9 a=XD7yVLdPMpWraOa8Un9W:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMjE2MDE2MyBTYWx0ZWRfX7y7/Aj986YaI bnp9X6JZwSiTkToyPBurh9iEQUqCGGN67os+D90pSlt405/4jnLcU5EzUi3vFKdGM5gTbebyADD TiJ4Oto/ZeyfRQM2SDm476IwseMaZAizJztp7kFyYD7oYo0tt+XgR/xmIBdi9UmxlwULOryrb6o E7grrGUcM3G4fCnCsjaluaaLbeMe1e8BLvXVkEhwHRFZTQbicEz10vLuQYb7h4H7X1qiOST7Z1V x2cmoLELlMj3OprfpEiDBgBU27dtbdHUAvDuHWm8LfKfSxn/5K3/sw+2LWoTGa41MjRTAJBjw2d h9O/xHrbtPAsVO8+ZGojIfkMebRrEzO/3JC9jCeomUrXAKh5hGSR1pV0n74zSUmbOCyb4/NvRb+ qdwmTU/4QbPYfP8+Lgn1anhnYqaBcVv4s27HCH3QO5jaeT98q/AtWyuQds09nzIaJAKxoGAX3K7 7kYtEXqX4YqG9qzy5NA== X-Proofpoint-GUID: _8f0PnRls08v-KVJwkZ5cn_Zp9e3vuY3 X-Proofpoint-ORIG-GUID: _8f0PnRls08v-KVJwkZ5cn_Zp9e3vuY3 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51, FMLib:17.12.100.49 definitions=2026-02-16_06,2026-02-16_04,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 impostorscore=0 lowpriorityscore=0 bulkscore=0 priorityscore=1501 spamscore=0 adultscore=0 phishscore=0 clxscore=1015 suspectscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2601150000 definitions=main-2602160163 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Add CPU soft ISP multi-threading support. Benchmark results for the Uno-Q with a weak CPU which is good for performance testing, all numbers with an IMX219 running at 3280x2464 -> 3272x2464: 1 thread : 147ms / frame, ~6.5 fps 2 threads: 81ms / frame, ~12 fps 3 threads: 66ms / frame, ~14.5 fps Adding a 4th thread does not improve performance. Signed-off-by: Hans de Goede --- src/libcamera/software_isp/debayer_cpu.cpp | 49 +++++++++++++++++----- src/libcamera/software_isp/debayer_cpu.h | 2 +- 2 files changed, 40 insertions(+), 11 deletions(-) diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp index 5e168554..c4b6c5b8 100644 --- a/src/libcamera/software_isp/debayer_cpu.cpp +++ b/src/libcamera/software_isp/debayer_cpu.cpp @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -41,7 +42,7 @@ namespace libcamera { * \param[in] configuration The global configuration */ DebayerCpu::DebayerCpu(std::unique_ptr stats, const GlobalConfiguration &configuration) - : Debayer(configuration), stats_(std::move(stats)), threadCount_(1) + : Debayer(configuration), stats_(std::move(stats)) { /* * Reading from uncached buffers may be very slow. @@ -56,6 +57,9 @@ DebayerCpu::DebayerCpu(std::unique_ptr stats, const GlobalConfigurat */ enableInputMemcpy_ = configuration.option({ "software_isp", "copy_input_buffer" }).value_or(true); + threadCount_ = + configuration.option({ "software_isp", "threads" }).value_or(3); + threadCount_ = std::clamp(threadCount_, 1u, kMaxThreads); } DebayerCpu::~DebayerCpu() = default; @@ -692,7 +696,7 @@ void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst, for (unsigned int y = threadData->yStart; y < threadData->yEnd; y += 2) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers, threadData); - stats_->processLine0(frame, y, linePointers, &statsBuffer_); + stats_->processLine0(frame, y, linePointers, threadData->statsBuffer); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -707,7 +711,8 @@ void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst, if (threadData->processLastLinesSeperately) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers, threadData); - stats_->processLine0(frame, threadData->yEnd, linePointers, &statsBuffer_); + stats_->processLine0(frame, threadData->yEnd, linePointers, + threadData->statsBuffer); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -749,7 +754,7 @@ void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst, for (unsigned int y = threadData->yStart; y < threadData->yEnd; y += 4) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers, threadData); - stats_->processLine0(frame, y, linePointers, &statsBuffer_); + stats_->processLine0(frame, y, linePointers, threadData->statsBuffer); (this->*debayer0_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -762,7 +767,7 @@ void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst, shiftLinePointers(linePointers, src); memcpyNextLine(linePointers, threadData); - stats_->processLine2(frame, y, linePointers, &statsBuffer_); + stats_->processLine2(frame, y, linePointers, threadData->statsBuffer); (this->*debayer2_)(dst, linePointers); src += inputConfig_.stride; dst += outputConfig_.stride; @@ -869,6 +874,10 @@ void DebayerCpu::updateLookupTables(const DebayerParams ¶ms) void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output, const DebayerParams ¶ms) { + std::unique_ptr threads[threadCount_ - 1]; + SwIspStats statsBuffer[threadCount_]; + unsigned int i; + bench_.startFrame(); std::vector dmaSyncers; @@ -891,11 +900,31 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output return; } - stats_->startFrame(frame, &statsBuffer_, 1); + stats_->startFrame(frame, statsBuffer, threadCount_); - threadData_[0].yStart = 0; - threadData_[0].yEnd = window_.height; - (this->*processInner_)(frame, in.planes()[0].data(), out.planes()[0].data(), &threadData_[0]); + unsigned int yStart = 0; + unsigned int linesPerThread = (window_.height / threadCount_) & + ~(inputConfig_.patternSize.width - 1); + for (i = 0; i < (threadCount_ - 1); i++) { + threadData_[i].yStart = yStart; + threadData_[i].yEnd = yStart + linesPerThread; + threadData_[i].statsBuffer = &statsBuffer[i]; + threads[i] = std::make_unique( + processInner_, this, frame, + in.planes()[0].data(), + out.planes()[0].data() + yStart * outputConfig_.stride, + &threadData_[i]); + yStart += linesPerThread; + } + threadData_[i].yStart = yStart; + threadData_[i].yEnd = window_.height; + threadData_[i].statsBuffer = &statsBuffer[i]; + (this->*processInner_)(frame, in.planes()[0].data(), + out.planes()[0].data() + yStart * outputConfig_.stride, + &threadData_[i]); + + for (i = 0; i < (threadCount_ - 1); i++) + threads[i]->join(); metadata.planes()[0].bytesused = out.planes()[0].size(); @@ -909,7 +938,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output * * \todo Pass real bufferId once stats buffer passing is changed. */ - stats_->finishFrame(frame, 0, &statsBuffer_, 1); + stats_->finishFrame(frame, 0, statsBuffer, threadCount_); outputBufferReady.emit(output); inputBufferReady.emit(input); } diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h index b85dd11c..63fa7710 100644 --- a/src/libcamera/software_isp/debayer_cpu.h +++ b/src/libcamera/software_isp/debayer_cpu.h @@ -85,6 +85,7 @@ private: unsigned int lineBufferIndex; /* Stored here to avoid causing register pressure in inner loop */ bool processLastLinesSeperately; + SwIspStats *statsBuffer; }; using processFn = void (DebayerCpu::*)(uint32_t frame, const uint8_t *src, uint8_t *dst, @@ -150,7 +151,6 @@ private: Rectangle window_; /* Variables used every line */ - SwIspStats statsBuffer_; debayerFn debayer0_; debayerFn debayer1_; debayerFn debayer2_;