From patchwork Tue Feb 24 19:37:43 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 26234 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id 8C8D5C32C8 for ; Tue, 24 Feb 2026 19:37:56 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id 93841622AD; Tue, 24 Feb 2026 20:37:55 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (2048-bit key; unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.b="bC9KGrlC"; dkim=pass (2048-bit key; unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="Zyv8KsDt"; dkim-atps=neutral Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 513F9622B1 for ; Tue, 24 Feb 2026 20:37:52 +0100 (CET) Received: from pps.filterd (m0279868.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 61OFQZ8P3177060 for ; Tue, 24 Feb 2026 19:37:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=me+HBKUkDgV dYQngZuHh24uEVJQAqxiMU3chL38HdgA=; b=bC9KGrlCv3XOaXOJl07CrWb8q6E g1S9MpMrEnbO1HIGDJcDPz59GLCBvnCnaRl6JZWaxtd0GKe6Zmkcmu6i4XhmI1N5 p/2L1/GT1hhBvLQZZfVklsFXuG9kQXd9JxBNPJZ5WbY3MjPx0g7r399DFkB7qAGC JkWjlqJwCh9JaipObsObg+uD9fLTSt/Lx0OZdq5P5kt4JCb4rvcmeBWs9SdCRr7B yoy4h8fG1NsJL1Tx8kMPVukSeckLf7LGRUWw4fCMhgtCV92eMRk/+DjCAwA4oQI0 rH4hnFMeYsheq3J60A6h8tdpPS+4mzT/imri7R8xiyHvI1GnvbqujniAIXQ== Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4chemnry0k-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Tue, 24 Feb 2026 19:37:50 +0000 (GMT) Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-8cb42f56c4aso5291949185a.3 for ; Tue, 24 Feb 2026 11:37:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1771961870; x=1772566670; darn=lists.libcamera.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=me+HBKUkDgVdYQngZuHh24uEVJQAqxiMU3chL38HdgA=; b=Zyv8KsDtnyu+320MNNR8lQsdx6IDr/JEzx74xqpc8DNfyU6HC7isSJ5piNd33gOM5G FPDvgBavTDKs8B40YTNjlhD0I7Lcrc71mBPVE82DEMrXN/BtuE07d+bk73P8CJtozIU2 xXrngx1W6VXgBrlWKW2Mj1tpOPnslvhRs2ZAmqlvmCrlpQFPR7XoWilihjZ+lmb1MNbP I2F/PDflHFM0cLdADvsOhjOX7bXni7IzNZSatePJKApFoCEO/OSe8zURGh393NAObWUf 73gI8xI0Uo1mUt+X68GNyWOv553zVp33AIW/vJA/KXZU7hrz7320CgtpRhOgQqRhMWqj Mj2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771961870; x=1772566670; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=me+HBKUkDgVdYQngZuHh24uEVJQAqxiMU3chL38HdgA=; b=eoZ7stHzWJvuBMHZJNqhggnZxIExm4now6fODZbxWIIQO6O3/OxGgoifZCoZHpDk5/ gkA4cei9klnrjop0zkqC0lk76n5wiLS+s1Nb7V2FBo8FZZEbR6J+5Z4MzOE5yCfOr8IH z2bF3VWlOr7u4ap7CQ7up/a2wBHNmGo9/kSphjqe/kFdVGI/ab0RJdDYjATfV3dx9KFj IQHBMHk99kBH3UHteiaS9W0HiFOVP3KB/SVJzx5/BHikljbg3c8dVTXglPXQoZjkoTLo shbm1i0DXDs+zkRQahAZQYSqPB3auCdc7Wtqhb1ZSGAj+FYVKKBHlYOZterxrj2UzWzo JqAA== X-Gm-Message-State: AOJu0YyjMhH56lywqpL9Mm1uEK1prqmoGzub+qHdVyQF2JoPurK7Wf05 Vygm+2gHdQsY4XD2g8LyGDKwy0rkAJZ0qtmHUVcG/lyVaix4Ey+XFVHBQKwP6itQri4Z/OacRYv 06aNQEErJxOCV4Xy5giZIYsmYU4SCrisgpYOBt4pkomxgGst1AmK9ZOUqnUomN8Ku/HFNjK+GWp KbYIWuLMIr X-Gm-Gg: AZuq6aKVO8+6uASheRkNOdn//2a3G8p6OPkKQGIb+ZShOBzq9aivcNuYXamLuu0yQ5z FfXnYRySU3QsHEIPB+yfvu5wlquM+6Y2+7B5DEFf3O8DIk8R51LugvWzCzxPHckRHPLYBGFLAoX zcbzlhvAtGAEHC2dgqO9teLnpMOWW2ulBXPg3nLTRcE4np04VDQtayKJPnv4/8rbORLtzGsHhh0 4HXCdExC0XtQ2CpQzgOBbO9hua58NbBmIViuQGEyqwkgCYF9k4W+NUAaAKWDvXKY++b/3tLK7r7 svj4U8b/CTEky2ln6Fu2vhVqXosRh89uTVWQMi89GiiRmLqMW34JU9BMW94ftDUtATRNUrAmF5r MhIoOpZn71YIUmJwi34XwNNfcknUWB8pCi+HSTbZo1vLJAqyFM4pat7+qShK6OxpHkbITR7H3gJ q5sdcdYipYVVV+4MWfrV9nO7/Fts+/szMfSvtx X-Received: by 2002:a05:620a:f0d:b0:8ca:2a04:3ff3 with SMTP id af79cd13be357-8cb8ca03fe5mr1726073185a.30.1771961869804; Tue, 24 Feb 2026 11:37:49 -0800 (PST) X-Received: by 2002:a05:620a:f0d:b0:8ca:2a04:3ff3 with SMTP id af79cd13be357-8cb8ca03fe5mr1726068185a.30.1771961869063; Tue, 24 Feb 2026 11:37:49 -0800 (PST) Received: from shalem (2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl. [2001:1c00:c32:7800:5bfa:a036:83f0:f9ec]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b9084e8cb3fsm458232866b.48.2026.02.24.11.37.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Feb 2026 11:37:48 -0800 (PST) From: Hans de Goede To: libcamera-devel@lists.libcamera.org, Milan Zamazal Cc: Hans de Goede Subject: [PATCH v3 2/4] software_isp: debayer_cpu: Add DebayerCpuThread class Date: Tue, 24 Feb 2026 20:37:43 +0100 Message-ID: <20260224193745.106186-3-johannes.goede@oss.qualcomm.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260224193745.106186-1-johannes.goede@oss.qualcomm.com> References: <20260224193745.106186-1-johannes.goede@oss.qualcomm.com> MIME-Version: 1.0 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMjI0MDE2OCBTYWx0ZWRfX0xfvC0eBxnbf fATYZNK0GSo2SCm5R4DidTEqSPOhRUqc3GDqxngo54jucTR9BgZOVHtlTZl+5pczhO0UOzrLq5k QvSStBNWE5UOh426Q8jTtF0jVYqIwPuo8ALrvt/qZJ+YD4vIbKGBDBPwL9+ekDFi+5Kt/5RCkmZ Aa8zbgjJ/CYlxo48lbzrJ3GaY8j9ZASxFP0y4eI2cxASD9d0s33wlDCViKBOJ5DHA5TnsD+7XDl MRtXmcBkzkQvbFmjcwm3I1JqGF9ndwOMMmf/IMkr5doU2rwamJrF9C2EcUiXyHYWNJczfOF37Wj h0CIP1xKpmaeQumMqbjcpEKcBDRFxALd8CqEJ3rn5MqHNlXhbSu6YuYRRa7l60CfGZsB7sQkKlQ mQDWQKIVuDrmsI75dvPzRN7bRXdzAGFgvHOpFvyzUC70FGb/kQLZFlanoYPWcf1ohwItaVidXwz j1g1ZFWespV2BflEYvg== X-Proofpoint-ORIG-GUID: UJCy6G2eL79Q09R1KrjsvfUNMsml0xzc X-Authority-Analysis: v=2.4 cv=Ro7I7SmK c=1 sm=1 tr=0 ts=699dfe0e cx=c_pps a=qKBjSQ1v91RyAK45QCPf5w==:117 a=xqWC_Br6kY4A:10 a=HzLeVaNsDn8A:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=u7WPNUs3qKkmUXheDGA7:22 a=ZpdpYltYx_vBUK5n70dp:22 a=EUspDBNiAAAA:8 a=Chkswc9nKTKKIei3T8UA:9 a=NFOGd7dJGGMPyQGDc5-O:22 X-Proofpoint-GUID: UJCy6G2eL79Q09R1KrjsvfUNMsml0xzc X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51, FMLib:17.12.100.49 definitions=2026-02-24_02,2026-02-23_03,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 clxscore=1015 spamscore=0 bulkscore=0 adultscore=0 impostorscore=0 phishscore=0 lowpriorityscore=0 priorityscore=1501 suspectscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2602130000 definitions=main-2602240168 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Add a DebayerCpuThreadclass and use this in the inner render loop. This contains data which needs to be separate per thread. This is a preparation patch for making DebayerCpu support multi-threading. Benchmarking on the Arduino Uno-Q with a weak CPU which is good for performance testing, shows 146-147ms per 3272x2464 frame both before and after this change, with things maybe being 0.5 ms slower after this change. Signed-off-by: Hans de Goede --- Changes in v3: - Use std::unique_ptr for the DebayerCpuThread pointers - Document new DebayerCpuThread class - Make DebayerCpuThread inherit from both Thread and Object Changes in v2: - Replace the DebayerCpuThreadData struct from v1 with a DebayerCpuThread class, derived from Object to allow calling invokeMethod for thread re-use in followup patches - As part of this also move a bunch of methods which primarily deal with per thread data: setupInputMemcpy(), shiftLinePointers(), memcpyNextLine(), process*() to the new DebayerCpuThread class --- src/libcamera/software_isp/debayer_cpu.cpp | 244 +++++++++++++++------ src/libcamera/software_isp/debayer_cpu.h | 20 +- 2 files changed, 188 insertions(+), 76 deletions(-) diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp index e7b01210..36b7881b 100644 --- a/src/libcamera/software_isp/debayer_cpu.cpp +++ b/src/libcamera/software_isp/debayer_cpu.cpp @@ -18,6 +18,8 @@ #include +#include + #include #include "libcamera/internal/bayer_format.h" @@ -27,6 +29,52 @@ namespace libcamera { +/** + * \brief Class representing one CPU debayering thread + * + * Implementation for CPU based debayering threads. + */ +class DebayerCpuThread : public Thread, public Object +{ +public: + DebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex, + bool enableInputMemcpy); + + void configure(unsigned int yStart, unsigned int yEnd); + void process(uint32_t frame, const uint8_t *src, uint8_t *dst); + +private: + void setupInputMemcpy(const uint8_t *linePointers[]); + void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src); + void memcpyNextLine(const uint8_t *linePointers[]); + void process2(uint32_t frame, const uint8_t *src, uint8_t *dst); + void process4(uint32_t frame, const uint8_t *src, uint8_t *dst); + + DebayerCpu *debayer_; + unsigned int threadIndex_; + unsigned int yStart_; + unsigned int yEnd_; + unsigned int lineBufferLength_; + unsigned int lineBufferPadding_; + unsigned int lineBufferIndex_; + std::vector lineBuffers_[DebayerCpu::kMaxLineBuffers]; + bool enableInputMemcpy_; +}; + +/** + * \brief Construct a DebayerCpuThread object + * \param[in] debayer pointer back to the DebayerCpuObject this thread belongs to + * \param[in] threadIndex 0 .. n thread-index value for the thread + * \param[in] enableInputMemcpy when set copy input data to a heap buffer before use + */ +DebayerCpuThread::DebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex, + bool enableInputMemcpy) + : Thread("DebayerCpu:" + std::to_string(threadIndex)), + debayer_(debayer), threadIndex_(threadIndex), + enableInputMemcpy_(enableInputMemcpy) +{ +} + /** * \class DebayerCpu * \brief Class for debayering on the CPU @@ -53,8 +101,14 @@ DebayerCpu::DebayerCpu(std::unique_ptr stats, const GlobalConfigurat * \todo Make memcpy automatic based on runtime detection of platform * capabilities. */ - enableInputMemcpy_ = + bool enableInputMemcpy = configuration.option({ "software_isp", "copy_input_buffer" }).value_or(true); + + /* Just one thread object for now, which will be called inline rather than async */ + threads_.resize(1); + + for (unsigned int i = 0; i < threads_.size(); i++) + threads_[i] = std::make_unique(this, i, enableInputMemcpy); } DebayerCpu::~DebayerCpu() = default; @@ -484,7 +538,7 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg, if (getInputConfig(inputCfg.pixelFormat, inputConfig_) != 0) return -EINVAL; - if (stats_->configure(inputCfg) != 0) + if (stats_->configure(inputCfg, threads_.size()) != 0) return -EINVAL; const Size &statsPatternSize = stats_->patternSize(); @@ -548,17 +602,43 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg, */ stats_->setWindow(Rectangle(window_.size())); + unsigned int yStart = 0; + unsigned int linesPerThread = (window_.height / threads_.size()) & + ~(inputConfig_.patternSize.height - 1); + unsigned int i; + + for (i = 0; i < (threads_.size() - 1); i++) { + threads_[i]->configure(yStart, yStart + linesPerThread); + yStart += linesPerThread; + } + threads_[i]->configure(yStart, window_.height); + + return 0; +} + +/** + * \brief Configure thread to process a specific part of the image + * \param[in] yStart y coordinate of first line to process + * \param[in] yEnd y coordinate of the line at which to stop processing + * + * Configure the thread to process lines yStart - (yEnd - 1). + */ +void DebayerCpuThread::configure(unsigned int yStart, unsigned int yEnd) +{ + Debayer::DebayerInputConfig &inputConfig = debayer_->inputConfig_; + + yStart_ = yStart; + yEnd_ = yEnd; + /* pad with patternSize.Width on both left and right side */ - lineBufferPadding_ = inputConfig_.patternSize.width * inputConfig_.bpp / 8; - lineBufferLength_ = window_.width * inputConfig_.bpp / 8 + + lineBufferPadding_ = inputConfig.patternSize.width * inputConfig.bpp / 8; + lineBufferLength_ = debayer_->window_.width * inputConfig.bpp / 8 + 2 * lineBufferPadding_; if (enableInputMemcpy_) { - for (unsigned int i = 0; i <= inputConfig_.patternSize.height; i++) + for (unsigned int i = 0; i <= inputConfig.patternSize.height; i++) lineBuffers_[i].resize(lineBufferLength_); } - - return 0; } /* @@ -599,9 +679,9 @@ DebayerCpu::strideAndFrameSize(const PixelFormat &outputFormat, const Size &size return std::make_tuple(stride, stride * size.height); } -void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[]) +void DebayerCpuThread::setupInputMemcpy(const uint8_t *linePointers[]) { - const unsigned int patternHeight = inputConfig_.patternSize.height; + const unsigned int patternHeight = debayer_->inputConfig_.patternSize.height; if (!enableInputMemcpy_) return; @@ -617,20 +697,20 @@ void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[]) lineBufferIndex_ = patternHeight; } -void DebayerCpu::shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src) +void DebayerCpuThread::shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src) { - const unsigned int patternHeight = inputConfig_.patternSize.height; + const unsigned int patternHeight = debayer_->inputConfig_.patternSize.height; for (unsigned int i = 0; i < patternHeight; i++) linePointers[i] = linePointers[i + 1]; - linePointers[patternHeight] = src + - (patternHeight / 2) * (int)inputConfig_.stride; + linePointers[patternHeight] = + src + (patternHeight / 2) * (int)debayer_->inputConfig_.stride; } -void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[]) +void DebayerCpuThread::memcpyNextLine(const uint8_t *linePointers[]) { - const unsigned int patternHeight = inputConfig_.patternSize.height; + const unsigned int patternHeight = debayer_->inputConfig_.patternSize.height; if (!enableInputMemcpy_) return; @@ -643,23 +723,48 @@ void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[]) lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1); } -void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) +/** + * \brief Process part of the image assigned to this debayer thread + * \param[in] frame The frame number + * \param[in] src The source buffer + * \param[in] dst The destination buffer + */ +void DebayerCpuThread::process(uint32_t frame, const uint8_t *src, uint8_t *dst) { - unsigned int yEnd = window_.height; + Rectangle &window = debayer_->window_; + + /* Adjust src to top left corner of the window */ + src += (window.y + yStart_) * debayer_->inputConfig_.stride + + window.x * debayer_->inputConfig_.bpp / 8; + /* Adjust dst for yStart_ */ + dst += yStart_ * debayer_->outputConfig_.stride; + + if (debayer_->inputConfig_.patternSize.height == 2) + process2(frame, src, dst); + else + process4(frame, src, dst); +} + +void DebayerCpuThread::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) +{ + unsigned int outputStride = debayer_->outputConfig_.stride; + unsigned int inputStride = debayer_->inputConfig_.stride; + Rectangle &window = debayer_->window_; + unsigned int yEnd = yEnd_; /* Holds [0] previous- [1] current- [2] next-line */ const uint8_t *linePointers[3]; - /* Adjust src to top left corner of the window */ - src += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; - /* [x] becomes [x - 1] after initial shiftLinePointers() call */ - if (window_.y) { - linePointers[1] = src - inputConfig_.stride; /* previous-line */ + if (window.y + yStart_) { + linePointers[1] = src - inputStride; /* previous-line */ linePointers[2] = src; } else { - /* window_.y == 0, use the next line as prev line */ - linePointers[1] = src + inputConfig_.stride; + /* Top line, use the next line as prev line */ + linePointers[1] = src + inputStride; linePointers[2] = src; + } + + if (window.y == 0 && yEnd_ == window.height) { /* * Last 2 lines also need special handling. * (And configure() ensures that yEnd >= 2.) @@ -669,83 +774,93 @@ void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) setupInputMemcpy(linePointers); - for (unsigned int y = 0; y < yEnd; y += 2) { + /* + * Note y is the line-number *inside* the window, since stats_' window + * is the stats window inside/relative to the debayer window. IOW for + * single thread rendering y goes from 0 to window.height. + */ + for (unsigned int y = yStart_; y < yEnd; y += 2) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, y, linePointers); - (this->*debayer0_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->stats_->processLine0(frame, y, linePointers, threadIndex_); + debayer_->debayer0(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - (this->*debayer1_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->debayer1(dst, linePointers); + src += inputStride; + dst += outputStride; } - if (window_.y == 0) { + if (window.y == 0 && yEnd_ == window.height) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, yEnd, linePointers); - (this->*debayer0_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->stats_->processLine0(frame, yEnd, linePointers, threadIndex_); + debayer_->debayer0(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); /* next line may point outside of src, use prev. */ linePointers[2] = linePointers[0]; - (this->*debayer1_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->debayer1(dst, linePointers); + src += inputStride; + dst += outputStride; } } -void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) +void DebayerCpuThread::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) { + unsigned int outputStride = debayer_->outputConfig_.stride; + unsigned int inputStride = debayer_->inputConfig_.stride; + /* * This holds pointers to [0] 2-lines-up [1] 1-line-up [2] current-line * [3] 1-line-down [4] 2-lines-down. */ const uint8_t *linePointers[5]; - /* Adjust src to top left corner of the window */ - src += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; - /* [x] becomes [x - 1] after initial shiftLinePointers() call */ - linePointers[1] = src - 2 * inputConfig_.stride; - linePointers[2] = src - inputConfig_.stride; + linePointers[1] = src - 2 * inputStride; + linePointers[2] = src - inputStride; linePointers[3] = src; - linePointers[4] = src + inputConfig_.stride; + linePointers[4] = src + inputStride; setupInputMemcpy(linePointers); - for (unsigned int y = 0; y < window_.height; y += 4) { + /* + * Note y is the line-number *inside* the window, since stats_' window + * is the stats window inside/relative to the debayer window. IOW for + * single thread rendering y goes from 0 to window.height. + */ + for (unsigned int y = yStart_; y < yEnd_; y += 4) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, y, linePointers); - (this->*debayer0_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->stats_->processLine0(frame, y, linePointers, threadIndex_); + debayer_->debayer0(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - (this->*debayer1_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->debayer1(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine2(frame, y, linePointers); - (this->*debayer2_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->stats_->processLine2(frame, y, linePointers, threadIndex_); + debayer_->debayer2(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - (this->*debayer3_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->debayer3(dst, linePointers); + src += inputStride; + dst += outputStride; } } @@ -867,10 +982,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output stats_->startFrame(frame); - if (inputConfig_.patternSize.height == 2) - process2(frame, in.planes()[0].data(), out.planes()[0].data()); - else - process4(frame, in.planes()[0].data(), out.planes()[0].data()); + threads_[0]->process(frame, in.planes()[0].data(), out.planes()[0].data()); metadata.planes()[0].bytesused = out.planes()[0].size(); diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h index 7a651746..1074bc9c 100644 --- a/src/libcamera/software_isp/debayer_cpu.h +++ b/src/libcamera/software_isp/debayer_cpu.h @@ -26,6 +26,7 @@ namespace libcamera { +class DebayerCpuThread; class DebayerCpu : public Debayer { public: @@ -44,6 +45,8 @@ public: const SharedFD &getStatsFD() { return stats_->getStatsFD(); } private: + friend class DebayerCpuThread; + /** * \brief Called to debayer 1 line of Bayer input data to output format * \param[out] dst Pointer to the start of the output line to write @@ -74,6 +77,11 @@ private: */ using debayerFn = void (DebayerCpu::*)(uint8_t *dst, const uint8_t *src[]); + void debayer0(uint8_t *dst, const uint8_t *src[]) { (this->*debayer0_)(dst, src); } + void debayer1(uint8_t *dst, const uint8_t *src[]) { (this->*debayer1_)(dst, src); } + void debayer2(uint8_t *dst, const uint8_t *src[]) { (this->*debayer2_)(dst, src); } + void debayer3(uint8_t *dst, const uint8_t *src[]) { (this->*debayer3_)(dst, src); } + /* 8-bit raw bayer format */ template void debayer8_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]); @@ -105,11 +113,6 @@ private: int setDebayerFunctions(PixelFormat inputFormat, PixelFormat outputFormat, bool ccmEnabled); - void setupInputMemcpy(const uint8_t *linePointers[]); - void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src); - void memcpyNextLine(const uint8_t *linePointers[]); - void process2(uint32_t frame, const uint8_t *src, uint8_t *dst); - void process4(uint32_t frame, const uint8_t *src, uint8_t *dst); void updateGammaTable(const DebayerParams ¶ms); void updateLookupTables(const DebayerParams ¶ms); @@ -142,12 +145,9 @@ private: debayerFn debayer3_; Rectangle window_; std::unique_ptr stats_; - std::vector lineBuffers_[kMaxLineBuffers]; - unsigned int lineBufferLength_; - unsigned int lineBufferPadding_; - unsigned int lineBufferIndex_; unsigned int xShift_; /* Offset of 0/1 applied to window_.x */ - bool enableInputMemcpy_; + + std::vector>threads_; }; } /* namespace libcamera */