From patchwork Wed Mar 4 07:50:49 2026 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 26250 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id B0AD3BE086 for ; Wed, 4 Mar 2026 07:51:03 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id 640B66239C; Wed, 4 Mar 2026 08:51:03 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (2048-bit key; unprotected) header.d=qualcomm.com header.i=@qualcomm.com header.b="KnGYgS2I"; dkim=pass (2048-bit key; unprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com header.b="Y2ozPG9U"; dkim-atps=neutral Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 09BAC62396 for ; Wed, 4 Mar 2026 08:51:00 +0100 (CET) Received: from pps.filterd (m0279865.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 6245SedC1678073 for ; Wed, 4 Mar 2026 07:50:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h= cc:content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=qcppdkim1; bh=lj7q66a95nk SRrYrRSIi8jiFgvLhiP3E04FCMHJ3vrU=; b=KnGYgS2IFDYpIpMhDU7qbEJ5bR0 VqpDP2sK/PQXMBoRlbMYt5A6QCnXjIX8RlkD9/QsfpKEtPfei8bAb/2PIER2nJQY USNpkd2U8+QipMj+zbv8kWNBpZDTMScPf7XHBHkJWMEFWZZzylRfmq+nkFa23n1W Km6uu1RxxQuo9L5eathbiPpWl1mDZC7s/oXRqwgqCclCD8hz0JRTux4WztnsKcaz obyINn0dAcWHzX7FKNRoEQm7z6JenEQcpvoe5uZ3FkKg8TbIeMnMatmoN2KZrzoE G8WTY9cySiGQoa1/Gaaq4ytyemjHJnBFbD/FkjnAngZhmF15YJkBqmFHdnw== Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4cp73h9vh9-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Wed, 04 Mar 2026 07:50:58 +0000 (GMT) Received: by mail-qk1-f197.google.com with SMTP id af79cd13be357-8c70ab7f67fso6882167885a.3 for ; Tue, 03 Mar 2026 23:50:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oss.qualcomm.com; s=google; t=1772610658; x=1773215458; darn=lists.libcamera.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lj7q66a95nkSRrYrRSIi8jiFgvLhiP3E04FCMHJ3vrU=; b=Y2ozPG9UcjnbK0IXvCr2siYqchJvFBDTjFYW7NC4sjpc827b7SvyScECtb1T21gMZs OrsOlCmyBWfMOto7OXVYn2TDV7TtjuNYTwYOh4B4ROz1HvpqoY6l7durI7KgYayW37LC RJ4/FxnJwSYPq1RDepNatUSlK3bY+nAimFYjh2Sv6mwHB2FED534czpsVKGePvS3yDl/ yf6ituVkN4lJEv/ahQ4xFQDcpcqFhNrdSD6hzHTcqFf7oZSaYdU+D4KVV5ISuzOi9SgQ Ysd0Ag+IqLsAyg6idgNXDzkw8vENBNClsnffrbTzUrGszf7LKS6owxhR1n8fXDgtBHdA cdvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772610658; x=1773215458; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lj7q66a95nkSRrYrRSIi8jiFgvLhiP3E04FCMHJ3vrU=; b=lgIKXXnhnsTarpYOTG7AedpAz1/P5U7QxrwS9AH5XV0CGc/qtUJ4x8GXWgpa6UYFCE 1C33uEozNT9RxN1jm6apYr9nVl6tT3JgvtQGduONwr6LvzUz3MAzMp5Bfky/jF4JmR7p XM/tNxeJGbvCi/77Hilfh92KKjT9b8Bmsgl2mUY3ExmssOP5T3CaFIo+nf9R8oJ9TQeB pMuM1hM5Sc0pwgXKdZ5pHIKmycQ9fASBRrr3D2jdAi0mdZS12+NjkVQk7tknT9H7bRtO qqPOca9QpJnKg3eREa+Oi6ankq5pswoTCrxZYHL0UbduDqb3qxS7je0tppzKuppau08S nuAg== X-Gm-Message-State: AOJu0YzakoA0IrslLgmpRSau0/r5Rk9XXEPO8kuSq/eGwX2yi5F9wv6x I9xds9tl7vQIjFWRhXl2+uBnK7TeztzreCV9mnY1c/ed/gI9W3KNpCy/GvMafsRl/5DVasTBAG6 83L5jCfXVvCfIGujz2vS+MrIcUtMbML26Sq1mUJjpJGA5+GLTbySB2jY6O5+Xt0R9PDK+CSK2/J mCtK9DLCq/ X-Gm-Gg: ATEYQzw6XAmPytD/3wAEoSnXbYP8HYgc84ibnBnwfPb9EW1haeWGj7vK02d1wGxDHCE vp9gbH91PLGmZS+SlPBEeIZm+vvhfXsvPgCYJo7gMIBzHjfhFw+SA1jQe95z0a7ZoyNdUcCwtB0 vvp2mh7K/fh//ZlmJ9dB93KUJdIczuWgA8Y08DP4/XBhSBvoJWb7JvotqKLQeswgPP0P2dAt636 stSB+OOMOHVU/VjxmIsBNnrqVKQdo07KeEqqMEXdM8ZJjol9U/xfQQFRGUE/6hGa4Z9HGYddVSK XFlqs72wJU0r/LTGKoGKDu/gGh1O85Ywc3h6xMzWlgD6/o2EHToayPsaJ9Wygv2OKO9DkGnP9Up V8i6tT2pUIFZPcaMGVJhUyg1cOA6OM7HTh/4GqxtWmJl7MBfrUo+2kfa2gdKvniqq2mb1E9EWiw tdkGwvLEBLqe4Sth2MXZba2N9i8wQoGVOlKGQk X-Received: by 2002:a05:620a:4495:b0:8b2:f0dd:2a97 with SMTP id af79cd13be357-8cd5af760a1mr126113685a.37.1772610657406; Tue, 03 Mar 2026 23:50:57 -0800 (PST) X-Received: by 2002:a05:620a:4495:b0:8b2:f0dd:2a97 with SMTP id af79cd13be357-8cd5af760a1mr126112485a.37.1772610656864; Tue, 03 Mar 2026 23:50:56 -0800 (PST) Received: from shalem (2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl. [2001:1c00:c32:7800:5bfa:a036:83f0:f9ec]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b935ac73a5dsm693263366b.25.2026.03.03.23.50.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Mar 2026 23:50:56 -0800 (PST) From: Hans de Goede To: libcamera-devel@lists.libcamera.org, Milan Zamazal Cc: Hans de Goede Subject: [PATCH v5 2/5] software_isp: debayer_cpu: Add DebayerCpuThread class Date: Wed, 4 Mar 2026 08:50:49 +0100 Message-ID: <20260304075052.11599-3-johannes.goede@oss.qualcomm.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260304075052.11599-1-johannes.goede@oss.qualcomm.com> References: <20260304075052.11599-1-johannes.goede@oss.qualcomm.com> MIME-Version: 1.0 X-Proofpoint-GUID: VvcgkqAg5YENpbDi_NV9NZkg2rSqatE6 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMzA0MDA2MiBTYWx0ZWRfX3bbrPlPL8ax1 sSCrJ+wKTvlfYcND9nKRx2OKaUuqlapgWREJtDWVzp9p20SxALHEIgl4YOuhJSnwqKPAdEbEMjt qPaKl45kDcJxnQvUWsgtQQo7CLLfSKg+kLvcc8LT2qbf72OrdwZLcxl3cQfyp7SvTSYy0eS7KkD Gz7avUloyEnHyd36t/9p9OEHhC1+XL/LtoVlsYO5dCyApfd8R3kOMWY8r5GnbBq4rf7ffARvLmb NpgnZsgOiSfRpa90xHsMk9dHBH3s6ydLXc4eAP3gmgWNQ6GSk9KBQtwOHVoQvX5vukUz4DzuA3B f7wgWrxrKCFrmK06OO7OkYJQ2xcEEh4IUh6j7b8Bnjy3H+MhpxGYf5LDmNG6VIAu5ZTaiFIZPiz v4ZdxH/YkxrorUYBP6at73IlSPMvQhIyouJym6rymJ1RP2RUMtpxCRDPrwAlpPqsz3lEIhKpdBf 4Lcs7+JtwPFbVyFc63g== X-Proofpoint-ORIG-GUID: VvcgkqAg5YENpbDi_NV9NZkg2rSqatE6 X-Authority-Analysis: v=2.4 cv=BpWQAIX5 c=1 sm=1 tr=0 ts=69a7e462 cx=c_pps a=50t2pK5VMbmlHzFWWp8p/g==:117 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=u7WPNUs3qKkmUXheDGA7:22 a=Um2Pa8k9VHT-vaBCBUpS:22 a=20KFwNOVAAAA:8 a=EUspDBNiAAAA:8 a=Ba8D0n3WaO7QZ3tp1_0A:9 a=IoWCM6iH3mJn3m4BftBB:22 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51, FMLib:17.12.100.49 definitions=2026-03-04_02,2026-03-03_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 suspectscore=0 bulkscore=0 adultscore=0 malwarescore=0 lowpriorityscore=0 impostorscore=0 priorityscore=1501 phishscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2602130000 definitions=main-2603040062 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Add a DebayerCpuThreadclass and use this in the inner render loop. This contains data which needs to be separate per thread. This is a preparation patch for making DebayerCpu support multi-threading. Benchmarking on the Arduino Uno-Q with a weak CPU which is good for performance testing, shows 146-147ms per 3272x2464 frame both before and after this change, with things maybe being 0.5 ms slower after this change. Reviewed-by: Milan Zamazal Signed-off-by: Hans de Goede --- Changes in v4: - Move kMaxLineBuffers constant to DebayerCpuThread class - Add Milan's Reviewed-by Changes in v3: - Use std::unique_ptr for the DebayerCpuThread pointers - Document new DebayerCpuThread class - Make DebayerCpuThread inherit from both Thread and Object Changes in v2: - Replace the DebayerCpuThreadData struct from v1 with a DebayerCpuThread class, derived from Object to allow calling invokeMethod for thread re-use in followup patches - As part of this also move a bunch of methods which primarily deal with per thread data: setupInputMemcpy(), shiftLinePointers(), memcpyNextLine(), process*() to the new DebayerCpuThread class --- src/libcamera/software_isp/debayer_cpu.cpp | 247 +++++++++++++++------ src/libcamera/software_isp/debayer_cpu.h | 23 +- 2 files changed, 191 insertions(+), 79 deletions(-) diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp index e7b012105..d57d640df 100644 --- a/src/libcamera/software_isp/debayer_cpu.cpp +++ b/src/libcamera/software_isp/debayer_cpu.cpp @@ -18,6 +18,8 @@ #include +#include + #include #include "libcamera/internal/bayer_format.h" @@ -27,6 +29,55 @@ namespace libcamera { +/** + * \brief Class representing one CPU debayering thread + * + * Implementation for CPU based debayering threads. + */ +class DebayerCpuThread : public Thread, public Object +{ +public: + DebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex, + bool enableInputMemcpy); + + void configure(unsigned int yStart, unsigned int yEnd); + void process(uint32_t frame, const uint8_t *src, uint8_t *dst); + +private: + void setupInputMemcpy(const uint8_t *linePointers[]); + void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src); + void memcpyNextLine(const uint8_t *linePointers[]); + void process2(uint32_t frame, const uint8_t *src, uint8_t *dst); + void process4(uint32_t frame, const uint8_t *src, uint8_t *dst); + + /* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */ + static constexpr unsigned int kMaxLineBuffers = 5; + + DebayerCpu *debayer_; + unsigned int threadIndex_; + unsigned int yStart_; + unsigned int yEnd_; + unsigned int lineBufferLength_; + unsigned int lineBufferPadding_; + unsigned int lineBufferIndex_; + std::vector lineBuffers_[kMaxLineBuffers]; + bool enableInputMemcpy_; +}; + +/** + * \brief Construct a DebayerCpuThread object + * \param[in] debayer pointer back to the DebayerCpuObject this thread belongs to + * \param[in] threadIndex 0 .. n thread-index value for the thread + * \param[in] enableInputMemcpy when set copy input data to a heap buffer before use + */ +DebayerCpuThread::DebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex, + bool enableInputMemcpy) + : Thread("DebayerCpu:" + std::to_string(threadIndex)), + debayer_(debayer), threadIndex_(threadIndex), + enableInputMemcpy_(enableInputMemcpy) +{ +} + /** * \class DebayerCpu * \brief Class for debayering on the CPU @@ -53,8 +104,14 @@ DebayerCpu::DebayerCpu(std::unique_ptr stats, const GlobalConfigurat * \todo Make memcpy automatic based on runtime detection of platform * capabilities. */ - enableInputMemcpy_ = + bool enableInputMemcpy = configuration.option({ "software_isp", "copy_input_buffer" }).value_or(true); + + /* Just one thread object for now, which will be called inline rather than async */ + threads_.resize(1); + + for (unsigned int i = 0; i < threads_.size(); i++) + threads_[i] = std::make_unique(this, i, enableInputMemcpy); } DebayerCpu::~DebayerCpu() = default; @@ -484,7 +541,7 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg, if (getInputConfig(inputCfg.pixelFormat, inputConfig_) != 0) return -EINVAL; - if (stats_->configure(inputCfg) != 0) + if (stats_->configure(inputCfg, threads_.size()) != 0) return -EINVAL; const Size &statsPatternSize = stats_->patternSize(); @@ -548,17 +605,43 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg, */ stats_->setWindow(Rectangle(window_.size())); + unsigned int yStart = 0; + unsigned int linesPerThread = (window_.height / threads_.size()) & + ~(inputConfig_.patternSize.height - 1); + unsigned int i; + + for (i = 0; i < (threads_.size() - 1); i++) { + threads_[i]->configure(yStart, yStart + linesPerThread); + yStart += linesPerThread; + } + threads_[i]->configure(yStart, window_.height); + + return 0; +} + +/** + * \brief Configure thread to process a specific part of the image + * \param[in] yStart y coordinate of first line to process + * \param[in] yEnd y coordinate of the line at which to stop processing + * + * Configure the thread to process lines yStart - (yEnd - 1). + */ +void DebayerCpuThread::configure(unsigned int yStart, unsigned int yEnd) +{ + Debayer::DebayerInputConfig &inputConfig = debayer_->inputConfig_; + + yStart_ = yStart; + yEnd_ = yEnd; + /* pad with patternSize.Width on both left and right side */ - lineBufferPadding_ = inputConfig_.patternSize.width * inputConfig_.bpp / 8; - lineBufferLength_ = window_.width * inputConfig_.bpp / 8 + + lineBufferPadding_ = inputConfig.patternSize.width * inputConfig.bpp / 8; + lineBufferLength_ = debayer_->window_.width * inputConfig.bpp / 8 + 2 * lineBufferPadding_; if (enableInputMemcpy_) { - for (unsigned int i = 0; i <= inputConfig_.patternSize.height; i++) + for (unsigned int i = 0; i <= inputConfig.patternSize.height; i++) lineBuffers_[i].resize(lineBufferLength_); } - - return 0; } /* @@ -599,9 +682,9 @@ DebayerCpu::strideAndFrameSize(const PixelFormat &outputFormat, const Size &size return std::make_tuple(stride, stride * size.height); } -void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[]) +void DebayerCpuThread::setupInputMemcpy(const uint8_t *linePointers[]) { - const unsigned int patternHeight = inputConfig_.patternSize.height; + const unsigned int patternHeight = debayer_->inputConfig_.patternSize.height; if (!enableInputMemcpy_) return; @@ -617,20 +700,20 @@ void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[]) lineBufferIndex_ = patternHeight; } -void DebayerCpu::shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src) +void DebayerCpuThread::shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src) { - const unsigned int patternHeight = inputConfig_.patternSize.height; + const unsigned int patternHeight = debayer_->inputConfig_.patternSize.height; for (unsigned int i = 0; i < patternHeight; i++) linePointers[i] = linePointers[i + 1]; - linePointers[patternHeight] = src + - (patternHeight / 2) * (int)inputConfig_.stride; + linePointers[patternHeight] = + src + (patternHeight / 2) * (int)debayer_->inputConfig_.stride; } -void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[]) +void DebayerCpuThread::memcpyNextLine(const uint8_t *linePointers[]) { - const unsigned int patternHeight = inputConfig_.patternSize.height; + const unsigned int patternHeight = debayer_->inputConfig_.patternSize.height; if (!enableInputMemcpy_) return; @@ -643,23 +726,48 @@ void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[]) lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1); } -void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) +/** + * \brief Process part of the image assigned to this debayer thread + * \param[in] frame The frame number + * \param[in] src The source buffer + * \param[in] dst The destination buffer + */ +void DebayerCpuThread::process(uint32_t frame, const uint8_t *src, uint8_t *dst) { - unsigned int yEnd = window_.height; + Rectangle &window = debayer_->window_; + + /* Adjust src to top left corner of the window */ + src += (window.y + yStart_) * debayer_->inputConfig_.stride + + window.x * debayer_->inputConfig_.bpp / 8; + /* Adjust dst for yStart_ */ + dst += yStart_ * debayer_->outputConfig_.stride; + + if (debayer_->inputConfig_.patternSize.height == 2) + process2(frame, src, dst); + else + process4(frame, src, dst); +} + +void DebayerCpuThread::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) +{ + unsigned int outputStride = debayer_->outputConfig_.stride; + unsigned int inputStride = debayer_->inputConfig_.stride; + Rectangle &window = debayer_->window_; + unsigned int yEnd = yEnd_; /* Holds [0] previous- [1] current- [2] next-line */ const uint8_t *linePointers[3]; - /* Adjust src to top left corner of the window */ - src += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; - /* [x] becomes [x - 1] after initial shiftLinePointers() call */ - if (window_.y) { - linePointers[1] = src - inputConfig_.stride; /* previous-line */ + if (window.y + yStart_) { + linePointers[1] = src - inputStride; /* previous-line */ linePointers[2] = src; } else { - /* window_.y == 0, use the next line as prev line */ - linePointers[1] = src + inputConfig_.stride; + /* Top line, use the next line as prev line */ + linePointers[1] = src + inputStride; linePointers[2] = src; + } + + if (window.y == 0 && yEnd_ == window.height) { /* * Last 2 lines also need special handling. * (And configure() ensures that yEnd >= 2.) @@ -669,83 +777,93 @@ void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst) setupInputMemcpy(linePointers); - for (unsigned int y = 0; y < yEnd; y += 2) { + /* + * Note y is the line-number *inside* the window, since stats_' window + * is the stats window inside/relative to the debayer window. IOW for + * single thread rendering y goes from 0 to window.height. + */ + for (unsigned int y = yStart_; y < yEnd; y += 2) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, y, linePointers); - (this->*debayer0_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->stats_->processLine0(frame, y, linePointers, threadIndex_); + debayer_->debayer0(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - (this->*debayer1_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->debayer1(dst, linePointers); + src += inputStride; + dst += outputStride; } - if (window_.y == 0) { + if (window.y == 0 && yEnd_ == window.height) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, yEnd, linePointers); - (this->*debayer0_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->stats_->processLine0(frame, yEnd, linePointers, threadIndex_); + debayer_->debayer0(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); /* next line may point outside of src, use prev. */ linePointers[2] = linePointers[0]; - (this->*debayer1_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->debayer1(dst, linePointers); + src += inputStride; + dst += outputStride; } } -void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) +void DebayerCpuThread::process4(uint32_t frame, const uint8_t *src, uint8_t *dst) { + unsigned int outputStride = debayer_->outputConfig_.stride; + unsigned int inputStride = debayer_->inputConfig_.stride; + /* * This holds pointers to [0] 2-lines-up [1] 1-line-up [2] current-line * [3] 1-line-down [4] 2-lines-down. */ const uint8_t *linePointers[5]; - /* Adjust src to top left corner of the window */ - src += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; - /* [x] becomes [x - 1] after initial shiftLinePointers() call */ - linePointers[1] = src - 2 * inputConfig_.stride; - linePointers[2] = src - inputConfig_.stride; + linePointers[1] = src - 2 * inputStride; + linePointers[2] = src - inputStride; linePointers[3] = src; - linePointers[4] = src + inputConfig_.stride; + linePointers[4] = src + inputStride; setupInputMemcpy(linePointers); - for (unsigned int y = 0; y < window_.height; y += 4) { + /* + * Note y is the line-number *inside* the window, since stats_' window + * is the stats window inside/relative to the debayer window. IOW for + * single thread rendering y goes from 0 to window.height. + */ + for (unsigned int y = yStart_; y < yEnd_; y += 4) { shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine0(frame, y, linePointers); - (this->*debayer0_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->stats_->processLine0(frame, y, linePointers, threadIndex_); + debayer_->debayer0(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - (this->*debayer1_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->debayer1(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - stats_->processLine2(frame, y, linePointers); - (this->*debayer2_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->stats_->processLine2(frame, y, linePointers, threadIndex_); + debayer_->debayer2(dst, linePointers); + src += inputStride; + dst += outputStride; shiftLinePointers(linePointers, src); memcpyNextLine(linePointers); - (this->*debayer3_)(dst, linePointers); - src += inputConfig_.stride; - dst += outputConfig_.stride; + debayer_->debayer3(dst, linePointers); + src += inputStride; + dst += outputStride; } } @@ -867,10 +985,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output stats_->startFrame(frame); - if (inputConfig_.patternSize.height == 2) - process2(frame, in.planes()[0].data(), out.planes()[0].data()); - else - process4(frame, in.planes()[0].data(), out.planes()[0].data()); + threads_[0]->process(frame, in.planes()[0].data(), out.planes()[0].data()); metadata.planes()[0].bytesused = out.planes()[0].size(); diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h index 7a6517462..780576090 100644 --- a/src/libcamera/software_isp/debayer_cpu.h +++ b/src/libcamera/software_isp/debayer_cpu.h @@ -26,6 +26,7 @@ namespace libcamera { +class DebayerCpuThread; class DebayerCpu : public Debayer { public: @@ -44,6 +45,8 @@ public: const SharedFD &getStatsFD() { return stats_->getStatsFD(); } private: + friend class DebayerCpuThread; + /** * \brief Called to debayer 1 line of Bayer input data to output format * \param[out] dst Pointer to the start of the output line to write @@ -74,6 +77,11 @@ private: */ using debayerFn = void (DebayerCpu::*)(uint8_t *dst, const uint8_t *src[]); + void debayer0(uint8_t *dst, const uint8_t *src[]) { (this->*debayer0_)(dst, src); } + void debayer1(uint8_t *dst, const uint8_t *src[]) { (this->*debayer1_)(dst, src); } + void debayer2(uint8_t *dst, const uint8_t *src[]) { (this->*debayer2_)(dst, src); } + void debayer3(uint8_t *dst, const uint8_t *src[]) { (this->*debayer3_)(dst, src); } + /* 8-bit raw bayer format */ template void debayer8_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]); @@ -105,17 +113,9 @@ private: int setDebayerFunctions(PixelFormat inputFormat, PixelFormat outputFormat, bool ccmEnabled); - void setupInputMemcpy(const uint8_t *linePointers[]); - void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src); - void memcpyNextLine(const uint8_t *linePointers[]); - void process2(uint32_t frame, const uint8_t *src, uint8_t *dst); - void process4(uint32_t frame, const uint8_t *src, uint8_t *dst); void updateGammaTable(const DebayerParams ¶ms); void updateLookupTables(const DebayerParams ¶ms); - /* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */ - static constexpr unsigned int kMaxLineBuffers = 5; - static constexpr unsigned int kRGBLookupSize = 256; static constexpr unsigned int kGammaLookupSize = 1024; struct CcmColumn { @@ -142,12 +142,9 @@ private: debayerFn debayer3_; Rectangle window_; std::unique_ptr stats_; - std::vector lineBuffers_[kMaxLineBuffers]; - unsigned int lineBufferLength_; - unsigned int lineBufferPadding_; - unsigned int lineBufferIndex_; unsigned int xShift_; /* Offset of 0/1 applied to window_.x */ - bool enableInputMemcpy_; + + std::vector>threads_; }; } /* namespace libcamera */