{"id":26250,"url":"https://patchwork.libcamera.org/api/patches/26250/?format=json","web_url":"https://patchwork.libcamera.org/patch/26250/","project":{"id":1,"url":"https://patchwork.libcamera.org/api/projects/1/?format=json","name":"libcamera","link_name":"libcamera","list_id":"libcamera_core","list_email":"libcamera-devel@lists.libcamera.org","web_url":"","scm_url":"","webscm_url":""},"msgid":"<20260304075052.11599-3-johannes.goede@oss.qualcomm.com>","date":"2026-03-04T07:50:49","name":"[v5,2/5] software_isp: debayer_cpu: Add DebayerCpuThread class","commit_ref":null,"pull_url":null,"state":"superseded","archived":false,"hash":"9a81af642e02b542c1fbb6ba3fda2fae52440b15","submitter":{"id":242,"url":"https://patchwork.libcamera.org/api/people/242/?format=json","name":"Hans de Goede","email":"johannes.goede@oss.qualcomm.com"},"delegate":null,"mbox":"https://patchwork.libcamera.org/patch/26250/mbox/","series":[{"id":5817,"url":"https://patchwork.libcamera.org/api/series/5817/?format=json","web_url":"https://patchwork.libcamera.org/project/libcamera/list/?series=5817","date":"2026-03-04T07:50:47","name":"software_isp: debayer_cpu: Add multi-threading support","version":5,"mbox":"https://patchwork.libcamera.org/series/5817/mbox/"}],"comments":"https://patchwork.libcamera.org/api/patches/26250/comments/","check":"pending","checks":"https://patchwork.libcamera.org/api/patches/26250/checks/","tags":{},"headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id B0AD3BE086\n\tfor <parsemail@patchwork.libcamera.org>;\n\tWed,  4 Mar 2026 07:51:03 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 640B66239C;\n\tWed,  4 Mar 2026 08:51:03 +0100 (CET)","from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com\n\t[205.220.168.131])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 09BAC62396\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tWed,  4 Mar 2026 08:51:00 +0100 (CET)","from pps.filterd (m0279865.ppops.net [127.0.0.1])\n\tby mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id\n\t6245SedC1678073 for <libcamera-devel@lists.libcamera.org>;\n\tWed, 4 Mar 2026 07:50:59 GMT","from mail-qk1-f197.google.com (mail-qk1-f197.google.com\n\t[209.85.222.197])\n\tby mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4cp73h9vh9-1\n\t(version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT)\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tWed, 04 Mar 2026 07:50:58 +0000 (GMT)","by mail-qk1-f197.google.com with SMTP id\n\taf79cd13be357-8c70ab7f67fso6882167885a.3\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 03 Mar 2026 23:50:58 -0800 (PST)","from shalem\n\t(2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl.\n\t[2001:1c00:c32:7800:5bfa:a036:83f0:f9ec])\n\tby smtp.gmail.com with ESMTPSA id\n\ta640c23a62f3a-b935ac73a5dsm693263366b.25.2026.03.03.23.50.55\n\t(version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n\tTue, 03 Mar 2026 23:50:56 -0800 (PST)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (2048-bit key;\n\tunprotected) header.d=qualcomm.com header.i=@qualcomm.com\n\theader.b=\"KnGYgS2I\"; dkim=pass (2048-bit key;\n\tunprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com\n\theader.b=\"Y2ozPG9U\"; dkim-atps=neutral","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h=\n\tcc:content-transfer-encoding:date:from:in-reply-to:message-id\n\t:mime-version:references:subject:to; s=qcppdkim1; bh=lj7q66a95nk\n\tSRrYrRSIi8jiFgvLhiP3E04FCMHJ3vrU=; b=KnGYgS2IFDYpIpMhDU7qbEJ5bR0\n\tVqpDP2sK/PQXMBoRlbMYt5A6QCnXjIX8RlkD9/QsfpKEtPfei8bAb/2PIER2nJQY\n\tUSNpkd2U8+QipMj+zbv8kWNBpZDTMScPf7XHBHkJWMEFWZZzylRfmq+nkFa23n1W\n\tKm6uu1RxxQuo9L5eathbiPpWl1mDZC7s/oXRqwgqCclCD8hz0JRTux4WztnsKcaz\n\tobyINn0dAcWHzX7FKNRoEQm7z6JenEQcpvoe5uZ3FkKg8TbIeMnMatmoN2KZrzoE\n\tG8WTY9cySiGQoa1/Gaaq4ytyemjHJnBFbD/FkjnAngZhmF15YJkBqmFHdnw==","v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=oss.qualcomm.com; s=google; t=1772610658; x=1773215458;\n\tdarn=lists.libcamera.org; \n\th=content-transfer-encoding:mime-version:references:in-reply-to\n\t:message-id:date:subject:cc:to:from:from:to:cc:subject:date\n\t:message-id:reply-to;\n\tbh=lj7q66a95nkSRrYrRSIi8jiFgvLhiP3E04FCMHJ3vrU=;\n\tb=Y2ozPG9UcjnbK0IXvCr2siYqchJvFBDTjFYW7NC4sjpc827b7SvyScECtb1T21gMZs\n\tOrsOlCmyBWfMOto7OXVYn2TDV7TtjuNYTwYOh4B4ROz1HvpqoY6l7durI7KgYayW37LC\n\tRJ4/FxnJwSYPq1RDepNatUSlK3bY+nAimFYjh2Sv6mwHB2FED534czpsVKGePvS3yDl/\n\tyf6ituVkN4lJEv/ahQ4xFQDcpcqFhNrdSD6hzHTcqFf7oZSaYdU+D4KVV5ISuzOi9SgQ\n\tYsd0Ag+IqLsAyg6idgNXDzkw8vENBNClsnffrbTzUrGszf7LKS6owxhR1n8fXDgtBHdA\n\tcdvA=="],"X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20230601; t=1772610658; x=1773215458;\n\th=content-transfer-encoding:mime-version:references:in-reply-to\n\t:message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from\n\t:to:cc:subject:date:message-id:reply-to;\n\tbh=lj7q66a95nkSRrYrRSIi8jiFgvLhiP3E04FCMHJ3vrU=;\n\tb=lgIKXXnhnsTarpYOTG7AedpAz1/P5U7QxrwS9AH5XV0CGc/qtUJ4x8GXWgpa6UYFCE\n\t1C33uEozNT9RxN1jm6apYr9nVl6tT3JgvtQGduONwr6LvzUz3MAzMp5Bfky/jF4JmR7p\n\tXM/tNxeJGbvCi/77Hilfh92KKjT9b8Bmsgl2mUY3ExmssOP5T3CaFIo+nf9R8oJ9TQeB\n\tpMuM1hM5Sc0pwgXKdZ5pHIKmycQ9fASBRrr3D2jdAi0mdZS12+NjkVQk7tknT9H7bRtO\n\tqqPOca9QpJnKg3eREa+Oi6ankq5pswoTCrxZYHL0UbduDqb3qxS7je0tppzKuppau08S\n\tnuAg==","X-Gm-Message-State":"AOJu0YzakoA0IrslLgmpRSau0/r5Rk9XXEPO8kuSq/eGwX2yi5F9wv6x\n\tI9xds9tl7vQIjFWRhXl2+uBnK7TeztzreCV9mnY1c/ed/gI9W3KNpCy/GvMafsRl/5DVasTBAG6\n\t83L5jCfXVvCfIGujz2vS+MrIcUtMbML26Sq1mUJjpJGA5+GLTbySB2jY6O5+Xt0R9PDK+CSK2/J\n\tmCtK9DLCq/","X-Gm-Gg":"ATEYQzw6XAmPytD/3wAEoSnXbYP8HYgc84ibnBnwfPb9EW1haeWGj7vK02d1wGxDHCE\n\tvp9gbH91PLGmZS+SlPBEeIZm+vvhfXsvPgCYJo7gMIBzHjfhFw+SA1jQe95z0a7ZoyNdUcCwtB0\n\tvvp2mh7K/fh//ZlmJ9dB93KUJdIczuWgA8Y08DP4/XBhSBvoJWb7JvotqKLQeswgPP0P2dAt636\n\tstSB+OOMOHVU/VjxmIsBNnrqVKQdo07KeEqqMEXdM8ZJjol9U/xfQQFRGUE/6hGa4Z9HGYddVSK\n\tXFlqs72wJU0r/LTGKoGKDu/gGh1O85Ywc3h6xMzWlgD6/o2EHToayPsaJ9Wygv2OKO9DkGnP9Up\n\tV8i6tT2pUIFZPcaMGVJhUyg1cOA6OM7HTh/4GqxtWmJl7MBfrUo+2kfa2gdKvniqq2mb1E9EWiw\n\ttdkGwvLEBLqe4Sth2MXZba2N9i8wQoGVOlKGQk","X-Received":["by 2002:a05:620a:4495:b0:8b2:f0dd:2a97 with SMTP id\n\taf79cd13be357-8cd5af760a1mr126113685a.37.1772610657406; \n\tTue, 03 Mar 2026 23:50:57 -0800 (PST)","by 2002:a05:620a:4495:b0:8b2:f0dd:2a97 with SMTP id\n\taf79cd13be357-8cd5af760a1mr126112485a.37.1772610656864; \n\tTue, 03 Mar 2026 23:50:56 -0800 (PST)"],"From":"Hans de Goede <johannes.goede@oss.qualcomm.com>","To":"libcamera-devel@lists.libcamera.org, Milan Zamazal <mzamazal@redhat.com>","Cc":"Hans de Goede <johannes.goede@oss.qualcomm.com>","Subject":"[PATCH v5 2/5] software_isp: debayer_cpu: Add DebayerCpuThread class","Date":"Wed,  4 Mar 2026 08:50:49 +0100","Message-ID":"<20260304075052.11599-3-johannes.goede@oss.qualcomm.com>","X-Mailer":"git-send-email 2.52.0","In-Reply-To":"<20260304075052.11599-1-johannes.goede@oss.qualcomm.com>","References":"<20260304075052.11599-1-johannes.goede@oss.qualcomm.com>","MIME-Version":"1.0","Content-Transfer-Encoding":"8bit","X-Proofpoint-GUID":"VvcgkqAg5YENpbDi_NV9NZkg2rSqatE6","X-Proofpoint-Spam-Details-Enc":"AW1haW4tMjYwMzA0MDA2MiBTYWx0ZWRfX3bbrPlPL8ax1\n\tsSCrJ+wKTvlfYcND9nKRx2OKaUuqlapgWREJtDWVzp9p20SxALHEIgl4YOuhJSnwqKPAdEbEMjt\n\tqPaKl45kDcJxnQvUWsgtQQo7CLLfSKg+kLvcc8LT2qbf72OrdwZLcxl3cQfyp7SvTSYy0eS7KkD\n\tGz7avUloyEnHyd36t/9p9OEHhC1+XL/LtoVlsYO5dCyApfd8R3kOMWY8r5GnbBq4rf7ffARvLmb\n\tNpgnZsgOiSfRpa90xHsMk9dHBH3s6ydLXc4eAP3gmgWNQ6GSk9KBQtwOHVoQvX5vukUz4DzuA3B\n\tf7wgWrxrKCFrmK06OO7OkYJQ2xcEEh4IUh6j7b8Bnjy3H+MhpxGYf5LDmNG6VIAu5ZTaiFIZPiz\n\tv4ZdxH/YkxrorUYBP6at73IlSPMvQhIyouJym6rymJ1RP2RUMtpxCRDPrwAlpPqsz3lEIhKpdBf\n\t4Lcs7+JtwPFbVyFc63g==","X-Proofpoint-ORIG-GUID":"VvcgkqAg5YENpbDi_NV9NZkg2rSqatE6","X-Authority-Analysis":"v=2.4 cv=BpWQAIX5 c=1 sm=1 tr=0 ts=69a7e462 cx=c_pps\n\ta=50t2pK5VMbmlHzFWWp8p/g==:117 a=xqWC_Br6kY4A:10 a=Yq5XynenixoA:10\n\ta=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22 a=u7WPNUs3qKkmUXheDGA7:22\n\ta=Um2Pa8k9VHT-vaBCBUpS:22 a=20KFwNOVAAAA:8 a=EUspDBNiAAAA:8\n\ta=Ba8D0n3WaO7QZ3tp1_0A:9 a=IoWCM6iH3mJn3m4BftBB:22","X-Proofpoint-Virus-Version":"vendor=baseguard\n\tengine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51,\n\tFMLib:17.12.100.49\n\tdefinitions=2026-03-04_02,2026-03-03_01,2025-10-01_01","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tclxscore=1015 suspectscore=0 bulkscore=0 adultscore=0 malwarescore=0\n\tlowpriorityscore=0 impostorscore=0 priorityscore=1501 phishscore=0\n\tspamscore=0 classifier=typeunknown authscore=0 authtc= authcc=\n\troute=outbound\n\tadjust=0 reason=mlx scancount=1 engine=8.22.0-2602130000\n\tdefinitions=main-2603040062","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"},"content":"Add a DebayerCpuThreadclass and use this in the inner render loop.\nThis contains data which needs to be separate per thread.\n\nThis is a preparation patch for making DebayerCpu support multi-threading.\n\nBenchmarking on the Arduino Uno-Q with a weak CPU which is good for\nperformance testing, shows 146-147ms per 3272x2464 frame both before and\nafter this change, with things maybe being 0.5 ms slower after this change.\n\nReviewed-by: Milan Zamazal <mzamazal@redhat.com>\nSigned-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>\n---\nChanges in v4:\n- Move kMaxLineBuffers constant to DebayerCpuThread class\n- Add Milan's Reviewed-by\n\nChanges in v3:\n- Use std::unique_ptr for the DebayerCpuThread pointers\n- Document new DebayerCpuThread class\n- Make DebayerCpuThread inherit from both Thread and Object\n\nChanges in v2:\n- Replace the DebayerCpuThreadData struct from v1 with a DebayerCpuThread\n  class, derived from Object to allow calling invokeMethod for thread re-use\n  in followup patches\n- As part of this also move a bunch of methods which primarily deal with\n  per thread data: setupInputMemcpy(), shiftLinePointers(), memcpyNextLine(),\n  process*() to the new DebayerCpuThread class\n---\n src/libcamera/software_isp/debayer_cpu.cpp | 247 +++++++++++++++------\n src/libcamera/software_isp/debayer_cpu.h   |  23 +-\n 2 files changed, 191 insertions(+), 79 deletions(-)","diff":"diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp\nindex e7b012105..d57d640df 100644\n--- a/src/libcamera/software_isp/debayer_cpu.cpp\n+++ b/src/libcamera/software_isp/debayer_cpu.cpp\n@@ -18,6 +18,8 @@\n \n #include <linux/dma-buf.h>\n \n+#include <libcamera/base/thread.h>\n+\n #include <libcamera/formats.h>\n \n #include \"libcamera/internal/bayer_format.h\"\n@@ -27,6 +29,55 @@\n \n namespace libcamera {\n \n+/**\n+ * \\brief Class representing one CPU debayering thread\n+ *\n+ * Implementation for CPU based debayering threads.\n+ */\n+class DebayerCpuThread : public Thread, public Object\n+{\n+public:\n+\tDebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex,\n+\t\t\t bool enableInputMemcpy);\n+\n+\tvoid configure(unsigned int yStart, unsigned int yEnd);\n+\tvoid process(uint32_t frame, const uint8_t *src, uint8_t *dst);\n+\n+private:\n+\tvoid setupInputMemcpy(const uint8_t *linePointers[]);\n+\tvoid shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src);\n+\tvoid memcpyNextLine(const uint8_t *linePointers[]);\n+\tvoid process2(uint32_t frame, const uint8_t *src, uint8_t *dst);\n+\tvoid process4(uint32_t frame, const uint8_t *src, uint8_t *dst);\n+\n+\t/* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */\n+\tstatic constexpr unsigned int kMaxLineBuffers = 5;\n+\n+\tDebayerCpu *debayer_;\n+\tunsigned int threadIndex_;\n+\tunsigned int yStart_;\n+\tunsigned int yEnd_;\n+\tunsigned int lineBufferLength_;\n+\tunsigned int lineBufferPadding_;\n+\tunsigned int lineBufferIndex_;\n+\tstd::vector<uint8_t> lineBuffers_[kMaxLineBuffers];\n+\tbool enableInputMemcpy_;\n+};\n+\n+/**\n+ * \\brief Construct a DebayerCpuThread object\n+ * \\param[in] debayer pointer back to the DebayerCpuObject this thread belongs to\n+ * \\param[in] threadIndex 0 .. n thread-index value for the thread\n+ * \\param[in] enableInputMemcpy when set copy input data to a heap buffer before use\n+ */\n+DebayerCpuThread::DebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex,\n+\t\t\t\t   bool enableInputMemcpy)\n+\t: Thread(\"DebayerCpu:\" + std::to_string(threadIndex)),\n+\t  debayer_(debayer), threadIndex_(threadIndex),\n+\t  enableInputMemcpy_(enableInputMemcpy)\n+{\n+}\n+\n /**\n  * \\class DebayerCpu\n  * \\brief Class for debayering on the CPU\n@@ -53,8 +104,14 @@ DebayerCpu::DebayerCpu(std::unique_ptr<SwStatsCpu> stats, const GlobalConfigurat\n \t * \\todo Make memcpy automatic based on runtime detection of platform\n \t * capabilities.\n \t */\n-\tenableInputMemcpy_ =\n+\tbool enableInputMemcpy =\n \t\tconfiguration.option<bool>({ \"software_isp\", \"copy_input_buffer\" }).value_or(true);\n+\n+\t/* Just one thread object for now, which will be called inline rather than async */\n+\tthreads_.resize(1);\n+\n+\tfor (unsigned int i = 0; i < threads_.size(); i++)\n+\t\tthreads_[i] = std::make_unique<DebayerCpuThread>(this, i, enableInputMemcpy);\n }\n \n DebayerCpu::~DebayerCpu() = default;\n@@ -484,7 +541,7 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg,\n \tif (getInputConfig(inputCfg.pixelFormat, inputConfig_) != 0)\n \t\treturn -EINVAL;\n \n-\tif (stats_->configure(inputCfg) != 0)\n+\tif (stats_->configure(inputCfg, threads_.size()) != 0)\n \t\treturn -EINVAL;\n \n \tconst Size &statsPatternSize = stats_->patternSize();\n@@ -548,17 +605,43 @@ int DebayerCpu::configure(const StreamConfiguration &inputCfg,\n \t */\n \tstats_->setWindow(Rectangle(window_.size()));\n \n+\tunsigned int yStart = 0;\n+\tunsigned int linesPerThread = (window_.height / threads_.size()) &\n+\t\t\t\t      ~(inputConfig_.patternSize.height - 1);\n+\tunsigned int i;\n+\n+\tfor (i = 0; i < (threads_.size() - 1); i++) {\n+\t\tthreads_[i]->configure(yStart, yStart + linesPerThread);\n+\t\tyStart += linesPerThread;\n+\t}\n+\tthreads_[i]->configure(yStart, window_.height);\n+\n+\treturn 0;\n+}\n+\n+/**\n+ * \\brief Configure thread to process a specific part of the image\n+ * \\param[in] yStart y coordinate of first line to process\n+ * \\param[in] yEnd y coordinate of the line at which to stop processing\n+ *\n+ * Configure the thread to process lines yStart - (yEnd - 1).\n+ */\n+void DebayerCpuThread::configure(unsigned int yStart, unsigned int yEnd)\n+{\n+\tDebayer::DebayerInputConfig &inputConfig = debayer_->inputConfig_;\n+\n+\tyStart_ = yStart;\n+\tyEnd_ = yEnd;\n+\n \t/* pad with patternSize.Width on both left and right side */\n-\tlineBufferPadding_ = inputConfig_.patternSize.width * inputConfig_.bpp / 8;\n-\tlineBufferLength_ = window_.width * inputConfig_.bpp / 8 +\n+\tlineBufferPadding_ = inputConfig.patternSize.width * inputConfig.bpp / 8;\n+\tlineBufferLength_ = debayer_->window_.width * inputConfig.bpp / 8 +\n \t\t\t    2 * lineBufferPadding_;\n \n \tif (enableInputMemcpy_) {\n-\t\tfor (unsigned int i = 0; i <= inputConfig_.patternSize.height; i++)\n+\t\tfor (unsigned int i = 0; i <= inputConfig.patternSize.height; i++)\n \t\t\tlineBuffers_[i].resize(lineBufferLength_);\n \t}\n-\n-\treturn 0;\n }\n \n /*\n@@ -599,9 +682,9 @@ DebayerCpu::strideAndFrameSize(const PixelFormat &outputFormat, const Size &size\n \treturn std::make_tuple(stride, stride * size.height);\n }\n \n-void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[])\n+void DebayerCpuThread::setupInputMemcpy(const uint8_t *linePointers[])\n {\n-\tconst unsigned int patternHeight = inputConfig_.patternSize.height;\n+\tconst unsigned int patternHeight = debayer_->inputConfig_.patternSize.height;\n \n \tif (!enableInputMemcpy_)\n \t\treturn;\n@@ -617,20 +700,20 @@ void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[])\n \tlineBufferIndex_ = patternHeight;\n }\n \n-void DebayerCpu::shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src)\n+void DebayerCpuThread::shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src)\n {\n-\tconst unsigned int patternHeight = inputConfig_.patternSize.height;\n+\tconst unsigned int patternHeight = debayer_->inputConfig_.patternSize.height;\n \n \tfor (unsigned int i = 0; i < patternHeight; i++)\n \t\tlinePointers[i] = linePointers[i + 1];\n \n-\tlinePointers[patternHeight] = src +\n-\t\t\t\t      (patternHeight / 2) * (int)inputConfig_.stride;\n+\tlinePointers[patternHeight] =\n+\t\tsrc + (patternHeight / 2) * (int)debayer_->inputConfig_.stride;\n }\n \n-void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])\n+void DebayerCpuThread::memcpyNextLine(const uint8_t *linePointers[])\n {\n-\tconst unsigned int patternHeight = inputConfig_.patternSize.height;\n+\tconst unsigned int patternHeight = debayer_->inputConfig_.patternSize.height;\n \n \tif (!enableInputMemcpy_)\n \t\treturn;\n@@ -643,23 +726,48 @@ void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])\n \tlineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1);\n }\n \n-void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)\n+/**\n+ * \\brief Process part of the image assigned to this debayer thread\n+ * \\param[in] frame The frame number\n+ * \\param[in] src The source buffer\n+ * \\param[in] dst The destination buffer\n+ */\n+void DebayerCpuThread::process(uint32_t frame, const uint8_t *src, uint8_t *dst)\n {\n-\tunsigned int yEnd = window_.height;\n+\tRectangle &window = debayer_->window_;\n+\n+\t/* Adjust src to top left corner of the window */\n+\tsrc += (window.y + yStart_) * debayer_->inputConfig_.stride +\n+\t       window.x * debayer_->inputConfig_.bpp / 8;\n+\t/* Adjust dst for yStart_ */\n+\tdst += yStart_ * debayer_->outputConfig_.stride;\n+\n+\tif (debayer_->inputConfig_.patternSize.height == 2)\n+\t\tprocess2(frame, src, dst);\n+\telse\n+\t\tprocess4(frame, src, dst);\n+}\n+\n+void DebayerCpuThread::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)\n+{\n+\tunsigned int outputStride = debayer_->outputConfig_.stride;\n+\tunsigned int inputStride = debayer_->inputConfig_.stride;\n+\tRectangle &window = debayer_->window_;\n+\tunsigned int yEnd = yEnd_;\n \t/* Holds [0] previous- [1] current- [2] next-line */\n \tconst uint8_t *linePointers[3];\n \n-\t/* Adjust src to top left corner of the window */\n-\tsrc += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8;\n-\n \t/* [x] becomes [x - 1] after initial shiftLinePointers() call */\n-\tif (window_.y) {\n-\t\tlinePointers[1] = src - inputConfig_.stride; /* previous-line */\n+\tif (window.y + yStart_) {\n+\t\tlinePointers[1] = src - inputStride; /* previous-line */\n \t\tlinePointers[2] = src;\n \t} else {\n-\t\t/* window_.y == 0, use the next line as prev line */\n-\t\tlinePointers[1] = src + inputConfig_.stride;\n+\t\t/* Top line, use the next line as prev line */\n+\t\tlinePointers[1] = src + inputStride;\n \t\tlinePointers[2] = src;\n+\t}\n+\n+\tif (window.y == 0 && yEnd_ == window.height) {\n \t\t/*\n \t\t * Last 2 lines also need special handling.\n \t\t * (And configure() ensures that yEnd >= 2.)\n@@ -669,83 +777,93 @@ void DebayerCpu::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)\n \n \tsetupInputMemcpy(linePointers);\n \n-\tfor (unsigned int y = 0; y < yEnd; y += 2) {\n+\t/*\n+\t * Note y is the line-number *inside* the window, since stats_' window\n+\t * is the stats window inside/relative to the debayer window. IOW for\n+\t * single thread rendering y goes from 0 to window.height.\n+\t */\n+\tfor (unsigned int y = yStart_; y < yEnd; y += 2) {\n \t\tshiftLinePointers(linePointers, src);\n \t\tmemcpyNextLine(linePointers);\n-\t\tstats_->processLine0(frame, y, linePointers);\n-\t\t(this->*debayer0_)(dst, linePointers);\n-\t\tsrc += inputConfig_.stride;\n-\t\tdst += outputConfig_.stride;\n+\t\tdebayer_->stats_->processLine0(frame, y, linePointers, threadIndex_);\n+\t\tdebayer_->debayer0(dst, linePointers);\n+\t\tsrc += inputStride;\n+\t\tdst += outputStride;\n \n \t\tshiftLinePointers(linePointers, src);\n \t\tmemcpyNextLine(linePointers);\n-\t\t(this->*debayer1_)(dst, linePointers);\n-\t\tsrc += inputConfig_.stride;\n-\t\tdst += outputConfig_.stride;\n+\t\tdebayer_->debayer1(dst, linePointers);\n+\t\tsrc += inputStride;\n+\t\tdst += outputStride;\n \t}\n \n-\tif (window_.y == 0) {\n+\tif (window.y == 0 && yEnd_ == window.height) {\n \t\tshiftLinePointers(linePointers, src);\n \t\tmemcpyNextLine(linePointers);\n-\t\tstats_->processLine0(frame, yEnd, linePointers);\n-\t\t(this->*debayer0_)(dst, linePointers);\n-\t\tsrc += inputConfig_.stride;\n-\t\tdst += outputConfig_.stride;\n+\t\tdebayer_->stats_->processLine0(frame, yEnd, linePointers, threadIndex_);\n+\t\tdebayer_->debayer0(dst, linePointers);\n+\t\tsrc += inputStride;\n+\t\tdst += outputStride;\n \n \t\tshiftLinePointers(linePointers, src);\n \t\t/* next line may point outside of src, use prev. */\n \t\tlinePointers[2] = linePointers[0];\n-\t\t(this->*debayer1_)(dst, linePointers);\n-\t\tsrc += inputConfig_.stride;\n-\t\tdst += outputConfig_.stride;\n+\t\tdebayer_->debayer1(dst, linePointers);\n+\t\tsrc += inputStride;\n+\t\tdst += outputStride;\n \t}\n }\n \n-void DebayerCpu::process4(uint32_t frame, const uint8_t *src, uint8_t *dst)\n+void DebayerCpuThread::process4(uint32_t frame, const uint8_t *src, uint8_t *dst)\n {\n+\tunsigned int outputStride = debayer_->outputConfig_.stride;\n+\tunsigned int inputStride = debayer_->inputConfig_.stride;\n+\n \t/*\n \t * This holds pointers to [0] 2-lines-up [1] 1-line-up [2] current-line\n \t * [3] 1-line-down [4] 2-lines-down.\n \t */\n \tconst uint8_t *linePointers[5];\n \n-\t/* Adjust src to top left corner of the window */\n-\tsrc += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8;\n-\n \t/* [x] becomes [x - 1] after initial shiftLinePointers() call */\n-\tlinePointers[1] = src - 2 * inputConfig_.stride;\n-\tlinePointers[2] = src - inputConfig_.stride;\n+\tlinePointers[1] = src - 2 * inputStride;\n+\tlinePointers[2] = src - inputStride;\n \tlinePointers[3] = src;\n-\tlinePointers[4] = src + inputConfig_.stride;\n+\tlinePointers[4] = src + inputStride;\n \n \tsetupInputMemcpy(linePointers);\n \n-\tfor (unsigned int y = 0; y < window_.height; y += 4) {\n+\t/*\n+\t * Note y is the line-number *inside* the window, since stats_' window\n+\t * is the stats window inside/relative to the debayer window. IOW for\n+\t * single thread rendering y goes from 0 to window.height.\n+\t */\n+\tfor (unsigned int y = yStart_; y < yEnd_; y += 4) {\n \t\tshiftLinePointers(linePointers, src);\n \t\tmemcpyNextLine(linePointers);\n-\t\tstats_->processLine0(frame, y, linePointers);\n-\t\t(this->*debayer0_)(dst, linePointers);\n-\t\tsrc += inputConfig_.stride;\n-\t\tdst += outputConfig_.stride;\n+\t\tdebayer_->stats_->processLine0(frame, y, linePointers, threadIndex_);\n+\t\tdebayer_->debayer0(dst, linePointers);\n+\t\tsrc += inputStride;\n+\t\tdst += outputStride;\n \n \t\tshiftLinePointers(linePointers, src);\n \t\tmemcpyNextLine(linePointers);\n-\t\t(this->*debayer1_)(dst, linePointers);\n-\t\tsrc += inputConfig_.stride;\n-\t\tdst += outputConfig_.stride;\n+\t\tdebayer_->debayer1(dst, linePointers);\n+\t\tsrc += inputStride;\n+\t\tdst += outputStride;\n \n \t\tshiftLinePointers(linePointers, src);\n \t\tmemcpyNextLine(linePointers);\n-\t\tstats_->processLine2(frame, y, linePointers);\n-\t\t(this->*debayer2_)(dst, linePointers);\n-\t\tsrc += inputConfig_.stride;\n-\t\tdst += outputConfig_.stride;\n+\t\tdebayer_->stats_->processLine2(frame, y, linePointers, threadIndex_);\n+\t\tdebayer_->debayer2(dst, linePointers);\n+\t\tsrc += inputStride;\n+\t\tdst += outputStride;\n \n \t\tshiftLinePointers(linePointers, src);\n \t\tmemcpyNextLine(linePointers);\n-\t\t(this->*debayer3_)(dst, linePointers);\n-\t\tsrc += inputConfig_.stride;\n-\t\tdst += outputConfig_.stride;\n+\t\tdebayer_->debayer3(dst, linePointers);\n+\t\tsrc += inputStride;\n+\t\tdst += outputStride;\n \t}\n }\n \n@@ -867,10 +985,7 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output\n \n \tstats_->startFrame(frame);\n \n-\tif (inputConfig_.patternSize.height == 2)\n-\t\tprocess2(frame, in.planes()[0].data(), out.planes()[0].data());\n-\telse\n-\t\tprocess4(frame, in.planes()[0].data(), out.planes()[0].data());\n+\tthreads_[0]->process(frame, in.planes()[0].data(), out.planes()[0].data());\n \n \tmetadata.planes()[0].bytesused = out.planes()[0].size();\n \ndiff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h\nindex 7a6517462..780576090 100644\n--- a/src/libcamera/software_isp/debayer_cpu.h\n+++ b/src/libcamera/software_isp/debayer_cpu.h\n@@ -26,6 +26,7 @@\n \n namespace libcamera {\n \n+class DebayerCpuThread;\n class DebayerCpu : public Debayer\n {\n public:\n@@ -44,6 +45,8 @@ public:\n \tconst SharedFD &getStatsFD() { return stats_->getStatsFD(); }\n \n private:\n+\tfriend class DebayerCpuThread;\n+\n \t/**\n \t * \\brief Called to debayer 1 line of Bayer input data to output format\n \t * \\param[out] dst Pointer to the start of the output line to write\n@@ -74,6 +77,11 @@ private:\n \t */\n \tusing debayerFn = void (DebayerCpu::*)(uint8_t *dst, const uint8_t *src[]);\n \n+\tvoid debayer0(uint8_t *dst, const uint8_t *src[]) { (this->*debayer0_)(dst, src); }\n+\tvoid debayer1(uint8_t *dst, const uint8_t *src[]) { (this->*debayer1_)(dst, src); }\n+\tvoid debayer2(uint8_t *dst, const uint8_t *src[]) { (this->*debayer2_)(dst, src); }\n+\tvoid debayer3(uint8_t *dst, const uint8_t *src[]) { (this->*debayer3_)(dst, src); }\n+\n \t/* 8-bit raw bayer format */\n \ttemplate<bool addAlphaByte, bool ccmEnabled>\n \tvoid debayer8_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n@@ -105,17 +113,9 @@ private:\n \tint setDebayerFunctions(PixelFormat inputFormat,\n \t\t\t\tPixelFormat outputFormat,\n \t\t\t\tbool ccmEnabled);\n-\tvoid setupInputMemcpy(const uint8_t *linePointers[]);\n-\tvoid shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src);\n-\tvoid memcpyNextLine(const uint8_t *linePointers[]);\n-\tvoid process2(uint32_t frame, const uint8_t *src, uint8_t *dst);\n-\tvoid process4(uint32_t frame, const uint8_t *src, uint8_t *dst);\n \tvoid updateGammaTable(const DebayerParams &params);\n \tvoid updateLookupTables(const DebayerParams &params);\n \n-\t/* Max. supported Bayer pattern height is 4, debayering this requires 5 lines */\n-\tstatic constexpr unsigned int kMaxLineBuffers = 5;\n-\n \tstatic constexpr unsigned int kRGBLookupSize = 256;\n \tstatic constexpr unsigned int kGammaLookupSize = 1024;\n \tstruct CcmColumn {\n@@ -142,12 +142,9 @@ private:\n \tdebayerFn debayer3_;\n \tRectangle window_;\n \tstd::unique_ptr<SwStatsCpu> stats_;\n-\tstd::vector<uint8_t> lineBuffers_[kMaxLineBuffers];\n-\tunsigned int lineBufferLength_;\n-\tunsigned int lineBufferPadding_;\n-\tunsigned int lineBufferIndex_;\n \tunsigned int xShift_; /* Offset of 0/1 applied to window_.x */\n-\tbool enableInputMemcpy_;\n+\n+\tstd::vector<std::unique_ptr<DebayerCpuThread>>threads_;\n };\n \n } /* namespace libcamera */\n","prefixes":["v5","2/5"]}