[{"id":38278,"web_url":"https://patchwork.libcamera.org/comment/38278/","msgid":"<df124ae8-ec83-4d87-afc8-7c3b5c651e8f@ideasonboard.com>","date":"2026-02-23T16:33:52","subject":"Re: [PATCH v2 3/4] software_isp: debayer_cpu: Add multi-threading\n\tsupport","submitter":{"id":216,"url":"https://patchwork.libcamera.org/api/people/216/","name":"Barnabás Pőcze","email":"barnabas.pocze@ideasonboard.com"},"content":"Hi\n\n2026. 02. 23. 17:09 keltezéssel, Hans de Goede írta:\n> Add CPU soft ISP multi-threading support.\n> \n> Benchmark results for the Arduino Uno-Q with a weak CPU which is good for\n> performance testing, all numbers with an IMX219 running at\n> 3280x2464 -> 3272x2464:\n> \n> 1 thread : 147ms / frame, ~6.5 fps\n> 2 threads:  80ms / frame, ~12.5 fps\n> 3 threads:  65ms / frame, ~15 fps\n> \n> Adding a 4th thread does not improve performance.\n> \n> Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>\n> ---\n> Changes in v2:\n> - Adjust to use the new DebayerCpuThread class introduced in the v2 patch-series\n> - Re-use threads instead of starting new threads every frame\n> ---\n>   src/libcamera/software_isp/debayer_cpu.cpp | 53 ++++++++++++++++++++--\n>   src/libcamera/software_isp/debayer_cpu.h   |  6 +++\n>   2 files changed, 55 insertions(+), 4 deletions(-)\n> \n> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp\n> index 122bfbb05..ea1b17c1f 100644\n> --- a/src/libcamera/software_isp/debayer_cpu.cpp\n> +++ b/src/libcamera/software_isp/debayer_cpu.cpp\n> @@ -18,6 +18,8 @@\n>   \n>   #include <linux/dma-buf.h>\n>   \n> +#include <libcamera/base/thread.h>\n> +\n>   #include <libcamera/formats.h>\n>   \n>   #include \"libcamera/internal/bayer_format.h\"\n> @@ -50,13 +52,15 @@ public:\n>   \tunsigned int lineBufferIndex_;\n>   \tstd::vector<uint8_t> lineBuffers_[DebayerCpu::kMaxLineBuffers];\n>   \tbool enableInputMemcpy_;\n> +\tThread worker_;\n\nAlternatively you can inherit it: `class DebayerCpuThread : public Thread, public Object { ...`\n(Given that type has \"Thread\" in its name.) (See e.g. the type `VirtualCameraData`.)\n\n\n>   };\n>   \n>   DebayerCpuThread::DebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex,\n>   \t\t\t\t   bool enableInputMemcpy)\n>   \t: debayer_(debayer), threadIndex_(threadIndex),\n> -\t  enableInputMemcpy_(enableInputMemcpy)\n> +\t  enableInputMemcpy_(enableInputMemcpy), worker_(\"DebayerWorker\")\n\nCould you add the index to the name, e.g. `\"DebayerCpu:\" + std::to_string(threadIndex)` or similar?\n\n\n>   {\n> +\tthis->moveToThread(&worker_);\n>   }\n>   \n>   /**\n> @@ -88,8 +92,10 @@ DebayerCpu::DebayerCpu(std::unique_ptr<SwStatsCpu> stats, const GlobalConfigurat\n>   \tbool enableInputMemcpy =\n>   \t\tconfiguration.option<bool>({ \"software_isp\", \"copy_input_buffer\" }).value_or(true);\n>   \n> -\t/* Just one thread object for now, which will be called inline rather than async */\n> -\tthreads_.resize(1);\n> +\tunsigned int threadCount =\n> +\t\tconfiguration.option<unsigned int>({ \"software_isp\", \"threads\" }).value_or(2);\n> +\tthreadCount = std::clamp(threadCount, 1u, 8u);\n> +\tthreads_.resize(threadCount);\n>   \n>   \tfor (unsigned int i = 0; i < threads_.size(); i++)\n>   \t\tthreads_[i] = new DebayerCpuThread(this, i, enableInputMemcpy);\n> @@ -714,6 +720,11 @@ void DebayerCpuThread::process(uint32_t frame, const uint8_t *src, uint8_t *dst)\n>   \t\tprocess2(frame, src, dst);\n>   \telse\n>   \t\tprocess4(frame, src, dst);\n> +\n> +\tdebayer_->workPendingMutex_.lock();\n> +\tdebayer_->workPending_ &= ~(1 << threadIndex_);\n> +\tdebayer_->workPendingMutex_.unlock();\n> +\tdebayer_->workPendingCv_.notify_one();\n>   }\n>   \n>   void DebayerCpuThread::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)\n> @@ -953,7 +964,24 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output\n>   \n>   \tstats_->startFrame(frame);\n>   \n> -\tthreads_[0]->process(frame, in.planes()[0].data(), out.planes()[0].data());\n> +\tworkPendingMutex_.lock();\n\nIs the above locking needed?\n\n\n> +\tworkPending_ = (1 << threads_.size()) - 1;\n\nWhy not just have it be `thread_.size()`? And then subtract one in the worker when done?\n\n\n> +\tworkPendingMutex_.unlock();\n> +\n> +\tfor (unsigned int i = 0; i < threads_.size(); i++)\n> +\t\tthreads_[i]->invokeMethod(&DebayerCpuThread::process,\n> +\t\t\t\t\t  ConnectionTypeQueued, frame,\n> +\t\t\t\t\t  in.planes()[0].data(), out.planes()[0].data());\n> +\n> +\t{\n> +\t\tMutexLocker locker(workPendingMutex_);\n> +\n> +\t\tauto workPending = ([&]() LIBCAMERA_TSA_REQUIRES(workPendingMutex_) {\n                                    ^\n\nI think you can drop the extra `()`.\n\n\n> +\t\t\treturn workPending_ == 0;\n> +\t\t});\n> +\n> +\t\tworkPendingCv_.wait(locker, workPending);\n\nActually, I would probably inline the lambda here.\n\n\n> +\t}\n>   \n>   \tmetadata.planes()[0].bytesused = out.planes()[0].size();\n>   \n> @@ -972,6 +1000,23 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output\n>   \tinputBufferReady.emit(input);\n>   }\n>   \n> +int DebayerCpu::start()\n> +{\n> +\tfor (unsigned int i = 0; i < threads_.size(); i++)\n> +\t\tthreads_[i]->worker_.start();\n> +\n> +\treturn 0;\n> +}\n> +\n> +void DebayerCpu::stop()\n> +{\n> +\tfor (unsigned int i = 0; i < threads_.size(); i++)\n> +\t\tthreads_[i]->worker_.exit();\n> +\n> +\tfor (unsigned int i = 0; i < threads_.size(); i++)\n\nPlease\n\n   for (auto &thr : threads_)\n\nhere and above.\n\n\n> +\t\tthreads_[i]->worker_.wait();\n> +}\n> +\n>   SizeRange DebayerCpu::sizes(PixelFormat inputFormat, const Size &inputSize)\n>   {\n>   \tSize patternSize = this->patternSize(inputFormat);\n> diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h\n> index 7196dcdd0..2c84f8e40 100644\n> --- a/src/libcamera/software_isp/debayer_cpu.h\n> +++ b/src/libcamera/software_isp/debayer_cpu.h\n> @@ -16,6 +16,7 @@\n>   #include <vector>\n>   \n>   #include <libcamera/base/object.h>\n> +#include <libcamera/base/mutex.h>\n>   \n>   #include \"libcamera/internal/bayer_format.h\"\n>   #include \"libcamera/internal/global_configuration.h\"\n> @@ -41,6 +42,8 @@ public:\n>   \tstd::tuple<unsigned int, unsigned int>\n>   \tstrideAndFrameSize(const PixelFormat &outputFormat, const Size &size);\n>   \tvoid process(uint32_t frame, FrameBuffer *input, FrameBuffer *output, const DebayerParams &params);\n> +\tint start();\n> +\tvoid stop();\n>   \tSizeRange sizes(PixelFormat inputFormat, const Size &inputSize);\n>   \tconst SharedFD &getStatsFD() { return stats_->getStatsFD(); }\n>   \n> @@ -147,6 +150,9 @@ private:\n>   \tstd::unique_ptr<SwStatsCpu> stats_;\n>   \tunsigned int xShift_; /* Offset of 0/1 applied to window_.x */\n>   \n> +\tunsigned int workPending_ LIBCAMERA_TSA_GUARDED_BY(workPendingMutex_);\n> +\tMutex workPendingMutex_;\n> +\tConditionVariable workPendingCv_;\n>   \tstd::vector<DebayerCpuThread *>threads_;\n>   };\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 51681C0DA4\n\tfor <parsemail@patchwork.libcamera.org>;\n\tMon, 23 Feb 2026 16:33:57 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id EE98F6229C;\n\tMon, 23 Feb 2026 17:33:56 +0100 (CET)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[213.167.242.64])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 2544D6227B\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tMon, 23 Feb 2026 17:33:55 +0100 (CET)","from [192.168.33.88] (185.221.141.206.nat.pool.zt.hu\n\t[185.221.141.206])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id 04E134F1;\n\tMon, 23 Feb 2026 17:32:58 +0100 (CET)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"enyLWdfv\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1771864379;\n\tbh=kAf+clriFNWfrHasLp57YATi9RLzNhL4P3a7y/y3sAM=;\n\th=Date:Subject:To:Cc:References:From:In-Reply-To:From;\n\tb=enyLWdfvq0vvmfVqbCp5BTuA3J9d/6Vm3Tmf1F20ymeGWAhr2AP/j0p9uPNe+wzxM\n\tVeB7837Bxu1q5sYVcnTbjru973V1yhsRfZrGRkVL8KgXi0+USzqgR2c/JzhzW/w+su\n\tsbHlm/Uy/ASOLfNTzaci06Wk7UpqHXnW7tbBaMrk=","Message-ID":"<df124ae8-ec83-4d87-afc8-7c3b5c651e8f@ideasonboard.com>","Date":"Mon, 23 Feb 2026 17:33:52 +0100","MIME-Version":"1.0","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH v2 3/4] software_isp: debayer_cpu: Add multi-threading\n\tsupport","To":"Hans de Goede <johannes.goede@oss.qualcomm.com>,\n\tlibcamera-devel@lists.libcamera.org","Cc":"Milan Zamazal <mzamazal@redhat.com>","References":"<20260223160930.27913-1-johannes.goede@oss.qualcomm.com>\n\t<20260223160930.27913-4-johannes.goede@oss.qualcomm.com>","From":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <barnabas.pocze@ideasonboard.com>","Content-Language":"en-US, hu-HU","In-Reply-To":"<20260223160930.27913-4-johannes.goede@oss.qualcomm.com>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"8bit","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":38285,"web_url":"https://patchwork.libcamera.org/comment/38285/","msgid":"<6f5a393e-c6f5-4aa7-928f-4d1728adcb24@oss.qualcomm.com>","date":"2026-02-24T12:44:39","subject":"Re: [PATCH v2 3/4] software_isp: debayer_cpu: Add multi-threading\n\tsupport","submitter":{"id":242,"url":"https://patchwork.libcamera.org/api/people/242/","name":"Hans de Goede","email":"johannes.goede@oss.qualcomm.com"},"content":"Hi,\n\nOn 23-Feb-26 17:33, Barnabás Pőcze wrote:\n> Hi\n> \n> 2026. 02. 23. 17:09 keltezéssel, Hans de Goede írta:\n>> Add CPU soft ISP multi-threading support.\n>>\n>> Benchmark results for the Arduino Uno-Q with a weak CPU which is good for\n>> performance testing, all numbers with an IMX219 running at\n>> 3280x2464 -> 3272x2464:\n>>\n>> 1 thread : 147ms / frame, ~6.5 fps\n>> 2 threads:  80ms / frame, ~12.5 fps\n>> 3 threads:  65ms / frame, ~15 fps\n>>\n>> Adding a 4th thread does not improve performance.\n>>\n>> Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>\n>> ---\n>> Changes in v2:\n>> - Adjust to use the new DebayerCpuThread class introduced in the v2 patch-series\n>> - Re-use threads instead of starting new threads every frame\n>> ---\n>>   src/libcamera/software_isp/debayer_cpu.cpp | 53 ++++++++++++++++++++--\n>>   src/libcamera/software_isp/debayer_cpu.h   |  6 +++\n>>   2 files changed, 55 insertions(+), 4 deletions(-)\n>>\n>> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp\n>> index 122bfbb05..ea1b17c1f 100644\n>> --- a/src/libcamera/software_isp/debayer_cpu.cpp\n>> +++ b/src/libcamera/software_isp/debayer_cpu.cpp\n>> @@ -18,6 +18,8 @@\n>>     #include <linux/dma-buf.h>\n>>   +#include <libcamera/base/thread.h>\n>> +\n>>   #include <libcamera/formats.h>\n>>     #include \"libcamera/internal/bayer_format.h\"\n>> @@ -50,13 +52,15 @@ public:\n>>       unsigned int lineBufferIndex_;\n>>       std::vector<uint8_t> lineBuffers_[DebayerCpu::kMaxLineBuffers];\n>>       bool enableInputMemcpy_;\n>> +    Thread worker_;\n> \n> Alternatively you can inherit it: `class DebayerCpuThread : public Thread, public Object { ...`\n> (Given that type has \"Thread\" in its name.) (See e.g. the type `VirtualCameraData`.)\n\nGood idea, will do for v3.\n\n> \n> \n>>   };\n>>     DebayerCpuThread::DebayerCpuThread(DebayerCpu *debayer, unsigned int threadIndex,\n>>                      bool enableInputMemcpy)\n>>       : debayer_(debayer), threadIndex_(threadIndex),\n>> -      enableInputMemcpy_(enableInputMemcpy)\n>> +      enableInputMemcpy_(enableInputMemcpy), worker_(\"DebayerWorker\")\n> \n> Could you add the index to the name, e.g. `\"DebayerCpu:\" + std::to_string(threadIndex)` or similar?\n\nAck, will do for v3.\n\n>>   {\n>> +    this->moveToThread(&worker_);\n>>   }\n>>     /**\n>> @@ -88,8 +92,10 @@ DebayerCpu::DebayerCpu(std::unique_ptr<SwStatsCpu> stats, const GlobalConfigurat\n>>       bool enableInputMemcpy =\n>>           configuration.option<bool>({ \"software_isp\", \"copy_input_buffer\" }).value_or(true);\n>>   -    /* Just one thread object for now, which will be called inline rather than async */\n>> -    threads_.resize(1);\n>> +    unsigned int threadCount =\n>> +        configuration.option<unsigned int>({ \"software_isp\", \"threads\" }).value_or(2);\n>> +    threadCount = std::clamp(threadCount, 1u, 8u);\n>> +    threads_.resize(threadCount);\n>>         for (unsigned int i = 0; i < threads_.size(); i++)\n>>           threads_[i] = new DebayerCpuThread(this, i, enableInputMemcpy);\n>> @@ -714,6 +720,11 @@ void DebayerCpuThread::process(uint32_t frame, const uint8_t *src, uint8_t *dst)\n>>           process2(frame, src, dst);\n>>       else\n>>           process4(frame, src, dst);\n>> +\n>> +    debayer_->workPendingMutex_.lock();\n>> +    debayer_->workPending_ &= ~(1 << threadIndex_);\n>> +    debayer_->workPendingMutex_.unlock();\n>> +    debayer_->workPendingCv_.notify_one();\n>>   }\n>>     void DebayerCpuThread::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)\n>> @@ -953,7 +964,24 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output\n>>         stats_->startFrame(frame);\n>>   -    threads_[0]->process(frame, in.planes()[0].data(), out.planes()[0].data());\n>> +    workPendingMutex_.lock();\n> \n> Is the above locking needed?\n\nIt is the correct thing to do and necessary to not get warnings because of\nworkPending_ being marked as LIBCAMERA_TSA_GUARDED_BY(workPendingMutex_)\n\n\n> \n> \n>> +    workPending_ = (1 << threads_.size()) - 1;\n> \n> Why not just have it be `thread_.size()`? And then subtract one in the worker when done?\n\nIt feels more correct to me to have one pending bit per part of\nthe image being processed and clear those.\n\n>> +    workPendingMutex_.unlock();\n>> +\n>> +    for (unsigned int i = 0; i < threads_.size(); i++)\n>> +        threads_[i]->invokeMethod(&DebayerCpuThread::process,\n>> +                      ConnectionTypeQueued, frame,\n>> +                      in.planes()[0].data(), out.planes()[0].data());\n>> +\n>> +    {\n>> +        MutexLocker locker(workPendingMutex_);\n>> +\n>> +        auto workPending = ([&]() LIBCAMERA_TSA_REQUIRES(workPendingMutex_) {\n>                                    ^\n> \n> I think you can drop the extra `()`.\n> \n> \n>> +            return workPending_ == 0;\n>> +        });\n>> +\n>> +        workPendingCv_.wait(locker, workPending);\n> \n> Actually, I would probably inline the lambda here.\n\nAck.\n\n> \n> \n>> +    }\n>>         metadata.planes()[0].bytesused = out.planes()[0].size();\n>>   @@ -972,6 +1000,23 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output\n>>       inputBufferReady.emit(input);\n>>   }\n>>   +int DebayerCpu::start()\n>> +{\n>> +    for (unsigned int i = 0; i < threads_.size(); i++)\n>> +        threads_[i]->worker_.start();\n>> +\n>> +    return 0;\n>> +}\n>> +\n>> +void DebayerCpu::stop()\n>> +{\n>> +    for (unsigned int i = 0; i < threads_.size(); i++)\n>> +        threads_[i]->worker_.exit();\n>> +\n>> +    for (unsigned int i = 0; i < threads_.size(); i++)\n> \n> Please\n> \n>   for (auto &thr : threads_)\n> \n> here and above.\n\nAck.\n\nRegards,\n\nHans","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 1B7C5C0DA4\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 24 Feb 2026 12:44:48 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id D9431622A0;\n\tTue, 24 Feb 2026 13:44:46 +0100 (CET)","from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com\n\t[205.220.168.131])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 80D7562080\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 24 Feb 2026 13:44:44 +0100 (CET)","from pps.filterd (m0279867.ppops.net [127.0.0.1])\n\tby mx0a-0031df01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id\n\t61OAFWcp2220901 for <libcamera-devel@lists.libcamera.org>;\n\tTue, 24 Feb 2026 12:44:43 GMT","from mail-qk1-f197.google.com (mail-qk1-f197.google.com\n\t[209.85.222.197])\n\tby mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 4ch4e39f8q-1\n\t(version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT)\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 24 Feb 2026 12:44:42 +0000 (GMT)","by mail-qk1-f197.google.com with SMTP id\n\taf79cd13be357-8cb3ff05c73so4776534485a.0\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 24 Feb 2026 04:44:42 -0800 (PST)","from ?IPV6:2001:1c00:c32:7800:5bfa:a036:83f0:f9ec?\n\t(2001-1c00-0c32-7800-5bfa-a036-83f0-f9ec.cable.dynamic.v6.ziggo.nl.\n\t[2001:1c00:c32:7800:5bfa:a036:83f0:f9ec])\n\tby smtp.gmail.com with ESMTPSA id\n\ta640c23a62f3a-b9084cad862sm421607266b.26.2026.02.24.04.44.39\n\t(version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);\n\tTue, 24 Feb 2026 04:44:40 -0800 (PST)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (2048-bit key;\n\tunprotected) header.d=qualcomm.com header.i=@qualcomm.com\n\theader.b=\"V8u4QcEG\"; dkim=pass (2048-bit key;\n\tunprotected) header.d=oss.qualcomm.com header.i=@oss.qualcomm.com\n\theader.b=\"kSViVVgm\"; dkim-atps=neutral","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=qualcomm.com; h=\n\tcc:content-transfer-encoding:content-type:date:from:in-reply-to\n\t:message-id:mime-version:references:subject:to; s=qcppdkim1; bh=\n\tXdRqpJTYAMk0uj17soqLQJAJShW318dDfk0NKxxREco=; b=V8u4QcEGGShklGdL\n\tVC64ilH5uSoeEpCpBije9kRHhDuLRRFRD4yTqh0A5HfiDxGC74943znX+wJ1wRqK\n\tOZCd5LNWik5htnKym23vZL5ygrP5+Zvv9xvrO6pD2BmNH3HYg/Ai9a06vPAIA9Vb\n\tz1MrrHRsnSMTg5ntFARhAOtJHhR7b74M4gEkqvk/KEpe1hmz9wf9uBv171VeIDO2\n\tFr3wFEI+lDZjGoR12+q+68jmZbllh+I1hshDVJMpGyBkLEx20OCcI/WcZLno1UO1\n\tMC/fspqAQTzi+jWsNAafNd/tnYJ3HqNVVJGg19Ye1lK0TdFV43prqAdoQity1zo0\n\tlzU3AA==","v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=oss.qualcomm.com; s=google; t=1771937081; x=1772541881;\n\tdarn=lists.libcamera.org; \n\th=content-transfer-encoding:in-reply-to:content-language:references\n\t:cc:to:subject:from:user-agent:mime-version:date:message-id:from:to\n\t:cc:subject:date:message-id:reply-to;\n\tbh=XdRqpJTYAMk0uj17soqLQJAJShW318dDfk0NKxxREco=;\n\tb=kSViVVgmIFoJCyA6jmbFY8AOo4SC29wX4qacGn5xMbR91j/Mr3wyZ2JZMqebgvdF3U\n\tdcEZjfm4R2PrC1dR+fDrzvtNSMT7PSHLGYK+wZsFDHHzVxy12jJ8/HfCTr8Dqk4Yllvx\n\tnqs7ibLDE1QBhKh9Q7J7ZJ36/hI6Bv74kSGazT+7C9kGqxIiBO6XcpIwkCF123ZBMJaS\n\tWFh8TQuyW6KusQZl2jhx5FRtTS7NjAP1VPgIE8/unYcgd+yob64Apj41sdxMdBkDrfFG\n\ttLel+qk+vYH6UzPkHlv80sT3anNNFNTW/2EmJMS4IY9RrfMSGltWKjcE5Xtp+z3q8i0Z\n\t4TKg=="],"X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20230601; t=1771937081; x=1772541881;\n\th=content-transfer-encoding:in-reply-to:content-language:references\n\t:cc:to:subject:from:user-agent:mime-version:date:message-id:x-gm-gg\n\t:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;\n\tbh=XdRqpJTYAMk0uj17soqLQJAJShW318dDfk0NKxxREco=;\n\tb=IeJ48I+9jad0DwsNPtsGudHfVTqzHplbIeEZKnB7J/Sz5PcCu5fPyciwQk4QahbPqd\n\tkC0F846S7Vy7BjrD4Yn1LXAgzEwz9ZDtgN/G+80wU0fCkqj8yYHu5AzJVa5SxAPep+Mn\n\twwWl8BgrP+H7nsgsHoBZHX0IqKOKxpAYHhdUBFXWn1ceeMSg4GAMLwc5UTkVNzjzkvKK\n\tkCq04K30Zlmn2nIW1HIOvnHb/L1WrD4+mxTfszO/GQygW2lpWmsHrcZplihF0nRvIJ7S\n\timJt7pXB3/GNKf1aXozJzxxHdXEE/nxhl0bHdm+w/MWNqLegQSvRVKRXlFwwoiv9SBRG\n\tmsPQ==","X-Forwarded-Encrypted":"i=1;\n\tAJvYcCXd79Gvuw7xpGg8oJ2ev04RakTkXUBTGquDG51nixwsI2tHtFPx5UhN75JEJ3R/kMkqWngSyi3uWTjhKWLfD6A=@lists.libcamera.org","X-Gm-Message-State":"AOJu0Yxt1LAvoBC1+x8J/unz8uigu8KMEg19HvzHhVgN9wSlttkVG4SW\n\t2phFNRxyAm/SS8UUvOpObpGqxUfRatepxa7G2dLGIaVHO0lHQWXPZQpk+Mzs/R5qw8sBAZZWkM5\n\tB5tnHQ78ySKVTlMRjF43OxmQ/ZtYjn0sMUnT+VBkxuE5tEoLcJzcddIm2JtyviW7QJPgoOpaVRj\n\tWV","X-Gm-Gg":"AZuq6aLCj9MW+zS830px9s1GV2iktAS4ZNADfQAGvPaRjObp0b7pXXZOOXcw/lSCjxU\n\tL2T/VJ5aEX4lVrUGO6JGtVuJGHk1ws3p2tSa/SyzJwNO49pjn5EAOFpyknUTQWdNJjrMN8iv14j\n\t43QLe/oeXvPHZCmVdjkj6leQ2J5f0b+Xfd2bgzv8xA6sSyrykZdWB3uJWt9B1gdf+YsrP9ziRWQ\n\tK72bZ8ls4k6yM2hYmWPoa5j56GMSZX7lHIWUkScdiYgPOzC+0CYoAh3TJrHSj+IKltksk9Cr80A\n\tcS1qsmqanFvdmJDh4YDDN3XLctvOwk44raW1kWLtuy+EGnH/HfFdkIczC9ADOSzBmzKOvKphZ3m\n\tn7MhEOE2KvSNYNLYZ6t8FyY8GQ3E8DHsd50xmeO17cumcIhb2ky1HoA4oaqalhD019ewTB448jQ\n\t0CArJDc93rxynjNM5W+k4B2aByUGZaXFslEigfDnPuo0tnBdyNLpVIrRl2G9lf0Gk40hq7sqxpC\n\tAiyA2dHxQCgRl8V","X-Received":["by 2002:a05:620a:40c6:b0:8c6:d343:79a4 with SMTP id\n\taf79cd13be357-8cb8ca67566mr1632214485a.40.1771937081201; \n\tTue, 24 Feb 2026 04:44:41 -0800 (PST)","by 2002:a05:620a:40c6:b0:8c6:d343:79a4 with SMTP id\n\taf79cd13be357-8cb8ca67566mr1632211285a.40.1771937080735; \n\tTue, 24 Feb 2026 04:44:40 -0800 (PST)"],"Message-ID":"<6f5a393e-c6f5-4aa7-928f-4d1728adcb24@oss.qualcomm.com>","Date":"Tue, 24 Feb 2026 13:44:39 +0100","MIME-Version":"1.0","User-Agent":"Mozilla Thunderbird","From":"Hans de Goede <johannes.goede@oss.qualcomm.com>","Subject":"Re: [PATCH v2 3/4] software_isp: debayer_cpu: Add multi-threading\n\tsupport","To":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <barnabas.pocze@ideasonboard.com>,\n\tlibcamera-devel@lists.libcamera.org","Cc":"Milan Zamazal <mzamazal@redhat.com>","References":"<20260223160930.27913-1-johannes.goede@oss.qualcomm.com>\n\t<20260223160930.27913-4-johannes.goede@oss.qualcomm.com>\n\t<df124ae8-ec83-4d87-afc8-7c3b5c651e8f@ideasonboard.com>","Content-Language":"en-US, nl","In-Reply-To":"<df124ae8-ec83-4d87-afc8-7c3b5c651e8f@ideasonboard.com>","Content-Type":"text/plain; charset=UTF-8","Content-Transfer-Encoding":"8bit","X-Authority-Analysis":"v=2.4 cv=DfIaa/tW c=1 sm=1 tr=0 ts=699d9d3a cx=c_pps\n\ta=50t2pK5VMbmlHzFWWp8p/g==:117 a=xqWC_Br6kY4A:10 a=IkcTkHD0fZMA:10\n\ta=HzLeVaNsDn8A:10 a=s4-Qcg_JpJYA:10 a=VkNPw1HP01LnGYTKEx00:22\n\ta=u7WPNUs3qKkmUXheDGA7:22 a=eoimf2acIAo5FJnRuUoq:22 a=EUspDBNiAAAA:8\n\ta=7k3QyJuu53KY6fNyNgEA:9 a=3ZKOabzyN94A:10 a=QEXdDO2ut3YA:10\n\ta=IoWCM6iH3mJn3m4BftBB:22","X-Proofpoint-GUID":"5e-STXzxYzwkiT-pDSoX3Td3w7kg84qt","X-Proofpoint-ORIG-GUID":"5e-STXzxYzwkiT-pDSoX3Td3w7kg84qt","X-Proofpoint-Spam-Details-Enc":"AW1haW4tMjYwMjI0MDEwNCBTYWx0ZWRfX7djWeIyKw6US\n\tWchcar0/Qpusm9TG8Jq6+tO0DCkBxEQJDwIOge9/ChzMKlz3LrieSeoL1D2IDHDB+YehPLqi13u\n\t2JYhRXNJmwFl2QMVMhD0+YDGc0IoPsIaLFuLf0SVF06F8tRGJjccS/b5pGs9bFjtXtj+AqUQvjP\n\tPDRzsnlKXH0ggNkp/EL8O+HNrXuX/4wI1h3H2HQowBtZV0PQZW3xuE3WbH3XWPVaWeyKKWkm/yi\n\t0XHk77ibI62/B1290Txdptdo54EpkWc4Aoog5v7sdfiyYD3cm7Le07/UMqWtwXUvNqNR5Z6gIfn\n\tWRAqvieFMy74oMqDIwxZ7jQ68qSj3M9QDCvxKjOrUfj/X7uI/Ye+jiUl/o9AF/JFk7ni2f001AD\n\tBTvNHJzvP7qJxNClDZbmdppbez0lGgCed8KQNcsiJQS4Y4bKXNnUmEDTgNHaXy8eZHlttxz762t\n\tkwI1fh2cAmvSeq85msw==","X-Proofpoint-Virus-Version":"vendor=baseguard\n\tengine=ICAP:2.0.293, Aquarius:18.0.1121, Hydra:6.1.51,\n\tFMLib:17.12.100.49\n\tdefinitions=2026-02-24_01,2026-02-23_03,2025-10-01_01","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tpriorityscore=1501 spamscore=0 phishscore=0 lowpriorityscore=0\n\tsuspectscore=0\n\tadultscore=0 impostorscore=0 bulkscore=0 clxscore=1015 malwarescore=0\n\tclassifier=typeunknown authscore=0 authtc= authcc= route=outbound\n\tadjust=0\n\treason=mlx scancount=1 engine=8.22.0-2602130000\n\tdefinitions=main-2602240104","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":38286,"web_url":"https://patchwork.libcamera.org/comment/38286/","msgid":"<2d323f42-b4b3-4998-b8ae-8c2e38f63705@ideasonboard.com>","date":"2026-02-24T12:51:46","subject":"Re: [PATCH v2 3/4] software_isp: debayer_cpu: Add multi-threading\n\tsupport","submitter":{"id":216,"url":"https://patchwork.libcamera.org/api/people/216/","name":"Barnabás Pőcze","email":"barnabas.pocze@ideasonboard.com"},"content":"2026. 02. 24. 13:44 keltezéssel, Hans de Goede írta:\n> Hi,\n> \n> On 23-Feb-26 17:33, Barnabás Pőcze wrote:\n>> Hi\n>>\n>> 2026. 02. 23. 17:09 keltezéssel, Hans de Goede írta:\n>>> Add CPU soft ISP multi-threading support.\n>>>\n>>> Benchmark results for the Arduino Uno-Q with a weak CPU which is good for\n>>> performance testing, all numbers with an IMX219 running at\n>>> 3280x2464 -> 3272x2464:\n>>>\n>>> 1 thread : 147ms / frame, ~6.5 fps\n>>> 2 threads:  80ms / frame, ~12.5 fps\n>>> 3 threads:  65ms / frame, ~15 fps\n>>>\n>>> Adding a 4th thread does not improve performance.\n>>>\n>>> Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>\n>>> ---\n>>> Changes in v2:\n>>> - Adjust to use the new DebayerCpuThread class introduced in the v2 patch-series\n>>> - Re-use threads instead of starting new threads every frame\n>>> ---\n>>>    src/libcamera/software_isp/debayer_cpu.cpp | 53 ++++++++++++++++++++--\n>>>    src/libcamera/software_isp/debayer_cpu.h   |  6 +++\n>>>    2 files changed, 55 insertions(+), 4 deletions(-)\n>>>\n>>> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp\n>>> index 122bfbb05..ea1b17c1f 100644\n>>> --- a/src/libcamera/software_isp/debayer_cpu.cpp\n>>> +++ b/src/libcamera/software_isp/debayer_cpu.cpp\n> [...]\n>>> @@ -714,6 +720,11 @@ void DebayerCpuThread::process(uint32_t frame, const uint8_t *src, uint8_t *dst)\n>>>            process2(frame, src, dst);\n>>>        else\n>>>            process4(frame, src, dst);\n>>> +\n>>> +    debayer_->workPendingMutex_.lock();\n>>> +    debayer_->workPending_ &= ~(1 << threadIndex_);\n>>> +    debayer_->workPendingMutex_.unlock();\n>>> +    debayer_->workPendingCv_.notify_one();\n>>>    }\n>>>      void DebayerCpuThread::process2(uint32_t frame, const uint8_t *src, uint8_t *dst)\n>>> @@ -953,7 +964,24 @@ void DebayerCpu::process(uint32_t frame, FrameBuffer *input, FrameBuffer *output\n>>>          stats_->startFrame(frame);\n>>>    -    threads_[0]->process(frame, in.planes()[0].data(), out.planes()[0].data());\n>>> +    workPendingMutex_.lock();\n>>\n>> Is the above locking needed?\n> \n> It is the correct thing to do and necessary to not get warnings because of\n> workPending_ being marked as LIBCAMERA_TSA_GUARDED_BY(workPendingMutex_)\n> \n\nThe way I see it, no thread may be executing `DebayerCpuThread::process` when this runs, so\nit should not be necessary. But I have indeed missed the tsa annotation.\n\n\n> \n>>\n>>\n>>> +    workPending_ = (1 << threads_.size()) - 1;\n>>\n>> Why not just have it be `thread_.size()`? And then subtract one in the worker when done?\n> \n> It feels more correct to me to have one pending bit per part of\n> the image being processed and clear those.\n\nInteresting, to me a simple counter looks best; but I suppose either works.\n\n\n> [...]","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id D9C42BE175\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 24 Feb 2026 12:51:52 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id BEB56622A1;\n\tTue, 24 Feb 2026 13:51:51 +0100 (CET)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 357B262080\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 24 Feb 2026 13:51:50 +0100 (CET)","from [192.168.33.90] (185.221.141.206.nat.pool.zt.hu\n\t[185.221.141.206])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id 3EC11B3;\n\tTue, 24 Feb 2026 13:50:53 +0100 (CET)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"mMo1kZBg\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1771937453;\n\tbh=JrEi9Onpj45FPXwSOPtmTTTnkcyu7B29Qhj+XQIsApg=;\n\th=Date:Subject:To:Cc:References:From:In-Reply-To:From;\n\tb=mMo1kZBgdU804wFNYREfLYPa0HZBDmFFwbibzmR97/KVhZg3Wc/fIxjlbSB6daRbn\n\tLvLwhJrKxs0CNFvvTzqlyLtFierphyKy80bHr8M+JwUF97pJPVU7QTgsOoHkl7pC1Y\n\tnSnH/9IKd8L5wa8oaukvfaI+CAt+6bsYDrOnXQOc=","Message-ID":"<2d323f42-b4b3-4998-b8ae-8c2e38f63705@ideasonboard.com>","Date":"Tue, 24 Feb 2026 13:51:46 +0100","MIME-Version":"1.0","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH v2 3/4] software_isp: debayer_cpu: Add multi-threading\n\tsupport","To":"Hans de Goede <johannes.goede@oss.qualcomm.com>,\n\tlibcamera-devel@lists.libcamera.org","Cc":"Milan Zamazal <mzamazal@redhat.com>","References":"<20260223160930.27913-1-johannes.goede@oss.qualcomm.com>\n\t<20260223160930.27913-4-johannes.goede@oss.qualcomm.com>\n\t<df124ae8-ec83-4d87-afc8-7c3b5c651e8f@ideasonboard.com>\n\t<6f5a393e-c6f5-4aa7-928f-4d1728adcb24@oss.qualcomm.com>","From":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <barnabas.pocze@ideasonboard.com>","Content-Language":"en-US, hu-HU","In-Reply-To":"<6f5a393e-c6f5-4aa7-928f-4d1728adcb24@oss.qualcomm.com>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"8bit","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}}]