[{"id":36788,"web_url":"https://patchwork.libcamera.org/comment/36788/","msgid":"<855xbe2lnq.fsf@mzamazal-thinkpadp1gen7.tpbc.csb>","date":"2025-11-13T11:54:01","subject":"Re: [RFC 1/1] libcamera: debayer_cpu: Mask out unused bits from >\n\t8bpp non packed src data","submitter":{"id":177,"url":"https://patchwork.libcamera.org/api/people/177/","name":"Milan Zamazal","email":"mzamazal@redhat.com"},"content":"Hi Hans,\n\nthank you for the fix.\n\nHans de Goede <johannes.goede@oss.qualcomm.com> writes:\n\n> Users have been reporting invalid array access assert errors in\n> statsBGGR10Line0 inside the SWSTATS_ACCUMULATE_LINE_STATS() macro caused\n> by out of bounds accesses to the yHistogram array.\n>\n> Another case of the same problem is out of bounds accesses to the various\n> lookup arrays used in the DebayerCpu code.\n>\n> Both of these are caused by 10 bpp sparse (stored in 16 bit words) input\n> frames containing pixels values > 1023 leading to out of bounds array\n> accesses. IOW bits 15-10 of the 16 bit word are not 0 as they should be.\n>\n> This can only happen if we somehow get a corrupt frame from the kernel.\n> The reported crashes show that this can actually happen so we will need\n> to harden the software ISP code to deal with this.\n>\n> The cheapest (in CPU time) way to fix this is to simply mask out the high\n> unused bits of the words for sparse input formats directly after memcpy-ing\n> a line to one of the lineBuffers.\n\nOK.  But how about GPU ISP?  We still run stats on a CPU.  And a\nmechanism other than memcpy-ing should be used with GPU ISP.\n\n> This also means that the software ISP must always use the memcpy path for\n> sparse 10/12 bpp input data (it should never write to the input buffer).\n> Add a new forceInputMemcpy_ for the setting coming from the config file and\n> set enableInputMemcpy_ to true if either forceInputMemcpy_ is set or\n> the input format is sparse 10/12 bpp.\n\nDo we really need a new variable for the purpose?  Why not to use\nenableInputMemcpy_ directly?\n\nI'm also thinking about honouring forceInputMemcpy_ unconditionally if\nset.  The masking would be disabled in case forceInputMemcpy_ is\nexplicitly set to false.  A warning in the documentation should avoid\ngetting much more bug reports about the crashes.\n\n> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2402746#c20\n> Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>\n> ---\n>  src/libcamera/software_isp/debayer_cpu.cpp | 41 ++++++++++++++++++++--\n>  src/libcamera/software_isp/debayer_cpu.h   |  6 ++++\n>  2 files changed, 45 insertions(+), 2 deletions(-)\n>\n> diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp\n> index 00738c56..616e4f13 100644\n> --- a/src/libcamera/software_isp/debayer_cpu.cpp\n> +++ b/src/libcamera/software_isp/debayer_cpu.cpp\n> @@ -47,14 +47,14 @@ DebayerCpu::DebayerCpu(std::unique_ptr<SwStatsCpu> stats, const GlobalConfigurat\n>  \t * Reading from uncached buffers may be very slow.\n>  \t * In such a case, it's better to copy input buffer data to normal memory.\n>  \t * But in case of cached buffers, copying the data is unnecessary overhead.\n> -\t * enable_input_memcpy_ makes this behavior configurable.  At the moment, we\n> +\t * forceInputMemcpy_ makes this behavior configurable.  At the moment, we\n>  \t * always set it to true as the safer choice but this should be changed in\n>  \t * future.\n>  \t *\n>  \t * \\todo Make memcpy automatic based on runtime detection of platform\n>  \t * capabilities.\n>  \t */\n> -\tenableInputMemcpy_ =\n> +\tforceInputMemcpy_ =\n>  \t\tconfiguration.option<bool>({ \"software_isp\", \"copy_input_buffer\" }).value_or(true);\n>  }\n>  \n> @@ -288,6 +288,30 @@ void DebayerCpu::debayer10P_RGRG_BGR888(uint8_t *dst, const uint8_t *src[])\n>  \t}\n>  }\n>  \n> +/*\n> + * Functions to mask out high unused bits from input buffers. These bits should\n> + * already by 0, but sometimes with corrupt input frames these are not 0 causing\n> + * out-of-bounds accesses to various lookup tables. These functions explicitly\n> + * set the unused high bits to 0 to avoid corrupt frames causing crashes.\n> + */\n> +void DebayerCpu::inputMask10(uint8_t *data, unsigned int len)\n> +{\n> +\t/* Everything is aligned to 2 16 bit pixels, mask 2 pixels at a time */\n> +\tuint32_t *input, \n\nDeclare in the `for' loop?\n\n> *end = (uint32_t *)(data + len);\n\nstatic_cast<uint32_t *>(data + len)\n\n(Similarly elsewhere.)\n\n> +\n> +\tfor (input = (uint32_t *)data; input < end; input++)\n> +\t\t*input &= 0x03ff03ff;\n> +}\n> +\n> +void DebayerCpu::inputMask12(uint8_t *data, unsigned int len)\n> +{\n> +\t/* Everything is aligned to 2 16 bit pixels, mask 2 pixels at a time */\n> +\tuint32_t *input, *end = (uint32_t *)(data + len);\n> +\n> +\tfor (input = (uint32_t *)data; input < end; input++)\n> +\t\t*input &= 0x0fff0fff;\n> +}\n\nThere is a measurable performance penalty with this, ~4 % in my\nenvironment.  Would perhaps changing the methods to perform the memcpy\ndirectly, rather than fixing the input as an additional step, help?\n\n> +\n>  /*\n>   * Setup the Debayer object according to the passed in parameters.\n>   * Return 0 on success, a negative errno value on failure\n> @@ -395,6 +419,8 @@ int DebayerCpu::setDebayerFunctions(PixelFormat inputFormat,\n>  \n>  \txShift_ = 0;\n>  \tswapRedBlueGains_ = false;\n> +\tinputMask_ = NULL;\n\nnullptr\n\n> +\tenableInputMemcpy_ = forceInputMemcpy_;\n>  \n>  \tauto invalidFmt = []() -> int {\n>  \t\tLOG(Debayer, Error) << \"Unsupported input output format combination\";\n> @@ -446,12 +472,17 @@ int DebayerCpu::setDebayerFunctions(PixelFormat inputFormat,\n>  \t\t\tbreak;\n>  \t\tcase 10:\n>  \t\t\tSET_DEBAYER_METHODS(debayer10_BGBG_BGR888, debayer10_GRGR_BGR888)\n> +\t\t\tinputMask_ = &DebayerCpu::inputMask10;\n>  \t\t\tbreak;\n>  \t\tcase 12:\n>  \t\t\tSET_DEBAYER_METHODS(debayer12_BGBG_BGR888, debayer12_GRGR_BGR888)\n> +\t\t\tinputMask_ = &DebayerCpu::inputMask12;\n>  \t\t\tbreak;\n>  \t\t}\n>  \t\tsetupStandardBayerOrder(bayerFormat.order);\n> +\t\tif (inputMask_)\n> +\t\t\tenableInputMemcpy_ = true;\n> +\n>  \t\treturn 0;\n>  \t}\n>  \n> @@ -601,6 +632,8 @@ void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[])\n>  \t\tmemcpy(lineBuffers_[i].data(),\n>  \t\t       linePointers[i + 1] - lineBufferPadding_,\n>  \t\t       lineBufferLength_);\n> +\t\tif (inputMask_)\n> +\t\t\t(this->*inputMask_)(lineBuffers_[i].data(), lineBufferLength_);\n>  \t\tlinePointers[i + 1] = lineBuffers_[i].data() + lineBufferPadding_;\n>  \t}\n>  \n> @@ -629,6 +662,10 @@ void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])\n>  \tmemcpy(lineBuffers_[lineBufferIndex_].data(),\n>  \t       linePointers[patternHeight] - lineBufferPadding_,\n>  \t       lineBufferLength_);\n> +\tif (inputMask_)\n> +\t\t(this->*inputMask_)(lineBuffers_[lineBufferIndex_].data(),\n> +\t\t\t\t    lineBufferLength_);\n> +\n>  \tlinePointers[patternHeight] = lineBuffers_[lineBufferIndex_].data() + lineBufferPadding_;\n>  \n>  \tlineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1);\n> diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h\n> index 3bf34ac3..a5feefa2 100644\n> --- a/src/libcamera/software_isp/debayer_cpu.h\n> +++ b/src/libcamera/software_isp/debayer_cpu.h\n> @@ -79,6 +79,8 @@ private:\n>  \t */\n>  \tusing debayerFn = void (DebayerCpu::*)(uint8_t *dst, const uint8_t *src[]);\n>  \n> +\tusing inputMaskFn = void (DebayerCpu::*)(uint8_t *data, unsigned int len);\n> +\n>  \t/* 8-bit raw bayer format */\n>  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n>  \tvoid debayer8_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n> @@ -89,11 +91,13 @@ private:\n>  \tvoid debayer10_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n>  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n>  \tvoid debayer10_GRGR_BGR888(uint8_t *dst, const uint8_t *src[]);\n> +\tvoid inputMask10(uint8_t *data, unsigned int len);\n>  \t/* unpacked 12-bit raw bayer format */\n>  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n>  \tvoid debayer12_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n>  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n>  \tvoid debayer12_GRGR_BGR888(uint8_t *dst, const uint8_t *src[]);\n> +\tvoid inputMask12(uint8_t *data, unsigned int len);\n>  \t/* CSI-2 packed 10-bit raw bayer format (all the 4 orders) */\n>  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n>  \tvoid debayer10P_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n> @@ -123,6 +127,7 @@ private:\n>  \tdebayerFn debayer1_;\n>  \tdebayerFn debayer2_;\n>  \tdebayerFn debayer3_;\n> +\tinputMaskFn inputMask_;\n>  \tRectangle window_;\n>  \tstd::unique_ptr<SwStatsCpu> stats_;\n>  \tstd::vector<uint8_t> lineBuffers_[kMaxLineBuffers];\n> @@ -130,6 +135,7 @@ private:\n>  \tunsigned int lineBufferPadding_;\n>  \tunsigned int lineBufferIndex_;\n>  \tunsigned int xShift_; /* Offset of 0/1 applied to window_.x */\n> +\tbool forceInputMemcpy_;\n>  \tbool enableInputMemcpy_;\n>  };","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id EFE20C3241\n\tfor <parsemail@patchwork.libcamera.org>;\n\tThu, 13 Nov 2025 11:54:09 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 2983960805;\n\tThu, 13 Nov 2025 12:54:09 +0100 (CET)","from us-smtp-delivery-124.mimecast.com\n\t(us-smtp-delivery-124.mimecast.com [170.10.133.124])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id B092260805\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tThu, 13 Nov 2025 12:54:07 +0100 (CET)","from mail-wm1-f71.google.com (mail-wm1-f71.google.com\n\t[209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS\n\t(version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id\n\tus-mta-74-acNJhpWeODWDlpP-jIOtCw-1; Thu, 13 Nov 2025 06:54:05 -0500","by mail-wm1-f71.google.com with SMTP id\n\t5b1f17b1804b1-47778daa4d2so6499495e9.2\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tThu, 13 Nov 2025 03:54:05 -0800 (PST)","from mzamazal-thinkpadp1gen7.tpbc.csb\n\t(ip-77-48-47-2.net.vodafone.cz. [77.48.47.2])\n\tby smtp.gmail.com with ESMTPSA id\n\t5b1f17b1804b1-47787e8e6a1sm84348725e9.11.2025.11.13.03.54.01\n\t(version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);\n\tThu, 13 Nov 2025 03:54:02 -0800 (PST)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=redhat.com header.i=@redhat.com\n\theader.b=\"D44ApDKX\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n\ts=mimecast20190719; t=1763034846;\n\th=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n\tto:to:cc:cc:mime-version:mime-version:content-type:content-type:\n\tin-reply-to:in-reply-to:references:references;\n\tbh=eQuaG+PLKd3qw+/4okKbyK+pLSx3Q2NzW5lY0T0ygqA=;\n\tb=D44ApDKXmm0lli4VyP32jL4gfVC9wBH4vMy9abzBM1e+0SQYPh77fNky2HBVA5gigdjOU9\n\tUWNs5N7qbsp2Ad95eWDfyI7pMLHUPCNbqd+KLsTww1qqW2lvP1oLSp3AeYqShw+cIzgIZY\n\t6Xr6/2bZcHgSEcJjo5M2Nrl4EaMYjus=","X-MC-Unique":"acNJhpWeODWDlpP-jIOtCw-1","X-Mimecast-MFC-AGG-ID":"acNJhpWeODWDlpP-jIOtCw_1763034844","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20230601; t=1763034844; x=1763639644;\n\th=mime-version:user-agent:message-id:date:references:in-reply-to\n\t:subject:cc:to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject\n\t:date:message-id:reply-to;\n\tbh=eQuaG+PLKd3qw+/4okKbyK+pLSx3Q2NzW5lY0T0ygqA=;\n\tb=cpxh62RmaMVMHDlfmpVh5jZfEtxd5xzY9nOhIwcKcR5234xKzn17+P7WyEkl39Nd2q\n\tn878zkbzY2zY8qU2+EDcGHGlF6doOw54F8QGat/LwlXOJBCXakEJNwiJFDXupLQcYxXA\n\tzVrW50JtA1qD57arfjmGr2rP2GZKHEnfG3ED83crA2dqBBWDNdFa3IRM4zE9bkePJBh3\n\tjSbwd0lQ9xpLuxZYxTfFtltZtQEtVuLxUd6PqZt/0nfIWz8q49bFeRtwt167xXiGmNy4\n\t6RzWYDZaf5XpWe1zkFpJhY+XNVwjG/3s96SZ4Vo1ClHEFx7CYoRt1jDw3nujP4VS5v32\n\t2NyA==","X-Gm-Message-State":"AOJu0Yy0Jr5HOSXJzXecVfGYeJg+5mrAcRXcoeM92Pi1zOUqORLFA0G0\n\tMkymjZuZtYSm7Tl/DQEulDJiMmlfixv9p4uSlKLQDUBhaHNMOOTViLhoQuSebLUTzmQ6KvCyzaE\n\tRphYDRPwaavjmVqYy8SFQ104UJrfgjAbj6MOIrMpknZvLDasnbfN82d5LWflkouzAirlBTluTKk\n\tb8UASXxYkuoLw44LFyOMMwDtcp5bkBSUSiA49hby0hM72OV0/lrnUosqwcAd4=","X-Gm-Gg":"ASbGncscwM9oy5XukEdmqvKJAxYJtZI+GbXBX1l2K2zHn4LQUdc5D2BB/ki7Rq2/u19\n\t2k4Nzqcsow65jYVujEIRjQkfE866T7EHehxR9D31a+GhU2APaelNEDAcpUt0r00cUltEIFqJIzz\n\tDleBtvY6WFMCB7D7IM/UzFuSmMeBgahUOW+VxkqGe8kmQ0rZq4eDiwZDfV7e5Js9xkbuHUIaLW9\n\tQwkvGzibdS+PxkGmk0UN4uMYFuBzBs/LOVOLyo7xbwNMI8KCnc44ROeQlfZf3d2fLKHBOrf3HZN\n\tJXaRrrXjlMuFvTBKnaI+xhO2SEjitvlGYnIaiKPqZrasOA1nRE9NEeNTP8ZjjS+dogK+UiWQyhe\n\t6Ia38AIjIILy2a2S6KtR8ayw/zYZJ+aWV7ok5Ye91OzpTH3A2Sb35","X-Received":["by 2002:a05:600c:138b:b0:46d:ba6d:65bb with SMTP id\n\t5b1f17b1804b1-477870c31c4mr61120115e9.31.1763034843560; \n\tThu, 13 Nov 2025 03:54:03 -0800 (PST)","by 2002:a05:600c:138b:b0:46d:ba6d:65bb with SMTP id\n\t5b1f17b1804b1-477870c31c4mr61119755e9.31.1763034842943; \n\tThu, 13 Nov 2025 03:54:02 -0800 (PST)"],"X-Google-Smtp-Source":"AGHT+IEGZcKcf7o9LHHbKBxp7LNL4VlJJoFWZMuHYYsF/VOKHSS1QVqX0tXFya2oBHkSI0vTQnhojw==","From":"Milan Zamazal <mzamazal@redhat.com>","To":"Hans de Goede <johannes.goede@oss.qualcomm.com>","Cc":"libcamera-devel@lists.libcamera.org","Subject":"Re: [RFC 1/1] libcamera: debayer_cpu: Mask out unused bits from >\n\t8bpp non packed src data","In-Reply-To":"<20251112090924.46295-2-johannes.goede@oss.qualcomm.com> (Hans\n\tde Goede's message of \"Wed, 12 Nov 2025 10:09:24 +0100\")","References":"<20251112090924.46295-1-johannes.goede@oss.qualcomm.com>\n\t<20251112090924.46295-2-johannes.goede@oss.qualcomm.com>","Date":"Thu, 13 Nov 2025 12:54:01 +0100","Message-ID":"<855xbe2lnq.fsf@mzamazal-thinkpadp1gen7.tpbc.csb>","User-Agent":"Gnus/5.13 (Gnus v5.13)","MIME-Version":"1.0","X-Mimecast-Spam-Score":"0","X-Mimecast-MFC-PROC-ID":"efN-Px7xWgp9WMPD13mjLcEXI5diOMyBz9BD0K7g0c0_1763034844","X-Mimecast-Originator":"redhat.com","Content-Type":"text/plain","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":36794,"web_url":"https://patchwork.libcamera.org/comment/36794/","msgid":"<20251113131244.GD30434@pendragon.ideasonboard.com>","date":"2025-11-13T13:12:44","subject":"Re: [RFC 1/1] libcamera: debayer_cpu: Mask out unused bits from >\n\t8bpp non packed src data","submitter":{"id":2,"url":"https://patchwork.libcamera.org/api/people/2/","name":"Laurent Pinchart","email":"laurent.pinchart@ideasonboard.com"},"content":"On Thu, Nov 13, 2025 at 12:54:01PM +0100, Milan Zamazal wrote:\n> Hi Hans,\n> \n> thank you for the fix.\n> \n> Hans de Goede <johannes.goede@oss.qualcomm.com> writes:\n> \n> > Users have been reporting invalid array access assert errors in\n> > statsBGGR10Line0 inside the SWSTATS_ACCUMULATE_LINE_STATS() macro caused\n> > by out of bounds accesses to the yHistogram array.\n> >\n> > Another case of the same problem is out of bounds accesses to the various\n> > lookup arrays used in the DebayerCpu code.\n> >\n> > Both of these are caused by 10 bpp sparse (stored in 16 bit words) input\n> > frames containing pixels values > 1023 leading to out of bounds array\n> > accesses. IOW bits 15-10 of the 16 bit word are not 0 as they should be.\n> >\n> > This can only happen if we somehow get a corrupt frame from the kernel.\n> > The reported crashes show that this can actually happen so we will need\n> > to harden the software ISP code to deal with this.\n> >\n> > The cheapest (in CPU time) way to fix this is to simply mask out the high\n> > unused bits of the words for sparse input formats directly after memcpy-ing\n> > a line to one of the lineBuffers.\n> \n> OK.  But how about GPU ISP?  We still run stats on a CPU.  And a\n> mechanism other than memcpy-ing should be used with GPU ISP.\n> \n> > This also means that the software ISP must always use the memcpy path for\n> > sparse 10/12 bpp input data (it should never write to the input buffer).\n> > Add a new forceInputMemcpy_ for the setting coming from the config file and\n> > set enableInputMemcpy_ to true if either forceInputMemcpy_ is set or\n> > the input format is sparse 10/12 bpp.\n> \n> Do we really need a new variable for the purpose?  Why not to use\n> enableInputMemcpy_ directly?\n> \n> I'm also thinking about honouring forceInputMemcpy_ unconditionally if\n> set.  The masking would be disabled in case forceInputMemcpy_ is\n> explicitly set to false.  A warning in the documentation should avoid\n> getting much more bug reports about the crashes.\n> \n> > Link: https://bugzilla.redhat.com/show_bug.cgi?id=2402746#c20\n> > Signed-off-by: Hans de Goede <johannes.goede@oss.qualcomm.com>\n> > ---\n> >  src/libcamera/software_isp/debayer_cpu.cpp | 41 ++++++++++++++++++++--\n> >  src/libcamera/software_isp/debayer_cpu.h   |  6 ++++\n> >  2 files changed, 45 insertions(+), 2 deletions(-)\n> >\n> > diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp\n> > index 00738c56..616e4f13 100644\n> > --- a/src/libcamera/software_isp/debayer_cpu.cpp\n> > +++ b/src/libcamera/software_isp/debayer_cpu.cpp\n> > @@ -47,14 +47,14 @@ DebayerCpu::DebayerCpu(std::unique_ptr<SwStatsCpu> stats, const GlobalConfigurat\n> >  \t * Reading from uncached buffers may be very slow.\n> >  \t * In such a case, it's better to copy input buffer data to normal memory.\n> >  \t * But in case of cached buffers, copying the data is unnecessary overhead.\n> > -\t * enable_input_memcpy_ makes this behavior configurable.  At the moment, we\n> > +\t * forceInputMemcpy_ makes this behavior configurable.  At the moment, we\n> >  \t * always set it to true as the safer choice but this should be changed in\n> >  \t * future.\n> >  \t *\n> >  \t * \\todo Make memcpy automatic based on runtime detection of platform\n> >  \t * capabilities.\n> >  \t */\n> > -\tenableInputMemcpy_ =\n> > +\tforceInputMemcpy_ =\n> >  \t\tconfiguration.option<bool>({ \"software_isp\", \"copy_input_buffer\" }).value_or(true);\n> >  }\n> >  \n> > @@ -288,6 +288,30 @@ void DebayerCpu::debayer10P_RGRG_BGR888(uint8_t *dst, const uint8_t *src[])\n> >  \t}\n> >  }\n> >  \n> > +/*\n> > + * Functions to mask out high unused bits from input buffers. These bits should\n> > + * already by 0, but sometimes with corrupt input frames these are not 0 causing\n> > + * out-of-bounds accesses to various lookup tables. These functions explicitly\n> > + * set the unused high bits to 0 to avoid corrupt frames causing crashes.\n> > + */\n> > +void DebayerCpu::inputMask10(uint8_t *data, unsigned int len)\n> > +{\n> > +\t/* Everything is aligned to 2 16 bit pixels, mask 2 pixels at a time */\n> > +\tuint32_t *input, \n> \n> Declare in the `for' loop?\n> \n> > *end = (uint32_t *)(data + len);\n> \n> static_cast<uint32_t *>(data + len)\n> \n> (Similarly elsewhere.)\n> \n> > +\n> > +\tfor (input = (uint32_t *)data; input < end; input++)\n> > +\t\t*input &= 0x03ff03ff;\n> > +}\n> > +\n> > +void DebayerCpu::inputMask12(uint8_t *data, unsigned int len)\n> > +{\n> > +\t/* Everything is aligned to 2 16 bit pixels, mask 2 pixels at a time */\n> > +\tuint32_t *input, *end = (uint32_t *)(data + len);\n> > +\n> > +\tfor (input = (uint32_t *)data; input < end; input++)\n> > +\t\t*input &= 0x0fff0fff;\n> > +}\n> \n> There is a measurable performance penalty with this, ~4 % in my\n> environment.  Would perhaps changing the methods to perform the memcpy\n> directly, rather than fixing the input as an additional step, help?\n\nI ws about to suggest that, or even masking when reading the pixels to\nsupport cases where memcpy is disabled.\n\n> > +\n> >  /*\n> >   * Setup the Debayer object according to the passed in parameters.\n> >   * Return 0 on success, a negative errno value on failure\n> > @@ -395,6 +419,8 @@ int DebayerCpu::setDebayerFunctions(PixelFormat inputFormat,\n> >  \n> >  \txShift_ = 0;\n> >  \tswapRedBlueGains_ = false;\n> > +\tinputMask_ = NULL;\n> \n> nullptr\n> \n> > +\tenableInputMemcpy_ = forceInputMemcpy_;\n> >  \n> >  \tauto invalidFmt = []() -> int {\n> >  \t\tLOG(Debayer, Error) << \"Unsupported input output format combination\";\n> > @@ -446,12 +472,17 @@ int DebayerCpu::setDebayerFunctions(PixelFormat inputFormat,\n> >  \t\t\tbreak;\n> >  \t\tcase 10:\n> >  \t\t\tSET_DEBAYER_METHODS(debayer10_BGBG_BGR888, debayer10_GRGR_BGR888)\n> > +\t\t\tinputMask_ = &DebayerCpu::inputMask10;\n> >  \t\t\tbreak;\n> >  \t\tcase 12:\n> >  \t\t\tSET_DEBAYER_METHODS(debayer12_BGBG_BGR888, debayer12_GRGR_BGR888)\n> > +\t\t\tinputMask_ = &DebayerCpu::inputMask12;\n> >  \t\t\tbreak;\n> >  \t\t}\n> >  \t\tsetupStandardBayerOrder(bayerFormat.order);\n> > +\t\tif (inputMask_)\n> > +\t\t\tenableInputMemcpy_ = true;\n> > +\n> >  \t\treturn 0;\n> >  \t}\n> >  \n> > @@ -601,6 +632,8 @@ void DebayerCpu::setupInputMemcpy(const uint8_t *linePointers[])\n> >  \t\tmemcpy(lineBuffers_[i].data(),\n> >  \t\t       linePointers[i + 1] - lineBufferPadding_,\n> >  \t\t       lineBufferLength_);\n> > +\t\tif (inputMask_)\n> > +\t\t\t(this->*inputMask_)(lineBuffers_[i].data(), lineBufferLength_);\n> >  \t\tlinePointers[i + 1] = lineBuffers_[i].data() + lineBufferPadding_;\n> >  \t}\n> >  \n> > @@ -629,6 +662,10 @@ void DebayerCpu::memcpyNextLine(const uint8_t *linePointers[])\n> >  \tmemcpy(lineBuffers_[lineBufferIndex_].data(),\n> >  \t       linePointers[patternHeight] - lineBufferPadding_,\n> >  \t       lineBufferLength_);\n> > +\tif (inputMask_)\n> > +\t\t(this->*inputMask_)(lineBuffers_[lineBufferIndex_].data(),\n> > +\t\t\t\t    lineBufferLength_);\n> > +\n> >  \tlinePointers[patternHeight] = lineBuffers_[lineBufferIndex_].data() + lineBufferPadding_;\n> >  \n> >  \tlineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1);\n> > diff --git a/src/libcamera/software_isp/debayer_cpu.h b/src/libcamera/software_isp/debayer_cpu.h\n> > index 3bf34ac3..a5feefa2 100644\n> > --- a/src/libcamera/software_isp/debayer_cpu.h\n> > +++ b/src/libcamera/software_isp/debayer_cpu.h\n> > @@ -79,6 +79,8 @@ private:\n> >  \t */\n> >  \tusing debayerFn = void (DebayerCpu::*)(uint8_t *dst, const uint8_t *src[]);\n> >  \n> > +\tusing inputMaskFn = void (DebayerCpu::*)(uint8_t *data, unsigned int len);\n> > +\n> >  \t/* 8-bit raw bayer format */\n> >  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n> >  \tvoid debayer8_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n> > @@ -89,11 +91,13 @@ private:\n> >  \tvoid debayer10_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n> >  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n> >  \tvoid debayer10_GRGR_BGR888(uint8_t *dst, const uint8_t *src[]);\n> > +\tvoid inputMask10(uint8_t *data, unsigned int len);\n> >  \t/* unpacked 12-bit raw bayer format */\n> >  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n> >  \tvoid debayer12_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n> >  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n> >  \tvoid debayer12_GRGR_BGR888(uint8_t *dst, const uint8_t *src[]);\n> > +\tvoid inputMask12(uint8_t *data, unsigned int len);\n> >  \t/* CSI-2 packed 10-bit raw bayer format (all the 4 orders) */\n> >  \ttemplate<bool addAlphaByte, bool ccmEnabled>\n> >  \tvoid debayer10P_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]);\n> > @@ -123,6 +127,7 @@ private:\n> >  \tdebayerFn debayer1_;\n> >  \tdebayerFn debayer2_;\n> >  \tdebayerFn debayer3_;\n> > +\tinputMaskFn inputMask_;\n> >  \tRectangle window_;\n> >  \tstd::unique_ptr<SwStatsCpu> stats_;\n> >  \tstd::vector<uint8_t> lineBuffers_[kMaxLineBuffers];\n> > @@ -130,6 +135,7 @@ private:\n> >  \tunsigned int lineBufferPadding_;\n> >  \tunsigned int lineBufferIndex_;\n> >  \tunsigned int xShift_; /* Offset of 0/1 applied to window_.x */\n> > +\tbool forceInputMemcpy_;\n> >  \tbool enableInputMemcpy_;\n> >  };\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 54900C3263\n\tfor <parsemail@patchwork.libcamera.org>;\n\tThu, 13 Nov 2025 13:12:59 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 4DA3360A8B;\n\tThu, 13 Nov 2025 14:12:58 +0100 (CET)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id CBC2E60805\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tThu, 13 Nov 2025 14:12:56 +0100 (CET)","from pendragon.ideasonboard.com (unknown [213.216.211.176])\n\tby perceval.ideasonboard.com (Postfix) with UTF8SMTPSA id 6B12016A;\n\tThu, 13 Nov 2025 14:10:56 +0100 (CET)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"RzOsb+zp\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1763039456;\n\tbh=RA2WwThW4pIoMHmaUwM8Sov7lHqRDzCAejC5iYCCWx8=;\n\th=Date:From:To:Cc:Subject:References:In-Reply-To:From;\n\tb=RzOsb+zp+ZPxljLm8S4qvvBgFPDoLe5MvfDbx1jPQU4NkshYOEWDH6TWR8UXX8Ae6\n\tRb4NMc23Xh7NsPGnWlnqYf5mFlqlvJgP9OlmA8K5JEA6JNPHm/dfeyFFsaXbMLiqbZ\n\tfLmmRLlLz5lYHRanPmejXm11P7dmhbULVuydJq0g=","Date":"Thu, 13 Nov 2025 15:12:44 +0200","From":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","To":"Milan Zamazal <mzamazal@redhat.com>","Cc":"Hans de Goede <johannes.goede@oss.qualcomm.com>,\n\tlibcamera-devel@lists.libcamera.org","Subject":"Re: [RFC 1/1] libcamera: debayer_cpu: Mask out unused bits from >\n\t8bpp non packed src data","Message-ID":"<20251113131244.GD30434@pendragon.ideasonboard.com>","References":"<20251112090924.46295-1-johannes.goede@oss.qualcomm.com>\n\t<20251112090924.46295-2-johannes.goede@oss.qualcomm.com>\n\t<855xbe2lnq.fsf@mzamazal-thinkpadp1gen7.tpbc.csb>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","In-Reply-To":"<855xbe2lnq.fsf@mzamazal-thinkpadp1gen7.tpbc.csb>","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}}]