[{"id":32825,"web_url":"https://patchwork.libcamera.org/comment/32825/","msgid":"<173445238001.32744.4669942287932373725@ping.linuxembedded.co.uk>","date":"2024-12-17T16:19:40","subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","submitter":{"id":4,"url":"https://patchwork.libcamera.org/api/people/4/","name":"Kieran Bingham","email":"kieran.bingham@ideasonboard.com"},"content":"Quoting Barnabás Pőcze (2024-12-11 15:25:56)\n> After the initial generation, when a frame is requested,\n> the test pattern generator rotates the image left by 1 column.\n> \n> The current approach has two shortcomings:\n>   (1) it allocates a temporary buffer to hold one column;\n>   (2) it swaps two columns at a time.\n> \n> The test patterns are simple ARGB images, in row-major order,\n> so doing (2) works against memory prefetching. This can be\n> addressed by doing the rotation one row at a time as that way\n> the image is addressed in a purely linear fashion. Doing so also\n> eliminates the need for a dynamically allocated temporary buffer,\n> as the required buffer now only needs to hold one sample,\n> which is 4 bytes in this case.\n> \n> In an optimized build, this results in about a 2x increase in the\n> number of frames per second as reported by `cam`. In an unoptimized,\n> ASAN and UBSAN intrumented build, the difference is even bigger,\n> which is useful for running lc-compliance in CI in a reasonable time.\n> \n> Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>\n\nHrm, rabbit holes a plenty trying to test this.\n\nFirst qcam isn't working on my main development PC as Qt6 does not\nsupply a pkg-config file in the version available on Ubuntu 22.04 LTS.\n\nNext up, I'm running cam with the SDL output which works:\n\n./build/gcc/src/apps/cam/cam -c2 -S -C\n\n136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n\n\nbut the virtual pipeline handler is clearly not handling sequence\nnumbers, timetamps, bytesused - or presumably any of the required\nmetadata correctly.\n\nBut none of that is a fault of your patch which is working so:\n\n\nTested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>\nReviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>\n\n> ---\n>  .../virtual/test_pattern_generator.cpp        | 68 +++++++++----------\n>  .../pipeline/virtual/test_pattern_generator.h |  4 --\n>  2 files changed, 34 insertions(+), 38 deletions(-)\n> \n> diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> index 47d341919..eeadc1646 100644\n> --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> @@ -5,6 +5,8 @@\n>   * Derived class of FrameGenerator for generating test patterns\n>   */\n>  \n> +#include <string.h>\n> +\n>  #include \"test_pattern_generator.h\"\n>  \n>  #include <libcamera/base/log.h>\n> @@ -13,6 +15,37 @@\n>  \n>  #include <libyuv/convert_from_argb.h>\n>  \n> +namespace {\n> +\n> +template<size_t SampleSize>\n> +void rotateLeft1Sample(uint8_t *samples, size_t width)\n> +{\n> +       if (width <= 0)\n> +               return;\n> +\n> +       const size_t stride = width * SampleSize;\n> +       uint8_t first[SampleSize];\n> +\n> +       memcpy(first, &samples[0], SampleSize);\n> +       for (size_t i = 0; i < stride - SampleSize; i += SampleSize)\n> +               memcpy(&samples[i], &samples[i + SampleSize], SampleSize);\n> +       memcpy(&samples[stride - SampleSize], first, SampleSize);\n> +}\n> +\n> +template<size_t SampleSize>\n> +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image)\n> +{\n> +       if (size.width < 2)\n> +               return;\n> +\n> +       const size_t stride = size.width * SampleSize;\n> +\n> +       for (size_t i = 0; i < size.height; i++, image += stride)\n> +               rotateLeft1Sample<SampleSize>(image, size.width);\n> +}\n> +\n> +}\n> +\n>  namespace libcamera {\n>  \n>  LOG_DECLARE_CATEGORY(Virtual)\n> @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size,\n>  \n>         const auto &planes = mappedFrameBuffer.planes();\n>  \n> -       shiftLeft(size);\n> +       rotateLeft1Column<kARGBSize>(size, template_.get());\n>  \n>         /* Convert the template_ to the frame buffer */\n>         int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize,\n> @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size,\n>         return ret;\n>  }\n>  \n> -void TestPatternGenerator::shiftLeft(const Size &size)\n> -{\n> -       /* Store the first column temporarily */\n> -       auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize);\n> -       for (size_t h = 0; h < size.height; h++) {\n> -               unsigned int index = h * size.width * kARGBSize;\n> -               unsigned int index1 = h * kARGBSize;\n> -               firstColumn[index1] = template_[index];\n> -               firstColumn[index1 + 1] = template_[index + 1];\n> -               firstColumn[index1 + 2] = template_[index + 2];\n> -               firstColumn[index1 + 3] = 0x00;\n> -       }\n> -\n> -       /* Overwrite template_ */\n> -       uint8_t *buf = template_.get();\n> -       for (size_t h = 0; h < size.height; h++) {\n> -               for (size_t w = 0; w < size.width - 1; w++) {\n> -                       /* Overwrite with the pixel on the right */\n> -                       unsigned int index = (h * size.width + w + 1) * kARGBSize;\n> -                       *buf++ = template_[index]; /* B */\n> -                       *buf++ = template_[index + 1]; /* G */\n> -                       *buf++ = template_[index + 2]; /* R */\n> -                       *buf++ = 0x00; /* A */\n> -               }\n> -               /* Overwrite the new last column with the original first column */\n> -               unsigned int index1 = h * kARGBSize;\n> -               *buf++ = firstColumn[index1]; /* B */\n> -               *buf++ = firstColumn[index1 + 1]; /* G */\n> -               *buf++ = firstColumn[index1 + 2]; /* R */\n> -               *buf++ = 0x00; /* A */\n> -       }\n> -}\n> -\n>  void ColorBarsGenerator::configure(const Size &size)\n>  {\n>         constexpr uint8_t kColorBar[8][3] = {\n> diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> index 05f4ab7a7..2a51bd31a 100644\n> --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> @@ -29,10 +29,6 @@ public:\n>  protected:\n>         /* Buffer of test pattern template */\n>         std::unique_ptr<uint8_t[]> template_;\n> -\n> -private:\n> -       /* Shift the buffer by 1 pixel left each frame */\n> -       void shiftLeft(const Size &size);\n>  };\n>  \n>  class ColorBarsGenerator : public TestPatternGenerator\n> -- \n> 2.47.1\n> \n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 69A0CBD1F1\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 17 Dec 2024 16:19:45 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id A49CB67FE0;\n\tTue, 17 Dec 2024 17:19:44 +0100 (CET)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[213.167.242.64])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 5B5DC67FD3\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 17 Dec 2024 17:19:43 +0100 (CET)","from pendragon.ideasonboard.com\n\t(cpc89244-aztw30-2-0-cust6594.18-1.cable.virginm.net [86.31.185.195])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id F41604C7;\n\tTue, 17 Dec 2024 17:19:05 +0100 (CET)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"GR2bg5YK\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1734452346;\n\tbh=9+4QBxC/0YFtNcwQkdlzameUrwzSsyuP3pQulshDMyk=;\n\th=In-Reply-To:References:Subject:From:To:Date:From;\n\tb=GR2bg5YKnlb2WuVid6Vz1SILyO4J/aoKDGCKY3yc0M56EGi5VSM3s6gifGZ66t0dM\n\t4fURtpZYlCKhk+i84sGX5eYc3eoUpZYvt6COZ26KmP1GS099unZtoLY/D67/YjoEKX\n\tYnONgU6tHTuO7oDwXzqqbgcAeN0+U9eN4KgRYzLg=","Content-Type":"text/plain; charset=\"utf-8\"","MIME-Version":"1.0","Content-Transfer-Encoding":"quoted-printable","In-Reply-To":"<20241211152542.1095857-3-pobrn@protonmail.com>","References":"<20241211152542.1095857-1-pobrn@protonmail.com>\n\t<20241211152542.1095857-3-pobrn@protonmail.com>","Subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","From":"Kieran Bingham <kieran.bingham@ideasonboard.com>","To":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <pobrn@protonmail.com>,\n\tlibcamera-devel@lists.libcamera.org","Date":"Tue, 17 Dec 2024 16:19:40 +0000","Message-ID":"<173445238001.32744.4669942287932373725@ping.linuxembedded.co.uk>","User-Agent":"alot/0.10","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":32827,"web_url":"https://patchwork.libcamera.org/comment/32827/","msgid":"<NINlug_ACinWYzt4ZhdSBWiw_dxXOQUrZYnJBid5nf7l6UsA7R92yNVuqqAuDZjaJwTF9qhC8gyUzMr1g7FRKvXpTb_BMhKJGl4sRxgT8gQ=@protonmail.com>","date":"2024-12-17T16:22:42","subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","submitter":{"id":133,"url":"https://patchwork.libcamera.org/api/people/133/","name":"Pőcze Barnabás","email":"pobrn@protonmail.com"},"content":"Regards,\nBarnabás Pőcze\n\n\n2024. december 17., kedd 17:19 keltezéssel, Kieran Bingham <kieran.bingham@ideasonboard.com> írta:\n\n> Quoting Barnabás Pőcze (2024-12-11 15:25:56)\n> \n> > After the initial generation, when a frame is requested,\n> > the test pattern generator rotates the image left by 1 column.\n> > \n> > The current approach has two shortcomings:\n> > (1) it allocates a temporary buffer to hold one column;\n> > (2) it swaps two columns at a time.\n> > \n> > The test patterns are simple ARGB images, in row-major order,\n> > so doing (2) works against memory prefetching. This can be\n> > addressed by doing the rotation one row at a time as that way\n> > the image is addressed in a purely linear fashion. Doing so also\n> > eliminates the need for a dynamically allocated temporary buffer,\n> > as the required buffer now only needs to hold one sample,\n> > which is 4 bytes in this case.\n> > \n> > In an optimized build, this results in about a 2x increase in the\n> > number of frames per second as reported by `cam`. In an unoptimized,\n> > ASAN and UBSAN intrumented build, the difference is even bigger,\n> > which is useful for running lc-compliance in CI in a reasonable time.\n> > \n> > Signed-off-by: Barnabás Pőcze pobrn@protonmail.com\n> \n> \n> Hrm, rabbit holes a plenty trying to test this.\n> \n> First qcam isn't working on my main development PC as Qt6 does not\n> supply a pkg-config file in the version available on Ubuntu 22.04 LTS.\n> \n> Next up, I'm running cam with the SDL output which works:\n> \n> ./build/gcc/src/apps/cam/cam -c2 -S -C\n> \n> 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> \n> \n> but the virtual pipeline handler is clearly not handling sequence\n> numbers, timetamps, bytesused - or presumably any of the required\n> metadata correctly.\n> \n> But none of that is a fault of your patch which is working so:\n\nI forgot to mention that I used this to test:\nhttps://gitlab.freedesktop.org/pobrn/libcamera/-/commit/cd537fa855bb68c1faeab8fb5b94c4d423899b04\n\nThere is also a corresponding bug:\nhttps://bugs.libcamera.org/show_bug.cgi?id=245\n\n\nRegards,\nBarnabás Pőcze\n\n> \n> \n> Tested-by: Kieran Bingham kieran.bingham@ideasonboard.com\n> \n> Reviewed-by: Kieran Bingham kieran.bingham@ideasonboard.com\n> \n> > ---\n> > .../virtual/test_pattern_generator.cpp | 68 +++++++++----------\n> > .../pipeline/virtual/test_pattern_generator.h | 4 --\n> > 2 files changed, 34 insertions(+), 38 deletions(-)\n> > \n> > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > index 47d341919..eeadc1646 100644\n> > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > @@ -5,6 +5,8 @@\n> > * Derived class of FrameGenerator for generating test patterns\n> > */\n> > \n> > +#include <string.h>\n> > +\n> > #include \"test_pattern_generator.h\"\n> > \n> > #include <libcamera/base/log.h>\n> > @@ -13,6 +15,37 @@\n> > \n> > #include <libyuv/convert_from_argb.h>\n> > \n> > +namespace {\n> > +\n> > +template<size_t SampleSize>\n> > +void rotateLeft1Sample(uint8_t *samples, size_t width)\n> > +{\n> > + if (width <= 0)\n> > + return;\n> > +\n> > + const size_t stride = width * SampleSize;\n> > + uint8_t first[SampleSize];\n> > +\n> > + memcpy(first, &samples[0], SampleSize);\n> > + for (size_t i = 0; i < stride - SampleSize; i += SampleSize)\n> > + memcpy(&samples[i], &samples[i + SampleSize], SampleSize);\n> > + memcpy(&samples[stride - SampleSize], first, SampleSize);\n> > +}\n> > +\n> > +template<size_t SampleSize>\n> > +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image)\n> > +{\n> > + if (size.width < 2)\n> > + return;\n> > +\n> > + const size_t stride = size.width * SampleSize;\n> > +\n> > + for (size_t i = 0; i < size.height; i++, image += stride)\n> > + rotateLeft1Sample<SampleSize>(image, size.width);\n> > +}\n> > +\n> > +}\n> > +\n> > namespace libcamera {\n> > \n> > LOG_DECLARE_CATEGORY(Virtual)\n> > @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size,\n> > \n> > const auto &planes = mappedFrameBuffer.planes();\n> > \n> > - shiftLeft(size);\n> > + rotateLeft1Column<kARGBSize>(size, template_.get());\n> > \n> > /* Convert the template_ to the frame buffer */\n> > int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize,\n> > @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size,\n> > return ret;\n> > }\n> > \n> > -void TestPatternGenerator::shiftLeft(const Size &size)\n> > -{\n> > - /* Store the first column temporarily /\n> > - auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize);\n> > - for (size_t h = 0; h < size.height; h++) {\n> > - unsigned int index = h * size.width * kARGBSize;\n> > - unsigned int index1 = h * kARGBSize;\n> > - firstColumn[index1] = template_[index];\n> > - firstColumn[index1 + 1] = template_[index + 1];\n> > - firstColumn[index1 + 2] = template_[index + 2];\n> > - firstColumn[index1 + 3] = 0x00;\n> > - }\n> > -\n> > - / Overwrite template_ */\n> > - uint8_t buf = template_.get();\n> > - for (size_t h = 0; h < size.height; h++) {\n> > - for (size_t w = 0; w < size.width - 1; w++) {\n> > - / Overwrite with the pixel on the right */\n> > - unsigned int index = (h * size.width + w + 1) * kARGBSize;\n> > - buf++ = template_[index]; / B */\n> > - buf++ = template_[index + 1]; / G */\n> > - buf++ = template_[index + 2]; / R */\n> > - buf++ = 0x00; / A /\n> > - }\n> > - / Overwrite the new last column with the original first column */\n> > - unsigned int index1 = h * kARGBSize;\n> > - buf++ = firstColumn[index1]; / B */\n> > - buf++ = firstColumn[index1 + 1]; / G */\n> > - buf++ = firstColumn[index1 + 2]; / R */\n> > - buf++ = 0x00; / A /\n> > - }\n> > -}\n> > -\n> > void ColorBarsGenerator::configure(const Size &size)\n> > {\n> > constexpr uint8_t kColorBar[8][3] = {\n> > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > index 05f4ab7a7..2a51bd31a 100644\n> > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > @@ -29,10 +29,6 @@ public:\n> > protected:\n> > / Buffer of test pattern template /\n> > std::unique_ptr<uint8_t[]> template_;\n> > -\n> > -private:\n> > - / Shift the buffer by 1 pixel left each frame */\n> > - void shiftLeft(const Size &size);\n> > };\n> > \n> > class ColorBarsGenerator : public TestPatternGenerator\n> > --\n> > 2.47.1","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 86E68C32F6\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 17 Dec 2024 16:22:50 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 3E13967FE6;\n\tTue, 17 Dec 2024 17:22:50 +0100 (CET)","from mail-10630.protonmail.ch (mail-10630.protonmail.ch\n\t[79.135.106.30])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 1C94567FDF\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 17 Dec 2024 17:22:49 +0100 (CET)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=protonmail.com header.i=@protonmail.com\n\theader.b=\"OIOb7WSv\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;\n\ts=protonmail3; t=1734452567; x=1734711767;\n\tbh=YKQDg9fqV2IP+1nt/ld/LhSCVZ/8GSuYUE90HGQr6Hc=;\n\th=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:\n\tFeedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:\n\tMessage-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;\n\tb=OIOb7WSvlX41t7IxkJeH1iAwmdB4kCbATFTWv6q3eo9gMmI9WyRjXqd5FZGc2aw1t\n\tZ61ppn0OMJ/kmDwPO35bF1bCveBcLchssFCSGFbiUM+xV8P1oNuYNUP59DKlB9jUZ3\n\tpoRfHs7RSJ1yWg6TYmoWqCRY7YylvDQBF+qB+xUJ1SgaYLRHfG5lNOGM/xSkJ+K1Et\n\t5RmfimMJaWPFoEGFYSddqiZTzaQhjPkApVp1+bgMinn2dn/7RLzbZiHLng2sq5PNih\n\tt8X8YM2mZNDL7eUHUsyDBePb1RjueZ54FFcu4ISHKk4t+DxfiLmnuqnwT9B8Uhcz5n\n\t+wPRruhUYSQ0w==","Date":"Tue, 17 Dec 2024 16:22:42 +0000","To":"Kieran Bingham <kieran.bingham@ideasonboard.com>","From":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <pobrn@protonmail.com>","Cc":"libcamera-devel@lists.libcamera.org","Subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","Message-ID":"<NINlug_ACinWYzt4ZhdSBWiw_dxXOQUrZYnJBid5nf7l6UsA7R92yNVuqqAuDZjaJwTF9qhC8gyUzMr1g7FRKvXpTb_BMhKJGl4sRxgT8gQ=@protonmail.com>","In-Reply-To":"<173445238001.32744.4669942287932373725@ping.linuxembedded.co.uk>","References":"<20241211152542.1095857-1-pobrn@protonmail.com>\n\t<20241211152542.1095857-3-pobrn@protonmail.com>\n\t<173445238001.32744.4669942287932373725@ping.linuxembedded.co.uk>","Feedback-ID":"20568564:user:proton","X-Pm-Message-ID":"af7769daa99af54a82829a1c347483d76a1716c2","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Transfer-Encoding":"quoted-printable","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":32831,"web_url":"https://patchwork.libcamera.org/comment/32831/","msgid":"<20241217163926.GB11809@pendragon.ideasonboard.com>","date":"2024-12-17T16:39:26","subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","submitter":{"id":2,"url":"https://patchwork.libcamera.org/api/people/2/","name":"Laurent Pinchart","email":"laurent.pinchart@ideasonboard.com"},"content":"Hi Barnabás,\n\nThank you for the patch.\n\nOn Wed, Dec 11, 2024 at 03:25:56PM +0000, Barnabás Pőcze wrote:\n> After the initial generation, when a frame is requested,\n> the test pattern generator rotates the image left by 1 column.\n> \n> The current approach has two shortcomings:\n>   (1) it allocates a temporary buffer to hold one column;\n>   (2) it swaps two columns at a time.\n> \n> The test patterns are simple ARGB images, in row-major order,\n> so doing (2) works against memory prefetching. This can be\n> addressed by doing the rotation one row at a time as that way\n> the image is addressed in a purely linear fashion. Doing so also\n> eliminates the need for a dynamically allocated temporary buffer,\n> as the required buffer now only needs to hold one sample,\n> which is 4 bytes in this case.\n> \n> In an optimized build, this results in about a 2x increase in the\n> number of frames per second as reported by `cam`. In an unoptimized,\n> ASAN and UBSAN intrumented build, the difference is even bigger,\n> which is useful for running lc-compliance in CI in a reasonable time.\n> \n> Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>\n> ---\n>  .../virtual/test_pattern_generator.cpp        | 68 +++++++++----------\n>  .../pipeline/virtual/test_pattern_generator.h |  4 --\n>  2 files changed, 34 insertions(+), 38 deletions(-)\n> \n> diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> index 47d341919..eeadc1646 100644\n> --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> @@ -5,6 +5,8 @@\n>   * Derived class of FrameGenerator for generating test patterns\n>   */\n>  \n> +#include <string.h>\n> +\n>  #include \"test_pattern_generator.h\"\n>  \n>  #include <libcamera/base/log.h>\n> @@ -13,6 +15,37 @@\n>  \n>  #include <libyuv/convert_from_argb.h>\n>  \n> +namespace {\n> +\n> +template<size_t SampleSize>\n> +void rotateLeft1Sample(uint8_t *samples, size_t width)\n> +{\n> +\tif (width <= 0)\n\nwidth is unsigned, so it will never be smaller than 0. As the caller\nalready ensures that the width is >= 2, I'd drop this check.\n\n> +\t\treturn;\n> +\n> +\tconst size_t stride = width * SampleSize;\n\nYou could pass the stride value to this function, instead of recomputing\nit for each line.\n\n> +\tuint8_t first[SampleSize];\n> +\n> +\tmemcpy(first, &samples[0], SampleSize);\n> +\tfor (size_t i = 0; i < stride - SampleSize; i += SampleSize)\n> +\t\tmemcpy(&samples[i], &samples[i + SampleSize], SampleSize);\n\nHave you considered replacing this with\n\n\tmemmove(&samples[0], &samples[SampleSize], stride - SampleSize);\n\n?\n\nThis patch is a good improvement, so\n\nReviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>\n\nwith the above comments taken into account (regardless of whether the\nsuggestions are accepted or rejected).\n\nI think the test pattern generator could do with a whole rewrite. I\nwould compute the test pattern image at configure time, and store it in\nNV12 and in the output resolution, and handle the rotation when copying\nit to the frame buffer. That should significantly speed up operation.\n\n> +\tmemcpy(&samples[stride - SampleSize], first, SampleSize);\n> +}\n> +\n> +template<size_t SampleSize>\n> +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image)\n> +{\n> +\tif (size.width < 2)\n> +\t\treturn;\n> +\n> +\tconst size_t stride = size.width * SampleSize;\n> +\n> +\tfor (size_t i = 0; i < size.height; i++, image += stride)\n> +\t\trotateLeft1Sample<SampleSize>(image, size.width);\n> +}\n> +\n> +}\n> +\n>  namespace libcamera {\n>  \n>  LOG_DECLARE_CATEGORY(Virtual)\n> @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size,\n>  \n>  \tconst auto &planes = mappedFrameBuffer.planes();\n>  \n> -\tshiftLeft(size);\n> +\trotateLeft1Column<kARGBSize>(size, template_.get());\n>  \n>  \t/* Convert the template_ to the frame buffer */\n>  \tint ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize,\n> @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size,\n>  \treturn ret;\n>  }\n>  \n> -void TestPatternGenerator::shiftLeft(const Size &size)\n> -{\n> -\t/* Store the first column temporarily */\n> -\tauto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize);\n> -\tfor (size_t h = 0; h < size.height; h++) {\n> -\t\tunsigned int index = h * size.width * kARGBSize;\n> -\t\tunsigned int index1 = h * kARGBSize;\n> -\t\tfirstColumn[index1] = template_[index];\n> -\t\tfirstColumn[index1 + 1] = template_[index + 1];\n> -\t\tfirstColumn[index1 + 2] = template_[index + 2];\n> -\t\tfirstColumn[index1 + 3] = 0x00;\n> -\t}\n> -\n> -\t/* Overwrite template_ */\n> -\tuint8_t *buf = template_.get();\n> -\tfor (size_t h = 0; h < size.height; h++) {\n> -\t\tfor (size_t w = 0; w < size.width - 1; w++) {\n> -\t\t\t/* Overwrite with the pixel on the right */\n> -\t\t\tunsigned int index = (h * size.width + w + 1) * kARGBSize;\n> -\t\t\t*buf++ = template_[index]; /* B */\n> -\t\t\t*buf++ = template_[index + 1]; /* G */\n> -\t\t\t*buf++ = template_[index + 2]; /* R */\n> -\t\t\t*buf++ = 0x00; /* A */\n> -\t\t}\n> -\t\t/* Overwrite the new last column with the original first column */\n> -\t\tunsigned int index1 = h * kARGBSize;\n> -\t\t*buf++ = firstColumn[index1]; /* B */\n> -\t\t*buf++ = firstColumn[index1 + 1]; /* G */\n> -\t\t*buf++ = firstColumn[index1 + 2]; /* R */\n> -\t\t*buf++ = 0x00; /* A */\n> -\t}\n> -}\n> -\n>  void ColorBarsGenerator::configure(const Size &size)\n>  {\n>  \tconstexpr uint8_t kColorBar[8][3] = {\n> diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> index 05f4ab7a7..2a51bd31a 100644\n> --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> @@ -29,10 +29,6 @@ public:\n>  protected:\n>  \t/* Buffer of test pattern template */\n>  \tstd::unique_ptr<uint8_t[]> template_;\n> -\n> -private:\n> -\t/* Shift the buffer by 1 pixel left each frame */\n> -\tvoid shiftLeft(const Size &size);\n>  };\n>  \n>  class ColorBarsGenerator : public TestPatternGenerator","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 34B61C32F6\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 17 Dec 2024 16:39:31 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 6808A67FE6;\n\tTue, 17 Dec 2024 17:39:30 +0100 (CET)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 21B1D67FD3\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 17 Dec 2024 17:39:29 +0100 (CET)","from pendragon.ideasonboard.com (81-175-209-231.bb.dnainternet.fi\n\t[81.175.209.231])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id 8DEA54C7;\n\tTue, 17 Dec 2024 17:38:51 +0100 (CET)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"oDZ0IXd1\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1734453531;\n\tbh=nehlE+KZzrsARj9GGvJ9dZDsAhl1q3b7lKraUnmtzHk=;\n\th=Date:From:To:Cc:Subject:References:In-Reply-To:From;\n\tb=oDZ0IXd1gaPz6d0FxTMWjq+qhfnDxWGwS2kqQlTadLmPEZAoPVV9czfA4BB1nuMxK\n\tk06fmKcTyYKXRNTpRIFgptWpDOfkASXWag/xH51TcoKWUrPkjvwrxg9dKpajXOxYYU\n\twQyihqv3ttk4jMaQbAD3FEa5VWTGQAIi79rxOZ1M=","Date":"Tue, 17 Dec 2024 18:39:26 +0200","From":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","To":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <pobrn@protonmail.com>","Cc":"libcamera-devel@lists.libcamera.org","Subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","Message-ID":"<20241217163926.GB11809@pendragon.ideasonboard.com>","References":"<20241211152542.1095857-1-pobrn@protonmail.com>\n\t<20241211152542.1095857-3-pobrn@protonmail.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<20241211152542.1095857-3-pobrn@protonmail.com>","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":32840,"web_url":"https://patchwork.libcamera.org/comment/32840/","msgid":"<20241217165746.GC11809@pendragon.ideasonboard.com>","date":"2024-12-17T16:57:46","subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","submitter":{"id":2,"url":"https://patchwork.libcamera.org/api/people/2/","name":"Laurent Pinchart","email":"laurent.pinchart@ideasonboard.com"},"content":"CC'ing Harvey.\n\nOn Tue, Dec 17, 2024 at 04:22:42PM +0000, Barnabás Pőcze wrote:\n> 2024. december 17., kedd 17:19 keltezéssel, Kieran Bingham írta:\n> > Quoting Barnabás Pőcze (2024-12-11 15:25:56)\n> > \n> > > After the initial generation, when a frame is requested,\n> > > the test pattern generator rotates the image left by 1 column.\n> > > \n> > > The current approach has two shortcomings:\n> > > (1) it allocates a temporary buffer to hold one column;\n> > > (2) it swaps two columns at a time.\n> > > \n> > > The test patterns are simple ARGB images, in row-major order,\n> > > so doing (2) works against memory prefetching. This can be\n> > > addressed by doing the rotation one row at a time as that way\n> > > the image is addressed in a purely linear fashion. Doing so also\n> > > eliminates the need for a dynamically allocated temporary buffer,\n> > > as the required buffer now only needs to hold one sample,\n> > > which is 4 bytes in this case.\n> > > \n> > > In an optimized build, this results in about a 2x increase in the\n> > > number of frames per second as reported by `cam`. In an unoptimized,\n> > > ASAN and UBSAN intrumented build, the difference is even bigger,\n> > > which is useful for running lc-compliance in CI in a reasonable time.\n> > > \n> > > Signed-off-by: Barnabás Pőcze pobrn@protonmail.com\n> > \n> > \n> > Hrm, rabbit holes a plenty trying to test this.\n> > \n> > First qcam isn't working on my main development PC as Qt6 does not\n> > supply a pkg-config file in the version available on Ubuntu 22.04 LTS.\n> > \n> > Next up, I'm running cam with the SDL output which works:\n> > \n> > ./build/gcc/src/apps/cam/cam -c2 -S -C\n> > \n> > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0\n> > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0\n> > \n> > \n> > but the virtual pipeline handler is clearly not handling sequence\n> > numbers, timetamps, bytesused - or presumably any of the required\n> > metadata correctly.\n> > \n> > But none of that is a fault of your patch which is working so:\n> \n> I forgot to mention that I used this to test:\n> https://gitlab.freedesktop.org/pobrn/libcamera/-/commit/cd537fa855bb68c1faeab8fb5b94c4d423899b04\n> \n> There is also a corresponding bug:\n> https://bugs.libcamera.org/show_bug.cgi?id=245\n\nThat's not great, it should be fixed sooner than later as I expect it\nwill impact CI. Harvey, when do you think you could have a look at that\n? Could you also create an account on bugs.libcamera.org, so that I can\nassign bugs for the virtual pipeline handler to you in the future ?\n\n> > Tested-by: Kieran Bingham kieran.bingham@ideasonboard.com\n> > \n> > Reviewed-by: Kieran Bingham kieran.bingham@ideasonboard.com\n> > \n> > > ---\n> > > .../virtual/test_pattern_generator.cpp | 68 +++++++++----------\n> > > .../pipeline/virtual/test_pattern_generator.h | 4 --\n> > > 2 files changed, 34 insertions(+), 38 deletions(-)\n> > > \n> > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > > index 47d341919..eeadc1646 100644\n> > > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > > @@ -5,6 +5,8 @@\n> > > * Derived class of FrameGenerator for generating test patterns\n> > > */\n> > > \n> > > +#include <string.h>\n> > > +\n> > > #include \"test_pattern_generator.h\"\n> > > \n> > > #include <libcamera/base/log.h>\n> > > @@ -13,6 +15,37 @@\n> > > \n> > > #include <libyuv/convert_from_argb.h>\n> > > \n> > > +namespace {\n> > > +\n> > > +template<size_t SampleSize>\n> > > +void rotateLeft1Sample(uint8_t *samples, size_t width)\n> > > +{\n> > > + if (width <= 0)\n> > > + return;\n> > > +\n> > > + const size_t stride = width * SampleSize;\n> > > + uint8_t first[SampleSize];\n> > > +\n> > > + memcpy(first, &samples[0], SampleSize);\n> > > + for (size_t i = 0; i < stride - SampleSize; i += SampleSize)\n> > > + memcpy(&samples[i], &samples[i + SampleSize], SampleSize);\n> > > + memcpy(&samples[stride - SampleSize], first, SampleSize);\n> > > +}\n> > > +\n> > > +template<size_t SampleSize>\n> > > +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image)\n> > > +{\n> > > + if (size.width < 2)\n> > > + return;\n> > > +\n> > > + const size_t stride = size.width * SampleSize;\n> > > +\n> > > + for (size_t i = 0; i < size.height; i++, image += stride)\n> > > + rotateLeft1Sample<SampleSize>(image, size.width);\n> > > +}\n> > > +\n> > > +}\n> > > +\n> > > namespace libcamera {\n> > > \n> > > LOG_DECLARE_CATEGORY(Virtual)\n> > > @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size,\n> > > \n> > > const auto &planes = mappedFrameBuffer.planes();\n> > > \n> > > - shiftLeft(size);\n> > > + rotateLeft1Column<kARGBSize>(size, template_.get());\n> > > \n> > > /* Convert the template_ to the frame buffer */\n> > > int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize,\n> > > @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size,\n> > > return ret;\n> > > }\n> > > \n> > > -void TestPatternGenerator::shiftLeft(const Size &size)\n> > > -{\n> > > - /* Store the first column temporarily /\n> > > - auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize);\n> > > - for (size_t h = 0; h < size.height; h++) {\n> > > - unsigned int index = h * size.width * kARGBSize;\n> > > - unsigned int index1 = h * kARGBSize;\n> > > - firstColumn[index1] = template_[index];\n> > > - firstColumn[index1 + 1] = template_[index + 1];\n> > > - firstColumn[index1 + 2] = template_[index + 2];\n> > > - firstColumn[index1 + 3] = 0x00;\n> > > - }\n> > > -\n> > > - / Overwrite template_ */\n> > > - uint8_t buf = template_.get();\n> > > - for (size_t h = 0; h < size.height; h++) {\n> > > - for (size_t w = 0; w < size.width - 1; w++) {\n> > > - / Overwrite with the pixel on the right */\n> > > - unsigned int index = (h * size.width + w + 1) * kARGBSize;\n> > > - buf++ = template_[index]; / B */\n> > > - buf++ = template_[index + 1]; / G */\n> > > - buf++ = template_[index + 2]; / R */\n> > > - buf++ = 0x00; / A /\n> > > - }\n> > > - / Overwrite the new last column with the original first column */\n> > > - unsigned int index1 = h * kARGBSize;\n> > > - buf++ = firstColumn[index1]; / B */\n> > > - buf++ = firstColumn[index1 + 1]; / G */\n> > > - buf++ = firstColumn[index1 + 2]; / R */\n> > > - buf++ = 0x00; / A /\n> > > - }\n> > > -}\n> > > -\n> > > void ColorBarsGenerator::configure(const Size &size)\n> > > {\n> > > constexpr uint8_t kColorBar[8][3] = {\n> > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > > index 05f4ab7a7..2a51bd31a 100644\n> > > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > > @@ -29,10 +29,6 @@ public:\n> > > protected:\n> > > / Buffer of test pattern template /\n> > > std::unique_ptr<uint8_t[]> template_;\n> > > -\n> > > -private:\n> > > - / Shift the buffer by 1 pixel left each frame */\n> > > - void shiftLeft(const Size &size);\n> > > };\n> > > \n> > > class ColorBarsGenerator : public TestPatternGenerator","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 4193FBD1F1\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 17 Dec 2024 16:57:51 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 821D267FFA;\n\tTue, 17 Dec 2024 17:57:50 +0100 (CET)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[213.167.242.64])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 4450E67FE2\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 17 Dec 2024 17:57:48 +0100 (CET)","from pendragon.ideasonboard.com (81-175-209-231.bb.dnainternet.fi\n\t[81.175.209.231])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id D0BE74C7;\n\tTue, 17 Dec 2024 17:57:10 +0100 (CET)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"T5GuEqpt\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1734454631;\n\tbh=8RQv7ymJDzGVMe0hbfHa7dnyw9ZpaIVC93zj9I1ER8Q=;\n\th=Date:From:To:Cc:Subject:References:In-Reply-To:From;\n\tb=T5GuEqptIBdHshFbLQfDtpdrZK48r2ECebiydN0Xi4kyosZr6+D/JG1Wkn7bl2OnT\n\tzAl779IY/jn82Pv835fj9qx375aoknWaDH3HlFy0I4T0/F3bKTiooU+wpUXkeYAw+n\n\trJsVnBI8FbYTbzkE4UwQXvIo3n9Kru3wSynToy2E=","Date":"Tue, 17 Dec 2024 18:57:46 +0200","From":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","To":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <pobrn@protonmail.com>","Cc":"Kieran Bingham <kieran.bingham@ideasonboard.com>,\n\tlibcamera-devel@lists.libcamera.org,\n\tHarvey Yang <chenghaoyang@chromium.org>","Subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","Message-ID":"<20241217165746.GC11809@pendragon.ideasonboard.com>","References":"<20241211152542.1095857-1-pobrn@protonmail.com>\n\t<20241211152542.1095857-3-pobrn@protonmail.com>\n\t<173445238001.32744.4669942287932373725@ping.linuxembedded.co.uk>\n\t<NINlug_ACinWYzt4ZhdSBWiw_dxXOQUrZYnJBid5nf7l6UsA7R92yNVuqqAuDZjaJwTF9qhC8gyUzMr1g7FRKvXpTb_BMhKJGl4sRxgT8gQ=@protonmail.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<NINlug_ACinWYzt4ZhdSBWiw_dxXOQUrZYnJBid5nf7l6UsA7R92yNVuqqAuDZjaJwTF9qhC8gyUzMr1g7FRKvXpTb_BMhKJGl4sRxgT8gQ=@protonmail.com>","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":32846,"web_url":"https://patchwork.libcamera.org/comment/32846/","msgid":"<_R8XAeD8iWuE6Xr2VnMVflhHThGVk0SgSe3wqg-I00R0xmMX-iYhSKFa5BeX92vjZzKBsVM1tEt5R2umAAxyah2CRuWnHO4q4qPIFcZsHfg=@protonmail.com>","date":"2024-12-17T17:38:51","subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","submitter":{"id":133,"url":"https://patchwork.libcamera.org/api/people/133/","name":"Pőcze Barnabás","email":"pobrn@protonmail.com"},"content":"2024. december 17., kedd 17:39 keltezéssel, Laurent Pinchart <laurent.pinchart@ideasonboard.com> írta:\n\n> Hi Barnabás,\n> \n> Thank you for the patch.\n> \n> On Wed, Dec 11, 2024 at 03:25:56PM +0000, Barnabás Pőcze wrote:\n> > After the initial generation, when a frame is requested,\n> > the test pattern generator rotates the image left by 1 column.\n> >\n> > The current approach has two shortcomings:\n> >   (1) it allocates a temporary buffer to hold one column;\n> >   (2) it swaps two columns at a time.\n> >\n> > The test patterns are simple ARGB images, in row-major order,\n> > so doing (2) works against memory prefetching. This can be\n> > addressed by doing the rotation one row at a time as that way\n> > the image is addressed in a purely linear fashion. Doing so also\n> > eliminates the need for a dynamically allocated temporary buffer,\n> > as the required buffer now only needs to hold one sample,\n> > which is 4 bytes in this case.\n> >\n> > In an optimized build, this results in about a 2x increase in the\n> > number of frames per second as reported by `cam`. In an unoptimized,\n> > ASAN and UBSAN intrumented build, the difference is even bigger,\n> > which is useful for running lc-compliance in CI in a reasonable time.\n> >\n> > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com>\n> > ---\n> >  .../virtual/test_pattern_generator.cpp        | 68 +++++++++----------\n> >  .../pipeline/virtual/test_pattern_generator.h |  4 --\n> >  2 files changed, 34 insertions(+), 38 deletions(-)\n> >\n> > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > index 47d341919..eeadc1646 100644\n> > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp\n> > @@ -5,6 +5,8 @@\n> >   * Derived class of FrameGenerator for generating test patterns\n> >   */\n> >\n> > +#include <string.h>\n> > +\n> >  #include \"test_pattern_generator.h\"\n> >\n> >  #include <libcamera/base/log.h>\n> > @@ -13,6 +15,37 @@\n> >\n> >  #include <libyuv/convert_from_argb.h>\n> >\n> > +namespace {\n> > +\n> > +template<size_t SampleSize>\n> > +void rotateLeft1Sample(uint8_t *samples, size_t width)\n> > +{\n> > +\tif (width <= 0)\n> \n> width is unsigned, so it will never be smaller than 0. As the caller\n> already ensures that the width is >= 2, I'd drop this check.\n> \n> > +\t\treturn;\n> > +\n> > +\tconst size_t stride = width * SampleSize;\n> \n> You could pass the stride value to this function, instead of recomputing\n> it for each line.\n> \n> > +\tuint8_t first[SampleSize];\n> > +\n> > +\tmemcpy(first, &samples[0], SampleSize);\n> > +\tfor (size_t i = 0; i < stride - SampleSize; i += SampleSize)\n> > +\t\tmemcpy(&samples[i], &samples[i + SampleSize], SampleSize);\n> \n> Have you considered replacing this with\n> \n> \tmemmove(&samples[0], &samples[SampleSize], stride - SampleSize);\n> \n> ?\n\nNo... but that's a good idea. I'll apply this in the next version.\n\n\nRegards,\nBarnabás Pőcze\n\n> \n> This patch is a good improvement, so\n> \n> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>\n> \n> with the above comments taken into account (regardless of whether the\n> suggestions are accepted or rejected).\n> \n> I think the test pattern generator could do with a whole rewrite. I\n> would compute the test pattern image at configure time, and store it in\n> NV12 and in the output resolution, and handle the rotation when copying\n> it to the frame buffer. That should significantly speed up operation.\n> \n> > +\tmemcpy(&samples[stride - SampleSize], first, SampleSize);\n> > +}\n> > +\n> > +template<size_t SampleSize>\n> > +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image)\n> > +{\n> > +\tif (size.width < 2)\n> > +\t\treturn;\n> > +\n> > +\tconst size_t stride = size.width * SampleSize;\n> > +\n> > +\tfor (size_t i = 0; i < size.height; i++, image += stride)\n> > +\t\trotateLeft1Sample<SampleSize>(image, size.width);\n> > +}\n> > +\n> > +}\n> > +\n> >  namespace libcamera {\n> >\n> >  LOG_DECLARE_CATEGORY(Virtual)\n> > @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size,\n> >\n> >  \tconst auto &planes = mappedFrameBuffer.planes();\n> >\n> > -\tshiftLeft(size);\n> > +\trotateLeft1Column<kARGBSize>(size, template_.get());\n> >\n> >  \t/* Convert the template_ to the frame buffer */\n> >  \tint ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize,\n> > @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size,\n> >  \treturn ret;\n> >  }\n> >\n> > -void TestPatternGenerator::shiftLeft(const Size &size)\n> > -{\n> > -\t/* Store the first column temporarily */\n> > -\tauto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize);\n> > -\tfor (size_t h = 0; h < size.height; h++) {\n> > -\t\tunsigned int index = h * size.width * kARGBSize;\n> > -\t\tunsigned int index1 = h * kARGBSize;\n> > -\t\tfirstColumn[index1] = template_[index];\n> > -\t\tfirstColumn[index1 + 1] = template_[index + 1];\n> > -\t\tfirstColumn[index1 + 2] = template_[index + 2];\n> > -\t\tfirstColumn[index1 + 3] = 0x00;\n> > -\t}\n> > -\n> > -\t/* Overwrite template_ */\n> > -\tuint8_t *buf = template_.get();\n> > -\tfor (size_t h = 0; h < size.height; h++) {\n> > -\t\tfor (size_t w = 0; w < size.width - 1; w++) {\n> > -\t\t\t/* Overwrite with the pixel on the right */\n> > -\t\t\tunsigned int index = (h * size.width + w + 1) * kARGBSize;\n> > -\t\t\t*buf++ = template_[index]; /* B */\n> > -\t\t\t*buf++ = template_[index + 1]; /* G */\n> > -\t\t\t*buf++ = template_[index + 2]; /* R */\n> > -\t\t\t*buf++ = 0x00; /* A */\n> > -\t\t}\n> > -\t\t/* Overwrite the new last column with the original first column */\n> > -\t\tunsigned int index1 = h * kARGBSize;\n> > -\t\t*buf++ = firstColumn[index1]; /* B */\n> > -\t\t*buf++ = firstColumn[index1 + 1]; /* G */\n> > -\t\t*buf++ = firstColumn[index1 + 2]; /* R */\n> > -\t\t*buf++ = 0x00; /* A */\n> > -\t}\n> > -}\n> > -\n> >  void ColorBarsGenerator::configure(const Size &size)\n> >  {\n> >  \tconstexpr uint8_t kColorBar[8][3] = {\n> > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > index 05f4ab7a7..2a51bd31a 100644\n> > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h\n> > @@ -29,10 +29,6 @@ public:\n> >  protected:\n> >  \t/* Buffer of test pattern template */\n> >  \tstd::unique_ptr<uint8_t[]> template_;\n> > -\n> > -private:\n> > -\t/* Shift the buffer by 1 pixel left each frame */\n> > -\tvoid shiftLeft(const Size &size);\n> >  };\n> >\n> >  class ColorBarsGenerator : public TestPatternGenerator\n> \n> --\n> Regards,\n> \n> Laurent Pinchart\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id DF345BD1F1\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 17 Dec 2024 17:38:57 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 9465667FF8;\n\tTue, 17 Dec 2024 18:38:57 +0100 (CET)","from mail-4322.protonmail.ch (mail-4322.protonmail.ch\n\t[185.70.43.22])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id BB5CF67FE6\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 17 Dec 2024 18:38:55 +0100 (CET)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=protonmail.com header.i=@protonmail.com\n\theader.b=\"l80BE61c\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com;\n\ts=protonmail3; t=1734457134; x=1734716334;\n\tbh=nyMM/cn5WEjXCEXrA5o9tv8VqYRQ5m7i3+2b+2sLSe0=;\n\th=Date:To:From:Cc:Subject:Message-ID:In-Reply-To:References:\n\tFeedback-ID:From:To:Cc:Date:Subject:Reply-To:Feedback-ID:\n\tMessage-ID:BIMI-Selector:List-Unsubscribe:List-Unsubscribe-Post;\n\tb=l80BE61cKJyNUcjtFnRWkx6om+YOC8WKXZbfzKjukwCjDTFG+FBSdnagenTq35qPx\n\tTWSH6UOKLfz/fF7CFXU8E9+Y3+W/+tb3OfRqDXjszH9w/DErFP2vWl/purixhtc6GC\n\tVRCdZ4mWlnuxPFqsxOhfRgLxfN0Cq8rw78bHiO2gG4D7GnviJM/IfyYusHfE95LzGA\n\t8J1LejcIoajzN7ckoJf4CsRiOywQ/zVDEByZpc16M2fYpGPNg9qE7qsl+x9uBiIqKE\n\tY6gj3jQ32Rle7fB2j+G5pTrfE/OWVP3szKxoBfetQOzCXJ2dZCrOXa61ZmwYGxDH0i\n\tRYeh2vBXJSoqA==","Date":"Tue, 17 Dec 2024 17:38:51 +0000","To":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","From":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <pobrn@protonmail.com>","Cc":"libcamera-devel@lists.libcamera.org","Subject":"Re: [PATCH v1 3/3] libcamera: virtual: Speed up test pattern\n\tanimation","Message-ID":"<_R8XAeD8iWuE6Xr2VnMVflhHThGVk0SgSe3wqg-I00R0xmMX-iYhSKFa5BeX92vjZzKBsVM1tEt5R2umAAxyah2CRuWnHO4q4qPIFcZsHfg=@protonmail.com>","In-Reply-To":"<20241217163926.GB11809@pendragon.ideasonboard.com>","References":"<20241211152542.1095857-1-pobrn@protonmail.com>\n\t<20241211152542.1095857-3-pobrn@protonmail.com>\n\t<20241217163926.GB11809@pendragon.ideasonboard.com>","Feedback-ID":"20568564:user:proton","X-Pm-Message-ID":"55f3bddf0324c24e4a354e321784eb10df793b77","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Transfer-Encoding":"quoted-printable","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}}]