Message ID | 20241211152542.1095857-3-pobrn@protonmail.com |
---|---|
State | Superseded |
Headers | show |
Series |
|
Related | show |
Quoting Barnabás Pőcze (2024-12-11 15:25:56) > After the initial generation, when a frame is requested, > the test pattern generator rotates the image left by 1 column. > > The current approach has two shortcomings: > (1) it allocates a temporary buffer to hold one column; > (2) it swaps two columns at a time. > > The test patterns are simple ARGB images, in row-major order, > so doing (2) works against memory prefetching. This can be > addressed by doing the rotation one row at a time as that way > the image is addressed in a purely linear fashion. Doing so also > eliminates the need for a dynamically allocated temporary buffer, > as the required buffer now only needs to hold one sample, > which is 4 bytes in this case. > > In an optimized build, this results in about a 2x increase in the > number of frames per second as reported by `cam`. In an unoptimized, > ASAN and UBSAN intrumented build, the difference is even bigger, > which is useful for running lc-compliance in CI in a reasonable time. > > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> Hrm, rabbit holes a plenty trying to test this. First qcam isn't working on my main development PC as Qt6 does not supply a pkg-config file in the version available on Ubuntu 22.04 LTS. Next up, I'm running cam with the SDL output which works: ./build/gcc/src/apps/cam/cam -c2 -S -C 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 but the virtual pipeline handler is clearly not handling sequence numbers, timetamps, bytesused - or presumably any of the required metadata correctly. But none of that is a fault of your patch which is working so: Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > --- > .../virtual/test_pattern_generator.cpp | 68 +++++++++---------- > .../pipeline/virtual/test_pattern_generator.h | 4 -- > 2 files changed, 34 insertions(+), 38 deletions(-) > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > index 47d341919..eeadc1646 100644 > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > @@ -5,6 +5,8 @@ > * Derived class of FrameGenerator for generating test patterns > */ > > +#include <string.h> > + > #include "test_pattern_generator.h" > > #include <libcamera/base/log.h> > @@ -13,6 +15,37 @@ > > #include <libyuv/convert_from_argb.h> > > +namespace { > + > +template<size_t SampleSize> > +void rotateLeft1Sample(uint8_t *samples, size_t width) > +{ > + if (width <= 0) > + return; > + > + const size_t stride = width * SampleSize; > + uint8_t first[SampleSize]; > + > + memcpy(first, &samples[0], SampleSize); > + for (size_t i = 0; i < stride - SampleSize; i += SampleSize) > + memcpy(&samples[i], &samples[i + SampleSize], SampleSize); > + memcpy(&samples[stride - SampleSize], first, SampleSize); > +} > + > +template<size_t SampleSize> > +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image) > +{ > + if (size.width < 2) > + return; > + > + const size_t stride = size.width * SampleSize; > + > + for (size_t i = 0; i < size.height; i++, image += stride) > + rotateLeft1Sample<SampleSize>(image, size.width); > +} > + > +} > + > namespace libcamera { > > LOG_DECLARE_CATEGORY(Virtual) > @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size, > > const auto &planes = mappedFrameBuffer.planes(); > > - shiftLeft(size); > + rotateLeft1Column<kARGBSize>(size, template_.get()); > > /* Convert the template_ to the frame buffer */ > int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize, > @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size, > return ret; > } > > -void TestPatternGenerator::shiftLeft(const Size &size) > -{ > - /* Store the first column temporarily */ > - auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize); > - for (size_t h = 0; h < size.height; h++) { > - unsigned int index = h * size.width * kARGBSize; > - unsigned int index1 = h * kARGBSize; > - firstColumn[index1] = template_[index]; > - firstColumn[index1 + 1] = template_[index + 1]; > - firstColumn[index1 + 2] = template_[index + 2]; > - firstColumn[index1 + 3] = 0x00; > - } > - > - /* Overwrite template_ */ > - uint8_t *buf = template_.get(); > - for (size_t h = 0; h < size.height; h++) { > - for (size_t w = 0; w < size.width - 1; w++) { > - /* Overwrite with the pixel on the right */ > - unsigned int index = (h * size.width + w + 1) * kARGBSize; > - *buf++ = template_[index]; /* B */ > - *buf++ = template_[index + 1]; /* G */ > - *buf++ = template_[index + 2]; /* R */ > - *buf++ = 0x00; /* A */ > - } > - /* Overwrite the new last column with the original first column */ > - unsigned int index1 = h * kARGBSize; > - *buf++ = firstColumn[index1]; /* B */ > - *buf++ = firstColumn[index1 + 1]; /* G */ > - *buf++ = firstColumn[index1 + 2]; /* R */ > - *buf++ = 0x00; /* A */ > - } > -} > - > void ColorBarsGenerator::configure(const Size &size) > { > constexpr uint8_t kColorBar[8][3] = { > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h > index 05f4ab7a7..2a51bd31a 100644 > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h > @@ -29,10 +29,6 @@ public: > protected: > /* Buffer of test pattern template */ > std::unique_ptr<uint8_t[]> template_; > - > -private: > - /* Shift the buffer by 1 pixel left each frame */ > - void shiftLeft(const Size &size); > }; > > class ColorBarsGenerator : public TestPatternGenerator > -- > 2.47.1 > >
Regards, Barnabás Pőcze 2024. december 17., kedd 17:19 keltezéssel, Kieran Bingham <kieran.bingham@ideasonboard.com> írta: > Quoting Barnabás Pőcze (2024-12-11 15:25:56) > > > After the initial generation, when a frame is requested, > > the test pattern generator rotates the image left by 1 column. > > > > The current approach has two shortcomings: > > (1) it allocates a temporary buffer to hold one column; > > (2) it swaps two columns at a time. > > > > The test patterns are simple ARGB images, in row-major order, > > so doing (2) works against memory prefetching. This can be > > addressed by doing the rotation one row at a time as that way > > the image is addressed in a purely linear fashion. Doing so also > > eliminates the need for a dynamically allocated temporary buffer, > > as the required buffer now only needs to hold one sample, > > which is 4 bytes in this case. > > > > In an optimized build, this results in about a 2x increase in the > > number of frames per second as reported by `cam`. In an unoptimized, > > ASAN and UBSAN intrumented build, the difference is even bigger, > > which is useful for running lc-compliance in CI in a reasonable time. > > > > Signed-off-by: Barnabás Pőcze pobrn@protonmail.com > > > Hrm, rabbit holes a plenty trying to test this. > > First qcam isn't working on my main development PC as Qt6 does not > supply a pkg-config file in the version available on Ubuntu 22.04 LTS. > > Next up, I'm running cam with the SDL output which works: > > ./build/gcc/src/apps/cam/cam -c2 -S -C > > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > > but the virtual pipeline handler is clearly not handling sequence > numbers, timetamps, bytesused - or presumably any of the required > metadata correctly. > > But none of that is a fault of your patch which is working so: I forgot to mention that I used this to test: https://gitlab.freedesktop.org/pobrn/libcamera/-/commit/cd537fa855bb68c1faeab8fb5b94c4d423899b04 There is also a corresponding bug: https://bugs.libcamera.org/show_bug.cgi?id=245 Regards, Barnabás Pőcze > > > Tested-by: Kieran Bingham kieran.bingham@ideasonboard.com > > Reviewed-by: Kieran Bingham kieran.bingham@ideasonboard.com > > > --- > > .../virtual/test_pattern_generator.cpp | 68 +++++++++---------- > > .../pipeline/virtual/test_pattern_generator.h | 4 -- > > 2 files changed, 34 insertions(+), 38 deletions(-) > > > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > index 47d341919..eeadc1646 100644 > > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > @@ -5,6 +5,8 @@ > > * Derived class of FrameGenerator for generating test patterns > > */ > > > > +#include <string.h> > > + > > #include "test_pattern_generator.h" > > > > #include <libcamera/base/log.h> > > @@ -13,6 +15,37 @@ > > > > #include <libyuv/convert_from_argb.h> > > > > +namespace { > > + > > +template<size_t SampleSize> > > +void rotateLeft1Sample(uint8_t *samples, size_t width) > > +{ > > + if (width <= 0) > > + return; > > + > > + const size_t stride = width * SampleSize; > > + uint8_t first[SampleSize]; > > + > > + memcpy(first, &samples[0], SampleSize); > > + for (size_t i = 0; i < stride - SampleSize; i += SampleSize) > > + memcpy(&samples[i], &samples[i + SampleSize], SampleSize); > > + memcpy(&samples[stride - SampleSize], first, SampleSize); > > +} > > + > > +template<size_t SampleSize> > > +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image) > > +{ > > + if (size.width < 2) > > + return; > > + > > + const size_t stride = size.width * SampleSize; > > + > > + for (size_t i = 0; i < size.height; i++, image += stride) > > + rotateLeft1Sample<SampleSize>(image, size.width); > > +} > > + > > +} > > + > > namespace libcamera { > > > > LOG_DECLARE_CATEGORY(Virtual) > > @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size, > > > > const auto &planes = mappedFrameBuffer.planes(); > > > > - shiftLeft(size); > > + rotateLeft1Column<kARGBSize>(size, template_.get()); > > > > /* Convert the template_ to the frame buffer */ > > int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize, > > @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size, > > return ret; > > } > > > > -void TestPatternGenerator::shiftLeft(const Size &size) > > -{ > > - /* Store the first column temporarily / > > - auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize); > > - for (size_t h = 0; h < size.height; h++) { > > - unsigned int index = h * size.width * kARGBSize; > > - unsigned int index1 = h * kARGBSize; > > - firstColumn[index1] = template_[index]; > > - firstColumn[index1 + 1] = template_[index + 1]; > > - firstColumn[index1 + 2] = template_[index + 2]; > > - firstColumn[index1 + 3] = 0x00; > > - } > > - > > - / Overwrite template_ */ > > - uint8_t buf = template_.get(); > > - for (size_t h = 0; h < size.height; h++) { > > - for (size_t w = 0; w < size.width - 1; w++) { > > - / Overwrite with the pixel on the right */ > > - unsigned int index = (h * size.width + w + 1) * kARGBSize; > > - buf++ = template_[index]; / B */ > > - buf++ = template_[index + 1]; / G */ > > - buf++ = template_[index + 2]; / R */ > > - buf++ = 0x00; / A / > > - } > > - / Overwrite the new last column with the original first column */ > > - unsigned int index1 = h * kARGBSize; > > - buf++ = firstColumn[index1]; / B */ > > - buf++ = firstColumn[index1 + 1]; / G */ > > - buf++ = firstColumn[index1 + 2]; / R */ > > - buf++ = 0x00; / A / > > - } > > -} > > - > > void ColorBarsGenerator::configure(const Size &size) > > { > > constexpr uint8_t kColorBar[8][3] = { > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h > > index 05f4ab7a7..2a51bd31a 100644 > > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h > > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h > > @@ -29,10 +29,6 @@ public: > > protected: > > / Buffer of test pattern template / > > std::unique_ptr<uint8_t[]> template_; > > - > > -private: > > - / Shift the buffer by 1 pixel left each frame */ > > - void shiftLeft(const Size &size); > > }; > > > > class ColorBarsGenerator : public TestPatternGenerator > > -- > > 2.47.1
Hi Barnabás, Thank you for the patch. On Wed, Dec 11, 2024 at 03:25:56PM +0000, Barnabás Pőcze wrote: > After the initial generation, when a frame is requested, > the test pattern generator rotates the image left by 1 column. > > The current approach has two shortcomings: > (1) it allocates a temporary buffer to hold one column; > (2) it swaps two columns at a time. > > The test patterns are simple ARGB images, in row-major order, > so doing (2) works against memory prefetching. This can be > addressed by doing the rotation one row at a time as that way > the image is addressed in a purely linear fashion. Doing so also > eliminates the need for a dynamically allocated temporary buffer, > as the required buffer now only needs to hold one sample, > which is 4 bytes in this case. > > In an optimized build, this results in about a 2x increase in the > number of frames per second as reported by `cam`. In an unoptimized, > ASAN and UBSAN intrumented build, the difference is even bigger, > which is useful for running lc-compliance in CI in a reasonable time. > > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> > --- > .../virtual/test_pattern_generator.cpp | 68 +++++++++---------- > .../pipeline/virtual/test_pattern_generator.h | 4 -- > 2 files changed, 34 insertions(+), 38 deletions(-) > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > index 47d341919..eeadc1646 100644 > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > @@ -5,6 +5,8 @@ > * Derived class of FrameGenerator for generating test patterns > */ > > +#include <string.h> > + > #include "test_pattern_generator.h" > > #include <libcamera/base/log.h> > @@ -13,6 +15,37 @@ > > #include <libyuv/convert_from_argb.h> > > +namespace { > + > +template<size_t SampleSize> > +void rotateLeft1Sample(uint8_t *samples, size_t width) > +{ > + if (width <= 0) width is unsigned, so it will never be smaller than 0. As the caller already ensures that the width is >= 2, I'd drop this check. > + return; > + > + const size_t stride = width * SampleSize; You could pass the stride value to this function, instead of recomputing it for each line. > + uint8_t first[SampleSize]; > + > + memcpy(first, &samples[0], SampleSize); > + for (size_t i = 0; i < stride - SampleSize; i += SampleSize) > + memcpy(&samples[i], &samples[i + SampleSize], SampleSize); Have you considered replacing this with memmove(&samples[0], &samples[SampleSize], stride - SampleSize); ? This patch is a good improvement, so Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> with the above comments taken into account (regardless of whether the suggestions are accepted or rejected). I think the test pattern generator could do with a whole rewrite. I would compute the test pattern image at configure time, and store it in NV12 and in the output resolution, and handle the rotation when copying it to the frame buffer. That should significantly speed up operation. > + memcpy(&samples[stride - SampleSize], first, SampleSize); > +} > + > +template<size_t SampleSize> > +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image) > +{ > + if (size.width < 2) > + return; > + > + const size_t stride = size.width * SampleSize; > + > + for (size_t i = 0; i < size.height; i++, image += stride) > + rotateLeft1Sample<SampleSize>(image, size.width); > +} > + > +} > + > namespace libcamera { > > LOG_DECLARE_CATEGORY(Virtual) > @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size, > > const auto &planes = mappedFrameBuffer.planes(); > > - shiftLeft(size); > + rotateLeft1Column<kARGBSize>(size, template_.get()); > > /* Convert the template_ to the frame buffer */ > int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize, > @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size, > return ret; > } > > -void TestPatternGenerator::shiftLeft(const Size &size) > -{ > - /* Store the first column temporarily */ > - auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize); > - for (size_t h = 0; h < size.height; h++) { > - unsigned int index = h * size.width * kARGBSize; > - unsigned int index1 = h * kARGBSize; > - firstColumn[index1] = template_[index]; > - firstColumn[index1 + 1] = template_[index + 1]; > - firstColumn[index1 + 2] = template_[index + 2]; > - firstColumn[index1 + 3] = 0x00; > - } > - > - /* Overwrite template_ */ > - uint8_t *buf = template_.get(); > - for (size_t h = 0; h < size.height; h++) { > - for (size_t w = 0; w < size.width - 1; w++) { > - /* Overwrite with the pixel on the right */ > - unsigned int index = (h * size.width + w + 1) * kARGBSize; > - *buf++ = template_[index]; /* B */ > - *buf++ = template_[index + 1]; /* G */ > - *buf++ = template_[index + 2]; /* R */ > - *buf++ = 0x00; /* A */ > - } > - /* Overwrite the new last column with the original first column */ > - unsigned int index1 = h * kARGBSize; > - *buf++ = firstColumn[index1]; /* B */ > - *buf++ = firstColumn[index1 + 1]; /* G */ > - *buf++ = firstColumn[index1 + 2]; /* R */ > - *buf++ = 0x00; /* A */ > - } > -} > - > void ColorBarsGenerator::configure(const Size &size) > { > constexpr uint8_t kColorBar[8][3] = { > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h > index 05f4ab7a7..2a51bd31a 100644 > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h > @@ -29,10 +29,6 @@ public: > protected: > /* Buffer of test pattern template */ > std::unique_ptr<uint8_t[]> template_; > - > -private: > - /* Shift the buffer by 1 pixel left each frame */ > - void shiftLeft(const Size &size); > }; > > class ColorBarsGenerator : public TestPatternGenerator
CC'ing Harvey. On Tue, Dec 17, 2024 at 04:22:42PM +0000, Barnabás Pőcze wrote: > 2024. december 17., kedd 17:19 keltezéssel, Kieran Bingham írta: > > Quoting Barnabás Pőcze (2024-12-11 15:25:56) > > > > > After the initial generation, when a frame is requested, > > > the test pattern generator rotates the image left by 1 column. > > > > > > The current approach has two shortcomings: > > > (1) it allocates a temporary buffer to hold one column; > > > (2) it swaps two columns at a time. > > > > > > The test patterns are simple ARGB images, in row-major order, > > > so doing (2) works against memory prefetching. This can be > > > addressed by doing the rotation one row at a time as that way > > > the image is addressed in a purely linear fashion. Doing so also > > > eliminates the need for a dynamically allocated temporary buffer, > > > as the required buffer now only needs to hold one sample, > > > which is 4 bytes in this case. > > > > > > In an optimized build, this results in about a 2x increase in the > > > number of frames per second as reported by `cam`. In an unoptimized, > > > ASAN and UBSAN intrumented build, the difference is even bigger, > > > which is useful for running lc-compliance in CI in a reasonable time. > > > > > > Signed-off-by: Barnabás Pőcze pobrn@protonmail.com > > > > > > Hrm, rabbit holes a plenty trying to test this. > > > > First qcam isn't working on my main development PC as Qt6 does not > > supply a pkg-config file in the version available on Ubuntu 22.04 LTS. > > > > Next up, I'm running cam with the SDL output which works: > > > > ./build/gcc/src/apps/cam/cam -c2 -S -C > > > > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > > 0.000000 (0.00 fps) cam0-stream0 seq: 000000 bytesused: 0/0 > > 136838.194958 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > 136838.194951 (0.00 fps) cam0-stream0 seq: 031860 bytesused: 0/0 > > > > > > but the virtual pipeline handler is clearly not handling sequence > > numbers, timetamps, bytesused - or presumably any of the required > > metadata correctly. > > > > But none of that is a fault of your patch which is working so: > > I forgot to mention that I used this to test: > https://gitlab.freedesktop.org/pobrn/libcamera/-/commit/cd537fa855bb68c1faeab8fb5b94c4d423899b04 > > There is also a corresponding bug: > https://bugs.libcamera.org/show_bug.cgi?id=245 That's not great, it should be fixed sooner than later as I expect it will impact CI. Harvey, when do you think you could have a look at that ? Could you also create an account on bugs.libcamera.org, so that I can assign bugs for the virtual pipeline handler to you in the future ? > > Tested-by: Kieran Bingham kieran.bingham@ideasonboard.com > > > > Reviewed-by: Kieran Bingham kieran.bingham@ideasonboard.com > > > > > --- > > > .../virtual/test_pattern_generator.cpp | 68 +++++++++---------- > > > .../pipeline/virtual/test_pattern_generator.h | 4 -- > > > 2 files changed, 34 insertions(+), 38 deletions(-) > > > > > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > > index 47d341919..eeadc1646 100644 > > > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > > @@ -5,6 +5,8 @@ > > > * Derived class of FrameGenerator for generating test patterns > > > */ > > > > > > +#include <string.h> > > > + > > > #include "test_pattern_generator.h" > > > > > > #include <libcamera/base/log.h> > > > @@ -13,6 +15,37 @@ > > > > > > #include <libyuv/convert_from_argb.h> > > > > > > +namespace { > > > + > > > +template<size_t SampleSize> > > > +void rotateLeft1Sample(uint8_t *samples, size_t width) > > > +{ > > > + if (width <= 0) > > > + return; > > > + > > > + const size_t stride = width * SampleSize; > > > + uint8_t first[SampleSize]; > > > + > > > + memcpy(first, &samples[0], SampleSize); > > > + for (size_t i = 0; i < stride - SampleSize; i += SampleSize) > > > + memcpy(&samples[i], &samples[i + SampleSize], SampleSize); > > > + memcpy(&samples[stride - SampleSize], first, SampleSize); > > > +} > > > + > > > +template<size_t SampleSize> > > > +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image) > > > +{ > > > + if (size.width < 2) > > > + return; > > > + > > > + const size_t stride = size.width * SampleSize; > > > + > > > + for (size_t i = 0; i < size.height; i++, image += stride) > > > + rotateLeft1Sample<SampleSize>(image, size.width); > > > +} > > > + > > > +} > > > + > > > namespace libcamera { > > > > > > LOG_DECLARE_CATEGORY(Virtual) > > > @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size, > > > > > > const auto &planes = mappedFrameBuffer.planes(); > > > > > > - shiftLeft(size); > > > + rotateLeft1Column<kARGBSize>(size, template_.get()); > > > > > > /* Convert the template_ to the frame buffer */ > > > int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize, > > > @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size, > > > return ret; > > > } > > > > > > -void TestPatternGenerator::shiftLeft(const Size &size) > > > -{ > > > - /* Store the first column temporarily / > > > - auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize); > > > - for (size_t h = 0; h < size.height; h++) { > > > - unsigned int index = h * size.width * kARGBSize; > > > - unsigned int index1 = h * kARGBSize; > > > - firstColumn[index1] = template_[index]; > > > - firstColumn[index1 + 1] = template_[index + 1]; > > > - firstColumn[index1 + 2] = template_[index + 2]; > > > - firstColumn[index1 + 3] = 0x00; > > > - } > > > - > > > - / Overwrite template_ */ > > > - uint8_t buf = template_.get(); > > > - for (size_t h = 0; h < size.height; h++) { > > > - for (size_t w = 0; w < size.width - 1; w++) { > > > - / Overwrite with the pixel on the right */ > > > - unsigned int index = (h * size.width + w + 1) * kARGBSize; > > > - buf++ = template_[index]; / B */ > > > - buf++ = template_[index + 1]; / G */ > > > - buf++ = template_[index + 2]; / R */ > > > - buf++ = 0x00; / A / > > > - } > > > - / Overwrite the new last column with the original first column */ > > > - unsigned int index1 = h * kARGBSize; > > > - buf++ = firstColumn[index1]; / B */ > > > - buf++ = firstColumn[index1 + 1]; / G */ > > > - buf++ = firstColumn[index1 + 2]; / R */ > > > - buf++ = 0x00; / A / > > > - } > > > -} > > > - > > > void ColorBarsGenerator::configure(const Size &size) > > > { > > > constexpr uint8_t kColorBar[8][3] = { > > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h > > > index 05f4ab7a7..2a51bd31a 100644 > > > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h > > > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h > > > @@ -29,10 +29,6 @@ public: > > > protected: > > > / Buffer of test pattern template / > > > std::unique_ptr<uint8_t[]> template_; > > > - > > > -private: > > > - / Shift the buffer by 1 pixel left each frame */ > > > - void shiftLeft(const Size &size); > > > }; > > > > > > class ColorBarsGenerator : public TestPatternGenerator
2024. december 17., kedd 17:39 keltezéssel, Laurent Pinchart <laurent.pinchart@ideasonboard.com> írta: > Hi Barnabás, > > Thank you for the patch. > > On Wed, Dec 11, 2024 at 03:25:56PM +0000, Barnabás Pőcze wrote: > > After the initial generation, when a frame is requested, > > the test pattern generator rotates the image left by 1 column. > > > > The current approach has two shortcomings: > > (1) it allocates a temporary buffer to hold one column; > > (2) it swaps two columns at a time. > > > > The test patterns are simple ARGB images, in row-major order, > > so doing (2) works against memory prefetching. This can be > > addressed by doing the rotation one row at a time as that way > > the image is addressed in a purely linear fashion. Doing so also > > eliminates the need for a dynamically allocated temporary buffer, > > as the required buffer now only needs to hold one sample, > > which is 4 bytes in this case. > > > > In an optimized build, this results in about a 2x increase in the > > number of frames per second as reported by `cam`. In an unoptimized, > > ASAN and UBSAN intrumented build, the difference is even bigger, > > which is useful for running lc-compliance in CI in a reasonable time. > > > > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> > > --- > > .../virtual/test_pattern_generator.cpp | 68 +++++++++---------- > > .../pipeline/virtual/test_pattern_generator.h | 4 -- > > 2 files changed, 34 insertions(+), 38 deletions(-) > > > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > index 47d341919..eeadc1646 100644 > > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp > > @@ -5,6 +5,8 @@ > > * Derived class of FrameGenerator for generating test patterns > > */ > > > > +#include <string.h> > > + > > #include "test_pattern_generator.h" > > > > #include <libcamera/base/log.h> > > @@ -13,6 +15,37 @@ > > > > #include <libyuv/convert_from_argb.h> > > > > +namespace { > > + > > +template<size_t SampleSize> > > +void rotateLeft1Sample(uint8_t *samples, size_t width) > > +{ > > + if (width <= 0) > > width is unsigned, so it will never be smaller than 0. As the caller > already ensures that the width is >= 2, I'd drop this check. > > > + return; > > + > > + const size_t stride = width * SampleSize; > > You could pass the stride value to this function, instead of recomputing > it for each line. > > > + uint8_t first[SampleSize]; > > + > > + memcpy(first, &samples[0], SampleSize); > > + for (size_t i = 0; i < stride - SampleSize; i += SampleSize) > > + memcpy(&samples[i], &samples[i + SampleSize], SampleSize); > > Have you considered replacing this with > > memmove(&samples[0], &samples[SampleSize], stride - SampleSize); > > ? No... but that's a good idea. I'll apply this in the next version. Regards, Barnabás Pőcze > > This patch is a good improvement, so > > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > with the above comments taken into account (regardless of whether the > suggestions are accepted or rejected). > > I think the test pattern generator could do with a whole rewrite. I > would compute the test pattern image at configure time, and store it in > NV12 and in the output resolution, and handle the rotation when copying > it to the frame buffer. That should significantly speed up operation. > > > + memcpy(&samples[stride - SampleSize], first, SampleSize); > > +} > > + > > +template<size_t SampleSize> > > +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image) > > +{ > > + if (size.width < 2) > > + return; > > + > > + const size_t stride = size.width * SampleSize; > > + > > + for (size_t i = 0; i < size.height; i++, image += stride) > > + rotateLeft1Sample<SampleSize>(image, size.width); > > +} > > + > > +} > > + > > namespace libcamera { > > > > LOG_DECLARE_CATEGORY(Virtual) > > @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size, > > > > const auto &planes = mappedFrameBuffer.planes(); > > > > - shiftLeft(size); > > + rotateLeft1Column<kARGBSize>(size, template_.get()); > > > > /* Convert the template_ to the frame buffer */ > > int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize, > > @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size, > > return ret; > > } > > > > -void TestPatternGenerator::shiftLeft(const Size &size) > > -{ > > - /* Store the first column temporarily */ > > - auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize); > > - for (size_t h = 0; h < size.height; h++) { > > - unsigned int index = h * size.width * kARGBSize; > > - unsigned int index1 = h * kARGBSize; > > - firstColumn[index1] = template_[index]; > > - firstColumn[index1 + 1] = template_[index + 1]; > > - firstColumn[index1 + 2] = template_[index + 2]; > > - firstColumn[index1 + 3] = 0x00; > > - } > > - > > - /* Overwrite template_ */ > > - uint8_t *buf = template_.get(); > > - for (size_t h = 0; h < size.height; h++) { > > - for (size_t w = 0; w < size.width - 1; w++) { > > - /* Overwrite with the pixel on the right */ > > - unsigned int index = (h * size.width + w + 1) * kARGBSize; > > - *buf++ = template_[index]; /* B */ > > - *buf++ = template_[index + 1]; /* G */ > > - *buf++ = template_[index + 2]; /* R */ > > - *buf++ = 0x00; /* A */ > > - } > > - /* Overwrite the new last column with the original first column */ > > - unsigned int index1 = h * kARGBSize; > > - *buf++ = firstColumn[index1]; /* B */ > > - *buf++ = firstColumn[index1 + 1]; /* G */ > > - *buf++ = firstColumn[index1 + 2]; /* R */ > > - *buf++ = 0x00; /* A */ > > - } > > -} > > - > > void ColorBarsGenerator::configure(const Size &size) > > { > > constexpr uint8_t kColorBar[8][3] = { > > diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h > > index 05f4ab7a7..2a51bd31a 100644 > > --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h > > +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h > > @@ -29,10 +29,6 @@ public: > > protected: > > /* Buffer of test pattern template */ > > std::unique_ptr<uint8_t[]> template_; > > - > > -private: > > - /* Shift the buffer by 1 pixel left each frame */ > > - void shiftLeft(const Size &size); > > }; > > > > class ColorBarsGenerator : public TestPatternGenerator > > -- > Regards, > > Laurent Pinchart >
diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp index 47d341919..eeadc1646 100644 --- a/src/libcamera/pipeline/virtual/test_pattern_generator.cpp +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.cpp @@ -5,6 +5,8 @@ * Derived class of FrameGenerator for generating test patterns */ +#include <string.h> + #include "test_pattern_generator.h" #include <libcamera/base/log.h> @@ -13,6 +15,37 @@ #include <libyuv/convert_from_argb.h> +namespace { + +template<size_t SampleSize> +void rotateLeft1Sample(uint8_t *samples, size_t width) +{ + if (width <= 0) + return; + + const size_t stride = width * SampleSize; + uint8_t first[SampleSize]; + + memcpy(first, &samples[0], SampleSize); + for (size_t i = 0; i < stride - SampleSize; i += SampleSize) + memcpy(&samples[i], &samples[i + SampleSize], SampleSize); + memcpy(&samples[stride - SampleSize], first, SampleSize); +} + +template<size_t SampleSize> +void rotateLeft1Column(const libcamera::Size &size, uint8_t *image) +{ + if (size.width < 2) + return; + + const size_t stride = size.width * SampleSize; + + for (size_t i = 0; i < size.height; i++, image += stride) + rotateLeft1Sample<SampleSize>(image, size.width); +} + +} + namespace libcamera { LOG_DECLARE_CATEGORY(Virtual) @@ -27,7 +60,7 @@ int TestPatternGenerator::generateFrame(const Size &size, const auto &planes = mappedFrameBuffer.planes(); - shiftLeft(size); + rotateLeft1Column<kARGBSize>(size, template_.get()); /* Convert the template_ to the frame buffer */ int ret = libyuv::ARGBToNV12(template_.get(), size.width * kARGBSize, @@ -40,39 +73,6 @@ int TestPatternGenerator::generateFrame(const Size &size, return ret; } -void TestPatternGenerator::shiftLeft(const Size &size) -{ - /* Store the first column temporarily */ - auto firstColumn = std::make_unique<uint8_t[]>(size.height * kARGBSize); - for (size_t h = 0; h < size.height; h++) { - unsigned int index = h * size.width * kARGBSize; - unsigned int index1 = h * kARGBSize; - firstColumn[index1] = template_[index]; - firstColumn[index1 + 1] = template_[index + 1]; - firstColumn[index1 + 2] = template_[index + 2]; - firstColumn[index1 + 3] = 0x00; - } - - /* Overwrite template_ */ - uint8_t *buf = template_.get(); - for (size_t h = 0; h < size.height; h++) { - for (size_t w = 0; w < size.width - 1; w++) { - /* Overwrite with the pixel on the right */ - unsigned int index = (h * size.width + w + 1) * kARGBSize; - *buf++ = template_[index]; /* B */ - *buf++ = template_[index + 1]; /* G */ - *buf++ = template_[index + 2]; /* R */ - *buf++ = 0x00; /* A */ - } - /* Overwrite the new last column with the original first column */ - unsigned int index1 = h * kARGBSize; - *buf++ = firstColumn[index1]; /* B */ - *buf++ = firstColumn[index1 + 1]; /* G */ - *buf++ = firstColumn[index1 + 2]; /* R */ - *buf++ = 0x00; /* A */ - } -} - void ColorBarsGenerator::configure(const Size &size) { constexpr uint8_t kColorBar[8][3] = { diff --git a/src/libcamera/pipeline/virtual/test_pattern_generator.h b/src/libcamera/pipeline/virtual/test_pattern_generator.h index 05f4ab7a7..2a51bd31a 100644 --- a/src/libcamera/pipeline/virtual/test_pattern_generator.h +++ b/src/libcamera/pipeline/virtual/test_pattern_generator.h @@ -29,10 +29,6 @@ public: protected: /* Buffer of test pattern template */ std::unique_ptr<uint8_t[]> template_; - -private: - /* Shift the buffer by 1 pixel left each frame */ - void shiftLeft(const Size &size); }; class ColorBarsGenerator : public TestPatternGenerator
After the initial generation, when a frame is requested, the test pattern generator rotates the image left by 1 column. The current approach has two shortcomings: (1) it allocates a temporary buffer to hold one column; (2) it swaps two columns at a time. The test patterns are simple ARGB images, in row-major order, so doing (2) works against memory prefetching. This can be addressed by doing the rotation one row at a time as that way the image is addressed in a purely linear fashion. Doing so also eliminates the need for a dynamically allocated temporary buffer, as the required buffer now only needs to hold one sample, which is 4 bytes in this case. In an optimized build, this results in about a 2x increase in the number of frames per second as reported by `cam`. In an unoptimized, ASAN and UBSAN intrumented build, the difference is even bigger, which is useful for running lc-compliance in CI in a reasonable time. Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> --- .../virtual/test_pattern_generator.cpp | 68 +++++++++---------- .../pipeline/virtual/test_pattern_generator.h | 4 -- 2 files changed, 34 insertions(+), 38 deletions(-)