Message ID | 20220201092738.804028-2-naush@raspberrypi.com |
---|---|
State | Accepted |
Headers | show |
Series |
|
Related | show |
Hi Naush Thanks for this patch. Weird how we never noticed this problem before! On Tue, 1 Feb 2022 at 09:27, Naushir Patuck <naush@raspberrypi.com> wrote: > > The ISP input stream currently only allocates a single slot in the > V4L2VideoDevice cache as it follows the number of buffers allocated for use. > However, this is wrong as the ISP input stream imports buffers from Unicam > image stream. As a consequence of this, only one cache slot was used during > runtime for the ISP input stream, and if multiple buffers were to be queued > simultaneously, the queue operation would return a failure. > > Fix this by passing the same number of RAW buffers available from the Unicam > image stream. Additionally, double this count in the cases where buffers could > be allocated externally from the application. I agree this sounds like it wants a better solution, but is definitely beyond the scope of this (relatively small but important) fix. So: Reviewed-by: David Plowman <david.plowman@raspberrypi.com> Thanks! David > > Bug: https://github.com/raspberrypi/libcamera-apps/issues/236 > https://github.com/raspberrypi/libcamera-apps/issues/238 > https://github.com/raspberrypi/libcamera-apps/issues/240 > Signed-off-by: Naushir Patuck <naush@raspberrypi.com> > --- > src/libcamera/pipeline/raspberrypi/raspberrypi.cpp | 8 ++++++++ > src/libcamera/pipeline/raspberrypi/rpi_stream.cpp | 3 ++- > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > index 6a46e6bc89fa..49af56edc1f9 100644 > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > @@ -1382,6 +1382,14 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera) > * minimise frame drops. > */ > numBuffers = std::max<int>(2, minBuffers - numRawBuffers); > + } else if (stream == &data->isp_[Isp::Input]) { > + /* > + * ISP input buffers are imported from Unicam, so follow > + * similar logic as above to count all the RAW buffers > + * available. > + */ > + numBuffers = numRawBuffers + std::max<int>(2, minBuffers - numRawBuffers); > + > } else if (stream == &data->unicam_[Unicam::Embedded]) { > /* > * Embedded data buffers are (currently) for internal use, > diff --git a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > index a4159e20b068..a421ad09ba50 100644 > --- a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > +++ b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > @@ -111,11 +111,12 @@ int Stream::prepareBuffers(unsigned int count) > * If this is an external stream, we must allocate slots for buffers that > * might be externally allocated. We have no indication of how many buffers > * may be used, so this might overallocate slots in the buffer cache. > + * Similarly, if this stream is only importing buffers, we do the same. > * > * \todo Find a better heuristic, or, even better, an exact solution to > * this issue. > */ > - if (isExternal()) > + if (isExternal() || importOnly_) > count = count * 2; > > return dev_->importBuffers(count); > -- > 2.25.1 >
Quoting David Plowman (2022-02-01 09:47:09) > Hi Naush > > Thanks for this patch. Weird how we never noticed this problem before! > > On Tue, 1 Feb 2022 at 09:27, Naushir Patuck <naush@raspberrypi.com> wrote: > > > > The ISP input stream currently only allocates a single slot in the > > V4L2VideoDevice cache as it follows the number of buffers allocated for use. > > However, this is wrong as the ISP input stream imports buffers from Unicam > > image stream. As a consequence of this, only one cache slot was used during > > runtime for the ISP input stream, and if multiple buffers were to be queued > > simultaneously, the queue operation would return a failure. > > > > Fix this by passing the same number of RAW buffers available from the Unicam > > image stream. Additionally, double this count in the cases where buffers could > > be allocated externally from the application. > > I agree this sounds like it wants a better solution, but is definitely > beyond the scope of this (relatively small but important) fix. So: Did you ever dig further into what the maximum number of V4L2 buffers can be allocated are? Should we change the V4L2Device to always just allocate the maximum there instead? These are just 'structures' to be able to pass the buffer information right ? They're not the actual expensive memory backing? That would also then give V4L2BufferCache the best chance of supporting more external buffers? -- Kieran > Reviewed-by: David Plowman <david.plowman@raspberrypi.com> > > Thanks! > David > > > > > Bug: https://github.com/raspberrypi/libcamera-apps/issues/236 > > https://github.com/raspberrypi/libcamera-apps/issues/238 > > https://github.com/raspberrypi/libcamera-apps/issues/240 > > Signed-off-by: Naushir Patuck <naush@raspberrypi.com> > > --- > > src/libcamera/pipeline/raspberrypi/raspberrypi.cpp | 8 ++++++++ > > src/libcamera/pipeline/raspberrypi/rpi_stream.cpp | 3 ++- > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > index 6a46e6bc89fa..49af56edc1f9 100644 > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > @@ -1382,6 +1382,14 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera) > > * minimise frame drops. > > */ > > numBuffers = std::max<int>(2, minBuffers - numRawBuffers); > > + } else if (stream == &data->isp_[Isp::Input]) { > > + /* > > + * ISP input buffers are imported from Unicam, so follow > > + * similar logic as above to count all the RAW buffers > > + * available. > > + */ > > + numBuffers = numRawBuffers + std::max<int>(2, minBuffers - numRawBuffers); > > + > > } else if (stream == &data->unicam_[Unicam::Embedded]) { > > /* > > * Embedded data buffers are (currently) for internal use, > > diff --git a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > index a4159e20b068..a421ad09ba50 100644 > > --- a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > +++ b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > @@ -111,11 +111,12 @@ int Stream::prepareBuffers(unsigned int count) > > * If this is an external stream, we must allocate slots for buffers that > > * might be externally allocated. We have no indication of how many buffers > > * may be used, so this might overallocate slots in the buffer cache. > > + * Similarly, if this stream is only importing buffers, we do the same. > > * > > * \todo Find a better heuristic, or, even better, an exact solution to > > * this issue. > > */ > > - if (isExternal()) > > + if (isExternal() || importOnly_) > > count = count * 2; > > > > return dev_->importBuffers(count); > > -- > > 2.25.1 > >
Hello, On Tue, Feb 01, 2022 at 10:18:23AM +0000, Kieran Bingham wrote: > Quoting David Plowman (2022-02-01 09:47:09) > > Hi Naush > > > > Thanks for this patch. Weird how we never noticed this problem before! > > > > On Tue, 1 Feb 2022 at 09:27, Naushir Patuck wrote: > > > > > > The ISP input stream currently only allocates a single slot in the > > > V4L2VideoDevice cache as it follows the number of buffers allocated for use. > > > However, this is wrong as the ISP input stream imports buffers from Unicam > > > image stream. As a consequence of this, only one cache slot was used during > > > runtime for the ISP input stream, and if multiple buffers were to be queued > > > simultaneously, the queue operation would return a failure. > > > > > > Fix this by passing the same number of RAW buffers available from the Unicam > > > image stream. Additionally, double this count in the cases where buffers could > > > be allocated externally from the application. > > > > I agree this sounds like it wants a better solution, but is definitely > > beyond the scope of this (relatively small but important) fix. So: > > Did you ever dig further into what the maximum number of V4L2 buffers > can be allocated are? > > Should we change the V4L2Device to always just allocate the maximum > there instead? > > These are just 'structures' to be able to pass the buffer information > right ? They're not the actual expensive memory backing? When importing, that's correct, but note that V4L2 doesn't provide an "unprepare" buffer operation. It means that once you queue a dmabuf fd to a given V4L2 buffer, the dmabuf will be mapped to the device and the CPU until a new dmabuf comes to replace it for the same V4L2 buffer, or until the V4L2 buffer is freed (which can only be done with VIDIOC_REQBUFS(0) at the moment). It also means that a reference to the dmabuf will be kept, which will prevent the memory from being released. If an application queues different buffers from time to time, with a large number of V4L2 buffer, it may take a long time before stale cache entries are being pushed out (if ever). I'd thus be quite cautious about allocating too many V4L2 buffers. This seems a difficult to solve problem, and it stems from the fact that we don't have explicit release calls in V4L2. We may want to address this in the libcamera public API nonetheless at some point, with hints that applications can pass, for instance to tell if a buffer will be submitted again or not. V4L2 extensions will then likely be required. > That would also then give V4L2BufferCache the best chance of supporting > more external buffers? > > > Reviewed-by: David Plowman <david.plowman@raspberrypi.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > > > > > Bug: https://github.com/raspberrypi/libcamera-apps/issues/236 > > > https://github.com/raspberrypi/libcamera-apps/issues/238 > > > https://github.com/raspberrypi/libcamera-apps/issues/240 > > > Signed-off-by: Naushir Patuck <naush@raspberrypi.com> > > > --- > > > src/libcamera/pipeline/raspberrypi/raspberrypi.cpp | 8 ++++++++ > > > src/libcamera/pipeline/raspberrypi/rpi_stream.cpp | 3 ++- > > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > > index 6a46e6bc89fa..49af56edc1f9 100644 > > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > > @@ -1382,6 +1382,14 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera) > > > * minimise frame drops. > > > */ > > > numBuffers = std::max<int>(2, minBuffers - numRawBuffers); > > > + } else if (stream == &data->isp_[Isp::Input]) { > > > + /* > > > + * ISP input buffers are imported from Unicam, so follow > > > + * similar logic as above to count all the RAW buffers > > > + * available. > > > + */ > > > + numBuffers = numRawBuffers + std::max<int>(2, minBuffers - numRawBuffers); > > > + > > > } else if (stream == &data->unicam_[Unicam::Embedded]) { > > > /* > > > * Embedded data buffers are (currently) for internal use, > > > diff --git a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > > index a4159e20b068..a421ad09ba50 100644 > > > --- a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > > +++ b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > > @@ -111,11 +111,12 @@ int Stream::prepareBuffers(unsigned int count) > > > * If this is an external stream, we must allocate slots for buffers that > > > * might be externally allocated. We have no indication of how many buffers > > > * may be used, so this might overallocate slots in the buffer cache. > > > + * Similarly, if this stream is only importing buffers, we do the same. > > > * > > > * \todo Find a better heuristic, or, even better, an exact solution to > > > * this issue. > > > */ > > > - if (isExternal()) > > > + if (isExternal() || importOnly_) > > > count = count * 2; > > > > > > return dev_->importBuffers(count);
Hi Kieran and Laurent, On Tue, 1 Feb 2022 at 10:35, Laurent Pinchart < laurent.pinchart@ideasonboard.com> wrote: > Hello, > > On Tue, Feb 01, 2022 at 10:18:23AM +0000, Kieran Bingham wrote: > > Quoting David Plowman (2022-02-01 09:47:09) > > > Hi Naush > > > > > > Thanks for this patch. Weird how we never noticed this problem before! > > > > > > On Tue, 1 Feb 2022 at 09:27, Naushir Patuck wrote: > > > > > > > > The ISP input stream currently only allocates a single slot in the > > > > V4L2VideoDevice cache as it follows the number of buffers allocated > for use. > > > > However, this is wrong as the ISP input stream imports buffers from > Unicam > > > > image stream. As a consequence of this, only one cache slot was > used during > > > > runtime for the ISP input stream, and if multiple buffers were to be > queued > > > > simultaneously, the queue operation would return a failure. > > > > > > > > Fix this by passing the same number of RAW buffers available from > the Unicam > > > > image stream. Additionally, double this count in the cases where > buffers could > > > > be allocated externally from the application. > > > > > > I agree this sounds like it wants a better solution, but is definitely > > > beyond the scope of this (relatively small but important) fix. So: > > > > Did you ever dig further into what the maximum number of V4L2 buffers > > can be allocated are? > > > > Should we change the V4L2Device to always just allocate the maximum > > there instead? > > > > These are just 'structures' to be able to pass the buffer information > > right ? They're not the actual expensive memory backing? > > When importing, that's correct, but note that V4L2 doesn't provide an > "unprepare" buffer operation. It means that once you queue a dmabuf fd > to a given V4L2 buffer, the dmabuf will be mapped to the device and the > CPU until a new dmabuf comes to replace it for the same V4L2 buffer, or > until the V4L2 buffer is freed (which can only be done with > VIDIOC_REQBUFS(0) at the moment). It also means that a reference to the > dmabuf will be kept, which will prevent the memory from being released. > If an application queues different buffers from time to time, with a > large number of V4L2 buffer, it may take a long time before stale cache > entries are being pushed out (if ever). I'd thus be quite cautious about > allocating too many V4L2 buffers. > > This seems a difficult to solve problem, and it stems from the fact that > we don't have explicit release calls in V4L2. We may want to address > this in the libcamera public API nonetheless at some point, with hints > that applications can pass, for instance to tell if a buffer will be > submitted again or not. V4L2 extensions will then likely be required. > I was going to suggest that we setup the cache to use VB2_MAX_FRAME (32) entries, but given the above explanation this may not be the right thing to do to fix this. Naush > > > That would also then give V4L2BufferCache the best chance of supporting > > more external buffers? > > > > > Reviewed-by: David Plowman <david.plowman@raspberrypi.com> > > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > > > > > > > > Bug: https://github.com/raspberrypi/libcamera-apps/issues/236 > > > > https://github.com/raspberrypi/libcamera-apps/issues/238 > > > > https://github.com/raspberrypi/libcamera-apps/issues/240 > > > > Signed-off-by: Naushir Patuck <naush@raspberrypi.com> > > > > --- > > > > src/libcamera/pipeline/raspberrypi/raspberrypi.cpp | 8 ++++++++ > > > > src/libcamera/pipeline/raspberrypi/rpi_stream.cpp | 3 ++- > > > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > > > index 6a46e6bc89fa..49af56edc1f9 100644 > > > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp > > > > @@ -1382,6 +1382,14 @@ int PipelineHandlerRPi::prepareBuffers(Camera > *camera) > > > > * minimise frame drops. > > > > */ > > > > numBuffers = std::max<int>(2, minBuffers - > numRawBuffers); > > > > + } else if (stream == &data->isp_[Isp::Input]) { > > > > + /* > > > > + * ISP input buffers are imported from > Unicam, so follow > > > > + * similar logic as above to count all the > RAW buffers > > > > + * available. > > > > + */ > > > > + numBuffers = numRawBuffers + > std::max<int>(2, minBuffers - numRawBuffers); > > > > + > > > > } else if (stream == > &data->unicam_[Unicam::Embedded]) { > > > > /* > > > > * Embedded data buffers are (currently) for > internal use, > > > > diff --git a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > > > index a4159e20b068..a421ad09ba50 100644 > > > > --- a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > > > +++ b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp > > > > @@ -111,11 +111,12 @@ int Stream::prepareBuffers(unsigned int count) > > > > * If this is an external stream, we must allocate slots for > buffers that > > > > * might be externally allocated. We have no indication of > how many buffers > > > > * may be used, so this might overallocate slots in the > buffer cache. > > > > + * Similarly, if this stream is only importing buffers, we > do the same. > > > > * > > > > * \todo Find a better heuristic, or, even better, an exact > solution to > > > > * this issue. > > > > */ > > > > - if (isExternal()) > > > > + if (isExternal() || importOnly_) > > > > count = count * 2; > > > > > > > > return dev_->importBuffers(count); > > -- > Regards, > > Laurent Pinchart >
diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp index 6a46e6bc89fa..49af56edc1f9 100644 --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp @@ -1382,6 +1382,14 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera) * minimise frame drops. */ numBuffers = std::max<int>(2, minBuffers - numRawBuffers); + } else if (stream == &data->isp_[Isp::Input]) { + /* + * ISP input buffers are imported from Unicam, so follow + * similar logic as above to count all the RAW buffers + * available. + */ + numBuffers = numRawBuffers + std::max<int>(2, minBuffers - numRawBuffers); + } else if (stream == &data->unicam_[Unicam::Embedded]) { /* * Embedded data buffers are (currently) for internal use, diff --git a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp index a4159e20b068..a421ad09ba50 100644 --- a/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp +++ b/src/libcamera/pipeline/raspberrypi/rpi_stream.cpp @@ -111,11 +111,12 @@ int Stream::prepareBuffers(unsigned int count) * If this is an external stream, we must allocate slots for buffers that * might be externally allocated. We have no indication of how many buffers * may be used, so this might overallocate slots in the buffer cache. + * Similarly, if this stream is only importing buffers, we do the same. * * \todo Find a better heuristic, or, even better, an exact solution to * this issue. */ - if (isExternal()) + if (isExternal() || importOnly_) count = count * 2; return dev_->importBuffers(count);
The ISP input stream currently only allocates a single slot in the V4L2VideoDevice cache as it follows the number of buffers allocated for use. However, this is wrong as the ISP input stream imports buffers from Unicam image stream. As a consequence of this, only one cache slot was used during runtime for the ISP input stream, and if multiple buffers were to be queued simultaneously, the queue operation would return a failure. Fix this by passing the same number of RAW buffers available from the Unicam image stream. Additionally, double this count in the cases where buffers could be allocated externally from the application. Bug: https://github.com/raspberrypi/libcamera-apps/issues/236 https://github.com/raspberrypi/libcamera-apps/issues/238 https://github.com/raspberrypi/libcamera-apps/issues/240 Signed-off-by: Naushir Patuck <naush@raspberrypi.com> --- src/libcamera/pipeline/raspberrypi/raspberrypi.cpp | 8 ++++++++ src/libcamera/pipeline/raspberrypi/rpi_stream.cpp | 3 ++- 2 files changed, 10 insertions(+), 1 deletion(-)