[libcamera-devel,v2,1/4] pipeline: rpi: Increase buffer import count to 32
diff mbox series

Message ID 20230725085540.24863-2-naush@raspberrypi.com
State Accepted
Headers show
Series
  • Raspberry Pi: External buffer handling
Related show

Commit Message

Naushir Patuck July 25, 2023, 8:55 a.m. UTC
Hardcode the maximum number of buffers imported to the V4L2 video device
to 32. This only has a minor disadvantage of over-allocating cache slots
and V4L2 buffer indexes, but does allow more headroom for using dma
buffers allocated from outside of libcamera.

Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
---
 .../pipeline/rpi/common/rpi_stream.cpp        | 33 +++++--------------
 1 file changed, 8 insertions(+), 25 deletions(-)

Comments

Jacopo Mondi July 27, 2023, 9:31 a.m. UTC | #1
Hi Naush

On Tue, Jul 25, 2023 at 09:55:37AM +0100, Naushir Patuck via libcamera-devel wrote:
> Hardcode the maximum number of buffers imported to the V4L2 video device
> to 32. This only has a minor disadvantage of over-allocating cache slots
> and V4L2 buffer indexes, but does allow more headroom for using dma
> buffers allocated from outside of libcamera.
>
> Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
> ---
>  .../pipeline/rpi/common/rpi_stream.cpp        | 33 +++++--------------
>  1 file changed, 8 insertions(+), 25 deletions(-)
>
> diff --git a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> index c158843cb0ed..e9ad1e6f0848 100644
> --- a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> +++ b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> @@ -95,34 +95,17 @@ int Stream::prepareBuffers(unsigned int count)
>  	int ret;
>
>  	if (!(flags_ & StreamFlag::ImportOnly)) {
> -		if (count) {
> -			/* Export some frame buffers for internal use. */
> -			ret = dev_->exportBuffers(count, &internalBuffers_);
> -			if (ret < 0)
> -				return ret;
> -
> -			/* Add these exported buffers to the internal/external buffer list. */
> -			setExportedBuffers(&internalBuffers_);
> -			resetBuffers();
> -		}
> +		/* Export some frame buffers for internal use. */
> +		ret = dev_->exportBuffers(count, &internalBuffers_);
> +		if (ret < 0)
> +			return ret;
>
> -		/* We must import all internal/external exported buffers. */
> -		count = bufferMap_.size();
> +		/* Add these exported buffers to the internal/external buffer list. */
> +		setExportedBuffers(&internalBuffers_);
> +		resetBuffers();
>  	}
>
> -	/*
> -	 * If this is an external stream, we must allocate slots for buffers that
> -	 * might be externally allocated. We have no indication of how many buffers
> -	 * may be used, so this might overallocate slots in the buffer cache.
> -	 * Similarly, if this stream is only importing buffers, we do the same.
> -	 *
> -	 * \todo Find a better heuristic, or, even better, an exact solution to
> -	 * this issue.
> -	 */
> -	if ((flags_ & StreamFlag::External) || (flags_ & StreamFlag::ImportOnly))
> -		count = count * 2;
> -
> -	return dev_->importBuffers(count);
> +	return dev_->importBuffers(32);

Might be good to define 32 somewhere as constexpr to avoid magic
numbers, but apart from this

Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>

Thanks
  j

>  }
>
>  int Stream::queueBuffer(FrameBuffer *buffer)
> --
> 2.34.1
>
Kieran Bingham Sept. 4, 2023, 8:44 a.m. UTC | #2
Quoting Jacopo Mondi via libcamera-devel (2023-07-27 10:31:28)
> Hi Naush
> 
> On Tue, Jul 25, 2023 at 09:55:37AM +0100, Naushir Patuck via libcamera-devel wrote:
> > Hardcode the maximum number of buffers imported to the V4L2 video device
> > to 32. This only has a minor disadvantage of over-allocating cache slots
> > and V4L2 buffer indexes, but does allow more headroom for using dma
> > buffers allocated from outside of libcamera.
> >
> > Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
> > ---
> >  .../pipeline/rpi/common/rpi_stream.cpp        | 33 +++++--------------
> >  1 file changed, 8 insertions(+), 25 deletions(-)
> >
> > diff --git a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > index c158843cb0ed..e9ad1e6f0848 100644
> > --- a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > +++ b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > @@ -95,34 +95,17 @@ int Stream::prepareBuffers(unsigned int count)
> >       int ret;
> >
> >       if (!(flags_ & StreamFlag::ImportOnly)) {
> > -             if (count) {
> > -                     /* Export some frame buffers for internal use. */
> > -                     ret = dev_->exportBuffers(count, &internalBuffers_);
> > -                     if (ret < 0)
> > -                             return ret;
> > -
> > -                     /* Add these exported buffers to the internal/external buffer list. */
> > -                     setExportedBuffers(&internalBuffers_);
> > -                     resetBuffers();
> > -             }
> > +             /* Export some frame buffers for internal use. */
> > +             ret = dev_->exportBuffers(count, &internalBuffers_);
> > +             if (ret < 0)
> > +                     return ret;
> >
> > -             /* We must import all internal/external exported buffers. */
> > -             count = bufferMap_.size();
> > +             /* Add these exported buffers to the internal/external buffer list. */
> > +             setExportedBuffers(&internalBuffers_);
> > +             resetBuffers();
> >       }
> >
> > -     /*
> > -      * If this is an external stream, we must allocate slots for buffers that
> > -      * might be externally allocated. We have no indication of how many buffers
> > -      * may be used, so this might overallocate slots in the buffer cache.
> > -      * Similarly, if this stream is only importing buffers, we do the same.
> > -      *
> > -      * \todo Find a better heuristic, or, even better, an exact solution to
> > -      * this issue.
> > -      */
> > -     if ((flags_ & StreamFlag::External) || (flags_ & StreamFlag::ImportOnly))
> > -             count = count * 2;
> > -
> > -     return dev_->importBuffers(count);
> > +     return dev_->importBuffers(32);
> 
> Might be good to define 32 somewhere as constexpr to avoid magic
> numbers, but apart from this

Agreed here. Something like

/** The maximum buffer allocation possible with V4L2 */
constexpr maximumV4L2Buffers = 32;

to express that this is the maximum possible with V4L2....

But otherwise

Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>

> 
> Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> 
> Thanks
>   j
> 
> >  }
> >
> >  int Stream::queueBuffer(FrameBuffer *buffer)
> > --
> > 2.34.1
> >
Naushir Patuck Sept. 4, 2023, 9:04 a.m. UTC | #3
Hi Kieran,

Thanks for the review.

On Mon, 4 Sept 2023 at 09:44, Kieran Bingham <
kieran.bingham@ideasonboard.com> wrote:

> Quoting Jacopo Mondi via libcamera-devel (2023-07-27 10:31:28)
> > Hi Naush
> >
> > On Tue, Jul 25, 2023 at 09:55:37AM +0100, Naushir Patuck via
> libcamera-devel wrote:
> > > Hardcode the maximum number of buffers imported to the V4L2 video
> device
> > > to 32. This only has a minor disadvantage of over-allocating cache
> slots
> > > and V4L2 buffer indexes, but does allow more headroom for using dma
> > > buffers allocated from outside of libcamera.
> > >
> > > Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
> > > ---
> > >  .../pipeline/rpi/common/rpi_stream.cpp        | 33 +++++--------------
> > >  1 file changed, 8 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > > index c158843cb0ed..e9ad1e6f0848 100644
> > > --- a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > > +++ b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > > @@ -95,34 +95,17 @@ int Stream::prepareBuffers(unsigned int count)
> > >       int ret;
> > >
> > >       if (!(flags_ & StreamFlag::ImportOnly)) {
> > > -             if (count) {
> > > -                     /* Export some frame buffers for internal use. */
> > > -                     ret = dev_->exportBuffers(count,
> &internalBuffers_);
> > > -                     if (ret < 0)
> > > -                             return ret;
> > > -
> > > -                     /* Add these exported buffers to the
> internal/external buffer list. */
> > > -                     setExportedBuffers(&internalBuffers_);
> > > -                     resetBuffers();
> > > -             }
> > > +             /* Export some frame buffers for internal use. */
> > > +             ret = dev_->exportBuffers(count, &internalBuffers_);
> > > +             if (ret < 0)
> > > +                     return ret;
> > >
> > > -             /* We must import all internal/external exported
> buffers. */
> > > -             count = bufferMap_.size();
> > > +             /* Add these exported buffers to the internal/external
> buffer list. */
> > > +             setExportedBuffers(&internalBuffers_);
> > > +             resetBuffers();
> > >       }
> > >
> > > -     /*
> > > -      * If this is an external stream, we must allocate slots for
> buffers that
> > > -      * might be externally allocated. We have no indication of how
> many buffers
> > > -      * may be used, so this might overallocate slots in the buffer
> cache.
> > > -      * Similarly, if this stream is only importing buffers, we do
> the same.
> > > -      *
> > > -      * \todo Find a better heuristic, or, even better, an exact
> solution to
> > > -      * this issue.
> > > -      */
> > > -     if ((flags_ & StreamFlag::External) || (flags_ &
> StreamFlag::ImportOnly))
> > > -             count = count * 2;
> > > -
> > > -     return dev_->importBuffers(count);
> > > +     return dev_->importBuffers(32);
> >
> > Might be good to define 32 somewhere as constexpr to avoid magic
> > numbers, but apart from this
>
> Agreed here. Something like
>
> /** The maximum buffer allocation possible with V4L2 */
> constexpr maximumV4L2Buffers = 32;
>

Ack, I'll make this change for v3.

Naush


>
> to express that this is the maximum possible with V4L2....
>
> But otherwise
>
> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
>
> >
> > Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> >
> > Thanks
> >   j
> >
> > >  }
> > >
> > >  int Stream::queueBuffer(FrameBuffer *buffer)
> > > --
> > > 2.34.1
> > >
>
Kieran Bingham Sept. 4, 2023, 11:23 a.m. UTC | #4
Quoting Naushir Patuck (2023-09-04 10:04:41)
> Hi Kieran,
> 
> Thanks for the review.
> 
> On Mon, 4 Sept 2023 at 09:44, Kieran Bingham <
> kieran.bingham@ideasonboard.com> wrote:
> 
> > Quoting Jacopo Mondi via libcamera-devel (2023-07-27 10:31:28)
> > > Hi Naush
> > >
> > > On Tue, Jul 25, 2023 at 09:55:37AM +0100, Naushir Patuck via
> > libcamera-devel wrote:
> > > > Hardcode the maximum number of buffers imported to the V4L2 video
> > device
> > > > to 32. This only has a minor disadvantage of over-allocating cache
> > slots
> > > > and V4L2 buffer indexes, but does allow more headroom for using dma
> > > > buffers allocated from outside of libcamera.
> > > >
> > > > Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
> > > > ---
> > > >  .../pipeline/rpi/common/rpi_stream.cpp        | 33 +++++--------------
> > > >  1 file changed, 8 insertions(+), 25 deletions(-)
> > > >
> > > > diff --git a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > > > index c158843cb0ed..e9ad1e6f0848 100644
> > > > --- a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > > > +++ b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> > > > @@ -95,34 +95,17 @@ int Stream::prepareBuffers(unsigned int count)
> > > >       int ret;
> > > >
> > > >       if (!(flags_ & StreamFlag::ImportOnly)) {
> > > > -             if (count) {
> > > > -                     /* Export some frame buffers for internal use. */
> > > > -                     ret = dev_->exportBuffers(count,
> > &internalBuffers_);
> > > > -                     if (ret < 0)
> > > > -                             return ret;
> > > > -
> > > > -                     /* Add these exported buffers to the
> > internal/external buffer list. */
> > > > -                     setExportedBuffers(&internalBuffers_);
> > > > -                     resetBuffers();
> > > > -             }
> > > > +             /* Export some frame buffers for internal use. */
> > > > +             ret = dev_->exportBuffers(count, &internalBuffers_);
> > > > +             if (ret < 0)
> > > > +                     return ret;
> > > >
> > > > -             /* We must import all internal/external exported
> > buffers. */
> > > > -             count = bufferMap_.size();
> > > > +             /* Add these exported buffers to the internal/external
> > buffer list. */
> > > > +             setExportedBuffers(&internalBuffers_);
> > > > +             resetBuffers();
> > > >       }
> > > >
> > > > -     /*
> > > > -      * If this is an external stream, we must allocate slots for
> > buffers that
> > > > -      * might be externally allocated. We have no indication of how
> > many buffers
> > > > -      * may be used, so this might overallocate slots in the buffer
> > cache.
> > > > -      * Similarly, if this stream is only importing buffers, we do
> > the same.
> > > > -      *
> > > > -      * \todo Find a better heuristic, or, even better, an exact
> > solution to
> > > > -      * this issue.
> > > > -      */
> > > > -     if ((flags_ & StreamFlag::External) || (flags_ &
> > StreamFlag::ImportOnly))
> > > > -             count = count * 2;
> > > > -
> > > > -     return dev_->importBuffers(count);
> > > > +     return dev_->importBuffers(32);
> > >
> > > Might be good to define 32 somewhere as constexpr to avoid magic
> > > numbers, but apart from this
> >
> > Agreed here. Something like
> >
> > /** The maximum buffer allocation possible with V4L2 */
> > constexpr maximumV4L2Buffers = 32;
> >
> 
> Ack, I'll make this change for v3.

No point sending a v3 just for this change I don't think.

You could go direct to a PR with this fixed.

--
Kieran


> 
> Naush
> 
> 
> >
> > to express that this is the maximum possible with V4L2....
> >
> > But otherwise
> >
> > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> >
> > >
> > > Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> > >
> > > Thanks
> > >   j
> > >
> > > >  }
> > > >
> > > >  int Stream::queueBuffer(FrameBuffer *buffer)
> > > > --
> > > > 2.34.1
> > > >
> >
Laurent Pinchart Sept. 7, 2023, 3:29 p.m. UTC | #5
Hi Naush,

Thank you for the patch.

On Tue, Jul 25, 2023 at 09:55:37AM +0100, Naushir Patuck via libcamera-devel wrote:
> Hardcode the maximum number of buffers imported to the V4L2 video device
> to 32. This only has a minor disadvantage of over-allocating cache slots
> and V4L2 buffer indexes, but does allow more headroom for using dma
> buffers allocated from outside of libcamera.

I fear there's an additional disadvantage :-S

Dmabuf instances are imported implicitly in V4L2, when preparing a
buffer (either with VIDIOC_PREPARE_BUF or VIDIOC_QBUF). At that point,
the reference count to the buffer is increased. The reference is dropped
either when the V4L2 buffer is deleted (currently only possible with
VIDIOC_REQBUFS(0)), or when the dmabuf instance is evicted by importing
a differ dmabuf (again with VIDIOC_PREPARE_BUF or VIDIOC_QBUF) for the
same buffer slot.

The eviction mechanism makes it very expensive to cycle through a pool
of dmabuf buffers if you queue them in different buffer slots all the
time, so the V4L2VideoDevice class tries hard to reuse a buffer slot
that is already associated with the dmabuf instances when queuing a
FrameBuffer. The downside, however, is that a large number of buffer
slots means that dmabuf instances will be held on for longer before they
get evicted.

If an application preallocates a limited set of buffers and cycles
through them, it's likely all fine. But if it destroys existing buffers
and allocates new ones (not necessarily all the time, it could do so for
still images only for instance), then the buffers that are freed by the
application will still have references in the kernel, and their memory
won't actually be freed. You may then end run out of memory.

I think this is an issue with the V4L2 API and its implicit dmabuf
import, there's likely nothing we can do about it in userspace to really
fix the problem, but increasing the number of buffer slots
unconditionally can make things much worse.

> Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
> ---
>  .../pipeline/rpi/common/rpi_stream.cpp        | 33 +++++--------------
>  1 file changed, 8 insertions(+), 25 deletions(-)
> 
> diff --git a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> index c158843cb0ed..e9ad1e6f0848 100644
> --- a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> +++ b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
> @@ -95,34 +95,17 @@ int Stream::prepareBuffers(unsigned int count)
>  	int ret;
>  
>  	if (!(flags_ & StreamFlag::ImportOnly)) {
> -		if (count) {
> -			/* Export some frame buffers for internal use. */
> -			ret = dev_->exportBuffers(count, &internalBuffers_);
> -			if (ret < 0)
> -				return ret;
> -
> -			/* Add these exported buffers to the internal/external buffer list. */
> -			setExportedBuffers(&internalBuffers_);
> -			resetBuffers();
> -		}
> +		/* Export some frame buffers for internal use. */
> +		ret = dev_->exportBuffers(count, &internalBuffers_);
> +		if (ret < 0)
> +			return ret;
>  
> -		/* We must import all internal/external exported buffers. */
> -		count = bufferMap_.size();
> +		/* Add these exported buffers to the internal/external buffer list. */
> +		setExportedBuffers(&internalBuffers_);
> +		resetBuffers();
>  	}
>  
> -	/*
> -	 * If this is an external stream, we must allocate slots for buffers that
> -	 * might be externally allocated. We have no indication of how many buffers
> -	 * may be used, so this might overallocate slots in the buffer cache.
> -	 * Similarly, if this stream is only importing buffers, we do the same.
> -	 *
> -	 * \todo Find a better heuristic, or, even better, an exact solution to
> -	 * this issue.
> -	 */
> -	if ((flags_ & StreamFlag::External) || (flags_ & StreamFlag::ImportOnly))
> -		count = count * 2;
> -
> -	return dev_->importBuffers(count);
> +	return dev_->importBuffers(32);
>  }
>  
>  int Stream::queueBuffer(FrameBuffer *buffer)

Patch
diff mbox series

diff --git a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
index c158843cb0ed..e9ad1e6f0848 100644
--- a/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
+++ b/src/libcamera/pipeline/rpi/common/rpi_stream.cpp
@@ -95,34 +95,17 @@  int Stream::prepareBuffers(unsigned int count)
 	int ret;
 
 	if (!(flags_ & StreamFlag::ImportOnly)) {
-		if (count) {
-			/* Export some frame buffers for internal use. */
-			ret = dev_->exportBuffers(count, &internalBuffers_);
-			if (ret < 0)
-				return ret;
-
-			/* Add these exported buffers to the internal/external buffer list. */
-			setExportedBuffers(&internalBuffers_);
-			resetBuffers();
-		}
+		/* Export some frame buffers for internal use. */
+		ret = dev_->exportBuffers(count, &internalBuffers_);
+		if (ret < 0)
+			return ret;
 
-		/* We must import all internal/external exported buffers. */
-		count = bufferMap_.size();
+		/* Add these exported buffers to the internal/external buffer list. */
+		setExportedBuffers(&internalBuffers_);
+		resetBuffers();
 	}
 
-	/*
-	 * If this is an external stream, we must allocate slots for buffers that
-	 * might be externally allocated. We have no indication of how many buffers
-	 * may be used, so this might overallocate slots in the buffer cache.
-	 * Similarly, if this stream is only importing buffers, we do the same.
-	 *
-	 * \todo Find a better heuristic, or, even better, an exact solution to
-	 * this issue.
-	 */
-	if ((flags_ & StreamFlag::External) || (flags_ & StreamFlag::ImportOnly))
-		count = count * 2;
-
-	return dev_->importBuffers(count);
+	return dev_->importBuffers(32);
 }
 
 int Stream::queueBuffer(FrameBuffer *buffer)