[libcamera-devel,18/30] libcamera: v4l2_videodevice: Add V4L2BufferCache to deal with index mapping

Message ID 20191126233620.1695316-19-niklas.soderlund@ragnatech.se
State Superseded
Headers show
Series
  • libcamera: Rework buffer API
Related show

Commit Message

Niklas Söderlund Nov. 26, 2019, 11:36 p.m. UTC
In preparation for the FrameBuffer interface add a class which will deal
with keeping the cache between dmafds and V4L2 video device buffer
indexes.

This initial implement ensures that no hot association is lost while its
eviction strategy could be improved in the future.

Signed-off-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
---
 src/libcamera/include/v4l2_videodevice.h |  20 ++++-
 src/libcamera/v4l2_videodevice.cpp       | 105 ++++++++++++++++++++++-
 2 files changed, 123 insertions(+), 2 deletions(-)

Comments

Jacopo Mondi Dec. 2, 2019, 10:45 a.m. UTC | #1
Hi Niklas,

On Wed, Nov 27, 2019 at 12:36:08AM +0100, Niklas Söderlund wrote:
> In preparation for the FrameBuffer interface add a class which will deal
s/FrameBuffer interface/switch to use the FrameBuffer interface/ ?

> with keeping the cache between dmafds and V4L2 video device buffer
> indexes.
>
> This initial implement ensures that no hot association is lost while its
> eviction strategy could be improved in the future.
>
> Signed-off-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> ---
>  src/libcamera/include/v4l2_videodevice.h |  20 ++++-
>  src/libcamera/v4l2_videodevice.cpp       | 105 ++++++++++++++++++++++-
>  2 files changed, 123 insertions(+), 2 deletions(-)
>
> diff --git a/src/libcamera/include/v4l2_videodevice.h b/src/libcamera/include/v4l2_videodevice.h
> index 34bbff41760753bd..254f8797af42dd8a 100644
> --- a/src/libcamera/include/v4l2_videodevice.h
> +++ b/src/libcamera/include/v4l2_videodevice.h
> @@ -12,6 +12,7 @@
>
>  #include <linux/videodev2.h>
>
> +#include <libcamera/buffer.h>
>  #include <libcamera/geometry.h>
>  #include <libcamera/pixelformats.h>
>  #include <libcamera/signal.h>
> @@ -22,7 +23,6 @@
>
>  namespace libcamera {
>
> -class Buffer;

Is this and the removal of the buffer.h inclusion in
v4l2_videodevice.cpp related with the introduction of this class ?

>  class BufferMemory;
>  class BufferPool;
>  class EventNotifier;
> @@ -105,6 +105,24 @@ struct V4L2Capability final : v4l2_capability {
>  	}
>  };
>
> +class V4L2BufferCache
> +{
> +public:
> +	V4L2BufferCache(unsigned int size);
> +	V4L2BufferCache(const std::vector<FrameBuffer *> buffers);
> +
> +	int fetch(const FrameBuffer *buffer);

Not sure what's best, as I've not seen this in use, but what about a
const FrameBuffer & ?

> +	void put(unsigned int index);
> +
> +private:
> +	struct CacheInfo {

CacheEntry ?

> +		bool free;
> +		std::vector<FrameBuffer::Plane> last;
> +	};
> +
> +	std::vector<CacheInfo> cache_;
> +};
> +
>  class V4L2DeviceFormat
>  {
>  public:
> diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
> index a05dd6a1f7d86eaa..c82f2829601bd14c 100644
> --- a/src/libcamera/v4l2_videodevice.cpp
> +++ b/src/libcamera/v4l2_videodevice.cpp
> @@ -19,7 +19,6 @@
>
>  #include <linux/drm_fourcc.h>
>
> -#include <libcamera/buffer.h>
>  #include <libcamera/event_notifier.h>
>
>  #include "log.h"
> @@ -134,6 +133,110 @@ LOG_DECLARE_CATEGORY(V4L2)
>   * \return True if the video device provides Streaming I/O IOCTLs
>   */
>
> +/**
> + * \class V4L2BufferCache
> + * \brief Hot cache of associations between V4L2 index and FrameBuffer

V4L2 buffer indexes ?

> + *
> + * There is performance to be gained if the same V4L2 buffer index can be
> + * reused for the same FrameBuffer object as the kernel don't have to redo

the kernel -doesn't-

> + * the mapping. The V4L2BufferCache tries to keep a hot-cache of mappings
> + * between the two.
> + *
> + * If there is a cache miss is not critical, everything still works as expected.
> + */
> +
> +/**
> + * \brief Create a empty cache of a given size

an empty
of a give \a size

> + * \param[in] size Size of cache to create

Number of entries to reserve in the cache ?

> + *
> + * Create a cold cache with \a size entries. The cache will be populated as
> + * it's being used.

not sure, but this sounds better to me as
"as it is used"

> + */
> +V4L2BufferCache::V4L2BufferCache(unsigned int size)
> +{
> +	cache_.resize(size, { .free = true, .last = {} });
> +}
> +
> +/**
> + * \brief Create a pre-populated cache
> + * \param[in] buffers Array of buffers to pre-populated with
> + *
> + * Create a warm cache from \a buffers.
> + */
> +V4L2BufferCache::V4L2BufferCache(const std::vector<FrameBuffer *> buffers)
> +{
> +	for (const FrameBuffer *buffer : buffers)
> +		cache_.push_back({ .free = true, .last = buffer->planes() });
> +}

Here and in all over this patch, are you copying vectors around ?

> +
> +/**
> + * \brief Fetch a index from the cache

an index

I would:
"Retrieve the V4L2 buffer index associated with \a buffer"

> + * \param[in] buffer FrameBuffer to match

The FrameBuffer

> + *
> + * Try to find \a buffer in cache and if it's free reuse the last used index
> + * for this buffer. If the buffer have never been seen or if have been evinced

the buffer -has-

> + * from the cache the first free index is pieced instead. Likewise if the last

not sure about what pieced means.

> + * used index is in use a new free index is picked.

I would just
"If no index is associated with \a buffer in the cache, or the index
has been re-assigned, use the first available one, if any"

> + *
> + * When an index is picked it is marked as in-use and returned to the caller.
> + * The association is also recorded so it if possible can reused the next time
> + * the FrameBuffer is seen.

That's the purpose of a cache :)

> + *
> + * \return V4L2 buffer index

or -ENOENT

> + */
> +int V4L2BufferCache::fetch(const FrameBuffer *buffer)
> +{
> +	int use = -1;
> +
> +	for (unsigned int index = 0; index < cache_.size(); index++) {
> +		if (!cache_[index].free)
> +			continue;
> +
> +		if (use < 0)
> +			use = index;

I wonder if we shouldn't keep the V4L2 buffer index in the CacheInfo
instead of using the CacheInfo entry position in the vector.

> +
> +		/* Try to find a cache hit by comparing the planes. */
> +		std::vector<FrameBuffer::Plane> planes = buffer->planes();

Are you copying the vector ?

> +		if (cache_[index].last.size() != planes.size())
> +			continue;
> +
> +		bool match = true;
> +		for (unsigned int i = 0; i < planes.size(); i++) {
> +			if (cache_[index].last[i].fd != planes[i].fd ||
> +			    cache_[index].last[i].length != planes[i].length) {
> +				match = false;
> +				break;
> +			}

I would here just use the address of the planes() vector as the key
instead of comparing it's content. Or is there a risk or relocation?

> +		}
> +
> +		if (!match)
> +			continue;
> +
> +		use = index;
> +		break;
> +	}
> +
> +	if (use < 0)
> +		return -ENOENT;
> +
> +	cache_[use].free = false;
> +	cache_[use].last = buffer->planes();

This seems to be a copy as well

> +
> +	return use;
> +}
> +
> +/**
> + * \brief Pit a V4L2 index back in the cache

s/Pit/Put

I would:
"Make the V4L2 buffer \a idex available again in the cache"


> + * \param[in] index V4L2 index

The V4L2 buffer index

> + *
> + * Mark the \a index as free in the cache so it can be reused.

the v4l2 buffer \a index

> + */
> +void V4L2BufferCache::put(unsigned int index)

put would suggest a get() counterpart ? Or fetch/put is a good enough
operation names pair ?

> +{
> +	ASSERT(index < cache_.size());
> +	cache_[index].free = true;

If you keep the v4l2 index in the CacheInfo class as I've suggested
you would here pay the price of iterating through the vector. How
often is a v4l2 buffer index given back to the cache ?

Thanks
   j

> +}
> +
>  /**
>   * \class V4L2DeviceFormat
>   * \brief The V4L2 video device image format and sizes
> --
> 2.24.0
>
> _______________________________________________
> libcamera-devel mailing list
> libcamera-devel@lists.libcamera.org
> https://lists.libcamera.org/listinfo/libcamera-devel
Laurent Pinchart Dec. 15, 2019, 1:16 a.m. UTC | #2
Hi Niklas,

Thank you for the patch.

On Mon, Dec 02, 2019 at 11:45:29AM +0100, Jacopo Mondi wrote:
> On Wed, Nov 27, 2019 at 12:36:08AM +0100, Niklas Söderlund wrote:
> > In preparation for the FrameBuffer interface add a class which will deal
> s/FrameBuffer interface/switch to use the FrameBuffer interface/ ?
> 
> > with keeping the cache between dmafds and V4L2 video device buffer
> > indexes.
> >
> > This initial implement ensures that no hot association is lost while its
> > eviction strategy could be improved in the future.
> >
> > Signed-off-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> > ---
> >  src/libcamera/include/v4l2_videodevice.h |  20 ++++-
> >  src/libcamera/v4l2_videodevice.cpp       | 105 ++++++++++++++++++++++-
> >  2 files changed, 123 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/libcamera/include/v4l2_videodevice.h b/src/libcamera/include/v4l2_videodevice.h
> > index 34bbff41760753bd..254f8797af42dd8a 100644
> > --- a/src/libcamera/include/v4l2_videodevice.h
> > +++ b/src/libcamera/include/v4l2_videodevice.h
> > @@ -12,6 +12,7 @@
> >
> >  #include <linux/videodev2.h>
> >
> > +#include <libcamera/buffer.h>
> >  #include <libcamera/geometry.h>
> >  #include <libcamera/pixelformats.h>
> >  #include <libcamera/signal.h>
> > @@ -22,7 +23,6 @@
> >
> >  namespace libcamera {
> >
> > -class Buffer;
> 
> Is this and the removal of the buffer.h inclusion in
> v4l2_videodevice.cpp related with the introduction of this class ?

The removal of the Buffer forward declaration is related to the
inclusion of buffer.h, which is required for the
std::vector<FrameBuffer::Plane> below.

> >  class BufferMemory;
> >  class BufferPool;

The BufferMemory and BufferPool forward declarations should be removed
too.

> >  class EventNotifier;
> > @@ -105,6 +105,24 @@ struct V4L2Capability final : v4l2_capability {
> >  	}
> >  };
> >
> > +class V4L2BufferCache
> > +{
> > +public:
> > +	V4L2BufferCache(unsigned int size);
> > +	V4L2BufferCache(const std::vector<FrameBuffer *> buffers);
> > +
> > +	int fetch(const FrameBuffer *buffer);
> 
> Not sure what's best, as I've not seen this in use, but what about a
> const FrameBuffer & ?

As buffer can't be NULL, I would indeed use a const reference.
> 
> > +	void put(unsigned int index);

I think you also need a reset() method to be called at stream stop time
for imported buffers. It should set the free flag for all entries.

> > +
> > +private:
> > +	struct CacheInfo {
> 
> CacheEntry ?

Sounds better to me, or just Entry.

> > +		bool free;
> > +		std::vector<FrameBuffer::Plane> last;
> > +	};
> > +
> > +	std::vector<CacheInfo> cache_;
> > +};
> > +
> >  class V4L2DeviceFormat
> >  {
> >  public:
> > diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
> > index a05dd6a1f7d86eaa..c82f2829601bd14c 100644
> > --- a/src/libcamera/v4l2_videodevice.cpp
> > +++ b/src/libcamera/v4l2_videodevice.cpp
> > @@ -19,7 +19,6 @@
> >
> >  #include <linux/drm_fourcc.h>
> >
> > -#include <libcamera/buffer.h>
> >  #include <libcamera/event_notifier.h>
> >
> >  #include "log.h"
> > @@ -134,6 +133,110 @@ LOG_DECLARE_CATEGORY(V4L2)
> >   * \return True if the video device provides Streaming I/O IOCTLs
> >   */
> >
> > +/**
> > + * \class V4L2BufferCache
> > + * \brief Hot cache of associations between V4L2 index and FrameBuffer
> 
> V4L2 buffer indexes ?
> 
> > + *
> > + * There is performance to be gained if the same V4L2 buffer index can be
> > + * reused for the same FrameBuffer object as the kernel don't have to redo
> 
> the kernel -doesn't-
> 
> > + * the mapping. The V4L2BufferCache tries to keep a hot-cache of mappings
> > + * between the two.
> > + *
> > + * If there is a cache miss is not critical, everything still works as expected.

"A performance to be gained" is a bit of an understatement, as trashing
this cash would just make the camera unusable in many cases :-) And I
think we should explain that this related to dmabuf mappings, it's not
apparent from the text. How about the following ?

 * When importing buffers, V4L2 performs lazy mapping of dmabuf instances at
 * VIDIOC_QBUF (or VIDIOC_PREPARE_BUF) time and keeps the mapping associated
 * with the V4L2 buffer, as identified by its index. If the same V4L2 buffer is
 * then reused and queued with different dmabufs, the old dmabufs will be
 * unmapped and the new ones mapped. To keep this process efficient, it is
 * crucial to consistently use the same V4L2 buffer for given dmabufs through
 * the whole duration of a capture cycle.
 *
 * The V4L2BufferCache class keeps a map of previous dmabufs to V4L2 buffer
 * index associations to help selecting V4L2 buffers. It tracks, for every
 * entry, if the V4L2 buffer is in use, and offers lookup of the best free V4L2
 * buffer for a set of dmabufs.

And now that I've written this, it occurs to me that this isn't really a
cache in the traditional sense of storing data so that future requests
for that data will be served faster :-) I can't really think of a better
name for the class though, as V4L2BufferMap would likely be a bit more
accurate, but less descriptive.

> > + */
> > +
> > +/**
> > + * \brief Create a empty cache of a given size
> 
> an empty
> of a give \a size

s/given/give/ or just "with \a size entries"

> 
> > + * \param[in] size Size of cache to create
> 
> Number of entries to reserve in the cache ?

Seems better to me. The parameter could be named numEntries to make the
API more explicit.

> > + *
> > + * Create a cold cache with \a size entries. The cache will be populated as
> > + * it's being used.
> 
> not sure, but this sounds better to me as
> "as it is used"

Agreed. I would also say "empty" instead of "cold". Or possibly better,

 * Create a cache with \a size entries all marked as unused. The entries will be
 * populated as the cache is used. This is typically used to implement buffer
 * import, with buffers added to the cache as they are queued.

> > + */
> > +V4L2BufferCache::V4L2BufferCache(unsigned int size)
> > +{
> > +	cache_.resize(size, { .free = true, .last = {} });

If you replaced .free with .used (or .inUse or a similar name), this
could become

	cache_.resize(size);

and the data will be zero-initialized:

https://en.cppreference.com/w/cpp/container/vector/resize

"If the current size is less than count,
1) additional default-inserted elements are appended"

https://en.cppreference.com/w/cpp/named_req/DefaultInsertable

"By default, this will call placement-new, as by ::new((void*)p) T()
(that is, value-initialize the object pointed to by p)."

https://en.cppreference.com/w/cpp/language/value_initialization

"2) if T is a class type with a default constructor that is neither
user-provided nor deleted (that is, it may be a class with an
implicitly-defined or defaulted default constructor), the object is
zero-initialized and then it is default-initialized if it has a
non-trivial default constructor;"

> > +}
> > +
> > +/**
> > + * \brief Create a pre-populated cache
> > + * \param[in] buffers Array of buffers to pre-populated with
> > + *
> > + * Create a warm cache from \a buffers.

And to align with the proposal above,

 * Create a cache pre-populated with \a buffers. This is typically used to
 * implement buffer export, with all buffers added to the cache when they are
 * allocated. 

> > + */
> > +V4L2BufferCache::V4L2BufferCache(const std::vector<FrameBuffer *> buffers)
> > +{
> > +	for (const FrameBuffer *buffer : buffers)
> > +		cache_.push_back({ .free = true, .last = buffer->planes() });

How about

	.emplace_back(true, buffer->planes());

with a constructor for CacheInfo

		CacheInfo() = default;
		CacheInfo(bool free, const std::vector<FrameBuffer::Plane> &last)
			: free(free), last(last)
		{
		}

The default constructor is required to make cache_.resize(size) valid
above, as the user-defined constructor causes the compiler to not
generate the default constructor otherwise.

You could even possibly skip the free parameter as it's always set to
true.

> > +}
> 
> Here and in all over this patch, are you copying vectors around ?

Do you mean for .last ? It seems to be the case, yes, but I don't think
we can really avoid that. We can't store a reference (or pointer)
instead as that would have implications on the lifetime of the
FrameBuffer vector that would be annoying for applications. For exported
buffers we may be fine as the FrameBuffer vector is allocated by the
BufferAllocator, but for imported buffers we would need to ensure that
applications never delete a FrameBuffer before stopping the stream. We
could document this requirement but it would be cumbersome to enforce
programmatically, and thus error prone.

> > +
> > +/**
> > + * \brief Fetch a index from the cache
> 
> an index
> 
> I would:
> "Retrieve the V4L2 buffer index associated with \a buffer"

"Find the best V4L2 buffer for a FrameBuffer" ?

> > + * \param[in] buffer FrameBuffer to match
> 
> The FrameBuffer

And I would either drop "to match" or expand the line to "The
FrameBuffer to get a V4L2 buffer for", with a preference for the shorter
option.

> > + *
> > + * Try to find \a buffer in cache and if it's free reuse the last used index
> > + * for this buffer. If the buffer have never been seen or if have been evinced
> 
> the buffer -has-
> 
> > + * from the cache the first free index is pieced instead. Likewise if the last
> 
> not sure about what pieced means.

picked ?

> > + * used index is in use a new free index is picked.
> 
> I would just
> "If no index is associated with \a buffer in the cache, or the index
> has been re-assigned, use the first available one, if any"
>
> > + *
> > + * When an index is picked it is marked as in-use and returned to the caller.
> > + * The association is also recorded so it if possible can reused the next time
> > + * the FrameBuffer is seen.
> 
> That's the purpose of a cache :)

 * Find the best V4L2 buffer index to be used for the FrameBuffer \a buffer
 * based on previous mappings of frame buffers to V4L2 buffers. If a free V4L2
 * buffer previously used with the same dmabufs as \a buffer is found in the
 * cache, return its index. Otherwise return the index of the first free V4L2
 * buffer and record its association with the dmabufs of \a buffer.

> 
> > + *
> > + * \return V4L2 buffer index
> 
> or -ENOENT

"The index of the best V4L2 buffer, or -ENOENT if no free V4L2 buffer is
available"

The only caller doesn't perform any error check by the way. In theory
this would indicate a serious bug anyway, but I don't think we prevent
applications from creating more FrameBuffer instances and queuing more
requests than the number of allocated buffers for the V4L2 devices. This
is an issue that should be handled in the pipeline handlers (or possibly
in the base PipelineHandler class), on top of this series. You could add
an error check in the caller in the meantime.

> > + */
> > +int V4L2BufferCache::fetch(const FrameBuffer *buffer)
> > +{
> > +	int use = -1;
> > +
> > +	for (unsigned int index = 0; index < cache_.size(); index++) {

		const CacheInfo &entry = cache_[index];

and use entry everywhere below.

> > +		if (!cache_[index].free)
> > +			continue;
> > +
> > +		if (use < 0)
> > +			use = index;

You will use the first free entry as a replacement policy instead of the
oldest entry as done today. What are the implications on performances ?

> I wonder if we shouldn't keep the V4L2 buffer index in the CacheInfo
> instead of using the CacheInfo entry position in the vector.
> 
> > +
> > +		/* Try to find a cache hit by comparing the planes. */
> > +		std::vector<FrameBuffer::Plane> planes = buffer->planes();
> 
> Are you copying the vector ?

This should be a const reference.

> > +		if (cache_[index].last.size() != planes.size())
> > +			continue;
> > +
> > +		bool match = true;
> > +		for (unsigned int i = 0; i < planes.size(); i++) {
> > +			if (cache_[index].last[i].fd != planes[i].fd ||
> > +			    cache_[index].last[i].length != planes[i].length) {

Maybe

			if (entry.last != buffer->planes()) {

and drop the size check above (and the local planes variable) ?

> > +				match = false;
> > +				break;
> > +			}
> 
> I would here just use the address of the planes() vector as the key
> instead of comparing it's content. Or is there a risk or relocation?

Please see above.

> > +		}
> > +
> > +		if (!match)
> > +			continue;
> > +
> > +		use = index;
> > +		break;
> > +	}
> > +
> > +	if (use < 0)
> > +		return -ENOENT;
> > +
> > +	cache_[use].free = false;
> > +	cache_[use].last = buffer->planes();
> 
> This seems to be a copy as well
> 
> > +
> > +	return use;
> > +}
> > +
> > +/**
> > + * \brief Pit a V4L2 index back in the cache
> 
> s/Pit/Put
> 
> I would:
> "Make the V4L2 buffer \a idex available again in the cache"

This ends up duplicating the longer description below. I would just
remove it and use

 * \brief Mark buffer \a index as free in the cache

> > + * \param[in] index V4L2 index
> 
> The V4L2 buffer index
> 
> > + *
> > + * Mark the \a index as free in the cache so it can be reused.
> 
> the v4l2 buffer \a index

s/v4l2/V4L2/

But I would drop this line as it just duplicates the \brief.

> > + */
> > +void V4L2BufferCache::put(unsigned int index)
> 
> put would suggest a get() counterpart ? Or fetch/put is a good enough
> operation names pair ?

The names are possibly not the best, but the name of the class is also
not quite accurate as explained above. This is an internal API and its
usage is limited to V4L2VideoDevice so I would go for get() and put().

> > +{
> > +	ASSERT(index < cache_.size());
> > +	cache_[index].free = true;
> 
> If you keep the v4l2 index in the CacheInfo class as I've suggested
> you would here pay the price of iterating through the vector. How
> often is a v4l2 buffer index given back to the cache ?

Every time the buffer is dequeued, so once per frame. The lookup is also
performed once per frame.

> > +}
> > +
> >  /**
> >   * \class V4L2DeviceFormat
> >   * \brief The V4L2 video device image format and sizes

Patch

diff --git a/src/libcamera/include/v4l2_videodevice.h b/src/libcamera/include/v4l2_videodevice.h
index 34bbff41760753bd..254f8797af42dd8a 100644
--- a/src/libcamera/include/v4l2_videodevice.h
+++ b/src/libcamera/include/v4l2_videodevice.h
@@ -12,6 +12,7 @@ 
 
 #include <linux/videodev2.h>
 
+#include <libcamera/buffer.h>
 #include <libcamera/geometry.h>
 #include <libcamera/pixelformats.h>
 #include <libcamera/signal.h>
@@ -22,7 +23,6 @@ 
 
 namespace libcamera {
 
-class Buffer;
 class BufferMemory;
 class BufferPool;
 class EventNotifier;
@@ -105,6 +105,24 @@  struct V4L2Capability final : v4l2_capability {
 	}
 };
 
+class V4L2BufferCache
+{
+public:
+	V4L2BufferCache(unsigned int size);
+	V4L2BufferCache(const std::vector<FrameBuffer *> buffers);
+
+	int fetch(const FrameBuffer *buffer);
+	void put(unsigned int index);
+
+private:
+	struct CacheInfo {
+		bool free;
+		std::vector<FrameBuffer::Plane> last;
+	};
+
+	std::vector<CacheInfo> cache_;
+};
+
 class V4L2DeviceFormat
 {
 public:
diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
index a05dd6a1f7d86eaa..c82f2829601bd14c 100644
--- a/src/libcamera/v4l2_videodevice.cpp
+++ b/src/libcamera/v4l2_videodevice.cpp
@@ -19,7 +19,6 @@ 
 
 #include <linux/drm_fourcc.h>
 
-#include <libcamera/buffer.h>
 #include <libcamera/event_notifier.h>
 
 #include "log.h"
@@ -134,6 +133,110 @@  LOG_DECLARE_CATEGORY(V4L2)
  * \return True if the video device provides Streaming I/O IOCTLs
  */
 
+/**
+ * \class V4L2BufferCache
+ * \brief Hot cache of associations between V4L2 index and FrameBuffer
+ *
+ * There is performance to be gained if the same V4L2 buffer index can be
+ * reused for the same FrameBuffer object as the kernel don't have to redo
+ * the mapping. The V4L2BufferCache tries to keep a hot-cache of mappings
+ * between the two.
+ *
+ * If there is a cache miss is not critical, everything still works as expected.
+ */
+
+/**
+ * \brief Create a empty cache of a given size
+ * \param[in] size Size of cache to create
+ *
+ * Create a cold cache with \a size entries. The cache will be populated as
+ * it's being used.
+ */
+V4L2BufferCache::V4L2BufferCache(unsigned int size)
+{
+	cache_.resize(size, { .free = true, .last = {} });
+}
+
+/**
+ * \brief Create a pre-populated cache
+ * \param[in] buffers Array of buffers to pre-populated with
+ *
+ * Create a warm cache from \a buffers.
+ */
+V4L2BufferCache::V4L2BufferCache(const std::vector<FrameBuffer *> buffers)
+{
+	for (const FrameBuffer *buffer : buffers)
+		cache_.push_back({ .free = true, .last = buffer->planes() });
+}
+
+/**
+ * \brief Fetch a index from the cache
+ * \param[in] buffer FrameBuffer to match
+ *
+ * Try to find \a buffer in cache and if it's free reuse the last used index
+ * for this buffer. If the buffer have never been seen or if have been evinced
+ * from the cache the first free index is pieced instead. Likewise if the last
+ * used index is in use a new free index is picked.
+ *
+ * When an index is picked it is marked as in-use and returned to the caller.
+ * The association is also recorded so it if possible can reused the next time
+ * the FrameBuffer is seen.
+ *
+ * \return V4L2 buffer index
+ */
+int V4L2BufferCache::fetch(const FrameBuffer *buffer)
+{
+	int use = -1;
+
+	for (unsigned int index = 0; index < cache_.size(); index++) {
+		if (!cache_[index].free)
+			continue;
+
+		if (use < 0)
+			use = index;
+
+		/* Try to find a cache hit by comparing the planes. */
+		std::vector<FrameBuffer::Plane> planes = buffer->planes();
+		if (cache_[index].last.size() != planes.size())
+			continue;
+
+		bool match = true;
+		for (unsigned int i = 0; i < planes.size(); i++) {
+			if (cache_[index].last[i].fd != planes[i].fd ||
+			    cache_[index].last[i].length != planes[i].length) {
+				match = false;
+				break;
+			}
+		}
+
+		if (!match)
+			continue;
+
+		use = index;
+		break;
+	}
+
+	if (use < 0)
+		return -ENOENT;
+
+	cache_[use].free = false;
+	cache_[use].last = buffer->planes();
+
+	return use;
+}
+
+/**
+ * \brief Pit a V4L2 index back in the cache
+ * \param[in] index V4L2 index
+ *
+ * Mark the \a index as free in the cache so it can be reused.
+ */
+void V4L2BufferCache::put(unsigned int index)
+{
+	ASSERT(index < cache_.size());
+	cache_[index].free = true;
+}
+
 /**
  * \class V4L2DeviceFormat
  * \brief The V4L2 video device image format and sizes