[libcamera-devel,v2,4/4] libcamera: V4L2BufferCache: Improve cache eviction strategy

Message ID 20200224193601.1040770-5-niklas.soderlund@ragnatech.se
State Superseded
Headers show
Series
  • libcamera: V4L2BufferCache: Improve cache eviction strategy
Related show

Commit Message

Niklas Söderlund Feb. 24, 2020, 7:36 p.m. UTC
The strategy used to find a free cache entry in the first implementation
was not the smartest, it picked the first free entry. This lead to
unwanted performance issues as they cache was not used as good as it
could for imported buffers.

Improve this by adding a last usage timestamp to the cache entries and
change the eviction strategy to use the oldest free entry instead of the
first one it finds.

Signed-off-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
Reviewed-by: Naushir Patuck <naush@raspberrypi.com>
---
 src/libcamera/include/v4l2_videodevice.h | 1 +
 src/libcamera/v4l2_videodevice.cpp       | 9 ++++++---
 2 files changed, 7 insertions(+), 3 deletions(-)

Comments

Jacopo Mondi Feb. 26, 2020, 2:44 p.m. UTC | #1
Hi Niklas,

On Mon, Feb 24, 2020 at 08:36:01PM +0100, Niklas Söderlund wrote:
> The strategy used to find a free cache entry in the first implementation
> was not the smartest, it picked the first free entry. This lead to
> unwanted performance issues as they cache was not used as good as it
> could for imported buffers.
>
> Improve this by adding a last usage timestamp to the cache entries and
> change the eviction strategy to use the oldest free entry instead of the
> first one it finds.
>
> Signed-off-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> Reviewed-by: Naushir Patuck <naush@raspberrypi.com>
> ---
>  src/libcamera/include/v4l2_videodevice.h | 1 +
>  src/libcamera/v4l2_videodevice.cpp       | 9 ++++++---
>  2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/src/libcamera/include/v4l2_videodevice.h b/src/libcamera/include/v4l2_videodevice.h
> index fcf072641420dacf..f04feed87b24f28f 100644
> --- a/src/libcamera/include/v4l2_videodevice.h
> +++ b/src/libcamera/include/v4l2_videodevice.h
> @@ -125,6 +125,7 @@ private:
>  		bool operator==(const FrameBuffer &buffer);
>
>  		bool free;
> +		utils::time_point lastUsed;
>
>  	private:
>  		struct Plane {
> diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
> index 99470ce11421c77c..9a4d2479b20f5e27 100644
> --- a/src/libcamera/v4l2_videodevice.cpp
> +++ b/src/libcamera/v4l2_videodevice.cpp
> @@ -205,6 +205,7 @@ int V4L2BufferCache::get(const FrameBuffer &buffer)
>  {
>  	bool hit = false;
>  	int use = -1;
> +	utils::time_point oldest = utils::clock::now();
>
>  	for (unsigned int index = 0; index < cache_.size(); index++) {
>  		const Entry &entry = cache_[index];
> @@ -212,8 +213,10 @@ int V4L2BufferCache::get(const FrameBuffer &buffer)
>  		if (!entry.free)
>  			continue;
>
> -		if (use < 0)
> +		if (cache_[index].lastUsed < oldest) {

entry.lastUsed

>  			use = index;
> +			oldest = cache_[index].lastUsed;
> +		}

Could you move this -after- having verified we didn't hit an hot entry ?

It seems more natural to

        for (entry: cache) {
                if (!entry.free)
                        continue;

                if (entry == buffer) {
                        hit = true;
                        break;
                }

                if (entry.lastUsed < oldest)
                        oldest = entry;
        }
>
>  		/* Try to find a cache hit by comparing the planes. */
>  		if (cache_[index] == buffer) {

You could actually use entry here as well.

Minors apart
Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>

Thanks
   j

> @@ -245,12 +248,12 @@ void V4L2BufferCache::put(unsigned int index)
>  }
>
>  V4L2BufferCache::Entry::Entry()
> -	: free(true)
> +	: free(true), lastUsed(utils::clock::now())
>  {
>  }
>
>  V4L2BufferCache::Entry::Entry(bool free, const FrameBuffer &buffer)
> -	: free(free)
> +	: free(free), lastUsed(utils::clock::now())
>  {
>  	for (const FrameBuffer::Plane &plane : buffer.planes())
>  		planes_.emplace_back(plane);
> --
> 2.25.0
>
> _______________________________________________
> libcamera-devel mailing list
> libcamera-devel@lists.libcamera.org
> https://lists.libcamera.org/listinfo/libcamera-devel
Laurent Pinchart Feb. 29, 2020, 4:25 p.m. UTC | #2
Hi Niklas,

Thank you for the patch.

On Wed, Feb 26, 2020 at 03:44:22PM +0100, Jacopo Mondi wrote:
> On Mon, Feb 24, 2020 at 08:36:01PM +0100, Niklas Söderlund wrote:
> > The strategy used to find a free cache entry in the first implementation
> > was not the smartest, it picked the first free entry. This lead to
> > unwanted performance issues as they cache was not used as good as it

s/they cache/the cache/

> > could for imported buffers.
> >
> > Improve this by adding a last usage timestamp to the cache entries and
> > change the eviction strategy to use the oldest free entry instead of the
> > first one it finds.
> >
> > Signed-off-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> > Reviewed-by: Naushir Patuck <naush@raspberrypi.com>
> > ---
> >  src/libcamera/include/v4l2_videodevice.h | 1 +
> >  src/libcamera/v4l2_videodevice.cpp       | 9 ++++++---
> >  2 files changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/libcamera/include/v4l2_videodevice.h b/src/libcamera/include/v4l2_videodevice.h
> > index fcf072641420dacf..f04feed87b24f28f 100644
> > --- a/src/libcamera/include/v4l2_videodevice.h
> > +++ b/src/libcamera/include/v4l2_videodevice.h
> > @@ -125,6 +125,7 @@ private:
> >  		bool operator==(const FrameBuffer &buffer);
> >
> >  		bool free;
> > +		utils::time_point lastUsed;

Would it be more efficient to use a sequence number, to avoid getting
the system clock all the time ? This would require adding a private
std::atomic_uint32_t (or a 64-bit type ?) in the V4L2BufferCache class,
calling .fetch_add() on it in V4L2BufferCache::get(), and passing the
value to the Entry() constructor. Entry would have an uint32_t lastUsed
field, with the default Entry() constructor setting it to 0.

With that I'm OK with the implementation.

> >
> >  	private:
> >  		struct Plane {
> > diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
> > index 99470ce11421c77c..9a4d2479b20f5e27 100644
> > --- a/src/libcamera/v4l2_videodevice.cpp
> > +++ b/src/libcamera/v4l2_videodevice.cpp
> > @@ -205,6 +205,7 @@ int V4L2BufferCache::get(const FrameBuffer &buffer)
> >  {
> >  	bool hit = false;
> >  	int use = -1;
> > +	utils::time_point oldest = utils::clock::now();

This would become

	uint32_t oldest = UINT32_MAX;

> >
> >  	for (unsigned int index = 0; index < cache_.size(); index++) {
> >  		const Entry &entry = cache_[index];
> > @@ -212,8 +213,10 @@ int V4L2BufferCache::get(const FrameBuffer &buffer)
> >  		if (!entry.free)
> >  			continue;
> >
> > -		if (use < 0)
> > +		if (cache_[index].lastUsed < oldest) {
> 
> entry.lastUsed
> 
> >  			use = index;
> > +			oldest = cache_[index].lastUsed;
> > +		}
> 
> Could you move this -after- having verified we didn't hit an hot entry ?
> 
> It seems more natural to
> 
>         for (entry: cache) {
>                 if (!entry.free)
>                         continue;
> 
>                 if (entry == buffer) {
>                         hit = true;
>                         break;
>                 }
> 
>                 if (entry.lastUsed < oldest)
>                         oldest = entry;

Missing

			use = index;

I think

>         }
> >
> >  		/* Try to find a cache hit by comparing the planes. */
> >  		if (cache_[index] == buffer) {
> 
> You could actually use entry here as well.
> 
> Minors apart
> Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>
> 
> > @@ -245,12 +248,12 @@ void V4L2BufferCache::put(unsigned int index)
> >  }
> >
> >  V4L2BufferCache::Entry::Entry()
> > -	: free(true)
> > +	: free(true), lastUsed(utils::clock::now())
> >  {
> >  }
> >
> >  V4L2BufferCache::Entry::Entry(bool free, const FrameBuffer &buffer)
> > -	: free(free)
> > +	: free(free), lastUsed(utils::clock::now())
> >  {
> >  	for (const FrameBuffer::Plane &plane : buffer.planes())
> >  		planes_.emplace_back(plane);

Patch

diff --git a/src/libcamera/include/v4l2_videodevice.h b/src/libcamera/include/v4l2_videodevice.h
index fcf072641420dacf..f04feed87b24f28f 100644
--- a/src/libcamera/include/v4l2_videodevice.h
+++ b/src/libcamera/include/v4l2_videodevice.h
@@ -125,6 +125,7 @@  private:
 		bool operator==(const FrameBuffer &buffer);
 
 		bool free;
+		utils::time_point lastUsed;
 
 	private:
 		struct Plane {
diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
index 99470ce11421c77c..9a4d2479b20f5e27 100644
--- a/src/libcamera/v4l2_videodevice.cpp
+++ b/src/libcamera/v4l2_videodevice.cpp
@@ -205,6 +205,7 @@  int V4L2BufferCache::get(const FrameBuffer &buffer)
 {
 	bool hit = false;
 	int use = -1;
+	utils::time_point oldest = utils::clock::now();
 
 	for (unsigned int index = 0; index < cache_.size(); index++) {
 		const Entry &entry = cache_[index];
@@ -212,8 +213,10 @@  int V4L2BufferCache::get(const FrameBuffer &buffer)
 		if (!entry.free)
 			continue;
 
-		if (use < 0)
+		if (cache_[index].lastUsed < oldest) {
 			use = index;
+			oldest = cache_[index].lastUsed;
+		}
 
 		/* Try to find a cache hit by comparing the planes. */
 		if (cache_[index] == buffer) {
@@ -245,12 +248,12 @@  void V4L2BufferCache::put(unsigned int index)
 }
 
 V4L2BufferCache::Entry::Entry()
-	: free(true)
+	: free(true), lastUsed(utils::clock::now())
 {
 }
 
 V4L2BufferCache::Entry::Entry(bool free, const FrameBuffer &buffer)
-	: free(free)
+	: free(free), lastUsed(utils::clock::now())
 {
 	for (const FrameBuffer::Plane &plane : buffer.planes())
 		planes_.emplace_back(plane);