[libcamera-devel,v3,7/7] libcamera: V4L2BufferCache: Improve cache eviction strategy

Message ID 20200304232246.325384-8-niklas.soderlund@ragnatech.se
State Accepted
Headers show
Series
  • libcamera: V4L2BufferCache: Improve cache
Related show

Commit Message

Niklas Söderlund March 4, 2020, 11:22 p.m. UTC
The strategy used to find a free cache entry in the first implementation
was not the smartest, it picked the first free entry. This lead to
unwanted performance issues as the cache was not used as good as it
could for imported buffers.

Improve this by adding a last usage sequence numer to the cache entries
and change the eviction strategy to use the oldest free entry instead of
the first one it finds.

Signed-off-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
Reviewed-by: Naushir Patuck <naush@raspberrypi.com>
---
* Changes since v2
- Use a sequence number instead of timestamp to judge age of a cached
  entry.
---
 src/libcamera/include/v4l2_videodevice.h |  5 ++++-
 src/libcamera/v4l2_videodevice.cpp       | 20 ++++++++++++--------
 2 files changed, 16 insertions(+), 9 deletions(-)

Comments

Laurent Pinchart March 4, 2020, 11:38 p.m. UTC | #1
Hi Niklas,

Thank you for the patch.

On Thu, Mar 05, 2020 at 12:22:46AM +0100, Niklas Söderlund wrote:
> The strategy used to find a free cache entry in the first implementation
> was not the smartest, it picked the first free entry. This lead to
> unwanted performance issues as the cache was not used as good as it
> could for imported buffers.
> 
> Improve this by adding a last usage sequence numer to the cache entries

s/numer/number/

> and change the eviction strategy to use the oldest free entry instead of
> the first one it finds.
> 
> Signed-off-by: Niklas Söderlund <niklas.soderlund@ragnatech.se>
> Reviewed-by: Naushir Patuck <naush@raspberrypi.com>
> ---
> * Changes since v2
> - Use a sequence number instead of timestamp to judge age of a cached
>   entry.
> ---
>  src/libcamera/include/v4l2_videodevice.h |  5 ++++-
>  src/libcamera/v4l2_videodevice.cpp       | 20 ++++++++++++--------
>  2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/src/libcamera/include/v4l2_videodevice.h b/src/libcamera/include/v4l2_videodevice.h
> index 359b366454e4e782..993ebb3069c595bf 100644
> --- a/src/libcamera/include/v4l2_videodevice.h
> +++ b/src/libcamera/include/v4l2_videodevice.h
> @@ -10,6 +10,7 @@
>  #include <string>
>  #include <vector>
>  
> +#include <atomic>

This should go with the C++ headers above. So should the memory header.
You can fix this "while at it", no need for a separate patch ;-)

>  #include <linux/videodev2.h>
>  #include <memory>
>  
> @@ -120,11 +121,12 @@ private:
>  	{
>  	public:
>  		Entry();
> -		Entry(bool free, const FrameBuffer &buffer);
> +		Entry(bool free, uint64_t lastUsed, const FrameBuffer &buffer);
>  
>  		bool operator==(const FrameBuffer &buffer) const;
>  
>  		bool free;
> +		uint64_t lastUsed;
>  
>  	private:
>  		struct Plane {
> @@ -140,6 +142,7 @@ private:
>  		std::vector<Plane> planes_;
>  	};
>  
> +	std::atomic_uint64_t lastUsedCounter_;
>  	std::vector<Entry> cache_;
>  	/* \todo Expose the miss counter through an instrumentation API. */
>  	unsigned int missCounter_;
> diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
> index c495de85f26efe14..3043aebfe0588290 100644
> --- a/src/libcamera/v4l2_videodevice.cpp
> +++ b/src/libcamera/v4l2_videodevice.cpp
> @@ -162,7 +162,7 @@ LOG_DECLARE_CATEGORY(V4L2)
>   * buffer import, with buffers added to the cache as they are queued.
>   */
>  V4L2BufferCache::V4L2BufferCache(unsigned int numEntries)
> -	: missCounter_(0)
> +	: lastUsedCounter_(1), missCounter_(0)
>  {
>  	cache_.resize(numEntries);
>  }
> @@ -176,10 +176,11 @@ V4L2BufferCache::V4L2BufferCache(unsigned int numEntries)
>   * allocated.
>   */
>  V4L2BufferCache::V4L2BufferCache(const std::vector<std::unique_ptr<FrameBuffer>> &buffers)
> -	: missCounter_(0)
> +	: lastUsedCounter_(1), missCounter_(0)
>  {
>  	for (const std::unique_ptr<FrameBuffer> &buffer : buffers)
> -		cache_.emplace_back(true, buffer->planes());
> +		cache_.emplace_back(true, lastUsedCounter_.fetch_add(1),

I think the std::memory_order_acq_rel memory order should be enough for
this usage (same below). Should I be mean and ask you to confirm ? :-)

> +				    buffer->planes());
>  }
>  
>  V4L2BufferCache::~V4L2BufferCache()
> @@ -205,6 +206,7 @@ int V4L2BufferCache::get(const FrameBuffer &buffer)
>  {
>  	bool hit = false;
>  	int use = -1;
> +	uint64_t oldest = UINT32_MAX;

UINT64_MAX ?

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

>  
>  	for (unsigned int index = 0; index < cache_.size(); index++) {
>  		const Entry &entry = cache_[index];
> @@ -219,8 +221,10 @@ int V4L2BufferCache::get(const FrameBuffer &buffer)
>  			break;
>  		}
>  
> -		if (use < 0)
> +		if (entry.lastUsed < oldest) {
>  			use = index;
> +			oldest = entry.lastUsed;
> +		}
>  	}
>  
>  	if (!hit)
> @@ -229,7 +233,7 @@ int V4L2BufferCache::get(const FrameBuffer &buffer)
>  	if (use < 0)
>  		return -ENOENT;
>  
> -	cache_[use] = Entry(false, buffer);
> +	cache_[use] = Entry(false, lastUsedCounter_.fetch_add(1), buffer);
>  
>  	return use;
>  }
> @@ -245,12 +249,12 @@ void V4L2BufferCache::put(unsigned int index)
>  }
>  
>  V4L2BufferCache::Entry::Entry()
> -	: free(true)
> +	: free(true), lastUsed(0)
>  {
>  }
>  
> -V4L2BufferCache::Entry::Entry(bool free, const FrameBuffer &buffer)
> -	: free(free)
> +V4L2BufferCache::Entry::Entry(bool free, uint64_t lastUsed, const FrameBuffer &buffer)
> +	: free(free), lastUsed(lastUsed)
>  {
>  	for (const FrameBuffer::Plane &plane : buffer.planes())
>  		planes_.emplace_back(plane);

Patch

diff --git a/src/libcamera/include/v4l2_videodevice.h b/src/libcamera/include/v4l2_videodevice.h
index 359b366454e4e782..993ebb3069c595bf 100644
--- a/src/libcamera/include/v4l2_videodevice.h
+++ b/src/libcamera/include/v4l2_videodevice.h
@@ -10,6 +10,7 @@ 
 #include <string>
 #include <vector>
 
+#include <atomic>
 #include <linux/videodev2.h>
 #include <memory>
 
@@ -120,11 +121,12 @@  private:
 	{
 	public:
 		Entry();
-		Entry(bool free, const FrameBuffer &buffer);
+		Entry(bool free, uint64_t lastUsed, const FrameBuffer &buffer);
 
 		bool operator==(const FrameBuffer &buffer) const;
 
 		bool free;
+		uint64_t lastUsed;
 
 	private:
 		struct Plane {
@@ -140,6 +142,7 @@  private:
 		std::vector<Plane> planes_;
 	};
 
+	std::atomic_uint64_t lastUsedCounter_;
 	std::vector<Entry> cache_;
 	/* \todo Expose the miss counter through an instrumentation API. */
 	unsigned int missCounter_;
diff --git a/src/libcamera/v4l2_videodevice.cpp b/src/libcamera/v4l2_videodevice.cpp
index c495de85f26efe14..3043aebfe0588290 100644
--- a/src/libcamera/v4l2_videodevice.cpp
+++ b/src/libcamera/v4l2_videodevice.cpp
@@ -162,7 +162,7 @@  LOG_DECLARE_CATEGORY(V4L2)
  * buffer import, with buffers added to the cache as they are queued.
  */
 V4L2BufferCache::V4L2BufferCache(unsigned int numEntries)
-	: missCounter_(0)
+	: lastUsedCounter_(1), missCounter_(0)
 {
 	cache_.resize(numEntries);
 }
@@ -176,10 +176,11 @@  V4L2BufferCache::V4L2BufferCache(unsigned int numEntries)
  * allocated.
  */
 V4L2BufferCache::V4L2BufferCache(const std::vector<std::unique_ptr<FrameBuffer>> &buffers)
-	: missCounter_(0)
+	: lastUsedCounter_(1), missCounter_(0)
 {
 	for (const std::unique_ptr<FrameBuffer> &buffer : buffers)
-		cache_.emplace_back(true, buffer->planes());
+		cache_.emplace_back(true, lastUsedCounter_.fetch_add(1),
+				    buffer->planes());
 }
 
 V4L2BufferCache::~V4L2BufferCache()
@@ -205,6 +206,7 @@  int V4L2BufferCache::get(const FrameBuffer &buffer)
 {
 	bool hit = false;
 	int use = -1;
+	uint64_t oldest = UINT32_MAX;
 
 	for (unsigned int index = 0; index < cache_.size(); index++) {
 		const Entry &entry = cache_[index];
@@ -219,8 +221,10 @@  int V4L2BufferCache::get(const FrameBuffer &buffer)
 			break;
 		}
 
-		if (use < 0)
+		if (entry.lastUsed < oldest) {
 			use = index;
+			oldest = entry.lastUsed;
+		}
 	}
 
 	if (!hit)
@@ -229,7 +233,7 @@  int V4L2BufferCache::get(const FrameBuffer &buffer)
 	if (use < 0)
 		return -ENOENT;
 
-	cache_[use] = Entry(false, buffer);
+	cache_[use] = Entry(false, lastUsedCounter_.fetch_add(1), buffer);
 
 	return use;
 }
@@ -245,12 +249,12 @@  void V4L2BufferCache::put(unsigned int index)
 }
 
 V4L2BufferCache::Entry::Entry()
-	: free(true)
+	: free(true), lastUsed(0)
 {
 }
 
-V4L2BufferCache::Entry::Entry(bool free, const FrameBuffer &buffer)
-	: free(free)
+V4L2BufferCache::Entry::Entry(bool free, uint64_t lastUsed, const FrameBuffer &buffer)
+	: free(free), lastUsed(lastUsed)
 {
 	for (const FrameBuffer::Plane &plane : buffer.planes())
 		planes_.emplace_back(plane);