[v2,3/6] test: fence: Fix race condition
diff mbox series

Message ID 20240529154341.10426-4-laurent.pinchart@ideasonboard.com
State Accepted
Commit 13dcc7fc5f662a8c00eff5dce689c81231b5216e
Headers show
Series
  • vimc scaling improvements
Related show

Commit Message

Laurent Pinchart May 29, 2024, 3:43 p.m. UTC
The fence test is racy, as it relies on the main loop being executed
between completion of request signalledRequestId_ and
signalledRequestId_ + 1. This usually happens, but is not guaranteed.

To fix the race condition, change the request identification logic by
replacing usage of the cookie value, which is zero-based and wraps
around at nbuffers_ - 1, with a completed request counter that is
one-based and doesn't wrap. The completedRequestId_, expiredRequestId_
and signalledRequestId_ variables now track the identifier of the last
request that has completed, the request whose fence will time out, and
the request whose fence will be signalled.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
Changes since v1:

- Add and update comments
- Fix typo in commit message
---
 test/fence.cpp | 57 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 36 insertions(+), 21 deletions(-)

Comments

Dan Scally June 1, 2024, 10:13 p.m. UTC | #1
Hi Laurent - thanks for the patch

On 29/05/2024 16:43, Laurent Pinchart wrote:
> The fence test is racy, as it relies on the main loop being executed
> between completion of request signalledRequestId_ and
> signalledRequestId_ + 1. This usually happens, but is not guaranteed.
>
> To fix the race condition, change the request identification logic by
> replacing usage of the cookie value, which is zero-based and wraps
> around at nbuffers_ - 1, with a completed request counter that is
> one-based and doesn't wrap. The completedRequestId_, expiredRequestId_
> and signalledRequestId_ variables now track the identifier of the last
> request that has completed, the request whose fence will time out, and
> the request whose fence will be signalled.
>
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>


Took me a while to wrap my head around it, but I think it's fine:


Daniel Scally <dan.scally@ideasonboard.com>

> ---
> Changes since v1:
>
> - Add and update comments
> - Fix typo in commit message
> ---
>   test/fence.cpp | 57 +++++++++++++++++++++++++++++++-------------------
>   1 file changed, 36 insertions(+), 21 deletions(-)
>
> diff --git a/test/fence.cpp b/test/fence.cpp
> index c282886287aa..a8fba7284d82 100644
> --- a/test/fence.cpp
> +++ b/test/fence.cpp
> @@ -57,8 +57,11 @@ private:
>   	bool expectedCompletionResult_ = true;
>   	bool setFence_ = true;
>   
> -	unsigned int completedRequest_;
> -
> +	/*
> +	 * Request IDs track the number of requests that have completed. They
> +	 * are one-based, and don't wrap.
> +	 */
> +	unsigned int completedRequestId_;
>   	unsigned int signalledRequestId_;
>   	unsigned int expiredRequestId_;
>   	unsigned int nbuffers_;
> @@ -126,8 +129,19 @@ int FenceTest::init()
>   		return TestFail;
>   	}
>   
> -	signalledRequestId_ = nbuffers_ - 2;
> -	expiredRequestId_ = nbuffers_ - 1;
> +	completedRequestId_ = 0;
> +
> +	/*
> +	 * All but two requests are queued without a fence. Request
> +	 * expiredRequestId_ will be queued with a fence that we won't signal
> +	 * (which is then expected to expire), and request signalledRequestId_
> +	 * will be queued with a fence that gets signalled. Select nbuffers_
> +	 * and nbuffers_ * 2 for those two requests, to space them by a few
> +	 * frames while still not requiring a long time for the test to
> +	 * complete.
> +	 */
> +	expiredRequestId_ = nbuffers_;
> +	signalledRequestId_ = nbuffers_ * 2;
>   
>   	return TestPass;
>   }
> @@ -189,16 +203,16 @@ void FenceTest::requestRequeue(Request *request)
>   	const Request::BufferMap &buffers = request->buffers();
>   	const Stream *stream = buffers.begin()->first;
>   	FrameBuffer *buffer = buffers.begin()->second;
> -	uint64_t cookie = request->cookie();
>   
>   	request->reuse();
>   
> -	if (cookie == signalledRequestId_ && setFence_) {
> +	if (completedRequestId_ == signalledRequestId_ - nbuffers_ && setFence_) {
>   		/*
> -		 * The second time this request is queued add a fence to it.
> -		 *
> -		 * The main loop signals it by using a timer to write to the
> -		 * efd2_ file descriptor before the fence expires.
> +		 * This is the request that will be used to test fence
> +		 * signalling when it completes next time. Add a fence to it,
> +		 * using efd2_. The main loop will signal the fence by using a
> +		 * timer to write to the efd2_ file descriptor before the fence
> +		 * expires.
>   		 */
>   		std::unique_ptr<Fence> fence =
>   			std::make_unique<Fence>(std::move(eventFd2_));
> @@ -213,16 +227,15 @@ void FenceTest::requestRequeue(Request *request)
>   
>   void FenceTest::requestComplete(Request *request)
>   {
> -	uint64_t cookie = request->cookie();
> -	completedRequest_ = cookie;
> +	completedRequestId_++;
>   
>   	/*
> -	 * The last request is expected to fail as its fence has not been
> -	 * signaled.
> +	 * Request expiredRequestId_ is expected to fail as its fence has not
> +	 * been signalled.
>   	 *
>   	 * Validate the fence status but do not re-queue it.
>   	 */
> -	if (cookie == expiredRequestId_) {
> +	if (completedRequestId_ == expiredRequestId_) {
>   		if (validateExpiredRequest(request) != TestPass)
>   			expectedCompletionResult_ = false;
>   
> @@ -230,7 +243,7 @@ void FenceTest::requestComplete(Request *request)
>   		return;
>   	}
>   
> -	/* Validate all requests but the last. */
> +	/* Validate all other requests. */
>   	if (validateRequest(request) != TestPass) {
>   		expectedCompletionResult_ = false;
>   
> @@ -271,7 +284,7 @@ int FenceTest::run()
>   		}
>   
>   		int ret;
> -		if (i == expiredRequestId_) {
> +		if (i == expiredRequestId_ - 1) {
>   			/* This request will have a fence, and it will expire. */
>   			std::unique_ptr<Fence> fence =
>   				std::make_unique<Fence>(std::move(eventFd_));
> @@ -318,11 +331,13 @@ int FenceTest::run()
>   	Timer timer;
>   	timer.start(1000ms);
>   	while (timer.isRunning() && expectedCompletionResult_) {
> -		if (completedRequest_ == signalledRequestId_ && setFence_)
> +		if (completedRequestId_ == signalledRequestId_ - 1 && setFence_)
>   			/*
> -			 * signalledRequestId_ has just completed and it has
> -			 * been re-queued with a fence. Start the timer to
> -			 * signal the fence in 10 msec.
> +			 * The request just before signalledRequestId_ has just
> +			 * completed. Request signalledRequestId_ has been
> +			 * queued with a fence, and libcamera is likely already
> +			 * waiting on the fence, or will soon. Start the timer
> +			 * to signal the fence in 10 msec.
>   			 */
>   			fenceTimer.start(10ms);
>
Kieran Bingham June 3, 2024, 8:48 a.m. UTC | #2
Quoting Laurent Pinchart (2024-05-29 16:43:38)
> The fence test is racy, as it relies on the main loop being executed
> between completion of request signalledRequestId_ and
> signalledRequestId_ + 1. This usually happens, but is not guaranteed.
> 
> To fix the race condition, change the request identification logic by
> replacing usage of the cookie value, which is zero-based and wraps
> around at nbuffers_ - 1, with a completed request counter that is
> one-based and doesn't wrap. The completedRequestId_, expiredRequestId_
> and signalledRequestId_ variables now track the identifier of the last
> request that has completed, the request whose fence will time out, and
> the request whose fence will be signalled.
> 
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> Changes since v1:
> 
> - Add and update comments
> - Fix typo in commit message
> ---
>  test/fence.cpp | 57 +++++++++++++++++++++++++++++++-------------------
>  1 file changed, 36 insertions(+), 21 deletions(-)
> 
> diff --git a/test/fence.cpp b/test/fence.cpp
> index c282886287aa..a8fba7284d82 100644
> --- a/test/fence.cpp
> +++ b/test/fence.cpp
> @@ -57,8 +57,11 @@ private:
>         bool expectedCompletionResult_ = true;
>         bool setFence_ = true;
>  
> -       unsigned int completedRequest_;
> -
> +       /*
> +        * Request IDs track the number of requests that have completed. They
> +        * are one-based, and don't wrap.
> +        */
> +       unsigned int completedRequestId_;
>         unsigned int signalledRequestId_;
>         unsigned int expiredRequestId_;
>         unsigned int nbuffers_;
> @@ -126,8 +129,19 @@ int FenceTest::init()
>                 return TestFail;
>         }
>  
> -       signalledRequestId_ = nbuffers_ - 2;
> -       expiredRequestId_ = nbuffers_ - 1;
> +       completedRequestId_ = 0;
> +
> +       /*
> +        * All but two requests are queued without a fence. Request
> +        * expiredRequestId_ will be queued with a fence that we won't signal
> +        * (which is then expected to expire), and request signalledRequestId_
> +        * will be queued with a fence that gets signalled. Select nbuffers_
> +        * and nbuffers_ * 2 for those two requests, to space them by a few
> +        * frames while still not requiring a long time for the test to
> +        * complete.
> +        */
> +       expiredRequestId_ = nbuffers_;
> +       signalledRequestId_ = nbuffers_ * 2;

Thank you - that's /far/ clearer as to what's going on now and why
there's a * 2.

Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>

>  
>         return TestPass;
>  }
> @@ -189,16 +203,16 @@ void FenceTest::requestRequeue(Request *request)
>         const Request::BufferMap &buffers = request->buffers();
>         const Stream *stream = buffers.begin()->first;
>         FrameBuffer *buffer = buffers.begin()->second;
> -       uint64_t cookie = request->cookie();
>  
>         request->reuse();
>  
> -       if (cookie == signalledRequestId_ && setFence_) {
> +       if (completedRequestId_ == signalledRequestId_ - nbuffers_ && setFence_) {
>                 /*
> -                * The second time this request is queued add a fence to it.
> -                *
> -                * The main loop signals it by using a timer to write to the
> -                * efd2_ file descriptor before the fence expires.
> +                * This is the request that will be used to test fence
> +                * signalling when it completes next time. Add a fence to it,
> +                * using efd2_. The main loop will signal the fence by using a
> +                * timer to write to the efd2_ file descriptor before the fence
> +                * expires.
>                  */
>                 std::unique_ptr<Fence> fence =
>                         std::make_unique<Fence>(std::move(eventFd2_));
> @@ -213,16 +227,15 @@ void FenceTest::requestRequeue(Request *request)
>  
>  void FenceTest::requestComplete(Request *request)
>  {
> -       uint64_t cookie = request->cookie();
> -       completedRequest_ = cookie;
> +       completedRequestId_++;
>  
>         /*
> -        * The last request is expected to fail as its fence has not been
> -        * signaled.
> +        * Request expiredRequestId_ is expected to fail as its fence has not
> +        * been signalled.
>          *
>          * Validate the fence status but do not re-queue it.
>          */
> -       if (cookie == expiredRequestId_) {
> +       if (completedRequestId_ == expiredRequestId_) {
>                 if (validateExpiredRequest(request) != TestPass)
>                         expectedCompletionResult_ = false;
>  
> @@ -230,7 +243,7 @@ void FenceTest::requestComplete(Request *request)
>                 return;
>         }
>  
> -       /* Validate all requests but the last. */
> +       /* Validate all other requests. */
>         if (validateRequest(request) != TestPass) {
>                 expectedCompletionResult_ = false;
>  
> @@ -271,7 +284,7 @@ int FenceTest::run()
>                 }
>  
>                 int ret;
> -               if (i == expiredRequestId_) {
> +               if (i == expiredRequestId_ - 1) {
>                         /* This request will have a fence, and it will expire. */
>                         std::unique_ptr<Fence> fence =
>                                 std::make_unique<Fence>(std::move(eventFd_));
> @@ -318,11 +331,13 @@ int FenceTest::run()
>         Timer timer;
>         timer.start(1000ms);
>         while (timer.isRunning() && expectedCompletionResult_) {
> -               if (completedRequest_ == signalledRequestId_ && setFence_)
> +               if (completedRequestId_ == signalledRequestId_ - 1 && setFence_)
>                         /*
> -                        * signalledRequestId_ has just completed and it has
> -                        * been re-queued with a fence. Start the timer to
> -                        * signal the fence in 10 msec.
> +                        * The request just before signalledRequestId_ has just
> +                        * completed. Request signalledRequestId_ has been
> +                        * queued with a fence, and libcamera is likely already
> +                        * waiting on the fence, or will soon. Start the timer
> +                        * to signal the fence in 10 msec.
>                          */
>                         fenceTimer.start(10ms);
>  
> -- 
> Regards,
> 
> Laurent Pinchart
>

Patch
diff mbox series

diff --git a/test/fence.cpp b/test/fence.cpp
index c282886287aa..a8fba7284d82 100644
--- a/test/fence.cpp
+++ b/test/fence.cpp
@@ -57,8 +57,11 @@  private:
 	bool expectedCompletionResult_ = true;
 	bool setFence_ = true;
 
-	unsigned int completedRequest_;
-
+	/*
+	 * Request IDs track the number of requests that have completed. They
+	 * are one-based, and don't wrap.
+	 */
+	unsigned int completedRequestId_;
 	unsigned int signalledRequestId_;
 	unsigned int expiredRequestId_;
 	unsigned int nbuffers_;
@@ -126,8 +129,19 @@  int FenceTest::init()
 		return TestFail;
 	}
 
-	signalledRequestId_ = nbuffers_ - 2;
-	expiredRequestId_ = nbuffers_ - 1;
+	completedRequestId_ = 0;
+
+	/*
+	 * All but two requests are queued without a fence. Request
+	 * expiredRequestId_ will be queued with a fence that we won't signal
+	 * (which is then expected to expire), and request signalledRequestId_
+	 * will be queued with a fence that gets signalled. Select nbuffers_
+	 * and nbuffers_ * 2 for those two requests, to space them by a few
+	 * frames while still not requiring a long time for the test to
+	 * complete.
+	 */
+	expiredRequestId_ = nbuffers_;
+	signalledRequestId_ = nbuffers_ * 2;
 
 	return TestPass;
 }
@@ -189,16 +203,16 @@  void FenceTest::requestRequeue(Request *request)
 	const Request::BufferMap &buffers = request->buffers();
 	const Stream *stream = buffers.begin()->first;
 	FrameBuffer *buffer = buffers.begin()->second;
-	uint64_t cookie = request->cookie();
 
 	request->reuse();
 
-	if (cookie == signalledRequestId_ && setFence_) {
+	if (completedRequestId_ == signalledRequestId_ - nbuffers_ && setFence_) {
 		/*
-		 * The second time this request is queued add a fence to it.
-		 *
-		 * The main loop signals it by using a timer to write to the
-		 * efd2_ file descriptor before the fence expires.
+		 * This is the request that will be used to test fence
+		 * signalling when it completes next time. Add a fence to it,
+		 * using efd2_. The main loop will signal the fence by using a
+		 * timer to write to the efd2_ file descriptor before the fence
+		 * expires.
 		 */
 		std::unique_ptr<Fence> fence =
 			std::make_unique<Fence>(std::move(eventFd2_));
@@ -213,16 +227,15 @@  void FenceTest::requestRequeue(Request *request)
 
 void FenceTest::requestComplete(Request *request)
 {
-	uint64_t cookie = request->cookie();
-	completedRequest_ = cookie;
+	completedRequestId_++;
 
 	/*
-	 * The last request is expected to fail as its fence has not been
-	 * signaled.
+	 * Request expiredRequestId_ is expected to fail as its fence has not
+	 * been signalled.
 	 *
 	 * Validate the fence status but do not re-queue it.
 	 */
-	if (cookie == expiredRequestId_) {
+	if (completedRequestId_ == expiredRequestId_) {
 		if (validateExpiredRequest(request) != TestPass)
 			expectedCompletionResult_ = false;
 
@@ -230,7 +243,7 @@  void FenceTest::requestComplete(Request *request)
 		return;
 	}
 
-	/* Validate all requests but the last. */
+	/* Validate all other requests. */
 	if (validateRequest(request) != TestPass) {
 		expectedCompletionResult_ = false;
 
@@ -271,7 +284,7 @@  int FenceTest::run()
 		}
 
 		int ret;
-		if (i == expiredRequestId_) {
+		if (i == expiredRequestId_ - 1) {
 			/* This request will have a fence, and it will expire. */
 			std::unique_ptr<Fence> fence =
 				std::make_unique<Fence>(std::move(eventFd_));
@@ -318,11 +331,13 @@  int FenceTest::run()
 	Timer timer;
 	timer.start(1000ms);
 	while (timer.isRunning() && expectedCompletionResult_) {
-		if (completedRequest_ == signalledRequestId_ && setFence_)
+		if (completedRequestId_ == signalledRequestId_ - 1 && setFence_)
 			/*
-			 * signalledRequestId_ has just completed and it has
-			 * been re-queued with a fence. Start the timer to
-			 * signal the fence in 10 msec.
+			 * The request just before signalledRequestId_ has just
+			 * completed. Request signalledRequestId_ has been
+			 * queued with a fence, and libcamera is likely already
+			 * waiting on the fence, or will soon. Start the timer
+			 * to signal the fence in 10 msec.
 			 */
 			fenceTimer.start(10ms);