[v1] libcamera: base: semaphore: Do not unlock prematurely
diff mbox series

Message ID 20250814093138.2075098-1-barnabas.pocze@ideasonboard.com
State Accepted
Headers show
Series
  • [v1] libcamera: base: semaphore: Do not unlock prematurely
Related show

Commit Message

Barnabás Pőcze Aug. 14, 2025, 9:31 a.m. UTC
In `Semaphore::release()`, unlocking the mutex before signalling the condition
variable can be problematic, especially with "temporary" objects such as the
ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`.

Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()`
has unlocked it, but before it had the chance to notify the condition variable.
In that case `Semaphore::acquire()` can succeed, and execution may proceed to
destroy the `Semaphore` object while the other thread is in the process of
running `std::condition_variable::notify_all()`.

See [0] for essentially the same issue in libcxx and its fix at [1].

[0]: https://github.com/llvm/llvm-project/issues/23667
[1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902

Bug: https://bugs.libcamera.org/show_bug.cgi?id=225
Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
---
 src/libcamera/base/semaphore.cpp | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Laurent Pinchart Aug. 14, 2025, 10:36 p.m. UTC | #1
Hi Barnabás,

Thank you for the patch.

On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote:
> In `Semaphore::release()`, unlocking the mutex before signalling the condition
> variable can be problematic, especially with "temporary" objects such as the
> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`.
> 
> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()`
> has unlocked it, but before it had the chance to notify the condition variable.
> In that case `Semaphore::acquire()` can succeed, and execution may proceed to
> destroy the `Semaphore` object while the other thread is in the process of
> running `std::condition_variable::notify_all()`.
> 
> See [0] for essentially the same issue in libcxx and its fix at [1].
> 
> [0]: https://github.com/llvm/llvm-project/issues/23667
> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902
> 
> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225
> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
> ---
>  src/libcamera/base/semaphore.cpp | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp
> index 862f3b313..6aec8b92d 100644
> --- a/src/libcamera/base/semaphore.cpp
> +++ b/src/libcamera/base/semaphore.cpp
> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n)
>   */
>  void Semaphore::release(unsigned int n)
>  {
> -	{
> -		MutexLocker locker(mutex_);
> -		available_ += n;
> -	}
> +	MutexLocker locker(mutex_);
>  
> +	available_ += n;
>  	cv_.notify_all();

It's a bit of a shame, as this degrades performance. The notify_all()
call can cause the thread being woken up to be scheduled on the same CPU
before the mutex is unlocked, and it will then immediately go back to
sleep when trying to acquire the mutex :-/

This fixes a real bug though, so

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

but I wonder if there could be a way to improve this.

>  }
>
Barnabás Pőcze Aug. 15, 2025, 7:41 a.m. UTC | #2
2025. 08. 15. 0:36 keltezéssel, Laurent Pinchart írta:
> Hi Barnabás,
> 
> Thank you for the patch.
> 
> On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote:
>> In `Semaphore::release()`, unlocking the mutex before signalling the condition
>> variable can be problematic, especially with "temporary" objects such as the
>> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`.
>>
>> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()`
>> has unlocked it, but before it had the chance to notify the condition variable.
>> In that case `Semaphore::acquire()` can succeed, and execution may proceed to
>> destroy the `Semaphore` object while the other thread is in the process of
>> running `std::condition_variable::notify_all()`.
>>
>> See [0] for essentially the same issue in libcxx and its fix at [1].
>>
>> [0]: https://github.com/llvm/llvm-project/issues/23667
>> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902
>>
>> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225
>> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
>> ---
>>   src/libcamera/base/semaphore.cpp | 6 ++----
>>   1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp
>> index 862f3b313..6aec8b92d 100644
>> --- a/src/libcamera/base/semaphore.cpp
>> +++ b/src/libcamera/base/semaphore.cpp
>> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n)
>>    */
>>   void Semaphore::release(unsigned int n)
>>   {
>> -	{
>> -		MutexLocker locker(mutex_);
>> -		available_ += n;
>> -	}
>> +	MutexLocker locker(mutex_);
>>   
>> +	available_ += n;
>>   	cv_.notify_all();
> 
> It's a bit of a shame, as this degrades performance. The notify_all()
> call can cause the thread being woken up to be scheduled on the same CPU
> before the mutex is unlocked, and it will then immediately go back to
> sleep when trying to acquire the mutex :-/
> 
> This fixes a real bug though, so
> 
> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> 
> but I wonder if there could be a way to improve this.

For example, in the case of message passing, in C++20 one possible solution would
be to use one `std::atomic<bool>` and its `wait()` and `notify_one()` methods. Or
there is std::latch or std::binary_semaphore, but both are more complex than the
simple "event" mechanism that message passing requires.


Regards,
Barnabás Pőcze


> 
>>   }
>>   
>
Kieran Bingham Aug. 18, 2025, 11:40 a.m. UTC | #3
Quoting Barnabás Pőcze (2025-08-15 08:41:43)
> 2025. 08. 15. 0:36 keltezéssel, Laurent Pinchart írta:
> > Hi Barnabás,
> > 
> > Thank you for the patch.
> > 
> > On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote:
> >> In `Semaphore::release()`, unlocking the mutex before signalling the condition
> >> variable can be problematic, especially with "temporary" objects such as the
> >> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`.
> >>
> >> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()`
> >> has unlocked it, but before it had the chance to notify the condition variable.
> >> In that case `Semaphore::acquire()` can succeed, and execution may proceed to
> >> destroy the `Semaphore` object while the other thread is in the process of
> >> running `std::condition_variable::notify_all()`.
> >>
> >> See [0] for essentially the same issue in libcxx and its fix at [1].
> >>
> >> [0]: https://github.com/llvm/llvm-project/issues/23667
> >> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902
> >>
> >> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225

If this fixes a bug in the code - is there value in a Fixes: tag ?

> >> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
> >> ---
> >>   src/libcamera/base/semaphore.cpp | 6 ++----
> >>   1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp
> >> index 862f3b313..6aec8b92d 100644
> >> --- a/src/libcamera/base/semaphore.cpp
> >> +++ b/src/libcamera/base/semaphore.cpp
> >> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n)
> >>    */
> >>   void Semaphore::release(unsigned int n)
> >>   {
> >> -    {
> >> -            MutexLocker locker(mutex_);
> >> -            available_ += n;
> >> -    }
> >> +    MutexLocker locker(mutex_);
> >>   
> >> +    available_ += n;
> >>      cv_.notify_all();
> > 
> > It's a bit of a shame, as this degrades performance. The notify_all()
> > call can cause the thread being woken up to be scheduled on the same CPU
> > before the mutex is unlocked, and it will then immediately go back to
> > sleep when trying to acquire the mutex :-/

That sounds annoying - but how often will it happen or what would the
cost actually be ? I guess that's hard to measure.

> > 
> > This fixes a real bug though, so
> > 
> > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>

> > 
> > but I wonder if there could be a way to improve this.
> 
> For example, in the case of message passing, in C++20 one possible solution would
> be to use one `std::atomic<bool>` and its `wait()` and `notify_one()` methods. Or
> there is std::latch or std::binary_semaphore, but both are more complex than the
> simple "event" mechanism that message passing requires.
> 
> 
> Regards,
> Barnabás Pőcze
> 
> 
> > 
> >>   }
> >>   
> > 
>
Barnabás Pőcze Aug. 18, 2025, 3:03 p.m. UTC | #4
2025. 08. 18. 13:40 keltezéssel, Kieran Bingham írta:
> Quoting Barnabás Pőcze (2025-08-15 08:41:43)
>> 2025. 08. 15. 0:36 keltezéssel, Laurent Pinchart írta:
>>> Hi Barnabás,
>>>
>>> Thank you for the patch.
>>>
>>> On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote:
>>>> In `Semaphore::release()`, unlocking the mutex before signalling the condition
>>>> variable can be problematic, especially with "temporary" objects such as the
>>>> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`.
>>>>
>>>> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()`
>>>> has unlocked it, but before it had the chance to notify the condition variable.
>>>> In that case `Semaphore::acquire()` can succeed, and execution may proceed to
>>>> destroy the `Semaphore` object while the other thread is in the process of
>>>> running `std::condition_variable::notify_all()`.
>>>>
>>>> See [0] for essentially the same issue in libcxx and its fix at [1].
>>>>
>>>> [0]: https://github.com/llvm/llvm-project/issues/23667
>>>> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902
>>>>
>>>> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225
> 
> If this fixes a bug in the code - is there value in a Fixes: tag ?

It's been like this since this class had been added:
66e7c5b774e288faa3a9b413861d6a77723db3ad ("libcamera: Add Semaphore class")

I'll add it now.


> 
>>>> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
>>>> ---
>>>>    src/libcamera/base/semaphore.cpp | 6 ++----
>>>>    1 file changed, 2 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp
>>>> index 862f3b313..6aec8b92d 100644
>>>> --- a/src/libcamera/base/semaphore.cpp
>>>> +++ b/src/libcamera/base/semaphore.cpp
>>>> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n)
>>>>     */
>>>>    void Semaphore::release(unsigned int n)
>>>>    {
>>>> -    {
>>>> -            MutexLocker locker(mutex_);
>>>> -            available_ += n;
>>>> -    }
>>>> +    MutexLocker locker(mutex_);
>>>>    
>>>> +    available_ += n;
>>>>       cv_.notify_all();
>>>
>>> It's a bit of a shame, as this degrades performance. The notify_all()
>>> call can cause the thread being woken up to be scheduled on the same CPU
>>> before the mutex is unlocked, and it will then immediately go back to
>>> sleep when trying to acquire the mutex :-/
> 
> That sounds annoying - but how often will it happen or what would the
> cost actually be ? I guess that's hard to measure.
> 
>>>
>>> This fixes a real bug though, so
>>>
>>> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> 
> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> 
>>>
>>> but I wonder if there could be a way to improve this.
>>
>> For example, in the case of message passing, in C++20 one possible solution would
>> be to use one `std::atomic<bool>` and its `wait()` and `notify_one()` methods. Or
>> there is std::latch or std::binary_semaphore, but both are more complex than the
>> simple "event" mechanism that message passing requires.
>>
>>
>> Regards,
>> Barnabás Pőcze
>>
>>
>>>
>>>>    }
>>>>    
>>>
>>
Kieran Bingham Aug. 18, 2025, 10:50 p.m. UTC | #5
Quoting Barnabás Pőcze (2025-08-18 16:03:48)
> 2025. 08. 18. 13:40 keltezéssel, Kieran Bingham írta:
> > Quoting Barnabás Pőcze (2025-08-15 08:41:43)
> >> 2025. 08. 15. 0:36 keltezéssel, Laurent Pinchart írta:
> >>> Hi Barnabás,
> >>>
> >>> Thank you for the patch.
> >>>
> >>> On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote:
> >>>> In `Semaphore::release()`, unlocking the mutex before signalling the condition
> >>>> variable can be problematic, especially with "temporary" objects such as the
> >>>> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`.
> >>>>
> >>>> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()`
> >>>> has unlocked it, but before it had the chance to notify the condition variable.
> >>>> In that case `Semaphore::acquire()` can succeed, and execution may proceed to
> >>>> destroy the `Semaphore` object while the other thread is in the process of
> >>>> running `std::condition_variable::notify_all()`.
> >>>>
> >>>> See [0] for essentially the same issue in libcxx and its fix at [1].
> >>>>
> >>>> [0]: https://github.com/llvm/llvm-project/issues/23667
> >>>> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902
> >>>>
> >>>> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225
> > 
> > If this fixes a bug in the code - is there value in a Fixes: tag ?
> 
> It's been like this since this class had been added:
> 66e7c5b774e288faa3a9b413861d6a77723db3ad ("libcamera: Add Semaphore class")
> 
> I'll add it now.


Thanks - that will help the code archeologists :D (but it also helps my
release notes!)


--
Kieran
> 
> 
> > 
> >>>> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>
> >>>> ---
> >>>>    src/libcamera/base/semaphore.cpp | 6 ++----
> >>>>    1 file changed, 2 insertions(+), 4 deletions(-)
> >>>>
> >>>> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp
> >>>> index 862f3b313..6aec8b92d 100644
> >>>> --- a/src/libcamera/base/semaphore.cpp
> >>>> +++ b/src/libcamera/base/semaphore.cpp
> >>>> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n)
> >>>>     */
> >>>>    void Semaphore::release(unsigned int n)
> >>>>    {
> >>>> -    {
> >>>> -            MutexLocker locker(mutex_);
> >>>> -            available_ += n;
> >>>> -    }
> >>>> +    MutexLocker locker(mutex_);
> >>>>    
> >>>> +    available_ += n;
> >>>>       cv_.notify_all();
> >>>
> >>> It's a bit of a shame, as this degrades performance. The notify_all()
> >>> call can cause the thread being woken up to be scheduled on the same CPU
> >>> before the mutex is unlocked, and it will then immediately go back to
> >>> sleep when trying to acquire the mutex :-/
> > 
> > That sounds annoying - but how often will it happen or what would the
> > cost actually be ? I guess that's hard to measure.
> > 
> >>>
> >>> This fixes a real bug though, so
> >>>
> >>> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > 
> > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > 
> >>>
> >>> but I wonder if there could be a way to improve this.
> >>
> >> For example, in the case of message passing, in C++20 one possible solution would
> >> be to use one `std::atomic<bool>` and its `wait()` and `notify_one()` methods. Or
> >> there is std::latch or std::binary_semaphore, but both are more complex than the
> >> simple "event" mechanism that message passing requires.
> >>
> >>
> >> Regards,
> >> Barnabás Pőcze
> >>
> >>
> >>>
> >>>>    }
> >>>>    
> >>>
> >>
>

Patch
diff mbox series

diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp
index 862f3b313..6aec8b92d 100644
--- a/src/libcamera/base/semaphore.cpp
+++ b/src/libcamera/base/semaphore.cpp
@@ -93,11 +93,9 @@  bool Semaphore::tryAcquire(unsigned int n)
  */
 void Semaphore::release(unsigned int n)
 {
-	{
-		MutexLocker locker(mutex_);
-		available_ += n;
-	}
+	MutexLocker locker(mutex_);
 
+	available_ += n;
 	cv_.notify_all();
 }