Message ID | 20250814093138.2075098-1-barnabas.pocze@ideasonboard.com |
---|---|
State | Accepted |
Headers | show |
Series |
|
Related | show |
Hi Barnabás, Thank you for the patch. On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote: > In `Semaphore::release()`, unlocking the mutex before signalling the condition > variable can be problematic, especially with "temporary" objects such as the > ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`. > > Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()` > has unlocked it, but before it had the chance to notify the condition variable. > In that case `Semaphore::acquire()` can succeed, and execution may proceed to > destroy the `Semaphore` object while the other thread is in the process of > running `std::condition_variable::notify_all()`. > > See [0] for essentially the same issue in libcxx and its fix at [1]. > > [0]: https://github.com/llvm/llvm-project/issues/23667 > [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902 > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=225 > Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> > --- > src/libcamera/base/semaphore.cpp | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp > index 862f3b313..6aec8b92d 100644 > --- a/src/libcamera/base/semaphore.cpp > +++ b/src/libcamera/base/semaphore.cpp > @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n) > */ > void Semaphore::release(unsigned int n) > { > - { > - MutexLocker locker(mutex_); > - available_ += n; > - } > + MutexLocker locker(mutex_); > > + available_ += n; > cv_.notify_all(); It's a bit of a shame, as this degrades performance. The notify_all() call can cause the thread being woken up to be scheduled on the same CPU before the mutex is unlocked, and it will then immediately go back to sleep when trying to acquire the mutex :-/ This fixes a real bug though, so Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> but I wonder if there could be a way to improve this. > } >
2025. 08. 15. 0:36 keltezéssel, Laurent Pinchart írta: > Hi Barnabás, > > Thank you for the patch. > > On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote: >> In `Semaphore::release()`, unlocking the mutex before signalling the condition >> variable can be problematic, especially with "temporary" objects such as the >> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`. >> >> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()` >> has unlocked it, but before it had the chance to notify the condition variable. >> In that case `Semaphore::acquire()` can succeed, and execution may proceed to >> destroy the `Semaphore` object while the other thread is in the process of >> running `std::condition_variable::notify_all()`. >> >> See [0] for essentially the same issue in libcxx and its fix at [1]. >> >> [0]: https://github.com/llvm/llvm-project/issues/23667 >> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902 >> >> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225 >> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> >> --- >> src/libcamera/base/semaphore.cpp | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp >> index 862f3b313..6aec8b92d 100644 >> --- a/src/libcamera/base/semaphore.cpp >> +++ b/src/libcamera/base/semaphore.cpp >> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n) >> */ >> void Semaphore::release(unsigned int n) >> { >> - { >> - MutexLocker locker(mutex_); >> - available_ += n; >> - } >> + MutexLocker locker(mutex_); >> >> + available_ += n; >> cv_.notify_all(); > > It's a bit of a shame, as this degrades performance. The notify_all() > call can cause the thread being woken up to be scheduled on the same CPU > before the mutex is unlocked, and it will then immediately go back to > sleep when trying to acquire the mutex :-/ > > This fixes a real bug though, so > > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > but I wonder if there could be a way to improve this. For example, in the case of message passing, in C++20 one possible solution would be to use one `std::atomic<bool>` and its `wait()` and `notify_one()` methods. Or there is std::latch or std::binary_semaphore, but both are more complex than the simple "event" mechanism that message passing requires. Regards, Barnabás Pőcze > >> } >> >
Quoting Barnabás Pőcze (2025-08-15 08:41:43) > 2025. 08. 15. 0:36 keltezéssel, Laurent Pinchart írta: > > Hi Barnabás, > > > > Thank you for the patch. > > > > On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote: > >> In `Semaphore::release()`, unlocking the mutex before signalling the condition > >> variable can be problematic, especially with "temporary" objects such as the > >> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`. > >> > >> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()` > >> has unlocked it, but before it had the chance to notify the condition variable. > >> In that case `Semaphore::acquire()` can succeed, and execution may proceed to > >> destroy the `Semaphore` object while the other thread is in the process of > >> running `std::condition_variable::notify_all()`. > >> > >> See [0] for essentially the same issue in libcxx and its fix at [1]. > >> > >> [0]: https://github.com/llvm/llvm-project/issues/23667 > >> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902 > >> > >> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225 If this fixes a bug in the code - is there value in a Fixes: tag ? > >> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> > >> --- > >> src/libcamera/base/semaphore.cpp | 6 ++---- > >> 1 file changed, 2 insertions(+), 4 deletions(-) > >> > >> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp > >> index 862f3b313..6aec8b92d 100644 > >> --- a/src/libcamera/base/semaphore.cpp > >> +++ b/src/libcamera/base/semaphore.cpp > >> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n) > >> */ > >> void Semaphore::release(unsigned int n) > >> { > >> - { > >> - MutexLocker locker(mutex_); > >> - available_ += n; > >> - } > >> + MutexLocker locker(mutex_); > >> > >> + available_ += n; > >> cv_.notify_all(); > > > > It's a bit of a shame, as this degrades performance. The notify_all() > > call can cause the thread being woken up to be scheduled on the same CPU > > before the mutex is unlocked, and it will then immediately go back to > > sleep when trying to acquire the mutex :-/ That sounds annoying - but how often will it happen or what would the cost actually be ? I guess that's hard to measure. > > > > This fixes a real bug though, so > > > > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > > > but I wonder if there could be a way to improve this. > > For example, in the case of message passing, in C++20 one possible solution would > be to use one `std::atomic<bool>` and its `wait()` and `notify_one()` methods. Or > there is std::latch or std::binary_semaphore, but both are more complex than the > simple "event" mechanism that message passing requires. > > > Regards, > Barnabás Pőcze > > > > > >> } > >> > > >
2025. 08. 18. 13:40 keltezéssel, Kieran Bingham írta: > Quoting Barnabás Pőcze (2025-08-15 08:41:43) >> 2025. 08. 15. 0:36 keltezéssel, Laurent Pinchart írta: >>> Hi Barnabás, >>> >>> Thank you for the patch. >>> >>> On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote: >>>> In `Semaphore::release()`, unlocking the mutex before signalling the condition >>>> variable can be problematic, especially with "temporary" objects such as the >>>> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`. >>>> >>>> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()` >>>> has unlocked it, but before it had the chance to notify the condition variable. >>>> In that case `Semaphore::acquire()` can succeed, and execution may proceed to >>>> destroy the `Semaphore` object while the other thread is in the process of >>>> running `std::condition_variable::notify_all()`. >>>> >>>> See [0] for essentially the same issue in libcxx and its fix at [1]. >>>> >>>> [0]: https://github.com/llvm/llvm-project/issues/23667 >>>> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902 >>>> >>>> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225 > > If this fixes a bug in the code - is there value in a Fixes: tag ? It's been like this since this class had been added: 66e7c5b774e288faa3a9b413861d6a77723db3ad ("libcamera: Add Semaphore class") I'll add it now. > >>>> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> >>>> --- >>>> src/libcamera/base/semaphore.cpp | 6 ++---- >>>> 1 file changed, 2 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp >>>> index 862f3b313..6aec8b92d 100644 >>>> --- a/src/libcamera/base/semaphore.cpp >>>> +++ b/src/libcamera/base/semaphore.cpp >>>> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n) >>>> */ >>>> void Semaphore::release(unsigned int n) >>>> { >>>> - { >>>> - MutexLocker locker(mutex_); >>>> - available_ += n; >>>> - } >>>> + MutexLocker locker(mutex_); >>>> >>>> + available_ += n; >>>> cv_.notify_all(); >>> >>> It's a bit of a shame, as this degrades performance. The notify_all() >>> call can cause the thread being woken up to be scheduled on the same CPU >>> before the mutex is unlocked, and it will then immediately go back to >>> sleep when trying to acquire the mutex :-/ > > That sounds annoying - but how often will it happen or what would the > cost actually be ? I guess that's hard to measure. > >>> >>> This fixes a real bug though, so >>> >>> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > >>> >>> but I wonder if there could be a way to improve this. >> >> For example, in the case of message passing, in C++20 one possible solution would >> be to use one `std::atomic<bool>` and its `wait()` and `notify_one()` methods. Or >> there is std::latch or std::binary_semaphore, but both are more complex than the >> simple "event" mechanism that message passing requires. >> >> >> Regards, >> Barnabás Pőcze >> >> >>> >>>> } >>>> >>> >>
Quoting Barnabás Pőcze (2025-08-18 16:03:48) > 2025. 08. 18. 13:40 keltezéssel, Kieran Bingham írta: > > Quoting Barnabás Pőcze (2025-08-15 08:41:43) > >> 2025. 08. 15. 0:36 keltezéssel, Laurent Pinchart írta: > >>> Hi Barnabás, > >>> > >>> Thank you for the patch. > >>> > >>> On Thu, Aug 14, 2025 at 11:31:38AM +0200, Barnabás Pőcze wrote: > >>>> In `Semaphore::release()`, unlocking the mutex before signalling the condition > >>>> variable can be problematic, especially with "temporary" objects such as the > >>>> ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`. > >>>> > >>>> Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()` > >>>> has unlocked it, but before it had the chance to notify the condition variable. > >>>> In that case `Semaphore::acquire()` can succeed, and execution may proceed to > >>>> destroy the `Semaphore` object while the other thread is in the process of > >>>> running `std::condition_variable::notify_all()`. > >>>> > >>>> See [0] for essentially the same issue in libcxx and its fix at [1]. > >>>> > >>>> [0]: https://github.com/llvm/llvm-project/issues/23667 > >>>> [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902 > >>>> > >>>> Bug: https://bugs.libcamera.org/show_bug.cgi?id=225 > > > > If this fixes a bug in the code - is there value in a Fixes: tag ? > > It's been like this since this class had been added: > 66e7c5b774e288faa3a9b413861d6a77723db3ad ("libcamera: Add Semaphore class") > > I'll add it now. Thanks - that will help the code archeologists :D (but it also helps my release notes!) -- Kieran > > > > > >>>> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> > >>>> --- > >>>> src/libcamera/base/semaphore.cpp | 6 ++---- > >>>> 1 file changed, 2 insertions(+), 4 deletions(-) > >>>> > >>>> diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp > >>>> index 862f3b313..6aec8b92d 100644 > >>>> --- a/src/libcamera/base/semaphore.cpp > >>>> +++ b/src/libcamera/base/semaphore.cpp > >>>> @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n) > >>>> */ > >>>> void Semaphore::release(unsigned int n) > >>>> { > >>>> - { > >>>> - MutexLocker locker(mutex_); > >>>> - available_ += n; > >>>> - } > >>>> + MutexLocker locker(mutex_); > >>>> > >>>> + available_ += n; > >>>> cv_.notify_all(); > >>> > >>> It's a bit of a shame, as this degrades performance. The notify_all() > >>> call can cause the thread being woken up to be scheduled on the same CPU > >>> before the mutex is unlocked, and it will then immediately go back to > >>> sleep when trying to acquire the mutex :-/ > > > > That sounds annoying - but how often will it happen or what would the > > cost actually be ? I guess that's hard to measure. > > > >>> > >>> This fixes a real bug though, so > >>> > >>> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > > > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > > >>> > >>> but I wonder if there could be a way to improve this. > >> > >> For example, in the case of message passing, in C++20 one possible solution would > >> be to use one `std::atomic<bool>` and its `wait()` and `notify_one()` methods. Or > >> there is std::latch or std::binary_semaphore, but both are more complex than the > >> simple "event" mechanism that message passing requires. > >> > >> > >> Regards, > >> Barnabás Pőcze > >> > >> > >>> > >>>> } > >>>> > >>> > >> >
diff --git a/src/libcamera/base/semaphore.cpp b/src/libcamera/base/semaphore.cpp index 862f3b313..6aec8b92d 100644 --- a/src/libcamera/base/semaphore.cpp +++ b/src/libcamera/base/semaphore.cpp @@ -93,11 +93,9 @@ bool Semaphore::tryAcquire(unsigned int n) */ void Semaphore::release(unsigned int n) { - { - MutexLocker locker(mutex_); - available_ += n; - } + MutexLocker locker(mutex_); + available_ += n; cv_.notify_all(); }
In `Semaphore::release()`, unlocking the mutex before signalling the condition variable can be problematic, especially with "temporary" objects such as the ones `BoundMethodBase::activatePack()` uses to handle `ConnectionTypeBlocking`. Specifically, `Semaphore::acquire()` might lock the mutex after `Semaphore::release()` has unlocked it, but before it had the chance to notify the condition variable. In that case `Semaphore::acquire()` can succeed, and execution may proceed to destroy the `Semaphore` object while the other thread is in the process of running `std::condition_variable::notify_all()`. See [0] for essentially the same issue in libcxx and its fix at [1]. [0]: https://github.com/llvm/llvm-project/issues/23667 [1]: https://github.com/llvm/llvm-project/commit/b3ec43d78aa2599e1592392844a7ec638fab7902 Bug: https://bugs.libcamera.org/show_bug.cgi?id=225 Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> --- src/libcamera/base/semaphore.cpp | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)