[libcamera-devel] libcamera: base: thread: Prevent messages to thread without an event loop
diff mbox series

Message ID 20220627164612.12049-1-laurent.pinchart@ideasonboard.com
State Changes Requested
Headers show
Series
  • [libcamera-devel] libcamera: base: thread: Prevent messages to thread without an event loop
Related show

Commit Message

Laurent Pinchart June 27, 2022, 4:46 p.m. UTC
A message posted to an object bound to a thread without an event loop
will never get delivered. This indicates a bug in the caller, and
materializes in ways that can be hard to debug, such as a signal never
being delivered. Add an assertion in Thread::postMessage() to catch this
at the source.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 src/libcamera/base/thread.cpp | 1 +
 1 file changed, 1 insertion(+)


base-commit: 27cc0a6b58bcca32071cb6ab96e5ee79c75031e5

Comments

Tomi Valkeinen June 28, 2022, 8:09 a.m. UTC | #1
On 27/06/2022 19:46, Laurent Pinchart wrote:
> A message posted to an object bound to a thread without an event loop
> will never get delivered. This indicates a bug in the caller, and
> materializes in ways that can be hard to debug, such as a signal never
> being delivered. Add an assertion in Thread::postMessage() to catch this
> at the source.
> 
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
>   src/libcamera/base/thread.cpp | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> index 6bda9d1462f5..6ead1d7c12d0 100644
> --- a/src/libcamera/base/thread.cpp
> +++ b/src/libcamera/base/thread.cpp
> @@ -530,6 +530,7 @@ void Thread::postMessage(std::unique_ptr<Message> msg, Object *receiver)
>   	msg->receiver_ = receiver;
>   
>   	ASSERT(data_ == receiver->thread()->data_);
> +	ASSERT(data_ != mainThread.data_);
>   
>   	MutexLocker locker(data_->messages_.mutex_);
>   	data_->messages_.list_.push_back(std::move(msg));
> 
> base-commit: 27cc0a6b58bcca32071cb6ab96e5ee79c75031e5

So what was the issue with the code I had? The problem is that 
CameraManager is a libcamera Object, and was associated with the main 
thread...? And a PyCameraManager which doesn't inherit CameraManager is 
not an Object and not associated with any thread, so it works?

  Tomi
Laurent Pinchart June 28, 2022, 9:46 a.m. UTC | #2
On Tue, Jun 28, 2022 at 11:09:21AM +0300, Tomi Valkeinen wrote:
> On 27/06/2022 19:46, Laurent Pinchart wrote:
> > A message posted to an object bound to a thread without an event loop
> > will never get delivered. This indicates a bug in the caller, and
> > materializes in ways that can be hard to debug, such as a signal never
> > being delivered. Add an assertion in Thread::postMessage() to catch this
> > at the source.
> > 
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> >   src/libcamera/base/thread.cpp | 1 +
> >   1 file changed, 1 insertion(+)
> > 
> > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > index 6bda9d1462f5..6ead1d7c12d0 100644
> > --- a/src/libcamera/base/thread.cpp
> > +++ b/src/libcamera/base/thread.cpp
> > @@ -530,6 +530,7 @@ void Thread::postMessage(std::unique_ptr<Message> msg, Object *receiver)
> >   	msg->receiver_ = receiver;
> >   
> >   	ASSERT(data_ == receiver->thread()->data_);
> > +	ASSERT(data_ != mainThread.data_);
> >   
> >   	MutexLocker locker(data_->messages_.mutex_);
> >   	data_->messages_.list_.push_back(std::move(msg));
> > 
> > base-commit: 27cc0a6b58bcca32071cb6ab96e5ee79c75031e5
> 
> So what was the issue with the code I had? The problem is that 
> CameraManager is a libcamera Object, and was associated with the main 
> thread...? And a PyCameraManager which doesn't inherit CameraManager is 
> not an Object and not associated with any thread, so it works?

That's right. Because PyCameraManager inherited from Object, it
automatically got support for queued signal delivery, where a signal
emitted in one thread would be processed in a different thread. This
mechanism requires the receiving object to be bound to a thread that has
an event loop managed by libcamera. By default an Object is bound to the
thread it's created in.

The CameraManager is created in the Python main thread, and the signal
is emitted from an internal libcamera thread. The signal will thus be
queued (the default connection type is Auto, which means that signals
are delivered directly if the receiver object is bound to the thread
that the signal is emitted from, or queued otherwise), which queues a
message for the receiver's thread event loop. As the receiver thread is
the Python main thread (the application thread from a libcamera point of
view), which doesn't have a libcamera event loop, the message is never
processed.

If the receiver doesn't inherit from the Object class, then the signal
will be delivered directly, in the emitter's thread. You can achieve the
same result with an Object receiver by specifying the direct connection
type when connecting the signal.
Laurent Pinchart Oct. 6, 2022, 11:15 p.m. UTC | #3
Ping for review.

On Tue, Jun 28, 2022 at 12:46:27PM +0300, Laurent Pinchart via libcamera-devel wrote:
> On Tue, Jun 28, 2022 at 11:09:21AM +0300, Tomi Valkeinen wrote:
> > On 27/06/2022 19:46, Laurent Pinchart wrote:
> > > A message posted to an object bound to a thread without an event loop
> > > will never get delivered. This indicates a bug in the caller, and
> > > materializes in ways that can be hard to debug, such as a signal never
> > > being delivered. Add an assertion in Thread::postMessage() to catch this
> > > at the source.
> > > 
> > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > ---
> > >   src/libcamera/base/thread.cpp | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > > index 6bda9d1462f5..6ead1d7c12d0 100644
> > > --- a/src/libcamera/base/thread.cpp
> > > +++ b/src/libcamera/base/thread.cpp
> > > @@ -530,6 +530,7 @@ void Thread::postMessage(std::unique_ptr<Message> msg, Object *receiver)
> > >   	msg->receiver_ = receiver;
> > >   
> > >   	ASSERT(data_ == receiver->thread()->data_);
> > > +	ASSERT(data_ != mainThread.data_);
> > >   
> > >   	MutexLocker locker(data_->messages_.mutex_);
> > >   	data_->messages_.list_.push_back(std::move(msg));
> > > 
> > > base-commit: 27cc0a6b58bcca32071cb6ab96e5ee79c75031e5
> > 
> > So what was the issue with the code I had? The problem is that 
> > CameraManager is a libcamera Object, and was associated with the main 
> > thread...? And a PyCameraManager which doesn't inherit CameraManager is 
> > not an Object and not associated with any thread, so it works?
> 
> That's right. Because PyCameraManager inherited from Object, it
> automatically got support for queued signal delivery, where a signal
> emitted in one thread would be processed in a different thread. This
> mechanism requires the receiving object to be bound to a thread that has
> an event loop managed by libcamera. By default an Object is bound to the
> thread it's created in.
> 
> The CameraManager is created in the Python main thread, and the signal
> is emitted from an internal libcamera thread. The signal will thus be
> queued (the default connection type is Auto, which means that signals
> are delivered directly if the receiver object is bound to the thread
> that the signal is emitted from, or queued otherwise), which queues a
> message for the receiver's thread event loop. As the receiver thread is
> the Python main thread (the application thread from a libcamera point of
> view), which doesn't have a libcamera event loop, the message is never
> processed.
> 
> If the receiver doesn't inherit from the Object class, then the signal
> will be delivered directly, in the emitter's thread. You can achieve the
> same result with an Object receiver by specifying the direct connection
> type when connecting the signal.
Umang Jain Oct. 7, 2022, 9:42 a.m. UTC | #4
Hi Laurent,

Thank you for the explanation.

Need few clarifications below

On 6/28/22 3:16 PM, Laurent Pinchart via libcamera-devel wrote:
> On Tue, Jun 28, 2022 at 11:09:21AM +0300, Tomi Valkeinen wrote:
>> On 27/06/2022 19:46, Laurent Pinchart wrote:
>>> A message posted to an object bound to a thread without an event loop
>>> will never get delivered. This indicates a bug in the caller, and
>>> materializes in ways that can be hard to debug, such as a signal never
>>> being delivered. Add an assertion in Thread::postMessage() to catch this
>>> at the source.
>>>
>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>>> ---
>>>    src/libcamera/base/thread.cpp | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
>>> index 6bda9d1462f5..6ead1d7c12d0 100644
>>> --- a/src/libcamera/base/thread.cpp
>>> +++ b/src/libcamera/base/thread.cpp
>>> @@ -530,6 +530,7 @@ void Thread::postMessage(std::unique_ptr<Message> msg, Object *receiver)
>>>    	msg->receiver_ = receiver;
>>>    
>>>    	ASSERT(data_ == receiver->thread()->data_);
>>> +	ASSERT(data_ != mainThread.data_);

Simply reading the commit message and inferring, the first change I 
expected was on the lines;
         ASSERT (data_->dispatcher_ != nullptr)

or for readability,

   	ASSERT(receiver->thread()->data_->dispatcher_ != nullptr);


that is, checking the receiver's thread event loop exists or not.
>>>    
>>>    	MutexLocker locker(data_->messages_.mutex_);
>>>    	data_->messages_.list_.push_back(std::move(msg));
>>>
>>> base-commit: 27cc0a6b58bcca32071cb6ab96e5ee79c75031e5
>> So what was the issue with the code I had? The problem is that
>> CameraManager is a libcamera Object, and was associated with the main
>> thread...? And a PyCameraManager which doesn't inherit CameraManager is
>> not an Object and not associated with any thread, so it works?
> That's right. Because PyCameraManager inherited from Object, it
> automatically got support for queued signal delivery, where a signal
> emitted in one thread would be processed in a different thread. This
> mechanism requires the receiving object to be bound to a thread that has
> an event loop managed by libcamera. By default an Object is bound to the
> thread it's created in.
>
> The CameraManager is created in the Python main thread, and the signal
> is emitted from an internal libcamera thread. The signal will thus be
> queued (the default connection type is Auto, which means that signals
> are delivered directly if the receiver object is bound to the thread
> that the signal is emitted from, or queued otherwise), which queues a
> message for the receiver's thread event loop. As the receiver thread is
> the Python main thread (the application thread from a libcamera point of
> view), which doesn't have a libcamera event loop, the message is never
> processed.

I understand the explanation above.

However I'm a bit lost with mainThread static variable. Am I right to 
infer that the mainThread is the Python's main thread in this case? If 
yes, the patch is correct, refraining the messages to be posted to a 
object whose threads has no event loop.

However, how about if the mainThread inherits from libcamera (hence 
Object, Thread, Signal delivery is in place) but tries to send a message 
to another Object bound to itself (i.e. mainThread)
Won't the assertion in this patch fail ?

> If the receiver doesn't inherit from the Object class, then the signal
> will be delivered directly, in the emitter's thread. You can achieve the
> same result with an Object receiver by specifying the direct connection
> type when connecting the signal.
>
Umang Jain Oct. 7, 2022, 11:54 a.m. UTC | #5
Hi,

On 10/7/22 3:12 PM, Umang Jain via libcamera-devel wrote:
> Hi Laurent,
>
> Thank you for the explanation.
>
> Need few clarifications below
>
> On 6/28/22 3:16 PM, Laurent Pinchart via libcamera-devel wrote:
>> On Tue, Jun 28, 2022 at 11:09:21AM +0300, Tomi Valkeinen wrote:
>>> On 27/06/2022 19:46, Laurent Pinchart wrote:
>>>> A message posted to an object bound to a thread without an event loop
>>>> will never get delivered. This indicates a bug in the caller, and
>>>> materializes in ways that can be hard to debug, such as a signal never
>>>> being delivered. Add an assertion in Thread::postMessage() to catch 
>>>> this
>>>> at the source.
>>>>
>>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>>>> ---
>>>>    src/libcamera/base/thread.cpp | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/src/libcamera/base/thread.cpp 
>>>> b/src/libcamera/base/thread.cpp
>>>> index 6bda9d1462f5..6ead1d7c12d0 100644
>>>> --- a/src/libcamera/base/thread.cpp
>>>> +++ b/src/libcamera/base/thread.cpp
>>>> @@ -530,6 +530,7 @@ void 
>>>> Thread::postMessage(std::unique_ptr<Message> msg, Object *receiver)
>>>>        msg->receiver_ = receiver;
>>>>           ASSERT(data_ == receiver->thread()->data_);
>>>> +    ASSERT(data_ != mainThread.data_);
>
> Simply reading the commit message and inferring, the first change I 
> expected was on the lines;
>         ASSERT (data_->dispatcher_ != nullptr)
>
> or for readability,
>
>       ASSERT(receiver->thread()->data_->dispatcher_ != nullptr);
>
>
> that is, checking the receiver's thread event loop exists or not.
>>>>           MutexLocker locker(data_->messages_.mutex_);
>>>>        data_->messages_.list_.push_back(std::move(msg));
>>>>
>>>> base-commit: 27cc0a6b58bcca32071cb6ab96e5ee79c75031e5
>>> So what was the issue with the code I had? The problem is that
>>> CameraManager is a libcamera Object, and was associated with the main
>>> thread...? And a PyCameraManager which doesn't inherit CameraManager is
>>> not an Object and not associated with any thread, so it works?
>> That's right. Because PyCameraManager inherited from Object, it
>> automatically got support for queued signal delivery, where a signal
>> emitted in one thread would be processed in a different thread. This
>> mechanism requires the receiving object to be bound to a thread that has
>> an event loop managed by libcamera. By default an Object is bound to the
>> thread it's created in.
>>
>> The CameraManager is created in the Python main thread, and the signal
>> is emitted from an internal libcamera thread. The signal will thus be
>> queued (the default connection type is Auto, which means that signals
>> are delivered directly if the receiver object is bound to the thread
>> that the signal is emitted from, or queued otherwise), which queues a
>> message for the receiver's thread event loop. As the receiver thread is
>> the Python main thread (the application thread from a libcamera point of
>> view), which doesn't have a libcamera event loop, the message is never
>> processed.
>
> I understand the explanation above.
>
> However I'm a bit lost with mainThread static variable. Am I right to 
> infer that the mainThread is the Python's main thread in this case? If 
> yes, the patch is correct, refraining the messages to be posted to a 
> object whose threads has no event loop.
>
> However, how about if the mainThread inherits from libcamera (hence 
> Object, Thread, Signal delivery is in place) but tries to send a 
> message to another Object bound to itself (i.e. mainThread)
> Won't the assertion in this patch fail ?

Ran the test suite with this patch and saw 3 failures

66/71 libcamera / timer-thread FAIL           15.52s   killed by signal 
6 SIGABRT
67/71 libcamera / event-thread FAIL           23.13s   killed by signal 
6 SIGABRT
68/71 libcamera / object-invoke FAIL           23.99s   killed by signal 
6 SIGABRT
69/71 libcamera / object-delete TIMEOUT        30.02s   killed by signal 
15 SIGTERM

All triggering the assertion.
>
>> If the receiver doesn't inherit from the Object class, then the signal
>> will be delivered directly, in the emitter's thread. You can achieve the
>> same result with an Object receiver by specifying the direct connection
>> type when connecting the signal.
>>
>
Laurent Pinchart Aug. 1, 2023, 2:34 p.m. UTC | #6
On Fri, Oct 07, 2022 at 05:24:48PM +0530, Umang Jain wrote:
> On 10/7/22 3:12 PM, Umang Jain via libcamera-devel wrote:
> > Hi Laurent,
> >
> > Thank you for the explanation.
> >
> > Need few clarifications below
> >
> > On 6/28/22 3:16 PM, Laurent Pinchart via libcamera-devel wrote:
> >> On Tue, Jun 28, 2022 at 11:09:21AM +0300, Tomi Valkeinen wrote:
> >>> On 27/06/2022 19:46, Laurent Pinchart wrote:
> >>>> A message posted to an object bound to a thread without an event loop
> >>>> will never get delivered. This indicates a bug in the caller, and
> >>>> materializes in ways that can be hard to debug, such as a signal never
> >>>> being delivered. Add an assertion in Thread::postMessage() to catch 
> >>>> this
> >>>> at the source.
> >>>>
> >>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> >>>> ---
> >>>>    src/libcamera/base/thread.cpp | 1 +
> >>>>    1 file changed, 1 insertion(+)
> >>>>
> >>>> diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> >>>> index 6bda9d1462f5..6ead1d7c12d0 100644
> >>>> --- a/src/libcamera/base/thread.cpp
> >>>> +++ b/src/libcamera/base/thread.cpp
> >>>> @@ -530,6 +530,7 @@ void Thread::postMessage(std::unique_ptr<Message> msg, Object *receiver)
> >>>>  	msg->receiver_ = receiver;
> >>>>
> >>>>  	ASSERT(data_ == receiver->thread()->data_);
> >>>> +	ASSERT(data_ != mainThread.data_);
> >
> > Simply reading the commit message and inferring, the first change I 
> > expected was on the lines;
> >
> >         ASSERT (data_->dispatcher_ != nullptr)
> >
> > or for readability,
> >
> >       ASSERT(receiver->thread()->data_->dispatcher_ != nullptr);
> >
> > that is, checking the receiver's thread event loop exists or not.

That should work too, but it's longer, and would require accessing the
dispatcher with the right atomic operation.

> >>>>  	 MutexLocker locker(data_->messages_.mutex_);
> >>>>  	data_->messages_.list_.push_back(std::move(msg));
> >>>>
> >>>> base-commit: 27cc0a6b58bcca32071cb6ab96e5ee79c75031e5
> >>>
> >>> So what was the issue with the code I had? The problem is that
> >>> CameraManager is a libcamera Object, and was associated with the main
> >>> thread...? And a PyCameraManager which doesn't inherit CameraManager is
> >>> not an Object and not associated with any thread, so it works?
> >>
> >> That's right. Because PyCameraManager inherited from Object, it
> >> automatically got support for queued signal delivery, where a signal
> >> emitted in one thread would be processed in a different thread. This
> >> mechanism requires the receiving object to be bound to a thread that has
> >> an event loop managed by libcamera. By default an Object is bound to the
> >> thread it's created in.
> >>
> >> The CameraManager is created in the Python main thread, and the signal
> >> is emitted from an internal libcamera thread. The signal will thus be
> >> queued (the default connection type is Auto, which means that signals
> >> are delivered directly if the receiver object is bound to the thread
> >> that the signal is emitted from, or queued otherwise), which queues a
> >> message for the receiver's thread event loop. As the receiver thread is
> >> the Python main thread (the application thread from a libcamera point of
> >> view), which doesn't have a libcamera event loop, the message is never
> >> processed.
> >
> > I understand the explanation above.
> >
> > However I'm a bit lost with mainThread static variable. Am I right to 
> > infer that the mainThread is the Python's main thread in this case? If 
> > yes, the patch is correct, refraining the messages to be posted to a 
> > object whose threads has no event loop.

mainThread was initially designed as the Thread instance for the
application's main thread. It is in practice used to represent any of
the threads that are not under libcamera's control. This should
eventually be improved, as all those threads will be identified by the
same thread ID in log messages (see how ThreadData::current() assigns
mainThread.data_ to currentThreadData, and sets the tid_ there).

> > However, how about if the mainThread inherits from libcamera (hence 
> > Object, Thread, Signal delivery is in place) but tries to send a 
> > message to another Object bound to itself (i.e. mainThread)
> > Won't the assertion in this patch fail ?

The "main" thread, from a C++ point of view, is the one that runs
main(), and by definition this can't inherit from the Thread class.

> Ran the test suite with this patch and saw 3 failures
> 
> 66/71 libcamera / timer-thread FAIL           15.52s   killed by signal 6 SIGABRT
> 67/71 libcamera / event-thread FAIL           23.13s   killed by signal 6 SIGABRT
> 68/71 libcamera / object-invoke FAIL          23.99s   killed by signal 6 SIGABRT
> 69/71 libcamera / object-delete TIMEOUT       30.02s   killed by signal 15 SIGTERM
> 
> All triggering the assertion.

:-( Those tests really abuse the API. I'll see if I can fix them.

> >> If the receiver doesn't inherit from the Object class, then the signal
> >> will be delivered directly, in the emitter's thread. You can achieve the
> >> same result with an Object receiver by specifying the direct connection
> >> type when connecting the signal.

Patch
diff mbox series

diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
index 6bda9d1462f5..6ead1d7c12d0 100644
--- a/src/libcamera/base/thread.cpp
+++ b/src/libcamera/base/thread.cpp
@@ -530,6 +530,7 @@  void Thread::postMessage(std::unique_ptr<Message> msg, Object *receiver)
 	msg->receiver_ = receiver;
 
 	ASSERT(data_ == receiver->thread()->data_);
+	ASSERT(data_ != mainThread.data_);
 
 	MutexLocker locker(data_->messages_.mutex_);
 	data_->messages_.list_.push_back(std::move(msg));