[libcamera-devel,3/3] base: thread: Fix recursive calls to dispatchMessages()
diff mbox series

Message ID 20210701230741.14320-4-laurent.pinchart@ideasonboard.com
State Accepted
Delegated to: Laurent Pinchart
Headers show
Series
  • base: Fix crash with recursive messages dispatch
Related show

Commit Message

Laurent Pinchart July 1, 2021, 11:07 p.m. UTC
There are use cases for calling the dispatchMessages() function
recursively, from within a message handler. This can be used, for
instance, to force delivery of messages posted to a thread concurrently
to stopping the thread. This currently causes access, in the outer
dispatchMessages() call, to iterators that have been invalidated by
erasing list elements in the recursive call, leading to undefined
behaviour (most likely double-free or other crashes).

Fix it by only erasing messages from the list at the end of the outer
call, identified using a recursion counter. We need to keep track of the
first non-null entry in the posted messages list to restrict the
deletion to the

Bug: https://bugs.libcamera.org/show_bug.cgi?id=26
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 src/libcamera/base/thread.cpp | 43 +++++++++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 10 deletions(-)

Comments

Jacopo Mondi July 5, 2021, 10:48 a.m. UTC | #1
Hi Laurent,

On Fri, Jul 02, 2021 at 02:07:41AM +0300, Laurent Pinchart wrote:
> There are use cases for calling the dispatchMessages() function
> recursively, from within a message handler. This can be used, for
> instance, to force delivery of messages posted to a thread concurrently
> to stopping the thread. This currently causes access, in the outer
> dispatchMessages() call, to iterators that have been invalidated by
> erasing list elements in the recursive call, leading to undefined
> behaviour (most likely double-free or other crashes).
>
> Fix it by only erasing messages from the list at the end of the outer
> call, identified using a recursion counter. We need to keep track of the
> first non-null entry in the posted messages list to restrict the
> deletion to the
>
> Bug: https://bugs.libcamera.org/show_bug.cgi?id=26
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
>  src/libcamera/base/thread.cpp | 43 +++++++++++++++++++++++++++--------
>  1 file changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> index 7f79115222e8..9eb488d46db8 100644
> --- a/src/libcamera/base/thread.cpp
> +++ b/src/libcamera/base/thread.cpp
> @@ -126,6 +126,11 @@ public:
>  	 * \brief Protects the \ref list_
>  	 */
>  	Mutex mutex_;
> +	/**
> +	 * \brief The recursion level for recursive Thread::dispatchMessages()
> +	 * calls
> +	 */
> +	unsigned int recursion_ = 0;
>  };
>
>  /**
> @@ -595,30 +600,34 @@ void Thread::removeMessages(Object *receiver)
>   * Messages shall only be dispatched from the current thread, typically within
>   * the thread from the run() function. Calling this function outside of the
>   * thread results in undefined behaviour.
> + *
> + * This function is not thread-safe, but it may be called recursively in the
> + * same thread from an object's message handler. It guarantees in all cases
> + * delivery of messages in the order they have been posted.
>   */
>  void Thread::dispatchMessages(Message::Type type)
>  {
>  	ASSERT(data_ == ThreadData::current());
>
> +	++data_->messages_.recursion_;
> +
>  	MutexLocker locker(data_->messages_.mutex_);
>
>  	std::list<std::unique_ptr<Message>> &messages = data_->messages_.list_;
>
> -	for (auto iter = messages.begin(); iter != messages.end(); ) {
> -		std::unique_ptr<Message> &msg = *iter;
> -
> -		if (!msg) {
> -			iter = data_->messages_.list_.erase(iter);
> +	for (std::unique_ptr<Message> &msg : messages) {
> +		if (!msg)
>  			continue;
> -		}
>
> -		if (type != Message::Type::None && msg->type() != type) {
> -			++iter;
> +		if (type != Message::Type::None && msg->type() != type)
>  			continue;
> -		}
>
> +		/*
> +		 * Move the message, setting the entry in the list to null. It
> +		 * will cause recursive calls to ignore the entry, and the erase
> +		 * loop at the end of the function to delete it from the list.
> +		 */
>  		std::unique_ptr<Message> message = std::move(msg);
> -		iter = data_->messages_.list_.erase(iter);
>
>  		Object *receiver = message->receiver_;
>  		ASSERT(data_ == receiver->thread()->data_);
> @@ -629,6 +638,20 @@ void Thread::dispatchMessages(Message::Type type)
>  		message.reset();
>  		locker.lock();
>  	}
> +
> +	/*
> +	 * If the recursion level is 0, erase all null messages in the list. We
> +	 * can't do so during recursion, as it would invalidate the iterator of
> +	 * the outer calls.
> +	 */

is it paranoid to think accesses to recursions_ should be serialized ?

> +	if (!--data_->messages_.recursion_) {
> +		for (auto iter = messages.begin(); iter != messages.end(); ) {
> +			if (!*iter)
> +				iter = messages.erase(iter);
> +			else
> +				++iter;
> +		}
> +	}

Can null messages end up in the list ? can't you just clear() the
whole list ?

Thanks
  j

>  }
>
>  /**
> --
> Regards,
>
> Laurent Pinchart
>
Laurent Pinchart July 5, 2021, 12:12 p.m. UTC | #2
Hi Jacopo,

On Mon, Jul 05, 2021 at 12:48:11PM +0200, Jacopo Mondi wrote:
> On Fri, Jul 02, 2021 at 02:07:41AM +0300, Laurent Pinchart wrote:
> > There are use cases for calling the dispatchMessages() function
> > recursively, from within a message handler. This can be used, for
> > instance, to force delivery of messages posted to a thread concurrently
> > to stopping the thread. This currently causes access, in the outer
> > dispatchMessages() call, to iterators that have been invalidated by
> > erasing list elements in the recursive call, leading to undefined
> > behaviour (most likely double-free or other crashes).
> >
> > Fix it by only erasing messages from the list at the end of the outer
> > call, identified using a recursion counter. We need to keep track of the
> > first non-null entry in the posted messages list to restrict the
> > deletion to the
> >
> > Bug: https://bugs.libcamera.org/show_bug.cgi?id=26
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> >  src/libcamera/base/thread.cpp | 43 +++++++++++++++++++++++++++--------
> >  1 file changed, 33 insertions(+), 10 deletions(-)
> >
> > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > index 7f79115222e8..9eb488d46db8 100644
> > --- a/src/libcamera/base/thread.cpp
> > +++ b/src/libcamera/base/thread.cpp
> > @@ -126,6 +126,11 @@ public:
> >  	 * \brief Protects the \ref list_
> >  	 */
> >  	Mutex mutex_;
> > +	/**
> > +	 * \brief The recursion level for recursive Thread::dispatchMessages()
> > +	 * calls
> > +	 */
> > +	unsigned int recursion_ = 0;
> >  };
> >
> >  /**
> > @@ -595,30 +600,34 @@ void Thread::removeMessages(Object *receiver)
> >   * Messages shall only be dispatched from the current thread, typically within
> >   * the thread from the run() function. Calling this function outside of the
> >   * thread results in undefined behaviour.
> > + *
> > + * This function is not thread-safe, but it may be called recursively in the
> > + * same thread from an object's message handler. It guarantees in all cases
> > + * delivery of messages in the order they have been posted.
> >   */
> >  void Thread::dispatchMessages(Message::Type type)
> >  {
> >  	ASSERT(data_ == ThreadData::current());
> >
> > +	++data_->messages_.recursion_;
> > +
> >  	MutexLocker locker(data_->messages_.mutex_);
> >
> >  	std::list<std::unique_ptr<Message>> &messages = data_->messages_.list_;
> >
> > -	for (auto iter = messages.begin(); iter != messages.end(); ) {
> > -		std::unique_ptr<Message> &msg = *iter;
> > -
> > -		if (!msg) {
> > -			iter = data_->messages_.list_.erase(iter);
> > +	for (std::unique_ptr<Message> &msg : messages) {
> > +		if (!msg)
> >  			continue;
> > -		}
> >
> > -		if (type != Message::Type::None && msg->type() != type) {
> > -			++iter;
> > +		if (type != Message::Type::None && msg->type() != type)
> >  			continue;
> > -		}
> >
> > +		/*
> > +		 * Move the message, setting the entry in the list to null. It
> > +		 * will cause recursive calls to ignore the entry, and the erase
> > +		 * loop at the end of the function to delete it from the list.
> > +		 */
> >  		std::unique_ptr<Message> message = std::move(msg);
> > -		iter = data_->messages_.list_.erase(iter);
> >
> >  		Object *receiver = message->receiver_;
> >  		ASSERT(data_ == receiver->thread()->data_);
> > @@ -629,6 +638,20 @@ void Thread::dispatchMessages(Message::Type type)
> >  		message.reset();
> >  		locker.lock();
> >  	}
> > +
> > +	/*
> > +	 * If the recursion level is 0, erase all null messages in the list. We
> > +	 * can't do so during recursion, as it would invalidate the iterator of
> > +	 * the outer calls.
> > +	 */
> 
> is it paranoid to think accesses to recursions_ should be serialized ?

The function is documented as not being thread-safe. This is because it
must be called from the thread created by the Thread class (see the
assert at the beginning that enforces that). While it can be called
recursively, there will never be two concurrent calls, so no race
condition when accessing the recursion_ variable. This is unlike the
messages list that needs to be protected by a lock because other threads
can post messages to the list concurrently to
Thread::dispatchMessages().

> > +	if (!--data_->messages_.recursion_) {
> > +		for (auto iter = messages.begin(); iter != messages.end(); ) {
> > +			if (!*iter)
> > +				iter = messages.erase(iter);
> > +			else
> > +				++iter;
> > +		}
> > +	}
> 
> Can null messages end up in the list ? can't you just clear() the
> whole list ?

The function takes a message type parameter to restrict it to
dispatching messages of a given type. The list may thus not be empty, it
may still contain messages of other types.

> >  }
> >
> >  /**
Jacopo Mondi July 5, 2021, 12:33 p.m. UTC | #3
Hi Laurent,

On Mon, Jul 05, 2021 at 03:12:45PM +0300, Laurent Pinchart wrote:
> Hi Jacopo,
>
> On Mon, Jul 05, 2021 at 12:48:11PM +0200, Jacopo Mondi wrote:
> > On Fri, Jul 02, 2021 at 02:07:41AM +0300, Laurent Pinchart wrote:
> > > There are use cases for calling the dispatchMessages() function
> > > recursively, from within a message handler. This can be used, for
> > > instance, to force delivery of messages posted to a thread concurrently
> > > to stopping the thread. This currently causes access, in the outer
> > > dispatchMessages() call, to iterators that have been invalidated by
> > > erasing list elements in the recursive call, leading to undefined
> > > behaviour (most likely double-free or other crashes).
> > >
> > > Fix it by only erasing messages from the list at the end of the outer
> > > call, identified using a recursion counter. We need to keep track of the
> > > first non-null entry in the posted messages list to restrict the
> > > deletion to the
> > >
> > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=26
> > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > ---
> > >  src/libcamera/base/thread.cpp | 43 +++++++++++++++++++++++++++--------
> > >  1 file changed, 33 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > > index 7f79115222e8..9eb488d46db8 100644
> > > --- a/src/libcamera/base/thread.cpp
> > > +++ b/src/libcamera/base/thread.cpp
> > > @@ -126,6 +126,11 @@ public:
> > >  	 * \brief Protects the \ref list_
> > >  	 */
> > >  	Mutex mutex_;
> > > +	/**
> > > +	 * \brief The recursion level for recursive Thread::dispatchMessages()
> > > +	 * calls
> > > +	 */
> > > +	unsigned int recursion_ = 0;
> > >  };
> > >
> > >  /**
> > > @@ -595,30 +600,34 @@ void Thread::removeMessages(Object *receiver)
> > >   * Messages shall only be dispatched from the current thread, typically within
> > >   * the thread from the run() function. Calling this function outside of the
> > >   * thread results in undefined behaviour.
> > > + *
> > > + * This function is not thread-safe, but it may be called recursively in the
> > > + * same thread from an object's message handler. It guarantees in all cases
> > > + * delivery of messages in the order they have been posted.
> > >   */
> > >  void Thread::dispatchMessages(Message::Type type)
> > >  {
> > >  	ASSERT(data_ == ThreadData::current());
> > >
> > > +	++data_->messages_.recursion_;
> > > +
> > >  	MutexLocker locker(data_->messages_.mutex_);
> > >
> > >  	std::list<std::unique_ptr<Message>> &messages = data_->messages_.list_;
> > >
> > > -	for (auto iter = messages.begin(); iter != messages.end(); ) {
> > > -		std::unique_ptr<Message> &msg = *iter;
> > > -
> > > -		if (!msg) {
> > > -			iter = data_->messages_.list_.erase(iter);
> > > +	for (std::unique_ptr<Message> &msg : messages) {
> > > +		if (!msg)
> > >  			continue;
> > > -		}
> > >
> > > -		if (type != Message::Type::None && msg->type() != type) {
> > > -			++iter;
> > > +		if (type != Message::Type::None && msg->type() != type)
> > >  			continue;
> > > -		}
> > >
> > > +		/*
> > > +		 * Move the message, setting the entry in the list to null. It
> > > +		 * will cause recursive calls to ignore the entry, and the erase
> > > +		 * loop at the end of the function to delete it from the list.
> > > +		 */
> > >  		std::unique_ptr<Message> message = std::move(msg);
> > > -		iter = data_->messages_.list_.erase(iter);
> > >
> > >  		Object *receiver = message->receiver_;
> > >  		ASSERT(data_ == receiver->thread()->data_);
> > > @@ -629,6 +638,20 @@ void Thread::dispatchMessages(Message::Type type)
> > >  		message.reset();
> > >  		locker.lock();
> > >  	}
> > > +
> > > +	/*
> > > +	 * If the recursion level is 0, erase all null messages in the list. We
> > > +	 * can't do so during recursion, as it would invalidate the iterator of
> > > +	 * the outer calls.
> > > +	 */
> >
> > is it paranoid to think accesses to recursions_ should be serialized ?
>
> The function is documented as not being thread-safe. This is because it
> must be called from the thread created by the Thread class (see the
> assert at the beginning that enforces that). While it can be called
> recursively, there will never be two concurrent calls, so no race
> condition when accessing the recursion_ variable. This is unlike the
> messages list that needs to be protected by a lock because other threads
> can post messages to the list concurrently to
> Thread::dispatchMessages().
>
> > > +	if (!--data_->messages_.recursion_) {
> > > +		for (auto iter = messages.begin(); iter != messages.end(); ) {
> > > +			if (!*iter)
> > > +				iter = messages.erase(iter);
> > > +			else
> > > +				++iter;
> > > +		}
> > > +	}
> >
> > Can null messages end up in the list ? can't you just clear() the
> > whole list ?
>
> The function takes a message type parameter to restrict it to
> dispatching messages of a given type. The list may thus not be empty, it
> may still contain messages of other types.

So it is the other way around actually, if the message has been
dispatched, we erase it. Sorry, I read the condition wrong, thanks for
explaining.

A little note, you could:

        if (--data_->messages_.recursion_)
                return;

And save one indentation level.

Apart from this:
Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>

Thanks
   j

>
> > >  }
> > >
> > >  /**
>
> --
> Regards,
>
> Laurent Pinchart
David Plowman July 6, 2021, 10:07 a.m. UTC | #4
Hi Laurent

Thank you very much for this patch.

On Fri, 2 Jul 2021 at 00:08, Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> There are use cases for calling the dispatchMessages() function
> recursively, from within a message handler. This can be used, for
> instance, to force delivery of messages posted to a thread concurrently
> to stopping the thread. This currently causes access, in the outer
> dispatchMessages() call, to iterators that have been invalidated by
> erasing list elements in the recursive call, leading to undefined
> behaviour (most likely double-free or other crashes).
>
> Fix it by only erasing messages from the list at the end of the outer
> call, identified using a recursion counter. We need to keep track of the
> first non-null entry in the posted messages list to restrict the
> deletion to the
>
> Bug: https://bugs.libcamera.org/show_bug.cgi?id=26
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

Notwithstanding the subsequent discussion, I've been running with
these changes for a while now and haven't seen the problem return, so:

Tested-by: David Plowman <david.plowman@raspberrypi.com>

Thank you!
David

> ---
>  src/libcamera/base/thread.cpp | 43 +++++++++++++++++++++++++++--------
>  1 file changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> index 7f79115222e8..9eb488d46db8 100644
> --- a/src/libcamera/base/thread.cpp
> +++ b/src/libcamera/base/thread.cpp
> @@ -126,6 +126,11 @@ public:
>          * \brief Protects the \ref list_
>          */
>         Mutex mutex_;
> +       /**
> +        * \brief The recursion level for recursive Thread::dispatchMessages()
> +        * calls
> +        */
> +       unsigned int recursion_ = 0;
>  };
>
>  /**
> @@ -595,30 +600,34 @@ void Thread::removeMessages(Object *receiver)
>   * Messages shall only be dispatched from the current thread, typically within
>   * the thread from the run() function. Calling this function outside of the
>   * thread results in undefined behaviour.
> + *
> + * This function is not thread-safe, but it may be called recursively in the
> + * same thread from an object's message handler. It guarantees in all cases
> + * delivery of messages in the order they have been posted.
>   */
>  void Thread::dispatchMessages(Message::Type type)
>  {
>         ASSERT(data_ == ThreadData::current());
>
> +       ++data_->messages_.recursion_;
> +
>         MutexLocker locker(data_->messages_.mutex_);
>
>         std::list<std::unique_ptr<Message>> &messages = data_->messages_.list_;
>
> -       for (auto iter = messages.begin(); iter != messages.end(); ) {
> -               std::unique_ptr<Message> &msg = *iter;
> -
> -               if (!msg) {
> -                       iter = data_->messages_.list_.erase(iter);
> +       for (std::unique_ptr<Message> &msg : messages) {
> +               if (!msg)
>                         continue;
> -               }
>
> -               if (type != Message::Type::None && msg->type() != type) {
> -                       ++iter;
> +               if (type != Message::Type::None && msg->type() != type)
>                         continue;
> -               }
>
> +               /*
> +                * Move the message, setting the entry in the list to null. It
> +                * will cause recursive calls to ignore the entry, and the erase
> +                * loop at the end of the function to delete it from the list.
> +                */
>                 std::unique_ptr<Message> message = std::move(msg);
> -               iter = data_->messages_.list_.erase(iter);
>
>                 Object *receiver = message->receiver_;
>                 ASSERT(data_ == receiver->thread()->data_);
> @@ -629,6 +638,20 @@ void Thread::dispatchMessages(Message::Type type)
>                 message.reset();
>                 locker.lock();
>         }
> +
> +       /*
> +        * If the recursion level is 0, erase all null messages in the list. We
> +        * can't do so during recursion, as it would invalidate the iterator of
> +        * the outer calls.
> +        */
> +       if (!--data_->messages_.recursion_) {
> +               for (auto iter = messages.begin(); iter != messages.end(); ) {
> +                       if (!*iter)
> +                               iter = messages.erase(iter);
> +                       else
> +                               ++iter;
> +               }
> +       }
>  }
>
>  /**
> --
> Regards,
>
> Laurent Pinchart
>
Kieran Bingham July 9, 2021, 12:36 p.m. UTC | #5
Hi Laurent,


On 05/07/2021 13:33, Jacopo Mondi wrote:
> Hi Laurent,
> 
> On Mon, Jul 05, 2021 at 03:12:45PM +0300, Laurent Pinchart wrote:
>> Hi Jacopo,
>>
>> On Mon, Jul 05, 2021 at 12:48:11PM +0200, Jacopo Mondi wrote:
>>> On Fri, Jul 02, 2021 at 02:07:41AM +0300, Laurent Pinchart wrote:
>>>> There are use cases for calling the dispatchMessages() function
>>>> recursively, from within a message handler. This can be used, for
>>>> instance, to force delivery of messages posted to a thread concurrently
>>>> to stopping the thread. This currently causes access, in the outer
>>>> dispatchMessages() call, to iterators that have been invalidated by
>>>> erasing list elements in the recursive call, leading to undefined
>>>> behaviour (most likely double-free or other crashes).
>>>>
>>>> Fix it by only erasing messages from the list at the end of the outer
>>>> call, identified using a recursion counter. We need to keep track of the
>>>> first non-null entry in the posted messages list to restrict the
>>>> deletion to the


to the .... ?


>>>>
>>>> Bug: https://bugs.libcamera.org/show_bug.cgi?id=26
>>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>>>> ---
>>>>  src/libcamera/base/thread.cpp | 43 +++++++++++++++++++++++++++--------
>>>>  1 file changed, 33 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
>>>> index 7f79115222e8..9eb488d46db8 100644
>>>> --- a/src/libcamera/base/thread.cpp
>>>> +++ b/src/libcamera/base/thread.cpp
>>>> @@ -126,6 +126,11 @@ public:
>>>>  	 * \brief Protects the \ref list_
>>>>  	 */
>>>>  	Mutex mutex_;
>>>> +	/**
>>>> +	 * \brief The recursion level for recursive Thread::dispatchMessages()
>>>> +	 * calls
>>>> +	 */
>>>> +	unsigned int recursion_ = 0;
>>>>  };
>>>>
>>>>  /**
>>>> @@ -595,30 +600,34 @@ void Thread::removeMessages(Object *receiver)
>>>>   * Messages shall only be dispatched from the current thread, typically within
>>>>   * the thread from the run() function. Calling this function outside of the
>>>>   * thread results in undefined behaviour.
>>>> + *
>>>> + * This function is not thread-safe, but it may be called recursively in the
>>>> + * same thread from an object's message handler. It guarantees in all cases
>>>> + * delivery of messages in the order they have been posted.

"in all cases delivery of" sounds odd.

Perhaps

"It guarantees delivery of messages in the order they have been posted
in all cases".

Or

"In all cases, it guarantees delivery of messages in the order they have
been posted".



>>>>   */
>>>>  void Thread::dispatchMessages(Message::Type type)
>>>>  {
>>>>  	ASSERT(data_ == ThreadData::current());
>>>>
>>>> +	++data_->messages_.recursion_;
>>>> +
>>>>  	MutexLocker locker(data_->messages_.mutex_);
>>>>
>>>>  	std::list<std::unique_ptr<Message>> &messages = data_->messages_.list_;
>>>>
>>>> -	for (auto iter = messages.begin(); iter != messages.end(); ) {
>>>> -		std::unique_ptr<Message> &msg = *iter;
>>>> -
>>>> -		if (!msg) {
>>>> -			iter = data_->messages_.list_.erase(iter);
>>>> +	for (std::unique_ptr<Message> &msg : messages) {
>>>> +		if (!msg)
>>>>  			continue;
>>>> -		}
>>>>
>>>> -		if (type != Message::Type::None && msg->type() != type) {
>>>> -			++iter;
>>>> +		if (type != Message::Type::None && msg->type() != type)
>>>>  			continue;
>>>> -		}
>>>>
>>>> +		/*
>>>> +		 * Move the message, setting the entry in the list to null. It
>>>> +		 * will cause recursive calls to ignore the entry, and the erase
>>>> +		 * loop at the end of the function to delete it from the list.
>>>> +		 */
>>>>  		std::unique_ptr<Message> message = std::move(msg);
>>>> -		iter = data_->messages_.list_.erase(iter);
>>>>
>>>>  		Object *receiver = message->receiver_;
>>>>  		ASSERT(data_ == receiver->thread()->data_);
>>>> @@ -629,6 +638,20 @@ void Thread::dispatchMessages(Message::Type type)
>>>>  		message.reset();
>>>>  		locker.lock();
>>>>  	}
>>>> +
>>>> +	/*
>>>> +	 * If the recursion level is 0, erase all null messages in the list. We
>>>> +	 * can't do so during recursion, as it would invalidate the iterator of
>>>> +	 * the outer calls.
>>>> +	 */
>>>
>>> is it paranoid to think accesses to recursions_ should be serialized ?
>>
>> The function is documented as not being thread-safe. This is because it
>> must be called from the thread created by the Thread class (see the
>> assert at the beginning that enforces that). While it can be called
>> recursively, there will never be two concurrent calls, so no race
>> condition when accessing the recursion_ variable. This is unlike the
>> messages list that needs to be protected by a lock because other threads
>> can post messages to the list concurrently to
>> Thread::dispatchMessages().
>>

Sounds good, and Thread looks well asserted in most cases to guarantee
rules are adhered to.


>>>> +	if (!--data_->messages_.recursion_) {
>>>> +		for (auto iter = messages.begin(); iter != messages.end(); ) {
>>>> +			if (!*iter)
>>>> +				iter = messages.erase(iter);
>>>> +			else
>>>> +				++iter;
>>>> +		}
>>>> +	}
>>>
>>> Can null messages end up in the list ? can't you just clear() the
>>> whole list ?
>>
>> The function takes a message type parameter to restrict it to
>> dispatching messages of a given type. The list may thus not be empty, it
>> may still contain messages of other types.
> 
> So it is the other way around actually, if the message has been
> dispatched, we erase it. Sorry, I read the condition wrong, thanks for
> explaining.
> 
> A little note, you could:
> 
>         if (--data_->messages_.recursion_)
>                 return;
> 
> And save one indentation level.
> 
> Apart from this:
> Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>


With comments addressed if and as required:

Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>


> 
> Thanks
>    j
> 
>>
>>>>  }
>>>>
>>>>  /**
>>
>> --
>> Regards,
>>
>> Laurent Pinchart
Laurent Pinchart July 11, 2021, 2:27 p.m. UTC | #6
Hi Kieran,

On Fri, Jul 09, 2021 at 01:36:13PM +0100, Kieran Bingham wrote:
> On 05/07/2021 13:33, Jacopo Mondi wrote:
> > On Mon, Jul 05, 2021 at 03:12:45PM +0300, Laurent Pinchart wrote:
> >> On Mon, Jul 05, 2021 at 12:48:11PM +0200, Jacopo Mondi wrote:
> >>> On Fri, Jul 02, 2021 at 02:07:41AM +0300, Laurent Pinchart wrote:
> >>>> There are use cases for calling the dispatchMessages() function
> >>>> recursively, from within a message handler. This can be used, for
> >>>> instance, to force delivery of messages posted to a thread concurrently
> >>>> to stopping the thread. This currently causes access, in the outer
> >>>> dispatchMessages() call, to iterators that have been invalidated by
> >>>> erasing list elements in the recursive call, leading to undefined
> >>>> behaviour (most likely double-free or other crashes).
> >>>>
> >>>> Fix it by only erasing messages from the list at the end of the outer
> >>>> call, identified using a recursion counter. We need to keep track of the
> >>>> first non-null entry in the posted messages list to restrict the
> >>>> deletion to the
> 
> to the .... ?

:-)

I'll drop the whole sentence, it's a leftover of a previous more complex
implementation, the current version doesn't keep track of the first
non-null entry.

> >>>>
> >>>> Bug: https://bugs.libcamera.org/show_bug.cgi?id=26
> >>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> >>>> ---
> >>>>  src/libcamera/base/thread.cpp | 43 +++++++++++++++++++++++++++--------
> >>>>  1 file changed, 33 insertions(+), 10 deletions(-)
> >>>>
> >>>> diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> >>>> index 7f79115222e8..9eb488d46db8 100644
> >>>> --- a/src/libcamera/base/thread.cpp
> >>>> +++ b/src/libcamera/base/thread.cpp
> >>>> @@ -126,6 +126,11 @@ public:
> >>>>  	 * \brief Protects the \ref list_
> >>>>  	 */
> >>>>  	Mutex mutex_;
> >>>> +	/**
> >>>> +	 * \brief The recursion level for recursive Thread::dispatchMessages()
> >>>> +	 * calls
> >>>> +	 */
> >>>> +	unsigned int recursion_ = 0;
> >>>>  };
> >>>>
> >>>>  /**
> >>>> @@ -595,30 +600,34 @@ void Thread::removeMessages(Object *receiver)
> >>>>   * Messages shall only be dispatched from the current thread, typically within
> >>>>   * the thread from the run() function. Calling this function outside of the
> >>>>   * thread results in undefined behaviour.
> >>>> + *
> >>>> + * This function is not thread-safe, but it may be called recursively in the
> >>>> + * same thread from an object's message handler. It guarantees in all cases
> >>>> + * delivery of messages in the order they have been posted.
> 
> "in all cases delivery of" sounds odd.
> 
> Perhaps
> 
> "It guarantees delivery of messages in the order they have been posted
> in all cases".

I'll use this wording.

> Or
> 
> "In all cases, it guarantees delivery of messages in the order they have
> been posted".
> 
> >>>>   */
> >>>>  void Thread::dispatchMessages(Message::Type type)
> >>>>  {
> >>>>  	ASSERT(data_ == ThreadData::current());
> >>>>
> >>>> +	++data_->messages_.recursion_;
> >>>> +
> >>>>  	MutexLocker locker(data_->messages_.mutex_);
> >>>>
> >>>>  	std::list<std::unique_ptr<Message>> &messages = data_->messages_.list_;
> >>>>
> >>>> -	for (auto iter = messages.begin(); iter != messages.end(); ) {
> >>>> -		std::unique_ptr<Message> &msg = *iter;
> >>>> -
> >>>> -		if (!msg) {
> >>>> -			iter = data_->messages_.list_.erase(iter);
> >>>> +	for (std::unique_ptr<Message> &msg : messages) {
> >>>> +		if (!msg)
> >>>>  			continue;
> >>>> -		}
> >>>>
> >>>> -		if (type != Message::Type::None && msg->type() != type) {
> >>>> -			++iter;
> >>>> +		if (type != Message::Type::None && msg->type() != type)
> >>>>  			continue;
> >>>> -		}
> >>>>
> >>>> +		/*
> >>>> +		 * Move the message, setting the entry in the list to null. It
> >>>> +		 * will cause recursive calls to ignore the entry, and the erase
> >>>> +		 * loop at the end of the function to delete it from the list.
> >>>> +		 */
> >>>>  		std::unique_ptr<Message> message = std::move(msg);
> >>>> -		iter = data_->messages_.list_.erase(iter);
> >>>>
> >>>>  		Object *receiver = message->receiver_;
> >>>>  		ASSERT(data_ == receiver->thread()->data_);
> >>>> @@ -629,6 +638,20 @@ void Thread::dispatchMessages(Message::Type type)
> >>>>  		message.reset();
> >>>>  		locker.lock();
> >>>>  	}
> >>>> +
> >>>> +	/*
> >>>> +	 * If the recursion level is 0, erase all null messages in the list. We
> >>>> +	 * can't do so during recursion, as it would invalidate the iterator of
> >>>> +	 * the outer calls.
> >>>> +	 */
> >>>
> >>> is it paranoid to think accesses to recursions_ should be serialized ?
> >>
> >> The function is documented as not being thread-safe. This is because it
> >> must be called from the thread created by the Thread class (see the
> >> assert at the beginning that enforces that). While it can be called
> >> recursively, there will never be two concurrent calls, so no race
> >> condition when accessing the recursion_ variable. This is unlike the
> >> messages list that needs to be protected by a lock because other threads
> >> can post messages to the list concurrently to
> >> Thread::dispatchMessages().
> 
> Sounds good, and Thread looks well asserted in most cases to guarantee
> rules are adhered to.
> 
> >>>> +	if (!--data_->messages_.recursion_) {
> >>>> +		for (auto iter = messages.begin(); iter != messages.end(); ) {
> >>>> +			if (!*iter)
> >>>> +				iter = messages.erase(iter);
> >>>> +			else
> >>>> +				++iter;
> >>>> +		}
> >>>> +	}
> >>>
> >>> Can null messages end up in the list ? can't you just clear() the
> >>> whole list ?
> >>
> >> The function takes a message type parameter to restrict it to
> >> dispatching messages of a given type. The list may thus not be empty, it
> >> may still contain messages of other types.
> > 
> > So it is the other way around actually, if the message has been
> > dispatched, we erase it. Sorry, I read the condition wrong, thanks for
> > explaining.
> > 
> > A little note, you could:
> > 
> >         if (--data_->messages_.recursion_)
> >                 return;
> > 
> > And save one indentation level.
> > 
> > Apart from this:
> > Reviewed-by: Jacopo Mondi <jacopo@jmondi.org>
> 
> With comments addressed if and as required:
> 
> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> 
> >>>>  }
> >>>>
> >>>>  /**

Patch
diff mbox series

diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
index 7f79115222e8..9eb488d46db8 100644
--- a/src/libcamera/base/thread.cpp
+++ b/src/libcamera/base/thread.cpp
@@ -126,6 +126,11 @@  public:
 	 * \brief Protects the \ref list_
 	 */
 	Mutex mutex_;
+	/**
+	 * \brief The recursion level for recursive Thread::dispatchMessages()
+	 * calls
+	 */
+	unsigned int recursion_ = 0;
 };
 
 /**
@@ -595,30 +600,34 @@  void Thread::removeMessages(Object *receiver)
  * Messages shall only be dispatched from the current thread, typically within
  * the thread from the run() function. Calling this function outside of the
  * thread results in undefined behaviour.
+ *
+ * This function is not thread-safe, but it may be called recursively in the
+ * same thread from an object's message handler. It guarantees in all cases
+ * delivery of messages in the order they have been posted.
  */
 void Thread::dispatchMessages(Message::Type type)
 {
 	ASSERT(data_ == ThreadData::current());
 
+	++data_->messages_.recursion_;
+
 	MutexLocker locker(data_->messages_.mutex_);
 
 	std::list<std::unique_ptr<Message>> &messages = data_->messages_.list_;
 
-	for (auto iter = messages.begin(); iter != messages.end(); ) {
-		std::unique_ptr<Message> &msg = *iter;
-
-		if (!msg) {
-			iter = data_->messages_.list_.erase(iter);
+	for (std::unique_ptr<Message> &msg : messages) {
+		if (!msg)
 			continue;
-		}
 
-		if (type != Message::Type::None && msg->type() != type) {
-			++iter;
+		if (type != Message::Type::None && msg->type() != type)
 			continue;
-		}
 
+		/*
+		 * Move the message, setting the entry in the list to null. It
+		 * will cause recursive calls to ignore the entry, and the erase
+		 * loop at the end of the function to delete it from the list.
+		 */
 		std::unique_ptr<Message> message = std::move(msg);
-		iter = data_->messages_.list_.erase(iter);
 
 		Object *receiver = message->receiver_;
 		ASSERT(data_ == receiver->thread()->data_);
@@ -629,6 +638,20 @@  void Thread::dispatchMessages(Message::Type type)
 		message.reset();
 		locker.lock();
 	}
+
+	/*
+	 * If the recursion level is 0, erase all null messages in the list. We
+	 * can't do so during recursion, as it would invalidate the iterator of
+	 * the outer calls.
+	 */
+	if (!--data_->messages_.recursion_) {
+		for (auto iter = messages.begin(); iter != messages.end(); ) {
+			if (!*iter)
+				iter = messages.erase(iter);
+			else
+				++iter;
+		}
+	}
 }
 
 /**