Thread: Fix setThreadAffinity issue
diff mbox series

Message ID 20241202090457.584020-1-chenghaoyang@chromium.org
State Accepted
Commit d49a84a4f3aa63efc900564ff32558e4f5d85b04
Headers show
Series
  • Thread: Fix setThreadAffinity issue
Related show

Commit Message

Cheng-Hao Yang Dec. 2, 2024, 9:04 a.m. UTC
This fixes commit 4d9db06d669044c0c461a2aed79c85c7fe32a502.

Previously we call Thread::setThreadAffinityInternal in
Thread::startThread, which seems to lead to a SEGV in
pthread_setaffinity_np.

This patch moves the call after the construction of std::thread.

Signed-off-by: Harvey Yang <chenghaoyang@chromium.org>
---
 src/libcamera/base/thread.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Barnabás Pőcze Dec. 3, 2024, 12:35 p.m. UTC | #1
Hi


2024. december 2., hétfő 10:04 keltezéssel, Harvey Yang <chenghaoyang@chromium.org> írta:

> This fixes commit 4d9db06d669044c0c461a2aed79c85c7fe32a502.
> 
> Previously we call Thread::setThreadAffinityInternal in
> Thread::startThread, which seems to lead to a SEGV in
> pthread_setaffinity_np.

I believe the patch should describe what the cause is. As far as I can
see it's a data race on `Thread::thread_` where it is concurrently set
and read from `Thread::start()` and `Thread::startThread()`, respectively.


> 
> This patch moves the call after the construction of std::thread.
> 

Should probably have:

  Fixes: 4d9db06d66 ("libcamera: add method to set thread affinity")


> Signed-off-by: Harvey Yang <chenghaoyang@chromium.org>
> ---
>  src/libcamera/base/thread.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> index f6322fe31..319bfda9a 100644
> --- a/src/libcamera/base/thread.cpp
> +++ b/src/libcamera/base/thread.cpp
> @@ -257,6 +257,8 @@ void Thread::start()
>  	data_->exit_.store(false, std::memory_order_relaxed);
> 
>  	thread_ = std::thread(&Thread::startThread, this);
> +
> +	setThreadAffinityInternal();
>  }
> 
>  void Thread::startThread()
> @@ -284,8 +286,6 @@ void Thread::startThread()
>  	data_->tid_ = syscall(SYS_gettid);
>  	currentThreadData = data_;
> 
> -	setThreadAffinityInternal();
> -
>  	run();
>  }
> 
> --
> 2.47.0.338.g60cca15819-goog
> 


Regards,
Barnabás Pőcze
Laurent Pinchart Dec. 3, 2024, 12:47 p.m. UTC | #2
On Tue, Dec 03, 2024 at 12:35:38PM +0000, Barnabás Pőcze wrote:
> 2024. december 2., hétfő 10:04 keltezéssel, Harvey Yang írta:
> 
> > This fixes commit 4d9db06d669044c0c461a2aed79c85c7fe32a502.
> > 
> > Previously we call Thread::setThreadAffinityInternal in
> > Thread::startThread, which seems to lead to a SEGV in
> > pthread_setaffinity_np.
> 
> I believe the patch should describe what the cause is. As far as I can

Yes, all patches should do so. Saying there seems to be a problem and it
seems to be fixed by a change without further explanation creates very
little confidence, and it shouldn't be up to the reviewer to try and
understand the fix.

> see it's a data race on `Thread::thread_` where it is concurrently set
> and read from `Thread::start()` and `Thread::startThread()`, respectively.
> 
> > This patch moves the call after the construction of std::thread.

If I recall correctly, setThreadAffinityInternal() was moved to
Thread::startThread() based on review discussions, to pin the thread to
CPUs before starting its main function. This patch drops this. The
commit message needs to explain why this is considered fine.

> Should probably have:
> 
>   Fixes: 4d9db06d66 ("libcamera: add method to set thread affinity")
> 
> > Signed-off-by: Harvey Yang <chenghaoyang@chromium.org>
> > ---
> >  src/libcamera/base/thread.cpp | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > index f6322fe31..319bfda9a 100644
> > --- a/src/libcamera/base/thread.cpp
> > +++ b/src/libcamera/base/thread.cpp
> > @@ -257,6 +257,8 @@ void Thread::start()
> >  	data_->exit_.store(false, std::memory_order_relaxed);
> > 
> >  	thread_ = std::thread(&Thread::startThread, this);
> > +
> > +	setThreadAffinityInternal();
> >  }
> > 
> >  void Thread::startThread()
> > @@ -284,8 +286,6 @@ void Thread::startThread()
> >  	data_->tid_ = syscall(SYS_gettid);
> >  	currentThreadData = data_;
> > 
> > -	setThreadAffinityInternal();
> > -
> >  	run();
> >  }
> >
Laurent Pinchart Jan. 7, 2025, 9:40 a.m. UTC | #3
Hi Harvey,

On Tue, Dec 03, 2024 at 02:47:06PM +0200, Laurent Pinchart wrote:
> On Tue, Dec 03, 2024 at 12:35:38PM +0000, Barnabás Pőcze wrote:
> > 2024. december 2., hétfő 10:04 keltezéssel, Harvey Yang írta:
> > 
> > > This fixes commit 4d9db06d669044c0c461a2aed79c85c7fe32a502.
> > > 
> > > Previously we call Thread::setThreadAffinityInternal in
> > > Thread::startThread, which seems to lead to a SEGV in
> > > pthread_setaffinity_np.
> > 
> > I believe the patch should describe what the cause is. As far as I can
> 
> Yes, all patches should do so. Saying there seems to be a problem and it
> seems to be fixed by a change without further explanation creates very
> little confidence, and it shouldn't be up to the reviewer to try and
> understand the fix.

Could you please send a new version with an improved commit message ?

> > see it's a data race on `Thread::thread_` where it is concurrently set
> > and read from `Thread::start()` and `Thread::startThread()`, respectively.
> > 
> > > This patch moves the call after the construction of std::thread.
> 
> If I recall correctly, setThreadAffinityInternal() was moved to
> Thread::startThread() based on review discussions, to pin the thread to
> CPUs before starting its main function. This patch drops this. The
> commit message needs to explain why this is considered fine.
> 
> > Should probably have:
> > 
> >   Fixes: 4d9db06d66 ("libcamera: add method to set thread affinity")
> > 
> > > Signed-off-by: Harvey Yang <chenghaoyang@chromium.org>
> > > ---
> > >  src/libcamera/base/thread.cpp | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > > index f6322fe31..319bfda9a 100644
> > > --- a/src/libcamera/base/thread.cpp
> > > +++ b/src/libcamera/base/thread.cpp
> > > @@ -257,6 +257,8 @@ void Thread::start()
> > >  	data_->exit_.store(false, std::memory_order_relaxed);
> > > 
> > >  	thread_ = std::thread(&Thread::startThread, this);
> > > +
> > > +	setThreadAffinityInternal();
> > >  }
> > > 
> > >  void Thread::startThread()
> > > @@ -284,8 +286,6 @@ void Thread::startThread()
> > >  	data_->tid_ = syscall(SYS_gettid);
> > >  	currentThreadData = data_;
> > > 
> > > -	setThreadAffinityInternal();
> > > -
> > >  	run();
> > >  }
> > >
Kieran Bingham Jan. 7, 2025, 11:23 a.m. UTC | #4
Quoting Laurent Pinchart (2025-01-07 09:40:54)
> Hi Harvey,
> 
> On Tue, Dec 03, 2024 at 02:47:06PM +0200, Laurent Pinchart wrote:
> > On Tue, Dec 03, 2024 at 12:35:38PM +0000, Barnabás Pőcze wrote:
> > > 2024. december 2., hétfő 10:04 keltezéssel, Harvey Yang írta:
> > > 
> > > > This fixes commit 4d9db06d669044c0c461a2aed79c85c7fe32a502.
> > > > 
> > > > Previously we call Thread::setThreadAffinityInternal in
> > > > Thread::startThread, which seems to lead to a SEGV in
> > > > pthread_setaffinity_np.
> > > 
> > > I believe the patch should describe what the cause is. As far as I can
> > 
> > Yes, all patches should do so. Saying there seems to be a problem and it
> > seems to be fixed by a change without further explanation creates very
> > little confidence, and it shouldn't be up to the reviewer to try and
> > understand the fix.
> 
> Could you please send a new version with an improved commit message ?

I saw this issue in CI today again too:

https://gitlab.freedesktop.org/camera/libcamera/-/jobs/68964947

So I'd be keen to merge this once the commit message is cleaned up.

--
Kieran


> 
> > > see it's a data race on `Thread::thread_` where it is concurrently set
> > > and read from `Thread::start()` and `Thread::startThread()`, respectively.
> > > 
> > > > This patch moves the call after the construction of std::thread.
> > 
> > If I recall correctly, setThreadAffinityInternal() was moved to
> > Thread::startThread() based on review discussions, to pin the thread to
> > CPUs before starting its main function. This patch drops this. The
> > commit message needs to explain why this is considered fine.
> > 
> > > Should probably have:
> > > 
> > >   Fixes: 4d9db06d66 ("libcamera: add method to set thread affinity")
> > > 
> > > > Signed-off-by: Harvey Yang <chenghaoyang@chromium.org>
> > > > ---
> > > >  src/libcamera/base/thread.cpp | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > > > index f6322fe31..319bfda9a 100644
> > > > --- a/src/libcamera/base/thread.cpp
> > > > +++ b/src/libcamera/base/thread.cpp
> > > > @@ -257,6 +257,8 @@ void Thread::start()
> > > >   data_->exit_.store(false, std::memory_order_relaxed);
> > > > 
> > > >   thread_ = std::thread(&Thread::startThread, this);
> > > > +
> > > > + setThreadAffinityInternal();
> > > >  }
> > > > 
> > > >  void Thread::startThread()
> > > > @@ -284,8 +286,6 @@ void Thread::startThread()
> > > >   data_->tid_ = syscall(SYS_gettid);
> > > >   currentThreadData = data_;
> > > > 
> > > > - setThreadAffinityInternal();
> > > > -
> > > >   run();
> > > >  }
> > > > 
> 
> -- 
> Regards,
> 
> Laurent Pinchart
Cheng-Hao Yang Jan. 7, 2025, 2:43 p.m. UTC | #5
Hi all,

Uploaded v2.

As it was Barnabás who found the root cause, I wouldn't mind if
Barnabás takes the credit and uploads new versions of the fix. JFYI.

BR,
Harvey

On Tue, Jan 7, 2025 at 12:23 PM Kieran Bingham
<kieran.bingham@ideasonboard.com> wrote:
>
> Quoting Laurent Pinchart (2025-01-07 09:40:54)
> > Hi Harvey,
> >
> > On Tue, Dec 03, 2024 at 02:47:06PM +0200, Laurent Pinchart wrote:
> > > On Tue, Dec 03, 2024 at 12:35:38PM +0000, Barnabás Pőcze wrote:
> > > > 2024. december 2., hétfő 10:04 keltezéssel, Harvey Yang írta:
> > > >
> > > > > This fixes commit 4d9db06d669044c0c461a2aed79c85c7fe32a502.
> > > > >
> > > > > Previously we call Thread::setThreadAffinityInternal in
> > > > > Thread::startThread, which seems to lead to a SEGV in
> > > > > pthread_setaffinity_np.
> > > >
> > > > I believe the patch should describe what the cause is. As far as I can
> > >
> > > Yes, all patches should do so. Saying there seems to be a problem and it
> > > seems to be fixed by a change without further explanation creates very
> > > little confidence, and it shouldn't be up to the reviewer to try and
> > > understand the fix.
> >
> > Could you please send a new version with an improved commit message ?
>
> I saw this issue in CI today again too:
>
> https://gitlab.freedesktop.org/camera/libcamera/-/jobs/68964947
>
> So I'd be keen to merge this once the commit message is cleaned up.
>
> --
> Kieran
>
>
> >
> > > > see it's a data race on `Thread::thread_` where it is concurrently set
> > > > and read from `Thread::start()` and `Thread::startThread()`, respectively.
> > > >
> > > > > This patch moves the call after the construction of std::thread.
> > >
> > > If I recall correctly, setThreadAffinityInternal() was moved to
> > > Thread::startThread() based on review discussions, to pin the thread to
> > > CPUs before starting its main function. This patch drops this. The
> > > commit message needs to explain why this is considered fine.
> > >
> > > > Should probably have:
> > > >
> > > >   Fixes: 4d9db06d66 ("libcamera: add method to set thread affinity")
> > > >
> > > > > Signed-off-by: Harvey Yang <chenghaoyang@chromium.org>
> > > > > ---
> > > > >  src/libcamera/base/thread.cpp | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
> > > > > index f6322fe31..319bfda9a 100644
> > > > > --- a/src/libcamera/base/thread.cpp
> > > > > +++ b/src/libcamera/base/thread.cpp
> > > > > @@ -257,6 +257,8 @@ void Thread::start()
> > > > >   data_->exit_.store(false, std::memory_order_relaxed);
> > > > >
> > > > >   thread_ = std::thread(&Thread::startThread, this);
> > > > > +
> > > > > + setThreadAffinityInternal();
> > > > >  }
> > > > >
> > > > >  void Thread::startThread()
> > > > > @@ -284,8 +286,6 @@ void Thread::startThread()
> > > > >   data_->tid_ = syscall(SYS_gettid);
> > > > >   currentThreadData = data_;
> > > > >
> > > > > - setThreadAffinityInternal();
> > > > > -
> > > > >   run();
> > > > >  }
> > > > >
> >
> > --
> > Regards,
> >
> > Laurent Pinchart

Patch
diff mbox series

diff --git a/src/libcamera/base/thread.cpp b/src/libcamera/base/thread.cpp
index f6322fe31..319bfda9a 100644
--- a/src/libcamera/base/thread.cpp
+++ b/src/libcamera/base/thread.cpp
@@ -257,6 +257,8 @@  void Thread::start()
 	data_->exit_.store(false, std::memory_order_relaxed);
 
 	thread_ = std::thread(&Thread::startThread, this);
+
+	setThreadAffinityInternal();
 }
 
 void Thread::startThread()
@@ -284,8 +286,6 @@  void Thread::startThread()
 	data_->tid_ = syscall(SYS_gettid);
 	currentThreadData = data_;
 
-	setThreadAffinityInternal();
-
 	run();
 }