Message ID | 20210423040427.1226704-1-hiroh@chromium.org |
---|---|
State | Superseded |
Headers | show |
Series |
|
Related | show |
Gentle ping for review. On Fri, Apr 23, 2021 at 1:04 PM Hirokazu Honda <hiroh@chromium.org> wrote: > CameraWorker::stop() executes Thread::exit() and Thread::wait(). > However, intuitively, the two functions doesn't clean up pending > messages in the message queue of the thread. They are rather > cleaned in the destruction of CameraWorker, but CameraWorker's > life time is the same as CameraDevice and thus CameraManager, > which is never destroyed. On the other hand, > Camera3RequestDescriptors are destroyed in CameraDevice::stop(). > > This causes CameraWorker::Worker::queueRequest() starts to be > executed after CameraWorker::start(), which is after > CameraDevice::stop(). This obviously causes a segmentation fault, > because CameraWorker::Worker::queueRequest() touches > CaptureRequest which is owned by Camera3RequestDescriptor. > > This fixes the issue by cancel pending message in thread queue in > CameraWorker::stop() with Thread::dispatchMessages(). > > Signed-off-by: Hirokazu Honda <hiroh@chromium.org> > --- > src/android/camera_worker.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/android/camera_worker.cpp b/src/android/camera_worker.cpp > index 300ddde0..b68ce9d5 100644 > --- a/src/android/camera_worker.cpp > +++ b/src/android/camera_worker.cpp > @@ -62,6 +62,7 @@ void CameraWorker::start() > > void CameraWorker::stop() > { > + thread_.dispatchMessages(Message::Type::InvokeMessage); > thread_.exit(); > thread_.wait(); > } > -- > 2.31.1.498.g6c1eba8ee3d-goog > >
Gentle ping for review. On Mon, May 10, 2021 at 3:03 PM Hirokazu Honda <hiroh@chromium.org> wrote: > Gentle ping for review. > > On Fri, Apr 23, 2021 at 1:04 PM Hirokazu Honda <hiroh@chromium.org> wrote: > >> CameraWorker::stop() executes Thread::exit() and Thread::wait(). >> However, intuitively, the two functions doesn't clean up pending >> messages in the message queue of the thread. They are rather >> cleaned in the destruction of CameraWorker, but CameraWorker's >> life time is the same as CameraDevice and thus CameraManager, >> which is never destroyed. On the other hand, >> Camera3RequestDescriptors are destroyed in CameraDevice::stop(). >> >> This causes CameraWorker::Worker::queueRequest() starts to be >> executed after CameraWorker::start(), which is after >> CameraDevice::stop(). This obviously causes a segmentation fault, >> because CameraWorker::Worker::queueRequest() touches >> CaptureRequest which is owned by Camera3RequestDescriptor. >> >> This fixes the issue by cancel pending message in thread queue in >> CameraWorker::stop() with Thread::dispatchMessages(). >> >> Signed-off-by: Hirokazu Honda <hiroh@chromium.org> >> --- >> src/android/camera_worker.cpp | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/src/android/camera_worker.cpp b/src/android/camera_worker.cpp >> index 300ddde0..b68ce9d5 100644 >> --- a/src/android/camera_worker.cpp >> +++ b/src/android/camera_worker.cpp >> @@ -62,6 +62,7 @@ void CameraWorker::start() >> >> void CameraWorker::stop() >> { >> + thread_.dispatchMessages(Message::Type::InvokeMessage); >> thread_.exit(); >> thread_.wait(); >> } >> -- >> 2.31.1.498.g6c1eba8ee3d-goog >> >>
On Fri, Apr 23, 2021 at 01:04:27PM +0900, Hirokazu Honda wrote: > CameraWorker::stop() executes Thread::exit() and Thread::wait(). > However, intuitively, the two functions doesn't clean up pending > messages in the message queue of the thread. They are rather > cleaned in the destruction of CameraWorker, but CameraWorker's > life time is the same as CameraDevice and thus CameraManager, > which is never destroyed. On the other hand, > Camera3RequestDescriptors are destroyed in CameraDevice::stop(). > > This causes CameraWorker::Worker::queueRequest() starts to be > executed after CameraWorker::start(), which is after I'm not sure I'm following right the sequence of Worker::queueRequest(), CameraWorker::start() and CameraDevice::stop() Should this be CameraWorker::stop() ? Did you ever triggered this failure ? > CameraDevice::stop(). This obviously causes a segmentation fault, > because CameraWorker::Worker::queueRequest() touches > CaptureRequest which is owned by Camera3RequestDescriptor. > > This fixes the issue by cancel pending message in thread queue in > CameraWorker::stop() with Thread::dispatchMessages(). For my limited understanding of the Object and Thread classes, I think it's ok to dispatch all pending method calls. However what concerns me is also Request that are pending in void CameraWorker::Worker::processRequest() when the Worker gets stopped. Should we wait for the queue of pending requests to have finised waiting their fences (or instrument an interruption mechanism) before returning from stop() ? > > Signed-off-by: Hirokazu Honda <hiroh@chromium.org> > --- > src/android/camera_worker.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/android/camera_worker.cpp b/src/android/camera_worker.cpp > index 300ddde0..b68ce9d5 100644 > --- a/src/android/camera_worker.cpp > +++ b/src/android/camera_worker.cpp > @@ -62,6 +62,7 @@ void CameraWorker::start() > > void CameraWorker::stop() > { > + thread_.dispatchMessages(Message::Type::InvokeMessage); > thread_.exit(); > thread_.wait(); > } > -- > 2.31.1.498.g6c1eba8ee3d-goog >
Hi Jacopo, thank you for reviewing. On Thu, May 20, 2021 at 8:12 PM Jacopo Mondi <jacopo@jmondi.org> wrote: > On Fri, Apr 23, 2021 at 01:04:27PM +0900, Hirokazu Honda wrote: > > CameraWorker::stop() executes Thread::exit() and Thread::wait(). > > However, intuitively, the two functions doesn't clean up pending > s/intuitively/surprisingly/ > messages in the message queue of the thread. They are rather > > cleaned in the destruction of CameraWorker, but CameraWorker's > > life time is the same as CameraDevice and thus CameraManager, > > which is never destroyed. On the other hand, > > Camera3RequestDescriptors are destroyed in CameraDevice::stop(). > > > > This causes CameraWorker::Worker::queueRequest() starts to be > > executed after CameraWorker::start(), which is after > > I'm not sure I'm following right the sequence of Worker::queueRequest(), > CameraWorker::start() and CameraDevice::stop() > > Should this be CameraWorker::stop() ? > > Did you ever triggered this failure ? > > I have encountered this issue when I tested my flush(). The HAL API flow is CameraDevice::close() -> (CameraDevice::configureStreams() ->) CameraDevice::processCaptureRequest(). The internal flow is * in CameraDevice::close() - CameraWorker::stop() - Thread::exit() - Thread::wait() - Camera::stop() * in CameraDevice::processCaptureResult() - CameraWorker::start() As I said, after CameraWorker::stop(), pending requests remain in the queue. CameraWorker::start() starts with the remained requests and because the object of the requests are destroyed by clearing descriptors_, the SIGSEGV happens. > > CameraDevice::stop(). This obviously causes a segmentation fault, > > because CameraWorker::Worker::queueRequest() touches > > CaptureRequest which is owned by Camera3RequestDescriptor. > > > > This fixes the issue by cancel pending message in thread queue in > > CameraWorker::stop() with Thread::dispatchMessages(). > > For my limited understanding of the Object and Thread classes, I think > it's ok to dispatch all pending method calls. > > However what concerns me is also Request that are pending in > void CameraWorker::Worker::processRequest() when the Worker gets > stopped. Should we wait for the queue of pending requests to have > finised waiting their fences (or instrument an interruption mechanism) > before returning from stop() ? > > If I understand correctly, Thread::dispatchMessages() executes CameraWorker::Worker::processRequest(). Thread::dispatchMessages() calls Object::message(), and it executes InvokeMessage::invoke(). I am honestly unsure. Laurent should know as Laurent is the original author of Thread::dispatchMessages(). Thanks, -Hiro > > > > Signed-off-by: Hirokazu Honda <hiroh@chromium.org> > > --- > > src/android/camera_worker.cpp | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/src/android/camera_worker.cpp > b/src/android/camera_worker.cpp > > index 300ddde0..b68ce9d5 100644 > > --- a/src/android/camera_worker.cpp > > +++ b/src/android/camera_worker.cpp > > @@ -62,6 +62,7 @@ void CameraWorker::start() > > > > void CameraWorker::stop() > > { > > + thread_.dispatchMessages(Message::Type::InvokeMessage); > > thread_.exit(); > > thread_.wait(); > > } > > -- > > 2.31.1.498.g6c1eba8ee3d-goog > > >
Hi Hiro On Fri, May 21, 2021 at 01:44:44PM +0900, Hirokazu Honda wrote: > Hi Jacopo, thank you for reviewing. CC Laurent for his opinion Thanks j > > On Thu, May 20, 2021 at 8:12 PM Jacopo Mondi <jacopo@jmondi.org> wrote: > > > On Fri, Apr 23, 2021 at 01:04:27PM +0900, Hirokazu Honda wrote: > > > CameraWorker::stop() executes Thread::exit() and Thread::wait(). > > > However, intuitively, the two functions doesn't clean up pending > > > > s/intuitively/surprisingly/ > > > messages in the message queue of the thread. They are rather > > > cleaned in the destruction of CameraWorker, but CameraWorker's > > > life time is the same as CameraDevice and thus CameraManager, > > > which is never destroyed. On the other hand, > > > Camera3RequestDescriptors are destroyed in CameraDevice::stop(). > > > > > > This causes CameraWorker::Worker::queueRequest() starts to be > > > executed after CameraWorker::start(), which is after > > > > I'm not sure I'm following right the sequence of Worker::queueRequest(), > > CameraWorker::start() and CameraDevice::stop() > > > > Should this be CameraWorker::stop() ? > > > > Did you ever triggered this failure ? > > > > > I have encountered this issue when I tested my flush(). > The HAL API flow is CameraDevice::close() -> > (CameraDevice::configureStreams() ->) CameraDevice::processCaptureRequest(). > The internal flow is > * in CameraDevice::close() > - CameraWorker::stop() > - Thread::exit() > - Thread::wait() > - Camera::stop() > * in CameraDevice::processCaptureResult() > - CameraWorker::start() > > As I said, after CameraWorker::stop(), pending requests remain in the queue. > CameraWorker::start() starts with the remained requests and because the > object of the requests are destroyed by clearing descriptors_, the SIGSEGV > happens. > > > > > CameraDevice::stop(). This obviously causes a segmentation fault, > > > because CameraWorker::Worker::queueRequest() touches > > > CaptureRequest which is owned by Camera3RequestDescriptor. > > > > > > This fixes the issue by cancel pending message in thread queue in > > > CameraWorker::stop() with Thread::dispatchMessages(). > > > > For my limited understanding of the Object and Thread classes, I think > > it's ok to dispatch all pending method calls. > > > > However what concerns me is also Request that are pending in > > void CameraWorker::Worker::processRequest() when the Worker gets > > stopped. Should we wait for the queue of pending requests to have > > finised waiting their fences (or instrument an interruption mechanism) > > before returning from stop() ? > > > > > If I understand correctly, Thread::dispatchMessages() executes > CameraWorker::Worker::processRequest(). > Thread::dispatchMessages() calls Object::message(), and it executes > InvokeMessage::invoke(). > I am honestly unsure. > Laurent should know as Laurent is the original author > of Thread::dispatchMessages(). > > Thanks, > -Hiro > > > > > > > Signed-off-by: Hirokazu Honda <hiroh@chromium.org> > > > --- > > > src/android/camera_worker.cpp | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/src/android/camera_worker.cpp > > b/src/android/camera_worker.cpp > > > index 300ddde0..b68ce9d5 100644 > > > --- a/src/android/camera_worker.cpp > > > +++ b/src/android/camera_worker.cpp > > > @@ -62,6 +62,7 @@ void CameraWorker::start() > > > > > > void CameraWorker::stop() > > > { > > > + thread_.dispatchMessages(Message::Type::InvokeMessage); > > > thread_.exit(); > > > thread_.wait(); > > > } > > > -- > > > 2.31.1.498.g6c1eba8ee3d-goog > > > > >
Hello, On Fri, May 21, 2021 at 01:44:44PM +0900, Hirokazu Honda wrote: > On Thu, May 20, 2021 at 8:12 PM Jacopo Mondi wrote: > > On Fri, Apr 23, 2021 at 01:04:27PM +0900, Hirokazu Honda wrote: > > > CameraWorker::stop() executes Thread::exit() and Thread::wait(). > > > However, intuitively, the two functions doesn't clean up pending > > s/intuitively/surprisingly/ > > > messages in the message queue of the thread. They are rather > > > cleaned in the destruction of CameraWorker, but CameraWorker's > > > life time is the same as CameraDevice and thus CameraManager, > > > which is never destroyed. On the other hand, > > > Camera3RequestDescriptors are destroyed in CameraDevice::stop(). > > > > > > This causes CameraWorker::Worker::queueRequest() starts to be > > > executed after CameraWorker::start(), which is after > > > > I'm not sure I'm following right the sequence of Worker::queueRequest(), > > CameraWorker::start() and CameraDevice::stop() > > > > Should this be CameraWorker::stop() ? > > > > Did you ever triggered this failure ? > > I have encountered this issue when I tested my flush(). > The HAL API flow is CameraDevice::close() -> > (CameraDevice::configureStreams() ->) CameraDevice::processCaptureRequest(). > The internal flow is > * in CameraDevice::close() > - CameraWorker::stop() > - Thread::exit() > - Thread::wait() > - Camera::stop() > * in CameraDevice::processCaptureResult() > - CameraWorker::start() > > As I said, after CameraWorker::stop(), pending requests remain in the queue. > CameraWorker::start() starts with the remained requests and because the > object of the requests are destroyed by clearing descriptors_, the SIGSEGV > happens. You're right. We can have messages in the queue when the thread is stopped, and they're not removed. I won't say it's by design :-), but thinking about it, forcing automatic dispatching of all messages when the thread is stopped would prevent fast stop of threads, so I think it shouldn't be done by default (or at least not without a way to prevent it). > > > CameraDevice::stop(). This obviously causes a segmentation fault, > > > because CameraWorker::Worker::queueRequest() touches > > > CaptureRequest which is owned by Camera3RequestDescriptor. > > > > > > This fixes the issue by cancel pending message in thread queue in > > > CameraWorker::stop() with Thread::dispatchMessages(). > > > > For my limited understanding of the Object and Thread classes, I think > > it's ok to dispatch all pending method calls. > > > > However what concerns me is also Request that are pending in > > void CameraWorker::Worker::processRequest() when the Worker gets > > stopped. Should we wait for the queue of pending requests to have > > finised waiting their fences (or instrument an interruption mechanism) > > before returning from stop() ? > > If I understand correctly, Thread::dispatchMessages() executes > CameraWorker::Worker::processRequest(). > Thread::dispatchMessages() calls Object::message(), and it executes > InvokeMessage::invoke(). > I am honestly unsure. > Laurent should know as Laurent is the original author > of Thread::dispatchMessages(). > > > > Signed-off-by: Hirokazu Honda <hiroh@chromium.org> > > > --- > > > src/android/camera_worker.cpp | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/src/android/camera_worker.cpp b/src/android/camera_worker.cpp > > > index 300ddde0..b68ce9d5 100644 > > > --- a/src/android/camera_worker.cpp > > > +++ b/src/android/camera_worker.cpp > > > @@ -62,6 +62,7 @@ void CameraWorker::start() > > > > > > void CameraWorker::stop() > > > { > > > + thread_.dispatchMessages(Message::Type::InvokeMessage); Thread::dispatchMessages() is meant to be called from within the thread only. This isn't documented, I'll send a patch for that. One option here to call dispatchMessage() from the thread would be the following (untested): diff --git a/src/android/camera_worker.cpp b/src/android/camera_worker.cpp index 300ddde03645..9f727826e23f 100644 --- a/src/android/camera_worker.cpp +++ b/src/android/camera_worker.cpp @@ -52,18 +52,24 @@ void CaptureRequest::queue() */ CameraWorker::CameraWorker() { - worker_.moveToThread(&thread_); + worker_.moveToThread(this); } void CameraWorker::start() { - thread_.start(); + Thread::start(); } void CameraWorker::stop() { - thread_.exit(); - thread_.wait(); + exit(); + wait(); +} + +void CameraWorker::run() +{ + exec(); + dispatchMessages(Message::Type::InvokeMessage); } void CameraWorker::queueRequest(CaptureRequest *request) diff --git a/src/android/camera_worker.h b/src/android/camera_worker.h index 64b1658b61b4..e289ef9b8655 100644 --- a/src/android/camera_worker.h +++ b/src/android/camera_worker.h @@ -42,7 +42,7 @@ private: std::unique_ptr<libcamera::Request> request_; }; -class CameraWorker +class CameraWorker : private libcamera::Thread { public: CameraWorker(); @@ -52,6 +52,9 @@ public: void queueRequest(CaptureRequest *request); +protected: + void run() override; + private: class Worker : public libcamera::Object { @@ -63,7 +66,6 @@ private: }; Worker worker_; - libcamera::Thread thread_; }; #endif /* __ANDROID_CAMERA_WORKER_H__ */ Could you test this to see if it fixes your issue ? I'm also considering deleting all posted messages when the thread is stopped, as I really can't see a use case for keeping messages to be processed when the thread is restarted. Does anyone have an opinion ? I'll likely post a patch as an RFC. > > > thread_.exit(); > > > thread_.wait(); > > > }
diff --git a/src/android/camera_worker.cpp b/src/android/camera_worker.cpp index 300ddde0..b68ce9d5 100644 --- a/src/android/camera_worker.cpp +++ b/src/android/camera_worker.cpp @@ -62,6 +62,7 @@ void CameraWorker::start() void CameraWorker::stop() { + thread_.dispatchMessages(Message::Type::InvokeMessage); thread_.exit(); thread_.wait(); }
CameraWorker::stop() executes Thread::exit() and Thread::wait(). However, intuitively, the two functions doesn't clean up pending messages in the message queue of the thread. They are rather cleaned in the destruction of CameraWorker, but CameraWorker's life time is the same as CameraDevice and thus CameraManager, which is never destroyed. On the other hand, Camera3RequestDescriptors are destroyed in CameraDevice::stop(). This causes CameraWorker::Worker::queueRequest() starts to be executed after CameraWorker::start(), which is after CameraDevice::stop(). This obviously causes a segmentation fault, because CameraWorker::Worker::queueRequest() touches CaptureRequest which is owned by Camera3RequestDescriptor. This fixes the issue by cancel pending message in thread queue in CameraWorker::stop() with Thread::dispatchMessages(). Signed-off-by: Hirokazu Honda <hiroh@chromium.org> --- src/android/camera_worker.cpp | 1 + 1 file changed, 1 insertion(+)