[libcamera-devel,8/8] android: Implement flush() camera operation
diff mbox series

Message ID 20210510105235.28319-9-jacopo@jmondi.org
State Superseded
Delegated to: Jacopo Mondi
Headers show
Series
  • android: Implement flush() camera operation
Related show

Commit Message

Jacopo Mondi May 10, 2021, 10:52 a.m. UTC
Implement the flush() camera operation in the CameraDevice class
and make it available to the camera framework by implementing the
operation wrapper in camera_ops.cpp.

The flush() implementation stops the Camera and the worker thread and
waits for all in-flight requests to be returned. Stopping the Camera
forces all Requests already queued to be returned immediately in error
state. As flush() has to wait until all of them have been returned, make it
wait on a newly introduced condition variable which is notified by the
request completion handler when the queue of pending requests has been
exhausted.

As flush() can race with processCaptureRequest() protect the requests
queueing by introducing a new CameraState::CameraFlushing state that
processCaptureRequest() inspects before queuing the Request to the
Camera. If flush() has been called while processCaptureRequest() was
executing, return the current Request immediately in error state.

Signed-off-by: Jacopo Mondi <jacopo@jmondi.org>
---
 src/android/camera_device.cpp | 63 +++++++++++++++++++++++++++++++++++
 src/android/camera_device.h   |  6 ++++
 src/android/camera_ops.cpp    |  8 ++++-
 3 files changed, 76 insertions(+), 1 deletion(-)

Comments

Niklas Söderlund May 10, 2021, 8:39 p.m. UTC | #1
Hi Jacopo,

Thanks for your work.

On 2021-05-10 12:52:35 +0200, Jacopo Mondi wrote:
> Implement the flush() camera operation in the CameraDevice class
> and make it available to the camera framework by implementing the
> operation wrapper in camera_ops.cpp.
> 
> The flush() implementation stops the Camera and the worker thread and
> waits for all in-flight requests to be returned. Stopping the Camera
> forces all Requests already queued to be returned immediately in error
> state. As flush() has to wait until all of them have been returned, make it
> wait on a newly introduced condition variable which is notified by the
> request completion handler when the queue of pending requests has been
> exhausted.
> 
> As flush() can race with processCaptureRequest() protect the requests
> queueing by introducing a new CameraState::CameraFlushing state that
> processCaptureRequest() inspects before queuing the Request to the
> Camera. If flush() has been called while processCaptureRequest() was
> executing, return the current Request immediately in error state.
> 
> Signed-off-by: Jacopo Mondi <jacopo@jmondi.org>
> ---
>  src/android/camera_device.cpp | 63 +++++++++++++++++++++++++++++++++++
>  src/android/camera_device.h   |  6 ++++
>  src/android/camera_ops.cpp    |  8 ++++-
>  3 files changed, 76 insertions(+), 1 deletion(-)
> 
> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> index fa12ce5b0133..01b3acd93c4b 100644
> --- a/src/android/camera_device.cpp
> +++ b/src/android/camera_device.cpp
> @@ -756,6 +756,42 @@ void CameraDevice::close()
>  	camera_->release();
>  }
>  
> +/*
> + * Flush is similar to stop() but sets the camera state to 'flushing' and wait
> + * until all the in-flight requests have been returned.
> + */
> +void CameraDevice::flush()
> +{
> +	{
> +		MutexLocker cameraLock(cameraMutex_);
> +
> +		if (state_ != CameraRunning)
> +			return;
> +
> +		state_ = CameraFlushing;
> +
> +		/*
> +		 * Stop the camera and set the state to flushing to prevent any
> +		 * new request to be queued from this point on.
> +		 */
> +		worker_.stop();
> +		camera_->stop();
> +	}

Releasing cameraMutex_ while waiting for the flushMutex_ signal seems 
like a race waiting to happen as the operation is acting in 
synchronization with the state_ statemachine.

I understand you release it as you want to lock it in 
CameraDevice::stop(), is there any other potential race site? If not 
maybe CameraDevice::stop() can be reworked? It only needs to read the 
state so is the mutex really needed, could it be reworked to an atomic?  
Or maybe there is something in the HAL itself that would prevent other 
callbacks to change the state from CameraFlushing -> CameraRunning while 
flush() is executing?

> +
> +	/*
> +	 * Now wait for all the in-flight requests to be completed before
> +	 * returning. Stopping the Camera guarantees that all in-flight requests
> +	 * are completed in error state.
> +	 */
> +	{
> +		MutexLocker flushLock(flushMutex_);
> +		flushing_.wait(flushLock, [&] { return descriptors_.empty(); });
> +	}
> +
> +	MutexLocker cameraLock(cameraMutex_);
> +	state_ = CameraStopped;
> +}
> +
>  void CameraDevice::stop()
>  {
>  	MutexLocker cameraLock(cameraMutex_);
> @@ -2019,6 +2055,26 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
>  	if (ret)
>  		return ret;
>  
> +
> +	/*
> +	 * Just before queuing the request, make sure flush() has not
> +	 * been called after this function has been executed. In that
> +	 * case, immediately return the request with errors.
> +	 */
> +	MutexLocker cameraLock(cameraMutex_);
> +	if (state_ == CameraFlushing || state_ == CameraStopped) {
> +		for (camera3_stream_buffer_t &buffer : descriptor.buffers_) {
> +			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> +			buffer.release_fence = buffer.acquire_fence;
> +		}
> +
> +		notifyError(descriptor.frameNumber_,
> +			    descriptor.buffers_[0].stream,
> +			    CAMERA3_MSG_ERROR_REQUEST);
> +
> +		return 0;
> +	}
> +
>  	worker_.queueRequest(descriptor.request_.get());
>  
>  	{
> @@ -2052,6 +2108,13 @@ void CameraDevice::requestComplete(Request *request)
>  	}
>  	Camera3RequestDescriptor &descriptor = node.mapped();
>  
> +	/* Release flush if all the pending requests have been completed. */
> +	{
> +		MutexLocker flushLock(flushMutex_);
> +		if (descriptors_.empty())
> +			flushing_.notify_one();
> +	}
> +
>  	/*
>  	 * Prepare the capture result for the Android camera stack.
>  	 *
> diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> index ed992ae56d5d..4a9ef495b776 100644
> --- a/src/android/camera_device.h
> +++ b/src/android/camera_device.h
> @@ -7,6 +7,7 @@
>  #ifndef __ANDROID_CAMERA_DEVICE_H__
>  #define __ANDROID_CAMERA_DEVICE_H__
>  
> +#include <condition_variable>
>  #include <map>
>  #include <memory>
>  #include <mutex>
> @@ -42,6 +43,7 @@ public:
>  
>  	int open(const hw_module_t *hardwareModule);
>  	void close();
> +	void flush();
>  
>  	unsigned int id() const { return id_; }
>  	camera3_device_t *camera3Device() { return &camera3Device_; }
> @@ -92,6 +94,7 @@ private:
>  	enum State {
>  		CameraStopped,
>  		CameraRunning,
> +		CameraFlushing,
>  	};
>  
>  	void stop();
> @@ -124,6 +127,9 @@ private:
>  	libcamera::Mutex cameraMutex_; /* Protects access to the camera state. */
>  	State state_;
>  
> +	libcamera::Mutex flushMutex_; /* Protects the flushing condition variable. */
> +	std::condition_variable flushing_;
> +
>  	std::shared_ptr<libcamera::Camera> camera_;
>  	std::unique_ptr<libcamera::CameraConfiguration> config_;
>  
> diff --git a/src/android/camera_ops.cpp b/src/android/camera_ops.cpp
> index 696e80436821..8a3cfa175ff5 100644
> --- a/src/android/camera_ops.cpp
> +++ b/src/android/camera_ops.cpp
> @@ -66,8 +66,14 @@ static void hal_dev_dump([[maybe_unused]] const struct camera3_device *dev,
>  {
>  }
>  
> -static int hal_dev_flush([[maybe_unused]] const struct camera3_device *dev)
> +static int hal_dev_flush(const struct camera3_device *dev)
>  {
> +	if (!dev)
> +		return -EINVAL;
> +
> +	CameraDevice *camera = reinterpret_cast<CameraDevice *>(dev->priv);
> +	camera->flush();
> +
>  	return 0;
>  }
>  
> -- 
> 2.31.1
> 
> _______________________________________________
> libcamera-devel mailing list
> libcamera-devel@lists.libcamera.org
> https://lists.libcamera.org/listinfo/libcamera-devel
Hirokazu Honda May 11, 2021, 6:15 a.m. UTC | #2
Hi Jacopo, thank you for the work.

On Tue, May 11, 2021 at 5:39 AM Niklas Söderlund <
niklas.soderlund@ragnatech.se> wrote:

> Hi Jacopo,
>
> Thanks for your work.
>
> On 2021-05-10 12:52:35 +0200, Jacopo Mondi wrote:
> > Implement the flush() camera operation in the CameraDevice class
> > and make it available to the camera framework by implementing the
> > operation wrapper in camera_ops.cpp.
> >
> > The flush() implementation stops the Camera and the worker thread and
> > waits for all in-flight requests to be returned. Stopping the Camera
> > forces all Requests already queued to be returned immediately in error
> > state. As flush() has to wait until all of them have been returned, make
> it
> > wait on a newly introduced condition variable which is notified by the
> > request completion handler when the queue of pending requests has been
> > exhausted.
> >
> > As flush() can race with processCaptureRequest() protect the requests
> > queueing by introducing a new CameraState::CameraFlushing state that
> > processCaptureRequest() inspects before queuing the Request to the
> > Camera. If flush() has been called while processCaptureRequest() was
> > executing, return the current Request immediately in error state.
> >
> > Signed-off-by: Jacopo Mondi <jacopo@jmondi.org>
> > ---
> >  src/android/camera_device.cpp | 63 +++++++++++++++++++++++++++++++++++
> >  src/android/camera_device.h   |  6 ++++
> >  src/android/camera_ops.cpp    |  8 ++++-
> >  3 files changed, 76 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/android/camera_device.cpp
> b/src/android/camera_device.cpp
> > index fa12ce5b0133..01b3acd93c4b 100644
> > --- a/src/android/camera_device.cpp
> > +++ b/src/android/camera_device.cpp
> > @@ -756,6 +756,42 @@ void CameraDevice::close()
> >       camera_->release();
> >  }
> >
> > +/*
> > + * Flush is similar to stop() but sets the camera state to 'flushing'
> and wait
> > + * until all the in-flight requests have been returned.
> > + */
> > +void CameraDevice::flush()
> > +{
> > +     {
> > +             MutexLocker cameraLock(cameraMutex_);
> > +
> > +             if (state_ != CameraRunning)
> > +                     return;
> > +
> > +             state_ = CameraFlushing;
> > +
> > +             /*
> > +              * Stop the camera and set the state to flushing to
> prevent any
> > +              * new request to be queued from this point on.
> > +              */
> > +             worker_.stop();
> > +             camera_->stop();
> > +     }
>
> Releasing cameraMutex_ while waiting for the flushMutex_ signal seems
> like a race waiting to happen as the operation is acting in
> synchronization with the state_ statemachine.
>
> I understand you release it as you want to lock it in
> CameraDevice::stop(), is there any other potential race site? If not
> maybe CameraDevice::stop() can be reworked? It only needs to read the
> state so is the mutex really needed, could it be reworked to an atomic?
> Or maybe there is something in the HAL itself that would prevent other
> callbacks to change the state from CameraFlushing -> CameraRunning while
> flush() is executing?
>
> > +
> > +     /*
> > +      * Now wait for all the in-flight requests to be completed before
> > +      * returning. Stopping the Camera guarantees that all in-flight
> requests
> > +      * are completed in error state.
> > +      */
> > +     {
> > +             MutexLocker flushLock(flushMutex_);
> > +             flushing_.wait(flushLock, [&] { return
> descriptors_.empty(); });
> > +     }
> > +
> > +     MutexLocker cameraLock(cameraMutex_);
> > +     state_ = CameraStopped;
> > +}
> > +
> >  void CameraDevice::stop()
> >  {
> >       MutexLocker cameraLock(cameraMutex_);
> > @@ -2019,6 +2055,26 @@ int
> CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
> >       if (ret)
> >               return ret;
> >
> > +
> > +     /*
> > +      * Just before queuing the request, make sure flush() has not
> > +      * been called after this function has been executed. In that
> > +      * case, immediately return the request with errors.
> > +      */
> > +     MutexLocker cameraLock(cameraMutex_);
> > +     if (state_ == CameraFlushing || state_ == CameraStopped) {
> > +             for (camera3_stream_buffer_t &buffer :
> descriptor.buffers_) {
> > +                     buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> > +                     buffer.release_fence = buffer.acquire_fence;
> > +             }
> > +
> > +             notifyError(descriptor.frameNumber_,
> > +                         descriptor.buffers_[0].stream,
> > +                         CAMERA3_MSG_ERROR_REQUEST);
> > +
> > +             return 0;
> > +     }
> > +
> >       worker_.queueRequest(descriptor.request_.get());
> >
> >       {
> > @@ -2052,6 +2108,13 @@ void CameraDevice::requestComplete(Request
> *request)
> >       }
> >       Camera3RequestDescriptor &descriptor = node.mapped();
> >
> > +     /* Release flush if all the pending requests have been completed.
> */
> > +     {
> > +             MutexLocker flushLock(flushMutex_);
> > +             if (descriptors_.empty())
> > +                     flushing_.notify_one();
> > +     }
> > +
>

Is flushMutex_ necessary? I think it should be replaced with requestMutex_.
It is because descriptors_ is accessed in the condition of the wait
function and here, before calling notify_one().


>       /*
> >        * Prepare the capture result for the Android camera stack.
> >        *
> > diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> > index ed992ae56d5d..4a9ef495b776 100644
> > --- a/src/android/camera_device.h
> > +++ b/src/android/camera_device.h
> > @@ -7,6 +7,7 @@
> >  #ifndef __ANDROID_CAMERA_DEVICE_H__
> >  #define __ANDROID_CAMERA_DEVICE_H__
> >
> > +#include <condition_variable>
> >  #include <map>
> >  #include <memory>
> >  #include <mutex>
> > @@ -42,6 +43,7 @@ public:
> >
> >       int open(const hw_module_t *hardwareModule);
> >       void close();
> > +     void flush();
> >
> >       unsigned int id() const { return id_; }
> >       camera3_device_t *camera3Device() { return &camera3Device_; }
> > @@ -92,6 +94,7 @@ private:
> >       enum State {
> >               CameraStopped,
> >               CameraRunning,
> > +             CameraFlushing,
> >       };
> >
> >       void stop();
> > @@ -124,6 +127,9 @@ private:
> >       libcamera::Mutex cameraMutex_; /* Protects access to the camera
> state. */
> >       State state_;
> >
> > +     libcamera::Mutex flushMutex_; /* Protects the flushing condition
> variable. */
> > +     std::condition_variable flushing_;
> > +
> >       std::shared_ptr<libcamera::Camera> camera_;
> >       std::unique_ptr<libcamera::CameraConfiguration> config_;
> >
> > diff --git a/src/android/camera_ops.cpp b/src/android/camera_ops.cpp
> > index 696e80436821..8a3cfa175ff5 100644
> > --- a/src/android/camera_ops.cpp
> > +++ b/src/android/camera_ops.cpp
> > @@ -66,8 +66,14 @@ static void hal_dev_dump([[maybe_unused]] const
> struct camera3_device *dev,
> >  {
> >  }
> >
> > -static int hal_dev_flush([[maybe_unused]] const struct camera3_device
> *dev)
> > +static int hal_dev_flush(const struct camera3_device *dev)
> >  {
> > +     if (!dev)
> > +             return -EINVAL;
> > +
> > +     CameraDevice *camera = reinterpret_cast<CameraDevice *>(dev->priv);
> > +     camera->flush();
> > +
> >       return 0;
> >  }
> >


This implementation is good if close() and flush() are not called at any
time.
But, looks like it doesn't protect a concurrent call of close() and flush()
with configureStreams() and each other?

-Hiro


> > --
> > 2.31.1
> >
> > _______________________________________________
> > libcamera-devel mailing list
> > libcamera-devel@lists.libcamera.org
> > https://lists.libcamera.org/listinfo/libcamera-devel
>
> --
> Regards,
> Niklas Söderlund
> _______________________________________________
> libcamera-devel mailing list
> libcamera-devel@lists.libcamera.org
> https://lists.libcamera.org/listinfo/libcamera-devel
>
Jacopo Mondi May 11, 2021, 7:44 a.m. UTC | #3
Hi Niklas,

On Mon, May 10, 2021 at 10:39:09PM +0200, Niklas Söderlund wrote:
> Hi Jacopo,
>
> Thanks for your work.
>
> On 2021-05-10 12:52:35 +0200, Jacopo Mondi wrote:
> > Implement the flush() camera operation in the CameraDevice class
> > and make it available to the camera framework by implementing the
> > operation wrapper in camera_ops.cpp.
> >
> > The flush() implementation stops the Camera and the worker thread and
> > waits for all in-flight requests to be returned. Stopping the Camera
> > forces all Requests already queued to be returned immediately in error
> > state. As flush() has to wait until all of them have been returned, make it
> > wait on a newly introduced condition variable which is notified by the
> > request completion handler when the queue of pending requests has been
> > exhausted.
> >
> > As flush() can race with processCaptureRequest() protect the requests
> > queueing by introducing a new CameraState::CameraFlushing state that
> > processCaptureRequest() inspects before queuing the Request to the
> > Camera. If flush() has been called while processCaptureRequest() was
> > executing, return the current Request immediately in error state.
> >
> > Signed-off-by: Jacopo Mondi <jacopo@jmondi.org>
> > ---
> >  src/android/camera_device.cpp | 63 +++++++++++++++++++++++++++++++++++
> >  src/android/camera_device.h   |  6 ++++
> >  src/android/camera_ops.cpp    |  8 ++++-
> >  3 files changed, 76 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> > index fa12ce5b0133..01b3acd93c4b 100644
> > --- a/src/android/camera_device.cpp
> > +++ b/src/android/camera_device.cpp
> > @@ -756,6 +756,42 @@ void CameraDevice::close()
> >  	camera_->release();
> >  }
> >
> > +/*
> > + * Flush is similar to stop() but sets the camera state to 'flushing' and wait
> > + * until all the in-flight requests have been returned.
> > + */
> > +void CameraDevice::flush()
> > +{
> > +	{
> > +		MutexLocker cameraLock(cameraMutex_);
> > +
> > +		if (state_ != CameraRunning)
> > +			return;
> > +
> > +		state_ = CameraFlushing;
> > +
> > +		/*
> > +		 * Stop the camera and set the state to flushing to prevent any
> > +		 * new request to be queued from this point on.
> > +		 */
> > +		worker_.stop();
> > +		camera_->stop();
> > +	}
>
> Releasing cameraMutex_ while waiting for the flushMutex_ signal seems
> like a race waiting to happen as the operation is acting in
> synchronization with the state_ statemachine.

The two operations happens one after the other, not at the same time,
am I mistaken ?

>
> I understand you release it as you want to lock it in
> CameraDevice::stop(), is there any other potential race site? If not

Nope, that's not CameraDevice::stop() but raher
libcamera::Camera::stop()

cameraMutex serves to protect against any racing call to
processCaptureRequest() that wants to inspect the camera state to
decide if the request has to be queued or returned with error.


> maybe CameraDevice::stop() can be reworked? It only needs to read the
> state so is the mutex really needed, could it be reworked to an atomic?
> Or maybe there is something in the HAL itself that would prevent other
> callbacks to change the state from CameraFlushing -> CameraRunning while
> flush() is executing?
>
> > +
> > +	/*
> > +	 * Now wait for all the in-flight requests to be completed before
> > +	 * returning. Stopping the Camera guarantees that all in-flight requests
> > +	 * are completed in error state.
> > +	 */
> > +	{
> > +		MutexLocker flushLock(flushMutex_);
> > +		flushing_.wait(flushLock, [&] { return descriptors_.empty(); });
> > +	}
> > +
> > +	MutexLocker cameraLock(cameraMutex_);
> > +	state_ = CameraStopped;
> > +}
> > +
> >  void CameraDevice::stop()
> >  {
> >  	MutexLocker cameraLock(cameraMutex_);
> > @@ -2019,6 +2055,26 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
> >  	if (ret)
> >  		return ret;
> >
> > +
> > +	/*
> > +	 * Just before queuing the request, make sure flush() has not
> > +	 * been called after this function has been executed. In that
> > +	 * case, immediately return the request with errors.
> > +	 */
> > +	MutexLocker cameraLock(cameraMutex_);
> > +	if (state_ == CameraFlushing || state_ == CameraStopped) {
> > +		for (camera3_stream_buffer_t &buffer : descriptor.buffers_) {
> > +			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> > +			buffer.release_fence = buffer.acquire_fence;
> > +		}
> > +
> > +		notifyError(descriptor.frameNumber_,
> > +			    descriptor.buffers_[0].stream,
> > +			    CAMERA3_MSG_ERROR_REQUEST);
> > +
> > +		return 0;
> > +	}
> > +
> >  	worker_.queueRequest(descriptor.request_.get());
> >
> >  	{
> > @@ -2052,6 +2108,13 @@ void CameraDevice::requestComplete(Request *request)
> >  	}
> >  	Camera3RequestDescriptor &descriptor = node.mapped();
> >
> > +	/* Release flush if all the pending requests have been completed. */
> > +	{
> > +		MutexLocker flushLock(flushMutex_);
> > +		if (descriptors_.empty())
> > +			flushing_.notify_one();
> > +	}
> > +
> >  	/*
> >  	 * Prepare the capture result for the Android camera stack.
> >  	 *
> > diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> > index ed992ae56d5d..4a9ef495b776 100644
> > --- a/src/android/camera_device.h
> > +++ b/src/android/camera_device.h
> > @@ -7,6 +7,7 @@
> >  #ifndef __ANDROID_CAMERA_DEVICE_H__
> >  #define __ANDROID_CAMERA_DEVICE_H__
> >
> > +#include <condition_variable>
> >  #include <map>
> >  #include <memory>
> >  #include <mutex>
> > @@ -42,6 +43,7 @@ public:
> >
> >  	int open(const hw_module_t *hardwareModule);
> >  	void close();
> > +	void flush();
> >
> >  	unsigned int id() const { return id_; }
> >  	camera3_device_t *camera3Device() { return &camera3Device_; }
> > @@ -92,6 +94,7 @@ private:
> >  	enum State {
> >  		CameraStopped,
> >  		CameraRunning,
> > +		CameraFlushing,
> >  	};
> >
> >  	void stop();
> > @@ -124,6 +127,9 @@ private:
> >  	libcamera::Mutex cameraMutex_; /* Protects access to the camera state. */
> >  	State state_;
> >
> > +	libcamera::Mutex flushMutex_; /* Protects the flushing condition variable. */
> > +	std::condition_variable flushing_;
> > +
> >  	std::shared_ptr<libcamera::Camera> camera_;
> >  	std::unique_ptr<libcamera::CameraConfiguration> config_;
> >
> > diff --git a/src/android/camera_ops.cpp b/src/android/camera_ops.cpp
> > index 696e80436821..8a3cfa175ff5 100644
> > --- a/src/android/camera_ops.cpp
> > +++ b/src/android/camera_ops.cpp
> > @@ -66,8 +66,14 @@ static void hal_dev_dump([[maybe_unused]] const struct camera3_device *dev,
> >  {
> >  }
> >
> > -static int hal_dev_flush([[maybe_unused]] const struct camera3_device *dev)
> > +static int hal_dev_flush(const struct camera3_device *dev)
> >  {
> > +	if (!dev)
> > +		return -EINVAL;
> > +
> > +	CameraDevice *camera = reinterpret_cast<CameraDevice *>(dev->priv);
> > +	camera->flush();
> > +
> >  	return 0;
> >  }
> >
> > --
> > 2.31.1
> >
> > _______________________________________________
> > libcamera-devel mailing list
> > libcamera-devel@lists.libcamera.org
> > https://lists.libcamera.org/listinfo/libcamera-devel
>
> --
> Regards,
> Niklas Söderlund
Jacopo Mondi May 11, 2021, 7:50 a.m. UTC | #4
Hi Hiro,

On Tue, May 11, 2021 at 03:15:17PM +0900, Hirokazu Honda wrote:
> Hi Jacopo, thank you for the work.
>
> On Tue, May 11, 2021 at 5:39 AM Niklas Söderlund <
> niklas.soderlund@ragnatech.se> wrote:
>
> > Hi Jacopo,
> >
> > Thanks for your work.
> >
> > On 2021-05-10 12:52:35 +0200, Jacopo Mondi wrote:
> > > Implement the flush() camera operation in the CameraDevice class
> > > and make it available to the camera framework by implementing the
> > > operation wrapper in camera_ops.cpp.
> > >
> > > The flush() implementation stops the Camera and the worker thread and
> > > waits for all in-flight requests to be returned. Stopping the Camera
> > > forces all Requests already queued to be returned immediately in error
> > > state. As flush() has to wait until all of them have been returned, make
> > it
> > > wait on a newly introduced condition variable which is notified by the
> > > request completion handler when the queue of pending requests has been
> > > exhausted.
> > >
> > > As flush() can race with processCaptureRequest() protect the requests
> > > queueing by introducing a new CameraState::CameraFlushing state that
> > > processCaptureRequest() inspects before queuing the Request to the
> > > Camera. If flush() has been called while processCaptureRequest() was
> > > executing, return the current Request immediately in error state.
> > >
> > > Signed-off-by: Jacopo Mondi <jacopo@jmondi.org>
> > > ---
> > >  src/android/camera_device.cpp | 63 +++++++++++++++++++++++++++++++++++
> > >  src/android/camera_device.h   |  6 ++++
> > >  src/android/camera_ops.cpp    |  8 ++++-
> > >  3 files changed, 76 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/android/camera_device.cpp
> > b/src/android/camera_device.cpp
> > > index fa12ce5b0133..01b3acd93c4b 100644
> > > --- a/src/android/camera_device.cpp
> > > +++ b/src/android/camera_device.cpp
> > > @@ -756,6 +756,42 @@ void CameraDevice::close()
> > >       camera_->release();
> > >  }
> > >
> > > +/*
> > > + * Flush is similar to stop() but sets the camera state to 'flushing'
> > and wait
> > > + * until all the in-flight requests have been returned.
> > > + */
> > > +void CameraDevice::flush()
> > > +{
> > > +     {
> > > +             MutexLocker cameraLock(cameraMutex_);
> > > +
> > > +             if (state_ != CameraRunning)
> > > +                     return;
> > > +
> > > +             state_ = CameraFlushing;
> > > +
> > > +             /*
> > > +              * Stop the camera and set the state to flushing to
> > prevent any
> > > +              * new request to be queued from this point on.
> > > +              */
> > > +             worker_.stop();
> > > +             camera_->stop();
> > > +     }
> >
> > Releasing cameraMutex_ while waiting for the flushMutex_ signal seems
> > like a race waiting to happen as the operation is acting in
> > synchronization with the state_ statemachine.
> >
> > I understand you release it as you want to lock it in
> > CameraDevice::stop(), is there any other potential race site? If not
> > maybe CameraDevice::stop() can be reworked? It only needs to read the
> > state so is the mutex really needed, could it be reworked to an atomic?
> > Or maybe there is something in the HAL itself that would prevent other
> > callbacks to change the state from CameraFlushing -> CameraRunning while
> > flush() is executing?
> >
> > > +
> > > +     /*
> > > +      * Now wait for all the in-flight requests to be completed before
> > > +      * returning. Stopping the Camera guarantees that all in-flight
> > requests
> > > +      * are completed in error state.
> > > +      */
> > > +     {
> > > +             MutexLocker flushLock(flushMutex_);
> > > +             flushing_.wait(flushLock, [&] { return
> > descriptors_.empty(); });
> > > +     }
> > > +
> > > +     MutexLocker cameraLock(cameraMutex_);
> > > +     state_ = CameraStopped;
> > > +}
> > > +
> > >  void CameraDevice::stop()
> > >  {
> > >       MutexLocker cameraLock(cameraMutex_);
> > > @@ -2019,6 +2055,26 @@ int
> > CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
> > >       if (ret)
> > >               return ret;
> > >
> > > +
> > > +     /*
> > > +      * Just before queuing the request, make sure flush() has not
> > > +      * been called after this function has been executed. In that
> > > +      * case, immediately return the request with errors.
> > > +      */
> > > +     MutexLocker cameraLock(cameraMutex_);
> > > +     if (state_ == CameraFlushing || state_ == CameraStopped) {
> > > +             for (camera3_stream_buffer_t &buffer :
> > descriptor.buffers_) {
> > > +                     buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> > > +                     buffer.release_fence = buffer.acquire_fence;
> > > +             }
> > > +
> > > +             notifyError(descriptor.frameNumber_,
> > > +                         descriptor.buffers_[0].stream,
> > > +                         CAMERA3_MSG_ERROR_REQUEST);
> > > +
> > > +             return 0;
> > > +     }
> > > +
> > >       worker_.queueRequest(descriptor.request_.get());
> > >
> > >       {
> > > @@ -2052,6 +2108,13 @@ void CameraDevice::requestComplete(Request
> > *request)
> > >       }
> > >       Camera3RequestDescriptor &descriptor = node.mapped();
> > >
> > > +     /* Release flush if all the pending requests have been completed.
> > */
> > > +     {
> > > +             MutexLocker flushLock(flushMutex_);
> > > +             if (descriptors_.empty())
> > > +                     flushing_.notify_one();
> > > +     }
> > > +
> >
>
> Is flushMutex_ necessary? I think it should be replaced with requestMutex_.
> It is because descriptors_ is accessed in the condition of the wait
> function and here, before calling notify_one().

You're probably right. I could replace flushMutex_ with requestMutex_
in the flush() implementation. That means I can move this block in the
previous one that extracts the descriptor.

>
>
> >       /*
> > >        * Prepare the capture result for the Android camera stack.
> > >        *
> > > diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> > > index ed992ae56d5d..4a9ef495b776 100644
> > > --- a/src/android/camera_device.h
> > > +++ b/src/android/camera_device.h
> > > @@ -7,6 +7,7 @@
> > >  #ifndef __ANDROID_CAMERA_DEVICE_H__
> > >  #define __ANDROID_CAMERA_DEVICE_H__
> > >
> > > +#include <condition_variable>
> > >  #include <map>
> > >  #include <memory>
> > >  #include <mutex>
> > > @@ -42,6 +43,7 @@ public:
> > >
> > >       int open(const hw_module_t *hardwareModule);
> > >       void close();
> > > +     void flush();
> > >
> > >       unsigned int id() const { return id_; }
> > >       camera3_device_t *camera3Device() { return &camera3Device_; }
> > > @@ -92,6 +94,7 @@ private:
> > >       enum State {
> > >               CameraStopped,
> > >               CameraRunning,
> > > +             CameraFlushing,
> > >       };
> > >
> > >       void stop();
> > > @@ -124,6 +127,9 @@ private:
> > >       libcamera::Mutex cameraMutex_; /* Protects access to the camera
> > state. */
> > >       State state_;
> > >
> > > +     libcamera::Mutex flushMutex_; /* Protects the flushing condition
> > variable. */
> > > +     std::condition_variable flushing_;
> > > +
> > >       std::shared_ptr<libcamera::Camera> camera_;
> > >       std::unique_ptr<libcamera::CameraConfiguration> config_;
> > >
> > > diff --git a/src/android/camera_ops.cpp b/src/android/camera_ops.cpp
> > > index 696e80436821..8a3cfa175ff5 100644
> > > --- a/src/android/camera_ops.cpp
> > > +++ b/src/android/camera_ops.cpp
> > > @@ -66,8 +66,14 @@ static void hal_dev_dump([[maybe_unused]] const
> > struct camera3_device *dev,
> > >  {
> > >  }
> > >
> > > -static int hal_dev_flush([[maybe_unused]] const struct camera3_device
> > *dev)
> > > +static int hal_dev_flush(const struct camera3_device *dev)
> > >  {
> > > +     if (!dev)
> > > +             return -EINVAL;
> > > +
> > > +     CameraDevice *camera = reinterpret_cast<CameraDevice *>(dev->priv);
> > > +     camera->flush();
> > > +
> > >       return 0;
> > >  }
> > >
>
>
> This implementation is good if close() and flush() are not called at any
> time.
> But, looks like it doesn't protect a concurrent call of close() and flush()
> with configureStreams() and each other?

I found no mention of close() potentially racing with flush() in the
camera3.h documentation, but I now recall your team considered that a
potential race.

close() can be instrumented to inspect the camera state, possibily by
adding a new CameraClosed state and taking care of it in
processCaptureRequest() and flush(). It shouldn't be hard.

Thanks
  j

>
> -Hiro
>
>
> > > --
> > > 2.31.1
> > >
> > > _______________________________________________
> > > libcamera-devel mailing list
> > > libcamera-devel@lists.libcamera.org
> > > https://lists.libcamera.org/listinfo/libcamera-devel
> >
> > --
> > Regards,
> > Niklas Söderlund
> > _______________________________________________
> > libcamera-devel mailing list
> > libcamera-devel@lists.libcamera.org
> > https://lists.libcamera.org/listinfo/libcamera-devel
> >
Hirokazu Honda May 11, 2021, 9:23 a.m. UTC | #5
Hi Jacopo,

On Tue, May 11, 2021 at 4:49 PM Jacopo Mondi <jacopo@jmondi.org> wrote:

> Hi Hiro,
>
> On Tue, May 11, 2021 at 03:15:17PM +0900, Hirokazu Honda wrote:
> > Hi Jacopo, thank you for the work.
> >
> > On Tue, May 11, 2021 at 5:39 AM Niklas Söderlund <
> > niklas.soderlund@ragnatech.se> wrote:
> >
> > > Hi Jacopo,
> > >
> > > Thanks for your work.
> > >
> > > On 2021-05-10 12:52:35 +0200, Jacopo Mondi wrote:
> > > > Implement the flush() camera operation in the CameraDevice class
> > > > and make it available to the camera framework by implementing the
> > > > operation wrapper in camera_ops.cpp.
> > > >
> > > > The flush() implementation stops the Camera and the worker thread and
> > > > waits for all in-flight requests to be returned. Stopping the Camera
> > > > forces all Requests already queued to be returned immediately in
> error
> > > > state. As flush() has to wait until all of them have been returned,
> make
> > > it
> > > > wait on a newly introduced condition variable which is notified by
> the
> > > > request completion handler when the queue of pending requests has
> been
> > > > exhausted.
> > > >
> > > > As flush() can race with processCaptureRequest() protect the requests
> > > > queueing by introducing a new CameraState::CameraFlushing state that
> > > > processCaptureRequest() inspects before queuing the Request to the
> > > > Camera. If flush() has been called while processCaptureRequest() was
> > > > executing, return the current Request immediately in error state.
> > > >
> > > > Signed-off-by: Jacopo Mondi <jacopo@jmondi.org>
> > > > ---
> > > >  src/android/camera_device.cpp | 63
> +++++++++++++++++++++++++++++++++++
> > > >  src/android/camera_device.h   |  6 ++++
> > > >  src/android/camera_ops.cpp    |  8 ++++-
> > > >  3 files changed, 76 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/src/android/camera_device.cpp
> > > b/src/android/camera_device.cpp
> > > > index fa12ce5b0133..01b3acd93c4b 100644
> > > > --- a/src/android/camera_device.cpp
> > > > +++ b/src/android/camera_device.cpp
> > > > @@ -756,6 +756,42 @@ void CameraDevice::close()
> > > >       camera_->release();
> > > >  }
> > > >
> > > > +/*
> > > > + * Flush is similar to stop() but sets the camera state to
> 'flushing'
> > > and wait
> > > > + * until all the in-flight requests have been returned.
> > > > + */
> > > > +void CameraDevice::flush()
> > > > +{
> > > > +     {
> > > > +             MutexLocker cameraLock(cameraMutex_);
> > > > +
> > > > +             if (state_ != CameraRunning)
> > > > +                     return;
> > > > +
> > > > +             state_ = CameraFlushing;
> > > > +
> > > > +             /*
> > > > +              * Stop the camera and set the state to flushing to
> > > prevent any
> > > > +              * new request to be queued from this point on.
> > > > +              */
> > > > +             worker_.stop();
> > > > +             camera_->stop();
> > > > +     }
> > >
> > > Releasing cameraMutex_ while waiting for the flushMutex_ signal seems
> > > like a race waiting to happen as the operation is acting in
> > > synchronization with the state_ statemachine.
> > >
> > > I understand you release it as you want to lock it in
> > > CameraDevice::stop(), is there any other potential race site? If not
> > > maybe CameraDevice::stop() can be reworked? It only needs to read the
> > > state so is the mutex really needed, could it be reworked to an atomic?
> > > Or maybe there is something in the HAL itself that would prevent other
> > > callbacks to change the state from CameraFlushing -> CameraRunning
> while
> > > flush() is executing?
> > >
> > > > +
> > > > +     /*
> > > > +      * Now wait for all the in-flight requests to be completed
> before
> > > > +      * returning. Stopping the Camera guarantees that all in-flight
> > > requests
> > > > +      * are completed in error state.
> > > > +      */
> > > > +     {
> > > > +             MutexLocker flushLock(flushMutex_);
> > > > +             flushing_.wait(flushLock, [&] { return
> > > descriptors_.empty(); });
> > > > +     }
> > > > +
> > > > +     MutexLocker cameraLock(cameraMutex_);
> > > > +     state_ = CameraStopped;
> > > > +}
> > > > +
> > > >  void CameraDevice::stop()
> > > >  {
> > > >       MutexLocker cameraLock(cameraMutex_);
> > > > @@ -2019,6 +2055,26 @@ int
> > > CameraDevice::processCaptureRequest(camera3_capture_request_t
> *camera3Reques
> > > >       if (ret)
> > > >               return ret;
> > > >
> > > > +
> > > > +     /*
> > > > +      * Just before queuing the request, make sure flush() has not
> > > > +      * been called after this function has been executed. In that
> > > > +      * case, immediately return the request with errors.
> > > > +      */
> > > > +     MutexLocker cameraLock(cameraMutex_);
> > > > +     if (state_ == CameraFlushing || state_ == CameraStopped) {
> > > > +             for (camera3_stream_buffer_t &buffer :
> > > descriptor.buffers_) {
> > > > +                     buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> > > > +                     buffer.release_fence = buffer.acquire_fence;
> > > > +             }
> > > > +
> > > > +             notifyError(descriptor.frameNumber_,
> > > > +                         descriptor.buffers_[0].stream,
> > > > +                         CAMERA3_MSG_ERROR_REQUEST);
> > > > +
> > > > +             return 0;
> > > > +     }
> > > > +
> > > >       worker_.queueRequest(descriptor.request_.get());
> > > >
> > > >       {
> > > > @@ -2052,6 +2108,13 @@ void CameraDevice::requestComplete(Request
> > > *request)
> > > >       }
> > > >       Camera3RequestDescriptor &descriptor = node.mapped();
> > > >
> > > > +     /* Release flush if all the pending requests have been
> completed.
> > > */
> > > > +     {
> > > > +             MutexLocker flushLock(flushMutex_);
> > > > +             if (descriptors_.empty())
> > > > +                     flushing_.notify_one();
> > > > +     }
> > > > +
> > >
> >
> > Is flushMutex_ necessary? I think it should be replaced with
> requestMutex_.
> > It is because descriptors_ is accessed in the condition of the wait
> > function and here, before calling notify_one().
>
> You're probably right. I could replace flushMutex_ with requestMutex_
> in the flush() implementation. That means I can move this block in the
> previous one that extracts the descriptor.
>
> >
> >
> > >       /*
> > > >        * Prepare the capture result for the Android camera stack.
> > > >        *
> > > > diff --git a/src/android/camera_device.h
> b/src/android/camera_device.h
> > > > index ed992ae56d5d..4a9ef495b776 100644
> > > > --- a/src/android/camera_device.h
> > > > +++ b/src/android/camera_device.h
> > > > @@ -7,6 +7,7 @@
> > > >  #ifndef __ANDROID_CAMERA_DEVICE_H__
> > > >  #define __ANDROID_CAMERA_DEVICE_H__
> > > >
> > > > +#include <condition_variable>
> > > >  #include <map>
> > > >  #include <memory>
> > > >  #include <mutex>
> > > > @@ -42,6 +43,7 @@ public:
> > > >
> > > >       int open(const hw_module_t *hardwareModule);
> > > >       void close();
> > > > +     void flush();
> > > >
> > > >       unsigned int id() const { return id_; }
> > > >       camera3_device_t *camera3Device() { return &camera3Device_; }
> > > > @@ -92,6 +94,7 @@ private:
> > > >       enum State {
> > > >               CameraStopped,
> > > >               CameraRunning,
> > > > +             CameraFlushing,
> > > >       };
> > > >
> > > >       void stop();
> > > > @@ -124,6 +127,9 @@ private:
> > > >       libcamera::Mutex cameraMutex_; /* Protects access to the camera
> > > state. */
> > > >       State state_;
> > > >
> > > > +     libcamera::Mutex flushMutex_; /* Protects the flushing
> condition
> > > variable. */
> > > > +     std::condition_variable flushing_;
> > > > +
> > > >       std::shared_ptr<libcamera::Camera> camera_;
> > > >       std::unique_ptr<libcamera::CameraConfiguration> config_;
> > > >
> > > > diff --git a/src/android/camera_ops.cpp b/src/android/camera_ops.cpp
> > > > index 696e80436821..8a3cfa175ff5 100644
> > > > --- a/src/android/camera_ops.cpp
> > > > +++ b/src/android/camera_ops.cpp
> > > > @@ -66,8 +66,14 @@ static void hal_dev_dump([[maybe_unused]] const
> > > struct camera3_device *dev,
> > > >  {
> > > >  }
> > > >
> > > > -static int hal_dev_flush([[maybe_unused]] const struct
> camera3_device
> > > *dev)
> > > > +static int hal_dev_flush(const struct camera3_device *dev)
> > > >  {
> > > > +     if (!dev)
> > > > +             return -EINVAL;
> > > > +
> > > > +     CameraDevice *camera = reinterpret_cast<CameraDevice
> *>(dev->priv);
> > > > +     camera->flush();
> > > > +
> > > >       return 0;
> > > >  }
> > > >
> >
> >
> > This implementation is good if close() and flush() are not called at any
> > time.
> > But, looks like it doesn't protect a concurrent call of close() and
> flush()
> > with configureStreams() and each other?
>
> I found no mention of close() potentially racing with flush() in the
> camera3.h documentation, but I now recall your team considered that a
> potential race.
>
> close() can be instrumented to inspect the camera state, possibily by
> adding a new CameraClosed state and taking care of it in
> processCaptureRequest() and flush(). It shouldn't be hard.
>
>
It sounds good. Again I think you need to guard configureStreams() as well
as processCaptureRequest().
I look forward to your next version.

-Hiro

> Thanks
>   j
>
> >
> > -Hiro
> >
> >
> > > > --
> > > > 2.31.1
> > > >
> > > > _______________________________________________
> > > > libcamera-devel mailing list
> > > > libcamera-devel@lists.libcamera.org
> > > > https://lists.libcamera.org/listinfo/libcamera-devel
> > >
> > > --
> > > Regards,
> > > Niklas Söderlund
> > > _______________________________________________
> > > libcamera-devel mailing list
> > > libcamera-devel@lists.libcamera.org
> > > https://lists.libcamera.org/listinfo/libcamera-devel
> > >
>

Patch
diff mbox series

diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
index fa12ce5b0133..01b3acd93c4b 100644
--- a/src/android/camera_device.cpp
+++ b/src/android/camera_device.cpp
@@ -756,6 +756,42 @@  void CameraDevice::close()
 	camera_->release();
 }
 
+/*
+ * Flush is similar to stop() but sets the camera state to 'flushing' and wait
+ * until all the in-flight requests have been returned.
+ */
+void CameraDevice::flush()
+{
+	{
+		MutexLocker cameraLock(cameraMutex_);
+
+		if (state_ != CameraRunning)
+			return;
+
+		state_ = CameraFlushing;
+
+		/*
+		 * Stop the camera and set the state to flushing to prevent any
+		 * new request to be queued from this point on.
+		 */
+		worker_.stop();
+		camera_->stop();
+	}
+
+	/*
+	 * Now wait for all the in-flight requests to be completed before
+	 * returning. Stopping the Camera guarantees that all in-flight requests
+	 * are completed in error state.
+	 */
+	{
+		MutexLocker flushLock(flushMutex_);
+		flushing_.wait(flushLock, [&] { return descriptors_.empty(); });
+	}
+
+	MutexLocker cameraLock(cameraMutex_);
+	state_ = CameraStopped;
+}
+
 void CameraDevice::stop()
 {
 	MutexLocker cameraLock(cameraMutex_);
@@ -2019,6 +2055,26 @@  int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
 	if (ret)
 		return ret;
 
+
+	/*
+	 * Just before queuing the request, make sure flush() has not
+	 * been called after this function has been executed. In that
+	 * case, immediately return the request with errors.
+	 */
+	MutexLocker cameraLock(cameraMutex_);
+	if (state_ == CameraFlushing || state_ == CameraStopped) {
+		for (camera3_stream_buffer_t &buffer : descriptor.buffers_) {
+			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
+			buffer.release_fence = buffer.acquire_fence;
+		}
+
+		notifyError(descriptor.frameNumber_,
+			    descriptor.buffers_[0].stream,
+			    CAMERA3_MSG_ERROR_REQUEST);
+
+		return 0;
+	}
+
 	worker_.queueRequest(descriptor.request_.get());
 
 	{
@@ -2052,6 +2108,13 @@  void CameraDevice::requestComplete(Request *request)
 	}
 	Camera3RequestDescriptor &descriptor = node.mapped();
 
+	/* Release flush if all the pending requests have been completed. */
+	{
+		MutexLocker flushLock(flushMutex_);
+		if (descriptors_.empty())
+			flushing_.notify_one();
+	}
+
 	/*
 	 * Prepare the capture result for the Android camera stack.
 	 *
diff --git a/src/android/camera_device.h b/src/android/camera_device.h
index ed992ae56d5d..4a9ef495b776 100644
--- a/src/android/camera_device.h
+++ b/src/android/camera_device.h
@@ -7,6 +7,7 @@ 
 #ifndef __ANDROID_CAMERA_DEVICE_H__
 #define __ANDROID_CAMERA_DEVICE_H__
 
+#include <condition_variable>
 #include <map>
 #include <memory>
 #include <mutex>
@@ -42,6 +43,7 @@  public:
 
 	int open(const hw_module_t *hardwareModule);
 	void close();
+	void flush();
 
 	unsigned int id() const { return id_; }
 	camera3_device_t *camera3Device() { return &camera3Device_; }
@@ -92,6 +94,7 @@  private:
 	enum State {
 		CameraStopped,
 		CameraRunning,
+		CameraFlushing,
 	};
 
 	void stop();
@@ -124,6 +127,9 @@  private:
 	libcamera::Mutex cameraMutex_; /* Protects access to the camera state. */
 	State state_;
 
+	libcamera::Mutex flushMutex_; /* Protects the flushing condition variable. */
+	std::condition_variable flushing_;
+
 	std::shared_ptr<libcamera::Camera> camera_;
 	std::unique_ptr<libcamera::CameraConfiguration> config_;
 
diff --git a/src/android/camera_ops.cpp b/src/android/camera_ops.cpp
index 696e80436821..8a3cfa175ff5 100644
--- a/src/android/camera_ops.cpp
+++ b/src/android/camera_ops.cpp
@@ -66,8 +66,14 @@  static void hal_dev_dump([[maybe_unused]] const struct camera3_device *dev,
 {
 }
 
-static int hal_dev_flush([[maybe_unused]] const struct camera3_device *dev)
+static int hal_dev_flush(const struct camera3_device *dev)
 {
+	if (!dev)
+		return -EINVAL;
+
+	CameraDevice *camera = reinterpret_cast<CameraDevice *>(dev->priv);
+	camera->flush();
+
 	return 0;
 }