[libcamera-devel,v1,3/3] android: camera_device: Send capture results by inspecting the queue
diff mbox series

Message ID 20210927111149.692004-4-umang.jain@ideasonboard.com
State Superseded
Delegated to: Umang Jain
Headers show
Series
  • Camera3RequestDescriptors std::map => deque
Related show

Commit Message

Umang Jain Sept. 27, 2021, 11:11 a.m. UTC
There is a possibility that an out-of-order completion of capture
request happens by calling process_capture_result() directly on error
paths. The framework expects that errors should be notified as soon as
possible, but the request completion order should remain intact.
An existing instance of this is abortRequest(), which sends the capture
results on flushing state, without considering order-of-completion.

Since, we have a queue of Camera3RequestDescriptor tracking each
capture request placed by framework to libcamera HAL, we should be only
sending back capture results from a single location, by inspecting
the queue. As per the patch, this now happens in
CameraDevice::sendCaptureResults().

Each descriptor is now equipped with its own status to denote whether
the capture request is complete and ready to send back to the framework
or need to be waited upon. This ensures that the order of completion is
respected for the requests.

Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
---
 src/android/camera_device.cpp | 46 ++++++++++++++++++++++++++---------
 src/android/camera_device.h   | 15 +++++++++++-
 2 files changed, 49 insertions(+), 12 deletions(-)

Comments

Jacopo Mondi Sept. 27, 2021, 9:08 p.m. UTC | #1
Hi Umang,

On Mon, Sep 27, 2021 at 04:41:49PM +0530, Umang Jain wrote:
> There is a possibility that an out-of-order completion of capture
> request happens by calling process_capture_result() directly on error
> paths. The framework expects that errors should be notified as soon as
> possible, but the request completion order should remain intact.
> An existing instance of this is abortRequest(), which sends the capture
> results on flushing state, without considering order-of-completion.
>
> Since, we have a queue of Camera3RequestDescriptor tracking each
> capture request placed by framework to libcamera HAL, we should be only
> sending back capture results from a single location, by inspecting
> the queue. As per the patch, this now happens in
> CameraDevice::sendCaptureResults().
>
> Each descriptor is now equipped with its own status to denote whether
> the capture request is complete and ready to send back to the framework
> or need to be waited upon. This ensures that the order of completion is
> respected for the requests.
>
> Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
> ---
>  src/android/camera_device.cpp | 46 ++++++++++++++++++++++++++---------
>  src/android/camera_device.h   | 15 +++++++++++-
>  2 files changed, 49 insertions(+), 12 deletions(-)
>
> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> index b0b7f4fd..8e2d22c5 100644
> --- a/src/android/camera_device.cpp
> +++ b/src/android/camera_device.cpp
> @@ -240,6 +240,8 @@ CameraDevice::Camera3RequestDescriptor::Camera3RequestDescriptor(
>  	/* Clone the controls associated with the camera3 request. */
>  	settings_ = CameraMetadata(camera3Request->settings);
>
> +	status_ = Status::Pending;
> +
>  	/*
>  	 * Create the CaptureRequest, stored as a unique_ptr<> to tie its
>  	 * lifetime to the descriptor.
> @@ -859,11 +861,12 @@ int CameraDevice::processControls(Camera3RequestDescriptor *descriptor)
>  	return 0;
>  }
>
> -void CameraDevice::abortRequest(camera3_capture_request_t *request)
> +void CameraDevice::abortRequest(Camera3RequestDescriptor *descriptor,
> +				camera3_capture_request_t *request)
>  {
>  	notifyError(request->frame_number, nullptr, CAMERA3_MSG_ERROR_REQUEST);
>
> -	camera3_capture_result_t result = {};
> +	camera3_capture_result_t &result = descriptor->captureResult_;
>  	result.num_output_buffers = request->num_output_buffers;
>  	result.frame_number = request->frame_number;
>  	result.partial_result = 0;
> @@ -877,7 +880,8 @@ void CameraDevice::abortRequest(camera3_capture_request_t *request)
>  	}
>  	result.output_buffers = resultBuffers.data();
>
> -	callbacks_->process_capture_result(callbacks_, &result);
> +	descriptor->status_ = Camera3RequestDescriptor::Status::Error;
> +	sendCaptureResults();
>  }
>
>  bool CameraDevice::isValidRequest(camera3_capture_request_t *camera3Request) const
> @@ -1045,13 +1049,19 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
>  		return ret;
>
>  	/*
> -	 * If flush is in progress abort the request. If the camera has been
> -	 * stopped we have to re-start it to be able to process the request.
> +	 * If flush is in progress push the descriptor in the queue and abort
> +	 * the request. If the camera has been stopped we have to re-start it to
> +	 * be able to process the request.
>  	 */
>  	MutexLocker stateLock(stateMutex_);
>
>  	if (state_ == State::Flushing) {
> -		abortRequest(camera3Request);
> +		descriptor->status_ = Camera3RequestDescriptor::Status::Error;
> +		{
> +			MutexLocker descriptorsLock(descriptorsMutex_);
> +			descriptors_.push_back(std::move(descriptor));
> +		}
> +		abortRequest(descriptors_.back().get(), camera3Request);

another possibility is to move adding the descriptor to the queue a
little up in processCaptureRequest().

However with a dequeu there is a possible issue: requests queued to
the worker for which waiting on the fence or queueing to
libcamera::Camera fails. We don't track those failure, and since
the descriptors are on the queue but not queued to the Camera, their
state won't ever be changed (unless we instrument the worker to do
so). The issue is already here, and could cause a request to be lost,
something for which CTS would complain but might potentially not
compromise the capture session. With this new setup a forgotten
request will starve the queue which sound worst.

One way out is to pass the whole descriptor to the worker and let it
set the state opportunely, as we have the descriptor on the queue
already.

There might be better ways out, let's think about them a bit (bonus
points, rework Camera3RequestDescriptor interface to hide
CaptureRequest, but maybe later..)


>  		return 0;
>  	}
>
> @@ -1099,7 +1109,7 @@ void CameraDevice::requestComplete(Request *request)
>  		return;
>  	}
>
> -	camera3_capture_result_t captureResult = {};
> +	camera3_capture_result_t &captureResult = descriptor->captureResult_;
>  	captureResult.frame_number = descriptor->frameNumber_;
>  	captureResult.num_output_buffers = descriptor->buffers_.size();
>  	captureResult.output_buffers = descriptor->buffers_.data();
> @@ -1138,9 +1148,9 @@ void CameraDevice::requestComplete(Request *request)
>  			buffer.acquire_fence = -1;
>  			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
>  		}
> -		callbacks_->process_capture_result(callbacks_, &captureResult);
>
> -		descriptors_.pop_front();
> +		descriptor->status_ = Camera3RequestDescriptor::Status::Error;
> +		sendCaptureResults();
>  		return;
>  	}
>
> @@ -1217,9 +1227,23 @@ void CameraDevice::requestComplete(Request *request)
>  	captureResult.partial_result = 1;
>
>  	captureResult.result = resultMetadata->get();
> -	callbacks_->process_capture_result(callbacks_, &captureResult);
> +	descriptor->status_ = Camera3RequestDescriptor::Status::Success;
> +	sendCaptureResults();
> +}
>
> -	descriptors_.pop_front();
> +void CameraDevice::sendCaptureResults()
> +{
> +	MutexLocker lock(descriptorsMutex_);
> +	while (!descriptors_.empty() && !descriptors_.front()->isPending()) {
> +		std::unique_ptr<Camera3RequestDescriptor> descriptor =
> +			std::move(descriptors_.front());
> +		descriptors_.pop_front();
> +
> +		lock.unlock();
> +		callbacks_->process_capture_result(callbacks_,
> +						   &(descriptor->captureResult_));
> +		lock.lock();
> +	}
>  }
>
>  std::string CameraDevice::logPrefix() const
> diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> index 5889a0e7..545cb9b4 100644
> --- a/src/android/camera_device.h
> +++ b/src/android/camera_device.h
> @@ -74,17 +74,28 @@ private:
>  	CameraDevice(unsigned int id, std::shared_ptr<libcamera::Camera> camera);
>
>  	struct Camera3RequestDescriptor {
> +		enum class Status {
> +			Pending,
> +			Success,
> +			Error,
> +		};
> +
>  		Camera3RequestDescriptor() = default;
>  		~Camera3RequestDescriptor() = default;
>  		Camera3RequestDescriptor(libcamera::Camera *camera,
>  					 const camera3_capture_request_t *camera3Request);
>  		Camera3RequestDescriptor &operator=(Camera3RequestDescriptor &&) = default;
> +		bool isPending() const { return status_ == Status::Pending; }
>
>  		uint32_t frameNumber_ = 0;
>  		std::vector<camera3_stream_buffer_t> buffers_;
>  		std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;
>  		CameraMetadata settings_;
>  		std::unique_ptr<CaptureRequest> request_;
> +
> +		camera3_capture_result_t captureResult_ = {};
> +		libcamera::FrameBuffer *internalBuffer_;

possibily unrelated

> +		Status status_;
>  	};
>
>  	enum class State {
> @@ -99,12 +110,14 @@ private:
>  	createFrameBuffer(const buffer_handle_t camera3buffer,
>  			  libcamera::PixelFormat pixelFormat,
>  			  const libcamera::Size &size);
> -	void abortRequest(camera3_capture_request_t *request);
> +	void abortRequest(Camera3RequestDescriptor *descriptor,
> +			  camera3_capture_request_t *request);
>  	bool isValidRequest(camera3_capture_request_t *request) const;
>  	void notifyShutter(uint32_t frameNumber, uint64_t timestamp);
>  	void notifyError(uint32_t frameNumber, camera3_stream_t *stream,
>  			 camera3_error_msg_code code);
>  	int processControls(Camera3RequestDescriptor *descriptor);
> +	void sendCaptureResults();
>  	std::unique_ptr<CameraMetadata> getResultMetadata(
>  		const Camera3RequestDescriptor *descriptor) const;
>
> --
> 2.31.1
>
Laurent Pinchart Sept. 27, 2021, 11:52 p.m. UTC | #2
Hi Umang,

Thank you for the patch.

On Mon, Sep 27, 2021 at 11:08:47PM +0200, Jacopo Mondi wrote:
> On Mon, Sep 27, 2021 at 04:41:49PM +0530, Umang Jain wrote:
> > There is a possibility that an out-of-order completion of capture
> > request happens by calling process_capture_result() directly on error
> > paths. The framework expects that errors should be notified as soon as
> > possible, but the request completion order should remain intact.
> > An existing instance of this is abortRequest(), which sends the capture
> > results on flushing state, without considering order-of-completion.
> >
> > Since, we have a queue of Camera3RequestDescriptor tracking each

s/Since,/Since/

> > capture request placed by framework to libcamera HAL, we should be only
> > sending back capture results from a single location, by inspecting
> > the queue. As per the patch, this now happens in
> > CameraDevice::sendCaptureResults().
> >
> > Each descriptor is now equipped with its own status to denote whether
> > the capture request is complete and ready to send back to the framework
> > or need to be waited upon. This ensures that the order of completion is

s/need/needs/

> > respected for the requests.
> >
> > Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
> > ---
> >  src/android/camera_device.cpp | 46 ++++++++++++++++++++++++++---------
> >  src/android/camera_device.h   | 15 +++++++++++-
> >  2 files changed, 49 insertions(+), 12 deletions(-)
> >
> > diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> > index b0b7f4fd..8e2d22c5 100644
> > --- a/src/android/camera_device.cpp
> > +++ b/src/android/camera_device.cpp
> > @@ -240,6 +240,8 @@ CameraDevice::Camera3RequestDescriptor::Camera3RequestDescriptor(
> >  	/* Clone the controls associated with the camera3 request. */
> >  	settings_ = CameraMetadata(camera3Request->settings);
> >
> > +	status_ = Status::Pending;
> > +

How about initializing this as part of the initializers list of the
constructor ? Or maybe in the definition of the structure, with

		Status status_ = Status::Pending;

(I haven't made up my mind on whether or not we should globally switch
to that, the pros and cons are not totally clear to me yet.)

> >  	/*
> >  	 * Create the CaptureRequest, stored as a unique_ptr<> to tie its
> >  	 * lifetime to the descriptor.
> > @@ -859,11 +861,12 @@ int CameraDevice::processControls(Camera3RequestDescriptor *descriptor)
> >  	return 0;
> >  }
> >
> > -void CameraDevice::abortRequest(camera3_capture_request_t *request)
> > +void CameraDevice::abortRequest(Camera3RequestDescriptor *descriptor,
> > +				camera3_capture_request_t *request)

Could this function take a Camera3RequestDescriptor pointer only ? It
should contain all the needed data. This can be done as a patch before
this one if desired.

> >  {
> >  	notifyError(request->frame_number, nullptr, CAMERA3_MSG_ERROR_REQUEST);
> >
> > -	camera3_capture_result_t result = {};
> > +	camera3_capture_result_t &result = descriptor->captureResult_;
> >  	result.num_output_buffers = request->num_output_buffers;
> >  	result.frame_number = request->frame_number;
> >  	result.partial_result = 0;
> > @@ -877,7 +880,8 @@ void CameraDevice::abortRequest(camera3_capture_request_t *request)
> >  	}
> >  	result.output_buffers = resultBuffers.data();
> >
> > -	callbacks_->process_capture_result(callbacks_, &result);
> > +	descriptor->status_ = Camera3RequestDescriptor::Status::Error;
> > +	sendCaptureResults();
> >  }
> >
> >  bool CameraDevice::isValidRequest(camera3_capture_request_t *camera3Request) const
> > @@ -1045,13 +1049,19 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
> >  		return ret;
> >
> >  	/*
> > -	 * If flush is in progress abort the request. If the camera has been
> > -	 * stopped we have to re-start it to be able to process the request.
> > +	 * If flush is in progress push the descriptor in the queue and abort
> > +	 * the request. If the camera has been stopped we have to re-start it to
> > +	 * be able to process the request.
> >  	 */
> >  	MutexLocker stateLock(stateMutex_);
> >
> >  	if (state_ == State::Flushing) {
> > -		abortRequest(camera3Request);
> > +		descriptor->status_ = Camera3RequestDescriptor::Status::Error;
> > +		{
> > +			MutexLocker descriptorsLock(descriptorsMutex_);
> > +			descriptors_.push_back(std::move(descriptor));
> > +		}
> > +		abortRequest(descriptors_.back().get(), camera3Request);
> 
> another possibility is to move adding the descriptor to the queue a
> little up in processCaptureRequest().

We have an issue here indeed, there's a race condition. As soon as the
request is added to the queue with a status set to !pending, it could be
completed by a call to sendCaptureResults() from requestComplete()
before abortRequest() gets a chance to run.

As abortRequest is called here only, I would call abortRequest() before
adding the descriptor to the queue, and move the sendCaptureResults()
call from abortRequest() to here after adding the descriptor to the
queue. You can drop setting descriptor->status_ to Error from this
function as it's done in abortRequest().

> However with a dequeu there is a possible issue: requests queued to
> the worker for which waiting on the fence or queueing to
> libcamera::Camera fails. We don't track those failure, and since
> the descriptors are on the queue but not queued to the Camera, their
> state won't ever be changed (unless we instrument the worker to do
> so). The issue is already here, and could cause a request to be lost,
> something for which CTS would complain but might potentially not
> compromise the capture session. With this new setup a forgotten
> request will starve the queue which sound worst.
> 
> One way out is to pass the whole descriptor to the worker and let it
> set the state opportunely, as we have the descriptor on the queue
> already.
> 
> There might be better ways out, let's think about them a bit (bonus
> points, rework Camera3RequestDescriptor interface to hide
> CaptureRequest, but maybe later..)
> 
> 
> >  		return 0;
> >  	}
> >
> > @@ -1099,7 +1109,7 @@ void CameraDevice::requestComplete(Request *request)
> >  		return;
> >  	}
> >
> > -	camera3_capture_result_t captureResult = {};
> > +	camera3_capture_result_t &captureResult = descriptor->captureResult_;
> >  	captureResult.frame_number = descriptor->frameNumber_;
> >  	captureResult.num_output_buffers = descriptor->buffers_.size();
> >  	captureResult.output_buffers = descriptor->buffers_.data();
> > @@ -1138,9 +1148,9 @@ void CameraDevice::requestComplete(Request *request)
> >  			buffer.acquire_fence = -1;
> >  			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> >  		}
> > -		callbacks_->process_capture_result(callbacks_, &captureResult);
> >
> > -		descriptors_.pop_front();
> > +		descriptor->status_ = Camera3RequestDescriptor::Status::Error;
> > +		sendCaptureResults();
> >  		return;
> >  	}
> >
> > @@ -1217,9 +1227,23 @@ void CameraDevice::requestComplete(Request *request)
> >  	captureResult.partial_result = 1;
> >
> >  	captureResult.result = resultMetadata->get();
> > -	callbacks_->process_capture_result(callbacks_, &captureResult);
> > +	descriptor->status_ = Camera3RequestDescriptor::Status::Success;
> > +	sendCaptureResults();
> > +}
> >
> > -	descriptors_.pop_front();
> > +void CameraDevice::sendCaptureResults()
> > +{
> > +	MutexLocker lock(descriptorsMutex_);
> > +	while (!descriptors_.empty() && !descriptors_.front()->isPending()) {
> > +		std::unique_ptr<Camera3RequestDescriptor> descriptor =
> > +			std::move(descriptors_.front());
> > +		descriptors_.pop_front();
> > +
> > +		lock.unlock();
> > +		callbacks_->process_capture_result(callbacks_,
> > +						   &(descriptor->captureResult_));

Do you need the parentheses ?

> > +		lock.lock();
> > +	}
> >  }
> >
> >  std::string CameraDevice::logPrefix() const
> > diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> > index 5889a0e7..545cb9b4 100644
> > --- a/src/android/camera_device.h
> > +++ b/src/android/camera_device.h
> > @@ -74,17 +74,28 @@ private:
> >  	CameraDevice(unsigned int id, std::shared_ptr<libcamera::Camera> camera);
> >
> >  	struct Camera3RequestDescriptor {
> > +		enum class Status {
> > +			Pending,
> > +			Success,
> > +			Error,
> > +		};
> > +
> >  		Camera3RequestDescriptor() = default;
> >  		~Camera3RequestDescriptor() = default;
> >  		Camera3RequestDescriptor(libcamera::Camera *camera,
> >  					 const camera3_capture_request_t *camera3Request);
> >  		Camera3RequestDescriptor &operator=(Camera3RequestDescriptor &&) = default;

Blank line here.

> > +		bool isPending() const { return status_ == Status::Pending; }
> >
> >  		uint32_t frameNumber_ = 0;
> >  		std::vector<camera3_stream_buffer_t> buffers_;
> >  		std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;
> >  		CameraMetadata settings_;
> >  		std::unique_ptr<CaptureRequest> request_;
> > +
> > +		camera3_capture_result_t captureResult_ = {};
> > +		libcamera::FrameBuffer *internalBuffer_;
> 
> possibily unrelated
> 
> > +		Status status_;
> >  	};
> >
> >  	enum class State {
> > @@ -99,12 +110,14 @@ private:
> >  	createFrameBuffer(const buffer_handle_t camera3buffer,
> >  			  libcamera::PixelFormat pixelFormat,
> >  			  const libcamera::Size &size);
> > -	void abortRequest(camera3_capture_request_t *request);
> > +	void abortRequest(Camera3RequestDescriptor *descriptor,
> > +			  camera3_capture_request_t *request);
> >  	bool isValidRequest(camera3_capture_request_t *request) const;
> >  	void notifyShutter(uint32_t frameNumber, uint64_t timestamp);
> >  	void notifyError(uint32_t frameNumber, camera3_stream_t *stream,
> >  			 camera3_error_msg_code code);
> >  	int processControls(Camera3RequestDescriptor *descriptor);
> > +	void sendCaptureResults();
> >  	std::unique_ptr<CameraMetadata> getResultMetadata(
> >  		const Camera3RequestDescriptor *descriptor) const;
> >
Umang Jain Sept. 28, 2021, 12:08 p.m. UTC | #3
Hi,

On 9/28/21 5:22 AM, Laurent Pinchart wrote:
> Hi Umang,
>
> Thank you for the patch.
>
> On Mon, Sep 27, 2021 at 11:08:47PM +0200, Jacopo Mondi wrote:
>> On Mon, Sep 27, 2021 at 04:41:49PM +0530, Umang Jain wrote:
>>> There is a possibility that an out-of-order completion of capture
>>> request happens by calling process_capture_result() directly on error
>>> paths. The framework expects that errors should be notified as soon as
>>> possible, but the request completion order should remain intact.
>>> An existing instance of this is abortRequest(), which sends the capture
>>> results on flushing state, without considering order-of-completion.
>>>
>>> Since, we have a queue of Camera3RequestDescriptor tracking each
> s/Since,/Since/
>
>>> capture request placed by framework to libcamera HAL, we should be only
>>> sending back capture results from a single location, by inspecting
>>> the queue. As per the patch, this now happens in
>>> CameraDevice::sendCaptureResults().
>>>
>>> Each descriptor is now equipped with its own status to denote whether
>>> the capture request is complete and ready to send back to the framework
>>> or need to be waited upon. This ensures that the order of completion is
> s/need/needs/
>
>>> respected for the requests.
>>>
>>> Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
>>> ---
>>>   src/android/camera_device.cpp | 46 ++++++++++++++++++++++++++---------
>>>   src/android/camera_device.h   | 15 +++++++++++-
>>>   2 files changed, 49 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
>>> index b0b7f4fd..8e2d22c5 100644
>>> --- a/src/android/camera_device.cpp
>>> +++ b/src/android/camera_device.cpp
>>> @@ -240,6 +240,8 @@ CameraDevice::Camera3RequestDescriptor::Camera3RequestDescriptor(
>>>   	/* Clone the controls associated with the camera3 request. */
>>>   	settings_ = CameraMetadata(camera3Request->settings);
>>>
>>> +	status_ = Status::Pending;
>>> +
> How about initializing this as part of the initializers list of the
> constructor ? Or maybe in the definition of the structure, with
>
> 		Status status_ = Status::Pending;
>
> (I haven't made up my mind on whether or not we should globally switch
> to that, the pros and cons are not totally clear to me yet.)


I'll opt to set in definition of the structure

>
>>>   	/*
>>>   	 * Create the CaptureRequest, stored as a unique_ptr<> to tie its
>>>   	 * lifetime to the descriptor.
>>> @@ -859,11 +861,12 @@ int CameraDevice::processControls(Camera3RequestDescriptor *descriptor)
>>>   	return 0;
>>>   }
>>>
>>> -void CameraDevice::abortRequest(camera3_capture_request_t *request)
>>> +void CameraDevice::abortRequest(Camera3RequestDescriptor *descriptor,
>>> +				camera3_capture_request_t *request)
> Could this function take a Camera3RequestDescriptor pointer only ? It
> should contain all the needed data. This can be done as a patch before
> this one if desired.


Yeah probably, will check if it can be split into a different patch 
earlier than this

>
>>>   {
>>>   	notifyError(request->frame_number, nullptr, CAMERA3_MSG_ERROR_REQUEST);
>>>
>>> -	camera3_capture_result_t result = {};
>>> +	camera3_capture_result_t &result = descriptor->captureResult_;
>>>   	result.num_output_buffers = request->num_output_buffers;
>>>   	result.frame_number = request->frame_number;
>>>   	result.partial_result = 0;
>>> @@ -877,7 +880,8 @@ void CameraDevice::abortRequest(camera3_capture_request_t *request)
>>>   	}
>>>   	result.output_buffers = resultBuffers.data();
>>>
>>> -	callbacks_->process_capture_result(callbacks_, &result);
>>> +	descriptor->status_ = Camera3RequestDescriptor::Status::Error;
>>> +	sendCaptureResults();
>>>   }
>>>
>>>   bool CameraDevice::isValidRequest(camera3_capture_request_t *camera3Request) const
>>> @@ -1045,13 +1049,19 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
>>>   		return ret;
>>>
>>>   	/*
>>> -	 * If flush is in progress abort the request. If the camera has been
>>> -	 * stopped we have to re-start it to be able to process the request.
>>> +	 * If flush is in progress push the descriptor in the queue and abort
>>> +	 * the request. If the camera has been stopped we have to re-start it to
>>> +	 * be able to process the request.
>>>   	 */
>>>   	MutexLocker stateLock(stateMutex_);
>>>
>>>   	if (state_ == State::Flushing) {
>>> -		abortRequest(camera3Request);
>>> +		descriptor->status_ = Camera3RequestDescriptor::Status::Error;
>>> +		{
>>> +			MutexLocker descriptorsLock(descriptorsMutex_);
>>> +			descriptors_.push_back(std::move(descriptor));
>>> +		}
>>> +		abortRequest(descriptors_.back().get(), camera3Request);
>> another possibility is to move adding the descriptor to the queue a
>> little up in processCaptureRequest().
> We have an issue here indeed, there's a race condition. As soon as the
> request is added to the queue with a status set to !pending, it could be
> completed by a call to sendCaptureResults() from requestComplete()
> before abortRequest() gets a chance to run.


Nice catch, didn't think of it :)

>
> As abortRequest is called here only, I would call abortRequest() before
> adding the descriptor to the queue, and move the sendCaptureResults()
> call from abortRequest() to here after adding the descriptor to the
> queue. You can drop setting descriptor->status_ to Error from this
> function as it's done in abortRequest().


Ack.

>
>> However with a dequeu there is a possible issue: requests queued to
>> the worker for which waiting on the fence or queueing to
>> libcamera::Camera fails. We don't track those failure, and since
>> the descriptors are on the queue but not queued to the Camera, their
>> state won't ever be changed (unless we instrument the worker to do
>> so). The issue is already here, and could cause a request to be lost,
>> something for which CTS would complain but might potentially not
>> compromise the capture session. With this new setup a forgotten
>> request will starve the queue which sound worst.


If we have an error on queueRequest(), I would aspect the descriptor 
status to be set accordingly to make sure the descriptor doesn't starve 
the queue endlessly. Similar to like, what the patch does currently on 
abortRequest, sets state to ::Error


I think the issue here, we missing a error handling block/logic for 
queueRequest() in the first place.

>> One way out is to pass the whole descriptor to the worker and let it


Ah okay this was something you were referring to in the wait-fence CTS 
fix series... I was not getting the context there :-)

>> set the state opportunely, as we have the descriptor on the queue
>> already.


Sounds fine to me. Potential as a separate series on top? I can add a 
\todo here if you want me to.

>>
>> There might be better ways out, let's think about them a bit (bonus
>> points, rework Camera3RequestDescriptor interface to hide
>> CaptureRequest, but maybe later..)
>>
>>
>>>   		return 0;
>>>   	}
>>>
>>> @@ -1099,7 +1109,7 @@ void CameraDevice::requestComplete(Request *request)
>>>   		return;
>>>   	}
>>>
>>> -	camera3_capture_result_t captureResult = {};
>>> +	camera3_capture_result_t &captureResult = descriptor->captureResult_;
>>>   	captureResult.frame_number = descriptor->frameNumber_;
>>>   	captureResult.num_output_buffers = descriptor->buffers_.size();
>>>   	captureResult.output_buffers = descriptor->buffers_.data();
>>> @@ -1138,9 +1148,9 @@ void CameraDevice::requestComplete(Request *request)
>>>   			buffer.acquire_fence = -1;
>>>   			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
>>>   		}
>>> -		callbacks_->process_capture_result(callbacks_, &captureResult);
>>>
>>> -		descriptors_.pop_front();
>>> +		descriptor->status_ = Camera3RequestDescriptor::Status::Error;
>>> +		sendCaptureResults();
>>>   		return;
>>>   	}
>>>
>>> @@ -1217,9 +1227,23 @@ void CameraDevice::requestComplete(Request *request)
>>>   	captureResult.partial_result = 1;
>>>
>>>   	captureResult.result = resultMetadata->get();
>>> -	callbacks_->process_capture_result(callbacks_, &captureResult);
>>> +	descriptor->status_ = Camera3RequestDescriptor::Status::Success;
>>> +	sendCaptureResults();
>>> +}
>>>
>>> -	descriptors_.pop_front();
>>> +void CameraDevice::sendCaptureResults()
>>> +{
>>> +	MutexLocker lock(descriptorsMutex_);
>>> +	while (!descriptors_.empty() && !descriptors_.front()->isPending()) {
>>> +		std::unique_ptr<Camera3RequestDescriptor> descriptor =
>>> +			std::move(descriptors_.front());
>>> +		descriptors_.pop_front();
>>> +
>>> +		lock.unlock();
>>> +		callbacks_->process_capture_result(callbacks_,
>>> +						   &(descriptor->captureResult_));
> Do you need the parentheses ?


might not

>
>>> +		lock.lock();
>>> +	}
>>>   }
>>>
>>>   std::string CameraDevice::logPrefix() const
>>> diff --git a/src/android/camera_device.h b/src/android/camera_device.h
>>> index 5889a0e7..545cb9b4 100644
>>> --- a/src/android/camera_device.h
>>> +++ b/src/android/camera_device.h
>>> @@ -74,17 +74,28 @@ private:
>>>   	CameraDevice(unsigned int id, std::shared_ptr<libcamera::Camera> camera);
>>>
>>>   	struct Camera3RequestDescriptor {
>>> +		enum class Status {
>>> +			Pending,
>>> +			Success,
>>> +			Error,
>>> +		};
>>> +
>>>   		Camera3RequestDescriptor() = default;
>>>   		~Camera3RequestDescriptor() = default;
>>>   		Camera3RequestDescriptor(libcamera::Camera *camera,
>>>   					 const camera3_capture_request_t *camera3Request);
>>>   		Camera3RequestDescriptor &operator=(Camera3RequestDescriptor &&) = default;
> Blank line here.
>
>>> +		bool isPending() const { return status_ == Status::Pending; }
>>>
>>>   		uint32_t frameNumber_ = 0;
>>>   		std::vector<camera3_stream_buffer_t> buffers_;
>>>   		std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;
>>>   		CameraMetadata settings_;
>>>   		std::unique_ptr<CaptureRequest> request_;
>>> +
>>> +		camera3_capture_result_t captureResult_ = {};
>>> +		libcamera::FrameBuffer *internalBuffer_;
>> possibily unrelated


ups yeah :S

>>> +		Status status_;
>>>   	};
>>>
>>>   	enum class State {
>>> @@ -99,12 +110,14 @@ private:
>>>   	createFrameBuffer(const buffer_handle_t camera3buffer,
>>>   			  libcamera::PixelFormat pixelFormat,
>>>   			  const libcamera::Size &size);
>>> -	void abortRequest(camera3_capture_request_t *request);
>>> +	void abortRequest(Camera3RequestDescriptor *descriptor,
>>> +			  camera3_capture_request_t *request);
>>>   	bool isValidRequest(camera3_capture_request_t *request) const;
>>>   	void notifyShutter(uint32_t frameNumber, uint64_t timestamp);
>>>   	void notifyError(uint32_t frameNumber, camera3_stream_t *stream,
>>>   			 camera3_error_msg_code code);
>>>   	int processControls(Camera3RequestDescriptor *descriptor);
>>> +	void sendCaptureResults();
>>>   	std::unique_ptr<CameraMetadata> getResultMetadata(
>>>   		const Camera3RequestDescriptor *descriptor) const;
>>>

Patch
diff mbox series

diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
index b0b7f4fd..8e2d22c5 100644
--- a/src/android/camera_device.cpp
+++ b/src/android/camera_device.cpp
@@ -240,6 +240,8 @@  CameraDevice::Camera3RequestDescriptor::Camera3RequestDescriptor(
 	/* Clone the controls associated with the camera3 request. */
 	settings_ = CameraMetadata(camera3Request->settings);
 
+	status_ = Status::Pending;
+
 	/*
 	 * Create the CaptureRequest, stored as a unique_ptr<> to tie its
 	 * lifetime to the descriptor.
@@ -859,11 +861,12 @@  int CameraDevice::processControls(Camera3RequestDescriptor *descriptor)
 	return 0;
 }
 
-void CameraDevice::abortRequest(camera3_capture_request_t *request)
+void CameraDevice::abortRequest(Camera3RequestDescriptor *descriptor,
+				camera3_capture_request_t *request)
 {
 	notifyError(request->frame_number, nullptr, CAMERA3_MSG_ERROR_REQUEST);
 
-	camera3_capture_result_t result = {};
+	camera3_capture_result_t &result = descriptor->captureResult_;
 	result.num_output_buffers = request->num_output_buffers;
 	result.frame_number = request->frame_number;
 	result.partial_result = 0;
@@ -877,7 +880,8 @@  void CameraDevice::abortRequest(camera3_capture_request_t *request)
 	}
 	result.output_buffers = resultBuffers.data();
 
-	callbacks_->process_capture_result(callbacks_, &result);
+	descriptor->status_ = Camera3RequestDescriptor::Status::Error;
+	sendCaptureResults();
 }
 
 bool CameraDevice::isValidRequest(camera3_capture_request_t *camera3Request) const
@@ -1045,13 +1049,19 @@  int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
 		return ret;
 
 	/*
-	 * If flush is in progress abort the request. If the camera has been
-	 * stopped we have to re-start it to be able to process the request.
+	 * If flush is in progress push the descriptor in the queue and abort
+	 * the request. If the camera has been stopped we have to re-start it to
+	 * be able to process the request.
 	 */
 	MutexLocker stateLock(stateMutex_);
 
 	if (state_ == State::Flushing) {
-		abortRequest(camera3Request);
+		descriptor->status_ = Camera3RequestDescriptor::Status::Error;
+		{
+			MutexLocker descriptorsLock(descriptorsMutex_);
+			descriptors_.push_back(std::move(descriptor));
+		}
+		abortRequest(descriptors_.back().get(), camera3Request);
 		return 0;
 	}
 
@@ -1099,7 +1109,7 @@  void CameraDevice::requestComplete(Request *request)
 		return;
 	}
 
-	camera3_capture_result_t captureResult = {};
+	camera3_capture_result_t &captureResult = descriptor->captureResult_;
 	captureResult.frame_number = descriptor->frameNumber_;
 	captureResult.num_output_buffers = descriptor->buffers_.size();
 	captureResult.output_buffers = descriptor->buffers_.data();
@@ -1138,9 +1148,9 @@  void CameraDevice::requestComplete(Request *request)
 			buffer.acquire_fence = -1;
 			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
 		}
-		callbacks_->process_capture_result(callbacks_, &captureResult);
 
-		descriptors_.pop_front();
+		descriptor->status_ = Camera3RequestDescriptor::Status::Error;
+		sendCaptureResults();
 		return;
 	}
 
@@ -1217,9 +1227,23 @@  void CameraDevice::requestComplete(Request *request)
 	captureResult.partial_result = 1;
 
 	captureResult.result = resultMetadata->get();
-	callbacks_->process_capture_result(callbacks_, &captureResult);
+	descriptor->status_ = Camera3RequestDescriptor::Status::Success;
+	sendCaptureResults();
+}
 
-	descriptors_.pop_front();
+void CameraDevice::sendCaptureResults()
+{
+	MutexLocker lock(descriptorsMutex_);
+	while (!descriptors_.empty() && !descriptors_.front()->isPending()) {
+		std::unique_ptr<Camera3RequestDescriptor> descriptor =
+			std::move(descriptors_.front());
+		descriptors_.pop_front();
+
+		lock.unlock();
+		callbacks_->process_capture_result(callbacks_,
+						   &(descriptor->captureResult_));
+		lock.lock();
+	}
 }
 
 std::string CameraDevice::logPrefix() const
diff --git a/src/android/camera_device.h b/src/android/camera_device.h
index 5889a0e7..545cb9b4 100644
--- a/src/android/camera_device.h
+++ b/src/android/camera_device.h
@@ -74,17 +74,28 @@  private:
 	CameraDevice(unsigned int id, std::shared_ptr<libcamera::Camera> camera);
 
 	struct Camera3RequestDescriptor {
+		enum class Status {
+			Pending,
+			Success,
+			Error,
+		};
+
 		Camera3RequestDescriptor() = default;
 		~Camera3RequestDescriptor() = default;
 		Camera3RequestDescriptor(libcamera::Camera *camera,
 					 const camera3_capture_request_t *camera3Request);
 		Camera3RequestDescriptor &operator=(Camera3RequestDescriptor &&) = default;
+		bool isPending() const { return status_ == Status::Pending; }
 
 		uint32_t frameNumber_ = 0;
 		std::vector<camera3_stream_buffer_t> buffers_;
 		std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;
 		CameraMetadata settings_;
 		std::unique_ptr<CaptureRequest> request_;
+
+		camera3_capture_result_t captureResult_ = {};
+		libcamera::FrameBuffer *internalBuffer_;
+		Status status_;
 	};
 
 	enum class State {
@@ -99,12 +110,14 @@  private:
 	createFrameBuffer(const buffer_handle_t camera3buffer,
 			  libcamera::PixelFormat pixelFormat,
 			  const libcamera::Size &size);
-	void abortRequest(camera3_capture_request_t *request);
+	void abortRequest(Camera3RequestDescriptor *descriptor,
+			  camera3_capture_request_t *request);
 	bool isValidRequest(camera3_capture_request_t *request) const;
 	void notifyShutter(uint32_t frameNumber, uint64_t timestamp);
 	void notifyError(uint32_t frameNumber, camera3_stream_t *stream,
 			 camera3_error_msg_code code);
 	int processControls(Camera3RequestDescriptor *descriptor);
+	void sendCaptureResults();
 	std::unique_ptr<CameraMetadata> getResultMetadata(
 		const Camera3RequestDescriptor *descriptor) const;