[{"id":19897,"web_url":"https://patchwork.libcamera.org/comment/19897/","msgid":"<20210927210847.wdx5xo62inqibts5@uno.localdomain>","date":"2021-09-27T21:08:47","subject":"Re: [libcamera-devel] [PATCH v1 3/3] android: camera_device: Send\n\tcapture results by inspecting the queue","submitter":{"id":3,"url":"https://patchwork.libcamera.org/api/people/3/","name":"Jacopo Mondi","email":"jacopo@jmondi.org"},"content":"Hi Umang,\n\nOn Mon, Sep 27, 2021 at 04:41:49PM +0530, Umang Jain wrote:\n> There is a possibility that an out-of-order completion of capture\n> request happens by calling process_capture_result() directly on error\n> paths. The framework expects that errors should be notified as soon as\n> possible, but the request completion order should remain intact.\n> An existing instance of this is abortRequest(), which sends the capture\n> results on flushing state, without considering order-of-completion.\n>\n> Since, we have a queue of Camera3RequestDescriptor tracking each\n> capture request placed by framework to libcamera HAL, we should be only\n> sending back capture results from a single location, by inspecting\n> the queue. As per the patch, this now happens in\n> CameraDevice::sendCaptureResults().\n>\n> Each descriptor is now equipped with its own status to denote whether\n> the capture request is complete and ready to send back to the framework\n> or need to be waited upon. This ensures that the order of completion is\n> respected for the requests.\n>\n> Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>\n> ---\n>  src/android/camera_device.cpp | 46 ++++++++++++++++++++++++++---------\n>  src/android/camera_device.h   | 15 +++++++++++-\n>  2 files changed, 49 insertions(+), 12 deletions(-)\n>\n> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp\n> index b0b7f4fd..8e2d22c5 100644\n> --- a/src/android/camera_device.cpp\n> +++ b/src/android/camera_device.cpp\n> @@ -240,6 +240,8 @@ CameraDevice::Camera3RequestDescriptor::Camera3RequestDescriptor(\n>  \t/* Clone the controls associated with the camera3 request. */\n>  \tsettings_ = CameraMetadata(camera3Request->settings);\n>\n> +\tstatus_ = Status::Pending;\n> +\n>  \t/*\n>  \t * Create the CaptureRequest, stored as a unique_ptr<> to tie its\n>  \t * lifetime to the descriptor.\n> @@ -859,11 +861,12 @@ int CameraDevice::processControls(Camera3RequestDescriptor *descriptor)\n>  \treturn 0;\n>  }\n>\n> -void CameraDevice::abortRequest(camera3_capture_request_t *request)\n> +void CameraDevice::abortRequest(Camera3RequestDescriptor *descriptor,\n> +\t\t\t\tcamera3_capture_request_t *request)\n>  {\n>  \tnotifyError(request->frame_number, nullptr, CAMERA3_MSG_ERROR_REQUEST);\n>\n> -\tcamera3_capture_result_t result = {};\n> +\tcamera3_capture_result_t &result = descriptor->captureResult_;\n>  \tresult.num_output_buffers = request->num_output_buffers;\n>  \tresult.frame_number = request->frame_number;\n>  \tresult.partial_result = 0;\n> @@ -877,7 +880,8 @@ void CameraDevice::abortRequest(camera3_capture_request_t *request)\n>  \t}\n>  \tresult.output_buffers = resultBuffers.data();\n>\n> -\tcallbacks_->process_capture_result(callbacks_, &result);\n> +\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n> +\tsendCaptureResults();\n>  }\n>\n>  bool CameraDevice::isValidRequest(camera3_capture_request_t *camera3Request) const\n> @@ -1045,13 +1049,19 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques\n>  \t\treturn ret;\n>\n>  \t/*\n> -\t * If flush is in progress abort the request. If the camera has been\n> -\t * stopped we have to re-start it to be able to process the request.\n> +\t * If flush is in progress push the descriptor in the queue and abort\n> +\t * the request. If the camera has been stopped we have to re-start it to\n> +\t * be able to process the request.\n>  \t */\n>  \tMutexLocker stateLock(stateMutex_);\n>\n>  \tif (state_ == State::Flushing) {\n> -\t\tabortRequest(camera3Request);\n> +\t\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n> +\t\t{\n> +\t\t\tMutexLocker descriptorsLock(descriptorsMutex_);\n> +\t\t\tdescriptors_.push_back(std::move(descriptor));\n> +\t\t}\n> +\t\tabortRequest(descriptors_.back().get(), camera3Request);\n\nanother possibility is to move adding the descriptor to the queue a\nlittle up in processCaptureRequest().\n\nHowever with a dequeu there is a possible issue: requests queued to\nthe worker for which waiting on the fence or queueing to\nlibcamera::Camera fails. We don't track those failure, and since\nthe descriptors are on the queue but not queued to the Camera, their\nstate won't ever be changed (unless we instrument the worker to do\nso). The issue is already here, and could cause a request to be lost,\nsomething for which CTS would complain but might potentially not\ncompromise the capture session. With this new setup a forgotten\nrequest will starve the queue which sound worst.\n\nOne way out is to pass the whole descriptor to the worker and let it\nset the state opportunely, as we have the descriptor on the queue\nalready.\n\nThere might be better ways out, let's think about them a bit (bonus\npoints, rework Camera3RequestDescriptor interface to hide\nCaptureRequest, but maybe later..)\n\n\n>  \t\treturn 0;\n>  \t}\n>\n> @@ -1099,7 +1109,7 @@ void CameraDevice::requestComplete(Request *request)\n>  \t\treturn;\n>  \t}\n>\n> -\tcamera3_capture_result_t captureResult = {};\n> +\tcamera3_capture_result_t &captureResult = descriptor->captureResult_;\n>  \tcaptureResult.frame_number = descriptor->frameNumber_;\n>  \tcaptureResult.num_output_buffers = descriptor->buffers_.size();\n>  \tcaptureResult.output_buffers = descriptor->buffers_.data();\n> @@ -1138,9 +1148,9 @@ void CameraDevice::requestComplete(Request *request)\n>  \t\t\tbuffer.acquire_fence = -1;\n>  \t\t\tbuffer.status = CAMERA3_BUFFER_STATUS_ERROR;\n>  \t\t}\n> -\t\tcallbacks_->process_capture_result(callbacks_, &captureResult);\n>\n> -\t\tdescriptors_.pop_front();\n> +\t\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n> +\t\tsendCaptureResults();\n>  \t\treturn;\n>  \t}\n>\n> @@ -1217,9 +1227,23 @@ void CameraDevice::requestComplete(Request *request)\n>  \tcaptureResult.partial_result = 1;\n>\n>  \tcaptureResult.result = resultMetadata->get();\n> -\tcallbacks_->process_capture_result(callbacks_, &captureResult);\n> +\tdescriptor->status_ = Camera3RequestDescriptor::Status::Success;\n> +\tsendCaptureResults();\n> +}\n>\n> -\tdescriptors_.pop_front();\n> +void CameraDevice::sendCaptureResults()\n> +{\n> +\tMutexLocker lock(descriptorsMutex_);\n> +\twhile (!descriptors_.empty() && !descriptors_.front()->isPending()) {\n> +\t\tstd::unique_ptr<Camera3RequestDescriptor> descriptor =\n> +\t\t\tstd::move(descriptors_.front());\n> +\t\tdescriptors_.pop_front();\n> +\n> +\t\tlock.unlock();\n> +\t\tcallbacks_->process_capture_result(callbacks_,\n> +\t\t\t\t\t\t   &(descriptor->captureResult_));\n> +\t\tlock.lock();\n> +\t}\n>  }\n>\n>  std::string CameraDevice::logPrefix() const\n> diff --git a/src/android/camera_device.h b/src/android/camera_device.h\n> index 5889a0e7..545cb9b4 100644\n> --- a/src/android/camera_device.h\n> +++ b/src/android/camera_device.h\n> @@ -74,17 +74,28 @@ private:\n>  \tCameraDevice(unsigned int id, std::shared_ptr<libcamera::Camera> camera);\n>\n>  \tstruct Camera3RequestDescriptor {\n> +\t\tenum class Status {\n> +\t\t\tPending,\n> +\t\t\tSuccess,\n> +\t\t\tError,\n> +\t\t};\n> +\n>  \t\tCamera3RequestDescriptor() = default;\n>  \t\t~Camera3RequestDescriptor() = default;\n>  \t\tCamera3RequestDescriptor(libcamera::Camera *camera,\n>  \t\t\t\t\t const camera3_capture_request_t *camera3Request);\n>  \t\tCamera3RequestDescriptor &operator=(Camera3RequestDescriptor &&) = default;\n> +\t\tbool isPending() const { return status_ == Status::Pending; }\n>\n>  \t\tuint32_t frameNumber_ = 0;\n>  \t\tstd::vector<camera3_stream_buffer_t> buffers_;\n>  \t\tstd::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;\n>  \t\tCameraMetadata settings_;\n>  \t\tstd::unique_ptr<CaptureRequest> request_;\n> +\n> +\t\tcamera3_capture_result_t captureResult_ = {};\n> +\t\tlibcamera::FrameBuffer *internalBuffer_;\n\npossibily unrelated\n\n> +\t\tStatus status_;\n>  \t};\n>\n>  \tenum class State {\n> @@ -99,12 +110,14 @@ private:\n>  \tcreateFrameBuffer(const buffer_handle_t camera3buffer,\n>  \t\t\t  libcamera::PixelFormat pixelFormat,\n>  \t\t\t  const libcamera::Size &size);\n> -\tvoid abortRequest(camera3_capture_request_t *request);\n> +\tvoid abortRequest(Camera3RequestDescriptor *descriptor,\n> +\t\t\t  camera3_capture_request_t *request);\n>  \tbool isValidRequest(camera3_capture_request_t *request) const;\n>  \tvoid notifyShutter(uint32_t frameNumber, uint64_t timestamp);\n>  \tvoid notifyError(uint32_t frameNumber, camera3_stream_t *stream,\n>  \t\t\t camera3_error_msg_code code);\n>  \tint processControls(Camera3RequestDescriptor *descriptor);\n> +\tvoid sendCaptureResults();\n>  \tstd::unique_ptr<CameraMetadata> getResultMetadata(\n>  \t\tconst Camera3RequestDescriptor *descriptor) const;\n>\n> --\n> 2.31.1\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 3B60CBDC71\n\tfor <parsemail@patchwork.libcamera.org>;\n\tMon, 27 Sep 2021 21:08:02 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 5C2926918B;\n\tMon, 27 Sep 2021 23:08:01 +0200 (CEST)","from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net\n\t[217.70.183.194])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id C02616012C\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tMon, 27 Sep 2021 23:07:59 +0200 (CEST)","(Authenticated sender: jacopo@jmondi.org)\n\tby relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 4501140002;\n\tMon, 27 Sep 2021 21:07:59 +0000 (UTC)"],"Date":"Mon, 27 Sep 2021 23:08:47 +0200","From":"Jacopo Mondi <jacopo@jmondi.org>","To":"Umang Jain <umang.jain@ideasonboard.com>","Message-ID":"<20210927210847.wdx5xo62inqibts5@uno.localdomain>","References":"<20210927111149.692004-1-umang.jain@ideasonboard.com>\n\t<20210927111149.692004-4-umang.jain@ideasonboard.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","In-Reply-To":"<20210927111149.692004-4-umang.jain@ideasonboard.com>","Subject":"Re: [libcamera-devel] [PATCH v1 3/3] android: camera_device: Send\n\tcapture results by inspecting the queue","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":19901,"web_url":"https://patchwork.libcamera.org/comment/19901/","msgid":"<YVJZTi1FA/vAL96I@pendragon.ideasonboard.com>","date":"2021-09-27T23:52:46","subject":"Re: [libcamera-devel] [PATCH v1 3/3] android: camera_device: Send\n\tcapture results by inspecting the queue","submitter":{"id":2,"url":"https://patchwork.libcamera.org/api/people/2/","name":"Laurent Pinchart","email":"laurent.pinchart@ideasonboard.com"},"content":"Hi Umang,\n\nThank you for the patch.\n\nOn Mon, Sep 27, 2021 at 11:08:47PM +0200, Jacopo Mondi wrote:\n> On Mon, Sep 27, 2021 at 04:41:49PM +0530, Umang Jain wrote:\n> > There is a possibility that an out-of-order completion of capture\n> > request happens by calling process_capture_result() directly on error\n> > paths. The framework expects that errors should be notified as soon as\n> > possible, but the request completion order should remain intact.\n> > An existing instance of this is abortRequest(), which sends the capture\n> > results on flushing state, without considering order-of-completion.\n> >\n> > Since, we have a queue of Camera3RequestDescriptor tracking each\n\ns/Since,/Since/\n\n> > capture request placed by framework to libcamera HAL, we should be only\n> > sending back capture results from a single location, by inspecting\n> > the queue. As per the patch, this now happens in\n> > CameraDevice::sendCaptureResults().\n> >\n> > Each descriptor is now equipped with its own status to denote whether\n> > the capture request is complete and ready to send back to the framework\n> > or need to be waited upon. This ensures that the order of completion is\n\ns/need/needs/\n\n> > respected for the requests.\n> >\n> > Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>\n> > ---\n> >  src/android/camera_device.cpp | 46 ++++++++++++++++++++++++++---------\n> >  src/android/camera_device.h   | 15 +++++++++++-\n> >  2 files changed, 49 insertions(+), 12 deletions(-)\n> >\n> > diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp\n> > index b0b7f4fd..8e2d22c5 100644\n> > --- a/src/android/camera_device.cpp\n> > +++ b/src/android/camera_device.cpp\n> > @@ -240,6 +240,8 @@ CameraDevice::Camera3RequestDescriptor::Camera3RequestDescriptor(\n> >  \t/* Clone the controls associated with the camera3 request. */\n> >  \tsettings_ = CameraMetadata(camera3Request->settings);\n> >\n> > +\tstatus_ = Status::Pending;\n> > +\n\nHow about initializing this as part of the initializers list of the\nconstructor ? Or maybe in the definition of the structure, with\n\n\t\tStatus status_ = Status::Pending;\n\n(I haven't made up my mind on whether or not we should globally switch\nto that, the pros and cons are not totally clear to me yet.)\n\n> >  \t/*\n> >  \t * Create the CaptureRequest, stored as a unique_ptr<> to tie its\n> >  \t * lifetime to the descriptor.\n> > @@ -859,11 +861,12 @@ int CameraDevice::processControls(Camera3RequestDescriptor *descriptor)\n> >  \treturn 0;\n> >  }\n> >\n> > -void CameraDevice::abortRequest(camera3_capture_request_t *request)\n> > +void CameraDevice::abortRequest(Camera3RequestDescriptor *descriptor,\n> > +\t\t\t\tcamera3_capture_request_t *request)\n\nCould this function take a Camera3RequestDescriptor pointer only ? It\nshould contain all the needed data. This can be done as a patch before\nthis one if desired.\n\n> >  {\n> >  \tnotifyError(request->frame_number, nullptr, CAMERA3_MSG_ERROR_REQUEST);\n> >\n> > -\tcamera3_capture_result_t result = {};\n> > +\tcamera3_capture_result_t &result = descriptor->captureResult_;\n> >  \tresult.num_output_buffers = request->num_output_buffers;\n> >  \tresult.frame_number = request->frame_number;\n> >  \tresult.partial_result = 0;\n> > @@ -877,7 +880,8 @@ void CameraDevice::abortRequest(camera3_capture_request_t *request)\n> >  \t}\n> >  \tresult.output_buffers = resultBuffers.data();\n> >\n> > -\tcallbacks_->process_capture_result(callbacks_, &result);\n> > +\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n> > +\tsendCaptureResults();\n> >  }\n> >\n> >  bool CameraDevice::isValidRequest(camera3_capture_request_t *camera3Request) const\n> > @@ -1045,13 +1049,19 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques\n> >  \t\treturn ret;\n> >\n> >  \t/*\n> > -\t * If flush is in progress abort the request. If the camera has been\n> > -\t * stopped we have to re-start it to be able to process the request.\n> > +\t * If flush is in progress push the descriptor in the queue and abort\n> > +\t * the request. If the camera has been stopped we have to re-start it to\n> > +\t * be able to process the request.\n> >  \t */\n> >  \tMutexLocker stateLock(stateMutex_);\n> >\n> >  \tif (state_ == State::Flushing) {\n> > -\t\tabortRequest(camera3Request);\n> > +\t\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n> > +\t\t{\n> > +\t\t\tMutexLocker descriptorsLock(descriptorsMutex_);\n> > +\t\t\tdescriptors_.push_back(std::move(descriptor));\n> > +\t\t}\n> > +\t\tabortRequest(descriptors_.back().get(), camera3Request);\n> \n> another possibility is to move adding the descriptor to the queue a\n> little up in processCaptureRequest().\n\nWe have an issue here indeed, there's a race condition. As soon as the\nrequest is added to the queue with a status set to !pending, it could be\ncompleted by a call to sendCaptureResults() from requestComplete()\nbefore abortRequest() gets a chance to run.\n\nAs abortRequest is called here only, I would call abortRequest() before\nadding the descriptor to the queue, and move the sendCaptureResults()\ncall from abortRequest() to here after adding the descriptor to the\nqueue. You can drop setting descriptor->status_ to Error from this\nfunction as it's done in abortRequest().\n\n> However with a dequeu there is a possible issue: requests queued to\n> the worker for which waiting on the fence or queueing to\n> libcamera::Camera fails. We don't track those failure, and since\n> the descriptors are on the queue but not queued to the Camera, their\n> state won't ever be changed (unless we instrument the worker to do\n> so). The issue is already here, and could cause a request to be lost,\n> something for which CTS would complain but might potentially not\n> compromise the capture session. With this new setup a forgotten\n> request will starve the queue which sound worst.\n> \n> One way out is to pass the whole descriptor to the worker and let it\n> set the state opportunely, as we have the descriptor on the queue\n> already.\n> \n> There might be better ways out, let's think about them a bit (bonus\n> points, rework Camera3RequestDescriptor interface to hide\n> CaptureRequest, but maybe later..)\n> \n> \n> >  \t\treturn 0;\n> >  \t}\n> >\n> > @@ -1099,7 +1109,7 @@ void CameraDevice::requestComplete(Request *request)\n> >  \t\treturn;\n> >  \t}\n> >\n> > -\tcamera3_capture_result_t captureResult = {};\n> > +\tcamera3_capture_result_t &captureResult = descriptor->captureResult_;\n> >  \tcaptureResult.frame_number = descriptor->frameNumber_;\n> >  \tcaptureResult.num_output_buffers = descriptor->buffers_.size();\n> >  \tcaptureResult.output_buffers = descriptor->buffers_.data();\n> > @@ -1138,9 +1148,9 @@ void CameraDevice::requestComplete(Request *request)\n> >  \t\t\tbuffer.acquire_fence = -1;\n> >  \t\t\tbuffer.status = CAMERA3_BUFFER_STATUS_ERROR;\n> >  \t\t}\n> > -\t\tcallbacks_->process_capture_result(callbacks_, &captureResult);\n> >\n> > -\t\tdescriptors_.pop_front();\n> > +\t\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n> > +\t\tsendCaptureResults();\n> >  \t\treturn;\n> >  \t}\n> >\n> > @@ -1217,9 +1227,23 @@ void CameraDevice::requestComplete(Request *request)\n> >  \tcaptureResult.partial_result = 1;\n> >\n> >  \tcaptureResult.result = resultMetadata->get();\n> > -\tcallbacks_->process_capture_result(callbacks_, &captureResult);\n> > +\tdescriptor->status_ = Camera3RequestDescriptor::Status::Success;\n> > +\tsendCaptureResults();\n> > +}\n> >\n> > -\tdescriptors_.pop_front();\n> > +void CameraDevice::sendCaptureResults()\n> > +{\n> > +\tMutexLocker lock(descriptorsMutex_);\n> > +\twhile (!descriptors_.empty() && !descriptors_.front()->isPending()) {\n> > +\t\tstd::unique_ptr<Camera3RequestDescriptor> descriptor =\n> > +\t\t\tstd::move(descriptors_.front());\n> > +\t\tdescriptors_.pop_front();\n> > +\n> > +\t\tlock.unlock();\n> > +\t\tcallbacks_->process_capture_result(callbacks_,\n> > +\t\t\t\t\t\t   &(descriptor->captureResult_));\n\nDo you need the parentheses ?\n\n> > +\t\tlock.lock();\n> > +\t}\n> >  }\n> >\n> >  std::string CameraDevice::logPrefix() const\n> > diff --git a/src/android/camera_device.h b/src/android/camera_device.h\n> > index 5889a0e7..545cb9b4 100644\n> > --- a/src/android/camera_device.h\n> > +++ b/src/android/camera_device.h\n> > @@ -74,17 +74,28 @@ private:\n> >  \tCameraDevice(unsigned int id, std::shared_ptr<libcamera::Camera> camera);\n> >\n> >  \tstruct Camera3RequestDescriptor {\n> > +\t\tenum class Status {\n> > +\t\t\tPending,\n> > +\t\t\tSuccess,\n> > +\t\t\tError,\n> > +\t\t};\n> > +\n> >  \t\tCamera3RequestDescriptor() = default;\n> >  \t\t~Camera3RequestDescriptor() = default;\n> >  \t\tCamera3RequestDescriptor(libcamera::Camera *camera,\n> >  \t\t\t\t\t const camera3_capture_request_t *camera3Request);\n> >  \t\tCamera3RequestDescriptor &operator=(Camera3RequestDescriptor &&) = default;\n\nBlank line here.\n\n> > +\t\tbool isPending() const { return status_ == Status::Pending; }\n> >\n> >  \t\tuint32_t frameNumber_ = 0;\n> >  \t\tstd::vector<camera3_stream_buffer_t> buffers_;\n> >  \t\tstd::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;\n> >  \t\tCameraMetadata settings_;\n> >  \t\tstd::unique_ptr<CaptureRequest> request_;\n> > +\n> > +\t\tcamera3_capture_result_t captureResult_ = {};\n> > +\t\tlibcamera::FrameBuffer *internalBuffer_;\n> \n> possibily unrelated\n> \n> > +\t\tStatus status_;\n> >  \t};\n> >\n> >  \tenum class State {\n> > @@ -99,12 +110,14 @@ private:\n> >  \tcreateFrameBuffer(const buffer_handle_t camera3buffer,\n> >  \t\t\t  libcamera::PixelFormat pixelFormat,\n> >  \t\t\t  const libcamera::Size &size);\n> > -\tvoid abortRequest(camera3_capture_request_t *request);\n> > +\tvoid abortRequest(Camera3RequestDescriptor *descriptor,\n> > +\t\t\t  camera3_capture_request_t *request);\n> >  \tbool isValidRequest(camera3_capture_request_t *request) const;\n> >  \tvoid notifyShutter(uint32_t frameNumber, uint64_t timestamp);\n> >  \tvoid notifyError(uint32_t frameNumber, camera3_stream_t *stream,\n> >  \t\t\t camera3_error_msg_code code);\n> >  \tint processControls(Camera3RequestDescriptor *descriptor);\n> > +\tvoid sendCaptureResults();\n> >  \tstd::unique_ptr<CameraMetadata> getResultMetadata(\n> >  \t\tconst Camera3RequestDescriptor *descriptor) const;\n> >","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id A6F98BDC71\n\tfor <parsemail@patchwork.libcamera.org>;\n\tMon, 27 Sep 2021 23:52:55 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id D5EBB6918B;\n\tTue, 28 Sep 2021 01:52:54 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[213.167.242.64])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id A09066012D\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 28 Sep 2021 01:52:53 +0200 (CEST)","from pendragon.ideasonboard.com (62-78-145-57.bb.dnainternet.fi\n\t[62.78.145.57])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id 0B0ED3F1;\n\tTue, 28 Sep 2021 01:52:52 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"Fdm7GDDL\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1632786773;\n\tbh=kaCcpuVzA4YU7AzKhJJsmVJNkz5LLnHXOWB7+/yt3Q8=;\n\th=Date:From:To:Cc:Subject:References:In-Reply-To:From;\n\tb=Fdm7GDDL0wClrLbTS4ugItLC9Q/5KppjVNMK4rF7Q5GPx73h1Qn+jx2GbvYXqH2pT\n\tIMVGvhf9LnG++f/vJxqxsszQOxI2HQ2tK+vzbunccDcLYClwzw53GUDOn89P6tdW28\n\trFgJeXdHvLnUa5vpVPL5QogOyef/F3Vks3YcdMIA=","Date":"Tue, 28 Sep 2021 02:52:46 +0300","From":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","To":"Umang Jain <umang.jain@ideasonboard.com>","Message-ID":"<YVJZTi1FA/vAL96I@pendragon.ideasonboard.com>","References":"<20210927111149.692004-1-umang.jain@ideasonboard.com>\n\t<20210927111149.692004-4-umang.jain@ideasonboard.com>\n\t<20210927210847.wdx5xo62inqibts5@uno.localdomain>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","In-Reply-To":"<20210927210847.wdx5xo62inqibts5@uno.localdomain>","Subject":"Re: [libcamera-devel] [PATCH v1 3/3] android: camera_device: Send\n\tcapture results by inspecting the queue","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":19925,"web_url":"https://patchwork.libcamera.org/comment/19925/","msgid":"<44a7397c-770f-b20a-5440-be1c82e28d40@ideasonboard.com>","date":"2021-09-28T12:08:30","subject":"Re: [libcamera-devel] [PATCH v1 3/3] android: camera_device: Send\n\tcapture results by inspecting the queue","submitter":{"id":86,"url":"https://patchwork.libcamera.org/api/people/86/","name":"Umang Jain","email":"umang.jain@ideasonboard.com"},"content":"Hi,\n\nOn 9/28/21 5:22 AM, Laurent Pinchart wrote:\n> Hi Umang,\n>\n> Thank you for the patch.\n>\n> On Mon, Sep 27, 2021 at 11:08:47PM +0200, Jacopo Mondi wrote:\n>> On Mon, Sep 27, 2021 at 04:41:49PM +0530, Umang Jain wrote:\n>>> There is a possibility that an out-of-order completion of capture\n>>> request happens by calling process_capture_result() directly on error\n>>> paths. The framework expects that errors should be notified as soon as\n>>> possible, but the request completion order should remain intact.\n>>> An existing instance of this is abortRequest(), which sends the capture\n>>> results on flushing state, without considering order-of-completion.\n>>>\n>>> Since, we have a queue of Camera3RequestDescriptor tracking each\n> s/Since,/Since/\n>\n>>> capture request placed by framework to libcamera HAL, we should be only\n>>> sending back capture results from a single location, by inspecting\n>>> the queue. As per the patch, this now happens in\n>>> CameraDevice::sendCaptureResults().\n>>>\n>>> Each descriptor is now equipped with its own status to denote whether\n>>> the capture request is complete and ready to send back to the framework\n>>> or need to be waited upon. This ensures that the order of completion is\n> s/need/needs/\n>\n>>> respected for the requests.\n>>>\n>>> Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>\n>>> ---\n>>>   src/android/camera_device.cpp | 46 ++++++++++++++++++++++++++---------\n>>>   src/android/camera_device.h   | 15 +++++++++++-\n>>>   2 files changed, 49 insertions(+), 12 deletions(-)\n>>>\n>>> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp\n>>> index b0b7f4fd..8e2d22c5 100644\n>>> --- a/src/android/camera_device.cpp\n>>> +++ b/src/android/camera_device.cpp\n>>> @@ -240,6 +240,8 @@ CameraDevice::Camera3RequestDescriptor::Camera3RequestDescriptor(\n>>>   \t/* Clone the controls associated with the camera3 request. */\n>>>   \tsettings_ = CameraMetadata(camera3Request->settings);\n>>>\n>>> +\tstatus_ = Status::Pending;\n>>> +\n> How about initializing this as part of the initializers list of the\n> constructor ? Or maybe in the definition of the structure, with\n>\n> \t\tStatus status_ = Status::Pending;\n>\n> (I haven't made up my mind on whether or not we should globally switch\n> to that, the pros and cons are not totally clear to me yet.)\n\n\nI'll opt to set in definition of the structure\n\n>\n>>>   \t/*\n>>>   \t * Create the CaptureRequest, stored as a unique_ptr<> to tie its\n>>>   \t * lifetime to the descriptor.\n>>> @@ -859,11 +861,12 @@ int CameraDevice::processControls(Camera3RequestDescriptor *descriptor)\n>>>   \treturn 0;\n>>>   }\n>>>\n>>> -void CameraDevice::abortRequest(camera3_capture_request_t *request)\n>>> +void CameraDevice::abortRequest(Camera3RequestDescriptor *descriptor,\n>>> +\t\t\t\tcamera3_capture_request_t *request)\n> Could this function take a Camera3RequestDescriptor pointer only ? It\n> should contain all the needed data. This can be done as a patch before\n> this one if desired.\n\n\nYeah probably, will check if it can be split into a different patch \nearlier than this\n\n>\n>>>   {\n>>>   \tnotifyError(request->frame_number, nullptr, CAMERA3_MSG_ERROR_REQUEST);\n>>>\n>>> -\tcamera3_capture_result_t result = {};\n>>> +\tcamera3_capture_result_t &result = descriptor->captureResult_;\n>>>   \tresult.num_output_buffers = request->num_output_buffers;\n>>>   \tresult.frame_number = request->frame_number;\n>>>   \tresult.partial_result = 0;\n>>> @@ -877,7 +880,8 @@ void CameraDevice::abortRequest(camera3_capture_request_t *request)\n>>>   \t}\n>>>   \tresult.output_buffers = resultBuffers.data();\n>>>\n>>> -\tcallbacks_->process_capture_result(callbacks_, &result);\n>>> +\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n>>> +\tsendCaptureResults();\n>>>   }\n>>>\n>>>   bool CameraDevice::isValidRequest(camera3_capture_request_t *camera3Request) const\n>>> @@ -1045,13 +1049,19 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques\n>>>   \t\treturn ret;\n>>>\n>>>   \t/*\n>>> -\t * If flush is in progress abort the request. If the camera has been\n>>> -\t * stopped we have to re-start it to be able to process the request.\n>>> +\t * If flush is in progress push the descriptor in the queue and abort\n>>> +\t * the request. If the camera has been stopped we have to re-start it to\n>>> +\t * be able to process the request.\n>>>   \t */\n>>>   \tMutexLocker stateLock(stateMutex_);\n>>>\n>>>   \tif (state_ == State::Flushing) {\n>>> -\t\tabortRequest(camera3Request);\n>>> +\t\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n>>> +\t\t{\n>>> +\t\t\tMutexLocker descriptorsLock(descriptorsMutex_);\n>>> +\t\t\tdescriptors_.push_back(std::move(descriptor));\n>>> +\t\t}\n>>> +\t\tabortRequest(descriptors_.back().get(), camera3Request);\n>> another possibility is to move adding the descriptor to the queue a\n>> little up in processCaptureRequest().\n> We have an issue here indeed, there's a race condition. As soon as the\n> request is added to the queue with a status set to !pending, it could be\n> completed by a call to sendCaptureResults() from requestComplete()\n> before abortRequest() gets a chance to run.\n\n\nNice catch, didn't think of it :)\n\n>\n> As abortRequest is called here only, I would call abortRequest() before\n> adding the descriptor to the queue, and move the sendCaptureResults()\n> call from abortRequest() to here after adding the descriptor to the\n> queue. You can drop setting descriptor->status_ to Error from this\n> function as it's done in abortRequest().\n\n\nAck.\n\n>\n>> However with a dequeu there is a possible issue: requests queued to\n>> the worker for which waiting on the fence or queueing to\n>> libcamera::Camera fails. We don't track those failure, and since\n>> the descriptors are on the queue but not queued to the Camera, their\n>> state won't ever be changed (unless we instrument the worker to do\n>> so). The issue is already here, and could cause a request to be lost,\n>> something for which CTS would complain but might potentially not\n>> compromise the capture session. With this new setup a forgotten\n>> request will starve the queue which sound worst.\n\n\nIf we have an error on queueRequest(), I would aspect the descriptor \nstatus to be set accordingly to make sure the descriptor doesn't starve \nthe queue endlessly. Similar to like, what the patch does currently on \nabortRequest, sets state to ::Error\n\n\nI think the issue here, we missing a error handling block/logic for \nqueueRequest() in the first place.\n\n>> One way out is to pass the whole descriptor to the worker and let it\n\n\nAh okay this was something you were referring to in the wait-fence CTS \nfix series... I was not getting the context there :-)\n\n>> set the state opportunely, as we have the descriptor on the queue\n>> already.\n\n\nSounds fine to me. Potential as a separate series on top? I can add a \n\\todo here if you want me to.\n\n>>\n>> There might be better ways out, let's think about them a bit (bonus\n>> points, rework Camera3RequestDescriptor interface to hide\n>> CaptureRequest, but maybe later..)\n>>\n>>\n>>>   \t\treturn 0;\n>>>   \t}\n>>>\n>>> @@ -1099,7 +1109,7 @@ void CameraDevice::requestComplete(Request *request)\n>>>   \t\treturn;\n>>>   \t}\n>>>\n>>> -\tcamera3_capture_result_t captureResult = {};\n>>> +\tcamera3_capture_result_t &captureResult = descriptor->captureResult_;\n>>>   \tcaptureResult.frame_number = descriptor->frameNumber_;\n>>>   \tcaptureResult.num_output_buffers = descriptor->buffers_.size();\n>>>   \tcaptureResult.output_buffers = descriptor->buffers_.data();\n>>> @@ -1138,9 +1148,9 @@ void CameraDevice::requestComplete(Request *request)\n>>>   \t\t\tbuffer.acquire_fence = -1;\n>>>   \t\t\tbuffer.status = CAMERA3_BUFFER_STATUS_ERROR;\n>>>   \t\t}\n>>> -\t\tcallbacks_->process_capture_result(callbacks_, &captureResult);\n>>>\n>>> -\t\tdescriptors_.pop_front();\n>>> +\t\tdescriptor->status_ = Camera3RequestDescriptor::Status::Error;\n>>> +\t\tsendCaptureResults();\n>>>   \t\treturn;\n>>>   \t}\n>>>\n>>> @@ -1217,9 +1227,23 @@ void CameraDevice::requestComplete(Request *request)\n>>>   \tcaptureResult.partial_result = 1;\n>>>\n>>>   \tcaptureResult.result = resultMetadata->get();\n>>> -\tcallbacks_->process_capture_result(callbacks_, &captureResult);\n>>> +\tdescriptor->status_ = Camera3RequestDescriptor::Status::Success;\n>>> +\tsendCaptureResults();\n>>> +}\n>>>\n>>> -\tdescriptors_.pop_front();\n>>> +void CameraDevice::sendCaptureResults()\n>>> +{\n>>> +\tMutexLocker lock(descriptorsMutex_);\n>>> +\twhile (!descriptors_.empty() && !descriptors_.front()->isPending()) {\n>>> +\t\tstd::unique_ptr<Camera3RequestDescriptor> descriptor =\n>>> +\t\t\tstd::move(descriptors_.front());\n>>> +\t\tdescriptors_.pop_front();\n>>> +\n>>> +\t\tlock.unlock();\n>>> +\t\tcallbacks_->process_capture_result(callbacks_,\n>>> +\t\t\t\t\t\t   &(descriptor->captureResult_));\n> Do you need the parentheses ?\n\n\nmight not\n\n>\n>>> +\t\tlock.lock();\n>>> +\t}\n>>>   }\n>>>\n>>>   std::string CameraDevice::logPrefix() const\n>>> diff --git a/src/android/camera_device.h b/src/android/camera_device.h\n>>> index 5889a0e7..545cb9b4 100644\n>>> --- a/src/android/camera_device.h\n>>> +++ b/src/android/camera_device.h\n>>> @@ -74,17 +74,28 @@ private:\n>>>   \tCameraDevice(unsigned int id, std::shared_ptr<libcamera::Camera> camera);\n>>>\n>>>   \tstruct Camera3RequestDescriptor {\n>>> +\t\tenum class Status {\n>>> +\t\t\tPending,\n>>> +\t\t\tSuccess,\n>>> +\t\t\tError,\n>>> +\t\t};\n>>> +\n>>>   \t\tCamera3RequestDescriptor() = default;\n>>>   \t\t~Camera3RequestDescriptor() = default;\n>>>   \t\tCamera3RequestDescriptor(libcamera::Camera *camera,\n>>>   \t\t\t\t\t const camera3_capture_request_t *camera3Request);\n>>>   \t\tCamera3RequestDescriptor &operator=(Camera3RequestDescriptor &&) = default;\n> Blank line here.\n>\n>>> +\t\tbool isPending() const { return status_ == Status::Pending; }\n>>>\n>>>   \t\tuint32_t frameNumber_ = 0;\n>>>   \t\tstd::vector<camera3_stream_buffer_t> buffers_;\n>>>   \t\tstd::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;\n>>>   \t\tCameraMetadata settings_;\n>>>   \t\tstd::unique_ptr<CaptureRequest> request_;\n>>> +\n>>> +\t\tcamera3_capture_result_t captureResult_ = {};\n>>> +\t\tlibcamera::FrameBuffer *internalBuffer_;\n>> possibily unrelated\n\n\nups yeah :S\n\n>>> +\t\tStatus status_;\n>>>   \t};\n>>>\n>>>   \tenum class State {\n>>> @@ -99,12 +110,14 @@ private:\n>>>   \tcreateFrameBuffer(const buffer_handle_t camera3buffer,\n>>>   \t\t\t  libcamera::PixelFormat pixelFormat,\n>>>   \t\t\t  const libcamera::Size &size);\n>>> -\tvoid abortRequest(camera3_capture_request_t *request);\n>>> +\tvoid abortRequest(Camera3RequestDescriptor *descriptor,\n>>> +\t\t\t  camera3_capture_request_t *request);\n>>>   \tbool isValidRequest(camera3_capture_request_t *request) const;\n>>>   \tvoid notifyShutter(uint32_t frameNumber, uint64_t timestamp);\n>>>   \tvoid notifyError(uint32_t frameNumber, camera3_stream_t *stream,\n>>>   \t\t\t camera3_error_msg_code code);\n>>>   \tint processControls(Camera3RequestDescriptor *descriptor);\n>>> +\tvoid sendCaptureResults();\n>>>   \tstd::unique_ptr<CameraMetadata> getResultMetadata(\n>>>   \t\tconst Camera3RequestDescriptor *descriptor) const;\n>>>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id DE863BDC71\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 28 Sep 2021 12:08:38 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 3A59C6918E;\n\tTue, 28 Sep 2021 14:08:38 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[213.167.242.64])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id AB9A469185\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 28 Sep 2021 14:08:36 +0200 (CEST)","from [192.168.1.104] (unknown [103.251.226.4])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id 734CA3F1;\n\tTue, 28 Sep 2021 14:08:35 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"dA/qE0N9\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1632830916;\n\tbh=FJNrnX1S9I/aTIAvcoFlAALKTDfWApOY/UKqunw/PNg=;\n\th=Subject:To:Cc:References:From:Date:In-Reply-To:From;\n\tb=dA/qE0N9MrH00+Ose6VKU7nOC+ZrVV/JuJKA3uJr4n83WdA838ijQ0hajC3pj8CC6\n\txWEcqkc5gLFNLbJ4ucv0HIyZoQD8THrb57HeBexF3cTFfI/hmzzd4lCCCmJ5/ogiUM\n\twAGiqStuZ9jBHuBH6lj4gXBdub5P0eNKcd38InDc=","To":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","References":"<20210927111149.692004-1-umang.jain@ideasonboard.com>\n\t<20210927111149.692004-4-umang.jain@ideasonboard.com>\n\t<20210927210847.wdx5xo62inqibts5@uno.localdomain>\n\t<YVJZTi1FA/vAL96I@pendragon.ideasonboard.com>","From":"Umang Jain <umang.jain@ideasonboard.com>","Message-ID":"<44a7397c-770f-b20a-5440-be1c82e28d40@ideasonboard.com>","Date":"Tue, 28 Sep 2021 17:38:30 +0530","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101\n\tThunderbird/78.10.2","MIME-Version":"1.0","In-Reply-To":"<YVJZTi1FA/vAL96I@pendragon.ideasonboard.com>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Transfer-Encoding":"7bit","Content-Language":"en-US","Subject":"Re: [libcamera-devel] [PATCH v1 3/3] android: camera_device: Send\n\tcapture results by inspecting the queue","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}}]