Message ID | 20210826213016.1829504-6-umang.jain@ideasonboard.com |
---|---|
State | Superseded |
Delegated to: | Umang Jain |
Headers | show |
Series |
|
Related | show |
Hi Umang, Thank you for the patch. I'll start with a few high-level comments. On Fri, Aug 27, 2021 at 03:00:16AM +0530, Umang Jain wrote: > In CameraStream, introduce a new private PostProcessorWorker > class derived from Thread. The post-processor shall run in > this derived thread instance asynchronously. > > Running PostProcessor asynchronously should entail that all the > data needed by the PostProcessor should remain valid for the entire > duration of its run. In this instance, we currently need to ensure: > > - 'source' FrameBuffer for the post processor > - Camera result metadata > > Should be valid and saved with preserving the entire context. The > 'source' framebuffer is copied internally for streams other than > internal (since internal ones are allocated by CameraStream class > itself) and the copy is passed along to post-processor. > > Coming to preserving the context, a quick reference on how captured > results are sent to android framework. Completed captured results > should be sent using process_capture_result() callback. The order > of sending them should match the order the capture request > (camera3_capture_request_t), that was submitted to the HAL in the first > place. > > In order to enforce the ordering, we need to maintain a queue. When a > stream requires post-processing, we save the associated capture results > (via Camera3RequestDescriptor) and add it to the queue. All transient > completed captured results are queued as well. When the post-processing > results are available, we can simply start de-queuing all the queued > completed captured results and invoke process_capture_result() callback > on each of them. > > The context saving part is done by Camera3RequestDescriptor. It is > further extended to include data-structs to fulfill the demands of > async post-processing. To ease review, do you think it would be possible to split this patch in two, with a first part that starts making use of the signals but doesn't move the post-processor to a separate threads > Signed-off-by: Umang Jain <umang.jain@ideasonboard.com> > --- > src/android/camera_device.cpp | 92 +++++++++++++++++++++++++++++++---- > src/android/camera_device.h | 21 +++++++- > src/android/camera_stream.cpp | 35 ++++++++++--- > src/android/camera_stream.h | 40 +++++++++++++++ > 4 files changed, 170 insertions(+), 18 deletions(-) > > diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp > index fea59ce6..08237187 100644 > --- a/src/android/camera_device.cpp > +++ b/src/android/camera_device.cpp > @@ -960,6 +960,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques > * and add them to the Request if required. > */ > FrameBuffer *buffer = nullptr; > + descriptor.srcInternalBuffer_ = nullptr; > switch (cameraStream->type()) { > case CameraStream::Type::Mapped: > /* > @@ -990,6 +991,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques > * once it has been processed. > */ > buffer = cameraStream->getBuffer(); > + descriptor.srcInternalBuffer_ = buffer; > LOG(HAL, Debug) << ss.str() << " (internal)"; > break; > } > @@ -1149,25 +1151,95 @@ void CameraDevice::requestComplete(Request *request) > continue; > } > > + /* > + * Save the current context of capture result and queue the > + * descriptor. cameraStream will process asynchronously, so > + * we can only send the results back after the processing has > + * been completed. > + */ > + descriptor.status_ = Camera3RequestDescriptor::NOT_READY; > + descriptor.resultMetadata_ = std::move(resultMetadata); > + descriptor.captureResult_ = captureResult; > + > + cameraStream->processComplete.connect(this, &CameraDevice::streamProcessingComplete); While signals can be connected and disconnected on demand, it often indicates a design issue. Is there a reason why you can't connect the signal when the cameraStream instance is created, and keep it connected for the whole lifetime of the stream ? > int ret = cameraStream->process(src, *buffer.buffer, > descriptor.settings_, > - resultMetadata.get()); > + descriptor.resultMetadata_.get()); You have a race condition here, if cameraStream->process() runs in a separate thread, it could complete and emit the processComplete signal before the code below gets executed. > + std::unique_ptr<Camera3RequestDescriptor> reqDescriptor = > + std::make_unique<Camera3RequestDescriptor>(); > + *reqDescriptor = std::move(descriptor); > + queuedDescriptor_.push_back(std::move(reqDescriptor)); > + > + return; > + } > + > + if (queuedDescriptor_.empty()) { > + captureResult.result = resultMetadata->get(); > + callbacks_->process_capture_result(callbacks_, &captureResult); > + } else { > + /* > + * Previous capture results waiting to be sent to framework > + * hence, queue the current capture results as well. After that, > + * check if any results are ready to be sent back to the > + * framework. > + */ > + descriptor.status_ = Camera3RequestDescriptor::READY_SUCCESS; > + descriptor.resultMetadata_ = std::move(resultMetadata); > + descriptor.captureResult_ = captureResult; > + std::unique_ptr<Camera3RequestDescriptor> reqDescriptor = > + std::make_unique<Camera3RequestDescriptor>(); > + *reqDescriptor = std::move(descriptor); > + queuedDescriptor_.push_back(std::move(reqDescriptor)); > + > + while (!queuedDescriptor_.empty()) { > + std::unique_ptr<Camera3RequestDescriptor> &d = queuedDescriptor_.front(); > + if (d->status_ != Camera3RequestDescriptor::NOT_READY) { > + d->captureResult_.result = d->resultMetadata_->get(); > + callbacks_->process_capture_result(callbacks_, &(d->captureResult_)); > + queuedDescriptor_.pop_front(); > + } else { > + break; > + } > + } > + } > +} > + > +void CameraDevice::streamProcessingComplete(CameraStream *cameraStream, > + CameraStream::ProcessStatus status, > + CameraMetadata *resultMetadata) > +{ > + cameraStream->processComplete.disconnect(this, &CameraDevice::streamProcessingComplete); > + > + for (auto &d : queuedDescriptor_) { > + if (d->resultMetadata_.get() != resultMetadata) > + continue; > + > /* > * Return the FrameBuffer to the CameraStream now that we're > * done processing it. > */ > if (cameraStream->type() == CameraStream::Type::Internal) > - cameraStream->putBuffer(src); > - > - if (ret) { > - buffer.status = CAMERA3_BUFFER_STATUS_ERROR; > - notifyError(descriptor.frameNumber_, buffer.stream, > - CAMERA3_MSG_ERROR_BUFFER); > + cameraStream->putBuffer(d->srcInternalBuffer_); > + > + if (status == CameraStream::ProcessStatus::Success) { > + d->status_ = Camera3RequestDescriptor::READY_SUCCESS; > + } else { > + /* \todo this is clumsy error handling, be better. */ > + d->status_ = Camera3RequestDescriptor::READY_FAILED; > + for (camera3_stream_buffer_t &buffer : d->buffers_) { > + CameraStream *cs = static_cast<CameraStream *>(buffer.stream->priv); > + > + if (cs->camera3Stream().format != HAL_PIXEL_FORMAT_BLOB) > + continue; > + > + buffer.status = CAMERA3_BUFFER_STATUS_ERROR; > + notifyError(d->frameNumber_, buffer.stream, > + CAMERA3_MSG_ERROR_BUFFER); > + } > } > - } > > - captureResult.result = resultMetadata->get(); > - callbacks_->process_capture_result(callbacks_, &captureResult); > + return; > + } > } > > std::string CameraDevice::logPrefix() const > diff --git a/src/android/camera_device.h b/src/android/camera_device.h > index dd9aebba..4e4bb970 100644 > --- a/src/android/camera_device.h > +++ b/src/android/camera_device.h > @@ -7,6 +7,7 @@ > #ifndef __ANDROID_CAMERA_DEVICE_H__ > #define __ANDROID_CAMERA_DEVICE_H__ > > +#include <deque> > #include <map> > #include <memory> > #include <mutex> > @@ -81,6 +82,20 @@ private: > std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_; > CameraMetadata settings_; > std::unique_ptr<CaptureRequest> request_; > + > + /* > + * Only set when a capture result needs to be queued before > + * completion via process_capture_result() callback. > + */ > + enum processingStatus { > + NOT_READY, > + READY_SUCCESS, > + READY_FAILED, > + }; > + std::unique_ptr<CameraMetadata> resultMetadata_; > + camera3_capture_result_t captureResult_; > + libcamera::FrameBuffer *srcInternalBuffer_; > + processingStatus status_; > }; > > enum class State { > @@ -100,7 +115,9 @@ private: > int processControls(Camera3RequestDescriptor *descriptor); > std::unique_ptr<CameraMetadata> getResultMetadata( > const Camera3RequestDescriptor &descriptor) const; > - > + void streamProcessingComplete(CameraStream *cameraStream, > + CameraStream::ProcessStatus status, > + CameraMetadata *resultMetadata); > unsigned int id_; > camera3_device_t camera3Device_; > > @@ -128,6 +145,8 @@ private: > int orientation_; > > CameraMetadata lastSettings_; > + > + std::deque<std::unique_ptr<Camera3RequestDescriptor>> queuedDescriptor_; > }; > > #endif /* __ANDROID_CAMERA_DEVICE_H__ */ > diff --git a/src/android/camera_stream.cpp b/src/android/camera_stream.cpp > index bdcc7cf9..d5981c0e 100644 > --- a/src/android/camera_stream.cpp > +++ b/src/android/camera_stream.cpp > @@ -55,6 +55,7 @@ CameraStream::CameraStream(CameraDevice *const cameraDevice, > * is what we instantiate here. > */ > postProcessor_ = std::make_unique<PostProcessorJpeg>(cameraDevice_); > + ppWorker_ = std::make_unique<PostProcessorWorker>(postProcessor_.get()); > } > > if (type == Type::Internal) { > @@ -108,22 +109,41 @@ int CameraStream::process(const libcamera::FrameBuffer *source, > if (!postProcessor_) > return 0; > > - /* > - * \todo Buffer mapping and processing should be moved to a > - * separate thread. > - */ > - CameraBuffer dest(camera3Dest, PROT_READ | PROT_WRITE); > - if (!dest.isValid()) { > + /* \todo Should buffer mapping be moved to post-processor's thread? */ Good point :-) > + dest_ = std::make_unique<CameraBuffer>(camera3Dest, PROT_READ | PROT_WRITE); > + if (!dest_->isValid()) { > LOG(HAL, Error) << "Failed to map android blob buffer"; > return -EINVAL; > } > > - return postProcessor_->process(source, &dest, requestMetadata, resultMetadata); > + if (type() != CameraStream::Type::Internal) { > + /* > + * The source buffer is owned by Request object which can be > + * reused for future capture request. Since post-processor will > + * run asynchrnously, we need to copy the source buffer and use > + * it as source data for post-processor. > + */ > + srcPlanes_.clear(); > + for (const auto &plane : source->planes()) > + srcPlanes_.push_back(plane); > + > + srcFramebuffer_ = std::make_unique<libcamera::FrameBuffer>(srcPlanes_); > + source = srcFramebuffer_.get(); > + } This will break if the next frame needs to be post-processed before the current frame completes. You need a queue of frames to post-process, not just one. One option would be to store the data in Camera3RequestDescriptor instead, as that the full context for a request, including all its frames. Some refactoring would be required, as the structure is current private to CameraDevice. > + > + ppWorker_->start(); > + > + return postProcessor_->invokeMethod(&PostProcessor::process, > + ConnectionTypeQueued, > + source, dest_.get(), > + requestMetadata, resultMetadata); Is the CameraStream actually the right place to spawn a thread and dispatch processing jobs ? Brainstorming a bit, I wonder if this shouldn't be moved to the PostProcessor class instead, as the need for a thread depends on the type of post-processor. The current software-based post-processors definitely need a thread, but a post-processor that would use a hardware-accelerated JPEG encoder would be asynchronous by nature (posting jobs to the device and being notified of their completion through callbacks) and may not need to be wrapped in a thread. It would actually also depend on how the hardware JPEG encoder would be interfaced. If the post-processor were to deal with the kernel device directly, it would likely need an event loop, which we're missing in the HAL implementation. A thread would provide that event loop, but it could be best to use a single thread for all event-driven post-processors instead of one per post-processor. This is largely theoretical though, we don't have hardware-backed post-processors today. > } > > void CameraStream::handlePostProcessing(PostProcessor::Status status, > CameraMetadata *resultMetadata) > { > + ppWorker_->exit(); Threads won't consume much resources when they're idle, but starting and stopping a thread is a costly operation. I would be better to start the thread when starting the camera capture session. > + > switch (status) { > case PostProcessor::Status::Success: > processComplete.emit(this, ProcessStatus::Success, resultMetadata); > @@ -134,6 +154,7 @@ void CameraStream::handlePostProcessing(PostProcessor::Status status, > default: > LOG(HAL, Error) << "PostProcessor status invalid"; > } > + srcFramebuffer_.reset(); > } > > FrameBuffer *CameraStream::getBuffer() > diff --git a/src/android/camera_stream.h b/src/android/camera_stream.h > index d54d3f58..567d896f 100644 > --- a/src/android/camera_stream.h > +++ b/src/android/camera_stream.h > @@ -13,7 +13,9 @@ > > #include <hardware/camera3.h> > > +#include <libcamera/base/object.h> > #include <libcamera/base/signal.h> > +#include <libcamera/base/thread.h> > > #include <libcamera/camera.h> > #include <libcamera/framebuffer.h> > @@ -23,6 +25,7 @@ > > #include "post_processor.h" > > +class CameraBuffer; > class CameraDevice; > class CameraMetadata; > > @@ -135,6 +138,38 @@ public: > libcamera::Signal<CameraStream *, ProcessStatus, CameraMetadata *> processComplete; > > private: > + class PostProcessorWorker : public libcamera::Thread > + { > + public: > + PostProcessorWorker(PostProcessor *postProcessor) > + { > + postProcessor->moveToThread(this); > + } Assuming we want to keep the thread inside the CameraStream, I think this needs to be reworked a little bit. The post-processor, in this case, isn't thread-aware, it gets run in a thread, but doesn't really know much about that fact. However, in patch 1/5, you make PostProcessor inherit from Object to enabled the invokeMethod() call. There's a design conflict between these two facts. I would propose instead keeping the PostProcessor unaware of threads, without inheriting from Object, and making PostProcessorWorker the class that inherits from Object and is moved to the thread. The thread could stay in PostProcessorWorker by inheriting from both Object and Thread, but that's a bit confusing, so I'd recommend adding a libcamera::Thread thread_ member variable to CameraStream for the thread. This is the design used by the IPA proxies, you can look at the generated code to see how it's implemented (it's more readable than the templates). The worker would have a process() function, called from CameraStream::process() using invokeMethod(), and that worker process() functions would simply call PostProcessor::process() directly. > + > + void start() > + { > + /* > + * \todo [BLOCKER] !!! > + * On second shutter capture, this fails to start the > + * thread(again). No errors for first shutter capture. > + */ > + Thread::start(); > + } Why do you reimplement start(), if you only call the same function in the parent class ? > + > + void stop() > + { > + exit(); > + wait(); > + } This is never called. > + > + protected: > + void run() override > + { > + exec(); > + dispatchMessages(); > + } > + }; > + > void handlePostProcessing(PostProcessor::Status status, > CameraMetadata *resultMetadata); > > @@ -152,6 +187,11 @@ private: > */ > std::unique_ptr<std::mutex> mutex_; > std::unique_ptr<PostProcessor> postProcessor_; > + std::unique_ptr<PostProcessorWorker> ppWorker_; > + > + std::unique_ptr<CameraBuffer> dest_; > + std::unique_ptr<libcamera::FrameBuffer> srcFramebuffer_; > + std::vector<libcamera::FrameBuffer::Plane> srcPlanes_; > }; > > #endif /* __ANDROID_CAMERA_STREAM__ */
Hi Laurent On 8/27/21 8:34 AM, Laurent Pinchart wrote: > Hi Umang, > > Thank you for the patch. > > I'll start with a few high-level comments. Great, I was expecting exactly this. > > On Fri, Aug 27, 2021 at 03:00:16AM +0530, Umang Jain wrote: >> In CameraStream, introduce a new private PostProcessorWorker >> class derived from Thread. The post-processor shall run in >> this derived thread instance asynchronously. >> >> Running PostProcessor asynchronously should entail that all the >> data needed by the PostProcessor should remain valid for the entire >> duration of its run. In this instance, we currently need to ensure: >> >> - 'source' FrameBuffer for the post processor >> - Camera result metadata >> >> Should be valid and saved with preserving the entire context. The >> 'source' framebuffer is copied internally for streams other than >> internal (since internal ones are allocated by CameraStream class >> itself) and the copy is passed along to post-processor. >> >> Coming to preserving the context, a quick reference on how captured >> results are sent to android framework. Completed captured results >> should be sent using process_capture_result() callback. The order >> of sending them should match the order the capture request >> (camera3_capture_request_t), that was submitted to the HAL in the first >> place. >> >> In order to enforce the ordering, we need to maintain a queue. When a >> stream requires post-processing, we save the associated capture results >> (via Camera3RequestDescriptor) and add it to the queue. All transient >> completed captured results are queued as well. When the post-processing >> results are available, we can simply start de-queuing all the queued >> completed captured results and invoke process_capture_result() callback >> on each of them. >> >> The context saving part is done by Camera3RequestDescriptor. It is >> further extended to include data-structs to fulfill the demands of >> async post-processing. > To ease review, do you think it would be possible to split this patch in > two, with a first part that starts making use of the signals but doesn't > move the post-processor to a separate threads Ah, indeed, we can break it with use of signal emitting in the signals in same thread and then introducing the threading mechanism in a separate patch (Why didn't I think of this) > >> Signed-off-by: Umang Jain <umang.jain@ideasonboard.com> >> --- >> src/android/camera_device.cpp | 92 +++++++++++++++++++++++++++++++---- >> src/android/camera_device.h | 21 +++++++- >> src/android/camera_stream.cpp | 35 ++++++++++--- >> src/android/camera_stream.h | 40 +++++++++++++++ >> 4 files changed, 170 insertions(+), 18 deletions(-) >> >> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp >> index fea59ce6..08237187 100644 >> --- a/src/android/camera_device.cpp >> +++ b/src/android/camera_device.cpp >> @@ -960,6 +960,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques >> * and add them to the Request if required. >> */ >> FrameBuffer *buffer = nullptr; >> + descriptor.srcInternalBuffer_ = nullptr; >> switch (cameraStream->type()) { >> case CameraStream::Type::Mapped: >> /* >> @@ -990,6 +991,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques >> * once it has been processed. >> */ >> buffer = cameraStream->getBuffer(); >> + descriptor.srcInternalBuffer_ = buffer; >> LOG(HAL, Debug) << ss.str() << " (internal)"; >> break; >> } >> @@ -1149,25 +1151,95 @@ void CameraDevice::requestComplete(Request *request) >> continue; >> } >> >> + /* >> + * Save the current context of capture result and queue the >> + * descriptor. cameraStream will process asynchronously, so >> + * we can only send the results back after the processing has >> + * been completed. >> + */ >> + descriptor.status_ = Camera3RequestDescriptor::NOT_READY; >> + descriptor.resultMetadata_ = std::move(resultMetadata); >> + descriptor.captureResult_ = captureResult; >> + >> + cameraStream->processComplete.connect(this, &CameraDevice::streamProcessingComplete); > While signals can be connected and disconnected on demand, it often > indicates a design issue. Is there a reason why you can't connect the > signal when the cameraStream instance is created, and keep it connected > for the whole lifetime of the stream ? yes, I can keep it connected for the while lifetime of the stream too... > >> int ret = cameraStream->process(src, *buffer.buffer, >> descriptor.settings_, >> - resultMetadata.get()); >> + descriptor.resultMetadata_.get()); > You have a race condition here, if cameraStream->process() runs in a > separate thread, it could complete and emit the processComplete signal > before the code below gets executed. ouch, that's not right order. You are correct. I should queue the descriptor first before process().. Good spot! > >> + std::unique_ptr<Camera3RequestDescriptor> reqDescriptor = >> + std::make_unique<Camera3RequestDescriptor>(); >> + *reqDescriptor = std::move(descriptor); >> + queuedDescriptor_.push_back(std::move(reqDescriptor)); >> + >> + return; >> + } >> + >> + if (queuedDescriptor_.empty()) { >> + captureResult.result = resultMetadata->get(); >> + callbacks_->process_capture_result(callbacks_, &captureResult); >> + } else { >> + /* >> + * Previous capture results waiting to be sent to framework >> + * hence, queue the current capture results as well. After that, >> + * check if any results are ready to be sent back to the >> + * framework. >> + */ >> + descriptor.status_ = Camera3RequestDescriptor::READY_SUCCESS; >> + descriptor.resultMetadata_ = std::move(resultMetadata); >> + descriptor.captureResult_ = captureResult; >> + std::unique_ptr<Camera3RequestDescriptor> reqDescriptor = >> + std::make_unique<Camera3RequestDescriptor>(); >> + *reqDescriptor = std::move(descriptor); >> + queuedDescriptor_.push_back(std::move(reqDescriptor)); >> + >> + while (!queuedDescriptor_.empty()) { >> + std::unique_ptr<Camera3RequestDescriptor> &d = queuedDescriptor_.front(); >> + if (d->status_ != Camera3RequestDescriptor::NOT_READY) { >> + d->captureResult_.result = d->resultMetadata_->get(); >> + callbacks_->process_capture_result(callbacks_, &(d->captureResult_)); >> + queuedDescriptor_.pop_front(); >> + } else { >> + break; >> + } >> + } >> + } >> +} >> + >> +void CameraDevice::streamProcessingComplete(CameraStream *cameraStream, >> + CameraStream::ProcessStatus status, >> + CameraMetadata *resultMetadata) >> +{ >> + cameraStream->processComplete.disconnect(this, &CameraDevice::streamProcessingComplete); >> + >> + for (auto &d : queuedDescriptor_) { >> + if (d->resultMetadata_.get() != resultMetadata) >> + continue; >> + >> /* >> * Return the FrameBuffer to the CameraStream now that we're >> * done processing it. >> */ >> if (cameraStream->type() == CameraStream::Type::Internal) >> - cameraStream->putBuffer(src); >> - >> - if (ret) { >> - buffer.status = CAMERA3_BUFFER_STATUS_ERROR; >> - notifyError(descriptor.frameNumber_, buffer.stream, >> - CAMERA3_MSG_ERROR_BUFFER); >> + cameraStream->putBuffer(d->srcInternalBuffer_); >> + >> + if (status == CameraStream::ProcessStatus::Success) { >> + d->status_ = Camera3RequestDescriptor::READY_SUCCESS; >> + } else { >> + /* \todo this is clumsy error handling, be better. */ >> + d->status_ = Camera3RequestDescriptor::READY_FAILED; >> + for (camera3_stream_buffer_t &buffer : d->buffers_) { >> + CameraStream *cs = static_cast<CameraStream *>(buffer.stream->priv); >> + >> + if (cs->camera3Stream().format != HAL_PIXEL_FORMAT_BLOB) >> + continue; >> + >> + buffer.status = CAMERA3_BUFFER_STATUS_ERROR; >> + notifyError(d->frameNumber_, buffer.stream, >> + CAMERA3_MSG_ERROR_BUFFER); >> + } >> } >> - } >> >> - captureResult.result = resultMetadata->get(); >> - callbacks_->process_capture_result(callbacks_, &captureResult); >> + return; >> + } >> } >> >> std::string CameraDevice::logPrefix() const >> diff --git a/src/android/camera_device.h b/src/android/camera_device.h >> index dd9aebba..4e4bb970 100644 >> --- a/src/android/camera_device.h >> +++ b/src/android/camera_device.h >> @@ -7,6 +7,7 @@ >> #ifndef __ANDROID_CAMERA_DEVICE_H__ >> #define __ANDROID_CAMERA_DEVICE_H__ >> >> +#include <deque> >> #include <map> >> #include <memory> >> #include <mutex> >> @@ -81,6 +82,20 @@ private: >> std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_; >> CameraMetadata settings_; >> std::unique_ptr<CaptureRequest> request_; >> + >> + /* >> + * Only set when a capture result needs to be queued before >> + * completion via process_capture_result() callback. >> + */ >> + enum processingStatus { >> + NOT_READY, >> + READY_SUCCESS, >> + READY_FAILED, >> + }; >> + std::unique_ptr<CameraMetadata> resultMetadata_; >> + camera3_capture_result_t captureResult_; >> + libcamera::FrameBuffer *srcInternalBuffer_; >> + processingStatus status_; >> }; >> >> enum class State { >> @@ -100,7 +115,9 @@ private: >> int processControls(Camera3RequestDescriptor *descriptor); >> std::unique_ptr<CameraMetadata> getResultMetadata( >> const Camera3RequestDescriptor &descriptor) const; >> - >> + void streamProcessingComplete(CameraStream *cameraStream, >> + CameraStream::ProcessStatus status, >> + CameraMetadata *resultMetadata); >> unsigned int id_; >> camera3_device_t camera3Device_; >> >> @@ -128,6 +145,8 @@ private: >> int orientation_; >> >> CameraMetadata lastSettings_; >> + >> + std::deque<std::unique_ptr<Camera3RequestDescriptor>> queuedDescriptor_; >> }; >> >> #endif /* __ANDROID_CAMERA_DEVICE_H__ */ >> diff --git a/src/android/camera_stream.cpp b/src/android/camera_stream.cpp >> index bdcc7cf9..d5981c0e 100644 >> --- a/src/android/camera_stream.cpp >> +++ b/src/android/camera_stream.cpp >> @@ -55,6 +55,7 @@ CameraStream::CameraStream(CameraDevice *const cameraDevice, >> * is what we instantiate here. >> */ >> postProcessor_ = std::make_unique<PostProcessorJpeg>(cameraDevice_); >> + ppWorker_ = std::make_unique<PostProcessorWorker>(postProcessor_.get()); >> } >> >> if (type == Type::Internal) { >> @@ -108,22 +109,41 @@ int CameraStream::process(const libcamera::FrameBuffer *source, >> if (!postProcessor_) >> return 0; >> >> - /* >> - * \todo Buffer mapping and processing should be moved to a >> - * separate thread. >> - */ >> - CameraBuffer dest(camera3Dest, PROT_READ | PROT_WRITE); >> - if (!dest.isValid()) { >> + /* \todo Should buffer mapping be moved to post-processor's thread? */ > Good point :-) umm, what does this indicate? I prefer to leave the mapping as it is, for now, with \todo '?' converted to a statement. We can decide on it later and can be done on top? > >> + dest_ = std::make_unique<CameraBuffer>(camera3Dest, PROT_READ | PROT_WRITE); >> + if (!dest_->isValid()) { >> LOG(HAL, Error) << "Failed to map android blob buffer"; >> return -EINVAL; >> } >> >> - return postProcessor_->process(source, &dest, requestMetadata, resultMetadata); >> + if (type() != CameraStream::Type::Internal) { >> + /* >> + * The source buffer is owned by Request object which can be >> + * reused for future capture request. Since post-processor will >> + * run asynchrnously, we need to copy the source buffer and use >> + * it as source data for post-processor. >> + */ >> + srcPlanes_.clear(); >> + for (const auto &plane : source->planes()) >> + srcPlanes_.push_back(plane); >> + >> + srcFramebuffer_ = std::make_unique<libcamera::FrameBuffer>(srcPlanes_); >> + source = srcFramebuffer_.get(); >> + } > This will break if the next frame needs to be post-processed before the > current frame completes. You need a queue of frames to post-process, not > just one. One option would be to store the data in > Camera3RequestDescriptor instead, as that the full context for a Ah, yes, this is thin ice. But exposing Camera3RequestDescriptor data, I am not sure... maybe we can just expose the frame data /only/. Something I'll meditate upon... but I agree with the data storage location in Camera3RequestDescriptor Did you notice, I am /copying/ the frame buffer 'source' ? :-) Does the associated comment makes sense to you? Asking because I and Kieran had a quite a bit of discussion on this right now. ... In my head right now, I see that 'source' comes from libcamera::Request in CameraDevice::requestComplete. So after signal handler, the libcamera::Request is disposed or reused (I don't know, I haven't seen the 'higher' context of what's happening with request). In both cases, the 'source' framebuffer (used for post-processing) might get invalidated/segfault, hence the reason of copy the framebuffer for post-processing purposes. This is the context in my head right now, I might be wrong. Does it make sense? I'll try to look around for the 'higher' context of the request, whether or not it's getting reused or something else > request, including all its frames. Some refactoring would be required, > as the structure is current private to CameraDevice. > >> + >> + ppWorker_->start(); >> + >> + return postProcessor_->invokeMethod(&PostProcessor::process, >> + ConnectionTypeQueued, >> + source, dest_.get(), >> + requestMetadata, resultMetadata); > Is the CameraStream actually the right place to spawn a thread and > dispatch processing jobs ? Brainstorming a bit, I wonder if this > shouldn't be moved to the PostProcessor class instead, as the need for a > thread depends on the type of post-processor. The current software-based > post-processors definitely need a thread, but a post-processor that > would use a hardware-accelerated JPEG encoder would be asynchronous by > nature (posting jobs to the device and being notified of their > completion through callbacks) and may not need to be wrapped in a > thread. It would actually also depend on how the hardware JPEG encoder Yes, this bit needs discussion but certainly you have more look-forward ideas than me. The way I envisioned is that, we have 'n' post-processors in future. And CameraStream is the one that creates the post-processor instances. For eg. , currently we have in CameraStream constructor: postProcessor_ = std::make_unique<PostProcessorJpeg>(cameraDevice_); In a future of multiple post-processors, I see that CameraStream decides which postProcessor_ it needs, depending on the context. Speaking from high-level, it feels CameraStream is the one that creates and manages the post-processor objects. Hence, it should be the one to decide whether or not, to post-processor asynchronously or synchronously; And the post-processor implementation should just care about the post-processing specific bits. So the hardware-accelerator JPEG encoder you mentioned, kind of fits into this design i.e. letting CameraStream decide if a post-processor needs to run in separate threa or not. As the h/w accerlerated encoder is async by nature, the CameraStream shouldn't use a thread for this post-processor. There can also be multiple post-processors right, for a single CameraStream instance? One chaining into the another... would need to think about it too, I guess. Anyway, this is my vision while writing the series. As you can already see, it's limited and I am open to suggestions / experimenting new design ideas. > would be interfaced. If the post-processor were to deal with the kernel > device directly, it would likely need an event loop, which we're missing > in the HAL implementation. A thread would provide that event loop, but > it could be best to use a single thread for all event-driven > post-processors instead of one per post-processor. This is largely > theoretical though, we don't have hardware-backed post-processors today. > >> } >> >> void CameraStream::handlePostProcessing(PostProcessor::Status status, >> CameraMetadata *resultMetadata) >> { >> + ppWorker_->exit(); > Threads won't consume much resources when they're idle, but starting and > stopping a thread is a costly operation. I would be better to start the > thread when starting the camera capture session. Ok, noted. Also I am sure I had stop() here, but seems a bad rebase/cherry-pick fiasco that the exit() creeped in (from earlier implementations). > >> + >> switch (status) { >> case PostProcessor::Status::Success: >> processComplete.emit(this, ProcessStatus::Success, resultMetadata); >> @@ -134,6 +154,7 @@ void CameraStream::handlePostProcessing(PostProcessor::Status status, >> default: >> LOG(HAL, Error) << "PostProcessor status invalid"; >> } >> + srcFramebuffer_.reset(); >> } >> >> FrameBuffer *CameraStream::getBuffer() >> diff --git a/src/android/camera_stream.h b/src/android/camera_stream.h >> index d54d3f58..567d896f 100644 >> --- a/src/android/camera_stream.h >> +++ b/src/android/camera_stream.h >> @@ -13,7 +13,9 @@ >> >> #include <hardware/camera3.h> >> >> +#include <libcamera/base/object.h> >> #include <libcamera/base/signal.h> >> +#include <libcamera/base/thread.h> >> >> #include <libcamera/camera.h> >> #include <libcamera/framebuffer.h> >> @@ -23,6 +25,7 @@ >> >> #include "post_processor.h" >> >> +class CameraBuffer; >> class CameraDevice; >> class CameraMetadata; >> >> @@ -135,6 +138,38 @@ public: >> libcamera::Signal<CameraStream *, ProcessStatus, CameraMetadata *> processComplete; >> >> private: >> + class PostProcessorWorker : public libcamera::Thread >> + { >> + public: >> + PostProcessorWorker(PostProcessor *postProcessor) >> + { >> + postProcessor->moveToThread(this); >> + } > Assuming we want to keep the thread inside the CameraStream, I think > this needs to be reworked a little bit. The post-processor, in this > case, isn't thread-aware, it gets run in a thread, but doesn't really > know much about that fact. However, in patch 1/5, you make PostProcessor > inherit from Object to enabled the invokeMethod() call. There's a design > conflict between these two facts. > > I would propose instead keeping the PostProcessor unaware of threads, > without inheriting from Object, and making PostProcessorWorker the class > that inherits from Object and is moved to the thread. The thread could > stay in PostProcessorWorker by inheriting from both Object and Thread, > but that's a bit confusing, so I'd recommend adding a libcamera::Thread > thread_ member variable to CameraStream for the thread. This is the > design used by the IPA proxies, you can look at the generated code to Ok, I have noted this down but not commenting right now. I think I need to read the code bits again and visualize the flow first > see how it's implemented (it's more readable than the templates). The > worker would have a process() function, called from > CameraStream::process() using invokeMethod(), and that worker process() > functions would simply call PostProcessor::process() directly. > >> + >> + void start() >> + { >> + /* >> + * \todo [BLOCKER] !!! >> + * On second shutter capture, this fails to start the >> + * thread(again). No errors for first shutter capture. >> + */ >> + Thread::start(); >> + } > Why do you reimplement start(), if you only call the same function in > the parent class ? Oh, hmm :-S I think I had moveToThread() here for post-processor but it changed over iterations. And I failed to cleanup the ruins.. > >> + >> + void stop() >> + { >> + exit(); >> + wait(); >> + } > This is never called. > >> + >> + protected: >> + void run() override >> + { >> + exec(); >> + dispatchMessages(); >> + } >> + }; >> + >> void handlePostProcessing(PostProcessor::Status status, >> CameraMetadata *resultMetadata); >> >> @@ -152,6 +187,11 @@ private: >> */ >> std::unique_ptr<std::mutex> mutex_; >> std::unique_ptr<PostProcessor> postProcessor_; >> + std::unique_ptr<PostProcessorWorker> ppWorker_; >> + >> + std::unique_ptr<CameraBuffer> dest_; >> + std::unique_ptr<libcamera::FrameBuffer> srcFramebuffer_; >> + std::vector<libcamera::FrameBuffer::Plane> srcPlanes_; >> }; >> >> #endif /* __ANDROID_CAMERA_STREAM__ */
diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp index fea59ce6..08237187 100644 --- a/src/android/camera_device.cpp +++ b/src/android/camera_device.cpp @@ -960,6 +960,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques * and add them to the Request if required. */ FrameBuffer *buffer = nullptr; + descriptor.srcInternalBuffer_ = nullptr; switch (cameraStream->type()) { case CameraStream::Type::Mapped: /* @@ -990,6 +991,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques * once it has been processed. */ buffer = cameraStream->getBuffer(); + descriptor.srcInternalBuffer_ = buffer; LOG(HAL, Debug) << ss.str() << " (internal)"; break; } @@ -1149,25 +1151,95 @@ void CameraDevice::requestComplete(Request *request) continue; } + /* + * Save the current context of capture result and queue the + * descriptor. cameraStream will process asynchronously, so + * we can only send the results back after the processing has + * been completed. + */ + descriptor.status_ = Camera3RequestDescriptor::NOT_READY; + descriptor.resultMetadata_ = std::move(resultMetadata); + descriptor.captureResult_ = captureResult; + + cameraStream->processComplete.connect(this, &CameraDevice::streamProcessingComplete); int ret = cameraStream->process(src, *buffer.buffer, descriptor.settings_, - resultMetadata.get()); + descriptor.resultMetadata_.get()); + std::unique_ptr<Camera3RequestDescriptor> reqDescriptor = + std::make_unique<Camera3RequestDescriptor>(); + *reqDescriptor = std::move(descriptor); + queuedDescriptor_.push_back(std::move(reqDescriptor)); + + return; + } + + if (queuedDescriptor_.empty()) { + captureResult.result = resultMetadata->get(); + callbacks_->process_capture_result(callbacks_, &captureResult); + } else { + /* + * Previous capture results waiting to be sent to framework + * hence, queue the current capture results as well. After that, + * check if any results are ready to be sent back to the + * framework. + */ + descriptor.status_ = Camera3RequestDescriptor::READY_SUCCESS; + descriptor.resultMetadata_ = std::move(resultMetadata); + descriptor.captureResult_ = captureResult; + std::unique_ptr<Camera3RequestDescriptor> reqDescriptor = + std::make_unique<Camera3RequestDescriptor>(); + *reqDescriptor = std::move(descriptor); + queuedDescriptor_.push_back(std::move(reqDescriptor)); + + while (!queuedDescriptor_.empty()) { + std::unique_ptr<Camera3RequestDescriptor> &d = queuedDescriptor_.front(); + if (d->status_ != Camera3RequestDescriptor::NOT_READY) { + d->captureResult_.result = d->resultMetadata_->get(); + callbacks_->process_capture_result(callbacks_, &(d->captureResult_)); + queuedDescriptor_.pop_front(); + } else { + break; + } + } + } +} + +void CameraDevice::streamProcessingComplete(CameraStream *cameraStream, + CameraStream::ProcessStatus status, + CameraMetadata *resultMetadata) +{ + cameraStream->processComplete.disconnect(this, &CameraDevice::streamProcessingComplete); + + for (auto &d : queuedDescriptor_) { + if (d->resultMetadata_.get() != resultMetadata) + continue; + /* * Return the FrameBuffer to the CameraStream now that we're * done processing it. */ if (cameraStream->type() == CameraStream::Type::Internal) - cameraStream->putBuffer(src); - - if (ret) { - buffer.status = CAMERA3_BUFFER_STATUS_ERROR; - notifyError(descriptor.frameNumber_, buffer.stream, - CAMERA3_MSG_ERROR_BUFFER); + cameraStream->putBuffer(d->srcInternalBuffer_); + + if (status == CameraStream::ProcessStatus::Success) { + d->status_ = Camera3RequestDescriptor::READY_SUCCESS; + } else { + /* \todo this is clumsy error handling, be better. */ + d->status_ = Camera3RequestDescriptor::READY_FAILED; + for (camera3_stream_buffer_t &buffer : d->buffers_) { + CameraStream *cs = static_cast<CameraStream *>(buffer.stream->priv); + + if (cs->camera3Stream().format != HAL_PIXEL_FORMAT_BLOB) + continue; + + buffer.status = CAMERA3_BUFFER_STATUS_ERROR; + notifyError(d->frameNumber_, buffer.stream, + CAMERA3_MSG_ERROR_BUFFER); + } } - } - captureResult.result = resultMetadata->get(); - callbacks_->process_capture_result(callbacks_, &captureResult); + return; + } } std::string CameraDevice::logPrefix() const diff --git a/src/android/camera_device.h b/src/android/camera_device.h index dd9aebba..4e4bb970 100644 --- a/src/android/camera_device.h +++ b/src/android/camera_device.h @@ -7,6 +7,7 @@ #ifndef __ANDROID_CAMERA_DEVICE_H__ #define __ANDROID_CAMERA_DEVICE_H__ +#include <deque> #include <map> #include <memory> #include <mutex> @@ -81,6 +82,20 @@ private: std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_; CameraMetadata settings_; std::unique_ptr<CaptureRequest> request_; + + /* + * Only set when a capture result needs to be queued before + * completion via process_capture_result() callback. + */ + enum processingStatus { + NOT_READY, + READY_SUCCESS, + READY_FAILED, + }; + std::unique_ptr<CameraMetadata> resultMetadata_; + camera3_capture_result_t captureResult_; + libcamera::FrameBuffer *srcInternalBuffer_; + processingStatus status_; }; enum class State { @@ -100,7 +115,9 @@ private: int processControls(Camera3RequestDescriptor *descriptor); std::unique_ptr<CameraMetadata> getResultMetadata( const Camera3RequestDescriptor &descriptor) const; - + void streamProcessingComplete(CameraStream *cameraStream, + CameraStream::ProcessStatus status, + CameraMetadata *resultMetadata); unsigned int id_; camera3_device_t camera3Device_; @@ -128,6 +145,8 @@ private: int orientation_; CameraMetadata lastSettings_; + + std::deque<std::unique_ptr<Camera3RequestDescriptor>> queuedDescriptor_; }; #endif /* __ANDROID_CAMERA_DEVICE_H__ */ diff --git a/src/android/camera_stream.cpp b/src/android/camera_stream.cpp index bdcc7cf9..d5981c0e 100644 --- a/src/android/camera_stream.cpp +++ b/src/android/camera_stream.cpp @@ -55,6 +55,7 @@ CameraStream::CameraStream(CameraDevice *const cameraDevice, * is what we instantiate here. */ postProcessor_ = std::make_unique<PostProcessorJpeg>(cameraDevice_); + ppWorker_ = std::make_unique<PostProcessorWorker>(postProcessor_.get()); } if (type == Type::Internal) { @@ -108,22 +109,41 @@ int CameraStream::process(const libcamera::FrameBuffer *source, if (!postProcessor_) return 0; - /* - * \todo Buffer mapping and processing should be moved to a - * separate thread. - */ - CameraBuffer dest(camera3Dest, PROT_READ | PROT_WRITE); - if (!dest.isValid()) { + /* \todo Should buffer mapping be moved to post-processor's thread? */ + dest_ = std::make_unique<CameraBuffer>(camera3Dest, PROT_READ | PROT_WRITE); + if (!dest_->isValid()) { LOG(HAL, Error) << "Failed to map android blob buffer"; return -EINVAL; } - return postProcessor_->process(source, &dest, requestMetadata, resultMetadata); + if (type() != CameraStream::Type::Internal) { + /* + * The source buffer is owned by Request object which can be + * reused for future capture request. Since post-processor will + * run asynchrnously, we need to copy the source buffer and use + * it as source data for post-processor. + */ + srcPlanes_.clear(); + for (const auto &plane : source->planes()) + srcPlanes_.push_back(plane); + + srcFramebuffer_ = std::make_unique<libcamera::FrameBuffer>(srcPlanes_); + source = srcFramebuffer_.get(); + } + + ppWorker_->start(); + + return postProcessor_->invokeMethod(&PostProcessor::process, + ConnectionTypeQueued, + source, dest_.get(), + requestMetadata, resultMetadata); } void CameraStream::handlePostProcessing(PostProcessor::Status status, CameraMetadata *resultMetadata) { + ppWorker_->exit(); + switch (status) { case PostProcessor::Status::Success: processComplete.emit(this, ProcessStatus::Success, resultMetadata); @@ -134,6 +154,7 @@ void CameraStream::handlePostProcessing(PostProcessor::Status status, default: LOG(HAL, Error) << "PostProcessor status invalid"; } + srcFramebuffer_.reset(); } FrameBuffer *CameraStream::getBuffer() diff --git a/src/android/camera_stream.h b/src/android/camera_stream.h index d54d3f58..567d896f 100644 --- a/src/android/camera_stream.h +++ b/src/android/camera_stream.h @@ -13,7 +13,9 @@ #include <hardware/camera3.h> +#include <libcamera/base/object.h> #include <libcamera/base/signal.h> +#include <libcamera/base/thread.h> #include <libcamera/camera.h> #include <libcamera/framebuffer.h> @@ -23,6 +25,7 @@ #include "post_processor.h" +class CameraBuffer; class CameraDevice; class CameraMetadata; @@ -135,6 +138,38 @@ public: libcamera::Signal<CameraStream *, ProcessStatus, CameraMetadata *> processComplete; private: + class PostProcessorWorker : public libcamera::Thread + { + public: + PostProcessorWorker(PostProcessor *postProcessor) + { + postProcessor->moveToThread(this); + } + + void start() + { + /* + * \todo [BLOCKER] !!! + * On second shutter capture, this fails to start the + * thread(again). No errors for first shutter capture. + */ + Thread::start(); + } + + void stop() + { + exit(); + wait(); + } + + protected: + void run() override + { + exec(); + dispatchMessages(); + } + }; + void handlePostProcessing(PostProcessor::Status status, CameraMetadata *resultMetadata); @@ -152,6 +187,11 @@ private: */ std::unique_ptr<std::mutex> mutex_; std::unique_ptr<PostProcessor> postProcessor_; + std::unique_ptr<PostProcessorWorker> ppWorker_; + + std::unique_ptr<CameraBuffer> dest_; + std::unique_ptr<libcamera::FrameBuffer> srcFramebuffer_; + std::vector<libcamera::FrameBuffer::Plane> srcPlanes_; }; #endif /* __ANDROID_CAMERA_STREAM__ */
In CameraStream, introduce a new private PostProcessorWorker class derived from Thread. The post-processor shall run in this derived thread instance asynchronously. Running PostProcessor asynchronously should entail that all the data needed by the PostProcessor should remain valid for the entire duration of its run. In this instance, we currently need to ensure: - 'source' FrameBuffer for the post processor - Camera result metadata Should be valid and saved with preserving the entire context. The 'source' framebuffer is copied internally for streams other than internal (since internal ones are allocated by CameraStream class itself) and the copy is passed along to post-processor. Coming to preserving the context, a quick reference on how captured results are sent to android framework. Completed captured results should be sent using process_capture_result() callback. The order of sending them should match the order the capture request (camera3_capture_request_t), that was submitted to the HAL in the first place. In order to enforce the ordering, we need to maintain a queue. When a stream requires post-processing, we save the associated capture results (via Camera3RequestDescriptor) and add it to the queue. All transient completed captured results are queued as well. When the post-processing results are available, we can simply start de-queuing all the queued completed captured results and invoke process_capture_result() callback on each of them. The context saving part is done by Camera3RequestDescriptor. It is further extended to include data-structs to fulfill the demands of async post-processing. Signed-off-by: Umang Jain <umang.jain@ideasonboard.com> --- src/android/camera_device.cpp | 92 +++++++++++++++++++++++++++++++---- src/android/camera_device.h | 21 +++++++- src/android/camera_stream.cpp | 35 ++++++++++--- src/android/camera_stream.h | 40 +++++++++++++++ 4 files changed, 170 insertions(+), 18 deletions(-)