[libcamera-devel,RFC,5/5] android: Run post processing in a separate thread
diff mbox series

Message ID 20210826213016.1829504-6-umang.jain@ideasonboard.com
State Superseded
Delegated to: Umang Jain
Headers show
Series
  • android: Async post-processing
Related show

Commit Message

Umang Jain Aug. 26, 2021, 9:30 p.m. UTC
In CameraStream, introduce a new private PostProcessorWorker
class derived from Thread. The post-processor shall run in
this derived thread instance asynchronously.

Running PostProcessor asynchronously should entail that all the
data needed by the PostProcessor should remain valid for the entire
duration of its run. In this instance, we currently need to ensure:

- 'source' FrameBuffer for the post processor
- Camera result metadata

Should be valid and saved with preserving the entire context. The
'source' framebuffer is copied internally for streams other than
internal (since internal ones are allocated by CameraStream class
itself) and the copy is passed along to post-processor.

Coming to preserving the context, a quick reference on how captured
results are sent to android framework. Completed captured results
should be sent using process_capture_result() callback. The order
of sending them should match the order the capture request
(camera3_capture_request_t), that was submitted to the HAL in the first
place.

In order to enforce the ordering, we need to maintain a queue. When a
stream requires post-processing, we save the associated capture results
(via Camera3RequestDescriptor) and add it to the queue. All transient
completed captured results are queued as well. When the post-processing
results are available, we can simply start de-queuing all the queued
completed captured results and invoke process_capture_result() callback
on each of them.

The context saving part is done by Camera3RequestDescriptor. It is
further extended to include data-structs to fulfill the demands of
async post-processing.

Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
---
 src/android/camera_device.cpp | 92 +++++++++++++++++++++++++++++++----
 src/android/camera_device.h   | 21 +++++++-
 src/android/camera_stream.cpp | 35 ++++++++++---
 src/android/camera_stream.h   | 40 +++++++++++++++
 4 files changed, 170 insertions(+), 18 deletions(-)

Comments

Laurent Pinchart Aug. 27, 2021, 3:04 a.m. UTC | #1
Hi Umang,

Thank you for the patch.

I'll start with a few high-level comments.

On Fri, Aug 27, 2021 at 03:00:16AM +0530, Umang Jain wrote:
> In CameraStream, introduce a new private PostProcessorWorker
> class derived from Thread. The post-processor shall run in
> this derived thread instance asynchronously.
> 
> Running PostProcessor asynchronously should entail that all the
> data needed by the PostProcessor should remain valid for the entire
> duration of its run. In this instance, we currently need to ensure:
> 
> - 'source' FrameBuffer for the post processor
> - Camera result metadata
> 
> Should be valid and saved with preserving the entire context. The
> 'source' framebuffer is copied internally for streams other than
> internal (since internal ones are allocated by CameraStream class
> itself) and the copy is passed along to post-processor.
> 
> Coming to preserving the context, a quick reference on how captured
> results are sent to android framework. Completed captured results
> should be sent using process_capture_result() callback. The order
> of sending them should match the order the capture request
> (camera3_capture_request_t), that was submitted to the HAL in the first
> place.
> 
> In order to enforce the ordering, we need to maintain a queue. When a
> stream requires post-processing, we save the associated capture results
> (via Camera3RequestDescriptor) and add it to the queue. All transient
> completed captured results are queued as well. When the post-processing
> results are available, we can simply start de-queuing all the queued
> completed captured results and invoke process_capture_result() callback
> on each of them.
> 
> The context saving part is done by Camera3RequestDescriptor. It is
> further extended to include data-structs to fulfill the demands of
> async post-processing.

To ease review, do you think it would be possible to split this patch in
two, with a first part that starts making use of the signals but doesn't
move the post-processor to a separate threads

> Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
> ---
>  src/android/camera_device.cpp | 92 +++++++++++++++++++++++++++++++----
>  src/android/camera_device.h   | 21 +++++++-
>  src/android/camera_stream.cpp | 35 ++++++++++---
>  src/android/camera_stream.h   | 40 +++++++++++++++
>  4 files changed, 170 insertions(+), 18 deletions(-)
> 
> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
> index fea59ce6..08237187 100644
> --- a/src/android/camera_device.cpp
> +++ b/src/android/camera_device.cpp
> @@ -960,6 +960,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
>  		 * and add them to the Request if required.
>  		 */
>  		FrameBuffer *buffer = nullptr;
> +		descriptor.srcInternalBuffer_ = nullptr;
>  		switch (cameraStream->type()) {
>  		case CameraStream::Type::Mapped:
>  			/*
> @@ -990,6 +991,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
>  			 * once it has been processed.
>  			 */
>  			buffer = cameraStream->getBuffer();
> +			descriptor.srcInternalBuffer_ = buffer;
>  			LOG(HAL, Debug) << ss.str() << " (internal)";
>  			break;
>  		}
> @@ -1149,25 +1151,95 @@ void CameraDevice::requestComplete(Request *request)
>  			continue;
>  		}
>  
> +		/*
> +		 * Save the current context of capture result and queue the
> +		 * descriptor. cameraStream will process asynchronously, so
> +		 * we can only send the results back after the processing has
> +		 * been completed.
> +		 */
> +		descriptor.status_ = Camera3RequestDescriptor::NOT_READY;
> +		descriptor.resultMetadata_ = std::move(resultMetadata);
> +		descriptor.captureResult_ = captureResult;
> +
> +		cameraStream->processComplete.connect(this, &CameraDevice::streamProcessingComplete);

While signals can be connected and disconnected on demand, it often
indicates a design issue. Is there a reason why you can't connect the
signal when the cameraStream instance is created, and keep it connected
for the whole lifetime of the stream ?

>  		int ret = cameraStream->process(src, *buffer.buffer,
>  						descriptor.settings_,
> -						resultMetadata.get());
> +						descriptor.resultMetadata_.get());

You have a race condition here, if cameraStream->process() runs in a
separate thread, it could complete and emit the processComplete signal
before the code below gets executed.

> +		std::unique_ptr<Camera3RequestDescriptor> reqDescriptor =
> +			std::make_unique<Camera3RequestDescriptor>();
> +		*reqDescriptor = std::move(descriptor);
> +		queuedDescriptor_.push_back(std::move(reqDescriptor));
> +
> +		return;
> +	}
> +
> +	if (queuedDescriptor_.empty()) {
> +		captureResult.result = resultMetadata->get();
> +		callbacks_->process_capture_result(callbacks_, &captureResult);
> +	} else {
> +		/*
> +		 * Previous capture results waiting to be sent to framework
> +		 * hence, queue the current capture results as well. After that,
> +		 * check if any results are ready to be sent back to the
> +		 * framework.
> +		 */
> +		descriptor.status_ = Camera3RequestDescriptor::READY_SUCCESS;
> +		descriptor.resultMetadata_ = std::move(resultMetadata);
> +		descriptor.captureResult_ = captureResult;
> +		std::unique_ptr<Camera3RequestDescriptor> reqDescriptor =
> +			std::make_unique<Camera3RequestDescriptor>();
> +		*reqDescriptor = std::move(descriptor);
> +		queuedDescriptor_.push_back(std::move(reqDescriptor));
> +
> +		while (!queuedDescriptor_.empty()) {
> +			std::unique_ptr<Camera3RequestDescriptor> &d = queuedDescriptor_.front();
> +			if (d->status_ != Camera3RequestDescriptor::NOT_READY) {
> +				d->captureResult_.result = d->resultMetadata_->get();
> +				callbacks_->process_capture_result(callbacks_, &(d->captureResult_));
> +				queuedDescriptor_.pop_front();
> +			} else {
> +				break;
> +			}
> +		}
> +	}
> +}
> +
> +void CameraDevice::streamProcessingComplete(CameraStream *cameraStream,
> +					    CameraStream::ProcessStatus status,
> +					    CameraMetadata *resultMetadata)
> +{
> +	cameraStream->processComplete.disconnect(this, &CameraDevice::streamProcessingComplete);
> +
> +	for (auto &d : queuedDescriptor_) {
> +		if (d->resultMetadata_.get() != resultMetadata)
> +			continue;
> +
>  		/*
>  		 * Return the FrameBuffer to the CameraStream now that we're
>  		 * done processing it.
>  		 */
>  		if (cameraStream->type() == CameraStream::Type::Internal)
> -			cameraStream->putBuffer(src);
> -
> -		if (ret) {
> -			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> -			notifyError(descriptor.frameNumber_, buffer.stream,
> -				    CAMERA3_MSG_ERROR_BUFFER);
> +			cameraStream->putBuffer(d->srcInternalBuffer_);
> +
> +		if (status == CameraStream::ProcessStatus::Success) {
> +			d->status_ = Camera3RequestDescriptor::READY_SUCCESS;
> +		} else {
> +			/* \todo this is clumsy error handling, be better. */
> +			d->status_ = Camera3RequestDescriptor::READY_FAILED;
> +			for (camera3_stream_buffer_t &buffer : d->buffers_) {
> +				CameraStream *cs = static_cast<CameraStream *>(buffer.stream->priv);
> +
> +				if (cs->camera3Stream().format != HAL_PIXEL_FORMAT_BLOB)
> +					continue;
> +
> +				buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
> +				notifyError(d->frameNumber_, buffer.stream,
> +					    CAMERA3_MSG_ERROR_BUFFER);
> +			}
>  		}
> -	}
>  
> -	captureResult.result = resultMetadata->get();
> -	callbacks_->process_capture_result(callbacks_, &captureResult);
> +		return;
> +	}
>  }
>  
>  std::string CameraDevice::logPrefix() const
> diff --git a/src/android/camera_device.h b/src/android/camera_device.h
> index dd9aebba..4e4bb970 100644
> --- a/src/android/camera_device.h
> +++ b/src/android/camera_device.h
> @@ -7,6 +7,7 @@
>  #ifndef __ANDROID_CAMERA_DEVICE_H__
>  #define __ANDROID_CAMERA_DEVICE_H__
>  
> +#include <deque>
>  #include <map>
>  #include <memory>
>  #include <mutex>
> @@ -81,6 +82,20 @@ private:
>  		std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;
>  		CameraMetadata settings_;
>  		std::unique_ptr<CaptureRequest> request_;
> +
> +		/*
> +		 * Only set when a capture result needs to be queued before
> +		 * completion via process_capture_result() callback.
> +		 */
> +		enum processingStatus {
> +			NOT_READY,
> +			READY_SUCCESS,
> +			READY_FAILED,
> +		};
> +		std::unique_ptr<CameraMetadata> resultMetadata_;
> +		camera3_capture_result_t captureResult_;
> +		libcamera::FrameBuffer *srcInternalBuffer_;
> +		processingStatus status_;
>  	};
>  
>  	enum class State {
> @@ -100,7 +115,9 @@ private:
>  	int processControls(Camera3RequestDescriptor *descriptor);
>  	std::unique_ptr<CameraMetadata> getResultMetadata(
>  		const Camera3RequestDescriptor &descriptor) const;
> -
> +	void streamProcessingComplete(CameraStream *cameraStream,
> +				      CameraStream::ProcessStatus status,
> +				      CameraMetadata *resultMetadata);
>  	unsigned int id_;
>  	camera3_device_t camera3Device_;
>  
> @@ -128,6 +145,8 @@ private:
>  	int orientation_;
>  
>  	CameraMetadata lastSettings_;
> +
> +	std::deque<std::unique_ptr<Camera3RequestDescriptor>> queuedDescriptor_;
>  };
>  
>  #endif /* __ANDROID_CAMERA_DEVICE_H__ */
> diff --git a/src/android/camera_stream.cpp b/src/android/camera_stream.cpp
> index bdcc7cf9..d5981c0e 100644
> --- a/src/android/camera_stream.cpp
> +++ b/src/android/camera_stream.cpp
> @@ -55,6 +55,7 @@ CameraStream::CameraStream(CameraDevice *const cameraDevice,
>  		 * is what we instantiate here.
>  		 */
>  		postProcessor_ = std::make_unique<PostProcessorJpeg>(cameraDevice_);
> +		ppWorker_ = std::make_unique<PostProcessorWorker>(postProcessor_.get());
>  	}
>  
>  	if (type == Type::Internal) {
> @@ -108,22 +109,41 @@ int CameraStream::process(const libcamera::FrameBuffer *source,
>  	if (!postProcessor_)
>  		return 0;
>  
> -	/*
> -	 * \todo Buffer mapping and processing should be moved to a
> -	 * separate thread.
> -	 */
> -	CameraBuffer dest(camera3Dest, PROT_READ | PROT_WRITE);
> -	if (!dest.isValid()) {
> +	/* \todo Should buffer mapping be moved to post-processor's thread? */

Good point :-)

> +	dest_ = std::make_unique<CameraBuffer>(camera3Dest, PROT_READ | PROT_WRITE);
> +	if (!dest_->isValid()) {
>  		LOG(HAL, Error) << "Failed to map android blob buffer";
>  		return -EINVAL;
>  	}
>  
> -	return postProcessor_->process(source, &dest, requestMetadata, resultMetadata);
> +	if (type() != CameraStream::Type::Internal) {
> +		/*
> +		 * The source buffer is owned by Request object which can be
> +		 * reused for future capture request. Since post-processor will
> +		 * run asynchrnously, we need to copy the source buffer and use
> +		 * it as source data for post-processor.
> +		 */
> +		srcPlanes_.clear();
> +		for (const auto &plane : source->planes())
> +			srcPlanes_.push_back(plane);
> +
> +		srcFramebuffer_ = std::make_unique<libcamera::FrameBuffer>(srcPlanes_);
> +		source = srcFramebuffer_.get();
> +	}

This will break if the next frame needs to be post-processed before the
current frame completes. You need a queue of frames to post-process, not
just one. One option would be to store the data in
Camera3RequestDescriptor instead, as that the full context for a
request, including all its frames. Some refactoring would be required,
as the structure is current private to CameraDevice.

> +
> +	ppWorker_->start();
> +
> +	return postProcessor_->invokeMethod(&PostProcessor::process,
> +					    ConnectionTypeQueued,
> +					    source, dest_.get(),
> +					    requestMetadata, resultMetadata);

Is the CameraStream actually the right place to spawn a thread and
dispatch processing jobs ? Brainstorming a bit, I wonder if this
shouldn't be moved to the PostProcessor class instead, as the need for a
thread depends on the type of post-processor. The current software-based
post-processors definitely need a thread, but a post-processor that
would use a hardware-accelerated JPEG encoder would be asynchronous by
nature (posting jobs to the device and being notified of their
completion through callbacks) and may not need to be wrapped in a
thread. It would actually also depend on how the hardware JPEG encoder
would be interfaced. If the post-processor were to deal with the kernel
device directly, it would likely need an event loop, which we're missing
in the HAL implementation. A thread would provide that event loop, but
it could be best to use a single thread for all event-driven
post-processors instead of one per post-processor. This is largely
theoretical though, we don't have hardware-backed post-processors today.

>  }
>  
>  void CameraStream::handlePostProcessing(PostProcessor::Status status,
>  					CameraMetadata *resultMetadata)
>  {
> +	ppWorker_->exit();

Threads won't consume much resources when they're idle, but starting and
stopping a thread is a costly operation. I would be better to start the
thread when starting the camera capture session.

> +
>  	switch (status) {
>  	case PostProcessor::Status::Success:
>  		processComplete.emit(this, ProcessStatus::Success, resultMetadata);
> @@ -134,6 +154,7 @@ void CameraStream::handlePostProcessing(PostProcessor::Status status,
>  	default:
>  		LOG(HAL, Error) << "PostProcessor status invalid";
>  	}
> +	srcFramebuffer_.reset();
>  }
>  
>  FrameBuffer *CameraStream::getBuffer()
> diff --git a/src/android/camera_stream.h b/src/android/camera_stream.h
> index d54d3f58..567d896f 100644
> --- a/src/android/camera_stream.h
> +++ b/src/android/camera_stream.h
> @@ -13,7 +13,9 @@
>  
>  #include <hardware/camera3.h>
>  
> +#include <libcamera/base/object.h>
>  #include <libcamera/base/signal.h>
> +#include <libcamera/base/thread.h>
>  
>  #include <libcamera/camera.h>
>  #include <libcamera/framebuffer.h>
> @@ -23,6 +25,7 @@
>  
>  #include "post_processor.h"
>  
> +class CameraBuffer;
>  class CameraDevice;
>  class CameraMetadata;
>  
> @@ -135,6 +138,38 @@ public:
>  	libcamera::Signal<CameraStream *, ProcessStatus, CameraMetadata *> processComplete;
>  
>  private:
> +	class PostProcessorWorker : public libcamera::Thread
> +	{
> +	public:
> +		PostProcessorWorker(PostProcessor *postProcessor)
> +		{
> +			postProcessor->moveToThread(this);
> +		}

Assuming we want to keep the thread inside the CameraStream, I think
this needs to be reworked a little bit. The post-processor, in this
case, isn't thread-aware, it gets run in a thread, but doesn't really
know much about that fact. However, in patch 1/5, you make PostProcessor
inherit from Object to enabled the invokeMethod() call. There's a design
conflict between these two facts.

I would propose instead keeping the PostProcessor unaware of threads,
without inheriting from Object, and making PostProcessorWorker the class
that inherits from Object and is moved to the thread. The thread could
stay in PostProcessorWorker by inheriting from both Object and Thread,
but that's a bit confusing, so I'd recommend adding a libcamera::Thread
thread_ member variable to CameraStream for the thread. This is the
design used by the IPA proxies, you can look at the generated code to
see how it's implemented (it's more readable than the templates). The
worker would have a process() function, called from
CameraStream::process() using invokeMethod(), and that worker process()
functions would simply call PostProcessor::process() directly.

> +
> +		void start()
> +		{
> +			/*
> +			 * \todo [BLOCKER] !!!
> +			 * On second shutter capture, this fails to start the
> +			 * thread(again). No errors for first shutter capture.
> +			 */
> +			Thread::start();
> +		}

Why do you reimplement start(), if you only call the same function in
the parent class ?

> +
> +		void stop()
> +		{
> +			exit();
> +			wait();
> +		}

This is never called.

> +
> +	protected:
> +		void run() override
> +		{
> +			exec();
> +			dispatchMessages();
> +		}
> +	};
> +
>  	void handlePostProcessing(PostProcessor::Status status,
>  				  CameraMetadata *resultMetadata);
>  
> @@ -152,6 +187,11 @@ private:
>  	 */
>  	std::unique_ptr<std::mutex> mutex_;
>  	std::unique_ptr<PostProcessor> postProcessor_;
> +	std::unique_ptr<PostProcessorWorker> ppWorker_;
> +
> +	std::unique_ptr<CameraBuffer> dest_;
> +	std::unique_ptr<libcamera::FrameBuffer> srcFramebuffer_;
> +	std::vector<libcamera::FrameBuffer::Plane> srcPlanes_;
>  };
>  
>  #endif /* __ANDROID_CAMERA_STREAM__ */
Umang Jain Aug. 27, 2021, 2 p.m. UTC | #2
Hi Laurent

On 8/27/21 8:34 AM, Laurent Pinchart wrote:
> Hi Umang,
>
> Thank you for the patch.
>
> I'll start with a few high-level comments.
Great, I was expecting exactly this.
>
> On Fri, Aug 27, 2021 at 03:00:16AM +0530, Umang Jain wrote:
>> In CameraStream, introduce a new private PostProcessorWorker
>> class derived from Thread. The post-processor shall run in
>> this derived thread instance asynchronously.
>>
>> Running PostProcessor asynchronously should entail that all the
>> data needed by the PostProcessor should remain valid for the entire
>> duration of its run. In this instance, we currently need to ensure:
>>
>> - 'source' FrameBuffer for the post processor
>> - Camera result metadata
>>
>> Should be valid and saved with preserving the entire context. The
>> 'source' framebuffer is copied internally for streams other than
>> internal (since internal ones are allocated by CameraStream class
>> itself) and the copy is passed along to post-processor.
>>
>> Coming to preserving the context, a quick reference on how captured
>> results are sent to android framework. Completed captured results
>> should be sent using process_capture_result() callback. The order
>> of sending them should match the order the capture request
>> (camera3_capture_request_t), that was submitted to the HAL in the first
>> place.
>>
>> In order to enforce the ordering, we need to maintain a queue. When a
>> stream requires post-processing, we save the associated capture results
>> (via Camera3RequestDescriptor) and add it to the queue. All transient
>> completed captured results are queued as well. When the post-processing
>> results are available, we can simply start de-queuing all the queued
>> completed captured results and invoke process_capture_result() callback
>> on each of them.
>>
>> The context saving part is done by Camera3RequestDescriptor. It is
>> further extended to include data-structs to fulfill the demands of
>> async post-processing.
> To ease review, do you think it would be possible to split this patch in
> two, with a first part that starts making use of the signals but doesn't
> move the post-processor to a separate threads


Ah, indeed, we can break it with use of signal emitting in the signals 
in same thread and then introducing the threading mechanism in a 
separate patch (Why didn't I think of this)

>
>> Signed-off-by: Umang Jain <umang.jain@ideasonboard.com>
>> ---
>>   src/android/camera_device.cpp | 92 +++++++++++++++++++++++++++++++----
>>   src/android/camera_device.h   | 21 +++++++-
>>   src/android/camera_stream.cpp | 35 ++++++++++---
>>   src/android/camera_stream.h   | 40 +++++++++++++++
>>   4 files changed, 170 insertions(+), 18 deletions(-)
>>
>> diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
>> index fea59ce6..08237187 100644
>> --- a/src/android/camera_device.cpp
>> +++ b/src/android/camera_device.cpp
>> @@ -960,6 +960,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
>>   		 * and add them to the Request if required.
>>   		 */
>>   		FrameBuffer *buffer = nullptr;
>> +		descriptor.srcInternalBuffer_ = nullptr;
>>   		switch (cameraStream->type()) {
>>   		case CameraStream::Type::Mapped:
>>   			/*
>> @@ -990,6 +991,7 @@ int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
>>   			 * once it has been processed.
>>   			 */
>>   			buffer = cameraStream->getBuffer();
>> +			descriptor.srcInternalBuffer_ = buffer;
>>   			LOG(HAL, Debug) << ss.str() << " (internal)";
>>   			break;
>>   		}
>> @@ -1149,25 +1151,95 @@ void CameraDevice::requestComplete(Request *request)
>>   			continue;
>>   		}
>>   
>> +		/*
>> +		 * Save the current context of capture result and queue the
>> +		 * descriptor. cameraStream will process asynchronously, so
>> +		 * we can only send the results back after the processing has
>> +		 * been completed.
>> +		 */
>> +		descriptor.status_ = Camera3RequestDescriptor::NOT_READY;
>> +		descriptor.resultMetadata_ = std::move(resultMetadata);
>> +		descriptor.captureResult_ = captureResult;
>> +
>> +		cameraStream->processComplete.connect(this, &CameraDevice::streamProcessingComplete);
> While signals can be connected and disconnected on demand, it often
> indicates a design issue. Is there a reason why you can't connect the
> signal when the cameraStream instance is created, and keep it connected
> for the whole lifetime of the stream ?


yes, I can keep it connected for the while lifetime of the stream too...

>
>>   		int ret = cameraStream->process(src, *buffer.buffer,
>>   						descriptor.settings_,
>> -						resultMetadata.get());
>> +						descriptor.resultMetadata_.get());
> You have a race condition here, if cameraStream->process() runs in a
> separate thread, it could complete and emit the processComplete signal
> before the code below gets executed.


ouch, that's not right order.  You are correct. I should queue the 
descriptor first before process().. Good spot!

>
>> +		std::unique_ptr<Camera3RequestDescriptor> reqDescriptor =
>> +			std::make_unique<Camera3RequestDescriptor>();
>> +		*reqDescriptor = std::move(descriptor);
>> +		queuedDescriptor_.push_back(std::move(reqDescriptor));
>> +
>> +		return;
>> +	}
>> +
>> +	if (queuedDescriptor_.empty()) {
>> +		captureResult.result = resultMetadata->get();
>> +		callbacks_->process_capture_result(callbacks_, &captureResult);
>> +	} else {
>> +		/*
>> +		 * Previous capture results waiting to be sent to framework
>> +		 * hence, queue the current capture results as well. After that,
>> +		 * check if any results are ready to be sent back to the
>> +		 * framework.
>> +		 */
>> +		descriptor.status_ = Camera3RequestDescriptor::READY_SUCCESS;
>> +		descriptor.resultMetadata_ = std::move(resultMetadata);
>> +		descriptor.captureResult_ = captureResult;
>> +		std::unique_ptr<Camera3RequestDescriptor> reqDescriptor =
>> +			std::make_unique<Camera3RequestDescriptor>();
>> +		*reqDescriptor = std::move(descriptor);
>> +		queuedDescriptor_.push_back(std::move(reqDescriptor));
>> +
>> +		while (!queuedDescriptor_.empty()) {
>> +			std::unique_ptr<Camera3RequestDescriptor> &d = queuedDescriptor_.front();
>> +			if (d->status_ != Camera3RequestDescriptor::NOT_READY) {
>> +				d->captureResult_.result = d->resultMetadata_->get();
>> +				callbacks_->process_capture_result(callbacks_, &(d->captureResult_));
>> +				queuedDescriptor_.pop_front();
>> +			} else {
>> +				break;
>> +			}
>> +		}
>> +	}
>> +}
>> +
>> +void CameraDevice::streamProcessingComplete(CameraStream *cameraStream,
>> +					    CameraStream::ProcessStatus status,
>> +					    CameraMetadata *resultMetadata)
>> +{
>> +	cameraStream->processComplete.disconnect(this, &CameraDevice::streamProcessingComplete);
>> +
>> +	for (auto &d : queuedDescriptor_) {
>> +		if (d->resultMetadata_.get() != resultMetadata)
>> +			continue;
>> +
>>   		/*
>>   		 * Return the FrameBuffer to the CameraStream now that we're
>>   		 * done processing it.
>>   		 */
>>   		if (cameraStream->type() == CameraStream::Type::Internal)
>> -			cameraStream->putBuffer(src);
>> -
>> -		if (ret) {
>> -			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
>> -			notifyError(descriptor.frameNumber_, buffer.stream,
>> -				    CAMERA3_MSG_ERROR_BUFFER);
>> +			cameraStream->putBuffer(d->srcInternalBuffer_);
>> +
>> +		if (status == CameraStream::ProcessStatus::Success) {
>> +			d->status_ = Camera3RequestDescriptor::READY_SUCCESS;
>> +		} else {
>> +			/* \todo this is clumsy error handling, be better. */
>> +			d->status_ = Camera3RequestDescriptor::READY_FAILED;
>> +			for (camera3_stream_buffer_t &buffer : d->buffers_) {
>> +				CameraStream *cs = static_cast<CameraStream *>(buffer.stream->priv);
>> +
>> +				if (cs->camera3Stream().format != HAL_PIXEL_FORMAT_BLOB)
>> +					continue;
>> +
>> +				buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
>> +				notifyError(d->frameNumber_, buffer.stream,
>> +					    CAMERA3_MSG_ERROR_BUFFER);
>> +			}
>>   		}
>> -	}
>>   
>> -	captureResult.result = resultMetadata->get();
>> -	callbacks_->process_capture_result(callbacks_, &captureResult);
>> +		return;
>> +	}
>>   }
>>   
>>   std::string CameraDevice::logPrefix() const
>> diff --git a/src/android/camera_device.h b/src/android/camera_device.h
>> index dd9aebba..4e4bb970 100644
>> --- a/src/android/camera_device.h
>> +++ b/src/android/camera_device.h
>> @@ -7,6 +7,7 @@
>>   #ifndef __ANDROID_CAMERA_DEVICE_H__
>>   #define __ANDROID_CAMERA_DEVICE_H__
>>   
>> +#include <deque>
>>   #include <map>
>>   #include <memory>
>>   #include <mutex>
>> @@ -81,6 +82,20 @@ private:
>>   		std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;
>>   		CameraMetadata settings_;
>>   		std::unique_ptr<CaptureRequest> request_;
>> +
>> +		/*
>> +		 * Only set when a capture result needs to be queued before
>> +		 * completion via process_capture_result() callback.
>> +		 */
>> +		enum processingStatus {
>> +			NOT_READY,
>> +			READY_SUCCESS,
>> +			READY_FAILED,
>> +		};
>> +		std::unique_ptr<CameraMetadata> resultMetadata_;
>> +		camera3_capture_result_t captureResult_;
>> +		libcamera::FrameBuffer *srcInternalBuffer_;
>> +		processingStatus status_;
>>   	};
>>   
>>   	enum class State {
>> @@ -100,7 +115,9 @@ private:
>>   	int processControls(Camera3RequestDescriptor *descriptor);
>>   	std::unique_ptr<CameraMetadata> getResultMetadata(
>>   		const Camera3RequestDescriptor &descriptor) const;
>> -
>> +	void streamProcessingComplete(CameraStream *cameraStream,
>> +				      CameraStream::ProcessStatus status,
>> +				      CameraMetadata *resultMetadata);
>>   	unsigned int id_;
>>   	camera3_device_t camera3Device_;
>>   
>> @@ -128,6 +145,8 @@ private:
>>   	int orientation_;
>>   
>>   	CameraMetadata lastSettings_;
>> +
>> +	std::deque<std::unique_ptr<Camera3RequestDescriptor>> queuedDescriptor_;
>>   };
>>   
>>   #endif /* __ANDROID_CAMERA_DEVICE_H__ */
>> diff --git a/src/android/camera_stream.cpp b/src/android/camera_stream.cpp
>> index bdcc7cf9..d5981c0e 100644
>> --- a/src/android/camera_stream.cpp
>> +++ b/src/android/camera_stream.cpp
>> @@ -55,6 +55,7 @@ CameraStream::CameraStream(CameraDevice *const cameraDevice,
>>   		 * is what we instantiate here.
>>   		 */
>>   		postProcessor_ = std::make_unique<PostProcessorJpeg>(cameraDevice_);
>> +		ppWorker_ = std::make_unique<PostProcessorWorker>(postProcessor_.get());
>>   	}
>>   
>>   	if (type == Type::Internal) {
>> @@ -108,22 +109,41 @@ int CameraStream::process(const libcamera::FrameBuffer *source,
>>   	if (!postProcessor_)
>>   		return 0;
>>   
>> -	/*
>> -	 * \todo Buffer mapping and processing should be moved to a
>> -	 * separate thread.
>> -	 */
>> -	CameraBuffer dest(camera3Dest, PROT_READ | PROT_WRITE);
>> -	if (!dest.isValid()) {
>> +	/* \todo Should buffer mapping be moved to post-processor's thread? */
> Good point :-)

umm, what does this indicate?

I prefer to leave the mapping as it is, for now, with \todo '?' 
converted to a statement. We can decide on it later and can be done on top?


>
>> +	dest_ = std::make_unique<CameraBuffer>(camera3Dest, PROT_READ | PROT_WRITE);
>> +	if (!dest_->isValid()) {
>>   		LOG(HAL, Error) << "Failed to map android blob buffer";
>>   		return -EINVAL;
>>   	}
>>   
>> -	return postProcessor_->process(source, &dest, requestMetadata, resultMetadata);
>> +	if (type() != CameraStream::Type::Internal) {
>> +		/*
>> +		 * The source buffer is owned by Request object which can be
>> +		 * reused for future capture request. Since post-processor will
>> +		 * run asynchrnously, we need to copy the source buffer and use
>> +		 * it as source data for post-processor.
>> +		 */
>> +		srcPlanes_.clear();
>> +		for (const auto &plane : source->planes())
>> +			srcPlanes_.push_back(plane);
>> +
>> +		srcFramebuffer_ = std::make_unique<libcamera::FrameBuffer>(srcPlanes_);
>> +		source = srcFramebuffer_.get();
>> +	}
> This will break if the next frame needs to be post-processed before the
> current frame completes. You need a queue of frames to post-process, not
> just one. One option would be to store the data in
> Camera3RequestDescriptor instead, as that the full context for a


Ah, yes, this is thin ice.

But exposing Camera3RequestDescriptor data, I am not sure... maybe we 
can just expose the frame data /only/. Something I'll meditate upon... 
but I agree with the data storage location in Camera3RequestDescriptor

Did you notice, I am /copying/ the frame buffer 'source' ? :-) Does the 
associated comment makes sense to you? Asking because I and Kieran had a 
quite a bit of discussion on this right now.

... In my head right now, I see that 'source' comes from 
libcamera::Request in CameraDevice::requestComplete. So after signal 
handler, the libcamera::Request is disposed or reused (I don't know, I 
haven't seen the 'higher' context of what's happening with request). In 
both cases, the 'source' framebuffer (used for post-processing) might 
get invalidated/segfault, hence the reason of copy the framebuffer for 
post-processing purposes.

This is the context in my head right now, I might be wrong. Does it make 
sense? I'll try to look around for the 'higher' context of the request, 
whether or not it's getting reused or something else

> request, including all its frames. Some refactoring would be required,
> as the structure is current private to CameraDevice.
>
>> +
>> +	ppWorker_->start();
>> +
>> +	return postProcessor_->invokeMethod(&PostProcessor::process,
>> +					    ConnectionTypeQueued,
>> +					    source, dest_.get(),
>> +					    requestMetadata, resultMetadata);
> Is the CameraStream actually the right place to spawn a thread and
> dispatch processing jobs ? Brainstorming a bit, I wonder if this
> shouldn't be moved to the PostProcessor class instead, as the need for a
> thread depends on the type of post-processor. The current software-based
> post-processors definitely need a thread, but a post-processor that
> would use a hardware-accelerated JPEG encoder would be asynchronous by
> nature (posting jobs to the device and being notified of their
> completion through callbacks) and may not need to be wrapped in a
> thread. It would actually also depend on how the hardware JPEG encoder


Yes, this bit needs discussion but certainly you have more look-forward 
ideas than me. The way I envisioned is that, we have 'n' post-processors 
in future. And CameraStream is the one that creates the post-processor 
instances.

For eg. , currently we have in CameraStream constructor:

             postProcessor_ = 
std::make_unique<PostProcessorJpeg>(cameraDevice_);

In a future of multiple post-processors, I see that CameraStream decides 
which postProcessor_ it needs, depending on the context. Speaking  from 
high-level, it feels CameraStream is the one that creates and manages 
the post-processor objects. Hence, it should be the one to decide 
whether or not, to post-processor asynchronously or synchronously; And 
the post-processor implementation should just care about the 
post-processing specific bits.

So the hardware-accelerator JPEG encoder you mentioned, kind of fits 
into this design i.e. letting CameraStream decide if a post-processor 
needs to run in separate threa or not. As the h/w accerlerated encoder 
is async by nature, the CameraStream shouldn't use a thread for this 
post-processor.

There can also be multiple post-processors right, for a single 
CameraStream instance? One chaining into the another... would need to 
think about it too, I guess.

Anyway, this is my vision while writing the series. As you can already 
see, it's limited and I am open to suggestions / experimenting new 
design ideas.


> would be interfaced. If the post-processor were to deal with the kernel
> device directly, it would likely need an event loop, which we're missing
> in the HAL implementation. A thread would provide that event loop, but
> it could be best to use a single thread for all event-driven
> post-processors instead of one per post-processor. This is largely
> theoretical though, we don't have hardware-backed post-processors today.
>
>>   }
>>   
>>   void CameraStream::handlePostProcessing(PostProcessor::Status status,
>>   					CameraMetadata *resultMetadata)
>>   {
>> +	ppWorker_->exit();
> Threads won't consume much resources when they're idle, but starting and
> stopping a thread is a costly operation. I would be better to start the
> thread when starting the camera capture session.

Ok, noted.

Also I am sure I had stop() here, but seems a bad rebase/cherry-pick 
fiasco that the exit() creeped in (from earlier implementations).

>
>> +
>>   	switch (status) {
>>   	case PostProcessor::Status::Success:
>>   		processComplete.emit(this, ProcessStatus::Success, resultMetadata);
>> @@ -134,6 +154,7 @@ void CameraStream::handlePostProcessing(PostProcessor::Status status,
>>   	default:
>>   		LOG(HAL, Error) << "PostProcessor status invalid";
>>   	}
>> +	srcFramebuffer_.reset();
>>   }
>>   
>>   FrameBuffer *CameraStream::getBuffer()
>> diff --git a/src/android/camera_stream.h b/src/android/camera_stream.h
>> index d54d3f58..567d896f 100644
>> --- a/src/android/camera_stream.h
>> +++ b/src/android/camera_stream.h
>> @@ -13,7 +13,9 @@
>>   
>>   #include <hardware/camera3.h>
>>   
>> +#include <libcamera/base/object.h>
>>   #include <libcamera/base/signal.h>
>> +#include <libcamera/base/thread.h>
>>   
>>   #include <libcamera/camera.h>
>>   #include <libcamera/framebuffer.h>
>> @@ -23,6 +25,7 @@
>>   
>>   #include "post_processor.h"
>>   
>> +class CameraBuffer;
>>   class CameraDevice;
>>   class CameraMetadata;
>>   
>> @@ -135,6 +138,38 @@ public:
>>   	libcamera::Signal<CameraStream *, ProcessStatus, CameraMetadata *> processComplete;
>>   
>>   private:
>> +	class PostProcessorWorker : public libcamera::Thread
>> +	{
>> +	public:
>> +		PostProcessorWorker(PostProcessor *postProcessor)
>> +		{
>> +			postProcessor->moveToThread(this);
>> +		}
> Assuming we want to keep the thread inside the CameraStream, I think
> this needs to be reworked a little bit. The post-processor, in this
> case, isn't thread-aware, it gets run in a thread, but doesn't really
> know much about that fact. However, in patch 1/5, you make PostProcessor
> inherit from Object to enabled the invokeMethod() call. There's a design
> conflict between these two facts.
>
> I would propose instead keeping the PostProcessor unaware of threads,
> without inheriting from Object, and making PostProcessorWorker the class
> that inherits from Object and is moved to the thread. The thread could
> stay in PostProcessorWorker by inheriting from both Object and Thread,
> but that's a bit confusing, so I'd recommend adding a libcamera::Thread
> thread_ member variable to CameraStream for the thread. This is the
> design used by the IPA proxies, you can look at the generated code to


Ok, I have noted this down but not commenting right now. I think I need 
to read the code bits again and visualize the flow first

> see how it's implemented (it's more readable than the templates). The
> worker would have a process() function, called from
> CameraStream::process() using invokeMethod(), and that worker process()
> functions would simply call PostProcessor::process() directly.
>
>> +
>> +		void start()
>> +		{
>> +			/*
>> +			 * \todo [BLOCKER] !!!
>> +			 * On second shutter capture, this fails to start the
>> +			 * thread(again). No errors for first shutter capture.
>> +			 */
>> +			Thread::start();
>> +		}
> Why do you reimplement start(), if you only call the same function in
> the parent class ?

Oh, hmm :-S

I think I had moveToThread() here for post-processor but it changed over 
iterations. And I failed to cleanup the ruins..

>
>> +
>> +		void stop()
>> +		{
>> +			exit();
>> +			wait();
>> +		}
> This is never called.
>
>> +
>> +	protected:
>> +		void run() override
>> +		{
>> +			exec();
>> +			dispatchMessages();
>> +		}
>> +	};
>> +
>>   	void handlePostProcessing(PostProcessor::Status status,
>>   				  CameraMetadata *resultMetadata);
>>   
>> @@ -152,6 +187,11 @@ private:
>>   	 */
>>   	std::unique_ptr<std::mutex> mutex_;
>>   	std::unique_ptr<PostProcessor> postProcessor_;
>> +	std::unique_ptr<PostProcessorWorker> ppWorker_;
>> +
>> +	std::unique_ptr<CameraBuffer> dest_;
>> +	std::unique_ptr<libcamera::FrameBuffer> srcFramebuffer_;
>> +	std::vector<libcamera::FrameBuffer::Plane> srcPlanes_;
>>   };
>>   
>>   #endif /* __ANDROID_CAMERA_STREAM__ */

Patch
diff mbox series

diff --git a/src/android/camera_device.cpp b/src/android/camera_device.cpp
index fea59ce6..08237187 100644
--- a/src/android/camera_device.cpp
+++ b/src/android/camera_device.cpp
@@ -960,6 +960,7 @@  int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
 		 * and add them to the Request if required.
 		 */
 		FrameBuffer *buffer = nullptr;
+		descriptor.srcInternalBuffer_ = nullptr;
 		switch (cameraStream->type()) {
 		case CameraStream::Type::Mapped:
 			/*
@@ -990,6 +991,7 @@  int CameraDevice::processCaptureRequest(camera3_capture_request_t *camera3Reques
 			 * once it has been processed.
 			 */
 			buffer = cameraStream->getBuffer();
+			descriptor.srcInternalBuffer_ = buffer;
 			LOG(HAL, Debug) << ss.str() << " (internal)";
 			break;
 		}
@@ -1149,25 +1151,95 @@  void CameraDevice::requestComplete(Request *request)
 			continue;
 		}
 
+		/*
+		 * Save the current context of capture result and queue the
+		 * descriptor. cameraStream will process asynchronously, so
+		 * we can only send the results back after the processing has
+		 * been completed.
+		 */
+		descriptor.status_ = Camera3RequestDescriptor::NOT_READY;
+		descriptor.resultMetadata_ = std::move(resultMetadata);
+		descriptor.captureResult_ = captureResult;
+
+		cameraStream->processComplete.connect(this, &CameraDevice::streamProcessingComplete);
 		int ret = cameraStream->process(src, *buffer.buffer,
 						descriptor.settings_,
-						resultMetadata.get());
+						descriptor.resultMetadata_.get());
+		std::unique_ptr<Camera3RequestDescriptor> reqDescriptor =
+			std::make_unique<Camera3RequestDescriptor>();
+		*reqDescriptor = std::move(descriptor);
+		queuedDescriptor_.push_back(std::move(reqDescriptor));
+
+		return;
+	}
+
+	if (queuedDescriptor_.empty()) {
+		captureResult.result = resultMetadata->get();
+		callbacks_->process_capture_result(callbacks_, &captureResult);
+	} else {
+		/*
+		 * Previous capture results waiting to be sent to framework
+		 * hence, queue the current capture results as well. After that,
+		 * check if any results are ready to be sent back to the
+		 * framework.
+		 */
+		descriptor.status_ = Camera3RequestDescriptor::READY_SUCCESS;
+		descriptor.resultMetadata_ = std::move(resultMetadata);
+		descriptor.captureResult_ = captureResult;
+		std::unique_ptr<Camera3RequestDescriptor> reqDescriptor =
+			std::make_unique<Camera3RequestDescriptor>();
+		*reqDescriptor = std::move(descriptor);
+		queuedDescriptor_.push_back(std::move(reqDescriptor));
+
+		while (!queuedDescriptor_.empty()) {
+			std::unique_ptr<Camera3RequestDescriptor> &d = queuedDescriptor_.front();
+			if (d->status_ != Camera3RequestDescriptor::NOT_READY) {
+				d->captureResult_.result = d->resultMetadata_->get();
+				callbacks_->process_capture_result(callbacks_, &(d->captureResult_));
+				queuedDescriptor_.pop_front();
+			} else {
+				break;
+			}
+		}
+	}
+}
+
+void CameraDevice::streamProcessingComplete(CameraStream *cameraStream,
+					    CameraStream::ProcessStatus status,
+					    CameraMetadata *resultMetadata)
+{
+	cameraStream->processComplete.disconnect(this, &CameraDevice::streamProcessingComplete);
+
+	for (auto &d : queuedDescriptor_) {
+		if (d->resultMetadata_.get() != resultMetadata)
+			continue;
+
 		/*
 		 * Return the FrameBuffer to the CameraStream now that we're
 		 * done processing it.
 		 */
 		if (cameraStream->type() == CameraStream::Type::Internal)
-			cameraStream->putBuffer(src);
-
-		if (ret) {
-			buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
-			notifyError(descriptor.frameNumber_, buffer.stream,
-				    CAMERA3_MSG_ERROR_BUFFER);
+			cameraStream->putBuffer(d->srcInternalBuffer_);
+
+		if (status == CameraStream::ProcessStatus::Success) {
+			d->status_ = Camera3RequestDescriptor::READY_SUCCESS;
+		} else {
+			/* \todo this is clumsy error handling, be better. */
+			d->status_ = Camera3RequestDescriptor::READY_FAILED;
+			for (camera3_stream_buffer_t &buffer : d->buffers_) {
+				CameraStream *cs = static_cast<CameraStream *>(buffer.stream->priv);
+
+				if (cs->camera3Stream().format != HAL_PIXEL_FORMAT_BLOB)
+					continue;
+
+				buffer.status = CAMERA3_BUFFER_STATUS_ERROR;
+				notifyError(d->frameNumber_, buffer.stream,
+					    CAMERA3_MSG_ERROR_BUFFER);
+			}
 		}
-	}
 
-	captureResult.result = resultMetadata->get();
-	callbacks_->process_capture_result(callbacks_, &captureResult);
+		return;
+	}
 }
 
 std::string CameraDevice::logPrefix() const
diff --git a/src/android/camera_device.h b/src/android/camera_device.h
index dd9aebba..4e4bb970 100644
--- a/src/android/camera_device.h
+++ b/src/android/camera_device.h
@@ -7,6 +7,7 @@ 
 #ifndef __ANDROID_CAMERA_DEVICE_H__
 #define __ANDROID_CAMERA_DEVICE_H__
 
+#include <deque>
 #include <map>
 #include <memory>
 #include <mutex>
@@ -81,6 +82,20 @@  private:
 		std::vector<std::unique_ptr<libcamera::FrameBuffer>> frameBuffers_;
 		CameraMetadata settings_;
 		std::unique_ptr<CaptureRequest> request_;
+
+		/*
+		 * Only set when a capture result needs to be queued before
+		 * completion via process_capture_result() callback.
+		 */
+		enum processingStatus {
+			NOT_READY,
+			READY_SUCCESS,
+			READY_FAILED,
+		};
+		std::unique_ptr<CameraMetadata> resultMetadata_;
+		camera3_capture_result_t captureResult_;
+		libcamera::FrameBuffer *srcInternalBuffer_;
+		processingStatus status_;
 	};
 
 	enum class State {
@@ -100,7 +115,9 @@  private:
 	int processControls(Camera3RequestDescriptor *descriptor);
 	std::unique_ptr<CameraMetadata> getResultMetadata(
 		const Camera3RequestDescriptor &descriptor) const;
-
+	void streamProcessingComplete(CameraStream *cameraStream,
+				      CameraStream::ProcessStatus status,
+				      CameraMetadata *resultMetadata);
 	unsigned int id_;
 	camera3_device_t camera3Device_;
 
@@ -128,6 +145,8 @@  private:
 	int orientation_;
 
 	CameraMetadata lastSettings_;
+
+	std::deque<std::unique_ptr<Camera3RequestDescriptor>> queuedDescriptor_;
 };
 
 #endif /* __ANDROID_CAMERA_DEVICE_H__ */
diff --git a/src/android/camera_stream.cpp b/src/android/camera_stream.cpp
index bdcc7cf9..d5981c0e 100644
--- a/src/android/camera_stream.cpp
+++ b/src/android/camera_stream.cpp
@@ -55,6 +55,7 @@  CameraStream::CameraStream(CameraDevice *const cameraDevice,
 		 * is what we instantiate here.
 		 */
 		postProcessor_ = std::make_unique<PostProcessorJpeg>(cameraDevice_);
+		ppWorker_ = std::make_unique<PostProcessorWorker>(postProcessor_.get());
 	}
 
 	if (type == Type::Internal) {
@@ -108,22 +109,41 @@  int CameraStream::process(const libcamera::FrameBuffer *source,
 	if (!postProcessor_)
 		return 0;
 
-	/*
-	 * \todo Buffer mapping and processing should be moved to a
-	 * separate thread.
-	 */
-	CameraBuffer dest(camera3Dest, PROT_READ | PROT_WRITE);
-	if (!dest.isValid()) {
+	/* \todo Should buffer mapping be moved to post-processor's thread? */
+	dest_ = std::make_unique<CameraBuffer>(camera3Dest, PROT_READ | PROT_WRITE);
+	if (!dest_->isValid()) {
 		LOG(HAL, Error) << "Failed to map android blob buffer";
 		return -EINVAL;
 	}
 
-	return postProcessor_->process(source, &dest, requestMetadata, resultMetadata);
+	if (type() != CameraStream::Type::Internal) {
+		/*
+		 * The source buffer is owned by Request object which can be
+		 * reused for future capture request. Since post-processor will
+		 * run asynchrnously, we need to copy the source buffer and use
+		 * it as source data for post-processor.
+		 */
+		srcPlanes_.clear();
+		for (const auto &plane : source->planes())
+			srcPlanes_.push_back(plane);
+
+		srcFramebuffer_ = std::make_unique<libcamera::FrameBuffer>(srcPlanes_);
+		source = srcFramebuffer_.get();
+	}
+
+	ppWorker_->start();
+
+	return postProcessor_->invokeMethod(&PostProcessor::process,
+					    ConnectionTypeQueued,
+					    source, dest_.get(),
+					    requestMetadata, resultMetadata);
 }
 
 void CameraStream::handlePostProcessing(PostProcessor::Status status,
 					CameraMetadata *resultMetadata)
 {
+	ppWorker_->exit();
+
 	switch (status) {
 	case PostProcessor::Status::Success:
 		processComplete.emit(this, ProcessStatus::Success, resultMetadata);
@@ -134,6 +154,7 @@  void CameraStream::handlePostProcessing(PostProcessor::Status status,
 	default:
 		LOG(HAL, Error) << "PostProcessor status invalid";
 	}
+	srcFramebuffer_.reset();
 }
 
 FrameBuffer *CameraStream::getBuffer()
diff --git a/src/android/camera_stream.h b/src/android/camera_stream.h
index d54d3f58..567d896f 100644
--- a/src/android/camera_stream.h
+++ b/src/android/camera_stream.h
@@ -13,7 +13,9 @@ 
 
 #include <hardware/camera3.h>
 
+#include <libcamera/base/object.h>
 #include <libcamera/base/signal.h>
+#include <libcamera/base/thread.h>
 
 #include <libcamera/camera.h>
 #include <libcamera/framebuffer.h>
@@ -23,6 +25,7 @@ 
 
 #include "post_processor.h"
 
+class CameraBuffer;
 class CameraDevice;
 class CameraMetadata;
 
@@ -135,6 +138,38 @@  public:
 	libcamera::Signal<CameraStream *, ProcessStatus, CameraMetadata *> processComplete;
 
 private:
+	class PostProcessorWorker : public libcamera::Thread
+	{
+	public:
+		PostProcessorWorker(PostProcessor *postProcessor)
+		{
+			postProcessor->moveToThread(this);
+		}
+
+		void start()
+		{
+			/*
+			 * \todo [BLOCKER] !!!
+			 * On second shutter capture, this fails to start the
+			 * thread(again). No errors for first shutter capture.
+			 */
+			Thread::start();
+		}
+
+		void stop()
+		{
+			exit();
+			wait();
+		}
+
+	protected:
+		void run() override
+		{
+			exec();
+			dispatchMessages();
+		}
+	};
+
 	void handlePostProcessing(PostProcessor::Status status,
 				  CameraMetadata *resultMetadata);
 
@@ -152,6 +187,11 @@  private:
 	 */
 	std::unique_ptr<std::mutex> mutex_;
 	std::unique_ptr<PostProcessor> postProcessor_;
+	std::unique_ptr<PostProcessorWorker> ppWorker_;
+
+	std::unique_ptr<CameraBuffer> dest_;
+	std::unique_ptr<libcamera::FrameBuffer> srcFramebuffer_;
+	std::vector<libcamera::FrameBuffer::Plane> srcPlanes_;
 };
 
 #endif /* __ANDROID_CAMERA_STREAM__ */