[libcamera-devel,0/1] Proposal of mapping between camera configurations and requested configurations

Message ID	20200806061706.1025788-1-hiroh@chromium.org
Headers	show Return-Path: <libcamera-devel-bounces@lists.libcamera.org> From: Hirokazu Honda <hiroh@chromium.org> To: libcamera-devel@lists.libcamera.org Date: Thu, 6 Aug 2020 15:17:05 +0900 Message-Id: <20200806061706.1025788-1-hiroh@chromium.org> MIME-Version: 1.0 Subject: [libcamera-devel] [PATCH 0/1] Proposal of mapping between camera configurations and requested configurations Precedence: list Cc: hanlinchen@chromium.org Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" <libcamera-devel-bounces@lists.libcamera.org>
Series	Proposal of mapping between camera configurations and requested configurations
Related	show [libcamera-devel,0/1] Proposal of mapping between camera configurations and requested configurations [libcamera-devel,1/1] android: [WIP] Map between camera configuration and requested configuration

Hirokazu Honda Aug. 6, 2020, 6:17 a.m. UTC

This is a proposal about how to map camera configurations and requested
configurations in Android Camera HAL adaptation layer. Please also see the
sample code in the following patch.

# Software Stream Processing in libcamera

_hiroh@chromium.org / Draft: 2020-08-06_

# Objective

Perform frame processing in libcamera to achieve requested stream configurations that are not supported natively by the camera hardware, but required by the Android Camera HAL interface.

# Background

### Libcamera

In addition to its native API, libcamera[^1] provides a number of camera APIs, for example, V4L2 Webcam API and Android Camera HAL3. The platform specific implementations are wrapped in libcamera core and a caller of libcamera doesn’t have to take care the platform.

### Android Camera HAL

Chrome OS camera stack uses Android Camera HAL[^2] interface. Libcamera provides Android Camera HAL with an adaptation layer[^3] between libcamera core part and Android HAL, which is called Android HAL adaptation layer in this document.

To present a uniform set of capabilities to the API users, Android Camera HAL API[^4] allows caller to request stream configurations that are beyond the device capabilities. For example, while a camera device is able to produce a single stream, a HAL caller requests three possibly different resolution streams (PRIV, YUV, JPEG). However, libcamera core implementation produces camera-capable streams. Therefore, we have to create three streams from the single stream produced by libcamera.

Requests beyond the device capability is supported only in Android HAL at this moment. I describe the design in this document that the stream processing is performed in Android HAL adaptation layer.

# Overview

## Current implementation

The requested stream configuration is given by _camera3_device_t->ops->configure_streams()_ in Android Camera HAL. This delegates CameraDevice::configureStreams()[^5] in libcamera. The current implementation attempts all the given configurations and succeeds if and only if the camera device can produces them without any adjustments.

### libcamera::CameraConfiguration

It is CameraConfiguration[^6] that judges whether adjustments are required, or even requested configurations are infeasible.

The procedure of configuration is that CameraDevice

1. Adds every configuration by CameraConfiguration::addConfiguration().
2. Assorts the added configurations by CameraConfiguration::validate().

CameraConfiguration, especially for validate(), is implemented per pipeline. For instance, the CameraConfiguration implementation for IPU3 is IPU3CameraConfiguration[^7].

validate() returns one of the below,

* Valid
* A camera can produce streams with requested configurations.
* Adjusted
* A camera cannot produce streams with requested configurations as-is, but can produce streams with different pixel formats or resolutions.
* Invalid
* A camera cannot produce streams with either requested configurations or different pixel formats and resolutions. For instance, this is returned when the larger resolution is requested than the maximum supported one?

What we need to resolve is, when Adjusted is returned, to map adjusted camera streams to requested camera streams and required processing.

## Stream processing

The processing to be thought of are followings.

* Down-scaling
* We don’t perform up-scaling because it affects stream qualities
* Down-scaling is allowed for the same ratio to avoid producing distorted frames. For instance, scaling from 1280x720 (16:9) to 480x360 (4:3) is not allowed.
* Cropping
* Cropping is executed only to change the frame ratio. Thus it must be done after down-scaling if required. For example, to convert 1280x720 to 480x360, first down-scale to 640x360 and then crop to 480x360.
* Format conversion
* Pixel format conversion
* JPEG encoding

# Proposal

Basically we only need to consider a mapping algorithm after validate(). However, to obtain less processing and better stream qualities, we should reorder given configurations within validate().

I described how to map after validate() first, and next how to reorder within validate().

## How to map after validate()

For each requested stream, we try to find a best-fit camera stream as follows.

1. Filter out smaller resolutions than the requested one.
2. Prioritize smaller resolutions to reduce number of processed pixels
3. If there is same ratio and same format as requested ones, select it
4. Otherwise, select one of the same ratio’s ones. If there is no same ratio one, select one of the same format’s ones.
5. If there is neither same ratio’s nor format’s one, select any convertible stream.

The required processings are

* No-op [Same resolution and same format]
* Scale [Same ratio and same format, but different resolution]
* Pixel format conversion [Same resolution but different format]
* Scale and Pixel format conversion [Same ratio but different format and resolution]
* Scle and Crop [Same format, but different ratio]
* Scale, Crop and Pixel format conversion [Different ratio and format]

## How to sort within validate()

Since up-scaling is not allowed in the proposal mapping, it is important to configure larger resolutions. Otherwise, configureStream() fails. Besides, producing as many as different ratios would be better so that we don’t have to get rid of captured content by cropping.

This can be done in Android HAL adaptation layer, however I would do this within validate() because the above sorting strategy is general and each configuration class should have more information to prioritize the requested configurations.

[^1]:
https://libcamera.org/index.html

[^2]:
https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/camera3.h

[^3]:
https://git.linuxtv.org/libcamera.git/tree/src/android

[^4]:
https://developer.android.com/reference/android/hardware/camera2/CameraDevice#regular-capture

[^5]:
https://git.linuxtv.org/libcamera.git/tree/src/android/camera_device.cpp#n934

[^6]:
https://git.linuxtv.org/libcamera.git/tree/include/libcamera/camera.h#n28

[^7]:
https://git.linuxtv.org/libcamera.git/tree/src/libcamera/pipeline/ipu3/ipu3.cpp#n62

--
2.28.0.236.gb10cc79966-goog

Hirokazu Honda Aug. 13, 2020, 12:51 p.m. UTC | #1

gentle ping


On Thu, Aug 6, 2020 at 3:17 PM Hirokazu Honda <hiroh@chromium.org> wrote:
>
> This is a proposal about how to map camera configurations and requested
> configurations in Android Camera HAL adaptation layer. Please also see the
> sample code in the following patch.
>
> # Software Stream Processing in libcamera
>
> _hiroh@chromium.org / Draft: 2020-08-06_
>
>
> # Objective
>
> Perform frame processing in libcamera to achieve requested stream configurations that are not supported natively by the camera hardware, but required by the Android Camera HAL interface.
>
>
> # Background
>
>
> ### Libcamera
>
> In addition to its native API, libcamera[^1] provides a number of camera APIs, for example, V4L2 Webcam API and Android Camera HAL3. The platform specific implementations are wrapped in libcamera core and a caller of libcamera doesn’t have to take care the platform.
>
>
> ### Android Camera HAL
>
> Chrome OS camera stack uses Android Camera HAL[^2] interface. Libcamera provides Android Camera HAL with an adaptation layer[^3] between libcamera core part and Android HAL, which is called Android HAL adaptation layer in this document.
>
> To present a uniform set of capabilities to the API users, Android Camera HAL API[^4] allows caller to request stream configurations that are beyond the device capabilities. For example, while a camera device is able to produce a single stream, a HAL caller requests three possibly different resolution streams (PRIV, YUV, JPEG). However, libcamera core implementation produces camera-capable streams. Therefore, we have to create three streams from the single stream produced by libcamera.
>
> Requests beyond the device capability is supported only in Android HAL at this moment. I describe the design in this document that the stream processing is performed in Android HAL adaptation layer.
>
>
> # Overview
>
>
> ## Current implementation
>
> The requested stream configuration is given by _camera3_device_t->ops->configure_streams()_ in Android Camera HAL. This delegates CameraDevice::configureStreams()[^5] in libcamera. The current implementation attempts all the given configurations and succeeds if and only if the camera device can produces them without any adjustments.
>
>
> ### libcamera::CameraConfiguration
>
>  It is CameraConfiguration[^6] that judges whether adjustments are required, or even requested configurations are infeasible.
>
> The procedure of configuration is that CameraDevice
>
>
>
> 1. Adds every configuration by CameraConfiguration::addConfiguration().
> 2. Assorts the added configurations by CameraConfiguration::validate().
>
> CameraConfiguration, especially for validate(), is implemented per pipeline. For instance, the CameraConfiguration implementation for IPU3 is IPU3CameraConfiguration[^7].
>
> validate() returns one of the below,
>
>
>
> *   Valid
>     *    A camera can produce streams with requested configurations.
> *   Adjusted
>     *   A camera cannot produce streams with requested configurations as-is, but can produce streams with different pixel formats or resolutions.
> *   Invalid
>     *   A camera cannot produce streams with either requested configurations or different pixel formats and resolutions. For instance, this is returned when the larger resolution is requested than the maximum supported one?
>
> What we need to resolve is, when Adjusted is returned, to map adjusted camera streams to requested camera streams and required processing.
>
>
> ## Stream processing
>
> The processing to be thought of are followings.
>
>
>
> *   Down-scaling
>     *   We don’t perform up-scaling because it affects stream qualities
>     *   Down-scaling is allowed for the same ratio to avoid producing distorted frames. For instance, scaling from 1280x720 (16:9) to 480x360 (4:3) is not allowed.
> *   Cropping
>     *   Cropping is executed only to change the frame ratio. Thus it must be done after down-scaling if required. For example, to convert 1280x720 to 480x360, first down-scale to 640x360 and then crop to 480x360.
> *   Format conversion
>     *   Pixel format conversion
>     *   JPEG encoding
>
>
> # Proposal
>
> Basically we only need to consider a mapping algorithm after validate(). However, to obtain less processing and better stream qualities, we should reorder given configurations within validate().
>
> I described how to map after validate() first, and next how to reorder within validate().
>
>
> ## How to map after validate()
>
> For each requested stream, we try to find a best-fit camera stream as follows.
>
>
>
> 1. Filter out smaller resolutions than the requested one.
> 2. Prioritize smaller resolutions to reduce number of processed pixels
> 3. If there is same ratio and same format as requested ones, select it
> 4. Otherwise, select one of the same ratio’s ones. If there is no same ratio one, select one of the same format’s ones.
> 5. If there is neither same ratio’s nor format’s one, select any convertible stream.
>
> The required processings are
>
>
> *   No-op [Same resolution and same format]
> *   Scale [Same ratio and same format, but different resolution]
> *   Pixel format conversion [Same resolution but different format]
> *   Scale and Pixel format conversion [Same ratio but different format and resolution]
> *   Scle and Crop [Same format, but different ratio]
> *   Scale, Crop and Pixel format conversion [Different ratio and format]
>
>
> ## How to sort within validate()
>
> Since up-scaling is not allowed in the proposal mapping, it is important to configure larger resolutions. Otherwise, configureStream() fails. Besides, producing as many as different ratios would be better so that we don’t have to get rid of captured content by cropping.
>
> This can be done in Android HAL adaptation layer, however I would do this within validate() because the above sorting strategy is general and each configuration class should have more information to prioritize the requested configurations.
>
> [^1]:
>      https://libcamera.org/index.html
>
> [^2]:
>      https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/camera3.h
>
> [^3]:
>      https://git.linuxtv.org/libcamera.git/tree/src/android
>
> [^4]:
>      https://developer.android.com/reference/android/hardware/camera2/CameraDevice#regular-capture
>
> [^5]:
>      https://git.linuxtv.org/libcamera.git/tree/src/android/camera_device.cpp#n934
>
> [^6]:
>      https://git.linuxtv.org/libcamera.git/tree/include/libcamera/camera.h#n28
>
> [^7]:
>      https://git.linuxtv.org/libcamera.git/tree/src/libcamera/pipeline/ipu3/ipu3.cpp#n62
>
> --
> 2.28.0.236.gb10cc79966-goog

Tomasz Figa Aug. 13, 2020, 12:55 p.m. UTC | #2

+Laurent Pinchart +Kieran Bingham +jacopo mondi +Niklas Söderlund as
the owners of the components affected.

On Thu, Aug 13, 2020 at 2:52 PM Hirokazu Honda <hiroh@chromium.org> wrote:
>
> gentle ping
>
>
> On Thu, Aug 6, 2020 at 3:17 PM Hirokazu Honda <hiroh@chromium.org> wrote:
> >
> > This is a proposal about how to map camera configurations and requested
> > configurations in Android Camera HAL adaptation layer. Please also see the
> > sample code in the following patch.
> >
> > # Software Stream Processing in libcamera
> >
> > _hiroh@chromium.org / Draft: 2020-08-06_
> >
> >
> > # Objective
> >
> > Perform frame processing in libcamera to achieve requested stream configurations that are not supported natively by the camera hardware, but required by the Android Camera HAL interface.
> >
> >
> > # Background
> >
> >
> > ### Libcamera
> >
> > In addition to its native API, libcamera[^1] provides a number of camera APIs, for example, V4L2 Webcam API and Android Camera HAL3. The platform specific implementations are wrapped in libcamera core and a caller of libcamera doesn’t have to take care the platform.
> >
> >
> > ### Android Camera HAL
> >
> > Chrome OS camera stack uses Android Camera HAL[^2] interface. Libcamera provides Android Camera HAL with an adaptation layer[^3] between libcamera core part and Android HAL, which is called Android HAL adaptation layer in this document.
> >
> > To present a uniform set of capabilities to the API users, Android Camera HAL API[^4] allows caller to request stream configurations that are beyond the device capabilities. For example, while a camera device is able to produce a single stream, a HAL caller requests three possibly different resolution streams (PRIV, YUV, JPEG). However, libcamera core implementation produces camera-capable streams. Therefore, we have to create three streams from the single stream produced by libcamera.
> >
> > Requests beyond the device capability is supported only in Android HAL at this moment. I describe the design in this document that the stream processing is performed in Android HAL adaptation layer.
> >
> >
> > # Overview
> >
> >
> > ## Current implementation
> >
> > The requested stream configuration is given by _camera3_device_t->ops->configure_streams()_ in Android Camera HAL. This delegates CameraDevice::configureStreams()[^5] in libcamera. The current implementation attempts all the given configurations and succeeds if and only if the camera device can produces them without any adjustments.
> >
> >
> > ### libcamera::CameraConfiguration
> >
> >  It is CameraConfiguration[^6] that judges whether adjustments are required, or even requested configurations are infeasible.
> >
> > The procedure of configuration is that CameraDevice
> >
> >
> >
> > 1. Adds every configuration by CameraConfiguration::addConfiguration().
> > 2. Assorts the added configurations by CameraConfiguration::validate().
> >
> > CameraConfiguration, especially for validate(), is implemented per pipeline. For instance, the CameraConfiguration implementation for IPU3 is IPU3CameraConfiguration[^7].
> >
> > validate() returns one of the below,
> >
> >
> >
> > *   Valid
> >     *    A camera can produce streams with requested configurations.
> > *   Adjusted
> >     *   A camera cannot produce streams with requested configurations as-is, but can produce streams with different pixel formats or resolutions.
> > *   Invalid
> >     *   A camera cannot produce streams with either requested configurations or different pixel formats and resolutions. For instance, this is returned when the larger resolution is requested than the maximum supported one?
> >
> > What we need to resolve is, when Adjusted is returned, to map adjusted camera streams to requested camera streams and required processing.
> >
> >
> > ## Stream processing
> >
> > The processing to be thought of are followings.
> >
> >
> >
> > *   Down-scaling
> >     *   We don’t perform up-scaling because it affects stream qualities
> >     *   Down-scaling is allowed for the same ratio to avoid producing distorted frames. For instance, scaling from 1280x720 (16:9) to 480x360 (4:3) is not allowed.
> > *   Cropping
> >     *   Cropping is executed only to change the frame ratio. Thus it must be done after down-scaling if required. For example, to convert 1280x720 to 480x360, first down-scale to 640x360 and then crop to 480x360.
> > *   Format conversion
> >     *   Pixel format conversion
> >     *   JPEG encoding
> >
> >
> > # Proposal
> >
> > Basically we only need to consider a mapping algorithm after validate(). However, to obtain less processing and better stream qualities, we should reorder given configurations within validate().
> >
> > I described how to map after validate() first, and next how to reorder within validate().
> >
> >
> > ## How to map after validate()
> >
> > For each requested stream, we try to find a best-fit camera stream as follows.
> >
> >
> >
> > 1. Filter out smaller resolutions than the requested one.
> > 2. Prioritize smaller resolutions to reduce number of processed pixels
> > 3. If there is same ratio and same format as requested ones, select it
> > 4. Otherwise, select one of the same ratio’s ones. If there is no same ratio one, select one of the same format’s ones.
> > 5. If there is neither same ratio’s nor format’s one, select any convertible stream.
> >
> > The required processings are
> >
> >
> > *   No-op [Same resolution and same format]
> > *   Scale [Same ratio and same format, but different resolution]
> > *   Pixel format conversion [Same resolution but different format]
> > *   Scale and Pixel format conversion [Same ratio but different format and resolution]
> > *   Scle and Crop [Same format, but different ratio]
> > *   Scale, Crop and Pixel format conversion [Different ratio and format]
> >
> >
> > ## How to sort within validate()
> >
> > Since up-scaling is not allowed in the proposal mapping, it is important to configure larger resolutions. Otherwise, configureStream() fails. Besides, producing as many as different ratios would be better so that we don’t have to get rid of captured content by cropping.
> >
> > This can be done in Android HAL adaptation layer, however I would do this within validate() because the above sorting strategy is general and each configuration class should have more information to prioritize the requested configurations.
> >
> > [^1]:
> >      https://libcamera.org/index.html
> >
> > [^2]:
> >      https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/camera3.h
> >
> > [^3]:
> >      https://git.linuxtv.org/libcamera.git/tree/src/android
> >
> > [^4]:
> >      https://developer.android.com/reference/android/hardware/camera2/CameraDevice#regular-capture
> >
> > [^5]:
> >      https://git.linuxtv.org/libcamera.git/tree/src/android/camera_device.cpp#n934
> >
> > [^6]:
> >      https://git.linuxtv.org/libcamera.git/tree/include/libcamera/camera.h#n28
> >
> > [^7]:
> >      https://git.linuxtv.org/libcamera.git/tree/src/libcamera/pipeline/ipu3/ipu3.cpp#n62
> >
> > --
> > 2.28.0.236.gb10cc79966-goog

Laurent Pinchart Aug. 13, 2020, 12:58 p.m. UTC | #3

On Thu, Aug 13, 2020 at 02:55:38PM +0200, Tomasz Figa wrote:
> +Laurent Pinchart +Kieran Bingham +jacopo mondi +Niklas Söderlund as
> the owners of the components affected.

It's on my todo list, I'll review this early next week. Sorry for the
delay.

> On Thu, Aug 13, 2020 at 2:52 PM Hirokazu Honda <hiroh@chromium.org> wrote:
> >
> > gentle ping
> >
> >
> > On Thu, Aug 6, 2020 at 3:17 PM Hirokazu Honda <hiroh@chromium.org> wrote:
> > >
> > > This is a proposal about how to map camera configurations and requested
> > > configurations in Android Camera HAL adaptation layer. Please also see the
> > > sample code in the following patch.
> > >
> > > # Software Stream Processing in libcamera
> > >
> > > _hiroh@chromium.org / Draft: 2020-08-06_
> > >
> > >
> > > # Objective
> > >
> > > Perform frame processing in libcamera to achieve requested stream configurations that are not supported natively by the camera hardware, but required by the Android Camera HAL interface.
> > >
> > >
> > > # Background
> > >
> > >
> > > ### Libcamera
> > >
> > > In addition to its native API, libcamera[^1] provides a number of camera APIs, for example, V4L2 Webcam API and Android Camera HAL3. The platform specific implementations are wrapped in libcamera core and a caller of libcamera doesn’t have to take care the platform.
> > >
> > >
> > > ### Android Camera HAL
> > >
> > > Chrome OS camera stack uses Android Camera HAL[^2] interface. Libcamera provides Android Camera HAL with an adaptation layer[^3] between libcamera core part and Android HAL, which is called Android HAL adaptation layer in this document.
> > >
> > > To present a uniform set of capabilities to the API users, Android Camera HAL API[^4] allows caller to request stream configurations that are beyond the device capabilities. For example, while a camera device is able to produce a single stream, a HAL caller requests three possibly different resolution streams (PRIV, YUV, JPEG). However, libcamera core implementation produces camera-capable streams. Therefore, we have to create three streams from the single stream produced by libcamera.
> > >
> > > Requests beyond the device capability is supported only in Android HAL at this moment. I describe the design in this document that the stream processing is performed in Android HAL adaptation layer.
> > >
> > >
> > > # Overview
> > >
> > >
> > > ## Current implementation
> > >
> > > The requested stream configuration is given by _camera3_device_t->ops->configure_streams()_ in Android Camera HAL. This delegates CameraDevice::configureStreams()[^5] in libcamera. The current implementation attempts all the given configurations and succeeds if and only if the camera device can produces them without any adjustments.
> > >
> > >
> > > ### libcamera::CameraConfiguration
> > >
> > >  It is CameraConfiguration[^6] that judges whether adjustments are required, or even requested configurations are infeasible.
> > >
> > > The procedure of configuration is that CameraDevice
> > >
> > >
> > >
> > > 1. Adds every configuration by CameraConfiguration::addConfiguration().
> > > 2. Assorts the added configurations by CameraConfiguration::validate().
> > >
> > > CameraConfiguration, especially for validate(), is implemented per pipeline. For instance, the CameraConfiguration implementation for IPU3 is IPU3CameraConfiguration[^7].
> > >
> > > validate() returns one of the below,
> > >
> > >
> > >
> > > *   Valid
> > >     *    A camera can produce streams with requested configurations.
> > > *   Adjusted
> > >     *   A camera cannot produce streams with requested configurations as-is, but can produce streams with different pixel formats or resolutions.
> > > *   Invalid
> > >     *   A camera cannot produce streams with either requested configurations or different pixel formats and resolutions. For instance, this is returned when the larger resolution is requested than the maximum supported one?
> > >
> > > What we need to resolve is, when Adjusted is returned, to map adjusted camera streams to requested camera streams and required processing.
> > >
> > >
> > > ## Stream processing
> > >
> > > The processing to be thought of are followings.
> > >
> > >
> > >
> > > *   Down-scaling
> > >     *   We don’t perform up-scaling because it affects stream qualities
> > >     *   Down-scaling is allowed for the same ratio to avoid producing distorted frames. For instance, scaling from 1280x720 (16:9) to 480x360 (4:3) is not allowed.
> > > *   Cropping
> > >     *   Cropping is executed only to change the frame ratio. Thus it must be done after down-scaling if required. For example, to convert 1280x720 to 480x360, first down-scale to 640x360 and then crop to 480x360.
> > > *   Format conversion
> > >     *   Pixel format conversion
> > >     *   JPEG encoding
> > >
> > >
> > > # Proposal
> > >
> > > Basically we only need to consider a mapping algorithm after validate(). However, to obtain less processing and better stream qualities, we should reorder given configurations within validate().
> > >
> > > I described how to map after validate() first, and next how to reorder within validate().
> > >
> > >
> > > ## How to map after validate()
> > >
> > > For each requested stream, we try to find a best-fit camera stream as follows.
> > >
> > >
> > >
> > > 1. Filter out smaller resolutions than the requested one.
> > > 2. Prioritize smaller resolutions to reduce number of processed pixels
> > > 3. If there is same ratio and same format as requested ones, select it
> > > 4. Otherwise, select one of the same ratio’s ones. If there is no same ratio one, select one of the same format’s ones.
> > > 5. If there is neither same ratio’s nor format’s one, select any convertible stream.
> > >
> > > The required processings are
> > >
> > >
> > > *   No-op [Same resolution and same format]
> > > *   Scale [Same ratio and same format, but different resolution]
> > > *   Pixel format conversion [Same resolution but different format]
> > > *   Scale and Pixel format conversion [Same ratio but different format and resolution]
> > > *   Scle and Crop [Same format, but different ratio]
> > > *   Scale, Crop and Pixel format conversion [Different ratio and format]
> > >
> > >
> > > ## How to sort within validate()
> > >
> > > Since up-scaling is not allowed in the proposal mapping, it is important to configure larger resolutions. Otherwise, configureStream() fails. Besides, producing as many as different ratios would be better so that we don’t have to get rid of captured content by cropping.
> > >
> > > This can be done in Android HAL adaptation layer, however I would do this within validate() because the above sorting strategy is general and each configuration class should have more information to prioritize the requested configurations.
> > >
> > > [^1]:
> > >      https://libcamera.org/index.html
> > >
> > > [^2]:
> > >      https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/camera3.h
> > >
> > > [^3]:
> > >      https://git.linuxtv.org/libcamera.git/tree/src/android
> > >
> > > [^4]:
> > >      https://developer.android.com/reference/android/hardware/camera2/CameraDevice#regular-capture
> > >
> > > [^5]:
> > >      https://git.linuxtv.org/libcamera.git/tree/src/android/camera_device.cpp#n934
> > >
> > > [^6]:
> > >      https://git.linuxtv.org/libcamera.git/tree/include/libcamera/camera.h#n28
> > >
> > > [^7]:
> > >      https://git.linuxtv.org/libcamera.git/tree/src/libcamera/pipeline/ipu3/ipu3.cpp#n62
> > >
> > > --
> > > 2.28.0.236.gb10cc79966-goog

Jacopo Mondi Sept. 1, 2020, 4:09 p.m. UTC | #4

Hi Hiro,
   first of all I'm very sorry for the un-aceptable delay in giving
you a reply.

If that's of any consolation we have not ignored your email, but it
has gone through several internal discussion, as it come at the
time where the JPEG support was being merged and the two things
collided a bit. Add a small delay due to leaves, and here you have a
month of delay. Again, we're really sorry for this.

> On Thu, Aug 06, 2020 at 03:17:05PM +0900, Hirokazu Honda wrote:
> This is a proposal about how to map camera configurations and
> requested configurations in Android Camera HAL adaptation layer.
> Please also see the sample code in the following patch.
>
> # Software Stream Processing in libcamera
>
> _hiroh@chromium.org / Draft: 2020-08-06_
>
>

As an initial and un-related note looking at the patch, I can see you
are following the ChromeOS coding style. Please note that libcamera
has it's own code style, which you can find documented at

- https://www.libcamera.org/coding-style.html#coding-style-guidelines

And we have a style checker, which can assist with this. The best way to
use the style checker is to install it as a git-hook.

I understand that this is an RFC, but we will need this style to be
followed to be able to integrate any future patches.

>
> # Objective
>
> Perform frame processing in libcamera to achieve requested stream
> configurations that are not supported natively by the camera
> hardware, but required by the Android Camera HAL interface.
>

As you can see in the camera_device.cpp file we have tried to list the
resolution and image formats that the Android Camera3 specification
lists as mandatory or suggested.

Do you have a list of additional requirements to add ?
Are there ChromeOS specific requirements ?
Or is this meant to full-fill the above stated requirements on
platforms that cannot satisfy them ?

>
> # Background
>
>
> ### Libcamera
>
> In addition to its native API, libcamera[^1] provides a number of
> camera APIs, for example, V4L2 Webcam API and Android Camera HAL3.
> The platform specific implementations are wrapped in libcamera core
> and a caller of libcamera doesn’t have to take care the platform.
>
>
> ### Android Camera HAL
>
> Chrome OS camera stack uses Android Camera HAL[^2] interface.
> Libcamera provides Android Camera HAL with an adaptation layer[^3]
> between libcamera core part and Android HAL, which is called
> Android HAL adaptation layer in this document.
>
> To present a uniform set of capabilities to the API users, Android
> Camera HAL API[^4] allows caller to request stream configurations
> that are beyond the device capabilities. For example, while a
> camera device is able to produce a single stream, a HAL caller
> requests three possibly different resolution streams (PRIV, YUV,
> JPEG). However, libcamera core implementation produces
> camera-capable streams. Therefore, we have to create three streams
> from the single stream produced by libcamera.
>
> Requests beyond the device capability is supported only in Android
> HAL at this moment. I describe the design in this document that the
> stream processing is performed in Android HAL adaptation layer.
>
>
> # Overview
>
>
> ## Current implementation
>
> The requested stream configuration is given by
> _camera3_device_t->ops->configure_streams()_ in Android Camera HAL.
> This delegates CameraDevice::configureStreams()[^5] in libcamera.
> The current implementation attempts all the given configurations
> and succeeds if and only if the camera device can produces them
> without any adjustments.
>
>
> ### libcamera::CameraConfiguration
>
> It is CameraConfiguration[^6] that judges whether adjustments are
> required, or even requested configurations are infeasible.
>
> The procedure of configuration is that CameraDevice
>
>
>
> 1. Adds every configuration by
> CameraConfiguration::addConfiguration(). 2. Assorts the added
> configurations by CameraConfiguration::validate().
>
> CameraConfiguration, especially for validate(), is implemented per
> pipeline. For instance, the CameraConfiguration implementation for
> IPU3 is IPU3CameraConfiguration[^7].
>
> validate() returns one of the below,
>
>
>
> *   Valid *    A camera can produce streams with requested
> configurations. *   Adjusted *   A camera cannot produce streams
> with requested configurations as-is, but can produce streams with
> different pixel formats or resolutions. *   Invalid *   A camera
> cannot produce streams with either requested configurations or
> different pixel formats and resolutions. For instance, this is
> returned when the larger resolution is requested than the maximum
> supported one?
>
> What we need to resolve is, when Adjusted is returned, to map
> adjusted camera streams to requested camera streams and required
> processing.
>
>
> ## Stream processing
>
> The processing to be thought of are followings.
>
>
>
> *   Down-scaling *   We don’t perform up-scaling because it affects
> stream qualities *   Down-scaling is allowed for the same ratio to
> avoid producing distorted frames. For instance, scaling from
> 1280x720 (16:9) to 480x360 (4:3) is not allowed. *   Cropping *
> Cropping is executed only to change the frame ratio. Thus it must
> be done after down-scaling if required. For example, to convert
> 1280x720 to 480x360, first down-scale to 640x360 and then crop to
> 480x360.
>
> *   Format conversion *   Pixel format conversion *   JPEG
> encoding
>
>
> # Proposal
>
> Basically we only need to consider a mapping algorithm after
> validate(). However, to obtain less processing and better stream
> qualities, we should reorder given configurations within
> validate().

>

The way the HAL layer works, and I agree something has changed since
the recent merge of the JPEG support, is slightly more complex, and
boils down to the following steps

1) Build the list of supported configuration

When a CameraDevice is initialized, a list of supported stream
configuration is built, in order to be able to report to Android
what it could ask. See CameraDevice::initializeStreamConfigurations().

We currently report the libcamera::Camera supported formats and
size, plus additional JPEG streams which are produced in the HAL.
This creates the first distinction between HAL-only-streams and
libcamera-streams, that you correctly identified in your summary.

Here, as we do (naively at the moment) for JPEG, you should inspect
the libcamera-streams and pass them through your code that infer
what kind of HAL-only-streams can be produced from the available
libcamera ones. If I'm not mistaken Android only asks for stream
combinations reported through the
ANDROID_SCALER_AVAILABLE_STREAM_CONFIGURATIONS_OUTPUT metadata, and
if you do not augment that list at initialization time, you won't
ever be asked for non-native streams later.

2) Camera configuration

That's the part you focused on, and a good part of what you wrote
could indeed be used to move forward.

The problem here can be summarized as: 'for each stream android
requested, the ones that cannot be natively produced by the
libcamera::Camera shall be mapped on the closest possible native
stream' (and here we could apply your implementation that identifies
the 'best matching' stream)

Unfortunately the problem breaks down into several others:

1) How to identify if a stream is a native or an HAL only one ?
Currently we get away with a trivial "if (!JPEG)" as all the non-JPEG
streams are native ones. This should be made smarter.

2) How to best map HAL-streams to libcamera-streams. Assume to
receive a request for two YUV streams in 1080p and 720p resolutions.
The libcamera::Camera claims to be able to support both, so we can
simply go and ask for those two streams. Then we receive a request
for the same streams plus a full-size JPEG one. What we have to do is
ask for the full-size YUV stream and use it to produce JPEG, and one
1080p YUV to produce both the YUV streams in 1080p and 720p
resolutions. In the case we'll then have to crop one YUV stream, and
dedicate a full-size YUV one to JPEG. Alternatively we can produce
1080p from the same full-size YUV used to produce JPEG, and ask for a
720p stream to the camera.

Now, Android specifies some format/size requirements in the Camera3
specification, I assume ChromeOS has maybe others. As we tried to
record the Camera3 requirements and satisfy them in the code, I
think the additional streams that are required should be someone
listed first, in order to be able to create only the -required-
additional streams.

For an example, have a look at CameraDevice::camera3Resolutions and
CameraDevice::camera3FormatsMap, these encode the Camera3
specification requirements.

Once the additional requirments have been encoded, I would then
proceed to divide them in 3 categories (there might very well be
others):

  1) Format conversions: Convert to one pixel format to the other. What
     happens today with JPEG more or less. We have an Encode interface for
     that purpose and I guess format converter should be implemented
     according to it, but that has to be discussed.

  2) Down-scale/crop: Assuming it happens in the HAL using maybe some
     external components, down-scaling/cropping produce additional
     resolutions from the list of natively supported ones. Given a
     powerful enough implementation we could produce ANY format <= a given
     native format, but that's not what we want I guess. We shall
     establish a list of additional resolutions we want to report to the
     framework layer, and find out how to produce them from the native
     streams.

   3) Image transformations A bit a lateral issue, but I assume some
      'transformations' can be performed by HAL only components. This
      mostly depends on handling streams with some specific metadata
      associated, which needs to be handled in the HAL. The most trivial
      example is rotation. If the libcamera::Camera is for whatever reason
      unable to rotate the images, they have to be software rotated in the
      HAL. This won't require any stream mapping, but rather inspecting
      metadata and pass the native streams through an additional processing
      layer.

3) Buffer allocation/handling:
   When performing any conversions between a HAL stream and a libcamera
   stream we may need to allocate an intermediate buffer to provide storage
   for processing the frame in libcamera, with the conversion entity
   reading from the libcamera buffer and writing into the android buffer.
   This is likely possible with the existing FrameBufferAllocator classes,
   but may have extra requirements.

My take is still that we should try to solve one problem at the time:

1) formalize additional requirements that are not expressed by our
   CameraDevice::camera3Resolutions and CameraDevice::camera3FormatsMap
2) if not other requirements are necessary, indentify a use case that
   cannot be satisfied by the current pipeline implementations we
   have. In example, a UVC camera that cannot produce NV12 and need
   conversion might be a good start
3) Address the buffer allocation issues which I understand is still
   to be addressed.

Sorry for the wall of text. Hope it helps.

Thanks
  j

Tomasz Figa Sept. 3, 2020, 12:36 a.m. UTC | #5

Hi Jacopo,

On Tue, Sep 1, 2020 at 6:05 PM Jacopo Mondi <jacopo@jmondi.org> wrote:
>
> Hi Hiro,
>    first of all I'm very sorry for the un-aceptable delay in giving
> you a reply.
>
> If that's of any consolation we have not ignored your email, but it
> has gone through several internal discussion, as it come at the
> time where the JPEG support was being merged and the two things
> collided a bit. Add a small delay due to leaves, and here you have a
> month of delay. Again, we're really sorry for this.
>
> > On Thu, Aug 06, 2020 at 03:17:05PM +0900, Hirokazu Honda wrote:
> > This is a proposal about how to map camera configurations and
> > requested configurations in Android Camera HAL adaptation layer.
> > Please also see the sample code in the following patch.
> >
> > # Software Stream Processing in libcamera
> >
> > _hiroh@chromium.org / Draft: 2020-08-06_
> >
> >
>
> As an initial and un-related note looking at the patch, I can see you
> are following the ChromeOS coding style. Please note that libcamera
> has it's own code style, which you can find documented at
>
> - https://www.libcamera.org/coding-style.html#coding-style-guidelines
>
> And we have a style checker, which can assist with this. The best way to
> use the style checker is to install it as a git-hook.
>
> I understand that this is an RFC, but we will need this style to be
> followed to be able to integrate any future patches.
>
> >
> > # Objective
> >
> > Perform frame processing in libcamera to achieve requested stream
> > configurations that are not supported natively by the camera
> > hardware, but required by the Android Camera HAL interface.
> >
>
> As you can see in the camera_device.cpp file we have tried to list the
> resolution and image formats that the Android Camera3 specification
> lists as mandatory or suggested.
>
> Do you have a list of additional requirements to add ?
> Are there ChromeOS specific requirements ?
> Or is this meant to full-fill the above stated requirements on
> platforms that cannot satisfy them ?
>

There can be per-device resolutions that should be supported due to
product requirements. Our current HAL implementations use
configuration files which define the required configurations.

That said, I think it's an independent problem, which we can likely
ignore for now, and I believe what Hiro had in mind was the latter -
platforms that cannot satisfy them. This also includes the cases you
mentioned below, when a number of streams greater than the number of
native hardware streams is requested.

As usual, the Android Camera2 API documentation is the authoritative
source of information here:
https://developer.android.com/reference/android/hardware/camera2/CameraDevice.html#createCaptureSession(android.hardware.camera2.params.SessionConfiguration)

The tables lower on the page include required stream combinations for
various capability levels.

> >
> > # Background
> >
> >
> > ### Libcamera
> >
> > In addition to its native API, libcamera[^1] provides a number of
> > camera APIs, for example, V4L2 Webcam API and Android Camera HAL3.
> > The platform specific implementations are wrapped in libcamera core
> > and a caller of libcamera doesn’t have to take care the platform.
> >
> >
> > ### Android Camera HAL
> >
> > Chrome OS camera stack uses Android Camera HAL[^2] interface.
> > Libcamera provides Android Camera HAL with an adaptation layer[^3]
> > between libcamera core part and Android HAL, which is called
> > Android HAL adaptation layer in this document.
> >
> > To present a uniform set of capabilities to the API users, Android
> > Camera HAL API[^4] allows caller to request stream configurations
> > that are beyond the device capabilities. For example, while a
> > camera device is able to produce a single stream, a HAL caller
> > requests three possibly different resolution streams (PRIV, YUV,
> > JPEG). However, libcamera core implementation produces
> > camera-capable streams. Therefore, we have to create three streams
> > from the single stream produced by libcamera.
> >
> > Requests beyond the device capability is supported only in Android
> > HAL at this moment. I describe the design in this document that the
> > stream processing is performed in Android HAL adaptation layer.
> >
> >
> > # Overview
> >
> >
> > ## Current implementation
> >
> > The requested stream configuration is given by
> > _camera3_device_t->ops->configure_streams()_ in Android Camera HAL.
> > This delegates CameraDevice::configureStreams()[^5] in libcamera.
> > The current implementation attempts all the given configurations
> > and succeeds if and only if the camera device can produces them
> > without any adjustments.
> >
> >
> > ### libcamera::CameraConfiguration
> >
> > It is CameraConfiguration[^6] that judges whether adjustments are
> > required, or even requested configurations are infeasible.
> >
> > The procedure of configuration is that CameraDevice
> >
> >
> >
> > 1. Adds every configuration by
> > CameraConfiguration::addConfiguration(). 2. Assorts the added
> > configurations by CameraConfiguration::validate().
> >
> > CameraConfiguration, especially for validate(), is implemented per
> > pipeline. For instance, the CameraConfiguration implementation for
> > IPU3 is IPU3CameraConfiguration[^7].
> >
> > validate() returns one of the below,
> >
> >
> >
> > *   Valid *    A camera can produce streams with requested
> > configurations. *   Adjusted *   A camera cannot produce streams
> > with requested configurations as-is, but can produce streams with
> > different pixel formats or resolutions. *   Invalid *   A camera
> > cannot produce streams with either requested configurations or
> > different pixel formats and resolutions. For instance, this is
> > returned when the larger resolution is requested than the maximum
> > supported one?
> >
> > What we need to resolve is, when Adjusted is returned, to map
> > adjusted camera streams to requested camera streams and required
> > processing.
> >
> >
> > ## Stream processing
> >
> > The processing to be thought of are followings.
> >
> >
> >
> > *   Down-scaling *   We don’t perform up-scaling because it affects
> > stream qualities *   Down-scaling is allowed for the same ratio to
> > avoid producing distorted frames. For instance, scaling from
> > 1280x720 (16:9) to 480x360 (4:3) is not allowed. *   Cropping *
> > Cropping is executed only to change the frame ratio. Thus it must
> > be done after down-scaling if required. For example, to convert
> > 1280x720 to 480x360, first down-scale to 640x360 and then crop to
> > 480x360.
> >
> > *   Format conversion *   Pixel format conversion *   JPEG
> > encoding
> >
> >
> > # Proposal
> >
> > Basically we only need to consider a mapping algorithm after
> > validate(). However, to obtain less processing and better stream
> > qualities, we should reorder given configurations within
> > validate().
>
> >
>
> The way the HAL layer works, and I agree something has changed since
> the recent merge of the JPEG support, is slightly more complex, and
> boils down to the following steps
>
> 1) Build the list of supported configuration
>
> When a CameraDevice is initialized, a list of supported stream
> configuration is built, in order to be able to report to Android
> what it could ask. See CameraDevice::initializeStreamConfigurations().
>
> We currently report the libcamera::Camera supported formats and
> size, plus additional JPEG streams which are produced in the HAL.
> This creates the first distinction between HAL-only-streams and
> libcamera-streams, that you correctly identified in your summary.
>
> Here, as we do (naively at the moment) for JPEG, you should inspect
> the libcamera-streams and pass them through your code that infer
> what kind of HAL-only-streams can be produced from the available
> libcamera ones. If I'm not mistaken Android only asks for stream
> combinations reported through the
> ANDROID_SCALER_AVAILABLE_STREAM_CONFIGURATIONS_OUTPUT metadata, and
> if you do not augment that list at initialization time, you won't
> ever be asked for non-native streams later.

I'm not entirely sure about this, because there are mandatory stream
configurations defined for the Camera2 API. If something is mandatory,
I suspect there is no need to query for the availability of it.

That said, I'd assume that CTS verifies whether all the required
configurations are both reported and supported, so perhaps there isn't
much to worry about here.

>
> 2) Camera configuration
>
> That's the part you focused on, and a good part of what you wrote
> could indeed be used to move forward.
>
> The problem here can be summarized as: 'for each stream android
> requested, the ones that cannot be natively produced by the
> libcamera::Camera shall be mapped on the closest possible native
> stream' (and here we could apply your implementation that identifies
> the 'best matching' stream)
>
> Unfortunately the problem breaks down into several others:
>
> 1) How to identify if a stream is a native or an HAL only one ?
> Currently we get away with a trivial "if (!JPEG)" as all the non-JPEG
> streams are native ones. This should be made smarter.
>
> 2) How to best map HAL-streams to libcamera-streams. Assume to
> receive a request for two YUV streams in 1080p and 720p resolutions.
> The libcamera::Camera claims to be able to support both, so we can
> simply go and ask for those two streams. Then we receive a request
> for the same streams plus a full-size JPEG one. What we have to do is
> ask for the full-size YUV stream and use it to produce JPEG, and one
> 1080p YUV to produce both the YUV streams in 1080p and 720p
> resolutions. In the case we'll then have to crop one YUV stream, and
> dedicate a full-size YUV one to JPEG. Alternatively we can produce
> 1080p from the same full-size YUV used to produce JPEG, and ask for a
> 720p stream to the camera.
>

Right, there are multiple possible choices. I've discussed this and
concluded that there might be some help needed from the pipeline
handler to tell the client which configuration is better from the
hardware point of view.

> Now, Android specifies some format/size requirements in the Camera3
> specification, I assume ChromeOS has maybe others. As we tried to
> record the Camera3 requirements and satisfy them in the code, I
> think the additional streams that are required should be someone
> listed first, in order to be able to create only the -required-
> additional streams.
>
> For an example, have a look at CameraDevice::camera3Resolutions and
> CameraDevice::camera3FormatsMap, these encode the Camera3
> specification requirements.
>
> Once the additional requirments have been encoded, I would then
> proceed to divide them in 3 categories (there might very well be
> others):

I believe we don't have any additional requirements for now.

>
>   1) Format conversions: Convert to one pixel format to the other. What
>      happens today with JPEG more or less. We have an Encode interface for
>      that purpose and I guess format converter should be implemented
>      according to it, but that has to be discussed.
>

One thing that is also missing today is MJPEG decoding. This is also
required to fulfill the stream configuration requirements, since it's
assumed that the formats are displayable and explicit YUV streams are
included as well.

>   2) Down-scale/crop: Assuming it happens in the HAL using maybe some
>      external components, down-scaling/cropping produce additional
>      resolutions from the list of natively supported ones. Given a
>      powerful enough implementation we could produce ANY format <= a given
>      native format, but that's not what we want I guess. We shall
>      establish a list of additional resolutions we want to report to the
>      framework layer, and find out how to produce them from the native
>      streams.

Given the above, we should be able to stick to the resolutions we have
already supported in the adaptation layer.

>
>    3) Image transformations A bit a lateral issue, but I assume some
>       'transformations' can be performed by HAL only components. This
>       mostly depends on handling streams with some specific metadata
>       associated, which needs to be handled in the HAL. The most trivial
>       example is rotation. If the libcamera::Camera is for whatever reason
>       unable to rotate the images, they have to be software rotated in the
>       HAL. This won't require any stream mapping, but rather inspecting
>       metadata and pass the native streams through an additional processing
>       layer.

Right. I honestly hope we won't need to do software rotation on any
reasonable hardware platform, but AFAIK we still have some in Chrome
OS for which we do, in some specific cases, like a tablet with the
camera in landscape orientation, but the device in portrait
orientation.

>
> 3) Buffer allocation/handling:
>    When performing any conversions between a HAL stream and a libcamera
>    stream we may need to allocate an intermediate buffer to provide storage
>    for processing the frame in libcamera, with the conversion entity
>    reading from the libcamera buffer and writing into the android buffer.
>    This is likely possible with the existing FrameBufferAllocator classes,
>    but may have extra requirements.

I suppose we could have 3 types of buffers here:
1) buffers written by hardware driven by libcamera
 - without any software processing these would be directly provided by
Android and imported to libcamera,
 - with processing, I assume libcamera would have to allocate its own,
but I guess that would just end up being V4L2 MMAP buffers?
2) buffers between processing steps - if software only, an arbitrary
malloc buffer could be used.
3) buffers for the end results - always provided by Android and
 - without processing they are the same thing as 1),
 - with processing they need to be mapped in the adaptation layer and
wouldn't reach libcamera itself.

For consistency, one might be tempted to use some external allocator,
like gralloc, and import the hardware buffers to libcamera in both
cases, but there are limitations - gralloc only knows how to allocate
the end result buffers, so it wouldn't give us arbitrary buffers
without some ugly hacks and DMA-buf heaps still need some time to gain
adoption. Therefore I think we might need to live with special cases
like this until the world improves.

>
> My take is still that we should try to solve one problem at the time:
>
> 1) formalize additional requirements that are not expressed by our
>    CameraDevice::camera3Resolutions and CameraDevice::camera3FormatsMap

This hopefully shouldn't be needed, although we might want to double
check if those fully cover the Android requirements.

> 2) if not other requirements are necessary, indentify a use case that
>    cannot be satisfied by the current pipeline implementations we
>    have. In example, a UVC camera that cannot produce NV12 and need
>    conversion might be a good start

The use cases we encountered in practice:
a) a UVC camera which outputs only MJPEG for higher resolutions and
needs decoding (and possibly one more extra conversion) to output YUV
4:2:0.
b) a UVC camera which only outputs 1 stream, while Android requires up
to 2 YUV streams + JPEG for the LIMITED capability level.
c) IPU3/RKISP1 which can output up to 2 streams, but there is a stream
configuration that requires 3 streams which could have different
resolutions - 2 YUV up to but not necessarily equal max PREVIEW size
and 1 JPEG with MAXIMUM resolution.

I don't remember if we in the end had to deal with it, but I recall also:
d) hardware platform that doesn't support one of the smaller required
resolutions due to max scaling factor constraints.

> 3) Address the buffer allocation issues which I understand is still
>    to be addressed.

Agreed.

>
> Sorry for the wall of text. Hope it helps.

Yep, thanks for starting the discussion.

Best regards,
Tomasz

Jacopo Mondi Sept. 4, 2020, 12:57 p.m. UTC | #6

Hi Tomasz,

On Thu, Sep 03, 2020 at 02:36:47AM +0200, Tomasz Figa wrote:
> Hi Jacopo,
>
> On Tue, Sep 1, 2020 at 6:05 PM Jacopo Mondi <jacopo@jmondi.org> wrote:
> >
> > Hi Hiro,
> >    first of all I'm very sorry for the un-aceptable delay in giving
> > you a reply.
> >
> > If that's of any consolation we have not ignored your email, but it
> > has gone through several internal discussion, as it come at the
> > time where the JPEG support was being merged and the two things
> > collided a bit. Add a small delay due to leaves, and here you have a
> > month of delay. Again, we're really sorry for this.
> >
> > > On Thu, Aug 06, 2020 at 03:17:05PM +0900, Hirokazu Honda wrote:
> > > This is a proposal about how to map camera configurations and
> > > requested configurations in Android Camera HAL adaptation layer.
> > > Please also see the sample code in the following patch.
> > >
> > > # Software Stream Processing in libcamera
> > >
> > > _hiroh@chromium.org / Draft: 2020-08-06_
> > >
> > >
> >
> > As an initial and un-related note looking at the patch, I can see you
> > are following the ChromeOS coding style. Please note that libcamera
> > has it's own code style, which you can find documented at
> >
> > - https://www.libcamera.org/coding-style.html#coding-style-guidelines
> >
> > And we have a style checker, which can assist with this. The best way to
> > use the style checker is to install it as a git-hook.
> >
> > I understand that this is an RFC, but we will need this style to be
> > followed to be able to integrate any future patches.
> >
> > >
> > > # Objective
> > >
> > > Perform frame processing in libcamera to achieve requested stream
> > > configurations that are not supported natively by the camera
> > > hardware, but required by the Android Camera HAL interface.
> > >
> >
> > As you can see in the camera_device.cpp file we have tried to list the
> > resolution and image formats that the Android Camera3 specification
> > lists as mandatory or suggested.
> >
> > Do you have a list of additional requirements to add ?
> > Are there ChromeOS specific requirements ?
> > Or is this meant to full-fill the above stated requirements on
> > platforms that cannot satisfy them ?
> >
>
> There can be per-device resolutions that should be supported due to
> product requirements. Our current HAL implementations use
> configuration files which define the required configurations.
>
> That said, I think it's an independent problem, which we can likely
> ignore for now, and I believe what Hiro had in mind was the latter -
> platforms that cannot satisfy them. This also includes the cases you
> mentioned below, when a number of streams greater than the number of
> native hardware streams is requested.
>
> As usual, the Android Camera2 API documentation is the authoritative
> source of information here:
> https://developer.android.com/reference/android/hardware/camera2/CameraDevice.html#createCaptureSession(android.hardware.camera2.params.SessionConfiguration)
>
> The tables lower on the page include required stream combinations for
> various capability levels.
>

Those are the requirements I think should be encoded.
So far, as a reference for the supported formats and resolutions I
used as reference the documentation of the scaler.availableStreamConfigurations
metadata tag

> > >
> > > # Background
> > >
> > >
> > > ### Libcamera
> > >
> > > In addition to its native API, libcamera[^1] provides a number of
> > > camera APIs, for example, V4L2 Webcam API and Android Camera HAL3.
> > > The platform specific implementations are wrapped in libcamera core
> > > and a caller of libcamera doesn’t have to take care the platform.
> > >
> > >
> > > ### Android Camera HAL
> > >
> > > Chrome OS camera stack uses Android Camera HAL[^2] interface.
> > > Libcamera provides Android Camera HAL with an adaptation layer[^3]
> > > between libcamera core part and Android HAL, which is called
> > > Android HAL adaptation layer in this document.
> > >
> > > To present a uniform set of capabilities to the API users, Android
> > > Camera HAL API[^4] allows caller to request stream configurations
> > > that are beyond the device capabilities. For example, while a
> > > camera device is able to produce a single stream, a HAL caller
> > > requests three possibly different resolution streams (PRIV, YUV,
> > > JPEG). However, libcamera core implementation produces
> > > camera-capable streams. Therefore, we have to create three streams
> > > from the single stream produced by libcamera.
> > >
> > > Requests beyond the device capability is supported only in Android
> > > HAL at this moment. I describe the design in this document that the
> > > stream processing is performed in Android HAL adaptation layer.
> > >
> > >
> > > # Overview
> > >
> > >
> > > ## Current implementation
> > >
> > > The requested stream configuration is given by
> > > _camera3_device_t->ops->configure_streams()_ in Android Camera HAL.
> > > This delegates CameraDevice::configureStreams()[^5] in libcamera.
> > > The current implementation attempts all the given configurations
> > > and succeeds if and only if the camera device can produces them
> > > without any adjustments.
> > >
> > >
> > > ### libcamera::CameraConfiguration
> > >
> > > It is CameraConfiguration[^6] that judges whether adjustments are
> > > required, or even requested configurations are infeasible.
> > >
> > > The procedure of configuration is that CameraDevice
> > >
> > >
> > >
> > > 1. Adds every configuration by
> > > CameraConfiguration::addConfiguration(). 2. Assorts the added
> > > configurations by CameraConfiguration::validate().
> > >
> > > CameraConfiguration, especially for validate(), is implemented per
> > > pipeline. For instance, the CameraConfiguration implementation for
> > > IPU3 is IPU3CameraConfiguration[^7].
> > >
> > > validate() returns one of the below,
> > >
> > >
> > >
> > > *   Valid *    A camera can produce streams with requested
> > > configurations. *   Adjusted *   A camera cannot produce streams
> > > with requested configurations as-is, but can produce streams with
> > > different pixel formats or resolutions. *   Invalid *   A camera
> > > cannot produce streams with either requested configurations or
> > > different pixel formats and resolutions. For instance, this is
> > > returned when the larger resolution is requested than the maximum
> > > supported one?
> > >
> > > What we need to resolve is, when Adjusted is returned, to map
> > > adjusted camera streams to requested camera streams and required
> > > processing.
> > >
> > >
> > > ## Stream processing
> > >
> > > The processing to be thought of are followings.
> > >
> > >
> > >
> > > *   Down-scaling *   We don’t perform up-scaling because it affects
> > > stream qualities *   Down-scaling is allowed for the same ratio to
> > > avoid producing distorted frames. For instance, scaling from
> > > 1280x720 (16:9) to 480x360 (4:3) is not allowed. *   Cropping *
> > > Cropping is executed only to change the frame ratio. Thus it must
> > > be done after down-scaling if required. For example, to convert
> > > 1280x720 to 480x360, first down-scale to 640x360 and then crop to
> > > 480x360.
> > >
> > > *   Format conversion *   Pixel format conversion *   JPEG
> > > encoding
> > >
> > >
> > > # Proposal
> > >
> > > Basically we only need to consider a mapping algorithm after
> > > validate(). However, to obtain less processing and better stream
> > > qualities, we should reorder given configurations within
> > > validate().
> >
> > >
> >
> > The way the HAL layer works, and I agree something has changed since
> > the recent merge of the JPEG support, is slightly more complex, and
> > boils down to the following steps
> >
> > 1) Build the list of supported configuration
> >
> > When a CameraDevice is initialized, a list of supported stream
> > configuration is built, in order to be able to report to Android
> > what it could ask. See CameraDevice::initializeStreamConfigurations().
> >
> > We currently report the libcamera::Camera supported formats and
> > size, plus additional JPEG streams which are produced in the HAL.
> > This creates the first distinction between HAL-only-streams and
> > libcamera-streams, that you correctly identified in your summary.
> >
> > Here, as we do (naively at the moment) for JPEG, you should inspect
> > the libcamera-streams and pass them through your code that infer
> > what kind of HAL-only-streams can be produced from the available
> > libcamera ones. If I'm not mistaken Android only asks for stream
> > combinations reported through the
> > ANDROID_SCALER_AVAILABLE_STREAM_CONFIGURATIONS_OUTPUT metadata, and
> > if you do not augment that list at initialization time, you won't
> > ever be asked for non-native streams later.
>
> I'm not entirely sure about this, because there are mandatory stream
> configurations defined for the Camera2 API. If something is mandatory,
> I suspect there is no need to query for the availability of it.

Well, we need to query the libcamera::Camera to know which of the
required streams could be natively produced and which ones instead has
to be produced in the HAL layer.

>
> That said, I'd assume that CTS verifies whether all the required
> configurations are both reported and supported, so perhaps there isn't
> much to worry about here.
>
> >
> > 2) Camera configuration
> >
> > That's the part you focused on, and a good part of what you wrote
> > could indeed be used to move forward.
> >
> > The problem here can be summarized as: 'for each stream android
> > requested, the ones that cannot be natively produced by the
> > libcamera::Camera shall be mapped on the closest possible native
> > stream' (and here we could apply your implementation that identifies
> > the 'best matching' stream)
> >
> > Unfortunately the problem breaks down into several others:
> >
> > 1) How to identify if a stream is a native or an HAL only one ?
> > Currently we get away with a trivial "if (!JPEG)" as all the non-JPEG
> > streams are native ones. This should be made smarter.
> >
> > 2) How to best map HAL-streams to libcamera-streams. Assume to
> > receive a request for two YUV streams in 1080p and 720p resolutions.
> > The libcamera::Camera claims to be able to support both, so we can
> > simply go and ask for those two streams. Then we receive a request
> > for the same streams plus a full-size JPEG one. What we have to do is
> > ask for the full-size YUV stream and use it to produce JPEG, and one
> > 1080p YUV to produce both the YUV streams in 1080p and 720p
> > resolutions. In the case we'll then have to crop one YUV stream, and
> > dedicate a full-size YUV one to JPEG. Alternatively we can produce
> > 1080p from the same full-size YUV used to produce JPEG, and ask for a
> > 720p stream to the camera.
> >
>
> Right, there are multiple possible choices. I've discussed this and
> concluded that there might be some help needed from the pipeline
> handler to tell the client which configuration is better from the
> hardware point of view.
>

I don't think the HAL could do all by itself, I agree. The number of
combination to test would be large and there's currently no way to get
a taste of what would be the better combination for the HW.

Have you already thought how this can be improved ?


> > Now, Android specifies some format/size requirements in the Camera3
> > specification, I assume ChromeOS has maybe others. As we tried to
> > record the Camera3 requirements and satisfy them in the code, I
> > think the additional streams that are required should be someone
> > listed first, in order to be able to create only the -required-
> > additional streams.
> >
> > For an example, have a look at CameraDevice::camera3Resolutions and
> > CameraDevice::camera3FormatsMap, these encode the Camera3
> > specification requirements.
> >
> > Once the additional requirments have been encoded, I would then
> > proceed to divide them in 3 categories (there might very well be
> > others):
>
> I believe we don't have any additional requirements for now.
>
> >
> >   1) Format conversions: Convert to one pixel format to the other. What
> >      happens today with JPEG more or less. We have an Encode interface for
> >      that purpose and I guess format converter should be implemented
> >      according to it, but that has to be discussed.
> >
>
> One thing that is also missing today is MJPEG decoding. This is also
> required to fulfill the stream configuration requirements, since it's
> assumed that the formats are displayable and explicit YUV streams are
> included as well.
>

s/Encoder/Transcoder ?

Decoding and encoding fall in the same category to me, but I agree
this represents a nice use case to start implementing something. I
assume the USB HAL has already some of that in place, right ?

> >   2) Down-scale/crop: Assuming it happens in the HAL using maybe some
> >      external components, down-scaling/cropping produce additional
> >      resolutions from the list of natively supported ones. Given a
> >      powerful enough implementation we could produce ANY format <= a given
> >      native format, but that's not what we want I guess. We shall
> >      establish a list of additional resolutions we want to report to the
> >      framework layer, and find out how to produce them from the native
> >      streams.
>
> Given the above, we should be able to stick to the resolutions we have
> already supported in the adaptation layer.
>

This then boils down again to improve how we identify what could be
natively produced by the camera and has to be produced in the HAL
using one of the native streams.

I think Hiro's proposal addresses the second part (streams
identification) but the first one has to be taken into account,
probably in first place or at least in parallel ?

> >
> >    3) Image transformations A bit a lateral issue, but I assume some
> >       'transformations' can be performed by HAL only components. This
> >       mostly depends on handling streams with some specific metadata
> >       associated, which needs to be handled in the HAL. The most trivial
> >       example is rotation. If the libcamera::Camera is for whatever reason
> >       unable to rotate the images, they have to be software rotated in the
> >       HAL. This won't require any stream mapping, but rather inspecting
> >       metadata and pass the native streams through an additional processing
> >       layer.
>
> Right. I honestly hope we won't need to do software rotation on any
> reasonable hardware platform, but AFAIK we still have some in Chrome
> OS for which we do, in some specific cases, like a tablet with the
> camera in landscape orientation, but the device in portrait
> orientation.
>

I hope it's a corner case as well, but who knows, Android runs on a
great variety of platforms nowadays, some of them might not be that
'reasonable' ? I agree this is a corner case at the moment though

> >
> > 3) Buffer allocation/handling:
> >    When performing any conversions between a HAL stream and a libcamera
> >    stream we may need to allocate an intermediate buffer to provide storage
> >    for processing the frame in libcamera, with the conversion entity
> >    reading from the libcamera buffer and writing into the android buffer.
> >    This is likely possible with the existing FrameBufferAllocator classes,
> >    but may have extra requirements.
>
> I suppose we could have 3 types of buffers here:
> 1) buffers written by hardware driven by libcamera
>  - without any software processing these would be directly provided by
> Android and imported to libcamera,
>  - with processing, I assume libcamera would have to allocate its own,
> but I guess that would just end up being V4L2 MMAP buffers?

These days we're looking at this part with the idea of exploiting the
FrameBufferAllocator abstraction we also provide to applications.

This would end up in
1) Allocating buffers in the video devices (the pipeline handler
decides which one)
2) Exporting them as dmabuf file descriptor
3) Re-importing them in video devices at Request processing time.

> 2) buffers between processing steps - if software only, an arbitrary
> malloc buffer could be used.

I don't see this distinct from the previous point. Whenever we need an
intermediate buffer that has to be processed (and the result of the
processing written to the Android provided buffer) it need to be
allocated in libcamera.

Or do you mean buffers for additional processing steps, in example a
scratch buffer used during an encoding procedure ? In that case I
think the transcoder implementation would deal with that as they
prefer, in example an HW accelerated component might need to allocate
buffers accessible by the CPU and the accelerator, and in this case it
will provide buffers to libcamera to import as well it will allocate
intermediate buffers if it requires any.

> 3) buffers for the end results - always provided by Android and
>  - without processing they are the same thing as 1),
>  - with processing they need to be mapped in the adaptation layer and
> wouldn't reach libcamera itself.

Again this sounds the same as you 1.2 point. I feel I missed something
:)

>
> For consistency, one might be tempted to use some external allocator,
> like gralloc, and import the hardware buffers to libcamera in both
> cases, but there are limitations - gralloc only knows how to allocate
> the end result buffers, so it wouldn't give us arbitrary buffers
> without some ugly hacks and DMA-buf heaps still need some time to gain
> adoption. Therefore I think we might need to live with special cases
> like this until the world improves.
>

Well, for sure we cannot depend on gralloc :)

For the time being, we have to address 1) first, as we have a few test
cases that requires an intermediate buffer to be allocated by the HAL
and provided to libcamera. This part does not concern me too much.

> >
> > My take is still that we should try to solve one problem at the time:
> >
> > 1) formalize additional requirements that are not expressed by our
> >    CameraDevice::camera3Resolutions and CameraDevice::camera3FormatsMap
>
> This hopefully shouldn't be needed, although we might want to double
> check if those fully cover the Android requirements.
>

I know they don't fully do that at the moment (we still don't enforce
mandatory resolutions to be supported in example). And they need to be
versioned depending on the reported HW level. There's indeed space for
development there.

> > 2) if not other requirements are necessary, indentify a use case that
> >    cannot be satisfied by the current pipeline implementations we
> >    have. In example, a UVC camera that cannot produce NV12 and need
> >    conversion might be a good start
>
> The use cases we encountered in practice:
> a) a UVC camera which outputs only MJPEG for higher resolutions and
> needs decoding (and possibly one more extra conversion) to output YUV
> 4:2:0.
> b) a UVC camera which only outputs 1 stream, while Android requires up
> to 2 YUV streams + JPEG for the LIMITED capability level.

This seems interesting use cases to start applying some of the
proposed implementation to an actual use case, aren't they ?

> c) IPU3/RKISP1 which can output up to 2 streams, but there is a stream
> configuration that requires 3 streams which could have different
> resolutions - 2 YUV up to but not necessarily equal max PREVIEW size
> and 1 JPEG with MAXIMUM resolution.

Interesting. This will require software down-scaling of the YUV stream
at max resolution used to produce JPEG. Unless we encode JPEG from
Bayer RAW (I'm not even sure it's possible :)

>
> I don't remember if we in the end had to deal with it, but I recall also:
> d) hardware platform that doesn't support one of the smaller required
> resolutions due to max scaling factor constraints.
>

Seems like a use case for a software downscaler too.

> > 3) Address the buffer allocation issues which I understand is still
> >    to be addressed.
>
> Agreed.
>
> >
> > Sorry for the wall of text. Hope it helps.
>
> Yep, thanks for starting the discussion.

Thank you for the useful feedback.

Looking forward for new developments in this area, as you've seen
there are quite some patches in-flight for the HAL, I know it's
complicated to start new developments with such a fast-moving base...

Thanks
  j

>
> Best regards,
> Tomasz

Tomasz Figa Sept. 7, 2020, 3:33 p.m. UTC | #7

On Fri, Sep 4, 2020 at 2:53 PM Jacopo Mondi <jacopo@jmondi.org> wrote:
>
> Hi Tomasz,
>
> On Thu, Sep 03, 2020 at 02:36:47AM +0200, Tomasz Figa wrote:
> > Hi Jacopo,
> >
> > On Tue, Sep 1, 2020 at 6:05 PM Jacopo Mondi <jacopo@jmondi.org> wrote:
> > >
> > > Hi Hiro,
> > >    first of all I'm very sorry for the un-aceptable delay in giving
> > > you a reply.
> > >
> > > If that's of any consolation we have not ignored your email, but it
> > > has gone through several internal discussion, as it come at the
> > > time where the JPEG support was being merged and the two things
> > > collided a bit. Add a small delay due to leaves, and here you have a
> > > month of delay. Again, we're really sorry for this.
> > >
> > > > On Thu, Aug 06, 2020 at 03:17:05PM +0900, Hirokazu Honda wrote:
> > > > This is a proposal about how to map camera configurations and
> > > > requested configurations in Android Camera HAL adaptation layer.
> > > > Please also see the sample code in the following patch.
> > > >
> > > > # Software Stream Processing in libcamera
> > > >
> > > > _hiroh@chromium.org / Draft: 2020-08-06_
> > > >
> > > >
> > >
> > > As an initial and un-related note looking at the patch, I can see you
> > > are following the ChromeOS coding style. Please note that libcamera
> > > has it's own code style, which you can find documented at
> > >
> > > - https://www.libcamera.org/coding-style.html#coding-style-guidelines
> > >
> > > And we have a style checker, which can assist with this. The best way to
> > > use the style checker is to install it as a git-hook.
> > >
> > > I understand that this is an RFC, but we will need this style to be
> > > followed to be able to integrate any future patches.
> > >
> > > >
> > > > # Objective
> > > >
> > > > Perform frame processing in libcamera to achieve requested stream
> > > > configurations that are not supported natively by the camera
> > > > hardware, but required by the Android Camera HAL interface.
> > > >
> > >
> > > As you can see in the camera_device.cpp file we have tried to list the
> > > resolution and image formats that the Android Camera3 specification
> > > lists as mandatory or suggested.
> > >
> > > Do you have a list of additional requirements to add ?
> > > Are there ChromeOS specific requirements ?
> > > Or is this meant to full-fill the above stated requirements on
> > > platforms that cannot satisfy them ?
> > >
> >
> > There can be per-device resolutions that should be supported due to
> > product requirements. Our current HAL implementations use
> > configuration files which define the required configurations.
> >
> > That said, I think it's an independent problem, which we can likely
> > ignore for now, and I believe what Hiro had in mind was the latter -
> > platforms that cannot satisfy them. This also includes the cases you
> > mentioned below, when a number of streams greater than the number of
> > native hardware streams is requested.
> >
> > As usual, the Android Camera2 API documentation is the authoritative
> > source of information here:
> > https://developer.android.com/reference/android/hardware/camera2/CameraDevice.html#createCaptureSession(android.hardware.camera2.params.SessionConfiguration)
> >
> > The tables lower on the page include required stream combinations for
> > various capability levels.
> >
>
> Those are the requirements I think should be encoded.
> So far, as a reference for the supported formats and resolutions I
> used as reference the documentation of the scaler.availableStreamConfigurations
> metadata tag
>

Yeah, the various pieces of the documentation are scattered across
many places sadly. Some bits are quite difficult to discover and often
show up only when trying to get CTS to pass...

> > > >
> > > > # Background
> > > >
> > > >
> > > > ### Libcamera
> > > >
> > > > In addition to its native API, libcamera[^1] provides a number of
> > > > camera APIs, for example, V4L2 Webcam API and Android Camera HAL3.
> > > > The platform specific implementations are wrapped in libcamera core
> > > > and a caller of libcamera doesn’t have to take care the platform.
> > > >
> > > >
> > > > ### Android Camera HAL
> > > >
> > > > Chrome OS camera stack uses Android Camera HAL[^2] interface.
> > > > Libcamera provides Android Camera HAL with an adaptation layer[^3]
> > > > between libcamera core part and Android HAL, which is called
> > > > Android HAL adaptation layer in this document.
> > > >
> > > > To present a uniform set of capabilities to the API users, Android
> > > > Camera HAL API[^4] allows caller to request stream configurations
> > > > that are beyond the device capabilities. For example, while a
> > > > camera device is able to produce a single stream, a HAL caller
> > > > requests three possibly different resolution streams (PRIV, YUV,
> > > > JPEG). However, libcamera core implementation produces
> > > > camera-capable streams. Therefore, we have to create three streams
> > > > from the single stream produced by libcamera.
> > > >
> > > > Requests beyond the device capability is supported only in Android
> > > > HAL at this moment. I describe the design in this document that the
> > > > stream processing is performed in Android HAL adaptation layer.
> > > >
> > > >
> > > > # Overview
> > > >
> > > >
> > > > ## Current implementation
> > > >
> > > > The requested stream configuration is given by
> > > > _camera3_device_t->ops->configure_streams()_ in Android Camera HAL.
> > > > This delegates CameraDevice::configureStreams()[^5] in libcamera.
> > > > The current implementation attempts all the given configurations
> > > > and succeeds if and only if the camera device can produces them
> > > > without any adjustments.
> > > >
> > > >
> > > > ### libcamera::CameraConfiguration
> > > >
> > > > It is CameraConfiguration[^6] that judges whether adjustments are
> > > > required, or even requested configurations are infeasible.
> > > >
> > > > The procedure of configuration is that CameraDevice
> > > >
> > > >
> > > >
> > > > 1. Adds every configuration by
> > > > CameraConfiguration::addConfiguration(). 2. Assorts the added
> > > > configurations by CameraConfiguration::validate().
> > > >
> > > > CameraConfiguration, especially for validate(), is implemented per
> > > > pipeline. For instance, the CameraConfiguration implementation for
> > > > IPU3 is IPU3CameraConfiguration[^7].
> > > >
> > > > validate() returns one of the below,
> > > >
> > > >
> > > >
> > > > *   Valid *    A camera can produce streams with requested
> > > > configurations. *   Adjusted *   A camera cannot produce streams
> > > > with requested configurations as-is, but can produce streams with
> > > > different pixel formats or resolutions. *   Invalid *   A camera
> > > > cannot produce streams with either requested configurations or
> > > > different pixel formats and resolutions. For instance, this is
> > > > returned when the larger resolution is requested than the maximum
> > > > supported one?
> > > >
> > > > What we need to resolve is, when Adjusted is returned, to map
> > > > adjusted camera streams to requested camera streams and required
> > > > processing.
> > > >
> > > >
> > > > ## Stream processing
> > > >
> > > > The processing to be thought of are followings.
> > > >
> > > >
> > > >
> > > > *   Down-scaling *   We don’t perform up-scaling because it affects
> > > > stream qualities *   Down-scaling is allowed for the same ratio to
> > > > avoid producing distorted frames. For instance, scaling from
> > > > 1280x720 (16:9) to 480x360 (4:3) is not allowed. *   Cropping *
> > > > Cropping is executed only to change the frame ratio. Thus it must
> > > > be done after down-scaling if required. For example, to convert
> > > > 1280x720 to 480x360, first down-scale to 640x360 and then crop to
> > > > 480x360.
> > > >
> > > > *   Format conversion *   Pixel format conversion *   JPEG
> > > > encoding
> > > >
> > > >
> > > > # Proposal
> > > >
> > > > Basically we only need to consider a mapping algorithm after
> > > > validate(). However, to obtain less processing and better stream
> > > > qualities, we should reorder given configurations within
> > > > validate().
> > >
> > > >
> > >
> > > The way the HAL layer works, and I agree something has changed since
> > > the recent merge of the JPEG support, is slightly more complex, and
> > > boils down to the following steps
> > >
> > > 1) Build the list of supported configuration
> > >
> > > When a CameraDevice is initialized, a list of supported stream
> > > configuration is built, in order to be able to report to Android
> > > what it could ask. See CameraDevice::initializeStreamConfigurations().
> > >
> > > We currently report the libcamera::Camera supported formats and
> > > size, plus additional JPEG streams which are produced in the HAL.
> > > This creates the first distinction between HAL-only-streams and
> > > libcamera-streams, that you correctly identified in your summary.
> > >
> > > Here, as we do (naively at the moment) for JPEG, you should inspect
> > > the libcamera-streams and pass them through your code that infer
> > > what kind of HAL-only-streams can be produced from the available
> > > libcamera ones. If I'm not mistaken Android only asks for stream
> > > combinations reported through the
> > > ANDROID_SCALER_AVAILABLE_STREAM_CONFIGURATIONS_OUTPUT metadata, and
> > > if you do not augment that list at initialization time, you won't
> > > ever be asked for non-native streams later.
> >
> > I'm not entirely sure about this, because there are mandatory stream
> > configurations defined for the Camera2 API. If something is mandatory,
> > I suspect there is no need to query for the availability of it.
>
> Well, we need to query the libcamera::Camera to know which of the
> required streams could be natively produced and which ones instead has
> to be produced in the HAL layer.
>

I was referring to the Android side, i.e. even if something is not
reported in ANDROID_SCALER_AVAILABLE_STREAM_CONFIGURATIONS_OUTPUT, the
client could still possibly request it if the spec defines it as
mandatory.

> >
> > That said, I'd assume that CTS verifies whether all the required
> > configurations are both reported and supported, so perhaps there isn't
> > much to worry about here.
> >
> > >
> > > 2) Camera configuration
> > >
> > > That's the part you focused on, and a good part of what you wrote
> > > could indeed be used to move forward.
> > >
> > > The problem here can be summarized as: 'for each stream android
> > > requested, the ones that cannot be natively produced by the
> > > libcamera::Camera shall be mapped on the closest possible native
> > > stream' (and here we could apply your implementation that identifies
> > > the 'best matching' stream)
> > >
> > > Unfortunately the problem breaks down into several others:
> > >
> > > 1) How to identify if a stream is a native or an HAL only one ?
> > > Currently we get away with a trivial "if (!JPEG)" as all the non-JPEG
> > > streams are native ones. This should be made smarter.
> > >
> > > 2) How to best map HAL-streams to libcamera-streams. Assume to
> > > receive a request for two YUV streams in 1080p and 720p resolutions.
> > > The libcamera::Camera claims to be able to support both, so we can
> > > simply go and ask for those two streams. Then we receive a request
> > > for the same streams plus a full-size JPEG one. What we have to do is
> > > ask for the full-size YUV stream and use it to produce JPEG, and one
> > > 1080p YUV to produce both the YUV streams in 1080p and 720p
> > > resolutions. In the case we'll then have to crop one YUV stream, and
> > > dedicate a full-size YUV one to JPEG. Alternatively we can produce
> > > 1080p from the same full-size YUV used to produce JPEG, and ask for a
> > > 720p stream to the camera.
> > >
> >
> > Right, there are multiple possible choices. I've discussed this and
> > concluded that there might be some help needed from the pipeline
> > handler to tell the client which configuration is better from the
> > hardware point of view.
> >
>
> I don't think the HAL could do all by itself, I agree. The number of
> combination to test would be large and there's currently no way to get
> a taste of what would be the better combination for the HW.
>
> Have you already thought how this can be improved ?
>

While it would be really nice to have some smart heuristics for this,
I think it might be difficult to have something automatic that works
for everyone, because there could also be business decisions involved
in the process. For example, scaling at component X could result in
sharper but possibly more noisy image, while at component Y could be
less noisy, but also not so sharp. The decision is certainly a matter
of someone's preferences.

I know we're trying to run away from configuration files as much as
possible, but I think this might be one of the places we need some way
to express integration-specific preferences.

>
> > > Now, Android specifies some format/size requirements in the Camera3
> > > specification, I assume ChromeOS has maybe others. As we tried to
> > > record the Camera3 requirements and satisfy them in the code, I
> > > think the additional streams that are required should be someone
> > > listed first, in order to be able to create only the -required-
> > > additional streams.
> > >
> > > For an example, have a look at CameraDevice::camera3Resolutions and
> > > CameraDevice::camera3FormatsMap, these encode the Camera3
> > > specification requirements.
> > >
> > > Once the additional requirments have been encoded, I would then
> > > proceed to divide them in 3 categories (there might very well be
> > > others):
> >
> > I believe we don't have any additional requirements for now.
> >
> > >
> > >   1) Format conversions: Convert to one pixel format to the other. What
> > >      happens today with JPEG more or less. We have an Encode interface for
> > >      that purpose and I guess format converter should be implemented
> > >      according to it, but that has to be discussed.
> > >
> >
> > One thing that is also missing today is MJPEG decoding. This is also
> > required to fulfill the stream configuration requirements, since it's
> > assumed that the formats are displayable and explicit YUV streams are
> > included as well.
> >
>
> s/Encoder/Transcoder ?
>
> Decoding and encoding fall in the same category to me, but I agree
> this represents a nice use case to start implementing something. I
> assume the USB HAL has already some of that in place, right ?
>

Right, the handling from the HAL3 API point of view is implemented
there, but the decoding itself is implemented in Chromium and there is
an IPC-based API exposed for the HALs to use. The layer in Chromium
supports multiple backends (software, V4L2, VAAPI) and also has
clients other than the camera.

> > >   2) Down-scale/crop: Assuming it happens in the HAL using maybe some
> > >      external components, down-scaling/cropping produce additional
> > >      resolutions from the list of natively supported ones. Given a
> > >      powerful enough implementation we could produce ANY format <= a given
> > >      native format, but that's not what we want I guess. We shall
> > >      establish a list of additional resolutions we want to report to the
> > >      framework layer, and find out how to produce them from the native
> > >      streams.
> >
> > Given the above, we should be able to stick to the resolutions we have
> > already supported in the adaptation layer.
> >
>
> This then boils down again to improve how we identify what could be
> natively produced by the camera and has to be produced in the HAL
> using one of the native streams.
>
> I think Hiro's proposal addresses the second part (streams
(> identification) but the first one has to be taken into account,
> probably in first place or at least in parallel ?
>

I understood his proposal as that we ask the pipeline handler to
adjust the requested configuration to the best supported by the
hardware and then the HAL decide what to do further with them.

The proposal mentions that the pipeline handler specifically has to
sort the requested streams by aspect ratios and resolutions, to
provide as many of the requested aspect ratios as possible and highest
requested resolutions to avoid upscaling. However I wonder if it's not
oversimplified. Let's consider the example below, on RKISP1.

1) PRIV 1280x640 (preview)
2) YUV 1920x1080 (record)
3) JPEG 1920x1440 (still, full sensor resolution)

Also note the Android cropping requirements:

"
* In all cases, the stream crop must be centered within the full crop region,
* and each stream is only either cropped horizontally or vertical relative to
* the full crop region, never both.
"

(https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/camera3.h#988)

The ISP can produce two streams. If we apply the sorting by resolution
and ratio, we get:

1) 1920x1440
2) 1920x1080
3) 1280x640

If we select the first two, we don't end up producing the most
efficient setup, because we need to scale 1920x1080->1280x720 in
software, which wouldn't be necessary if we selected 1920x1440
(croppable to 1920x1080) and 1280x640.

How about something like this:

1) Sort the horizontal resolution.

1920x1080
1920x1440
1280x640

2) Sort the vertical resolution.

1920x1440
1920x1080
1280x640

3) Discard entries with the same horizontal resolution, but smaller
vertical resolutions, until the number of streams is small enough to
be supported by the hardware.

1920x1440
1280x640

For RKISP1 we would be fine here, but if we have even more constrained
hardware, like a UVC camera, we would have to go even further.

4) If we still have more streams than we can support, sort them by
their aspect ratio and resolution and eliminate all except the highest
resolution of each aspect ratio.

1920x1440
1280x640

5) Sort by resolutions again and fold the lowest resolution streams
into higher resolution streams by expanding their FoV to cover the
sensor area required by both streams. Repeat until the number of
streams is low enough.

1920x1440

Of course the above prefers scaling higher resolution images in the
hardware, which could already be a business decision rather than an
universal choice. The selection may also depend on the availability of
additional hardware, like a V4L2 mem2mem image processor to do the
scaling.

The above also lacks handling of any platform-specific constraints,
such as min/max scaling ratio, resolution limits of hardware streams,
etc., which is where it needs to rely on the pipeline handler.

> > >
> > >    3) Image transformations A bit a lateral issue, but I assume some
> > >       'transformations' can be performed by HAL only components. This
> > >       mostly depends on handling streams with some specific metadata
> > >       associated, which needs to be handled in the HAL. The most trivial
> > >       example is rotation. If the libcamera::Camera is for whatever reason
> > >       unable to rotate the images, they have to be software rotated in the
> > >       HAL. This won't require any stream mapping, but rather inspecting
> > >       metadata and pass the native streams through an additional processing
> > >       layer.
> >
> > Right. I honestly hope we won't need to do software rotation on any
> > reasonable hardware platform, but AFAIK we still have some in Chrome
> > OS for which we do, in some specific cases, like a tablet with the
> > camera in landscape orientation, but the device in portrait
> > orientation.
> >
>
> I hope it's a corner case as well, but who knows, Android runs on a
> great variety of platforms nowadays, some of them might not be that
> 'reasonable' ? I agree this is a corner case at the moment though
>

Sadly, it's quite the opposite and we need to support it as well as we
can. Ideally GLES or a simple V4L2 mem2mem device could be used to
perform the rotation.

> > >
> > > 3) Buffer allocation/handling:
> > >    When performing any conversions between a HAL stream and a libcamera
> > >    stream we may need to allocate an intermediate buffer to provide storage
> > >    for processing the frame in libcamera, with the conversion entity
> > >    reading from the libcamera buffer and writing into the android buffer.
> > >    This is likely possible with the existing FrameBufferAllocator classes,
> > >    but may have extra requirements.
> >
> > I suppose we could have 3 types of buffers here:
> > 1) buffers written by hardware driven by libcamera
> >  - without any software processing these would be directly provided by
> > Android and imported to libcamera,
> >  - with processing, I assume libcamera would have to allocate its own,
> > but I guess that would just end up being V4L2 MMAP buffers?
>
> These days we're looking at this part with the idea of exploiting the
> FrameBufferAllocator abstraction we also provide to applications.
>
> This would end up in
> 1) Allocating buffers in the video devices (the pipeline handler
> decides which one)
> 2) Exporting them as dmabuf file descriptor
> 3) Re-importing them in video devices at Request processing time.
>
> > 2) buffers between processing steps - if software only, an arbitrary
> > malloc buffer could be used.
>
> I don't see this distinct from the previous point. Whenever we need an
> intermediate buffer that has to be processed (and the result of the
> processing written to the Android provided buffer) it need to be
> allocated in libcamera.
>
> Or do you mean buffers for additional processing steps, in example a
> scratch buffer used during an encoding procedure ? In that case I
> think the transcoder implementation would deal with that as they
> prefer, in example an HW accelerated component might need to allocate
> buffers accessible by the CPU and the accelerator, and in this case it
> will provide buffers to libcamera to import as well it will allocate
> intermediate buffers if it requires any.

Let's say we do the following:

camera output --1--> software scale --2--> JPEG encode --3--> client

The buffer at 1) would be libcamera-allocated DMAble memory. At 3) -
Android-allocated DMA-buf. At 2) - needs to be libcamera-allocated,
but no expectations about DMAbility and likely shouldn't be a DMA-buf,
because on ARM that would currently mean an uncached mapping,
significantly affecting the performance of the software processing.

>
> > 3) buffers for the end results - always provided by Android and
> >  - without processing they are the same thing as 1),
> >  - with processing they need to be mapped in the adaptation layer and
> > wouldn't reach libcamera itself.
>
> Again this sounds the same as you 1.2 point. I feel I missed something
> :)
>

The point is that for some processing steps, the buffer might be
either imported or allocated. Moreover, for the allocation, there
might be different requirements, like I described above.

> >
> > For consistency, one might be tempted to use some external allocator,
> > like gralloc, and import the hardware buffers to libcamera in both
> > cases, but there are limitations - gralloc only knows how to allocate
> > the end result buffers, so it wouldn't give us arbitrary buffers
> > without some ugly hacks and DMA-buf heaps still need some time to gain
> > adoption. Therefore I think we might need to live with special cases
> > like this until the world improves.
> >
>
> Well, for sure we cannot depend on gralloc :)
>
> For the time being, we have to address 1) first, as we have a few test
> cases that requires an intermediate buffer to be allocated by the HAL
> and provided to libcamera. This part does not concern me too much.
>

Do we have any ideas on how to allocate those?

> > >
> > > My take is still that we should try to solve one problem at the time:
> > >
> > > 1) formalize additional requirements that are not expressed by our
> > >    CameraDevice::camera3Resolutions and CameraDevice::camera3FormatsMap
> >
> > This hopefully shouldn't be needed, although we might want to double
> > check if those fully cover the Android requirements.
> >
>
> I know they don't fully do that at the moment (we still don't enforce
> mandatory resolutions to be supported in example). And they need to be
> versioned depending on the reported HW level. There's indeed space for
> development there.
>
> > > 2) if not other requirements are necessary, indentify a use case that
> > >    cannot be satisfied by the current pipeline implementations we
> > >    have. In example, a UVC camera that cannot produce NV12 and need
> > >    conversion might be a good start
> >
> > The use cases we encountered in practice:
> > a) a UVC camera which outputs only MJPEG for higher resolutions and
> > needs decoding (and possibly one more extra conversion) to output YUV
> > 4:2:0.
> > b) a UVC camera which only outputs 1 stream, while Android requires up
> > to 2 YUV streams + JPEG for the LIMITED capability level.
>
> This seems interesting use cases to start applying some of the
> proposed implementation to an actual use case, aren't they ?
>

Indeed.

> > c) IPU3/RKISP1 which can output up to 2 streams, but there is a stream
> > configuration that requires 3 streams which could have different
> > resolutions - 2 YUV up to but not necessarily equal max PREVIEW size
> > and 1 JPEG with MAXIMUM resolution.
>
> Interesting. This will require software down-scaling of the YUV stream
> at max resolution used to produce JPEG. Unless we encode JPEG from
> Bayer RAW (I'm not even sure it's possible :)
>

Not necessarily. One stream could be made full resolution, while the
other max of the two other streams and that downscaled and/or cropped.
I think the heuristic I described above should work at least for
software-only processing.

Actually, with the dual pipe mode, IPU3 should be able to handle this
setup natively, but probably some work would be needed in the pipeline
handler to handle it. RKISP1 is still limited to 2 streams, though.

Best regards,
Tomasz

[libcamera-devel,0/1] Proposal of mapping between camera configurations and requested configurations
mbox series

Message

Comments

[libcamera-devel,0/1] Proposal of mapping between camera configurations and requested configurations 9244 mbox series

Message

Comments

[libcamera-devel,0/1] Proposal of mapping between camera configurations and requested configurations
mbox series