[v2,2/3] libcamera: Add face detection controls
diff mbox series

Message ID 20240823143205.2668765-3-chenghaoyang@google.com
State New
Headers show
Series
  • Add Face Detection Controls
Related show

Commit Message

Cheng-Hao Yang Aug. 23, 2024, 2:29 p.m. UTC
From: Yudhistira Erlandinata <yerlandinata@chromium.org>

Add FaceDetectMode, FaceDetectFaceRectangles, FaceDetectFaceScores,
and FaceDetectFaceLandmark. Also add ControlTypePoint for supporting
FaceDetectFaceLandmark.

BUG=b:308713855
TEST=emerge-geralt libcamera-mtkisp7

Signed-off-by: Yudhistira Erlandinata <yerlandinata@chromium.org>
Co-developed-by: becker hsieh <beckerh@chromium.org>
Co-developed-by: Harvey Yang <chenghaoyang@chromium.org>
---
 include/libcamera/controls.h        |  6 +++++
 src/libcamera/control_ids_core.yaml | 42 +++++++++++++++++++++++++++++
 src/libcamera/controls.cpp          |  6 +++++
 3 files changed, 54 insertions(+)

Comments

Jacopo Mondi Aug. 28, 2024, 12:38 p.m. UTC | #1
Hello

On Fri, Aug 23, 2024 at 02:29:09PM GMT, Harvey Yang wrote:
> From: Yudhistira Erlandinata <yerlandinata@chromium.org>
>
> Add FaceDetectMode, FaceDetectFaceRectangles, FaceDetectFaceScores,
> and FaceDetectFaceLandmark. Also add ControlTypePoint for supporting
> FaceDetectFaceLandmark.
>
> BUG=b:308713855
> TEST=emerge-geralt libcamera-mtkisp7

Please drop these

>
> Signed-off-by: Yudhistira Erlandinata <yerlandinata@chromium.org>
> Co-developed-by: becker hsieh <beckerh@chromium.org>
> Co-developed-by: Harvey Yang <chenghaoyang@chromium.org>
> ---
>  include/libcamera/controls.h        |  6 +++++
>  src/libcamera/control_ids_core.yaml | 42 +++++++++++++++++++++++++++++
>  src/libcamera/controls.cpp          |  6 +++++
>  3 files changed, 54 insertions(+)
>
> diff --git a/include/libcamera/controls.h b/include/libcamera/controls.h
> index 7c2bb2872..bf1b8609c 100644
> --- a/include/libcamera/controls.h
> +++ b/include/libcamera/controls.h
> @@ -34,6 +34,7 @@ enum ControlType {
>  	ControlTypeString,
>  	ControlTypeRectangle,
>  	ControlTypeSize,
> +	ControlTypePoint,
>  };
>
>  namespace details {
> @@ -87,6 +88,11 @@ struct control_type<Size> {
>  	static constexpr ControlType value = ControlTypeSize;
>  };
>
> +template<>
> +struct control_type<Point> {
> +	static constexpr ControlType value = ControlTypePoint;
> +};
> +
>  template<typename T, std::size_t N>
>  struct control_type<Span<T, N>> : public control_type<std::remove_cv_t<T>> {
>  };
> diff --git a/src/libcamera/control_ids_core.yaml b/src/libcamera/control_ids_core.yaml
> index 6381970b4..5ff46c71e 100644
> --- a/src/libcamera/control_ids_core.yaml
> +++ b/src/libcamera/control_ids_core.yaml

I wonder if we can already made these core controls or as long as we
don't have more users they should in control_ids_draft

> @@ -860,6 +860,48 @@ controls:
>              No further state changes or lens movements will occur until the
>              AfPauseResume control is sent.
>
> +  - FaceDetectMode:
> +      type: uint8_t
> +      description: |
> +        Reporting mode of face detection.

I presume this is a metadata only control (iow can only be reported by
the library to the application). If that's the case this should be
made explicit.

We don't do so consistently in our controls definition unfortunately,
but some controls already specify that, like, in example,
ColorTemperature:

        The ColourTemperature control can only be returned in metadata.

I would do the same here if that's the case.

However, reading your next patch, this control doesn't seem to be for
reporting only, but can also be set by applications to control how the
pipeline behaves. That's not what the description says.


> +
> +      enum:
> +        - name: FaceDetectModeOff
> +          value: 0
> +          description: |
> +            Pipeline should not report face detection result.
> +        - name: FaceDetectModeSimple
> +          value: 1
> +          description: |
> +            Pipeline should at least report faces boundary rectangles and

Why "at least" ?

> +            confidence score for each of them.

How ? Please describe here how these information are reported,
mentioning explicitly

                \sa FaceDetectFaceRectangles
                \sa FaceDetectFaceScores

I presume it is expected from the pipline handlers to report two lists
of the same sizes of rectangles and scores.

Please keep in mind this documentation should be usable for other
pipeline handler implementers to implement face detection and from
application to understand how to interpret the results.

> +
> +  - FaceDetectFaceRectangles:
> +      type: Rectangle
> +      description: |
> +        Boundary rectangle of the detected faces.

This is reported only when FaceDetectMode is set to
FaceDetectModeSimple I presume. It has to be specified. If it's a
metadata only control the above suggestion of saying that explicitly
applies.

> +
> +      size: [n]
> +
> +  - FaceDetectFaceScores:
> +      type: uint8_t
> +      description: |
> +        Confidence score of each of the detected faces by face detector.
> +        The range of score is [0, 100], but usually those with low confidence
> +        scores will not be included.

I think you should eitehr clarify what determines what "low confidence" is
or drop that part, as it's a pipeline handler specific decision.

I presume a score needs a rectangle assigned, please mention that.

> +        Currently identical to ANDROID_STATISTICS_FACE_SCORES.

This makes me think all of this should go in draft controls, where we
have collected controls whose definition comes from the Android ones.

> +
> +      size: [n]
> +
> +  - FaceDetectFaceLandmark:
> +      type: Point
> +      description: |
> +        Array of human face landmark coordinates in format:
> +        [..., left_eye_i, right_eye_i, mouth_i, left_eye_i+1, ...],
> +        with i = index of face.

This is quite vague I'm afraid.

Which are the face landmarks ? eyes and mouth ? Why not the nose ? A more
sophisticated mechanism is required to support this and have it usable
for multiple platforms I'm afraid. If this is only useful for Android and
its description comes from an Android control, move it to draf where
we can be slighly more liberal in defining things, as draf controls
are by definition not expected to be stable.

> +
> +      size: [n]
> +
>    - HdrMode:
>        type: int32_t
>        description: |
> diff --git a/src/libcamera/controls.cpp b/src/libcamera/controls.cpp
> index 11d35321c..ce73ae9d7 100644
> --- a/src/libcamera/controls.cpp
> +++ b/src/libcamera/controls.cpp
> @@ -61,6 +61,7 @@ static constexpr size_t ControlValueSize[] = {
>  	[ControlTypeString]		= sizeof(char),
>  	[ControlTypeRectangle]		= sizeof(Rectangle),
>  	[ControlTypeSize]		= sizeof(Size),
> +	[ControlTypePoint]		= sizeof(Point),
>  };
>
>  } /* namespace */
> @@ -255,6 +256,11 @@ std::string ControlValue::toString() const
>  			str += value->toString();
>  			break;
>  		}
> +		case ControlTypePoint: {
> +			const Point *value = reinterpret_cast<const Point *>(data);
> +			str += value->toString();
> +			break;
> +		}
>  		case ControlTypeNone:
>  		case ControlTypeString:
>  			break;
> --
> 2.46.0.295.g3b9ea8a38a-goog
>
Cheng-Hao Yang Aug. 30, 2024, 9:03 p.m. UTC | #2
Thanks Jacopo,

On Wed, Aug 28, 2024 at 2:39 PM Jacopo Mondi <jacopo.mondi@ideasonboard.com>
wrote:

> Hello
>
> On Fri, Aug 23, 2024 at 02:29:09PM GMT, Harvey Yang wrote:
> > From: Yudhistira Erlandinata <yerlandinata@chromium.org>
> >
> > Add FaceDetectMode, FaceDetectFaceRectangles, FaceDetectFaceScores,
> > and FaceDetectFaceLandmark. Also add ControlTypePoint for supporting
> > FaceDetectFaceLandmark.
> >
> > BUG=b:308713855
> > TEST=emerge-geralt libcamera-mtkisp7
>
> Please drop these
>
Removed.


>
> >
> > Signed-off-by: Yudhistira Erlandinata <yerlandinata@chromium.org>
> > Co-developed-by: becker hsieh <beckerh@chromium.org>
> > Co-developed-by: Harvey Yang <chenghaoyang@chromium.org>
> > ---
> >  include/libcamera/controls.h        |  6 +++++
> >  src/libcamera/control_ids_core.yaml | 42 +++++++++++++++++++++++++++++
> >  src/libcamera/controls.cpp          |  6 +++++
> >  3 files changed, 54 insertions(+)
> >
> > diff --git a/include/libcamera/controls.h b/include/libcamera/controls.h
> > index 7c2bb2872..bf1b8609c 100644
> > --- a/include/libcamera/controls.h
> > +++ b/include/libcamera/controls.h
> > @@ -34,6 +34,7 @@ enum ControlType {
> >       ControlTypeString,
> >       ControlTypeRectangle,
> >       ControlTypeSize,
> > +     ControlTypePoint,
> >  };
> >
> >  namespace details {
> > @@ -87,6 +88,11 @@ struct control_type<Size> {
> >       static constexpr ControlType value = ControlTypeSize;
> >  };
> >
> > +template<>
> > +struct control_type<Point> {
> > +     static constexpr ControlType value = ControlTypePoint;
> > +};
> > +
> >  template<typename T, std::size_t N>
> >  struct control_type<Span<T, N>> : public
> control_type<std::remove_cv_t<T>> {
> >  };
> > diff --git a/src/libcamera/control_ids_core.yaml
> b/src/libcamera/control_ids_core.yaml
> > index 6381970b4..5ff46c71e 100644
> > --- a/src/libcamera/control_ids_core.yaml
> > +++ b/src/libcamera/control_ids_core.yaml
>
> I wonder if we can already made these core controls or as long as we
> don't have more users they should in control_ids_draft
>
> > @@ -860,6 +860,48 @@ controls:
> >              No further state changes or lens movements will occur until
> the
> >              AfPauseResume control is sent.
> >
> > +  - FaceDetectMode:
> > +      type: uint8_t
> > +      description: |
> > +        Reporting mode of face detection.
>
> I presume this is a metadata only control (iow can only be reported by
> the library to the application). If that's the case this should be
> made explicit.
>
> We don't do so consistently in our controls definition unfortunately,
> but some controls already specify that, like, in example,
> ColorTemperature:
>
>         The ColourTemperature control can only be returned in metadata.
>
> I would do the same here if that's the case.
>
> However, reading your next patch, this control doesn't seem to be for
> reporting only, but can also be set by applications to control how the
> pipeline behaves. That's not what the description says.
>

Right, in Android adapter, HAL returns a list of available
FaceDetectMode in static metadata. Application then sets a mode it
desires per-framely in controls. HAL returns the final mode that it
supports per-framely in the result metadata.

Therefore, I'll keep it this way :)


>
>
> > +
> > +      enum:
> > +        - name: FaceDetectModeOff
> > +          value: 0
> > +          description: |
> > +            Pipeline should not report face detection result.
> > +        - name: FaceDetectModeSimple
> > +          value: 1
> > +          description: |
> > +            Pipeline should at least report faces boundary rectangles
> and
>
> Why "at least" ?
>

In Android, there's another Full mode [1] that includes face landmarks and
face IDs. I think Yudhis didn't list Full mode here because mtkisp7 doesn't
support face IDs in their face detector lib/hardware...

Android CTS doesn't seem to care if more metadata is returned, when not
asked for though.

We can also list FULL mode here, and let mtkisp7 pipeline handler returns
Off & Simple modes in the static metadata. WDYT?

[1]:
https://developer.android.com/reference/android/hardware/camera2/CameraMetadata#STATISTICS_FACE_DETECT_MODE_FULL


>
> > +            confidence score for each of them.
>
> How ? Please describe here how these information are reported,
> mentioning explicitly
>
>                 \sa FaceDetectFaceRectangles
>                 \sa FaceDetectFaceScores
>
> I presume it is expected from the pipline handlers to report two lists
> of the same sizes of rectangles and scores.
>

Exactly. Thanks! Updated, please check again.


>
> Please keep in mind this documentation should be usable for other
> pipeline handler implementers to implement face detection and from
> application to understand how to interpret the results.
>

Right, I'll keep that in mind.


>
> > +
> > +  - FaceDetectFaceRectangles:
> > +      type: Rectangle
> > +      description: |
> > +        Boundary rectangle of the detected faces.
>
> This is reported only when FaceDetectMode is set to
> FaceDetectModeSimple I presume. It has to be specified. If it's a
> metadata only control the above suggestion of saying that explicitly
> applies.
>

Actually, as I mentioned above, Android doesn't seem to complain
about extra metadata. Therefore, it's `when` instead of `only when`.
Updated.


>
> > +
> > +      size: [n]
> > +
> > +  - FaceDetectFaceScores:
> > +      type: uint8_t
> > +      description: |
> > +        Confidence score of each of the detected faces by face detector.
> > +        The range of score is [0, 100], but usually those with low
> confidence
> > +        scores will not be included.
>
> I think you should eitehr clarify what determines what "low confidence" is
> or drop that part, as it's a pipeline handler specific decision.
>
Right, dropped.


>
> I presume a score needs a rectangle assigned, please mention that.
>
Do we need to mention that the size of faces in each metadata should
align in every control id involved? How about just mentioning once in
FaceDetectModeSimple?


>
> > +        Currently identical to ANDROID_STATISTICS_FACE_SCORES.
>
> This makes me think all of this should go in draft controls, where we
> have collected controls whose definition comes from the Android ones.
>

Right, moved to the draft ids.


>
> > +
> > +      size: [n]
> > +
> > +  - FaceDetectFaceLandmark:
> > +      type: Point
> > +      description: |
> > +        Array of human face landmark coordinates in format:
> > +        [..., left_eye_i, right_eye_i, mouth_i, left_eye_i+1, ...],
> > +        with i = index of face.
>
> This is quite vague I'm afraid.
>
> Which are the face landmarks ? eyes and mouth ? Why not the nose ? A more
> sophisticated mechanism is required to support this and have it usable
> for multiple platforms I'm afraid. If this is only useful for Android and
> its description comes from an Android control, move it to draf where
> we can be slighly more liberal in defining things, as draf controls
> are by definition not expected to be stable.
>

Yes, it's to be aligned with Android control [2]. Moved to the draft ids.

[2]:
https://android.googlesource.com/platform/system/media/+/refs/heads/main/camera/docs/docs.html#32906


>
> > +
> > +      size: [n]
> > +
> >    - HdrMode:
> >        type: int32_t
> >        description: |
> > diff --git a/src/libcamera/controls.cpp b/src/libcamera/controls.cpp
> > index 11d35321c..ce73ae9d7 100644
> > --- a/src/libcamera/controls.cpp
> > +++ b/src/libcamera/controls.cpp
> > @@ -61,6 +61,7 @@ static constexpr size_t ControlValueSize[] = {
> >       [ControlTypeString]             = sizeof(char),
> >       [ControlTypeRectangle]          = sizeof(Rectangle),
> >       [ControlTypeSize]               = sizeof(Size),
> > +     [ControlTypePoint]              = sizeof(Point),
> >  };
> >
> >  } /* namespace */
> > @@ -255,6 +256,11 @@ std::string ControlValue::toString() const
> >                       str += value->toString();
> >                       break;
> >               }
> > +             case ControlTypePoint: {
> > +                     const Point *value = reinterpret_cast<const Point
> *>(data);
> > +                     str += value->toString();
> > +                     break;
> > +             }
> >               case ControlTypeNone:
> >               case ControlTypeString:
> >                       break;
> > --
> > 2.46.0.295.g3b9ea8a38a-goog
> >
>

Patch
diff mbox series

diff --git a/include/libcamera/controls.h b/include/libcamera/controls.h
index 7c2bb2872..bf1b8609c 100644
--- a/include/libcamera/controls.h
+++ b/include/libcamera/controls.h
@@ -34,6 +34,7 @@  enum ControlType {
 	ControlTypeString,
 	ControlTypeRectangle,
 	ControlTypeSize,
+	ControlTypePoint,
 };
 
 namespace details {
@@ -87,6 +88,11 @@  struct control_type<Size> {
 	static constexpr ControlType value = ControlTypeSize;
 };
 
+template<>
+struct control_type<Point> {
+	static constexpr ControlType value = ControlTypePoint;
+};
+
 template<typename T, std::size_t N>
 struct control_type<Span<T, N>> : public control_type<std::remove_cv_t<T>> {
 };
diff --git a/src/libcamera/control_ids_core.yaml b/src/libcamera/control_ids_core.yaml
index 6381970b4..5ff46c71e 100644
--- a/src/libcamera/control_ids_core.yaml
+++ b/src/libcamera/control_ids_core.yaml
@@ -860,6 +860,48 @@  controls:
             No further state changes or lens movements will occur until the
             AfPauseResume control is sent.
 
+  - FaceDetectMode:
+      type: uint8_t
+      description: |
+        Reporting mode of face detection.
+
+      enum:
+        - name: FaceDetectModeOff
+          value: 0
+          description: |
+            Pipeline should not report face detection result.
+        - name: FaceDetectModeSimple
+          value: 1
+          description: |
+            Pipeline should at least report faces boundary rectangles and
+            confidence score for each of them.
+
+  - FaceDetectFaceRectangles:
+      type: Rectangle
+      description: |
+        Boundary rectangle of the detected faces.
+
+      size: [n]
+
+  - FaceDetectFaceScores:
+      type: uint8_t
+      description: |
+        Confidence score of each of the detected faces by face detector.
+        The range of score is [0, 100], but usually those with low confidence
+        scores will not be included.
+        Currently identical to ANDROID_STATISTICS_FACE_SCORES.
+
+      size: [n]
+
+  - FaceDetectFaceLandmark:
+      type: Point
+      description: |
+        Array of human face landmark coordinates in format:
+        [..., left_eye_i, right_eye_i, mouth_i, left_eye_i+1, ...],
+        with i = index of face.
+
+      size: [n]
+
   - HdrMode:
       type: int32_t
       description: |
diff --git a/src/libcamera/controls.cpp b/src/libcamera/controls.cpp
index 11d35321c..ce73ae9d7 100644
--- a/src/libcamera/controls.cpp
+++ b/src/libcamera/controls.cpp
@@ -61,6 +61,7 @@  static constexpr size_t ControlValueSize[] = {
 	[ControlTypeString]		= sizeof(char),
 	[ControlTypeRectangle]		= sizeof(Rectangle),
 	[ControlTypeSize]		= sizeof(Size),
+	[ControlTypePoint]		= sizeof(Point),
 };
 
 } /* namespace */
@@ -255,6 +256,11 @@  std::string ControlValue::toString() const
 			str += value->toString();
 			break;
 		}
+		case ControlTypePoint: {
+			const Point *value = reinterpret_cast<const Point *>(data);
+			str += value->toString();
+			break;
+		}
 		case ControlTypeNone:
 		case ControlTypeString:
 			break;