Message ID | 20240823143205.2668765-3-chenghaoyang@google.com |
---|---|
State | Superseded |
Headers | show |
Series |
|
Related | show |
Hello On Fri, Aug 23, 2024 at 02:29:09PM GMT, Harvey Yang wrote: > From: Yudhistira Erlandinata <yerlandinata@chromium.org> > > Add FaceDetectMode, FaceDetectFaceRectangles, FaceDetectFaceScores, > and FaceDetectFaceLandmark. Also add ControlTypePoint for supporting > FaceDetectFaceLandmark. > > BUG=b:308713855 > TEST=emerge-geralt libcamera-mtkisp7 Please drop these > > Signed-off-by: Yudhistira Erlandinata <yerlandinata@chromium.org> > Co-developed-by: becker hsieh <beckerh@chromium.org> > Co-developed-by: Harvey Yang <chenghaoyang@chromium.org> > --- > include/libcamera/controls.h | 6 +++++ > src/libcamera/control_ids_core.yaml | 42 +++++++++++++++++++++++++++++ > src/libcamera/controls.cpp | 6 +++++ > 3 files changed, 54 insertions(+) > > diff --git a/include/libcamera/controls.h b/include/libcamera/controls.h > index 7c2bb2872..bf1b8609c 100644 > --- a/include/libcamera/controls.h > +++ b/include/libcamera/controls.h > @@ -34,6 +34,7 @@ enum ControlType { > ControlTypeString, > ControlTypeRectangle, > ControlTypeSize, > + ControlTypePoint, > }; > > namespace details { > @@ -87,6 +88,11 @@ struct control_type<Size> { > static constexpr ControlType value = ControlTypeSize; > }; > > +template<> > +struct control_type<Point> { > + static constexpr ControlType value = ControlTypePoint; > +}; > + > template<typename T, std::size_t N> > struct control_type<Span<T, N>> : public control_type<std::remove_cv_t<T>> { > }; > diff --git a/src/libcamera/control_ids_core.yaml b/src/libcamera/control_ids_core.yaml > index 6381970b4..5ff46c71e 100644 > --- a/src/libcamera/control_ids_core.yaml > +++ b/src/libcamera/control_ids_core.yaml I wonder if we can already made these core controls or as long as we don't have more users they should in control_ids_draft > @@ -860,6 +860,48 @@ controls: > No further state changes or lens movements will occur until the > AfPauseResume control is sent. > > + - FaceDetectMode: > + type: uint8_t > + description: | > + Reporting mode of face detection. I presume this is a metadata only control (iow can only be reported by the library to the application). If that's the case this should be made explicit. We don't do so consistently in our controls definition unfortunately, but some controls already specify that, like, in example, ColorTemperature: The ColourTemperature control can only be returned in metadata. I would do the same here if that's the case. However, reading your next patch, this control doesn't seem to be for reporting only, but can also be set by applications to control how the pipeline behaves. That's not what the description says. > + > + enum: > + - name: FaceDetectModeOff > + value: 0 > + description: | > + Pipeline should not report face detection result. > + - name: FaceDetectModeSimple > + value: 1 > + description: | > + Pipeline should at least report faces boundary rectangles and Why "at least" ? > + confidence score for each of them. How ? Please describe here how these information are reported, mentioning explicitly \sa FaceDetectFaceRectangles \sa FaceDetectFaceScores I presume it is expected from the pipline handlers to report two lists of the same sizes of rectangles and scores. Please keep in mind this documentation should be usable for other pipeline handler implementers to implement face detection and from application to understand how to interpret the results. > + > + - FaceDetectFaceRectangles: > + type: Rectangle > + description: | > + Boundary rectangle of the detected faces. This is reported only when FaceDetectMode is set to FaceDetectModeSimple I presume. It has to be specified. If it's a metadata only control the above suggestion of saying that explicitly applies. > + > + size: [n] > + > + - FaceDetectFaceScores: > + type: uint8_t > + description: | > + Confidence score of each of the detected faces by face detector. > + The range of score is [0, 100], but usually those with low confidence > + scores will not be included. I think you should eitehr clarify what determines what "low confidence" is or drop that part, as it's a pipeline handler specific decision. I presume a score needs a rectangle assigned, please mention that. > + Currently identical to ANDROID_STATISTICS_FACE_SCORES. This makes me think all of this should go in draft controls, where we have collected controls whose definition comes from the Android ones. > + > + size: [n] > + > + - FaceDetectFaceLandmark: > + type: Point > + description: | > + Array of human face landmark coordinates in format: > + [..., left_eye_i, right_eye_i, mouth_i, left_eye_i+1, ...], > + with i = index of face. This is quite vague I'm afraid. Which are the face landmarks ? eyes and mouth ? Why not the nose ? A more sophisticated mechanism is required to support this and have it usable for multiple platforms I'm afraid. If this is only useful for Android and its description comes from an Android control, move it to draf where we can be slighly more liberal in defining things, as draf controls are by definition not expected to be stable. > + > + size: [n] > + > - HdrMode: > type: int32_t > description: | > diff --git a/src/libcamera/controls.cpp b/src/libcamera/controls.cpp > index 11d35321c..ce73ae9d7 100644 > --- a/src/libcamera/controls.cpp > +++ b/src/libcamera/controls.cpp > @@ -61,6 +61,7 @@ static constexpr size_t ControlValueSize[] = { > [ControlTypeString] = sizeof(char), > [ControlTypeRectangle] = sizeof(Rectangle), > [ControlTypeSize] = sizeof(Size), > + [ControlTypePoint] = sizeof(Point), > }; > > } /* namespace */ > @@ -255,6 +256,11 @@ std::string ControlValue::toString() const > str += value->toString(); > break; > } > + case ControlTypePoint: { > + const Point *value = reinterpret_cast<const Point *>(data); > + str += value->toString(); > + break; > + } > case ControlTypeNone: > case ControlTypeString: > break; > -- > 2.46.0.295.g3b9ea8a38a-goog >
Thanks Jacopo, On Wed, Aug 28, 2024 at 2:39 PM Jacopo Mondi <jacopo.mondi@ideasonboard.com> wrote: > Hello > > On Fri, Aug 23, 2024 at 02:29:09PM GMT, Harvey Yang wrote: > > From: Yudhistira Erlandinata <yerlandinata@chromium.org> > > > > Add FaceDetectMode, FaceDetectFaceRectangles, FaceDetectFaceScores, > > and FaceDetectFaceLandmark. Also add ControlTypePoint for supporting > > FaceDetectFaceLandmark. > > > > BUG=b:308713855 > > TEST=emerge-geralt libcamera-mtkisp7 > > Please drop these > Removed. > > > > > Signed-off-by: Yudhistira Erlandinata <yerlandinata@chromium.org> > > Co-developed-by: becker hsieh <beckerh@chromium.org> > > Co-developed-by: Harvey Yang <chenghaoyang@chromium.org> > > --- > > include/libcamera/controls.h | 6 +++++ > > src/libcamera/control_ids_core.yaml | 42 +++++++++++++++++++++++++++++ > > src/libcamera/controls.cpp | 6 +++++ > > 3 files changed, 54 insertions(+) > > > > diff --git a/include/libcamera/controls.h b/include/libcamera/controls.h > > index 7c2bb2872..bf1b8609c 100644 > > --- a/include/libcamera/controls.h > > +++ b/include/libcamera/controls.h > > @@ -34,6 +34,7 @@ enum ControlType { > > ControlTypeString, > > ControlTypeRectangle, > > ControlTypeSize, > > + ControlTypePoint, > > }; > > > > namespace details { > > @@ -87,6 +88,11 @@ struct control_type<Size> { > > static constexpr ControlType value = ControlTypeSize; > > }; > > > > +template<> > > +struct control_type<Point> { > > + static constexpr ControlType value = ControlTypePoint; > > +}; > > + > > template<typename T, std::size_t N> > > struct control_type<Span<T, N>> : public > control_type<std::remove_cv_t<T>> { > > }; > > diff --git a/src/libcamera/control_ids_core.yaml > b/src/libcamera/control_ids_core.yaml > > index 6381970b4..5ff46c71e 100644 > > --- a/src/libcamera/control_ids_core.yaml > > +++ b/src/libcamera/control_ids_core.yaml > > I wonder if we can already made these core controls or as long as we > don't have more users they should in control_ids_draft > > > @@ -860,6 +860,48 @@ controls: > > No further state changes or lens movements will occur until > the > > AfPauseResume control is sent. > > > > + - FaceDetectMode: > > + type: uint8_t > > + description: | > > + Reporting mode of face detection. > > I presume this is a metadata only control (iow can only be reported by > the library to the application). If that's the case this should be > made explicit. > > We don't do so consistently in our controls definition unfortunately, > but some controls already specify that, like, in example, > ColorTemperature: > > The ColourTemperature control can only be returned in metadata. > > I would do the same here if that's the case. > > However, reading your next patch, this control doesn't seem to be for > reporting only, but can also be set by applications to control how the > pipeline behaves. That's not what the description says. > Right, in Android adapter, HAL returns a list of available FaceDetectMode in static metadata. Application then sets a mode it desires per-framely in controls. HAL returns the final mode that it supports per-framely in the result metadata. Therefore, I'll keep it this way :) > > > > + > > + enum: > > + - name: FaceDetectModeOff > > + value: 0 > > + description: | > > + Pipeline should not report face detection result. > > + - name: FaceDetectModeSimple > > + value: 1 > > + description: | > > + Pipeline should at least report faces boundary rectangles > and > > Why "at least" ? > In Android, there's another Full mode [1] that includes face landmarks and face IDs. I think Yudhis didn't list Full mode here because mtkisp7 doesn't support face IDs in their face detector lib/hardware... Android CTS doesn't seem to care if more metadata is returned, when not asked for though. We can also list FULL mode here, and let mtkisp7 pipeline handler returns Off & Simple modes in the static metadata. WDYT? [1]: https://developer.android.com/reference/android/hardware/camera2/CameraMetadata#STATISTICS_FACE_DETECT_MODE_FULL > > > + confidence score for each of them. > > How ? Please describe here how these information are reported, > mentioning explicitly > > \sa FaceDetectFaceRectangles > \sa FaceDetectFaceScores > > I presume it is expected from the pipline handlers to report two lists > of the same sizes of rectangles and scores. > Exactly. Thanks! Updated, please check again. > > Please keep in mind this documentation should be usable for other > pipeline handler implementers to implement face detection and from > application to understand how to interpret the results. > Right, I'll keep that in mind. > > > + > > + - FaceDetectFaceRectangles: > > + type: Rectangle > > + description: | > > + Boundary rectangle of the detected faces. > > This is reported only when FaceDetectMode is set to > FaceDetectModeSimple I presume. It has to be specified. If it's a > metadata only control the above suggestion of saying that explicitly > applies. > Actually, as I mentioned above, Android doesn't seem to complain about extra metadata. Therefore, it's `when` instead of `only when`. Updated. > > > + > > + size: [n] > > + > > + - FaceDetectFaceScores: > > + type: uint8_t > > + description: | > > + Confidence score of each of the detected faces by face detector. > > + The range of score is [0, 100], but usually those with low > confidence > > + scores will not be included. > > I think you should eitehr clarify what determines what "low confidence" is > or drop that part, as it's a pipeline handler specific decision. > Right, dropped. > > I presume a score needs a rectangle assigned, please mention that. > Do we need to mention that the size of faces in each metadata should align in every control id involved? How about just mentioning once in FaceDetectModeSimple? > > > + Currently identical to ANDROID_STATISTICS_FACE_SCORES. > > This makes me think all of this should go in draft controls, where we > have collected controls whose definition comes from the Android ones. > Right, moved to the draft ids. > > > + > > + size: [n] > > + > > + - FaceDetectFaceLandmark: > > + type: Point > > + description: | > > + Array of human face landmark coordinates in format: > > + [..., left_eye_i, right_eye_i, mouth_i, left_eye_i+1, ...], > > + with i = index of face. > > This is quite vague I'm afraid. > > Which are the face landmarks ? eyes and mouth ? Why not the nose ? A more > sophisticated mechanism is required to support this and have it usable > for multiple platforms I'm afraid. If this is only useful for Android and > its description comes from an Android control, move it to draf where > we can be slighly more liberal in defining things, as draf controls > are by definition not expected to be stable. > Yes, it's to be aligned with Android control [2]. Moved to the draft ids. [2]: https://android.googlesource.com/platform/system/media/+/refs/heads/main/camera/docs/docs.html#32906 > > > + > > + size: [n] > > + > > - HdrMode: > > type: int32_t > > description: | > > diff --git a/src/libcamera/controls.cpp b/src/libcamera/controls.cpp > > index 11d35321c..ce73ae9d7 100644 > > --- a/src/libcamera/controls.cpp > > +++ b/src/libcamera/controls.cpp > > @@ -61,6 +61,7 @@ static constexpr size_t ControlValueSize[] = { > > [ControlTypeString] = sizeof(char), > > [ControlTypeRectangle] = sizeof(Rectangle), > > [ControlTypeSize] = sizeof(Size), > > + [ControlTypePoint] = sizeof(Point), > > }; > > > > } /* namespace */ > > @@ -255,6 +256,11 @@ std::string ControlValue::toString() const > > str += value->toString(); > > break; > > } > > + case ControlTypePoint: { > > + const Point *value = reinterpret_cast<const Point > *>(data); > > + str += value->toString(); > > + break; > > + } > > case ControlTypeNone: > > case ControlTypeString: > > break; > > -- > > 2.46.0.295.g3b9ea8a38a-goog > > >
diff --git a/include/libcamera/controls.h b/include/libcamera/controls.h index 7c2bb2872..bf1b8609c 100644 --- a/include/libcamera/controls.h +++ b/include/libcamera/controls.h @@ -34,6 +34,7 @@ enum ControlType { ControlTypeString, ControlTypeRectangle, ControlTypeSize, + ControlTypePoint, }; namespace details { @@ -87,6 +88,11 @@ struct control_type<Size> { static constexpr ControlType value = ControlTypeSize; }; +template<> +struct control_type<Point> { + static constexpr ControlType value = ControlTypePoint; +}; + template<typename T, std::size_t N> struct control_type<Span<T, N>> : public control_type<std::remove_cv_t<T>> { }; diff --git a/src/libcamera/control_ids_core.yaml b/src/libcamera/control_ids_core.yaml index 6381970b4..5ff46c71e 100644 --- a/src/libcamera/control_ids_core.yaml +++ b/src/libcamera/control_ids_core.yaml @@ -860,6 +860,48 @@ controls: No further state changes or lens movements will occur until the AfPauseResume control is sent. + - FaceDetectMode: + type: uint8_t + description: | + Reporting mode of face detection. + + enum: + - name: FaceDetectModeOff + value: 0 + description: | + Pipeline should not report face detection result. + - name: FaceDetectModeSimple + value: 1 + description: | + Pipeline should at least report faces boundary rectangles and + confidence score for each of them. + + - FaceDetectFaceRectangles: + type: Rectangle + description: | + Boundary rectangle of the detected faces. + + size: [n] + + - FaceDetectFaceScores: + type: uint8_t + description: | + Confidence score of each of the detected faces by face detector. + The range of score is [0, 100], but usually those with low confidence + scores will not be included. + Currently identical to ANDROID_STATISTICS_FACE_SCORES. + + size: [n] + + - FaceDetectFaceLandmark: + type: Point + description: | + Array of human face landmark coordinates in format: + [..., left_eye_i, right_eye_i, mouth_i, left_eye_i+1, ...], + with i = index of face. + + size: [n] + - HdrMode: type: int32_t description: | diff --git a/src/libcamera/controls.cpp b/src/libcamera/controls.cpp index 11d35321c..ce73ae9d7 100644 --- a/src/libcamera/controls.cpp +++ b/src/libcamera/controls.cpp @@ -61,6 +61,7 @@ static constexpr size_t ControlValueSize[] = { [ControlTypeString] = sizeof(char), [ControlTypeRectangle] = sizeof(Rectangle), [ControlTypeSize] = sizeof(Size), + [ControlTypePoint] = sizeof(Point), }; } /* namespace */ @@ -255,6 +256,11 @@ std::string ControlValue::toString() const str += value->toString(); break; } + case ControlTypePoint: { + const Point *value = reinterpret_cast<const Point *>(data); + str += value->toString(); + break; + } case ControlTypeNone: case ControlTypeString: break;