| Message ID | 20260121173737.376113-5-kieran.bingham@ideasonboard.com |
|---|---|
| State | New |
| Headers | show |
| Series |
|
| Related | show |
Hi Kieran, Quoting Kieran Bingham (2026-01-21 18:37:23) > Extend the new Quantized type infrastructure by providing a > FixedPointQTraits template. > > This allows construction of fixed point types with a Quantized storage > that allows easy reading of both the underlying quantized type value and > a floating point representation of that same value. > > Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com> > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > --- > v4: > - Assert that the given type has enough bits for the usage > - Use unsigned types for calculating qmin/qmax > - Reorder toFloat/fromFloat and min/max for future inlining > - Make toFloat and fromFloat constexpr > > v5: > - Make UT, Bits and Bitmask private (and remove doxygen) > - Remove constexpr from fromFloat which uses std::round (only constexpr > in C++23) > - static_assert that min<max when converted > - Provide new Q and UQ automatic width types (Thanks Barnabás) > - Convert types to shortened Q/UQ automatic widths > - Use automatic width Q/UQ for 12,4 > - change qmin->qMin qmax->qMax Bits->bits BitMask->bitMask > - Remove typedefs for Q1_7 etc > > v6: > - Use 'quantized' over 'quantised' > - Document sign is based on T and number of bits includes sign bit > > - Document that fromFloat also clamps between [min, max] > > - Remove 64 bit support. We have 32 bit assumptions on fromFloat > > - Restrict to 24 bits, to stay compatible with float types > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > --- > src/ipa/libipa/fixedpoint.cpp | 93 +++++++++++++++++++++++++++++++++++ > src/ipa/libipa/fixedpoint.h | 74 ++++++++++++++++++++++++++++ > 2 files changed, 167 insertions(+) > > diff --git a/src/ipa/libipa/fixedpoint.cpp b/src/ipa/libipa/fixedpoint.cpp > index 6b698fc5d680..c9d04e31e4df 100644 > --- a/src/ipa/libipa/fixedpoint.cpp > +++ b/src/ipa/libipa/fixedpoint.cpp > @@ -37,6 +37,99 @@ namespace ipa { > * \return The converted value > */ > > +/** > + * \struct libcamera::ipa::FixedPointQTraits > + * \brief Traits type implementing fixed-point quantisation conversions Nit: I saw in the log that quantized is preferred over quantised. > + * > + * The FixedPointQTraits structure defines a policy for mapping floating-point > + * values to and from fixed-point integer representations. It is parameterised > + * by the number of integer bits \a I, fractional bits \a F, and the integral > + * storage type \a T. The traits are used with Quantized<Traits> to create a > + * quantized type that stores both the fixed-point representation and the > + * corresponding floating-point value. > + * > + * The signedness of the type is determined by the signedness of \a T. For > + * signed types, the number of integer bits in \a I includes the sign bit. So it took me a while to accept that it is a great idea to represent a signed fixed point as signed int. In all the hardware/register interfaces I remember, the registers were always represented as unsigned types. So I tried to break it... uint16_t myReg; using T=Q<6,10>; T q(T::TraitsType::min); myReg = q.quantized(); ... and failed miserably :-) So all in all I think this is working great. Reviewed-by: Stefan Klug <stefan.klug@ideasonboard.com> Cheers, Stefan > + * > + * The trait exposes compile-time constants describing the bit layout, limits, > + * and scaling factors used in the fixed-point representation. > + * > + * \tparam I Number of integer bits > + * \tparam F Number of fractional bits > + * \tparam T Integral type used to store the quantized value > + */ > + > +/** > + * \typedef FixedPointQTraits::QuantizedType > + * \brief The integral storage type used for the fixed-point representation > + */ > + > +/** > + * \var FixedPointQTraits::qMin > + * \brief Minimum representable quantized integer value > + * > + * This corresponds to the most negative value for signed formats or zero for > + * unsigned formats. > + */ > + > +/** > + * \var FixedPointQTraits::qMax > + * \brief Maximum representable quantized integer value > + */ > + > +/** > + * \var FixedPointQTraits::min > + * \brief Minimum representable floating-point value corresponding to qMin > + */ > + > +/** > + * \var FixedPointQTraits::max > + * \brief Maximum representable floating-point value corresponding to qMax > + */ > + > +/** > + * \fn FixedPointQTraits::fromFloat(float v) > + * \brief Convert a floating-point value to a fixed-point integer > + * \param[in] v The floating-point value to be converted > + * \return The quantized fixed-point integer representation > + * > + * The conversion first clamps the floating-point input \a v to the range [min, > + * max] and then rounds it to the nearest integer according to the scaling > + * factor defined by the number of fractional bits F. > + */ > + > +/** > + * \fn FixedPointQTraits::toFloat(QuantizedType q) > + * \brief Convert a fixed-point integer to a floating-point value > + * \param[in] q The fixed-point integer value to be converted > + * \return The corresponding floating-point value > + * > + * The conversion sign-extends the integer value if required and divides by the > + * scaling factor defined by the number of fractional bits F. > + */ > + > +/** > + * \typedef Q > + * \brief Define a signed fixed-point quantized type with automatic storage width > + * \tparam I The number of integer bits > + * \tparam F The number of fractional bits > + * > + * This alias defines a signed fixed-point quantized type using the > + * \ref FixedPointQTraits trait and a suitable signed integer storage type > + * automatically selected based on the total number of bits \a (I + F). > + */ > + > +/** > + * \typedef UQ > + * \brief Define an unsigned fixed-point quantized type with automatic storage width > + * \tparam I The number of integer bits > + * \tparam F The number of fractional bits > + * > + * This alias defines an unsigned fixed-point quantized type using the > + * \ref FixedPointQTraits trait and a suitable unsigned integer storage type > + * automatically selected based on the total number of bits \a (I + F). > + */ > + > } /* namespace ipa */ > > } /* namespace libcamera */ > diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h > index 48a9757f9554..33d1f4af4792 100644 > --- a/src/ipa/libipa/fixedpoint.h > +++ b/src/ipa/libipa/fixedpoint.h > @@ -10,6 +10,8 @@ > #include <cmath> > #include <type_traits> > > +#include "quantized.h" > + > namespace libcamera { > > namespace ipa { > @@ -63,6 +65,78 @@ constexpr R fixedToFloatingPoint(T number) > return static_cast<R>(t) / static_cast<R>(1 << F); > } > > +template<unsigned int I, unsigned int F, typename T> > +struct FixedPointQTraits { > +private: > + static_assert(std::is_integral_v<T>, "FixedPointQTraits: T must be integral"); > + using UT = std::make_unsigned_t<T>; > + > + static constexpr unsigned int bits = I + F; > + static_assert(bits <= sizeof(T) * 8, "FixedPointQTraits: too many bits for type T"); > + > + /* > + * If fixed point storage is required with more than 24 bits, consider > + * updating this implementation to use double-precision floating point. > + */ > + static_assert(bits <= 24, "Floating point precision may be insufficient for more than 24 bits"); > + > + static constexpr T bitMask = (bits < sizeof(T) * 8) > + ? static_cast<T>((UT{1} << bits) - 1) > + : static_cast<T>(~UT{0}); > + > +public: > + using QuantizedType = T; > + > + static constexpr T qMin = std::is_signed_v<T> > + ? static_cast<T>(-(UT{1} << (bits - 1))) > + : static_cast<T>(0); > + > + static constexpr T qMax = std::is_signed_v<T> > + ? static_cast<T>((UT{1} << (bits - 1)) - 1) > + : bitMask; > + > + static constexpr float toFloat(QuantizedType q) > + { > + return fixedToFloatingPoint<I, F, float, QuantizedType>(q); > + } > + > + static constexpr float min = fixedToFloatingPoint<I, F, float>(qMin); > + static constexpr float max = fixedToFloatingPoint<I, F, float>(qMax); > + > + static_assert(min < max, "FixedPointQTraits: Minimum must be less than maximum"); > + > + /* Conversion functions required by Quantized<Traits> */ > + static QuantizedType fromFloat(float v) > + { > + v = std::clamp(v, min, max); > + return floatingToFixedPoint<I, F, QuantizedType, float>(v); > + } > +}; > + > +namespace details { > + > +template<unsigned int Bits> > +constexpr auto qtype() > +{ > + static_assert(Bits <= 32, > + "Unsupported number of bits for quantized type"); > + > + if constexpr (Bits <= 8) > + return int8_t(); > + else if constexpr (Bits <= 16) > + return int16_t(); > + else if constexpr (Bits <= 32) > + return int32_t(); > +} > + > +} /* namespace details */ > + > +template<unsigned int I, unsigned int F> > +using Q = Quantized<FixedPointQTraits<I, F, decltype(details::qtype<I + F>())>>; > + > +template<unsigned int I, unsigned int F> > +using UQ = Quantized<FixedPointQTraits<I, F, std::make_unsigned_t<decltype(details::qtype<I + F>())>>>; > + > } /* namespace ipa */ > > } /* namespace libcamera */ > -- > 2.52.0 >
On Fri, Jan 23, 2026 at 11:09:22AM +0100, Stefan Klug wrote: > Quoting Kieran Bingham (2026-01-21 18:37:23) > > Extend the new Quantized type infrastructure by providing a > > FixedPointQTraits template. > > > > This allows construction of fixed point types with a Quantized storage > > that allows easy reading of both the underlying quantized type value and > > a floating point representation of that same value. > > > > Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com> > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > > > --- > > v4: > > - Assert that the given type has enough bits for the usage > > - Use unsigned types for calculating qmin/qmax > > - Reorder toFloat/fromFloat and min/max for future inlining > > - Make toFloat and fromFloat constexpr > > > > v5: > > - Make UT, Bits and Bitmask private (and remove doxygen) > > - Remove constexpr from fromFloat which uses std::round (only constexpr > > in C++23) > > - static_assert that min<max when converted > > - Provide new Q and UQ automatic width types (Thanks Barnabás) > > - Convert types to shortened Q/UQ automatic widths > > - Use automatic width Q/UQ for 12,4 > > - change qmin->qMin qmax->qMax Bits->bits BitMask->bitMask > > - Remove typedefs for Q1_7 etc > > > > v6: > > - Use 'quantized' over 'quantised' > > - Document sign is based on T and number of bits includes sign bit > > > > - Document that fromFloat also clamps between [min, max] > > > > - Remove 64 bit support. We have 32 bit assumptions on fromFloat > > > > - Restrict to 24 bits, to stay compatible with float types > > > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > --- > > src/ipa/libipa/fixedpoint.cpp | 93 +++++++++++++++++++++++++++++++++++ > > src/ipa/libipa/fixedpoint.h | 74 ++++++++++++++++++++++++++++ > > 2 files changed, 167 insertions(+) > > > > diff --git a/src/ipa/libipa/fixedpoint.cpp b/src/ipa/libipa/fixedpoint.cpp > > index 6b698fc5d680..c9d04e31e4df 100644 > > --- a/src/ipa/libipa/fixedpoint.cpp > > +++ b/src/ipa/libipa/fixedpoint.cpp > > @@ -37,6 +37,99 @@ namespace ipa { > > * \return The converted value > > */ > > > > +/** > > + * \struct libcamera::ipa::FixedPointQTraits > > + * \brief Traits type implementing fixed-point quantisation conversions > > Nit: I saw in the log that quantized is preferred over quantised. > > > + * > > + * The FixedPointQTraits structure defines a policy for mapping floating-point > > + * values to and from fixed-point integer representations. It is parameterised > > + * by the number of integer bits \a I, fractional bits \a F, and the integral > > + * storage type \a T. The traits are used with Quantized<Traits> to create a > > + * quantized type that stores both the fixed-point representation and the > > + * corresponding floating-point value. > > + * > > + * The signedness of the type is determined by the signedness of \a T. For > > + * signed types, the number of integer bits in \a I includes the sign bit. > > So it took me a while to accept that it is a great idea to represent a > signed fixed point as signed int. In all the hardware/register > interfaces I remember, the registers were always represented as unsigned > types. So I tried to break it... > > uint16_t myReg; > using T=Q<6,10>; > T q(T::TraitsType::min); > myReg = q.quantized(); > > ... and failed miserably :-) > > So all in all I think this is working great. uint32_t myReg; using T=Q<4,4>; T q(T::TraitsType::min); myReg = q.quantized(); std::cout << "q: " << q << ", myReg: " << utils::hex(myReg) << std::endl; produces q: [0x80:-8], myReg: 0xffffff80 I would expect myReg to be 0x80. > Reviewed-by: Stefan Klug <stefan.klug@ideasonboard.com> > > Cheers, > Stefan > > > > + * > > + * The trait exposes compile-time constants describing the bit layout, limits, > > + * and scaling factors used in the fixed-point representation. > > + * > > + * \tparam I Number of integer bits > > + * \tparam F Number of fractional bits > > + * \tparam T Integral type used to store the quantized value > > + */ > > + > > +/** > > + * \typedef FixedPointQTraits::QuantizedType > > + * \brief The integral storage type used for the fixed-point representation > > + */ > > + > > +/** > > + * \var FixedPointQTraits::qMin > > + * \brief Minimum representable quantized integer value > > + * > > + * This corresponds to the most negative value for signed formats or zero for > > + * unsigned formats. > > + */ > > + > > +/** > > + * \var FixedPointQTraits::qMax > > + * \brief Maximum representable quantized integer value > > + */ > > + > > +/** > > + * \var FixedPointQTraits::min > > + * \brief Minimum representable floating-point value corresponding to qMin > > + */ > > + > > +/** > > + * \var FixedPointQTraits::max > > + * \brief Maximum representable floating-point value corresponding to qMax > > + */ > > + > > +/** > > + * \fn FixedPointQTraits::fromFloat(float v) > > + * \brief Convert a floating-point value to a fixed-point integer > > + * \param[in] v The floating-point value to be converted > > + * \return The quantized fixed-point integer representation > > + * > > + * The conversion first clamps the floating-point input \a v to the range [min, > > + * max] and then rounds it to the nearest integer according to the scaling > > + * factor defined by the number of fractional bits F. > > + */ > > + > > +/** > > + * \fn FixedPointQTraits::toFloat(QuantizedType q) > > + * \brief Convert a fixed-point integer to a floating-point value > > + * \param[in] q The fixed-point integer value to be converted > > + * \return The corresponding floating-point value > > + * > > + * The conversion sign-extends the integer value if required and divides by the > > + * scaling factor defined by the number of fractional bits F. > > + */ > > + > > +/** > > + * \typedef Q > > + * \brief Define a signed fixed-point quantized type with automatic storage width > > + * \tparam I The number of integer bits > > + * \tparam F The number of fractional bits > > + * > > + * This alias defines a signed fixed-point quantized type using the > > + * \ref FixedPointQTraits trait and a suitable signed integer storage type > > + * automatically selected based on the total number of bits \a (I + F). > > + */ > > + > > +/** > > + * \typedef UQ > > + * \brief Define an unsigned fixed-point quantized type with automatic storage width > > + * \tparam I The number of integer bits > > + * \tparam F The number of fractional bits > > + * > > + * This alias defines an unsigned fixed-point quantized type using the > > + * \ref FixedPointQTraits trait and a suitable unsigned integer storage type > > + * automatically selected based on the total number of bits \a (I + F). > > + */ > > + > > } /* namespace ipa */ > > > > } /* namespace libcamera */ > > diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h > > index 48a9757f9554..33d1f4af4792 100644 > > --- a/src/ipa/libipa/fixedpoint.h > > +++ b/src/ipa/libipa/fixedpoint.h > > @@ -10,6 +10,8 @@ > > #include <cmath> > > #include <type_traits> > > > > +#include "quantized.h" > > + > > namespace libcamera { > > > > namespace ipa { > > @@ -63,6 +65,78 @@ constexpr R fixedToFloatingPoint(T number) > > return static_cast<R>(t) / static_cast<R>(1 << F); > > } > > > > +template<unsigned int I, unsigned int F, typename T> > > +struct FixedPointQTraits { > > +private: > > + static_assert(std::is_integral_v<T>, "FixedPointQTraits: T must be integral"); > > + using UT = std::make_unsigned_t<T>; > > + > > + static constexpr unsigned int bits = I + F; > > + static_assert(bits <= sizeof(T) * 8, "FixedPointQTraits: too many bits for type T"); > > + > > + /* > > + * If fixed point storage is required with more than 24 bits, consider > > + * updating this implementation to use double-precision floating point. > > + */ > > + static_assert(bits <= 24, "Floating point precision may be insufficient for more than 24 bits"); > > + > > + static constexpr T bitMask = (bits < sizeof(T) * 8) > > + ? static_cast<T>((UT{1} << bits) - 1) > > + : static_cast<T>(~UT{0}); > > + > > +public: > > + using QuantizedType = T; > > + > > + static constexpr T qMin = std::is_signed_v<T> > > + ? static_cast<T>(-(UT{1} << (bits - 1))) > > + : static_cast<T>(0); > > + > > + static constexpr T qMax = std::is_signed_v<T> > > + ? static_cast<T>((UT{1} << (bits - 1)) - 1) > > + : bitMask; > > + > > + static constexpr float toFloat(QuantizedType q) > > + { > > + return fixedToFloatingPoint<I, F, float, QuantizedType>(q); > > + } > > + > > + static constexpr float min = fixedToFloatingPoint<I, F, float>(qMin); > > + static constexpr float max = fixedToFloatingPoint<I, F, float>(qMax); > > + > > + static_assert(min < max, "FixedPointQTraits: Minimum must be less than maximum"); > > + > > + /* Conversion functions required by Quantized<Traits> */ > > + static QuantizedType fromFloat(float v) > > + { > > + v = std::clamp(v, min, max); > > + return floatingToFixedPoint<I, F, QuantizedType, float>(v); > > + } > > +}; > > + > > +namespace details { > > + > > +template<unsigned int Bits> > > +constexpr auto qtype() > > +{ > > + static_assert(Bits <= 32, > > + "Unsupported number of bits for quantized type"); > > + > > + if constexpr (Bits <= 8) > > + return int8_t(); > > + else if constexpr (Bits <= 16) > > + return int16_t(); > > + else if constexpr (Bits <= 32) > > + return int32_t(); > > +} > > + > > +} /* namespace details */ > > + > > +template<unsigned int I, unsigned int F> > > +using Q = Quantized<FixedPointQTraits<I, F, decltype(details::qtype<I + F>())>>; > > + > > +template<unsigned int I, unsigned int F> > > +using UQ = Quantized<FixedPointQTraits<I, F, std::make_unsigned_t<decltype(details::qtype<I + F>())>>>; > > + > > } /* namespace ipa */ > > > > } /* namespace libcamera */
Quoting Laurent Pinchart (2026-01-24 02:42:40) > On Fri, Jan 23, 2026 at 11:09:22AM +0100, Stefan Klug wrote: > > Quoting Kieran Bingham (2026-01-21 18:37:23) > > > Extend the new Quantized type infrastructure by providing a > > > FixedPointQTraits template. > > > > > > This allows construction of fixed point types with a Quantized storage > > > that allows easy reading of both the underlying quantized type value and > > > a floating point representation of that same value. > > > > > > Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com> > > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > > > > > --- > > > v4: > > > - Assert that the given type has enough bits for the usage > > > - Use unsigned types for calculating qmin/qmax > > > - Reorder toFloat/fromFloat and min/max for future inlining > > > - Make toFloat and fromFloat constexpr > > > > > > v5: > > > - Make UT, Bits and Bitmask private (and remove doxygen) > > > - Remove constexpr from fromFloat which uses std::round (only constexpr > > > in C++23) > > > - static_assert that min<max when converted > > > - Provide new Q and UQ automatic width types (Thanks Barnabás) > > > - Convert types to shortened Q/UQ automatic widths > > > - Use automatic width Q/UQ for 12,4 > > > - change qmin->qMin qmax->qMax Bits->bits BitMask->bitMask > > > - Remove typedefs for Q1_7 etc > > > > > > v6: > > > - Use 'quantized' over 'quantised' > > > - Document sign is based on T and number of bits includes sign bit > > > > > > - Document that fromFloat also clamps between [min, max] > > > > > > - Remove 64 bit support. We have 32 bit assumptions on fromFloat > > > > > > - Restrict to 24 bits, to stay compatible with float types > > > > > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > > --- > > > src/ipa/libipa/fixedpoint.cpp | 93 +++++++++++++++++++++++++++++++++++ > > > src/ipa/libipa/fixedpoint.h | 74 ++++++++++++++++++++++++++++ > > > 2 files changed, 167 insertions(+) > > > > > > diff --git a/src/ipa/libipa/fixedpoint.cpp b/src/ipa/libipa/fixedpoint.cpp > > > index 6b698fc5d680..c9d04e31e4df 100644 > > > --- a/src/ipa/libipa/fixedpoint.cpp > > > +++ b/src/ipa/libipa/fixedpoint.cpp > > > @@ -37,6 +37,99 @@ namespace ipa { > > > * \return The converted value > > > */ > > > > > > +/** > > > + * \struct libcamera::ipa::FixedPointQTraits > > > + * \brief Traits type implementing fixed-point quantisation conversions > > > > Nit: I saw in the log that quantized is preferred over quantised. > > > > > + * > > > + * The FixedPointQTraits structure defines a policy for mapping floating-point > > > + * values to and from fixed-point integer representations. It is parameterised > > > + * by the number of integer bits \a I, fractional bits \a F, and the integral > > > + * storage type \a T. The traits are used with Quantized<Traits> to create a > > > + * quantized type that stores both the fixed-point representation and the > > > + * corresponding floating-point value. > > > + * > > > + * The signedness of the type is determined by the signedness of \a T. For > > > + * signed types, the number of integer bits in \a I includes the sign bit. > > > > So it took me a while to accept that it is a great idea to represent a > > signed fixed point as signed int. In all the hardware/register > > interfaces I remember, the registers were always represented as unsigned > > types. So I tried to break it... > > > > uint16_t myReg; > > using T=Q<6,10>; > > T q(T::TraitsType::min); > > myReg = q.quantized(); > > > > ... and failed miserably :-) > > > > So all in all I think this is working great. > > uint32_t myReg; > using T=Q<4,4>; > T q(T::TraitsType::min); > myReg = q.quantized(); > > std::cout << "q: " << q << ", myReg: " << utils::hex(myReg) << std::endl; > > produces > > q: [0x80:-8], myReg: 0xffffff80 > > I would expect myReg to be 0x80. Eeek why didn't I see that case. Yes, that needs to be fixed. For me that raises the question again if quantized() should always return an unsigned value as there is imho no real use case for a signed representation of the quantized type but a high risk for confusion. Best regards, Stefan > > > Reviewed-by: Stefan Klug <stefan.klug@ideasonboard.com> > > > > Cheers, > > Stefan > > > > > > > + * > > > + * The trait exposes compile-time constants describing the bit layout, limits, > > > + * and scaling factors used in the fixed-point representation. > > > + * > > > + * \tparam I Number of integer bits > > > + * \tparam F Number of fractional bits > > > + * \tparam T Integral type used to store the quantized value > > > + */ > > > + > > > +/** > > > + * \typedef FixedPointQTraits::QuantizedType > > > + * \brief The integral storage type used for the fixed-point representation > > > + */ > > > + > > > +/** > > > + * \var FixedPointQTraits::qMin > > > + * \brief Minimum representable quantized integer value > > > + * > > > + * This corresponds to the most negative value for signed formats or zero for > > > + * unsigned formats. > > > + */ > > > + > > > +/** > > > + * \var FixedPointQTraits::qMax > > > + * \brief Maximum representable quantized integer value > > > + */ > > > + > > > +/** > > > + * \var FixedPointQTraits::min > > > + * \brief Minimum representable floating-point value corresponding to qMin > > > + */ > > > + > > > +/** > > > + * \var FixedPointQTraits::max > > > + * \brief Maximum representable floating-point value corresponding to qMax > > > + */ > > > + > > > +/** > > > + * \fn FixedPointQTraits::fromFloat(float v) > > > + * \brief Convert a floating-point value to a fixed-point integer > > > + * \param[in] v The floating-point value to be converted > > > + * \return The quantized fixed-point integer representation > > > + * > > > + * The conversion first clamps the floating-point input \a v to the range [min, > > > + * max] and then rounds it to the nearest integer according to the scaling > > > + * factor defined by the number of fractional bits F. > > > + */ > > > + > > > +/** > > > + * \fn FixedPointQTraits::toFloat(QuantizedType q) > > > + * \brief Convert a fixed-point integer to a floating-point value > > > + * \param[in] q The fixed-point integer value to be converted > > > + * \return The corresponding floating-point value > > > + * > > > + * The conversion sign-extends the integer value if required and divides by the > > > + * scaling factor defined by the number of fractional bits F. > > > + */ > > > + > > > +/** > > > + * \typedef Q > > > + * \brief Define a signed fixed-point quantized type with automatic storage width > > > + * \tparam I The number of integer bits > > > + * \tparam F The number of fractional bits > > > + * > > > + * This alias defines a signed fixed-point quantized type using the > > > + * \ref FixedPointQTraits trait and a suitable signed integer storage type > > > + * automatically selected based on the total number of bits \a (I + F). > > > + */ > > > + > > > +/** > > > + * \typedef UQ > > > + * \brief Define an unsigned fixed-point quantized type with automatic storage width > > > + * \tparam I The number of integer bits > > > + * \tparam F The number of fractional bits > > > + * > > > + * This alias defines an unsigned fixed-point quantized type using the > > > + * \ref FixedPointQTraits trait and a suitable unsigned integer storage type > > > + * automatically selected based on the total number of bits \a (I + F). > > > + */ > > > + > > > } /* namespace ipa */ > > > > > > } /* namespace libcamera */ > > > diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h > > > index 48a9757f9554..33d1f4af4792 100644 > > > --- a/src/ipa/libipa/fixedpoint.h > > > +++ b/src/ipa/libipa/fixedpoint.h > > > @@ -10,6 +10,8 @@ > > > #include <cmath> > > > #include <type_traits> > > > > > > +#include "quantized.h" > > > + > > > namespace libcamera { > > > > > > namespace ipa { > > > @@ -63,6 +65,78 @@ constexpr R fixedToFloatingPoint(T number) > > > return static_cast<R>(t) / static_cast<R>(1 << F); > > > } > > > > > > +template<unsigned int I, unsigned int F, typename T> > > > +struct FixedPointQTraits { > > > +private: > > > + static_assert(std::is_integral_v<T>, "FixedPointQTraits: T must be integral"); > > > + using UT = std::make_unsigned_t<T>; > > > + > > > + static constexpr unsigned int bits = I + F; > > > + static_assert(bits <= sizeof(T) * 8, "FixedPointQTraits: too many bits for type T"); > > > + > > > + /* > > > + * If fixed point storage is required with more than 24 bits, consider > > > + * updating this implementation to use double-precision floating point. > > > + */ > > > + static_assert(bits <= 24, "Floating point precision may be insufficient for more than 24 bits"); > > > + > > > + static constexpr T bitMask = (bits < sizeof(T) * 8) > > > + ? static_cast<T>((UT{1} << bits) - 1) > > > + : static_cast<T>(~UT{0}); > > > + > > > +public: > > > + using QuantizedType = T; > > > + > > > + static constexpr T qMin = std::is_signed_v<T> > > > + ? static_cast<T>(-(UT{1} << (bits - 1))) > > > + : static_cast<T>(0); > > > + > > > + static constexpr T qMax = std::is_signed_v<T> > > > + ? static_cast<T>((UT{1} << (bits - 1)) - 1) > > > + : bitMask; > > > + > > > + static constexpr float toFloat(QuantizedType q) > > > + { > > > + return fixedToFloatingPoint<I, F, float, QuantizedType>(q); > > > + } > > > + > > > + static constexpr float min = fixedToFloatingPoint<I, F, float>(qMin); > > > + static constexpr float max = fixedToFloatingPoint<I, F, float>(qMax); > > > + > > > + static_assert(min < max, "FixedPointQTraits: Minimum must be less than maximum"); > > > + > > > + /* Conversion functions required by Quantized<Traits> */ > > > + static QuantizedType fromFloat(float v) > > > + { > > > + v = std::clamp(v, min, max); > > > + return floatingToFixedPoint<I, F, QuantizedType, float>(v); > > > + } > > > +}; > > > + > > > +namespace details { > > > + > > > +template<unsigned int Bits> > > > +constexpr auto qtype() > > > +{ > > > + static_assert(Bits <= 32, > > > + "Unsupported number of bits for quantized type"); > > > + > > > + if constexpr (Bits <= 8) > > > + return int8_t(); > > > + else if constexpr (Bits <= 16) > > > + return int16_t(); > > > + else if constexpr (Bits <= 32) > > > + return int32_t(); > > > +} > > > + > > > +} /* namespace details */ > > > + > > > +template<unsigned int I, unsigned int F> > > > +using Q = Quantized<FixedPointQTraits<I, F, decltype(details::qtype<I + F>())>>; > > > + > > > +template<unsigned int I, unsigned int F> > > > +using UQ = Quantized<FixedPointQTraits<I, F, std::make_unsigned_t<decltype(details::qtype<I + F>())>>>; > > > + > > > } /* namespace ipa */ > > > > > > } /* namespace libcamera */ > > -- > Regards, > > Laurent Pinchart
On Mon, Jan 26, 2026 at 09:37:47AM +0100, Stefan Klug wrote: > Quoting Laurent Pinchart (2026-01-24 02:42:40) > > On Fri, Jan 23, 2026 at 11:09:22AM +0100, Stefan Klug wrote: > > > Quoting Kieran Bingham (2026-01-21 18:37:23) > > > > Extend the new Quantized type infrastructure by providing a > > > > FixedPointQTraits template. > > > > > > > > This allows construction of fixed point types with a Quantized storage > > > > that allows easy reading of both the underlying quantized type value and > > > > a floating point representation of that same value. > > > > > > > > Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com> > > > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > > > > > > > --- > > > > v4: > > > > - Assert that the given type has enough bits for the usage > > > > - Use unsigned types for calculating qmin/qmax > > > > - Reorder toFloat/fromFloat and min/max for future inlining > > > > - Make toFloat and fromFloat constexpr > > > > > > > > v5: > > > > - Make UT, Bits and Bitmask private (and remove doxygen) > > > > - Remove constexpr from fromFloat which uses std::round (only constexpr > > > > in C++23) > > > > - static_assert that min<max when converted > > > > - Provide new Q and UQ automatic width types (Thanks Barnabás) > > > > - Convert types to shortened Q/UQ automatic widths > > > > - Use automatic width Q/UQ for 12,4 > > > > - change qmin->qMin qmax->qMax Bits->bits BitMask->bitMask > > > > - Remove typedefs for Q1_7 etc > > > > > > > > v6: > > > > - Use 'quantized' over 'quantised' > > > > - Document sign is based on T and number of bits includes sign bit > > > > > > > > - Document that fromFloat also clamps between [min, max] > > > > > > > > - Remove 64 bit support. We have 32 bit assumptions on fromFloat > > > > > > > > - Restrict to 24 bits, to stay compatible with float types > > > > > > > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > > > --- > > > > src/ipa/libipa/fixedpoint.cpp | 93 +++++++++++++++++++++++++++++++++++ > > > > src/ipa/libipa/fixedpoint.h | 74 ++++++++++++++++++++++++++++ > > > > 2 files changed, 167 insertions(+) > > > > > > > > diff --git a/src/ipa/libipa/fixedpoint.cpp b/src/ipa/libipa/fixedpoint.cpp > > > > index 6b698fc5d680..c9d04e31e4df 100644 > > > > --- a/src/ipa/libipa/fixedpoint.cpp > > > > +++ b/src/ipa/libipa/fixedpoint.cpp > > > > @@ -37,6 +37,99 @@ namespace ipa { > > > > * \return The converted value > > > > */ > > > > > > > > +/** > > > > + * \struct libcamera::ipa::FixedPointQTraits > > > > + * \brief Traits type implementing fixed-point quantisation conversions > > > > > > Nit: I saw in the log that quantized is preferred over quantised. > > > > > > > + * > > > > + * The FixedPointQTraits structure defines a policy for mapping floating-point > > > > + * values to and from fixed-point integer representations. It is parameterised > > > > + * by the number of integer bits \a I, fractional bits \a F, and the integral > > > > + * storage type \a T. The traits are used with Quantized<Traits> to create a > > > > + * quantized type that stores both the fixed-point representation and the > > > > + * corresponding floating-point value. > > > > + * > > > > + * The signedness of the type is determined by the signedness of \a T. For > > > > + * signed types, the number of integer bits in \a I includes the sign bit. > > > > > > So it took me a while to accept that it is a great idea to represent a > > > signed fixed point as signed int. In all the hardware/register > > > interfaces I remember, the registers were always represented as unsigned > > > types. So I tried to break it... > > > > > > uint16_t myReg; > > > using T=Q<6,10>; > > > T q(T::TraitsType::min); > > > myReg = q.quantized(); > > > > > > ... and failed miserably :-) > > > > > > So all in all I think this is working great. > > > > uint32_t myReg; > > using T=Q<4,4>; > > T q(T::TraitsType::min); > > myReg = q.quantized(); > > > > std::cout << "q: " << q << ", myReg: " << utils::hex(myReg) << std::endl; > > > > produces > > > > q: [0x80:-8], myReg: 0xffffff80 > > > > I would expect myReg to be 0x80. > > Eeek why didn't I see that case. Yes, that needs to be fixed. > > For me that raises the question again if quantized() should always > return an unsigned value as there is imho no real use case for a signed > representation of the quantized type but a high risk for confusion. I would make it unsigned unconditionally. It happens that Q handles quantized values represented as 2's complement, directly usable for arithmetics on all the CPU we support, but we know that there are other hardware representations (Jacopo mentioned that the C55 uses sign/magnitude for some registers). The quantized values should be considered as a bitfield, not as a numerical value. > > > Reviewed-by: Stefan Klug <stefan.klug@ideasonboard.com> > > > > > > > + * > > > > + * The trait exposes compile-time constants describing the bit layout, limits, > > > > + * and scaling factors used in the fixed-point representation. > > > > + * > > > > + * \tparam I Number of integer bits > > > > + * \tparam F Number of fractional bits > > > > + * \tparam T Integral type used to store the quantized value > > > > + */ > > > > + > > > > +/** > > > > + * \typedef FixedPointQTraits::QuantizedType > > > > + * \brief The integral storage type used for the fixed-point representation > > > > + */ > > > > + > > > > +/** > > > > + * \var FixedPointQTraits::qMin > > > > + * \brief Minimum representable quantized integer value > > > > + * > > > > + * This corresponds to the most negative value for signed formats or zero for > > > > + * unsigned formats. > > > > + */ > > > > + > > > > +/** > > > > + * \var FixedPointQTraits::qMax > > > > + * \brief Maximum representable quantized integer value > > > > + */ > > > > + > > > > +/** > > > > + * \var FixedPointQTraits::min > > > > + * \brief Minimum representable floating-point value corresponding to qMin > > > > + */ > > > > + > > > > +/** > > > > + * \var FixedPointQTraits::max > > > > + * \brief Maximum representable floating-point value corresponding to qMax > > > > + */ > > > > + > > > > +/** > > > > + * \fn FixedPointQTraits::fromFloat(float v) > > > > + * \brief Convert a floating-point value to a fixed-point integer > > > > + * \param[in] v The floating-point value to be converted > > > > + * \return The quantized fixed-point integer representation > > > > + * > > > > + * The conversion first clamps the floating-point input \a v to the range [min, > > > > + * max] and then rounds it to the nearest integer according to the scaling > > > > + * factor defined by the number of fractional bits F. > > > > + */ > > > > + > > > > +/** > > > > + * \fn FixedPointQTraits::toFloat(QuantizedType q) > > > > + * \brief Convert a fixed-point integer to a floating-point value > > > > + * \param[in] q The fixed-point integer value to be converted > > > > + * \return The corresponding floating-point value > > > > + * > > > > + * The conversion sign-extends the integer value if required and divides by the > > > > + * scaling factor defined by the number of fractional bits F. > > > > + */ > > > > + > > > > +/** > > > > + * \typedef Q > > > > + * \brief Define a signed fixed-point quantized type with automatic storage width > > > > + * \tparam I The number of integer bits > > > > + * \tparam F The number of fractional bits > > > > + * > > > > + * This alias defines a signed fixed-point quantized type using the > > > > + * \ref FixedPointQTraits trait and a suitable signed integer storage type > > > > + * automatically selected based on the total number of bits \a (I + F). > > > > + */ > > > > + > > > > +/** > > > > + * \typedef UQ > > > > + * \brief Define an unsigned fixed-point quantized type with automatic storage width > > > > + * \tparam I The number of integer bits > > > > + * \tparam F The number of fractional bits > > > > + * > > > > + * This alias defines an unsigned fixed-point quantized type using the > > > > + * \ref FixedPointQTraits trait and a suitable unsigned integer storage type > > > > + * automatically selected based on the total number of bits \a (I + F). > > > > + */ > > > > + > > > > } /* namespace ipa */ > > > > > > > > } /* namespace libcamera */ > > > > diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h > > > > index 48a9757f9554..33d1f4af4792 100644 > > > > --- a/src/ipa/libipa/fixedpoint.h > > > > +++ b/src/ipa/libipa/fixedpoint.h > > > > @@ -10,6 +10,8 @@ > > > > #include <cmath> > > > > #include <type_traits> > > > > > > > > +#include "quantized.h" > > > > + > > > > namespace libcamera { > > > > > > > > namespace ipa { > > > > @@ -63,6 +65,78 @@ constexpr R fixedToFloatingPoint(T number) > > > > return static_cast<R>(t) / static_cast<R>(1 << F); > > > > } > > > > > > > > +template<unsigned int I, unsigned int F, typename T> > > > > +struct FixedPointQTraits { > > > > +private: > > > > + static_assert(std::is_integral_v<T>, "FixedPointQTraits: T must be integral"); > > > > + using UT = std::make_unsigned_t<T>; > > > > + > > > > + static constexpr unsigned int bits = I + F; > > > > + static_assert(bits <= sizeof(T) * 8, "FixedPointQTraits: too many bits for type T"); > > > > + > > > > + /* > > > > + * If fixed point storage is required with more than 24 bits, consider > > > > + * updating this implementation to use double-precision floating point. > > > > + */ > > > > + static_assert(bits <= 24, "Floating point precision may be insufficient for more than 24 bits"); > > > > + > > > > + static constexpr T bitMask = (bits < sizeof(T) * 8) > > > > + ? static_cast<T>((UT{1} << bits) - 1) > > > > + : static_cast<T>(~UT{0}); > > > > + > > > > +public: > > > > + using QuantizedType = T; > > > > + > > > > + static constexpr T qMin = std::is_signed_v<T> > > > > + ? static_cast<T>(-(UT{1} << (bits - 1))) > > > > + : static_cast<T>(0); > > > > + > > > > + static constexpr T qMax = std::is_signed_v<T> > > > > + ? static_cast<T>((UT{1} << (bits - 1)) - 1) > > > > + : bitMask; > > > > + > > > > + static constexpr float toFloat(QuantizedType q) > > > > + { > > > > + return fixedToFloatingPoint<I, F, float, QuantizedType>(q); > > > > + } > > > > + > > > > + static constexpr float min = fixedToFloatingPoint<I, F, float>(qMin); > > > > + static constexpr float max = fixedToFloatingPoint<I, F, float>(qMax); > > > > + > > > > + static_assert(min < max, "FixedPointQTraits: Minimum must be less than maximum"); > > > > + > > > > + /* Conversion functions required by Quantized<Traits> */ > > > > + static QuantizedType fromFloat(float v) > > > > + { > > > > + v = std::clamp(v, min, max); > > > > + return floatingToFixedPoint<I, F, QuantizedType, float>(v); > > > > + } > > > > +}; > > > > + > > > > +namespace details { > > > > + > > > > +template<unsigned int Bits> > > > > +constexpr auto qtype() > > > > +{ > > > > + static_assert(Bits <= 32, > > > > + "Unsupported number of bits for quantized type"); > > > > + > > > > + if constexpr (Bits <= 8) > > > > + return int8_t(); > > > > + else if constexpr (Bits <= 16) > > > > + return int16_t(); > > > > + else if constexpr (Bits <= 32) > > > > + return int32_t(); > > > > +} > > > > + > > > > +} /* namespace details */ > > > > + > > > > +template<unsigned int I, unsigned int F> > > > > +using Q = Quantized<FixedPointQTraits<I, F, decltype(details::qtype<I + F>())>>; > > > > + > > > > +template<unsigned int I, unsigned int F> > > > > +using UQ = Quantized<FixedPointQTraits<I, F, std::make_unsigned_t<decltype(details::qtype<I + F>())>>>; > > > > + > > > > } /* namespace ipa */ > > > > > > > > } /* namespace libcamera */
2026. 01. 26. 10:57 keltezéssel, Laurent Pinchart írta: > On Mon, Jan 26, 2026 at 09:37:47AM +0100, Stefan Klug wrote: >> Quoting Laurent Pinchart (2026-01-24 02:42:40) >>> On Fri, Jan 23, 2026 at 11:09:22AM +0100, Stefan Klug wrote: >>>> Quoting Kieran Bingham (2026-01-21 18:37:23) >>>>> Extend the new Quantized type infrastructure by providing a >>>>> FixedPointQTraits template. >>>>> >>>>> This allows construction of fixed point types with a Quantized storage >>>>> that allows easy reading of both the underlying quantized type value and >>>>> a floating point representation of that same value. >>>>> >>>>> Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com> >>>>> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> >>>>> >>>>> --- >>>>> v4: >>>>> - Assert that the given type has enough bits for the usage >>>>> - Use unsigned types for calculating qmin/qmax >>>>> - Reorder toFloat/fromFloat and min/max for future inlining >>>>> - Make toFloat and fromFloat constexpr >>>>> >>>>> v5: >>>>> - Make UT, Bits and Bitmask private (and remove doxygen) >>>>> - Remove constexpr from fromFloat which uses std::round (only constexpr >>>>> in C++23) >>>>> - static_assert that min<max when converted >>>>> - Provide new Q and UQ automatic width types (Thanks Barnabás) >>>>> - Convert types to shortened Q/UQ automatic widths >>>>> - Use automatic width Q/UQ for 12,4 >>>>> - change qmin->qMin qmax->qMax Bits->bits BitMask->bitMask >>>>> - Remove typedefs for Q1_7 etc >>>>> >>>>> v6: >>>>> - Use 'quantized' over 'quantised' >>>>> - Document sign is based on T and number of bits includes sign bit >>>>> >>>>> - Document that fromFloat also clamps between [min, max] >>>>> >>>>> - Remove 64 bit support. We have 32 bit assumptions on fromFloat >>>>> >>>>> - Restrict to 24 bits, to stay compatible with float types >>>>> >>>>> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> >>>>> --- >>>>> src/ipa/libipa/fixedpoint.cpp | 93 +++++++++++++++++++++++++++++++++++ >>>>> src/ipa/libipa/fixedpoint.h | 74 ++++++++++++++++++++++++++++ >>>>> 2 files changed, 167 insertions(+) >>>>> >>>>> diff --git a/src/ipa/libipa/fixedpoint.cpp b/src/ipa/libipa/fixedpoint.cpp >>>>> index 6b698fc5d680..c9d04e31e4df 100644 >>>>> --- a/src/ipa/libipa/fixedpoint.cpp >>>>> +++ b/src/ipa/libipa/fixedpoint.cpp >>>>> @@ -37,6 +37,99 @@ namespace ipa { >>>>> * \return The converted value >>>>> */ >>>>> >>>>> +/** >>>>> + * \struct libcamera::ipa::FixedPointQTraits >>>>> + * \brief Traits type implementing fixed-point quantisation conversions >>>> >>>> Nit: I saw in the log that quantized is preferred over quantised. >>>> >>>>> + * >>>>> + * The FixedPointQTraits structure defines a policy for mapping floating-point >>>>> + * values to and from fixed-point integer representations. It is parameterised >>>>> + * by the number of integer bits \a I, fractional bits \a F, and the integral >>>>> + * storage type \a T. The traits are used with Quantized<Traits> to create a >>>>> + * quantized type that stores both the fixed-point representation and the >>>>> + * corresponding floating-point value. >>>>> + * >>>>> + * The signedness of the type is determined by the signedness of \a T. For >>>>> + * signed types, the number of integer bits in \a I includes the sign bit. >>>> >>>> So it took me a while to accept that it is a great idea to represent a >>>> signed fixed point as signed int. In all the hardware/register >>>> interfaces I remember, the registers were always represented as unsigned >>>> types. So I tried to break it... >>>> >>>> uint16_t myReg; >>>> using T=Q<6,10>; >>>> T q(T::TraitsType::min); >>>> myReg = q.quantized(); >>>> >>>> ... and failed miserably :-) >>>> >>>> So all in all I think this is working great. >>> >>> uint32_t myReg; >>> using T=Q<4,4>; >>> T q(T::TraitsType::min); >>> myReg = q.quantized(); >>> >>> std::cout << "q: " << q << ", myReg: " << utils::hex(myReg) << std::endl; >>> >>> produces >>> >>> q: [0x80:-8], myReg: 0xffffff80 >>> >>> I would expect myReg to be 0x80. >> >> Eeek why didn't I see that case. Yes, that needs to be fixed. That cannot easily be fixed without making things fully unsigned due to how signed -> unsigned conversion is defined. >> >> For me that raises the question again if quantized() should always >> return an unsigned value as there is imho no real use case for a signed >> representation of the quantized type but a high risk for confusion. > > I would make it unsigned unconditionally. It happens that Q handles > quantized values represented as 2's complement, directly usable for > arithmetics on all the CPU we support, but we know that there are other > hardware representations (Jacopo mentioned that the C55 uses > sign/magnitude for some registers). The quantized values should be > considered as a bitfield, not as a numerical value. Even if one does not assume any particular representation, there is the "advantage" that converting between different signed types will "keep" the value, and that the sign will reflect the "real" sign. But on the other hand, it is probably the choice of least surprise is to keep everything unsigned. > >>>> Reviewed-by: Stefan Klug <stefan.klug@ideasonboard.com> >>>> >>>>> + * >>>>> + * The trait exposes compile-time constants describing the bit layout, limits, >>>>> + * and scaling factors used in the fixed-point representation. >>>>> + * >>>>> + * \tparam I Number of integer bits >>>>> + * \tparam F Number of fractional bits >>>>> + * \tparam T Integral type used to store the quantized value >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \typedef FixedPointQTraits::QuantizedType >>>>> + * \brief The integral storage type used for the fixed-point representation >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \var FixedPointQTraits::qMin >>>>> + * \brief Minimum representable quantized integer value >>>>> + * >>>>> + * This corresponds to the most negative value for signed formats or zero for >>>>> + * unsigned formats. >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \var FixedPointQTraits::qMax >>>>> + * \brief Maximum representable quantized integer value >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \var FixedPointQTraits::min >>>>> + * \brief Minimum representable floating-point value corresponding to qMin >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \var FixedPointQTraits::max >>>>> + * \brief Maximum representable floating-point value corresponding to qMax >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \fn FixedPointQTraits::fromFloat(float v) >>>>> + * \brief Convert a floating-point value to a fixed-point integer >>>>> + * \param[in] v The floating-point value to be converted >>>>> + * \return The quantized fixed-point integer representation >>>>> + * >>>>> + * The conversion first clamps the floating-point input \a v to the range [min, >>>>> + * max] and then rounds it to the nearest integer according to the scaling >>>>> + * factor defined by the number of fractional bits F. >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \fn FixedPointQTraits::toFloat(QuantizedType q) >>>>> + * \brief Convert a fixed-point integer to a floating-point value >>>>> + * \param[in] q The fixed-point integer value to be converted >>>>> + * \return The corresponding floating-point value >>>>> + * >>>>> + * The conversion sign-extends the integer value if required and divides by the >>>>> + * scaling factor defined by the number of fractional bits F. >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \typedef Q >>>>> + * \brief Define a signed fixed-point quantized type with automatic storage width >>>>> + * \tparam I The number of integer bits >>>>> + * \tparam F The number of fractional bits >>>>> + * >>>>> + * This alias defines a signed fixed-point quantized type using the >>>>> + * \ref FixedPointQTraits trait and a suitable signed integer storage type >>>>> + * automatically selected based on the total number of bits \a (I + F). >>>>> + */ >>>>> + >>>>> +/** >>>>> + * \typedef UQ >>>>> + * \brief Define an unsigned fixed-point quantized type with automatic storage width >>>>> + * \tparam I The number of integer bits >>>>> + * \tparam F The number of fractional bits >>>>> + * >>>>> + * This alias defines an unsigned fixed-point quantized type using the >>>>> + * \ref FixedPointQTraits trait and a suitable unsigned integer storage type >>>>> + * automatically selected based on the total number of bits \a (I + F). >>>>> + */ >>>>> + >>>>> } /* namespace ipa */ >>>>> >>>>> } /* namespace libcamera */ >>>>> diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h >>>>> index 48a9757f9554..33d1f4af4792 100644 >>>>> --- a/src/ipa/libipa/fixedpoint.h >>>>> +++ b/src/ipa/libipa/fixedpoint.h >>>>> @@ -10,6 +10,8 @@ >>>>> #include <cmath> >>>>> #include <type_traits> >>>>> >>>>> +#include "quantized.h" >>>>> + >>>>> namespace libcamera { >>>>> >>>>> namespace ipa { >>>>> @@ -63,6 +65,78 @@ constexpr R fixedToFloatingPoint(T number) >>>>> return static_cast<R>(t) / static_cast<R>(1 << F); >>>>> } >>>>> >>>>> +template<unsigned int I, unsigned int F, typename T> >>>>> +struct FixedPointQTraits { >>>>> +private: >>>>> + static_assert(std::is_integral_v<T>, "FixedPointQTraits: T must be integral"); >>>>> + using UT = std::make_unsigned_t<T>; >>>>> + >>>>> + static constexpr unsigned int bits = I + F; >>>>> + static_assert(bits <= sizeof(T) * 8, "FixedPointQTraits: too many bits for type T"); >>>>> + >>>>> + /* >>>>> + * If fixed point storage is required with more than 24 bits, consider >>>>> + * updating this implementation to use double-precision floating point. >>>>> + */ >>>>> + static_assert(bits <= 24, "Floating point precision may be insufficient for more than 24 bits"); >>>>> + >>>>> + static constexpr T bitMask = (bits < sizeof(T) * 8) >>>>> + ? static_cast<T>((UT{1} << bits) - 1) >>>>> + : static_cast<T>(~UT{0}); >>>>> + >>>>> +public: >>>>> + using QuantizedType = T; >>>>> + >>>>> + static constexpr T qMin = std::is_signed_v<T> >>>>> + ? static_cast<T>(-(UT{1} << (bits - 1))) >>>>> + : static_cast<T>(0); >>>>> + >>>>> + static constexpr T qMax = std::is_signed_v<T> >>>>> + ? static_cast<T>((UT{1} << (bits - 1)) - 1) >>>>> + : bitMask; >>>>> + >>>>> + static constexpr float toFloat(QuantizedType q) >>>>> + { >>>>> + return fixedToFloatingPoint<I, F, float, QuantizedType>(q); >>>>> + } >>>>> + >>>>> + static constexpr float min = fixedToFloatingPoint<I, F, float>(qMin); >>>>> + static constexpr float max = fixedToFloatingPoint<I, F, float>(qMax); >>>>> + >>>>> + static_assert(min < max, "FixedPointQTraits: Minimum must be less than maximum"); >>>>> + >>>>> + /* Conversion functions required by Quantized<Traits> */ >>>>> + static QuantizedType fromFloat(float v) >>>>> + { >>>>> + v = std::clamp(v, min, max); >>>>> + return floatingToFixedPoint<I, F, QuantizedType, float>(v); >>>>> + } >>>>> +}; >>>>> + >>>>> +namespace details { >>>>> + >>>>> +template<unsigned int Bits> >>>>> +constexpr auto qtype() >>>>> +{ >>>>> + static_assert(Bits <= 32, >>>>> + "Unsupported number of bits for quantized type"); >>>>> + >>>>> + if constexpr (Bits <= 8) >>>>> + return int8_t(); >>>>> + else if constexpr (Bits <= 16) >>>>> + return int16_t(); >>>>> + else if constexpr (Bits <= 32) >>>>> + return int32_t(); >>>>> +} >>>>> + >>>>> +} /* namespace details */ >>>>> + >>>>> +template<unsigned int I, unsigned int F> >>>>> +using Q = Quantized<FixedPointQTraits<I, F, decltype(details::qtype<I + F>())>>; >>>>> + >>>>> +template<unsigned int I, unsigned int F> >>>>> +using UQ = Quantized<FixedPointQTraits<I, F, std::make_unsigned_t<decltype(details::qtype<I + F>())>>>; >>>>> + >>>>> } /* namespace ipa */ >>>>> >>>>> } /* namespace libcamera */ >
diff --git a/src/ipa/libipa/fixedpoint.cpp b/src/ipa/libipa/fixedpoint.cpp index 6b698fc5d680..c9d04e31e4df 100644 --- a/src/ipa/libipa/fixedpoint.cpp +++ b/src/ipa/libipa/fixedpoint.cpp @@ -37,6 +37,99 @@ namespace ipa { * \return The converted value */ +/** + * \struct libcamera::ipa::FixedPointQTraits + * \brief Traits type implementing fixed-point quantisation conversions + * + * The FixedPointQTraits structure defines a policy for mapping floating-point + * values to and from fixed-point integer representations. It is parameterised + * by the number of integer bits \a I, fractional bits \a F, and the integral + * storage type \a T. The traits are used with Quantized<Traits> to create a + * quantized type that stores both the fixed-point representation and the + * corresponding floating-point value. + * + * The signedness of the type is determined by the signedness of \a T. For + * signed types, the number of integer bits in \a I includes the sign bit. + * + * The trait exposes compile-time constants describing the bit layout, limits, + * and scaling factors used in the fixed-point representation. + * + * \tparam I Number of integer bits + * \tparam F Number of fractional bits + * \tparam T Integral type used to store the quantized value + */ + +/** + * \typedef FixedPointQTraits::QuantizedType + * \brief The integral storage type used for the fixed-point representation + */ + +/** + * \var FixedPointQTraits::qMin + * \brief Minimum representable quantized integer value + * + * This corresponds to the most negative value for signed formats or zero for + * unsigned formats. + */ + +/** + * \var FixedPointQTraits::qMax + * \brief Maximum representable quantized integer value + */ + +/** + * \var FixedPointQTraits::min + * \brief Minimum representable floating-point value corresponding to qMin + */ + +/** + * \var FixedPointQTraits::max + * \brief Maximum representable floating-point value corresponding to qMax + */ + +/** + * \fn FixedPointQTraits::fromFloat(float v) + * \brief Convert a floating-point value to a fixed-point integer + * \param[in] v The floating-point value to be converted + * \return The quantized fixed-point integer representation + * + * The conversion first clamps the floating-point input \a v to the range [min, + * max] and then rounds it to the nearest integer according to the scaling + * factor defined by the number of fractional bits F. + */ + +/** + * \fn FixedPointQTraits::toFloat(QuantizedType q) + * \brief Convert a fixed-point integer to a floating-point value + * \param[in] q The fixed-point integer value to be converted + * \return The corresponding floating-point value + * + * The conversion sign-extends the integer value if required and divides by the + * scaling factor defined by the number of fractional bits F. + */ + +/** + * \typedef Q + * \brief Define a signed fixed-point quantized type with automatic storage width + * \tparam I The number of integer bits + * \tparam F The number of fractional bits + * + * This alias defines a signed fixed-point quantized type using the + * \ref FixedPointQTraits trait and a suitable signed integer storage type + * automatically selected based on the total number of bits \a (I + F). + */ + +/** + * \typedef UQ + * \brief Define an unsigned fixed-point quantized type with automatic storage width + * \tparam I The number of integer bits + * \tparam F The number of fractional bits + * + * This alias defines an unsigned fixed-point quantized type using the + * \ref FixedPointQTraits trait and a suitable unsigned integer storage type + * automatically selected based on the total number of bits \a (I + F). + */ + } /* namespace ipa */ } /* namespace libcamera */ diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h index 48a9757f9554..33d1f4af4792 100644 --- a/src/ipa/libipa/fixedpoint.h +++ b/src/ipa/libipa/fixedpoint.h @@ -10,6 +10,8 @@ #include <cmath> #include <type_traits> +#include "quantized.h" + namespace libcamera { namespace ipa { @@ -63,6 +65,78 @@ constexpr R fixedToFloatingPoint(T number) return static_cast<R>(t) / static_cast<R>(1 << F); } +template<unsigned int I, unsigned int F, typename T> +struct FixedPointQTraits { +private: + static_assert(std::is_integral_v<T>, "FixedPointQTraits: T must be integral"); + using UT = std::make_unsigned_t<T>; + + static constexpr unsigned int bits = I + F; + static_assert(bits <= sizeof(T) * 8, "FixedPointQTraits: too many bits for type T"); + + /* + * If fixed point storage is required with more than 24 bits, consider + * updating this implementation to use double-precision floating point. + */ + static_assert(bits <= 24, "Floating point precision may be insufficient for more than 24 bits"); + + static constexpr T bitMask = (bits < sizeof(T) * 8) + ? static_cast<T>((UT{1} << bits) - 1) + : static_cast<T>(~UT{0}); + +public: + using QuantizedType = T; + + static constexpr T qMin = std::is_signed_v<T> + ? static_cast<T>(-(UT{1} << (bits - 1))) + : static_cast<T>(0); + + static constexpr T qMax = std::is_signed_v<T> + ? static_cast<T>((UT{1} << (bits - 1)) - 1) + : bitMask; + + static constexpr float toFloat(QuantizedType q) + { + return fixedToFloatingPoint<I, F, float, QuantizedType>(q); + } + + static constexpr float min = fixedToFloatingPoint<I, F, float>(qMin); + static constexpr float max = fixedToFloatingPoint<I, F, float>(qMax); + + static_assert(min < max, "FixedPointQTraits: Minimum must be less than maximum"); + + /* Conversion functions required by Quantized<Traits> */ + static QuantizedType fromFloat(float v) + { + v = std::clamp(v, min, max); + return floatingToFixedPoint<I, F, QuantizedType, float>(v); + } +}; + +namespace details { + +template<unsigned int Bits> +constexpr auto qtype() +{ + static_assert(Bits <= 32, + "Unsupported number of bits for quantized type"); + + if constexpr (Bits <= 8) + return int8_t(); + else if constexpr (Bits <= 16) + return int16_t(); + else if constexpr (Bits <= 32) + return int32_t(); +} + +} /* namespace details */ + +template<unsigned int I, unsigned int F> +using Q = Quantized<FixedPointQTraits<I, F, decltype(details::qtype<I + F>())>>; + +template<unsigned int I, unsigned int F> +using UQ = Quantized<FixedPointQTraits<I, F, std::make_unsigned_t<decltype(details::qtype<I + F>())>>>; + } /* namespace ipa */ } /* namespace libcamera */