[v6,03/16] ipa: libipa: fixedpoint: Fix unsigned usage
diff mbox series

Message ID 20260121173737.376113-4-kieran.bingham@ideasonboard.com
State New
Headers show
Series
  • libipa: Introduce a Quantized type
Related show

Commit Message

Kieran Bingham Jan. 21, 2026, 5:37 p.m. UTC
The fixedToFloatingPoint does not support unsigned Q types, and
incorrectly sign-extends all values which have the top most bit set in
the quantized values.

Fix this by ensuring that only signed types perform sign extension, and
simplify the calculation for unsigned types.

Convert the storage of the test cases to signed types to correctly
represent their intended purpose, to prevent test failures.

Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com>
Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>

---

v6:

 - Use T{1} << F instead of 1 << F

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
---
 src/ipa/libipa/fixedpoint.h    | 3 +++
 test/ipa/libipa/fixedpoint.cpp | 6 +++---
 2 files changed, 6 insertions(+), 3 deletions(-)

Comments

Stefan Klug Jan. 22, 2026, 5:24 p.m. UTC | #1
Hi Kieran,

Quoting Kieran Bingham (2026-01-21 18:37:22)
> The fixedToFloatingPoint does not support unsigned Q types, and
> incorrectly sign-extends all values which have the top most bit set in
> the quantized values.
> 
> Fix this by ensuring that only signed types perform sign extension, and
> simplify the calculation for unsigned types.

I need to digest that a bit. That means a signed fixed point number must
always be represented by a signed integral type?

I didn't expect that, to me it feels strange to represent any fixed
point value as signed int. I need to think about that. But it seems to
introduce an asymmetry. In rkisp1/algorithms/ccm.cpp we convert a float
to uint16_t which carries a signed FP value. So if I'm not mistaken the
backwards conversion of that ccm value would now be broken?

Best regards,
Stefan

> 
> Convert the storage of the test cases to signed types to correctly
> represent their intended purpose, to prevent test failures.
> 
> Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com>
> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> 
> ---
> 
> v6:
> 
>  - Use T{1} << F instead of 1 << F
> 
> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> ---
>  src/ipa/libipa/fixedpoint.h    | 3 +++
>  test/ipa/libipa/fixedpoint.cpp | 6 +++---
>  2 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h
> index 709cf50f0fcd..48a9757f9554 100644
> --- a/src/ipa/libipa/fixedpoint.h
> +++ b/src/ipa/libipa/fixedpoint.h
> @@ -49,6 +49,9 @@ constexpr R fixedToFloatingPoint(T number)
>         static_assert(sizeof(int) >= sizeof(T));
>         static_assert(I + F <= sizeof(T) * 8);
>  
> +       if constexpr (std::is_unsigned_v<T>)
> +               return static_cast<R>(number) / static_cast<R>(T{1} << F);
> +
>         /*
>          * Recreate the upper bits in case of a negative number by shifting the sign
>          * bit from the fixed point to the first bit of the unsigned and then right shifting
> diff --git a/test/ipa/libipa/fixedpoint.cpp b/test/ipa/libipa/fixedpoint.cpp
> index 99eb662ddf4e..4b017e86a74f 100644
> --- a/test/ipa/libipa/fixedpoint.cpp
> +++ b/test/ipa/libipa/fixedpoint.cpp
> @@ -68,7 +68,7 @@ protected:
>                  * The second 7.992 test is to test that unused bits don't
>                  * affect the result.
>                  */
> -               std::map<double, uint16_t> testCases = {
> +               std::map<double, int16_t> testCases = {
>                         { 7.992, 0x3ff },
>                         {   0.2, 0x01a },
>                         {  -0.2, 0x7e6 },
> @@ -81,14 +81,14 @@ protected:
>  
>                 int ret;
>                 for (const auto &testCase : testCases) {
> -                       ret = testSingleFixedPoint<4, 7, uint16_t>(testCase.first,
> +                       ret = testSingleFixedPoint<4, 7, int16_t>(testCase.first,
>                                                                    testCase.second);
>                         if (ret != TestPass)
>                                 return ret;
>                 }
>  
>                 /* Special case with a superfluous one in the unused bits */
> -               ret = testFixedToFloat<4, 7, uint16_t, double>(0xbff, 7.992);
> +               ret = testFixedToFloat<4, 7, int16_t, double>(0xbff, 7.992);
>                 if (ret != TestPass)
>                         return ret;
>  
> -- 
> 2.52.0
>
Kieran Bingham Jan. 22, 2026, 5:43 p.m. UTC | #2
Quoting Stefan Klug (2026-01-22 17:24:56)
> Hi Kieran,
> 
> Quoting Kieran Bingham (2026-01-21 18:37:22)
> > The fixedToFloatingPoint does not support unsigned Q types, and
> > incorrectly sign-extends all values which have the top most bit set in
> > the quantized values.
> > 
> > Fix this by ensuring that only signed types perform sign extension, and
> > simplify the calculation for unsigned types.
> 
> I need to digest that a bit. That means a signed fixed point number must
> always be represented by a signed integral type?

Yes, up to this point - both 'signed' and 'unsigned' types are
represented and stored into an 'unsigned' underlying integral type when
processed through these functions.

> I didn't expect that, to me it feels strange to represent any fixed
> point value as signed int. I need to think about that. But it seems to
> introduce an asymmetry. In rkisp1/algorithms/ccm.cpp we convert a float
> to uint16_t which carries a signed FP value. So if I'm not mistaken the
> backwards conversion of that ccm value would now be broken?

No, the addition of

	if constexpr (std::is_unsigned_v<T>)
		return static_cast<R>(number) / static_cast<R>(T{1} << F);

should keep it working at this stage, though by the end of this series -
as you'll see these functions are completely removed, so please do
certainly check that the CCM value is handled correctly by the new
types.


--
Kieran



> 
> Best regards,
> Stefan
> 
> > 
> > Convert the storage of the test cases to signed types to correctly
> > represent their intended purpose, to prevent test failures.
> > 
> > Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com>
> > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > 
> > ---
> > 
> > v6:
> > 
> >  - Use T{1} << F instead of 1 << F
> > 
> > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > ---
> >  src/ipa/libipa/fixedpoint.h    | 3 +++
> >  test/ipa/libipa/fixedpoint.cpp | 6 +++---
> >  2 files changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h
> > index 709cf50f0fcd..48a9757f9554 100644
> > --- a/src/ipa/libipa/fixedpoint.h
> > +++ b/src/ipa/libipa/fixedpoint.h
> > @@ -49,6 +49,9 @@ constexpr R fixedToFloatingPoint(T number)
> >         static_assert(sizeof(int) >= sizeof(T));
> >         static_assert(I + F <= sizeof(T) * 8);
> >  
> > +       if constexpr (std::is_unsigned_v<T>)
> > +               return static_cast<R>(number) / static_cast<R>(T{1} << F);
> > +
> >         /*
> >          * Recreate the upper bits in case of a negative number by shifting the sign
> >          * bit from the fixed point to the first bit of the unsigned and then right shifting
> > diff --git a/test/ipa/libipa/fixedpoint.cpp b/test/ipa/libipa/fixedpoint.cpp
> > index 99eb662ddf4e..4b017e86a74f 100644
> > --- a/test/ipa/libipa/fixedpoint.cpp
> > +++ b/test/ipa/libipa/fixedpoint.cpp
> > @@ -68,7 +68,7 @@ protected:
> >                  * The second 7.992 test is to test that unused bits don't
> >                  * affect the result.
> >                  */
> > -               std::map<double, uint16_t> testCases = {
> > +               std::map<double, int16_t> testCases = {
> >                         { 7.992, 0x3ff },
> >                         {   0.2, 0x01a },
> >                         {  -0.2, 0x7e6 },
> > @@ -81,14 +81,14 @@ protected:
> >  
> >                 int ret;
> >                 for (const auto &testCase : testCases) {
> > -                       ret = testSingleFixedPoint<4, 7, uint16_t>(testCase.first,
> > +                       ret = testSingleFixedPoint<4, 7, int16_t>(testCase.first,
> >                                                                    testCase.second);
> >                         if (ret != TestPass)
> >                                 return ret;
> >                 }
> >  
> >                 /* Special case with a superfluous one in the unused bits */
> > -               ret = testFixedToFloat<4, 7, uint16_t, double>(0xbff, 7.992);
> > +               ret = testFixedToFloat<4, 7, int16_t, double>(0xbff, 7.992);
> >                 if (ret != TestPass)
> >                         return ret;
> >  
> > -- 
> > 2.52.0
> >
Kieran Bingham Jan. 22, 2026, 5:45 p.m. UTC | #3
Quoting Kieran Bingham (2026-01-22 17:43:41)
> Quoting Stefan Klug (2026-01-22 17:24:56)
> > Hi Kieran,
> > 
> > Quoting Kieran Bingham (2026-01-21 18:37:22)
> > > The fixedToFloatingPoint does not support unsigned Q types, and
> > > incorrectly sign-extends all values which have the top most bit set in
> > > the quantized values.
> > > 
> > > Fix this by ensuring that only signed types perform sign extension, and
> > > simplify the calculation for unsigned types.
> > 
> > I need to digest that a bit. That means a signed fixed point number must
> > always be represented by a signed integral type?
> 
> Yes, up to this point - both 'signed' and 'unsigned' types are
> represented and stored into an 'unsigned' underlying integral type when
> processed through these functions.
> 
> > I didn't expect that, to me it feels strange to represent any fixed
> > point value as signed int. I need to think about that. But it seems to
> > introduce an asymmetry. In rkisp1/algorithms/ccm.cpp we convert a float
> > to uint16_t which carries a signed FP value. So if I'm not mistaken the
> > backwards conversion of that ccm value would now be broken?

Oh - wait - do you mean I need to change the storage type of
rkisp1/algorithms/ccm.cpp to an int16_t 'in this patch' before it gets
converted perhaps!?


> 
> No, the addition of
> 
>         if constexpr (std::is_unsigned_v<T>)
>                 return static_cast<R>(number) / static_cast<R>(T{1} << F);
> 
> should keep it working at this stage, though by the end of this series -
> as you'll see these functions are completely removed, so please do
> certainly check that the CCM value is handled correctly by the new
> types.
> 
> 
> --
> Kieran
> 
> 
> 
> > 
> > Best regards,
> > Stefan
> > 
> > > 
> > > Convert the storage of the test cases to signed types to correctly
> > > represent their intended purpose, to prevent test failures.
> > > 
> > > Reviewed-by: Isaac Scott <isaac.scott@ideasonboard.com>
> > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > > 
> > > ---
> > > 
> > > v6:
> > > 
> > >  - Use T{1} << F instead of 1 << F
> > > 
> > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > > ---
> > >  src/ipa/libipa/fixedpoint.h    | 3 +++
> > >  test/ipa/libipa/fixedpoint.cpp | 6 +++---
> > >  2 files changed, 6 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h
> > > index 709cf50f0fcd..48a9757f9554 100644
> > > --- a/src/ipa/libipa/fixedpoint.h
> > > +++ b/src/ipa/libipa/fixedpoint.h
> > > @@ -49,6 +49,9 @@ constexpr R fixedToFloatingPoint(T number)
> > >         static_assert(sizeof(int) >= sizeof(T));
> > >         static_assert(I + F <= sizeof(T) * 8);
> > >  
> > > +       if constexpr (std::is_unsigned_v<T>)
> > > +               return static_cast<R>(number) / static_cast<R>(T{1} << F);
> > > +
> > >         /*
> > >          * Recreate the upper bits in case of a negative number by shifting the sign
> > >          * bit from the fixed point to the first bit of the unsigned and then right shifting
> > > diff --git a/test/ipa/libipa/fixedpoint.cpp b/test/ipa/libipa/fixedpoint.cpp
> > > index 99eb662ddf4e..4b017e86a74f 100644
> > > --- a/test/ipa/libipa/fixedpoint.cpp
> > > +++ b/test/ipa/libipa/fixedpoint.cpp
> > > @@ -68,7 +68,7 @@ protected:
> > >                  * The second 7.992 test is to test that unused bits don't
> > >                  * affect the result.
> > >                  */
> > > -               std::map<double, uint16_t> testCases = {
> > > +               std::map<double, int16_t> testCases = {
> > >                         { 7.992, 0x3ff },
> > >                         {   0.2, 0x01a },
> > >                         {  -0.2, 0x7e6 },
> > > @@ -81,14 +81,14 @@ protected:
> > >  
> > >                 int ret;
> > >                 for (const auto &testCase : testCases) {
> > > -                       ret = testSingleFixedPoint<4, 7, uint16_t>(testCase.first,
> > > +                       ret = testSingleFixedPoint<4, 7, int16_t>(testCase.first,
> > >                                                                    testCase.second);
> > >                         if (ret != TestPass)
> > >                                 return ret;
> > >                 }
> > >  
> > >                 /* Special case with a superfluous one in the unused bits */
> > > -               ret = testFixedToFloat<4, 7, uint16_t, double>(0xbff, 7.992);
> > > +               ret = testFixedToFloat<4, 7, int16_t, double>(0xbff, 7.992);
> > >                 if (ret != TestPass)
> > >                         return ret;
> > >  
> > > -- 
> > > 2.52.0
> > >

Patch
diff mbox series

diff --git a/src/ipa/libipa/fixedpoint.h b/src/ipa/libipa/fixedpoint.h
index 709cf50f0fcd..48a9757f9554 100644
--- a/src/ipa/libipa/fixedpoint.h
+++ b/src/ipa/libipa/fixedpoint.h
@@ -49,6 +49,9 @@  constexpr R fixedToFloatingPoint(T number)
 	static_assert(sizeof(int) >= sizeof(T));
 	static_assert(I + F <= sizeof(T) * 8);
 
+	if constexpr (std::is_unsigned_v<T>)
+		return static_cast<R>(number) / static_cast<R>(T{1} << F);
+
 	/*
 	 * Recreate the upper bits in case of a negative number by shifting the sign
 	 * bit from the fixed point to the first bit of the unsigned and then right shifting
diff --git a/test/ipa/libipa/fixedpoint.cpp b/test/ipa/libipa/fixedpoint.cpp
index 99eb662ddf4e..4b017e86a74f 100644
--- a/test/ipa/libipa/fixedpoint.cpp
+++ b/test/ipa/libipa/fixedpoint.cpp
@@ -68,7 +68,7 @@  protected:
 		 * The second 7.992 test is to test that unused bits don't
 		 * affect the result.
 		 */
-		std::map<double, uint16_t> testCases = {
+		std::map<double, int16_t> testCases = {
 			{ 7.992, 0x3ff },
 			{   0.2, 0x01a },
 			{  -0.2, 0x7e6 },
@@ -81,14 +81,14 @@  protected:
 
 		int ret;
 		for (const auto &testCase : testCases) {
-			ret = testSingleFixedPoint<4, 7, uint16_t>(testCase.first,
+			ret = testSingleFixedPoint<4, 7, int16_t>(testCase.first,
 								   testCase.second);
 			if (ret != TestPass)
 				return ret;
 		}
 
 		/* Special case with a superfluous one in the unused bits */
-		ret = testFixedToFloat<4, 7, uint16_t, double>(0xbff, 7.992);
+		ret = testFixedToFloat<4, 7, int16_t, double>(0xbff, 7.992);
 		if (ret != TestPass)
 			return ret;