Message ID | 20250912125528.1963619-1-barnabas.pocze@ideasonboard.com |
---|---|
State | New |
Headers | show |
Series |
|
Related | show |
On Fri, Sep 12, 2025 at 02:55:27PM +0200, Barnabás Pőcze wrote: > Other code generation scripts do that already and let pyyaml deal with > decoding utf-8, etc. So do the same here as well. How does pyyaml determine the encoding ? Does it just hardcode utf-8 ? > Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> > --- > src/py/libcamera/gen-py-formats.py | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/py/libcamera/gen-py-formats.py b/src/py/libcamera/gen-py-formats.py > index 0ff1d12ac..6323e237f 100755 > --- a/src/py/libcamera/gen-py-formats.py > +++ b/src/py/libcamera/gen-py-formats.py > @@ -37,7 +37,7 @@ def main(argv): > help='Template file name.') > args = parser.parse_args(argv[1:]) > > - with open(args.input, encoding='utf-8') as f: > + with open(args.input, 'rb') as f: > formats = yaml.safe_load(f)['formats'] > > data = generate(formats)
2025. 09. 12. 16:00 keltezéssel, Laurent Pinchart írta: > On Fri, Sep 12, 2025 at 02:55:27PM +0200, Barnabás Pőcze wrote: >> Other code generation scripts do that already and let pyyaml deal with >> decoding utf-8, etc. So do the same here as well. > > How does pyyaml determine the encoding ? Does it just hardcode utf-8 ? https://yaml.org/spec/1.2.2/#52-character-encodings says that if there is no BOM, then it is utf-8. And additionally: If a character stream begins with a byte order mark, the character encoding will be taken to be as indicated by the byte order mark. Otherwise, the stream must begin with an ASCII character. This allows the encoding to be deduced by the pattern of null (x00) characters. So for our purposes it will deduce utf-8 since no yaml file that is used here starts with a BOM or a "long ascii character" as far as I can tell. Due to this special behaviour, I'd say opening it in binary mode is the correct choice. Regards, Barnabás Pőcze > >> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> >> --- >> src/py/libcamera/gen-py-formats.py | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/src/py/libcamera/gen-py-formats.py b/src/py/libcamera/gen-py-formats.py >> index 0ff1d12ac..6323e237f 100755 >> --- a/src/py/libcamera/gen-py-formats.py >> +++ b/src/py/libcamera/gen-py-formats.py >> @@ -37,7 +37,7 @@ def main(argv): >> help='Template file name.') >> args = parser.parse_args(argv[1:]) >> >> - with open(args.input, encoding='utf-8') as f: >> + with open(args.input, 'rb') as f: >> formats = yaml.safe_load(f)['formats'] >> >> data = generate(formats) >
On Fri, Sep 12, 2025 at 04:08:50PM +0200, Barnabás Pőcze wrote: > 2025. 09. 12. 16:00 keltezéssel, Laurent Pinchart írta: > > On Fri, Sep 12, 2025 at 02:55:27PM +0200, Barnabás Pőcze wrote: > >> Other code generation scripts do that already and let pyyaml deal with > >> decoding utf-8, etc. So do the same here as well. > > > > How does pyyaml determine the encoding ? Does it just hardcode utf-8 ? > > https://yaml.org/spec/1.2.2/#52-character-encodings says that if there is no > BOM, then it is utf-8. And additionally: > > If a character stream begins with a byte order mark, the character encoding will be > taken to be as indicated by the byte order mark. Otherwise, the stream must begin > with an ASCII character. This allows the encoding to be deduced by the pattern of > null (x00) characters. > > So for our purposes it will deduce utf-8 since no yaml file that is used here starts > with a BOM or a "long ascii character" as far as I can tell. > > Due to this special behaviour, I'd say opening it in binary mode is the correct choice. Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > >> Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> > >> --- > >> src/py/libcamera/gen-py-formats.py | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/src/py/libcamera/gen-py-formats.py b/src/py/libcamera/gen-py-formats.py > >> index 0ff1d12ac..6323e237f 100755 > >> --- a/src/py/libcamera/gen-py-formats.py > >> +++ b/src/py/libcamera/gen-py-formats.py > >> @@ -37,7 +37,7 @@ def main(argv): > >> help='Template file name.') > >> args = parser.parse_args(argv[1:]) > >> > >> - with open(args.input, encoding='utf-8') as f: > >> + with open(args.input, 'rb') as f: > >> formats = yaml.safe_load(f)['formats'] > >> > >> data = generate(formats)
diff --git a/src/py/libcamera/gen-py-formats.py b/src/py/libcamera/gen-py-formats.py index 0ff1d12ac..6323e237f 100755 --- a/src/py/libcamera/gen-py-formats.py +++ b/src/py/libcamera/gen-py-formats.py @@ -37,7 +37,7 @@ def main(argv): help='Template file name.') args = parser.parse_args(argv[1:]) - with open(args.input, encoding='utf-8') as f: + with open(args.input, 'rb') as f: formats = yaml.safe_load(f)['formats'] data = generate(formats)
Other code generation scripts do that already and let pyyaml deal with decoding utf-8, etc. So do the same here as well. Signed-off-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com> --- src/py/libcamera/gen-py-formats.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)