[libcamera-devel,v1,4/4] utils: checkstyle: Add trailers checker
diff mbox series

Message ID 20230612224751.4437-5-laurent.pinchart@ideasonboard.com
State Accepted
Commit d06ed87d49ca3d734fd1c2f1409280abb499c625
Headers show
Series
  • utils: checkstyle: Add a commit message trailers checker
Related show

Commit Message

Laurent Pinchart June 12, 2023, 10:47 p.m. UTC
The libcamera git history contains numerous examples of incorrect commit
message trailers due to invalid trailer types (e.g. Change-Id), typos
and other small issues. Those went unnoticed through reviews, which
shows that an automated checker is required.

Add a trailers checker to checkstyle.py to catch invalid or malformed
trailers, with a set of supported trailers that match libcamera's commit
message practices. New trailer keys can easily be added later as new
needs arise.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
This reports a total of 42 issues through the project's history. 37 of
them should not be controversial:

- Trailer keys not valid for libcamera ('Cc', 'Change-Id', 'Inspired-by'
  and 'Reported-on')

- Typos in trailer keys ('Fixed' instead of 'Fixes', 'Reviewed' instead
  of 'Reviewed-by' and 'Signed-off-By' instead of 'Signed-off-by')

- Typos in e-mail address (missing display name, missing space before
  '<' and missing trailing '>')

- Link in 'Fixes' trailer (should be a commit)

- Too short commit ID in 'Fixes' trailer

- Typos in 'Fixes' trailer (extra 'commit' before commit ID, missing
  space or extra ':' after commit ID, missing '"' around commit subject)

The five remaining issues may benefit from discussions:

- Invalid trailer key 'Co-developed-by' (one instance). This is a
  trailer key commonly used in the kernel, but git..b both recommend
  Co-authored-by. I'm sure which option would be best, so I haven't
  included either for now.

- Typo in 'Reported-by' trailer for issues reported by Coverity (one
  instance). 'Reported-by' usually has an e-mail address value, but we
  have commonly used 'Coverity CID=<CID>' for issues reported by
  Coverity. I've tentatively added support for this (feedback is
  welcome), and one commit still got flagged as its 'Reported-by'
  trailer has a space instead of an '=' after 'CID'.

- Usage of a github user URL in 'Reported-by' (one instance). Our policy
  is to be able to identify users by name and e-mail address for
  'Signed-off-by' trailers, and I would prefer covering 'Reported-by'
  trailers too. If someone *really* doesn't want their name included in
  the git log when reporting an issue, we can simply omit the
  'Reported-by' trailer.

- Usage of URLs in 'Reported-by' to point to buildbot.libcamera.org (two
  instances). This should use a 'Link' trailer instead.
---
 utils/checkstyle.py | 80 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 78 insertions(+), 2 deletions(-)

Comments

Kieran Bingham July 5, 2023, 1:07 p.m. UTC | #1
Quoting Laurent Pinchart via libcamera-devel (2023-06-12 23:47:51)
> The libcamera git history contains numerous examples of incorrect commit
> message trailers due to invalid trailer types (e.g. Change-Id), typos
> and other small issues. Those went unnoticed through reviews, which
> shows that an automated checker is required.
> 
> Add a trailers checker to checkstyle.py to catch invalid or malformed
> trailers, with a set of supported trailers that match libcamera's commit
> message practices. New trailer keys can easily be added later as new
> needs arise.
> 
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> This reports a total of 42 issues through the project's history. 37 of
> them should not be controversial:
> 
> - Trailer keys not valid for libcamera ('Cc', 'Change-Id', 'Inspired-by'
>   and 'Reported-on')
> 
> - Typos in trailer keys ('Fixed' instead of 'Fixes', 'Reviewed' instead
>   of 'Reviewed-by' and 'Signed-off-By' instead of 'Signed-off-by')
> 
> - Typos in e-mail address (missing display name, missing space before
>   '<' and missing trailing '>')
> 
> - Link in 'Fixes' trailer (should be a commit)
> 
> - Too short commit ID in 'Fixes' trailer
> 
> - Typos in 'Fixes' trailer (extra 'commit' before commit ID, missing
>   space or extra ':' after commit ID, missing '"' around commit subject)
> 
> The five remaining issues may benefit from discussions:
> 
> - Invalid trailer key 'Co-developed-by' (one instance). This is a
>   trailer key commonly used in the kernel, but git..b both recommend
>   Co-authored-by. I'm sure which option would be best, so I haven't
>   included either for now.

I think those can be figured out when we next need them then.

I don't mind either, and the checkstyle won't prevent us using them -
just highlights them for discussion anyway (which is good).


> 
> - Typo in 'Reported-by' trailer for issues reported by Coverity (one
>   instance). 'Reported-by' usually has an e-mail address value, but we
>   have commonly used 'Coverity CID=<CID>' for issues reported by
>   Coverity. I've tentatively added support for this (feedback is
>   welcome), and one commit still got flagged as its 'Reported-by'
>   trailer has a space instead of an '=' after 'CID'.

This is fine with me - if it's defined in the checker we just use that
going forwards.

Coverity CID=<CID> looks good to me.


> 
> - Usage of a github user URL in 'Reported-by' (one instance). Our policy
>   is to be able to identify users by name and e-mail address for
>   'Signed-off-by' trailers, and I would prefer covering 'Reported-by'
>   trailers too. If someone *really* doesn't want their name included in
>   the git log when reporting an issue, we can simply omit the
>   'Reported-by' trailer.

Agreed,

> 
> - Usage of URLs in 'Reported-by' to point to buildbot.libcamera.org (two
>   instances). This should use a 'Link' trailer instead.

Link sounds fine if it's a full link. Otherwise - we'll end up with
Buildbot: Jenkins: Lava: ... 


This looks good to me anyway, lets see how it runs in production!


Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
--
Kieran


> ---
>  utils/checkstyle.py | 80 +++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 78 insertions(+), 2 deletions(-)
> 
> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> index e68c874609bc..3558740d389d 100755
> --- a/utils/checkstyle.py
> +++ b/utils/checkstyle.py
> @@ -210,13 +210,23 @@ class Commit:
>  
>      def _parse(self):
>          # Get the commit title and list of files.
> -        ret = subprocess.run(['git', 'show', '--format=%s', '--name-status',
> +        ret = subprocess.run(['git', 'show', '--format=%s%n%(trailers:only,unfold)', '--name-status',
>                                self.commit],
>                               stdout=subprocess.PIPE).stdout.decode('utf-8')
>          lines = ret.splitlines()
> -        self._files = [CommitFile(f) for f in lines[1:] if f]
> +
>          self._title = lines[0]
>  
> +        self._trailers = []
> +        for index in range(1, len(lines)):
> +            line = lines[index]
> +            if not line:
> +                break
> +
> +            self._trailers.append(line)
> +
> +        self._files = [CommitFile(f) for f in lines[index:] if f]
> +
>      def files(self, filter='AMR'):
>          return [f.filename for f in self._files if f.status in filter]
>  
> @@ -224,6 +234,10 @@ class Commit:
>      def title(self):
>          return self._title
>  
> +    @property
> +    def trailers(self):
> +        return self._trailers
> +
>      def get_diff(self, top_level, filename):
>          diff = subprocess.run(['git', 'diff', '%s~..%s' % (self.commit, self.commit),
>                                 '--', '%s/%s' % (top_level, filename)],
> @@ -424,6 +438,68 @@ class TitleChecker(CommitChecker):
>                              'possible candidates are ' + candidates)]
>  
>  
> +class TrailersChecker(CommitChecker):
> +    commit_regex = re.compile(r'[0-9a-f]{12}[0-9a-f]* \(".*"\)')
> +
> +    coverity_regex = re.compile(r'Coverity CID=.*')
> +
> +    # Simple e-mail address validator regex, with an additional trailing
> +    # comment. The complexity of a full RFC6531 validator isn't worth the
> +    # additional invalid addresses it would reject.
> +    email_regex = re.compile(r'[^<]+ <[^@>]+@[^>]+>( # .*)?')
> +
> +    link_regex = re.compile(r'https?://.*')
> +
> +    @staticmethod
> +    def validate_reported_by(value):
> +        if TrailersChecker.email_regex.fullmatch(value):
> +            return True
> +        if TrailersChecker.coverity_regex.fullmatch(value):
> +            return True
> +        return False
> +
> +    known_trailers = {
> +        'Acked-by': email_regex,
> +        'Bug': link_regex,
> +        'Fixes': commit_regex,
> +        'Link': link_regex,
> +        'Reported-by': validate_reported_by,
> +        'Reviewed-by': email_regex,
> +        'Signed-off-by': email_regex,
> +        'Suggested-by': email_regex,
> +        'Tested-by': email_regex,
> +    }
> +
> +    trailer_regex = re.compile(r'([A-Z][a-zA-Z-]*)\s*:\s*(.*)')
> +
> +    @classmethod
> +    def check(cls, commit, top_level):
> +        issues = []
> +
> +        for trailer in commit.trailers:
> +            match = TrailersChecker.trailer_regex.fullmatch(trailer)
> +            if not match:
> +                raise RuntimeError(f"Malformed commit trailer '{trailer}'")
> +
> +            key, value = match.groups()
> +
> +            validator = TrailersChecker.known_trailers.get(key)
> +            if not validator:
> +                issues.append(CommitIssue(f"Invalid commit trailer key '{key}'"))
> +                continue
> +
> +            if isinstance(validator, re.Pattern):
> +                valid = bool(validator.fullmatch(value))
> +            else:
> +                valid = validator(value)
> +
> +            if not valid:
> +                issues.append(CommitIssue(f"Malformed value '{value}' for commit trailer '{key}'"))
> +                continue
> +
> +        return issues
> +
> +
>  # ------------------------------------------------------------------------------
>  # Style Checkers
>  #
> -- 
> Regards,
> 
> Laurent Pinchart
>
Naushir Patuck July 10, 2023, 10:33 a.m. UTC | #2
Hi Laurent,

I've updated my tree and this change now seems to break my pre-commit hook:

naushir@work:~/libcamera/ $ git commit -a -s
---------------
 Staged changes
---------------
Traceback (most recent call last):
  File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
1052, in <module>
    sys.exit(main(sys.argv))
  File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
1042, in main
    issues += check_style(top_level, commit, args.checkers)
  File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
935, in check_style
    for issue in checker.check(commit, top_level):
  File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
479, in check
    for trailer in commit.trailers:
  File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
239, in trailers
    return self._trailers
AttributeError: 'StagedChanges' object has no attribute '_trailers'.
Did you mean: 'trailers'?

I'm using the same old hook found in ./utils/hooks/pre-commit. I've briefly
tried debugging, but quickly found that I'm in way over my depth with the
checkstyle.py script :-)

Regards,
Naush

On Mon, 12 Jun 2023 at 23:48, Laurent Pinchart via libcamera-devel
<libcamera-devel@lists.libcamera.org> wrote:
>
> The libcamera git history contains numerous examples of incorrect commit
> message trailers due to invalid trailer types (e.g. Change-Id), typos
> and other small issues. Those went unnoticed through reviews, which
> shows that an automated checker is required.
>
> Add a trailers checker to checkstyle.py to catch invalid or malformed
> trailers, with a set of supported trailers that match libcamera's commit
> message practices. New trailer keys can easily be added later as new
> needs arise.
>
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> This reports a total of 42 issues through the project's history. 37 of
> them should not be controversial:
>
> - Trailer keys not valid for libcamera ('Cc', 'Change-Id', 'Inspired-by'
>   and 'Reported-on')
>
> - Typos in trailer keys ('Fixed' instead of 'Fixes', 'Reviewed' instead
>   of 'Reviewed-by' and 'Signed-off-By' instead of 'Signed-off-by')
>
> - Typos in e-mail address (missing display name, missing space before
>   '<' and missing trailing '>')
>
> - Link in 'Fixes' trailer (should be a commit)
>
> - Too short commit ID in 'Fixes' trailer
>
> - Typos in 'Fixes' trailer (extra 'commit' before commit ID, missing
>   space or extra ':' after commit ID, missing '"' around commit subject)
>
> The five remaining issues may benefit from discussions:
>
> - Invalid trailer key 'Co-developed-by' (one instance). This is a
>   trailer key commonly used in the kernel, but git..b both recommend
>   Co-authored-by. I'm sure which option would be best, so I haven't
>   included either for now.
>
> - Typo in 'Reported-by' trailer for issues reported by Coverity (one
>   instance). 'Reported-by' usually has an e-mail address value, but we
>   have commonly used 'Coverity CID=<CID>' for issues reported by
>   Coverity. I've tentatively added support for this (feedback is
>   welcome), and one commit still got flagged as its 'Reported-by'
>   trailer has a space instead of an '=' after 'CID'.
>
> - Usage of a github user URL in 'Reported-by' (one instance). Our policy
>   is to be able to identify users by name and e-mail address for
>   'Signed-off-by' trailers, and I would prefer covering 'Reported-by'
>   trailers too. If someone *really* doesn't want their name included in
>   the git log when reporting an issue, we can simply omit the
>   'Reported-by' trailer.
>
> - Usage of URLs in 'Reported-by' to point to buildbot.libcamera.org (two
>   instances). This should use a 'Link' trailer instead.
> ---
>  utils/checkstyle.py | 80 +++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 78 insertions(+), 2 deletions(-)
>
> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> index e68c874609bc..3558740d389d 100755
> --- a/utils/checkstyle.py
> +++ b/utils/checkstyle.py
> @@ -210,13 +210,23 @@ class Commit:
>
>      def _parse(self):
>          # Get the commit title and list of files.
> -        ret = subprocess.run(['git', 'show', '--format=%s', '--name-status',
> +        ret = subprocess.run(['git', 'show', '--format=%s%n%(trailers:only,unfold)', '--name-status',
>                                self.commit],
>                               stdout=subprocess.PIPE).stdout.decode('utf-8')
>          lines = ret.splitlines()
> -        self._files = [CommitFile(f) for f in lines[1:] if f]
> +
>          self._title = lines[0]
>
> +        self._trailers = []
> +        for index in range(1, len(lines)):
> +            line = lines[index]
> +            if not line:
> +                break
> +
> +            self._trailers.append(line)
> +
> +        self._files = [CommitFile(f) for f in lines[index:] if f]
> +
>      def files(self, filter='AMR'):
>          return [f.filename for f in self._files if f.status in filter]
>
> @@ -224,6 +234,10 @@ class Commit:
>      def title(self):
>          return self._title
>
> +    @property
> +    def trailers(self):
> +        return self._trailers
> +
>      def get_diff(self, top_level, filename):
>          diff = subprocess.run(['git', 'diff', '%s~..%s' % (self.commit, self.commit),
>                                 '--', '%s/%s' % (top_level, filename)],
> @@ -424,6 +438,68 @@ class TitleChecker(CommitChecker):
>                              'possible candidates are ' + candidates)]
>
>
> +class TrailersChecker(CommitChecker):
> +    commit_regex = re.compile(r'[0-9a-f]{12}[0-9a-f]* \(".*"\)')
> +
> +    coverity_regex = re.compile(r'Coverity CID=.*')
> +
> +    # Simple e-mail address validator regex, with an additional trailing
> +    # comment. The complexity of a full RFC6531 validator isn't worth the
> +    # additional invalid addresses it would reject.
> +    email_regex = re.compile(r'[^<]+ <[^@>]+@[^>]+>( # .*)?')
> +
> +    link_regex = re.compile(r'https?://.*')
> +
> +    @staticmethod
> +    def validate_reported_by(value):
> +        if TrailersChecker.email_regex.fullmatch(value):
> +            return True
> +        if TrailersChecker.coverity_regex.fullmatch(value):
> +            return True
> +        return False
> +
> +    known_trailers = {
> +        'Acked-by': email_regex,
> +        'Bug': link_regex,
> +        'Fixes': commit_regex,
> +        'Link': link_regex,
> +        'Reported-by': validate_reported_by,
> +        'Reviewed-by': email_regex,
> +        'Signed-off-by': email_regex,
> +        'Suggested-by': email_regex,
> +        'Tested-by': email_regex,
> +    }
> +
> +    trailer_regex = re.compile(r'([A-Z][a-zA-Z-]*)\s*:\s*(.*)')
> +
> +    @classmethod
> +    def check(cls, commit, top_level):
> +        issues = []
> +
> +        for trailer in commit.trailers:
> +            match = TrailersChecker.trailer_regex.fullmatch(trailer)
> +            if not match:
> +                raise RuntimeError(f"Malformed commit trailer '{trailer}'")
> +
> +            key, value = match.groups()
> +
> +            validator = TrailersChecker.known_trailers.get(key)
> +            if not validator:
> +                issues.append(CommitIssue(f"Invalid commit trailer key '{key}'"))
> +                continue
> +
> +            if isinstance(validator, re.Pattern):
> +                valid = bool(validator.fullmatch(value))
> +            else:
> +                valid = validator(value)
> +
> +            if not valid:
> +                issues.append(CommitIssue(f"Malformed value '{value}' for commit trailer '{key}'"))
> +                continue
> +
> +        return issues
> +
> +
>  # ------------------------------------------------------------------------------
>  # Style Checkers
>  #
> --
> Regards,
>
> Laurent Pinchart
>
Kieran Bingham July 10, 2023, 3:51 p.m. UTC | #3
Quoting Naushir Patuck via libcamera-devel (2023-07-10 11:33:54)
> Hi Laurent,
> 
> I've updated my tree and this change now seems to break my pre-commit hook:

Yikes - You've gone pre-commit ;-)

I usually use post-commit - as pre-commit makes it harder to actually
work in my experience as it prevents the commits in the first place.

I think post-commit is a much better use of the hooks, as it tells you
of issues while not getting in the way.

Still - pre-commit is supposed to be supported - so I'll swap over and
see if I can debug this.

--
Kieran

> 
> naushir@work:~/libcamera/ $ git commit -a -s
> ---------------
>  Staged changes
> ---------------
> Traceback (most recent call last):
>   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> 1052, in <module>
>     sys.exit(main(sys.argv))
>   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> 1042, in main
>     issues += check_style(top_level, commit, args.checkers)
>   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> 935, in check_style
>     for issue in checker.check(commit, top_level):
>   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> 479, in check
>     for trailer in commit.trailers:
>   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> 239, in trailers
>     return self._trailers
> AttributeError: 'StagedChanges' object has no attribute '_trailers'.
> Did you mean: 'trailers'?
> 
> I'm using the same old hook found in ./utils/hooks/pre-commit. I've briefly
> tried debugging, but quickly found that I'm in way over my depth with the
> checkstyle.py script :-)
> 
> Regards,
> Naush
> 
> On Mon, 12 Jun 2023 at 23:48, Laurent Pinchart via libcamera-devel
> <libcamera-devel@lists.libcamera.org> wrote:
> >
> > The libcamera git history contains numerous examples of incorrect commit
> > message trailers due to invalid trailer types (e.g. Change-Id), typos
> > and other small issues. Those went unnoticed through reviews, which
> > shows that an automated checker is required.
> >
> > Add a trailers checker to checkstyle.py to catch invalid or malformed
> > trailers, with a set of supported trailers that match libcamera's commit
> > message practices. New trailer keys can easily be added later as new
> > needs arise.
> >
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> > This reports a total of 42 issues through the project's history. 37 of
> > them should not be controversial:
> >
> > - Trailer keys not valid for libcamera ('Cc', 'Change-Id', 'Inspired-by'
> >   and 'Reported-on')
> >
> > - Typos in trailer keys ('Fixed' instead of 'Fixes', 'Reviewed' instead
> >   of 'Reviewed-by' and 'Signed-off-By' instead of 'Signed-off-by')
> >
> > - Typos in e-mail address (missing display name, missing space before
> >   '<' and missing trailing '>')
> >
> > - Link in 'Fixes' trailer (should be a commit)
> >
> > - Too short commit ID in 'Fixes' trailer
> >
> > - Typos in 'Fixes' trailer (extra 'commit' before commit ID, missing
> >   space or extra ':' after commit ID, missing '"' around commit subject)
> >
> > The five remaining issues may benefit from discussions:
> >
> > - Invalid trailer key 'Co-developed-by' (one instance). This is a
> >   trailer key commonly used in the kernel, but git..b both recommend
> >   Co-authored-by. I'm sure which option would be best, so I haven't
> >   included either for now.
> >
> > - Typo in 'Reported-by' trailer for issues reported by Coverity (one
> >   instance). 'Reported-by' usually has an e-mail address value, but we
> >   have commonly used 'Coverity CID=<CID>' for issues reported by
> >   Coverity. I've tentatively added support for this (feedback is
> >   welcome), and one commit still got flagged as its 'Reported-by'
> >   trailer has a space instead of an '=' after 'CID'.
> >
> > - Usage of a github user URL in 'Reported-by' (one instance). Our policy
> >   is to be able to identify users by name and e-mail address for
> >   'Signed-off-by' trailers, and I would prefer covering 'Reported-by'
> >   trailers too. If someone *really* doesn't want their name included in
> >   the git log when reporting an issue, we can simply omit the
> >   'Reported-by' trailer.
> >
> > - Usage of URLs in 'Reported-by' to point to buildbot.libcamera.org (two
> >   instances). This should use a 'Link' trailer instead.
> > ---
> >  utils/checkstyle.py | 80 +++++++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 78 insertions(+), 2 deletions(-)
> >
> > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > index e68c874609bc..3558740d389d 100755
> > --- a/utils/checkstyle.py
> > +++ b/utils/checkstyle.py
> > @@ -210,13 +210,23 @@ class Commit:
> >
> >      def _parse(self):
> >          # Get the commit title and list of files.
> > -        ret = subprocess.run(['git', 'show', '--format=%s', '--name-status',
> > +        ret = subprocess.run(['git', 'show', '--format=%s%n%(trailers:only,unfold)', '--name-status',
> >                                self.commit],
> >                               stdout=subprocess.PIPE).stdout.decode('utf-8')
> >          lines = ret.splitlines()
> > -        self._files = [CommitFile(f) for f in lines[1:] if f]
> > +
> >          self._title = lines[0]
> >
> > +        self._trailers = []
> > +        for index in range(1, len(lines)):
> > +            line = lines[index]
> > +            if not line:
> > +                break
> > +
> > +            self._trailers.append(line)
> > +
> > +        self._files = [CommitFile(f) for f in lines[index:] if f]
> > +
> >      def files(self, filter='AMR'):
> >          return [f.filename for f in self._files if f.status in filter]
> >
> > @@ -224,6 +234,10 @@ class Commit:
> >      def title(self):
> >          return self._title
> >
> > +    @property
> > +    def trailers(self):
> > +        return self._trailers
> > +
> >      def get_diff(self, top_level, filename):
> >          diff = subprocess.run(['git', 'diff', '%s~..%s' % (self.commit, self.commit),
> >                                 '--', '%s/%s' % (top_level, filename)],
> > @@ -424,6 +438,68 @@ class TitleChecker(CommitChecker):
> >                              'possible candidates are ' + candidates)]
> >
> >
> > +class TrailersChecker(CommitChecker):
> > +    commit_regex = re.compile(r'[0-9a-f]{12}[0-9a-f]* \(".*"\)')
> > +
> > +    coverity_regex = re.compile(r'Coverity CID=.*')
> > +
> > +    # Simple e-mail address validator regex, with an additional trailing
> > +    # comment. The complexity of a full RFC6531 validator isn't worth the
> > +    # additional invalid addresses it would reject.
> > +    email_regex = re.compile(r'[^<]+ <[^@>]+@[^>]+>( # .*)?')
> > +
> > +    link_regex = re.compile(r'https?://.*')
> > +
> > +    @staticmethod
> > +    def validate_reported_by(value):
> > +        if TrailersChecker.email_regex.fullmatch(value):
> > +            return True
> > +        if TrailersChecker.coverity_regex.fullmatch(value):
> > +            return True
> > +        return False
> > +
> > +    known_trailers = {
> > +        'Acked-by': email_regex,
> > +        'Bug': link_regex,
> > +        'Fixes': commit_regex,
> > +        'Link': link_regex,
> > +        'Reported-by': validate_reported_by,
> > +        'Reviewed-by': email_regex,
> > +        'Signed-off-by': email_regex,
> > +        'Suggested-by': email_regex,
> > +        'Tested-by': email_regex,
> > +    }
> > +
> > +    trailer_regex = re.compile(r'([A-Z][a-zA-Z-]*)\s*:\s*(.*)')
> > +
> > +    @classmethod
> > +    def check(cls, commit, top_level):
> > +        issues = []
> > +
> > +        for trailer in commit.trailers:
> > +            match = TrailersChecker.trailer_regex.fullmatch(trailer)
> > +            if not match:
> > +                raise RuntimeError(f"Malformed commit trailer '{trailer}'")
> > +
> > +            key, value = match.groups()
> > +
> > +            validator = TrailersChecker.known_trailers.get(key)
> > +            if not validator:
> > +                issues.append(CommitIssue(f"Invalid commit trailer key '{key}'"))
> > +                continue
> > +
> > +            if isinstance(validator, re.Pattern):
> > +                valid = bool(validator.fullmatch(value))
> > +            else:
> > +                valid = validator(value)
> > +
> > +            if not valid:
> > +                issues.append(CommitIssue(f"Malformed value '{value}' for commit trailer '{key}'"))
> > +                continue
> > +
> > +        return issues
> > +
> > +
> >  # ------------------------------------------------------------------------------
> >  # Style Checkers
> >  #
> > --
> > Regards,
> >
> > Laurent Pinchart
> >
Naushir Patuck July 11, 2023, 7:50 a.m. UTC | #4
Hi Kieran,

On Mon, 10 Jul 2023 at 16:51, Kieran Bingham
<kieran.bingham@ideasonboard.com> wrote:
>
> Quoting Naushir Patuck via libcamera-devel (2023-07-10 11:33:54)
> > Hi Laurent,
> >
> > I've updated my tree and this change now seems to break my pre-commit hook:
>
> Yikes - You've gone pre-commit ;-)
>
> I usually use post-commit - as pre-commit makes it harder to actually
> work in my experience as it prevents the commits in the first place.
>
> I think post-commit is a much better use of the hooks, as it tells you
> of issues while not getting in the way.

Ok, that's no problem, I can switch to post-commit hooks.
Perhaps it's worth deprecating utils/hooks/pre-commit?

>
> Still - pre-commit is supposed to be supported - so I'll swap over and
> see if I can debug this.

I see you have a DNI fix for this, but maybe I'll just switch to post-commit
hooks if that's what everyone else is using.

Naush

>
> --
> Kieran
>
> >
> > naushir@work:~/libcamera/ $ git commit -a -s
> > ---------------
> >  Staged changes
> > ---------------
> > Traceback (most recent call last):
> >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > 1052, in <module>
> >     sys.exit(main(sys.argv))
> >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > 1042, in main
> >     issues += check_style(top_level, commit, args.checkers)
> >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > 935, in check_style
> >     for issue in checker.check(commit, top_level):
> >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > 479, in check
> >     for trailer in commit.trailers:
> >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > 239, in trailers
> >     return self._trailers
> > AttributeError: 'StagedChanges' object has no attribute '_trailers'.
> > Did you mean: 'trailers'?
> >
> > I'm using the same old hook found in ./utils/hooks/pre-commit. I've briefly
> > tried debugging, but quickly found that I'm in way over my depth with the
> > checkstyle.py script :-)
> >
> > Regards,
> > Naush
> >
> > On Mon, 12 Jun 2023 at 23:48, Laurent Pinchart via libcamera-devel
> > <libcamera-devel@lists.libcamera.org> wrote:
> > >
> > > The libcamera git history contains numerous examples of incorrect commit
> > > message trailers due to invalid trailer types (e.g. Change-Id), typos
> > > and other small issues. Those went unnoticed through reviews, which
> > > shows that an automated checker is required.
> > >
> > > Add a trailers checker to checkstyle.py to catch invalid or malformed
> > > trailers, with a set of supported trailers that match libcamera's commit
> > > message practices. New trailer keys can easily be added later as new
> > > needs arise.
> > >
> > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > ---
> > > This reports a total of 42 issues through the project's history. 37 of
> > > them should not be controversial:
> > >
> > > - Trailer keys not valid for libcamera ('Cc', 'Change-Id', 'Inspired-by'
> > >   and 'Reported-on')
> > >
> > > - Typos in trailer keys ('Fixed' instead of 'Fixes', 'Reviewed' instead
> > >   of 'Reviewed-by' and 'Signed-off-By' instead of 'Signed-off-by')
> > >
> > > - Typos in e-mail address (missing display name, missing space before
> > >   '<' and missing trailing '>')
> > >
> > > - Link in 'Fixes' trailer (should be a commit)
> > >
> > > - Too short commit ID in 'Fixes' trailer
> > >
> > > - Typos in 'Fixes' trailer (extra 'commit' before commit ID, missing
> > >   space or extra ':' after commit ID, missing '"' around commit subject)
> > >
> > > The five remaining issues may benefit from discussions:
> > >
> > > - Invalid trailer key 'Co-developed-by' (one instance). This is a
> > >   trailer key commonly used in the kernel, but git..b both recommend
> > >   Co-authored-by. I'm sure which option would be best, so I haven't
> > >   included either for now.
> > >
> > > - Typo in 'Reported-by' trailer for issues reported by Coverity (one
> > >   instance). 'Reported-by' usually has an e-mail address value, but we
> > >   have commonly used 'Coverity CID=<CID>' for issues reported by
> > >   Coverity. I've tentatively added support for this (feedback is
> > >   welcome), and one commit still got flagged as its 'Reported-by'
> > >   trailer has a space instead of an '=' after 'CID'.
> > >
> > > - Usage of a github user URL in 'Reported-by' (one instance). Our policy
> > >   is to be able to identify users by name and e-mail address for
> > >   'Signed-off-by' trailers, and I would prefer covering 'Reported-by'
> > >   trailers too. If someone *really* doesn't want their name included in
> > >   the git log when reporting an issue, we can simply omit the
> > >   'Reported-by' trailer.
> > >
> > > - Usage of URLs in 'Reported-by' to point to buildbot.libcamera.org (two
> > >   instances). This should use a 'Link' trailer instead.
> > > ---
> > >  utils/checkstyle.py | 80 +++++++++++++++++++++++++++++++++++++++++++--
> > >  1 file changed, 78 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > > index e68c874609bc..3558740d389d 100755
> > > --- a/utils/checkstyle.py
> > > +++ b/utils/checkstyle.py
> > > @@ -210,13 +210,23 @@ class Commit:
> > >
> > >      def _parse(self):
> > >          # Get the commit title and list of files.
> > > -        ret = subprocess.run(['git', 'show', '--format=%s', '--name-status',
> > > +        ret = subprocess.run(['git', 'show', '--format=%s%n%(trailers:only,unfold)', '--name-status',
> > >                                self.commit],
> > >                               stdout=subprocess.PIPE).stdout.decode('utf-8')
> > >          lines = ret.splitlines()
> > > -        self._files = [CommitFile(f) for f in lines[1:] if f]
> > > +
> > >          self._title = lines[0]
> > >
> > > +        self._trailers = []
> > > +        for index in range(1, len(lines)):
> > > +            line = lines[index]
> > > +            if not line:
> > > +                break
> > > +
> > > +            self._trailers.append(line)
> > > +
> > > +        self._files = [CommitFile(f) for f in lines[index:] if f]
> > > +
> > >      def files(self, filter='AMR'):
> > >          return [f.filename for f in self._files if f.status in filter]
> > >
> > > @@ -224,6 +234,10 @@ class Commit:
> > >      def title(self):
> > >          return self._title
> > >
> > > +    @property
> > > +    def trailers(self):
> > > +        return self._trailers
> > > +
> > >      def get_diff(self, top_level, filename):
> > >          diff = subprocess.run(['git', 'diff', '%s~..%s' % (self.commit, self.commit),
> > >                                 '--', '%s/%s' % (top_level, filename)],
> > > @@ -424,6 +438,68 @@ class TitleChecker(CommitChecker):
> > >                              'possible candidates are ' + candidates)]
> > >
> > >
> > > +class TrailersChecker(CommitChecker):
> > > +    commit_regex = re.compile(r'[0-9a-f]{12}[0-9a-f]* \(".*"\)')
> > > +
> > > +    coverity_regex = re.compile(r'Coverity CID=.*')
> > > +
> > > +    # Simple e-mail address validator regex, with an additional trailing
> > > +    # comment. The complexity of a full RFC6531 validator isn't worth the
> > > +    # additional invalid addresses it would reject.
> > > +    email_regex = re.compile(r'[^<]+ <[^@>]+@[^>]+>( # .*)?')
> > > +
> > > +    link_regex = re.compile(r'https?://.*')
> > > +
> > > +    @staticmethod
> > > +    def validate_reported_by(value):
> > > +        if TrailersChecker.email_regex.fullmatch(value):
> > > +            return True
> > > +        if TrailersChecker.coverity_regex.fullmatch(value):
> > > +            return True
> > > +        return False
> > > +
> > > +    known_trailers = {
> > > +        'Acked-by': email_regex,
> > > +        'Bug': link_regex,
> > > +        'Fixes': commit_regex,
> > > +        'Link': link_regex,
> > > +        'Reported-by': validate_reported_by,
> > > +        'Reviewed-by': email_regex,
> > > +        'Signed-off-by': email_regex,
> > > +        'Suggested-by': email_regex,
> > > +        'Tested-by': email_regex,
> > > +    }
> > > +
> > > +    trailer_regex = re.compile(r'([A-Z][a-zA-Z-]*)\s*:\s*(.*)')
> > > +
> > > +    @classmethod
> > > +    def check(cls, commit, top_level):
> > > +        issues = []
> > > +
> > > +        for trailer in commit.trailers:
> > > +            match = TrailersChecker.trailer_regex.fullmatch(trailer)
> > > +            if not match:
> > > +                raise RuntimeError(f"Malformed commit trailer '{trailer}'")
> > > +
> > > +            key, value = match.groups()
> > > +
> > > +            validator = TrailersChecker.known_trailers.get(key)
> > > +            if not validator:
> > > +                issues.append(CommitIssue(f"Invalid commit trailer key '{key}'"))
> > > +                continue
> > > +
> > > +            if isinstance(validator, re.Pattern):
> > > +                valid = bool(validator.fullmatch(value))
> > > +            else:
> > > +                valid = validator(value)
> > > +
> > > +            if not valid:
> > > +                issues.append(CommitIssue(f"Malformed value '{value}' for commit trailer '{key}'"))
> > > +                continue
> > > +
> > > +        return issues
> > > +
> > > +
> > >  # ------------------------------------------------------------------------------
> > >  # Style Checkers
> > >  #
> > > --
> > > Regards,
> > >
> > > Laurent Pinchart
> > >
Kieran Bingham July 11, 2023, 8:01 a.m. UTC | #5
Quoting Naushir Patuck (2023-07-11 08:50:13)
> Hi Kieran,
> 
> On Mon, 10 Jul 2023 at 16:51, Kieran Bingham
> <kieran.bingham@ideasonboard.com> wrote:
> >
> > Quoting Naushir Patuck via libcamera-devel (2023-07-10 11:33:54)
> > > Hi Laurent,
> > >
> > > I've updated my tree and this change now seems to break my pre-commit hook:
> >
> > Yikes - You've gone pre-commit ;-)
> >
> > I usually use post-commit - as pre-commit makes it harder to actually
> > work in my experience as it prevents the commits in the first place.
> >
> > I think post-commit is a much better use of the hooks, as it tells you
> > of issues while not getting in the way.
> 
> Ok, that's no problem, I can switch to post-commit hooks.
> Perhaps it's worth deprecating utils/hooks/pre-commit?
> 
> >
> > Still - pre-commit is supposed to be supported - so I'll swap over and
> > see if I can debug this.
> 
> I see you have a DNI fix for this, but maybe I'll just switch to post-commit
> hooks if that's what everyone else is using.

Yes, the DNI quick fix is to just initialise the trailers in the common
class.
Theres:

Commit <- StagedCommit <- AmendedCommit

And the trailers only get initialised if you create a Commit, and the
derived classes don't call Commit::_parse() as they override it
themselves.

I don't think AmendedCommit should derive from StagedCommit either ...
and should also still check the trailers, while StagedCommit can't as
there is no commit message to check trailers in that instance.

But the issue goes away in a post-commit as there's no staged commit
nor amended commit to check - as in 'post-commit' it's just a Commit
instance.

I think post-commit gives a better developer experience anyway and would
always recommend that over pre-commit.

--
Kieran

> 
> Naush
> 
> >
> > --
> > Kieran
> >
> > >
> > > naushir@work:~/libcamera/ $ git commit -a -s
> > > ---------------
> > >  Staged changes
> > > ---------------
> > > Traceback (most recent call last):
> > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > 1052, in <module>
> > >     sys.exit(main(sys.argv))
> > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > 1042, in main
> > >     issues += check_style(top_level, commit, args.checkers)
> > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > 935, in check_style
> > >     for issue in checker.check(commit, top_level):
> > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > 479, in check
> > >     for trailer in commit.trailers:
> > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > 239, in trailers
> > >     return self._trailers
> > > AttributeError: 'StagedChanges' object has no attribute '_trailers'.
> > > Did you mean: 'trailers'?
> > >
> > > I'm using the same old hook found in ./utils/hooks/pre-commit. I've briefly
> > > tried debugging, but quickly found that I'm in way over my depth with the
> > > checkstyle.py script :-)
> > >
> > > Regards,
> > > Naush
> > >
> > > On Mon, 12 Jun 2023 at 23:48, Laurent Pinchart via libcamera-devel
> > > <libcamera-devel@lists.libcamera.org> wrote:
> > > >
> > > > The libcamera git history contains numerous examples of incorrect commit
> > > > message trailers due to invalid trailer types (e.g. Change-Id), typos
> > > > and other small issues. Those went unnoticed through reviews, which
> > > > shows that an automated checker is required.
> > > >
> > > > Add a trailers checker to checkstyle.py to catch invalid or malformed
> > > > trailers, with a set of supported trailers that match libcamera's commit
> > > > message practices. New trailer keys can easily be added later as new
> > > > needs arise.
> > > >
> > > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > ---
> > > > This reports a total of 42 issues through the project's history. 37 of
> > > > them should not be controversial:
> > > >
> > > > - Trailer keys not valid for libcamera ('Cc', 'Change-Id', 'Inspired-by'
> > > >   and 'Reported-on')
> > > >
> > > > - Typos in trailer keys ('Fixed' instead of 'Fixes', 'Reviewed' instead
> > > >   of 'Reviewed-by' and 'Signed-off-By' instead of 'Signed-off-by')
> > > >
> > > > - Typos in e-mail address (missing display name, missing space before
> > > >   '<' and missing trailing '>')
> > > >
> > > > - Link in 'Fixes' trailer (should be a commit)
> > > >
> > > > - Too short commit ID in 'Fixes' trailer
> > > >
> > > > - Typos in 'Fixes' trailer (extra 'commit' before commit ID, missing
> > > >   space or extra ':' after commit ID, missing '"' around commit subject)
> > > >
> > > > The five remaining issues may benefit from discussions:
> > > >
> > > > - Invalid trailer key 'Co-developed-by' (one instance). This is a
> > > >   trailer key commonly used in the kernel, but git..b both recommend
> > > >   Co-authored-by. I'm sure which option would be best, so I haven't
> > > >   included either for now.
> > > >
> > > > - Typo in 'Reported-by' trailer for issues reported by Coverity (one
> > > >   instance). 'Reported-by' usually has an e-mail address value, but we
> > > >   have commonly used 'Coverity CID=<CID>' for issues reported by
> > > >   Coverity. I've tentatively added support for this (feedback is
> > > >   welcome), and one commit still got flagged as its 'Reported-by'
> > > >   trailer has a space instead of an '=' after 'CID'.
> > > >
> > > > - Usage of a github user URL in 'Reported-by' (one instance). Our policy
> > > >   is to be able to identify users by name and e-mail address for
> > > >   'Signed-off-by' trailers, and I would prefer covering 'Reported-by'
> > > >   trailers too. If someone *really* doesn't want their name included in
> > > >   the git log when reporting an issue, we can simply omit the
> > > >   'Reported-by' trailer.
> > > >
> > > > - Usage of URLs in 'Reported-by' to point to buildbot.libcamera.org (two
> > > >   instances). This should use a 'Link' trailer instead.
> > > > ---
> > > >  utils/checkstyle.py | 80 +++++++++++++++++++++++++++++++++++++++++++--
> > > >  1 file changed, 78 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > > > index e68c874609bc..3558740d389d 100755
> > > > --- a/utils/checkstyle.py
> > > > +++ b/utils/checkstyle.py
> > > > @@ -210,13 +210,23 @@ class Commit:
> > > >
> > > >      def _parse(self):
> > > >          # Get the commit title and list of files.
> > > > -        ret = subprocess.run(['git', 'show', '--format=%s', '--name-status',
> > > > +        ret = subprocess.run(['git', 'show', '--format=%s%n%(trailers:only,unfold)', '--name-status',
> > > >                                self.commit],
> > > >                               stdout=subprocess.PIPE).stdout.decode('utf-8')
> > > >          lines = ret.splitlines()
> > > > -        self._files = [CommitFile(f) for f in lines[1:] if f]
> > > > +
> > > >          self._title = lines[0]
> > > >
> > > > +        self._trailers = []
> > > > +        for index in range(1, len(lines)):
> > > > +            line = lines[index]
> > > > +            if not line:
> > > > +                break
> > > > +
> > > > +            self._trailers.append(line)
> > > > +
> > > > +        self._files = [CommitFile(f) for f in lines[index:] if f]
> > > > +
> > > >      def files(self, filter='AMR'):
> > > >          return [f.filename for f in self._files if f.status in filter]
> > > >
> > > > @@ -224,6 +234,10 @@ class Commit:
> > > >      def title(self):
> > > >          return self._title
> > > >
> > > > +    @property
> > > > +    def trailers(self):
> > > > +        return self._trailers
> > > > +
> > > >      def get_diff(self, top_level, filename):
> > > >          diff = subprocess.run(['git', 'diff', '%s~..%s' % (self.commit, self.commit),
> > > >                                 '--', '%s/%s' % (top_level, filename)],
> > > > @@ -424,6 +438,68 @@ class TitleChecker(CommitChecker):
> > > >                              'possible candidates are ' + candidates)]
> > > >
> > > >
> > > > +class TrailersChecker(CommitChecker):
> > > > +    commit_regex = re.compile(r'[0-9a-f]{12}[0-9a-f]* \(".*"\)')
> > > > +
> > > > +    coverity_regex = re.compile(r'Coverity CID=.*')
> > > > +
> > > > +    # Simple e-mail address validator regex, with an additional trailing
> > > > +    # comment. The complexity of a full RFC6531 validator isn't worth the
> > > > +    # additional invalid addresses it would reject.
> > > > +    email_regex = re.compile(r'[^<]+ <[^@>]+@[^>]+>( # .*)?')
> > > > +
> > > > +    link_regex = re.compile(r'https?://.*')
> > > > +
> > > > +    @staticmethod
> > > > +    def validate_reported_by(value):
> > > > +        if TrailersChecker.email_regex.fullmatch(value):
> > > > +            return True
> > > > +        if TrailersChecker.coverity_regex.fullmatch(value):
> > > > +            return True
> > > > +        return False
> > > > +
> > > > +    known_trailers = {
> > > > +        'Acked-by': email_regex,
> > > > +        'Bug': link_regex,
> > > > +        'Fixes': commit_regex,
> > > > +        'Link': link_regex,
> > > > +        'Reported-by': validate_reported_by,
> > > > +        'Reviewed-by': email_regex,
> > > > +        'Signed-off-by': email_regex,
> > > > +        'Suggested-by': email_regex,
> > > > +        'Tested-by': email_regex,
> > > > +    }
> > > > +
> > > > +    trailer_regex = re.compile(r'([A-Z][a-zA-Z-]*)\s*:\s*(.*)')
> > > > +
> > > > +    @classmethod
> > > > +    def check(cls, commit, top_level):
> > > > +        issues = []
> > > > +
> > > > +        for trailer in commit.trailers:
> > > > +            match = TrailersChecker.trailer_regex.fullmatch(trailer)
> > > > +            if not match:
> > > > +                raise RuntimeError(f"Malformed commit trailer '{trailer}'")
> > > > +
> > > > +            key, value = match.groups()
> > > > +
> > > > +            validator = TrailersChecker.known_trailers.get(key)
> > > > +            if not validator:
> > > > +                issues.append(CommitIssue(f"Invalid commit trailer key '{key}'"))
> > > > +                continue
> > > > +
> > > > +            if isinstance(validator, re.Pattern):
> > > > +                valid = bool(validator.fullmatch(value))
> > > > +            else:
> > > > +                valid = validator(value)
> > > > +
> > > > +            if not valid:
> > > > +                issues.append(CommitIssue(f"Malformed value '{value}' for commit trailer '{key}'"))
> > > > +                continue
> > > > +
> > > > +        return issues
> > > > +
> > > > +
> > > >  # ------------------------------------------------------------------------------
> > > >  # Style Checkers
> > > >  #
> > > > --
> > > > Regards,
> > > >
> > > > Laurent Pinchart
> > > >
Kieran Bingham July 11, 2023, 9:05 a.m. UTC | #6
Quoting Kieran Bingham (2023-07-11 09:01:28)
> Quoting Naushir Patuck (2023-07-11 08:50:13)
> > Hi Kieran,
> > 
> > On Mon, 10 Jul 2023 at 16:51, Kieran Bingham
> > <kieran.bingham@ideasonboard.com> wrote:
> > >
> > > Quoting Naushir Patuck via libcamera-devel (2023-07-10 11:33:54)
> > > > Hi Laurent,
> > > >
> > > > I've updated my tree and this change now seems to break my pre-commit hook:
> > >
> > > Yikes - You've gone pre-commit ;-)
> > >
> > > I usually use post-commit - as pre-commit makes it harder to actually
> > > work in my experience as it prevents the commits in the first place.
> > >
> > > I think post-commit is a much better use of the hooks, as it tells you
> > > of issues while not getting in the way.
> > 
> > Ok, that's no problem, I can switch to post-commit hooks.
> > Perhaps it's worth deprecating utils/hooks/pre-commit?
> > 
> > >
> > > Still - pre-commit is supposed to be supported - so I'll swap over and
> > > see if I can debug this.
> > 
> > I see you have a DNI fix for this, but maybe I'll just switch to post-commit
> > hooks if that's what everyone else is using.
> 
> Yes, the DNI quick fix is to just initialise the trailers in the common
> class.
> Theres:
> 
> Commit <- StagedCommit <- AmendedCommit
> 
> And the trailers only get initialised if you create a Commit, and the
> derived classes don't call Commit::_parse() as they override it
> themselves.
> 
> I don't think AmendedCommit should derive from StagedCommit either ...
> and should also still check the trailers, while StagedCommit can't as
> there is no commit message to check trailers in that instance.
> 
> But the issue goes away in a post-commit as there's no staged commit
> nor amended commit to check - as in 'post-commit' it's just a Commit
> instance.

And I'm starting to think that it really isn't worth checking Trailers
in pre-commit anyway, as it's /before/ the user has even had the
opportunity to adjust them.

The only thing that would make sense is if the trailer reporting was
added as a comment to the commit message so that the warning was visible
at the point that the commit message could be written ... but that
seems a lot more effort to support something that doesn't necessarily
help.

I think for now I'm tempted to suggest we merge the default
initialisation as in the DNI patch I posted to prevent errors in the
checkstyle script, without supporting Trailer checks in the pre-commit
hooks.

--
Kieran


> 
> I think post-commit gives a better developer experience anyway and would
> always recommend that over pre-commit.
> 
> --
> Kieran
> 
> > 
> > Naush
> > 
> > >
> > > --
> > > Kieran
> > >
> > > >
> > > > naushir@work:~/libcamera/ $ git commit -a -s
> > > > ---------------
> > > >  Staged changes
> > > > ---------------
> > > > Traceback (most recent call last):
> > > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > > 1052, in <module>
> > > >     sys.exit(main(sys.argv))
> > > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > > 1042, in main
> > > >     issues += check_style(top_level, commit, args.checkers)
> > > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > > 935, in check_style
> > > >     for issue in checker.check(commit, top_level):
> > > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > > 479, in check
> > > >     for trailer in commit.trailers:
> > > >   File "/home/naushir/projects/libcamera/./utils/checkstyle.py", line
> > > > 239, in trailers
> > > >     return self._trailers
> > > > AttributeError: 'StagedChanges' object has no attribute '_trailers'.
> > > > Did you mean: 'trailers'?
> > > >
> > > > I'm using the same old hook found in ./utils/hooks/pre-commit. I've briefly
> > > > tried debugging, but quickly found that I'm in way over my depth with the
> > > > checkstyle.py script :-)
> > > >
> > > > Regards,
> > > > Naush
> > > >
> > > > On Mon, 12 Jun 2023 at 23:48, Laurent Pinchart via libcamera-devel
> > > > <libcamera-devel@lists.libcamera.org> wrote:
> > > > >
> > > > > The libcamera git history contains numerous examples of incorrect commit
> > > > > message trailers due to invalid trailer types (e.g. Change-Id), typos
> > > > > and other small issues. Those went unnoticed through reviews, which
> > > > > shows that an automated checker is required.
> > > > >
> > > > > Add a trailers checker to checkstyle.py to catch invalid or malformed
> > > > > trailers, with a set of supported trailers that match libcamera's commit
> > > > > message practices. New trailer keys can easily be added later as new
> > > > > needs arise.
> > > > >
> > > > > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > > ---
> > > > > This reports a total of 42 issues through the project's history. 37 of
> > > > > them should not be controversial:
> > > > >
> > > > > - Trailer keys not valid for libcamera ('Cc', 'Change-Id', 'Inspired-by'
> > > > >   and 'Reported-on')
> > > > >
> > > > > - Typos in trailer keys ('Fixed' instead of 'Fixes', 'Reviewed' instead
> > > > >   of 'Reviewed-by' and 'Signed-off-By' instead of 'Signed-off-by')
> > > > >
> > > > > - Typos in e-mail address (missing display name, missing space before
> > > > >   '<' and missing trailing '>')
> > > > >
> > > > > - Link in 'Fixes' trailer (should be a commit)
> > > > >
> > > > > - Too short commit ID in 'Fixes' trailer
> > > > >
> > > > > - Typos in 'Fixes' trailer (extra 'commit' before commit ID, missing
> > > > >   space or extra ':' after commit ID, missing '"' around commit subject)
> > > > >
> > > > > The five remaining issues may benefit from discussions:
> > > > >
> > > > > - Invalid trailer key 'Co-developed-by' (one instance). This is a
> > > > >   trailer key commonly used in the kernel, but git..b both recommend
> > > > >   Co-authored-by. I'm sure which option would be best, so I haven't
> > > > >   included either for now.
> > > > >
> > > > > - Typo in 'Reported-by' trailer for issues reported by Coverity (one
> > > > >   instance). 'Reported-by' usually has an e-mail address value, but we
> > > > >   have commonly used 'Coverity CID=<CID>' for issues reported by
> > > > >   Coverity. I've tentatively added support for this (feedback is
> > > > >   welcome), and one commit still got flagged as its 'Reported-by'
> > > > >   trailer has a space instead of an '=' after 'CID'.
> > > > >
> > > > > - Usage of a github user URL in 'Reported-by' (one instance). Our policy
> > > > >   is to be able to identify users by name and e-mail address for
> > > > >   'Signed-off-by' trailers, and I would prefer covering 'Reported-by'
> > > > >   trailers too. If someone *really* doesn't want their name included in
> > > > >   the git log when reporting an issue, we can simply omit the
> > > > >   'Reported-by' trailer.
> > > > >
> > > > > - Usage of URLs in 'Reported-by' to point to buildbot.libcamera.org (two
> > > > >   instances). This should use a 'Link' trailer instead.
> > > > > ---
> > > > >  utils/checkstyle.py | 80 +++++++++++++++++++++++++++++++++++++++++++--
> > > > >  1 file changed, 78 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > > > > index e68c874609bc..3558740d389d 100755
> > > > > --- a/utils/checkstyle.py
> > > > > +++ b/utils/checkstyle.py
> > > > > @@ -210,13 +210,23 @@ class Commit:
> > > > >
> > > > >      def _parse(self):
> > > > >          # Get the commit title and list of files.
> > > > > -        ret = subprocess.run(['git', 'show', '--format=%s', '--name-status',
> > > > > +        ret = subprocess.run(['git', 'show', '--format=%s%n%(trailers:only,unfold)', '--name-status',
> > > > >                                self.commit],
> > > > >                               stdout=subprocess.PIPE).stdout.decode('utf-8')
> > > > >          lines = ret.splitlines()
> > > > > -        self._files = [CommitFile(f) for f in lines[1:] if f]
> > > > > +
> > > > >          self._title = lines[0]
> > > > >
> > > > > +        self._trailers = []
> > > > > +        for index in range(1, len(lines)):
> > > > > +            line = lines[index]
> > > > > +            if not line:
> > > > > +                break
> > > > > +
> > > > > +            self._trailers.append(line)
> > > > > +
> > > > > +        self._files = [CommitFile(f) for f in lines[index:] if f]
> > > > > +
> > > > >      def files(self, filter='AMR'):
> > > > >          return [f.filename for f in self._files if f.status in filter]
> > > > >
> > > > > @@ -224,6 +234,10 @@ class Commit:
> > > > >      def title(self):
> > > > >          return self._title
> > > > >
> > > > > +    @property
> > > > > +    def trailers(self):
> > > > > +        return self._trailers
> > > > > +
> > > > >      def get_diff(self, top_level, filename):
> > > > >          diff = subprocess.run(['git', 'diff', '%s~..%s' % (self.commit, self.commit),
> > > > >                                 '--', '%s/%s' % (top_level, filename)],
> > > > > @@ -424,6 +438,68 @@ class TitleChecker(CommitChecker):
> > > > >                              'possible candidates are ' + candidates)]
> > > > >
> > > > >
> > > > > +class TrailersChecker(CommitChecker):
> > > > > +    commit_regex = re.compile(r'[0-9a-f]{12}[0-9a-f]* \(".*"\)')
> > > > > +
> > > > > +    coverity_regex = re.compile(r'Coverity CID=.*')
> > > > > +
> > > > > +    # Simple e-mail address validator regex, with an additional trailing
> > > > > +    # comment. The complexity of a full RFC6531 validator isn't worth the
> > > > > +    # additional invalid addresses it would reject.
> > > > > +    email_regex = re.compile(r'[^<]+ <[^@>]+@[^>]+>( # .*)?')
> > > > > +
> > > > > +    link_regex = re.compile(r'https?://.*')
> > > > > +
> > > > > +    @staticmethod
> > > > > +    def validate_reported_by(value):
> > > > > +        if TrailersChecker.email_regex.fullmatch(value):
> > > > > +            return True
> > > > > +        if TrailersChecker.coverity_regex.fullmatch(value):
> > > > > +            return True
> > > > > +        return False
> > > > > +
> > > > > +    known_trailers = {
> > > > > +        'Acked-by': email_regex,
> > > > > +        'Bug': link_regex,
> > > > > +        'Fixes': commit_regex,
> > > > > +        'Link': link_regex,
> > > > > +        'Reported-by': validate_reported_by,
> > > > > +        'Reviewed-by': email_regex,
> > > > > +        'Signed-off-by': email_regex,
> > > > > +        'Suggested-by': email_regex,
> > > > > +        'Tested-by': email_regex,
> > > > > +    }
> > > > > +
> > > > > +    trailer_regex = re.compile(r'([A-Z][a-zA-Z-]*)\s*:\s*(.*)')
> > > > > +
> > > > > +    @classmethod
> > > > > +    def check(cls, commit, top_level):
> > > > > +        issues = []
> > > > > +
> > > > > +        for trailer in commit.trailers:
> > > > > +            match = TrailersChecker.trailer_regex.fullmatch(trailer)
> > > > > +            if not match:
> > > > > +                raise RuntimeError(f"Malformed commit trailer '{trailer}'")
> > > > > +
> > > > > +            key, value = match.groups()
> > > > > +
> > > > > +            validator = TrailersChecker.known_trailers.get(key)
> > > > > +            if not validator:
> > > > > +                issues.append(CommitIssue(f"Invalid commit trailer key '{key}'"))
> > > > > +                continue
> > > > > +
> > > > > +            if isinstance(validator, re.Pattern):
> > > > > +                valid = bool(validator.fullmatch(value))
> > > > > +            else:
> > > > > +                valid = validator(value)
> > > > > +
> > > > > +            if not valid:
> > > > > +                issues.append(CommitIssue(f"Malformed value '{value}' for commit trailer '{key}'"))
> > > > > +                continue
> > > > > +
> > > > > +        return issues
> > > > > +
> > > > > +
> > > > >  # ------------------------------------------------------------------------------
> > > > >  # Style Checkers
> > > > >  #
> > > > > --
> > > > > Regards,
> > > > >
> > > > > Laurent Pinchart
> > > > >

Patch
diff mbox series

diff --git a/utils/checkstyle.py b/utils/checkstyle.py
index e68c874609bc..3558740d389d 100755
--- a/utils/checkstyle.py
+++ b/utils/checkstyle.py
@@ -210,13 +210,23 @@  class Commit:
 
     def _parse(self):
         # Get the commit title and list of files.
-        ret = subprocess.run(['git', 'show', '--format=%s', '--name-status',
+        ret = subprocess.run(['git', 'show', '--format=%s%n%(trailers:only,unfold)', '--name-status',
                               self.commit],
                              stdout=subprocess.PIPE).stdout.decode('utf-8')
         lines = ret.splitlines()
-        self._files = [CommitFile(f) for f in lines[1:] if f]
+
         self._title = lines[0]
 
+        self._trailers = []
+        for index in range(1, len(lines)):
+            line = lines[index]
+            if not line:
+                break
+
+            self._trailers.append(line)
+
+        self._files = [CommitFile(f) for f in lines[index:] if f]
+
     def files(self, filter='AMR'):
         return [f.filename for f in self._files if f.status in filter]
 
@@ -224,6 +234,10 @@  class Commit:
     def title(self):
         return self._title
 
+    @property
+    def trailers(self):
+        return self._trailers
+
     def get_diff(self, top_level, filename):
         diff = subprocess.run(['git', 'diff', '%s~..%s' % (self.commit, self.commit),
                                '--', '%s/%s' % (top_level, filename)],
@@ -424,6 +438,68 @@  class TitleChecker(CommitChecker):
                             'possible candidates are ' + candidates)]
 
 
+class TrailersChecker(CommitChecker):
+    commit_regex = re.compile(r'[0-9a-f]{12}[0-9a-f]* \(".*"\)')
+
+    coverity_regex = re.compile(r'Coverity CID=.*')
+
+    # Simple e-mail address validator regex, with an additional trailing
+    # comment. The complexity of a full RFC6531 validator isn't worth the
+    # additional invalid addresses it would reject.
+    email_regex = re.compile(r'[^<]+ <[^@>]+@[^>]+>( # .*)?')
+
+    link_regex = re.compile(r'https?://.*')
+
+    @staticmethod
+    def validate_reported_by(value):
+        if TrailersChecker.email_regex.fullmatch(value):
+            return True
+        if TrailersChecker.coverity_regex.fullmatch(value):
+            return True
+        return False
+
+    known_trailers = {
+        'Acked-by': email_regex,
+        'Bug': link_regex,
+        'Fixes': commit_regex,
+        'Link': link_regex,
+        'Reported-by': validate_reported_by,
+        'Reviewed-by': email_regex,
+        'Signed-off-by': email_regex,
+        'Suggested-by': email_regex,
+        'Tested-by': email_regex,
+    }
+
+    trailer_regex = re.compile(r'([A-Z][a-zA-Z-]*)\s*:\s*(.*)')
+
+    @classmethod
+    def check(cls, commit, top_level):
+        issues = []
+
+        for trailer in commit.trailers:
+            match = TrailersChecker.trailer_regex.fullmatch(trailer)
+            if not match:
+                raise RuntimeError(f"Malformed commit trailer '{trailer}'")
+
+            key, value = match.groups()
+
+            validator = TrailersChecker.known_trailers.get(key)
+            if not validator:
+                issues.append(CommitIssue(f"Invalid commit trailer key '{key}'"))
+                continue
+
+            if isinstance(validator, re.Pattern):
+                valid = bool(validator.fullmatch(value))
+            else:
+                valid = validator(value)
+
+            if not valid:
+                issues.append(CommitIssue(f"Malformed value '{value}' for commit trailer '{key}'"))
+                continue
+
+        return issues
+
+
 # ------------------------------------------------------------------------------
 # Style Checkers
 #