[libcamera-devel,v2] utils: checkstyle.py: Add commit title checker
diff mbox series

Message ID 20221221221237.3094-1-laurent.pinchart@ideasonboard.com
State Accepted
Headers show
Series
  • [libcamera-devel,v2] utils: checkstyle.py: Add commit title checker
Related show

Commit Message

Laurent Pinchart Dec. 21, 2022, 10:12 p.m. UTC
Add a commit checker to ensure that commit titles start with a prefix.
The commit issue message lists prefix candidates retrieved from the git
log.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
Changes since v1:

- Reduce indentation
---
 utils/checkstyle.py | 62 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)


base-commit: f66a5c447b65bce774a1bc2d01034f437bf764b5
prerequisite-patch-id: 55c457bf621237f99229898adb78e18b259ab74b

Comments

Paul Elder Dec. 23, 2022, 11:56 p.m. UTC | #1
On Thu, Dec 22, 2022 at 12:12:37AM +0200, Laurent Pinchart via libcamera-devel wrote:
> Add a commit checker to ensure that commit titles start with a prefix.
> The commit issue message lists prefix candidates retrieved from the git
> log.
> 
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> Changes since v1:
> 
> - Reduce indentation
> ---
>  utils/checkstyle.py | 62 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 62 insertions(+)
> 
> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> index a11d95cc5808..364e13d6533d 100755
> --- a/utils/checkstyle.py
> +++ b/utils/checkstyle.py
> @@ -352,6 +352,68 @@ class HeaderAddChecker(CommitChecker):
>          return issues
>  
>  
> +class TitleChecker(CommitChecker):
> +    prefix_regex = re.compile(r'[0-9a-f]+ (([a-zA-Z0-9_.-]+: )+)')

Ah, the first chunk is for the commit ID...

> +    release_regex = re.compile(r'libcamera v[0-9]+\.[0-9]+\.[0-9]+')
> +
> +    @classmethod
> +    def check(cls, commit, top_level):
> +        title = commit.title
> +
> +        # Ignore release commits, they don't need a prefix.
> +        if TitleChecker.release_regex.fullmatch(title):
> +            return []
> +
> +        prefix_pos = title.find(': ')
> +        if prefix_pos != -1 and prefix_pos != len(title) - 2:
> +            return []
> +
> +        # Find prefix candidates by searching the git history
> +        msgs = subprocess.run(['git', 'log', '--no-decorate', '--oneline', '-n100', '--'] + commit.files(),
> +                              stdout=subprocess.PIPE).stdout.decode('utf-8')
> +        prefixes = {}
> +        prefixes_count = 0
> +        for msg in msgs.splitlines():
> +            prefix = TitleChecker.prefix_regex.match(msg)
> +            if not prefix:
> +                continue
> +
> +            prefix = prefix.group(1)
> +            if prefix in prefixes:
> +                prefixes[prefix] += 1
> +            else:
> +                prefixes[prefix] = 1
> +
> +            prefixes_count += 1
> +
> +        if not prefixes:
> +            return [CommitIssue('Commit title is missing prefix')]
> +
> +        # Sort the candidates by number of occurrences and pick the best ones.
> +        # When multiple prefixes are possible without a clear winner, we want to
> +        # display the most common options to the user, but without the most
> +        # unlikely options to avoid too long messages. As a heuristic, select
> +        # enough candidates to cover at least 2/3 of the possible prefixes, but
> +        # never more than 4 candidates.
> +        prefixes = list(prefixes.items())
> +        prefixes.sort(key=lambda x: x[1], reverse=True)
> +
> +        candidates = []
> +        candidates_count = 0
> +        for prefix in prefixes:
> +            candidates.append(f"`{prefix[0]}'")
> +            candidates_count += prefix[1]
> +            if candidates_count >= prefixes_count * 2 / 3 or \
> +               len(candidates) == 4:
> +                break
> +
> +        candidates = candidates[:-2] + [' and '.join(candidates[-2:])]

imo, s/and/or/

> +        candidates = ', '.join(candidates)

Also I feel like surrounding each candidate in quotes might be helpful.

Otherwise,

Reviewed-by: Paul Elder <paul.elder@ideasonboard.com>

> +
> +        return [CommitIssue('Commit title is missing prefix, '
> +                            'possible candidates are ' + candidates)]
> +
> +
>  # ------------------------------------------------------------------------------
>  # Style Checkers
>  #
> 
> base-commit: f66a5c447b65bce774a1bc2d01034f437bf764b5
> prerequisite-patch-id: 55c457bf621237f99229898adb78e18b259ab74b
Laurent Pinchart Dec. 24, 2022, 12:11 a.m. UTC | #2
Hi Paul,

On Fri, Dec 23, 2022 at 05:56:03PM -0600, Paul Elder wrote:
> On Thu, Dec 22, 2022 at 12:12:37AM +0200, Laurent Pinchart via libcamera-devel wrote:
> > Add a commit checker to ensure that commit titles start with a prefix.
> > The commit issue message lists prefix candidates retrieved from the git
> > log.
> > 
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> > Changes since v1:
> > 
> > - Reduce indentation
> > ---
> >  utils/checkstyle.py | 62 +++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 62 insertions(+)
> > 
> > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > index a11d95cc5808..364e13d6533d 100755
> > --- a/utils/checkstyle.py
> > +++ b/utils/checkstyle.py
> > @@ -352,6 +352,68 @@ class HeaderAddChecker(CommitChecker):
> >          return issues
> >  
> >  
> > +class TitleChecker(CommitChecker):
> > +    prefix_regex = re.compile(r'[0-9a-f]+ (([a-zA-Z0-9_.-]+: )+)')
> 
> Ah, the first chunk is for the commit ID...

That's right.

> > +    release_regex = re.compile(r'libcamera v[0-9]+\.[0-9]+\.[0-9]+')
> > +
> > +    @classmethod
> > +    def check(cls, commit, top_level):
> > +        title = commit.title
> > +
> > +        # Ignore release commits, they don't need a prefix.
> > +        if TitleChecker.release_regex.fullmatch(title):
> > +            return []
> > +
> > +        prefix_pos = title.find(': ')
> > +        if prefix_pos != -1 and prefix_pos != len(title) - 2:
> > +            return []
> > +
> > +        # Find prefix candidates by searching the git history
> > +        msgs = subprocess.run(['git', 'log', '--no-decorate', '--oneline', '-n100', '--'] + commit.files(),
> > +                              stdout=subprocess.PIPE).stdout.decode('utf-8')
> > +        prefixes = {}
> > +        prefixes_count = 0
> > +        for msg in msgs.splitlines():
> > +            prefix = TitleChecker.prefix_regex.match(msg)
> > +            if not prefix:
> > +                continue
> > +
> > +            prefix = prefix.group(1)
> > +            if prefix in prefixes:
> > +                prefixes[prefix] += 1
> > +            else:
> > +                prefixes[prefix] = 1
> > +
> > +            prefixes_count += 1
> > +
> > +        if not prefixes:
> > +            return [CommitIssue('Commit title is missing prefix')]
> > +
> > +        # Sort the candidates by number of occurrences and pick the best ones.
> > +        # When multiple prefixes are possible without a clear winner, we want to
> > +        # display the most common options to the user, but without the most
> > +        # unlikely options to avoid too long messages. As a heuristic, select
> > +        # enough candidates to cover at least 2/3 of the possible prefixes, but
> > +        # never more than 4 candidates.
> > +        prefixes = list(prefixes.items())
> > +        prefixes.sort(key=lambda x: x[1], reverse=True)
> > +
> > +        candidates = []
> > +        candidates_count = 0
> > +        for prefix in prefixes:
> > +            candidates.append(f"`{prefix[0]}'")
> > +            candidates_count += prefix[1]
> > +            if candidates_count >= prefixes_count * 2 / 3 or \
> > +               len(candidates) == 4:
> > +                break
> > +
> > +        candidates = candidates[:-2] + [' and '.join(candidates[-2:])]
> 
> imo, s/and/or/

They are all candidates, but I'm happy to use "or" if that's preferred.

> > +        candidates = ', '.join(candidates)
> 
> Also I feel like surrounding each candidate in quotes might be helpful.

Done already, see

            candidates.append(f"`{prefix[0]}'")

above.

> Otherwise,
> 
> Reviewed-by: Paul Elder <paul.elder@ideasonboard.com>
> 
> > +
> > +        return [CommitIssue('Commit title is missing prefix, '
> > +                            'possible candidates are ' + candidates)]
> > +
> > +
> >  # ------------------------------------------------------------------------------
> >  # Style Checkers
> >  #
> > 
> > base-commit: f66a5c447b65bce774a1bc2d01034f437bf764b5
> > prerequisite-patch-id: 55c457bf621237f99229898adb78e18b259ab74b
Jacopo Mondi Jan. 10, 2023, 8:17 a.m. UTC | #3
I don't really understand python but it seems ok to me

Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>

On Thu, Dec 22, 2022 at 12:12:37AM +0200, Laurent Pinchart via libcamera-devel wrote:
> Add a commit checker to ensure that commit titles start with a prefix.
> The commit issue message lists prefix candidates retrieved from the git
> log.
>
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> Changes since v1:
>
> - Reduce indentation
> ---
>  utils/checkstyle.py | 62 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 62 insertions(+)
>
> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> index a11d95cc5808..364e13d6533d 100755
> --- a/utils/checkstyle.py
> +++ b/utils/checkstyle.py
> @@ -352,6 +352,68 @@ class HeaderAddChecker(CommitChecker):
>          return issues
>
>
> +class TitleChecker(CommitChecker):
> +    prefix_regex = re.compile(r'[0-9a-f]+ (([a-zA-Z0-9_.-]+: )+)')
> +    release_regex = re.compile(r'libcamera v[0-9]+\.[0-9]+\.[0-9]+')
> +
> +    @classmethod
> +    def check(cls, commit, top_level):
> +        title = commit.title
> +
> +        # Ignore release commits, they don't need a prefix.
> +        if TitleChecker.release_regex.fullmatch(title):
> +            return []
> +
> +        prefix_pos = title.find(': ')
> +        if prefix_pos != -1 and prefix_pos != len(title) - 2:
> +            return []
> +
> +        # Find prefix candidates by searching the git history
> +        msgs = subprocess.run(['git', 'log', '--no-decorate', '--oneline', '-n100', '--'] + commit.files(),
> +                              stdout=subprocess.PIPE).stdout.decode('utf-8')
> +        prefixes = {}
> +        prefixes_count = 0
> +        for msg in msgs.splitlines():
> +            prefix = TitleChecker.prefix_regex.match(msg)
> +            if not prefix:
> +                continue
> +
> +            prefix = prefix.group(1)
> +            if prefix in prefixes:
> +                prefixes[prefix] += 1
> +            else:
> +                prefixes[prefix] = 1
> +
> +            prefixes_count += 1
> +
> +        if not prefixes:
> +            return [CommitIssue('Commit title is missing prefix')]
> +
> +        # Sort the candidates by number of occurrences and pick the best ones.
> +        # When multiple prefixes are possible without a clear winner, we want to
> +        # display the most common options to the user, but without the most
> +        # unlikely options to avoid too long messages. As a heuristic, select
> +        # enough candidates to cover at least 2/3 of the possible prefixes, but
> +        # never more than 4 candidates.
> +        prefixes = list(prefixes.items())
> +        prefixes.sort(key=lambda x: x[1], reverse=True)
> +
> +        candidates = []
> +        candidates_count = 0
> +        for prefix in prefixes:
> +            candidates.append(f"`{prefix[0]}'")
> +            candidates_count += prefix[1]
> +            if candidates_count >= prefixes_count * 2 / 3 or \
> +               len(candidates) == 4:
> +                break
> +
> +        candidates = candidates[:-2] + [' and '.join(candidates[-2:])]
> +        candidates = ', '.join(candidates)
> +
> +        return [CommitIssue('Commit title is missing prefix, '
> +                            'possible candidates are ' + candidates)]
> +
> +
>  # ------------------------------------------------------------------------------
>  # Style Checkers
>  #
>
> base-commit: f66a5c447b65bce774a1bc2d01034f437bf764b5
> prerequisite-patch-id: 55c457bf621237f99229898adb78e18b259ab74b
> --
> Regards,
>
> Laurent Pinchart
>

Patch
diff mbox series

diff --git a/utils/checkstyle.py b/utils/checkstyle.py
index a11d95cc5808..364e13d6533d 100755
--- a/utils/checkstyle.py
+++ b/utils/checkstyle.py
@@ -352,6 +352,68 @@  class HeaderAddChecker(CommitChecker):
         return issues
 
 
+class TitleChecker(CommitChecker):
+    prefix_regex = re.compile(r'[0-9a-f]+ (([a-zA-Z0-9_.-]+: )+)')
+    release_regex = re.compile(r'libcamera v[0-9]+\.[0-9]+\.[0-9]+')
+
+    @classmethod
+    def check(cls, commit, top_level):
+        title = commit.title
+
+        # Ignore release commits, they don't need a prefix.
+        if TitleChecker.release_regex.fullmatch(title):
+            return []
+
+        prefix_pos = title.find(': ')
+        if prefix_pos != -1 and prefix_pos != len(title) - 2:
+            return []
+
+        # Find prefix candidates by searching the git history
+        msgs = subprocess.run(['git', 'log', '--no-decorate', '--oneline', '-n100', '--'] + commit.files(),
+                              stdout=subprocess.PIPE).stdout.decode('utf-8')
+        prefixes = {}
+        prefixes_count = 0
+        for msg in msgs.splitlines():
+            prefix = TitleChecker.prefix_regex.match(msg)
+            if not prefix:
+                continue
+
+            prefix = prefix.group(1)
+            if prefix in prefixes:
+                prefixes[prefix] += 1
+            else:
+                prefixes[prefix] = 1
+
+            prefixes_count += 1
+
+        if not prefixes:
+            return [CommitIssue('Commit title is missing prefix')]
+
+        # Sort the candidates by number of occurrences and pick the best ones.
+        # When multiple prefixes are possible without a clear winner, we want to
+        # display the most common options to the user, but without the most
+        # unlikely options to avoid too long messages. As a heuristic, select
+        # enough candidates to cover at least 2/3 of the possible prefixes, but
+        # never more than 4 candidates.
+        prefixes = list(prefixes.items())
+        prefixes.sort(key=lambda x: x[1], reverse=True)
+
+        candidates = []
+        candidates_count = 0
+        for prefix in prefixes:
+            candidates.append(f"`{prefix[0]}'")
+            candidates_count += prefix[1]
+            if candidates_count >= prefixes_count * 2 / 3 or \
+               len(candidates) == 4:
+                break
+
+        candidates = candidates[:-2] + [' and '.join(candidates[-2:])]
+        candidates = ', '.join(candidates)
+
+        return [CommitIssue('Commit title is missing prefix, '
+                            'possible candidates are ' + candidates)]
+
+
 # ------------------------------------------------------------------------------
 # Style Checkers
 #