utils: checkstyle: Add license commit checker
diff mbox series

Message ID 20260407135632.1810486-1-laurent.pinchart@ideasonboard.com
State New
Headers show
Series
  • utils: checkstyle: Add license commit checker
Related show

Commit Message

Laurent Pinchart April 7, 2026, 1:56 p.m. UTC
Add a checker, based on the reuse tool, to detect files without a valid
license. The tool is optional, the checker will be skipped when not
available.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
I figured out that if we want this in CI, we should have it in
checkstyle too. In which case we could drop the license job from CI.
---
 utils/checkstyle.py | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)


base-commit: 4b6c47bd6675c428c19ea76370f0301c24f23bf1

Comments

Jacopo Mondi April 8, 2026, 6:39 a.m. UTC | #1
Hi Laurent

On Tue, Apr 07, 2026 at 04:56:32PM +0300, Laurent Pinchart wrote:
> Add a checker, based on the reuse tool, to detect files without a valid
> license. The tool is optional, the checker will be skipped when not
> available.
>
> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
> I figured out that if we want this in CI, we should have it in
> checkstyle too. In which case we could drop the license job from CI.

I think this is a way better approach

> ---
>  utils/checkstyle.py | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>
> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> index fa1355c1738b..f1ba1ee0ed81 100755
> --- a/utils/checkstyle.py
> +++ b/utils/checkstyle.py
> @@ -453,6 +453,31 @@ class HeaderAddChecker(CommitChecker):
>          return issues
>
>
> +class LicenseChecker(CommitChecker):
> +    commit_types = (Commit, StagedChanges, Amendment)
> +    dependencies = ('reuse',)

Is this a build dependency ? Does it need to be ?

> +
> +    missing_license_regex = re.compile(r'^(.*): no license identifier$')
> +
> +    @classmethod
> +    def check(cls, commit, top_level):
> +        issues = []
> +
> +        ret = subprocess.run(['reuse', 'lint-file'] + commit.files('AR'),
> +                             stdout=subprocess.PIPE)
> +
> +        for line in ret.stdout.decode('utf-8').splitlines():
> +            match = LicenseChecker.missing_license_regex.match(line)
> +            if not match:
> +                continue
> +
> +            filename = match.group(1)
> +            issue = CommitIssue(f'File {filename} has no license identifier')
> +            issues.append(issue)
> +
> +        return issues
> +
> +

I trust your Python!

Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>

>  class TitleChecker(CommitChecker):
>      commit_types = (Commit,)
>
>
> base-commit: 4b6c47bd6675c428c19ea76370f0301c24f23bf1
> --
> Regards,
>
> Laurent Pinchart
>
Laurent Pinchart April 8, 2026, 9:14 a.m. UTC | #2
On Wed, Apr 08, 2026 at 08:39:37AM +0200, Jacopo Mondi wrote:
> On Tue, Apr 07, 2026 at 04:56:32PM +0300, Laurent Pinchart wrote:
> > Add a checker, based on the reuse tool, to detect files without a valid
> > license. The tool is optional, the checker will be skipped when not
> > available.
> >
> > Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> > I figured out that if we want this in CI, we should have it in
> > checkstyle too. In which case we could drop the license job from CI.
> 
> I think this is a way better approach
> 
> > ---
> >  utils/checkstyle.py | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> >
> > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > index fa1355c1738b..f1ba1ee0ed81 100755
> > --- a/utils/checkstyle.py
> > +++ b/utils/checkstyle.py
> > @@ -453,6 +453,31 @@ class HeaderAddChecker(CommitChecker):
> >          return issues
> >
> >
> > +class LicenseChecker(CommitChecker):
> > +    commit_types = (Commit, StagedChanges, Amendment)
> > +    dependencies = ('reuse',)
> 
> Is this a build dependency ? Does it need to be ?

I expect most people to not have reuse installed (even if it's packaged
by distributions), so we'll likely catch issues in CI only. I'm however
not too concerned, as missing license information when adding new files
is not a common mistake.

We could implement file parsing in Python as well and avoid depending on
the reuse tool, but that's more code to maintain. I'd rather depend on
the reuse tool for the time being, and reconsider that later if it
causes issues.

> > +
> > +    missing_license_regex = re.compile(r'^(.*): no license identifier$')
> > +
> > +    @classmethod
> > +    def check(cls, commit, top_level):
> > +        issues = []
> > +
> > +        ret = subprocess.run(['reuse', 'lint-file'] + commit.files('AR'),
> > +                             stdout=subprocess.PIPE)
> > +
> > +        for line in ret.stdout.decode('utf-8').splitlines():
> > +            match = LicenseChecker.missing_license_regex.match(line)
> > +            if not match:
> > +                continue
> > +
> > +            filename = match.group(1)
> > +            issue = CommitIssue(f'File {filename} has no license identifier')
> > +            issues.append(issue)
> > +
> > +        return issues
> > +
> > +
> 
> I trust your Python!
> 
> Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
> 
> >  class TitleChecker(CommitChecker):
> >      commit_types = (Commit,)
> >
> >
> > base-commit: 4b6c47bd6675c428c19ea76370f0301c24f23bf1
Barnabás Pőcze April 13, 2026, 7:03 a.m. UTC | #3
2026. 04. 08. 11:14 keltezéssel, Laurent Pinchart írta:
> On Wed, Apr 08, 2026 at 08:39:37AM +0200, Jacopo Mondi wrote:
>> On Tue, Apr 07, 2026 at 04:56:32PM +0300, Laurent Pinchart wrote:
>>> Add a checker, based on the reuse tool, to detect files without a valid
>>> license. The tool is optional, the checker will be skipped when not
>>> available.
>>>
>>> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>>> ---
>>> I figured out that if we want this in CI, we should have it in
>>> checkstyle too. In which case we could drop the license job from CI.
>>
>> I think this is a way better approach
>>
>>> ---
>>>   utils/checkstyle.py | 25 +++++++++++++++++++++++++
>>>   1 file changed, 25 insertions(+)
>>>
>>> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
>>> index fa1355c1738b..f1ba1ee0ed81 100755
>>> --- a/utils/checkstyle.py
>>> +++ b/utils/checkstyle.py
>>> @@ -453,6 +453,31 @@ class HeaderAddChecker(CommitChecker):
>>>           return issues
>>>
>>>
>>> +class LicenseChecker(CommitChecker):
>>> +    commit_types = (Commit, StagedChanges, Amendment)
>>> +    dependencies = ('reuse',)
>>
>> Is this a build dependency ? Does it need to be ?
> 
> I expect most people to not have reuse installed (even if it's packaged
> by distributions), so we'll likely catch issues in CI only. I'm however
> not too concerned, as missing license information when adding new files
> is not a common mistake.
> 
> We could implement file parsing in Python as well and avoid depending on
> the reuse tool, but that's more code to maintain. I'd rather depend on
> the reuse tool for the time being, and reconsider that later if it
> causes issues.

I think it's better to use the existing tool here.


> 
>>> +
>>> +    missing_license_regex = re.compile(r'^(.*): no license identifier$')
>>> +
>>> +    @classmethod
>>> +    def check(cls, commit, top_level):
>>> +        issues = []
>>> +
>>> +        ret = subprocess.run(['reuse', 'lint-file'] + commit.files('AR'),

This looks at added and renamed files, right? Why not the modified ones as well?

Reviewed-by: Barnabás Pőcze <barnabas.pocze@ideasonboard.com>


>>> +                             stdout=subprocess.PIPE)
>>> +
>>> +        for line in ret.stdout.decode('utf-8').splitlines():
>>> +            match = LicenseChecker.missing_license_regex.match(line)
>>> +            if not match:
>>> +                continue
>>> +
>>> +            filename = match.group(1)
>>> +            issue = CommitIssue(f'File {filename} has no license identifier')
>>> +            issues.append(issue)
>>> +
>>> +        return issues
>>> +
>>> +
>>
>> I trust your Python!
>>
>> Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
>>
>>>   class TitleChecker(CommitChecker):
>>>       commit_types = (Commit,)
>>>
>>>
>>> base-commit: 4b6c47bd6675c428c19ea76370f0301c24f23bf1
>

Patch
diff mbox series

diff --git a/utils/checkstyle.py b/utils/checkstyle.py
index fa1355c1738b..f1ba1ee0ed81 100755
--- a/utils/checkstyle.py
+++ b/utils/checkstyle.py
@@ -453,6 +453,31 @@  class HeaderAddChecker(CommitChecker):
         return issues
 
 
+class LicenseChecker(CommitChecker):
+    commit_types = (Commit, StagedChanges, Amendment)
+    dependencies = ('reuse',)
+
+    missing_license_regex = re.compile(r'^(.*): no license identifier$')
+
+    @classmethod
+    def check(cls, commit, top_level):
+        issues = []
+
+        ret = subprocess.run(['reuse', 'lint-file'] + commit.files('AR'),
+                             stdout=subprocess.PIPE)
+
+        for line in ret.stdout.decode('utf-8').splitlines():
+            match = LicenseChecker.missing_license_regex.match(line)
+            if not match:
+                continue
+
+            filename = match.group(1)
+            issue = CommitIssue(f'File {filename} has no license identifier')
+            issues.append(issue)
+
+        return issues
+
+
 class TitleChecker(CommitChecker):
     commit_types = (Commit,)