[libcamera-devel,2/2] utils: checkstyle.py: Support single line hunks

Message ID 20200107154442.17372-3-kieran.bingham@ideasonboard.com
State Accepted
Headers show
Series
  • utils: checkstyle: Two checkstyle fixes
Related show

Commit Message

Kieran Bingham Jan. 7, 2020, 3:44 p.m. UTC
The checkstyle script expects hunks to be declared with a start line and
line count, however the unified diff format [0] declares that a single
line hunk will only have the start line:

> If a hunk contains just one line, only its start line number appears.
> Otherwise its line numbers look like ‘start,count’. An empty hunk is
> considered to start at the line that follows the hunk.

[0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified

Attempting to parse a single line hunk results in the following error:

  File "./utils/checkstyle.py", line 110, in __init__
    raise RuntimeError("Malformed diff hunk header '%s'" % line)
  RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@

The DiffHunk class only makes use of the start line, and does not
utilise the line count, thus update the regex to make the unused
groups optional, along with its comma separator.

Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
---
 utils/checkstyle.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Laurent Pinchart Jan. 7, 2020, 3:58 p.m. UTC | #1
Hi Kieran,

Thank you for the patch.

On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote:
> The checkstyle script expects hunks to be declared with a start line and
> line count, however the unified diff format [0] declares that a single
> line hunk will only have the start line:
> 
> > If a hunk contains just one line, only its start line number appears.
> > Otherwise its line numbers look like ‘start,count’. An empty hunk is
> > considered to start at the line that follows the hunk.
> 
> [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified
> 
> Attempting to parse a single line hunk results in the following error:
> 
>   File "./utils/checkstyle.py", line 110, in __init__
>     raise RuntimeError("Malformed diff hunk header '%s'" % line)
>   RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@
> 
> The DiffHunk class only makes use of the start line, and does not
> utilise the line count, thus update the regex to make the unused
> groups optional, along with its comma separator.
> 
> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> ---
>  utils/checkstyle.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> index 41cd3371e81e..ab3b977358c7 100755
> --- a/utils/checkstyle.py
> +++ b/utils/checkstyle.py
> @@ -102,7 +102,7 @@ class DiffHunkSide(object):
>  
>  
>  class DiffHunk(object):
> -    diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@')
> +    diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@')

How about

    diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@')

so that the count will be correctly captured, in case we need it later ?

>      def __init__(self, line):
>          match = DiffHunk.diff_header_regex.match(line)
Laurent Pinchart Jan. 7, 2020, 4:01 p.m. UTC | #2
Hi Kieran,

On Tue, Jan 07, 2020 at 05:58:04PM +0200, Laurent Pinchart wrote:
> On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote:
> > The checkstyle script expects hunks to be declared with a start line and
> > line count, however the unified diff format [0] declares that a single
> > line hunk will only have the start line:
> > 
> > > If a hunk contains just one line, only its start line number appears.
> > > Otherwise its line numbers look like ‘start,count’. An empty hunk is
> > > considered to start at the line that follows the hunk.
> > 
> > [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified
> > 
> > Attempting to parse a single line hunk results in the following error:
> > 
> >   File "./utils/checkstyle.py", line 110, in __init__
> >     raise RuntimeError("Malformed diff hunk header '%s'" % line)
> >   RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@
> > 
> > The DiffHunk class only makes use of the start line, and does not
> > utilise the line count, thus update the regex to make the unused
> > groups optional, along with its comma separator.
> > 
> > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > ---
> >  utils/checkstyle.py | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> > index 41cd3371e81e..ab3b977358c7 100755
> > --- a/utils/checkstyle.py
> > +++ b/utils/checkstyle.py
> > @@ -102,7 +102,7 @@ class DiffHunkSide(object):
> >  
> >  
> >  class DiffHunk(object):
> > -    diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@')
> > +    diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@')
> 
> How about
> 
>     diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@')
> 
> so that the count will be correctly captured, in case we need it later ?

With this fixed,

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

> >      def __init__(self, line):
> >          match = DiffHunk.diff_header_regex.match(line)
Kieran Bingham Jan. 7, 2020, 4:04 p.m. UTC | #3
Hi Laurent,

On 07/01/2020 15:58, Laurent Pinchart wrote:
> Hi Kieran,
> 
> Thank you for the patch.
> 
> On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote:
>> The checkstyle script expects hunks to be declared with a start line and
>> line count, however the unified diff format [0] declares that a single
>> line hunk will only have the start line:
>>
>>> If a hunk contains just one line, only its start line number appears.
>>> Otherwise its line numbers look like ‘start,count’. An empty hunk is
>>> considered to start at the line that follows the hunk.
>>
>> [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified
>>
>> Attempting to parse a single line hunk results in the following error:
>>
>>   File "./utils/checkstyle.py", line 110, in __init__
>>     raise RuntimeError("Malformed diff hunk header '%s'" % line)
>>   RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@
>>
>> The DiffHunk class only makes use of the start line, and does not
>> utilise the line count, thus update the regex to make the unused
>> groups optional, along with its comma separator.
>>
>> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
>> ---
>>  utils/checkstyle.py | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
>> index 41cd3371e81e..ab3b977358c7 100755
>> --- a/utils/checkstyle.py
>> +++ b/utils/checkstyle.py
>> @@ -102,7 +102,7 @@ class DiffHunkSide(object):
>>  
>>  
>>  class DiffHunk(object):
>> -    diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@')
>> +    diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@')
> 
> How about
> 
>     diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@')
> 
> so that the count will be correctly captured, in case we need it later ?

That could be better, but it doesn't make the ',' invalid in the case of
 @@ -1, +1,12 @@

That likely shouldn't happen, and perhaps isn't an issue. Are you happy
for the ',' to be allowed when no digits follow?

I don't think I can come up with either a reason for how this string
might end up occurring, or how it would be a problem if we parsed it as
such.. (until we make use of the hunk length at least).

If you're happy, I'll update the string (and commit message) and push.
--
Kieran


> 
>>      def __init__(self, line):
>>          match = DiffHunk.diff_header_regex.match(line)
>
Laurent Pinchart Jan. 7, 2020, 4:07 p.m. UTC | #4
Hi Kieran,

On Tue, Jan 07, 2020 at 04:04:00PM +0000, Kieran Bingham wrote:
> On 07/01/2020 15:58, Laurent Pinchart wrote:
> > On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote:
> >> The checkstyle script expects hunks to be declared with a start line and
> >> line count, however the unified diff format [0] declares that a single
> >> line hunk will only have the start line:
> >>
> >>> If a hunk contains just one line, only its start line number appears.
> >>> Otherwise its line numbers look like ‘start,count’. An empty hunk is
> >>> considered to start at the line that follows the hunk.
> >>
> >> [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified
> >>
> >> Attempting to parse a single line hunk results in the following error:
> >>
> >>   File "./utils/checkstyle.py", line 110, in __init__
> >>     raise RuntimeError("Malformed diff hunk header '%s'" % line)
> >>   RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@
> >>
> >> The DiffHunk class only makes use of the start line, and does not
> >> utilise the line count, thus update the regex to make the unused
> >> groups optional, along with its comma separator.
> >>
> >> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> >> ---
> >>  utils/checkstyle.py | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
> >> index 41cd3371e81e..ab3b977358c7 100755
> >> --- a/utils/checkstyle.py
> >> +++ b/utils/checkstyle.py
> >> @@ -102,7 +102,7 @@ class DiffHunkSide(object):
> >>  
> >>  
> >>  class DiffHunk(object):
> >> -    diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@')
> >> +    diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@')
> > 
> > How about
> > 
> >     diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@')
> > 
> > so that the count will be correctly captured, in case we need it later ?
> 
> That could be better, but it doesn't make the ',' invalid in the case of
>  @@ -1, +1,12 @@
> 
> That likely shouldn't happen, and perhaps isn't an issue. Are you happy
> for the ',' to be allowed when no digits follow?
> 
> I don't think I can come up with either a reason for how this string
> might end up occurring, or how it would be a problem if we parsed it as
> such.. (until we make use of the hunk length at least).
> 
> If you're happy, I'll update the string (and commit message) and push.

It's a good point. I'm happy either way, but if we include the ',' in
the group, does the last group need to have a '?' after the ',' ? i.e.
shouldn't it be

    diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,[0-9]+)? @@')

> >>      def __init__(self, line):
> >>          match = DiffHunk.diff_header_regex.match(line)
Kieran Bingham Jan. 7, 2020, 4:28 p.m. UTC | #5
Hi Laurent,

On 07/01/2020 16:07, Laurent Pinchart wrote:
> Hi Kieran,
> 
> On Tue, Jan 07, 2020 at 04:04:00PM +0000, Kieran Bingham wrote:
>> On 07/01/2020 15:58, Laurent Pinchart wrote:
>>> On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote:
>>>> The checkstyle script expects hunks to be declared with a start line and
>>>> line count, however the unified diff format [0] declares that a single
>>>> line hunk will only have the start line:
>>>>
>>>>> If a hunk contains just one line, only its start line number appears.
>>>>> Otherwise its line numbers look like ‘start,count’. An empty hunk is
>>>>> considered to start at the line that follows the hunk.
>>>>
>>>> [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified
>>>>
>>>> Attempting to parse a single line hunk results in the following error:
>>>>
>>>>   File "./utils/checkstyle.py", line 110, in __init__
>>>>     raise RuntimeError("Malformed diff hunk header '%s'" % line)
>>>>   RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@
>>>>
>>>> The DiffHunk class only makes use of the start line, and does not
>>>> utilise the line count, thus update the regex to make the unused
>>>> groups optional, along with its comma separator.
>>>>
>>>> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
>>>> ---
>>>>  utils/checkstyle.py | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/utils/checkstyle.py b/utils/checkstyle.py
>>>> index 41cd3371e81e..ab3b977358c7 100755
>>>> --- a/utils/checkstyle.py
>>>> +++ b/utils/checkstyle.py
>>>> @@ -102,7 +102,7 @@ class DiffHunkSide(object):
>>>>  
>>>>  
>>>>  class DiffHunk(object):
>>>> -    diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@')
>>>> +    diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@')
>>>
>>> How about
>>>
>>>     diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@')
>>>
>>> so that the count will be correctly captured, in case we need it later ?
>>
>> That could be better, but it doesn't make the ',' invalid in the case of
>>  @@ -1, +1,12 @@
>>
>> That likely shouldn't happen, and perhaps isn't an issue. Are you happy
>> for the ',' to be allowed when no digits follow?
>>
>> I don't think I can come up with either a reason for how this string
>> might end up occurring, or how it would be a problem if we parsed it as
>> such.. (until we make use of the hunk length at least).
>>
>> If you're happy, I'll update the string (and commit message) and push.
> 
> It's a good point. I'm happy either way, but if we include the ',' in
> the group, does the last group need to have a '?' after the ',' ? i.e.
> shouldn't it be
>
>     diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,[0-9]+)? @@')
> 

Indeed - the ? in the last group was extraneous, sorry it was a left
over from trying out the combinations, or rather an artefact of me
removing the first one but not the second :-D

If you're content that any extraneous ',' provided will do no harm (and
in my initial testing here, I don't think it will) then I'll go with
your variant to facilitate easy extraction of the hunk size later if
it's needed.

>>>>      def __init__(self, line):
>>>>          match = DiffHunk.diff_header_regex.match(line)
>

Patch

diff --git a/utils/checkstyle.py b/utils/checkstyle.py
index 41cd3371e81e..ab3b977358c7 100755
--- a/utils/checkstyle.py
+++ b/utils/checkstyle.py
@@ -102,7 +102,7 @@  class DiffHunkSide(object):
 
 
 class DiffHunk(object):
-    diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@')
+    diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@')
 
     def __init__(self, line):
         match = DiffHunk.diff_header_regex.match(line)