Message ID | 20200107154442.17372-3-kieran.bingham@ideasonboard.com |
---|---|
State | Accepted |
Headers | show |
Series |
|
Related | show |
Hi Kieran, Thank you for the patch. On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote: > The checkstyle script expects hunks to be declared with a start line and > line count, however the unified diff format [0] declares that a single > line hunk will only have the start line: > > > If a hunk contains just one line, only its start line number appears. > > Otherwise its line numbers look like ‘start,count’. An empty hunk is > > considered to start at the line that follows the hunk. > > [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified > > Attempting to parse a single line hunk results in the following error: > > File "./utils/checkstyle.py", line 110, in __init__ > raise RuntimeError("Malformed diff hunk header '%s'" % line) > RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@ > > The DiffHunk class only makes use of the start line, and does not > utilise the line count, thus update the regex to make the unused > groups optional, along with its comma separator. > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > --- > utils/checkstyle.py | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/utils/checkstyle.py b/utils/checkstyle.py > index 41cd3371e81e..ab3b977358c7 100755 > --- a/utils/checkstyle.py > +++ b/utils/checkstyle.py > @@ -102,7 +102,7 @@ class DiffHunkSide(object): > > > class DiffHunk(object): > - diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@') > + diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@') How about diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@') so that the count will be correctly captured, in case we need it later ? > def __init__(self, line): > match = DiffHunk.diff_header_regex.match(line)
Hi Kieran, On Tue, Jan 07, 2020 at 05:58:04PM +0200, Laurent Pinchart wrote: > On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote: > > The checkstyle script expects hunks to be declared with a start line and > > line count, however the unified diff format [0] declares that a single > > line hunk will only have the start line: > > > > > If a hunk contains just one line, only its start line number appears. > > > Otherwise its line numbers look like ‘start,count’. An empty hunk is > > > considered to start at the line that follows the hunk. > > > > [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified > > > > Attempting to parse a single line hunk results in the following error: > > > > File "./utils/checkstyle.py", line 110, in __init__ > > raise RuntimeError("Malformed diff hunk header '%s'" % line) > > RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@ > > > > The DiffHunk class only makes use of the start line, and does not > > utilise the line count, thus update the regex to make the unused > > groups optional, along with its comma separator. > > > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > --- > > utils/checkstyle.py | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/utils/checkstyle.py b/utils/checkstyle.py > > index 41cd3371e81e..ab3b977358c7 100755 > > --- a/utils/checkstyle.py > > +++ b/utils/checkstyle.py > > @@ -102,7 +102,7 @@ class DiffHunkSide(object): > > > > > > class DiffHunk(object): > > - diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@') > > + diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@') > > How about > > diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@') > > so that the count will be correctly captured, in case we need it later ? With this fixed, Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> > > def __init__(self, line): > > match = DiffHunk.diff_header_regex.match(line)
Hi Laurent, On 07/01/2020 15:58, Laurent Pinchart wrote: > Hi Kieran, > > Thank you for the patch. > > On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote: >> The checkstyle script expects hunks to be declared with a start line and >> line count, however the unified diff format [0] declares that a single >> line hunk will only have the start line: >> >>> If a hunk contains just one line, only its start line number appears. >>> Otherwise its line numbers look like ‘start,count’. An empty hunk is >>> considered to start at the line that follows the hunk. >> >> [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified >> >> Attempting to parse a single line hunk results in the following error: >> >> File "./utils/checkstyle.py", line 110, in __init__ >> raise RuntimeError("Malformed diff hunk header '%s'" % line) >> RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@ >> >> The DiffHunk class only makes use of the start line, and does not >> utilise the line count, thus update the regex to make the unused >> groups optional, along with its comma separator. >> >> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> >> --- >> utils/checkstyle.py | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/utils/checkstyle.py b/utils/checkstyle.py >> index 41cd3371e81e..ab3b977358c7 100755 >> --- a/utils/checkstyle.py >> +++ b/utils/checkstyle.py >> @@ -102,7 +102,7 @@ class DiffHunkSide(object): >> >> >> class DiffHunk(object): >> - diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@') >> + diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@') > > How about > > diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@') > > so that the count will be correctly captured, in case we need it later ? That could be better, but it doesn't make the ',' invalid in the case of @@ -1, +1,12 @@ That likely shouldn't happen, and perhaps isn't an issue. Are you happy for the ',' to be allowed when no digits follow? I don't think I can come up with either a reason for how this string might end up occurring, or how it would be a problem if we parsed it as such.. (until we make use of the hunk length at least). If you're happy, I'll update the string (and commit message) and push. -- Kieran > >> def __init__(self, line): >> match = DiffHunk.diff_header_regex.match(line) >
Hi Kieran, On Tue, Jan 07, 2020 at 04:04:00PM +0000, Kieran Bingham wrote: > On 07/01/2020 15:58, Laurent Pinchart wrote: > > On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote: > >> The checkstyle script expects hunks to be declared with a start line and > >> line count, however the unified diff format [0] declares that a single > >> line hunk will only have the start line: > >> > >>> If a hunk contains just one line, only its start line number appears. > >>> Otherwise its line numbers look like ‘start,count’. An empty hunk is > >>> considered to start at the line that follows the hunk. > >> > >> [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified > >> > >> Attempting to parse a single line hunk results in the following error: > >> > >> File "./utils/checkstyle.py", line 110, in __init__ > >> raise RuntimeError("Malformed diff hunk header '%s'" % line) > >> RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@ > >> > >> The DiffHunk class only makes use of the start line, and does not > >> utilise the line count, thus update the regex to make the unused > >> groups optional, along with its comma separator. > >> > >> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > >> --- > >> utils/checkstyle.py | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/utils/checkstyle.py b/utils/checkstyle.py > >> index 41cd3371e81e..ab3b977358c7 100755 > >> --- a/utils/checkstyle.py > >> +++ b/utils/checkstyle.py > >> @@ -102,7 +102,7 @@ class DiffHunkSide(object): > >> > >> > >> class DiffHunk(object): > >> - diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@') > >> + diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@') > > > > How about > > > > diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@') > > > > so that the count will be correctly captured, in case we need it later ? > > That could be better, but it doesn't make the ',' invalid in the case of > @@ -1, +1,12 @@ > > That likely shouldn't happen, and perhaps isn't an issue. Are you happy > for the ',' to be allowed when no digits follow? > > I don't think I can come up with either a reason for how this string > might end up occurring, or how it would be a problem if we parsed it as > such.. (until we make use of the hunk length at least). > > If you're happy, I'll update the string (and commit message) and push. It's a good point. I'm happy either way, but if we include the ',' in the group, does the last group need to have a '?' after the ',' ? i.e. shouldn't it be diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,[0-9]+)? @@') > >> def __init__(self, line): > >> match = DiffHunk.diff_header_regex.match(line)
Hi Laurent, On 07/01/2020 16:07, Laurent Pinchart wrote: > Hi Kieran, > > On Tue, Jan 07, 2020 at 04:04:00PM +0000, Kieran Bingham wrote: >> On 07/01/2020 15:58, Laurent Pinchart wrote: >>> On Tue, Jan 07, 2020 at 03:44:41PM +0000, Kieran Bingham wrote: >>>> The checkstyle script expects hunks to be declared with a start line and >>>> line count, however the unified diff format [0] declares that a single >>>> line hunk will only have the start line: >>>> >>>>> If a hunk contains just one line, only its start line number appears. >>>>> Otherwise its line numbers look like ‘start,count’. An empty hunk is >>>>> considered to start at the line that follows the hunk. >>>> >>>> [0] https://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html#Detailed-Unified >>>> >>>> Attempting to parse a single line hunk results in the following error: >>>> >>>> File "./utils/checkstyle.py", line 110, in __init__ >>>> raise RuntimeError("Malformed diff hunk header '%s'" % line) >>>> RuntimeError: Malformed diff hunk header '@@ -1 +1,2 @@ >>>> >>>> The DiffHunk class only makes use of the start line, and does not >>>> utilise the line count, thus update the regex to make the unused >>>> groups optional, along with its comma separator. >>>> >>>> Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> >>>> --- >>>> utils/checkstyle.py | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/utils/checkstyle.py b/utils/checkstyle.py >>>> index 41cd3371e81e..ab3b977358c7 100755 >>>> --- a/utils/checkstyle.py >>>> +++ b/utils/checkstyle.py >>>> @@ -102,7 +102,7 @@ class DiffHunkSide(object): >>>> >>>> >>>> class DiffHunk(object): >>>> - diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@') >>>> + diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@') >>> >>> How about >>> >>> diff_header_regex = re.compile(r'@@ -([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@') >>> >>> so that the count will be correctly captured, in case we need it later ? >> >> That could be better, but it doesn't make the ',' invalid in the case of >> @@ -1, +1,12 @@ >> >> That likely shouldn't happen, and perhaps isn't an issue. Are you happy >> for the ',' to be allowed when no digits follow? >> >> I don't think I can come up with either a reason for how this string >> might end up occurring, or how it would be a problem if we parsed it as >> such.. (until we make use of the hunk length at least). >> >> If you're happy, I'll update the string (and commit message) and push. > > It's a good point. I'm happy either way, but if we include the ',' in > the group, does the last group need to have a '?' after the ',' ? i.e. > shouldn't it be > > diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,[0-9]+)? @@') > Indeed - the ? in the last group was extraneous, sorry it was a left over from trying out the combinations, or rather an artefact of me removing the first one but not the second :-D If you're content that any extraneous ',' provided will do no harm (and in my initial testing here, I don't think it will) then I'll go with your variant to facilitate easy extraction of the hunk size later if it's needed. >>>> def __init__(self, line): >>>> match = DiffHunk.diff_header_regex.match(line) >
diff --git a/utils/checkstyle.py b/utils/checkstyle.py index 41cd3371e81e..ab3b977358c7 100755 --- a/utils/checkstyle.py +++ b/utils/checkstyle.py @@ -102,7 +102,7 @@ class DiffHunkSide(object): class DiffHunk(object): - diff_header_regex = re.compile(r'@@ -([0-9]+),([0-9]+) \+([0-9]+),([0-9]+) @@') + diff_header_regex = re.compile(r'@@ -([0-9]+)(,[0-9]+)? \+([0-9]+)(,?[0-9]+)? @@') def __init__(self, line): match = DiffHunk.diff_header_regex.match(line)