Message ID | 20240805223148.30605-1-kieran.bingham@ideasonboard.com |
---|---|
State | Accepted |
Headers | show |
Series |
|
Related | show |
Hi Kieran, Thank you for the patch. On Mon, Aug 05, 2024 at 11:31:48PM +0100, Kieran Bingham wrote: > The udev_monitor_receive_device() can return NULL on an error as > detailed in the man pages for the function. > > The udevNotify() handler in the DeviceEnumeratorUdev directly uses the > return value of udev_monitor_receive_device() in successive calls to > process the event without having first checked the udev_device. > > Ensure we identify, and handle events where the udev_device can not be > returned successfully. > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=230 > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > --- > This is a proposed fix for the reported CI failure caught by Systemd > runners. It's not clear if this failure was an injected failure, startup > race or otherwise - but given the documentation of this function states > it can return null ... it seems valid to null check before passing into > other APIs especially when a trace is reported directly at this point. > > I don't presently have a way to replicate or reproduce the error either, > but this patch itself has passed our ci tests (which isn't too > unexpected as the change is quite minimal otherwise). > > > src/libcamera/device_enumerator_udev.cpp | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/src/libcamera/device_enumerator_udev.cpp b/src/libcamera/device_enumerator_udev.cpp > index 01c70b6daa82..c6c78651baa0 100644 > --- a/src/libcamera/device_enumerator_udev.cpp > +++ b/src/libcamera/device_enumerator_udev.cpp > @@ -332,6 +332,12 @@ int DeviceEnumeratorUdev::addV4L2Device(dev_t devnum) > void DeviceEnumeratorUdev::udevNotify() > { > struct udev_device *dev = udev_monitor_receive_device(monitor_); > + if (!dev) { > + LOG(DeviceEnumerator, Warning) > + << "Ignoring notfication received without a device"; Is that even worth a warning or only debug? Anyways: Signed-off-by: Stefan Klug <stefan.klug@ideasonboard.com> Cheers, Stefan > + return; > + } > + > std::string_view action(udev_device_get_action(dev)); > std::string_view deviceNode(udev_device_get_devnode(dev)); > > -- > 2.45.2 >
On Tue, Aug 06, 2024 at 09:19:37AM +0200, Stefan Klug wrote: > Hi Kieran, > > Thank you for the patch. > > On Mon, Aug 05, 2024 at 11:31:48PM +0100, Kieran Bingham wrote: > > The udev_monitor_receive_device() can return NULL on an error as > > detailed in the man pages for the function. > > > > The udevNotify() handler in the DeviceEnumeratorUdev directly uses the > > return value of udev_monitor_receive_device() in successive calls to > > process the event without having first checked the udev_device. > > > > Ensure we identify, and handle events where the udev_device can not be > > returned successfully. > > > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=230 > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > --- > > This is a proposed fix for the reported CI failure caught by Systemd > > runners. It's not clear if this failure was an injected failure, startup > > race or otherwise - but given the documentation of this function states > > it can return null ... it seems valid to null check before passing into > > other APIs especially when a trace is reported directly at this point. > > > > I don't presently have a way to replicate or reproduce the error either, > > but this patch itself has passed our ci tests (which isn't too > > unexpected as the change is quite minimal otherwise). > > > > > > src/libcamera/device_enumerator_udev.cpp | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/src/libcamera/device_enumerator_udev.cpp b/src/libcamera/device_enumerator_udev.cpp > > index 01c70b6daa82..c6c78651baa0 100644 > > --- a/src/libcamera/device_enumerator_udev.cpp > > +++ b/src/libcamera/device_enumerator_udev.cpp > > @@ -332,6 +332,12 @@ int DeviceEnumeratorUdev::addV4L2Device(dev_t devnum) > > void DeviceEnumeratorUdev::udevNotify() > > { > > struct udev_device *dev = udev_monitor_receive_device(monitor_); > > + if (!dev) { > > + LOG(DeviceEnumerator, Warning) > > + << "Ignoring notfication received without a device"; > > Is that even worth a warning or only debug? As we've never seen this product for years, I wonder why it's happening. A warning would help noticing the issue. On the other hand, are such errors means to happen in rare ubt valid cases ? And then, are they meant to be ignored, or do they indicate something is fundamentally wrong and action needs to be taken ? If no action is needed and there are no consequences, then a debug message is more than enough. If we want to diagnose this, we need more information. The message should print the value of errno, and it would be nice if the original reporter of the issues could then report the error number. > Anyways: > Signed-off-by: Stefan Klug <stefan.klug@ideasonboard.com> > > > + return; > > + } > > + > > std::string_view action(udev_device_get_action(dev)); > > std::string_view deviceNode(udev_device_get_devnode(dev));
Quoting Laurent Pinchart (2024-08-06 08:40:53) > On Tue, Aug 06, 2024 at 09:19:37AM +0200, Stefan Klug wrote: > > Hi Kieran, > > > > Thank you for the patch. > > > > On Mon, Aug 05, 2024 at 11:31:48PM +0100, Kieran Bingham wrote: > > > The udev_monitor_receive_device() can return NULL on an error as > > > detailed in the man pages for the function. > > > > > > The udevNotify() handler in the DeviceEnumeratorUdev directly uses the > > > return value of udev_monitor_receive_device() in successive calls to > > > process the event without having first checked the udev_device. > > > > > > Ensure we identify, and handle events where the udev_device can not be > > > returned successfully. > > > > > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=230 > > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > > --- > > > This is a proposed fix for the reported CI failure caught by Systemd > > > runners. It's not clear if this failure was an injected failure, startup > > > race or otherwise - but given the documentation of this function states > > > it can return null ... it seems valid to null check before passing into > > > other APIs especially when a trace is reported directly at this point. > > > > > > I don't presently have a way to replicate or reproduce the error either, > > > but this patch itself has passed our ci tests (which isn't too > > > unexpected as the change is quite minimal otherwise). > > > > > > > > > src/libcamera/device_enumerator_udev.cpp | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/src/libcamera/device_enumerator_udev.cpp b/src/libcamera/device_enumerator_udev.cpp > > > index 01c70b6daa82..c6c78651baa0 100644 > > > --- a/src/libcamera/device_enumerator_udev.cpp > > > +++ b/src/libcamera/device_enumerator_udev.cpp > > > @@ -332,6 +332,12 @@ int DeviceEnumeratorUdev::addV4L2Device(dev_t devnum) > > > void DeviceEnumeratorUdev::udevNotify() > > > { > > > struct udev_device *dev = udev_monitor_receive_device(monitor_); > > > + if (!dev) { > > > + LOG(DeviceEnumerator, Warning) > > > + << "Ignoring notfication received without a device"; > > > > Is that even worth a warning or only debug? > > As we've never seen this product for years, I wonder why it's happening. > A warning would help noticing the issue. On the other hand, are such > errors means to happen in rare ubt valid cases ? And then, are they > meant to be ignored, or do they indicate something is fundamentally > wrong and action needs to be taken ? If no action is needed and there > are no consequences, then a debug message is more than enough. > > If we want to diagnose this, we need more information. The message > should print the value of errno, and it would be nice if the original > reporter of the issues could then report the error number. Oh interesting idea. I didn't notice if the man pages reference setting errno - but it's worth reporting to help diagnose the issue for sure. https://github.com/systemd/systemd/blob/5a5a7093b8e054cf3bfa65716ffe77a5d93647ff/src/libudev/libudev-monitor.c#L239 calls return_with_errno(NULL, r);, so I think errno is indeed worthwhile reporting. I'll update in v2. -- Kieran > > Anyways: > > Signed-off-by: Stefan Klug <stefan.klug@ideasonboard.com> > > > > > + return; > > > + } > > > + > > > std::string_view action(udev_device_get_action(dev)); > > > std::string_view deviceNode(udev_device_get_devnode(dev)); > > -- > Regards, > > Laurent Pinchart
Quoting Stefan Klug (2024-08-06 08:19:37) > Hi Kieran, > > Thank you for the patch. > > On Mon, Aug 05, 2024 at 11:31:48PM +0100, Kieran Bingham wrote: > > The udev_monitor_receive_device() can return NULL on an error as > > detailed in the man pages for the function. > > > > The udevNotify() handler in the DeviceEnumeratorUdev directly uses the > > return value of udev_monitor_receive_device() in successive calls to > > process the event without having first checked the udev_device. > > > > Ensure we identify, and handle events where the udev_device can not be > > returned successfully. > > > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=230 > > Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> > > --- > > This is a proposed fix for the reported CI failure caught by Systemd > > runners. It's not clear if this failure was an injected failure, startup > > race or otherwise - but given the documentation of this function states > > it can return null ... it seems valid to null check before passing into > > other APIs especially when a trace is reported directly at this point. > > > > I don't presently have a way to replicate or reproduce the error either, > > but this patch itself has passed our ci tests (which isn't too > > unexpected as the change is quite minimal otherwise). > > > > > > src/libcamera/device_enumerator_udev.cpp | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/src/libcamera/device_enumerator_udev.cpp b/src/libcamera/device_enumerator_udev.cpp > > index 01c70b6daa82..c6c78651baa0 100644 > > --- a/src/libcamera/device_enumerator_udev.cpp > > +++ b/src/libcamera/device_enumerator_udev.cpp > > @@ -332,6 +332,12 @@ int DeviceEnumeratorUdev::addV4L2Device(dev_t devnum) > > void DeviceEnumeratorUdev::udevNotify() > > { > > struct udev_device *dev = udev_monitor_receive_device(monitor_); > > + if (!dev) { > > + LOG(DeviceEnumerator, Warning) > > + << "Ignoring notfication received without a device"; > > Is that even worth a warning or only debug? > > Anyways: > Signed-off-by: Stefan Klug <stefan.klug@ideasonboard.com> Thanks, but I'll assume that was "Reviewed-by: " :-) > > Cheers, > Stefan > > > + return; > > + } > > + > > std::string_view action(udev_device_get_action(dev)); > > std::string_view deviceNode(udev_device_get_devnode(dev)); > > > > -- > > 2.45.2 > >
diff --git a/src/libcamera/device_enumerator_udev.cpp b/src/libcamera/device_enumerator_udev.cpp index 01c70b6daa82..c6c78651baa0 100644 --- a/src/libcamera/device_enumerator_udev.cpp +++ b/src/libcamera/device_enumerator_udev.cpp @@ -332,6 +332,12 @@ int DeviceEnumeratorUdev::addV4L2Device(dev_t devnum) void DeviceEnumeratorUdev::udevNotify() { struct udev_device *dev = udev_monitor_receive_device(monitor_); + if (!dev) { + LOG(DeviceEnumerator, Warning) + << "Ignoring notfication received without a device"; + return; + } + std::string_view action(udev_device_get_action(dev)); std::string_view deviceNode(udev_device_get_devnode(dev));
The udev_monitor_receive_device() can return NULL on an error as detailed in the man pages for the function. The udevNotify() handler in the DeviceEnumeratorUdev directly uses the return value of udev_monitor_receive_device() in successive calls to process the event without having first checked the udev_device. Ensure we identify, and handle events where the udev_device can not be returned successfully. Bug: https://bugs.libcamera.org/show_bug.cgi?id=230 Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> --- This is a proposed fix for the reported CI failure caught by Systemd runners. It's not clear if this failure was an injected failure, startup race or otherwise - but given the documentation of this function states it can return null ... it seems valid to null check before passing into other APIs especially when a trace is reported directly at this point. I don't presently have a way to replicate or reproduce the error either, but this patch itself has passed our ci tests (which isn't too unexpected as the change is quite minimal otherwise). src/libcamera/device_enumerator_udev.cpp | 6 ++++++ 1 file changed, 6 insertions(+)