Message ID | 20220602100017.22272-1-ecurtin@redhat.com |
---|---|
State | Accepted |
Headers | show |
Series |
|
Related | show |
Hi Eric, Thank you for the patch. On Thu, Jun 02, 2022 at 11:00:17AM +0100, Eric Curtin via libcamera-devel wrote: > Existing code is hardcoded to card0. Since recent fedora upgrades, we > have noticed on more than one machine that card1 is present as the > lowest numbered device, could theoretically be higher. Out of curiosity, why is that ? > This technique > tries every file starting with card and continue only when we have > successfully opened one. These devices with card1 as the lowest device > were simply failing when they do not see a /dev/dri/card0 file present. > > Reported-by: Ian Mullins <imullins@redhat.com> > Signed-off-by: Eric Curtin <ecurtin@redhat.com> > > --- > Changes in v3: > - switch back to memcpy, strncpy causing false compiler issue > > Changes in v2: > - return if no valid card found, or directory could not be opened etc. > - memcpy to strncpy for safety > - only adjust the numerical bytes for each iteration of loop since > /dev/dri/card won't change > - initialize ret to zero > --- > src/cam/drm.cpp | 46 ++++++++++++++++++++++++++++++++++++---------- > 1 file changed, 36 insertions(+), 10 deletions(-) > > diff --git a/src/cam/drm.cpp b/src/cam/drm.cpp > index 42c5a3b1..7975a502 100644 > --- a/src/cam/drm.cpp > +++ b/src/cam/drm.cpp > @@ -8,6 +8,7 @@ > #include "drm.h" > > #include <algorithm> > +#include <dirent.h> > #include <errno.h> > #include <fcntl.h> > #include <iostream> > @@ -393,9 +394,13 @@ Device::~Device() > > int Device::init() > { > - constexpr size_t NODE_NAME_MAX = sizeof("/dev/dri/card255"); > - char name[NODE_NAME_MAX]; > - int ret; > + constexpr size_t DIR_NAME_MAX = sizeof("/dev/dri/"); > + constexpr size_t PRE_NODE_NAME_MAX = sizeof("card"); > + constexpr size_t POST_NODE_NAME_MAX = sizeof("255"); > + constexpr size_t NODE_NAME_MAX = > + DIR_NAME_MAX + PRE_NODE_NAME_MAX + POST_NODE_NAME_MAX - 2; > + char name[NODE_NAME_MAX] = "/dev/dri/"; This looks pretty error-prone compared to std::string, let's use the C++ API here instead. This isn't performance-sensitive code. std::filesystem could be nice too, to avoid the manual closedir(), but that's a lesser concern. > + int ret = 0; > > /* > * Open the first DRM/KMS device. The libdrm drmOpen*() functions > @@ -404,14 +409,35 @@ int Device::init() > * from drmOpen() is of no practical use as any modern system will > * handle that through udev or an equivalent component. > */ > - snprintf(name, sizeof(name), "/dev/dri/card%u", 0); > - fd_ = open(name, O_RDWR | O_CLOEXEC); > + DIR *folder = opendir(name); > + memcpy(name + DIR_NAME_MAX - 1, "card", PRE_NODE_NAME_MAX); > + if (folder) { You can turn this into a if (!folder) { ... } to lower the indentation below. > + for (struct dirent *res; (res = readdir(folder));) { > + if (!strncmp(res->d_name, "card", 4)) { Same thing here, if (strncmp(...)) continue; (or rather the equivalent after switching to std::string) > + memcpy(name + DIR_NAME_MAX + > + PRE_NODE_NAME_MAX - 2, > + res->d_name + PRE_NODE_NAME_MAX - 1, > + POST_NODE_NAME_MAX); > + fd_ = open(name, O_RDWR | O_CLOEXEC); > + if (fd_ < 0) { > + ret = -errno; > + std::cerr > + << "Failed to open DRM/KMS device " > + << name << ": " > + << strerror(-ret) << std::endl; > + continue; > + } > + > + break; > + } > + } > + > + closedir(folder); > + } > + > if (fd_ < 0) { > - ret = -errno; > - std::cerr > - << "Failed to open DRM/KMS device " << name << ": " > - << strerror(-ret) << std::endl; > - return ret; > + std::cerr << "Unable to open any DRM/KMS device" << std::endl; > + return ret ? ret : -ENOENT; I don't think it's very meaningful to pick the last error here, you can just return -ENOENT > } It could be interesting to try all available DRM devices to find the requested connector, as well as allowing selection of a particular device on the command line. That's an improvement candidate for the future, if a need arises. > > /*
On Fri, 3 Jun 2022 at 22:52, Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > > Hi Eric, > > Thank you for the patch. > > On Thu, Jun 02, 2022 at 11:00:17AM +0100, Eric Curtin via libcamera-devel wrote: > > Existing code is hardcoded to card0. Since recent fedora upgrades, we > > have noticed on more than one machine that card1 is present as the > > lowest numbered device, could theoretically be higher. > > Out of curiosity, why is that ? I don't know, I didn't mind too much, because this patch fixed it and it seemed like the more correct way anyway. CC'ing Javier, just in case he knows, being a DRM guru. I've seen it on 2 machines I own that updated recently and Ian also saw it on his machine. > > > This technique > > tries every file starting with card and continue only when we have > > successfully opened one. These devices with card1 as the lowest device > > were simply failing when they do not see a /dev/dri/card0 file present. > > > > Reported-by: Ian Mullins <imullins@redhat.com> > > Signed-off-by: Eric Curtin <ecurtin@redhat.com> > > > > --- > > Changes in v3: > > - switch back to memcpy, strncpy causing false compiler issue > > > > Changes in v2: > > - return if no valid card found, or directory could not be opened etc. > > - memcpy to strncpy for safety > > - only adjust the numerical bytes for each iteration of loop since > > /dev/dri/card won't change > > - initialize ret to zero > > --- > > src/cam/drm.cpp | 46 ++++++++++++++++++++++++++++++++++++---------- > > 1 file changed, 36 insertions(+), 10 deletions(-) > > > > diff --git a/src/cam/drm.cpp b/src/cam/drm.cpp > > index 42c5a3b1..7975a502 100644 > > --- a/src/cam/drm.cpp > > +++ b/src/cam/drm.cpp > > @@ -8,6 +8,7 @@ > > #include "drm.h" > > > > #include <algorithm> > > +#include <dirent.h> > > #include <errno.h> > > #include <fcntl.h> > > #include <iostream> > > @@ -393,9 +394,13 @@ Device::~Device() > > > > int Device::init() > > { > > - constexpr size_t NODE_NAME_MAX = sizeof("/dev/dri/card255"); > > - char name[NODE_NAME_MAX]; > > - int ret; > > + constexpr size_t DIR_NAME_MAX = sizeof("/dev/dri/"); > > + constexpr size_t PRE_NODE_NAME_MAX = sizeof("card"); > > + constexpr size_t POST_NODE_NAME_MAX = sizeof("255"); > > + constexpr size_t NODE_NAME_MAX = > > + DIR_NAME_MAX + PRE_NODE_NAME_MAX + POST_NODE_NAME_MAX - 2; > > + char name[NODE_NAME_MAX] = "/dev/dri/"; > > This looks pretty error-prone compared to std::string, let's use the C++ > API here instead. This isn't performance-sensitive code. Will std::stringify it. > > std::filesystem could be nice too, to avoid the manual closedir(), but > that's a lesser concern. > > > + int ret = 0; > > > > /* > > * Open the first DRM/KMS device. The libdrm drmOpen*() functions > > @@ -404,14 +409,35 @@ int Device::init() > > * from drmOpen() is of no practical use as any modern system will > > * handle that through udev or an equivalent component. > > */ > > - snprintf(name, sizeof(name), "/dev/dri/card%u", 0); > > - fd_ = open(name, O_RDWR | O_CLOEXEC); > > + DIR *folder = opendir(name); > > + memcpy(name + DIR_NAME_MAX - 1, "card", PRE_NODE_NAME_MAX); > > + if (folder) { > > You can turn this into a > > if (!folder) { > ... > } > > to lower the indentation below. Ok > > > + for (struct dirent *res; (res = readdir(folder));) { > > + if (!strncmp(res->d_name, "card", 4)) { > > Same thing here, > > if (strncmp(...)) > continue; > > (or rather the equivalent after switching to std::string) > > > + memcpy(name + DIR_NAME_MAX + > > + PRE_NODE_NAME_MAX - 2, > > + res->d_name + PRE_NODE_NAME_MAX - 1, > > + POST_NODE_NAME_MAX); > > + fd_ = open(name, O_RDWR | O_CLOEXEC); > > + if (fd_ < 0) { > > + ret = -errno; > > + std::cerr > > + << "Failed to open DRM/KMS device " > > + << name << ": " > > + << strerror(-ret) << std::endl; > > + continue; > > + } > > + > > + break; > > + } > > + } > > + > > + closedir(folder); > > + } > > + > > if (fd_ < 0) { > > - ret = -errno; > > - std::cerr > > - << "Failed to open DRM/KMS device " << name << ": " > > - << strerror(-ret) << std::endl; > > - return ret; > > + std::cerr << "Unable to open any DRM/KMS device" << std::endl; > > + return ret ? ret : -ENOENT; > > I don't think it's very meaningful to pick the last error here, you can > just return -ENOENT Ok. > > > } > > It could be interesting to try all available DRM devices to find the > requested connector, as well as allowing selection of a particular > device on the command line. That's an improvement candidate for the > future, if a need arises. > > > > > /* > > -- > Regards, > > Laurent Pinchart >
On Sun, 5 Jun 2022 at 16:25, Eric Curtin <ecurtin@redhat.com> wrote: > > On Fri, 3 Jun 2022 at 22:52, Laurent Pinchart > <laurent.pinchart@ideasonboard.com> wrote: > > > > Hi Eric, > > > > Thank you for the patch. > > > > On Thu, Jun 02, 2022 at 11:00:17AM +0100, Eric Curtin via libcamera-devel wrote: > > > Existing code is hardcoded to card0. Since recent fedora upgrades, we > > > have noticed on more than one machine that card1 is present as the > > > lowest numbered device, could theoretically be higher. > > > > Out of curiosity, why is that ? > > I don't know, I didn't mind too much, because this patch fixed it and > it seemed like the more correct way anyway. CC'ing Javier, just in > case he knows, being a DRM guru. I've seen it on 2 machines I own that > updated recently and Ian also saw it on his machine. I got an answer on this, it is believed on modern kernels, simpledrm will register as card0, then the real hardware-accelerated driver will register as card1 which will cause simpledrm driver (or card0) to be kicked out. Regardless, this was seen as a userspace bug as it should not be assumed that if there is one card registered that it is card0, there are no guarantees here. Coincidentally I did try this kmssink with a simpledrm driver on raspberry pi and it didn't work, while SDL does (which is a kms sink in itself when ran without desktop environment running). Never got around to figuring out why. > > > > > > This technique > > > tries every file starting with card and continue only when we have > > > successfully opened one. These devices with card1 as the lowest device > > > were simply failing when they do not see a /dev/dri/card0 file present. > > > > > > Reported-by: Ian Mullins <imullins@redhat.com> > > > Signed-off-by: Eric Curtin <ecurtin@redhat.com> > > > > > > --- > > > Changes in v3: > > > - switch back to memcpy, strncpy causing false compiler issue > > > > > > Changes in v2: > > > - return if no valid card found, or directory could not be opened etc. > > > - memcpy to strncpy for safety > > > - only adjust the numerical bytes for each iteration of loop since > > > /dev/dri/card won't change > > > - initialize ret to zero > > > --- > > > src/cam/drm.cpp | 46 ++++++++++++++++++++++++++++++++++++---------- > > > 1 file changed, 36 insertions(+), 10 deletions(-) > > > > > > diff --git a/src/cam/drm.cpp b/src/cam/drm.cpp > > > index 42c5a3b1..7975a502 100644 > > > --- a/src/cam/drm.cpp > > > +++ b/src/cam/drm.cpp > > > @@ -8,6 +8,7 @@ > > > #include "drm.h" > > > > > > #include <algorithm> > > > +#include <dirent.h> > > > #include <errno.h> > > > #include <fcntl.h> > > > #include <iostream> > > > @@ -393,9 +394,13 @@ Device::~Device() > > > > > > int Device::init() > > > { > > > - constexpr size_t NODE_NAME_MAX = sizeof("/dev/dri/card255"); > > > - char name[NODE_NAME_MAX]; > > > - int ret; > > > + constexpr size_t DIR_NAME_MAX = sizeof("/dev/dri/"); > > > + constexpr size_t PRE_NODE_NAME_MAX = sizeof("card"); > > > + constexpr size_t POST_NODE_NAME_MAX = sizeof("255"); > > > + constexpr size_t NODE_NAME_MAX = > > > + DIR_NAME_MAX + PRE_NODE_NAME_MAX + POST_NODE_NAME_MAX - 2; > > > + char name[NODE_NAME_MAX] = "/dev/dri/"; > > > > This looks pretty error-prone compared to std::string, let's use the C++ > > API here instead. This isn't performance-sensitive code. > > Will std::stringify it. > > > > > std::filesystem could be nice too, to avoid the manual closedir(), but > > that's a lesser concern. > > > > > + int ret = 0; > > > > > > /* > > > * Open the first DRM/KMS device. The libdrm drmOpen*() functions > > > @@ -404,14 +409,35 @@ int Device::init() > > > * from drmOpen() is of no practical use as any modern system will > > > * handle that through udev or an equivalent component. > > > */ > > > - snprintf(name, sizeof(name), "/dev/dri/card%u", 0); > > > - fd_ = open(name, O_RDWR | O_CLOEXEC); > > > + DIR *folder = opendir(name); > > > + memcpy(name + DIR_NAME_MAX - 1, "card", PRE_NODE_NAME_MAX); > > > + if (folder) { > > > > You can turn this into a > > > > if (!folder) { > > ... > > } > > > > to lower the indentation below. > > Ok > > > > > > + for (struct dirent *res; (res = readdir(folder));) { > > > + if (!strncmp(res->d_name, "card", 4)) { > > > > Same thing here, > > > > if (strncmp(...)) > > continue; > > > > (or rather the equivalent after switching to std::string) > > > > > + memcpy(name + DIR_NAME_MAX + > > > + PRE_NODE_NAME_MAX - 2, > > > + res->d_name + PRE_NODE_NAME_MAX - 1, > > > + POST_NODE_NAME_MAX); > > > + fd_ = open(name, O_RDWR | O_CLOEXEC); > > > + if (fd_ < 0) { > > > + ret = -errno; > > > + std::cerr > > > + << "Failed to open DRM/KMS device " > > > + << name << ": " > > > + << strerror(-ret) << std::endl; > > > + continue; > > > + } > > > + > > > + break; > > > + } > > > + } > > > + > > > + closedir(folder); > > > + } > > > + > > > if (fd_ < 0) { > > > - ret = -errno; > > > - std::cerr > > > - << "Failed to open DRM/KMS device " << name << ": " > > > - << strerror(-ret) << std::endl; > > > - return ret; > > > + std::cerr << "Unable to open any DRM/KMS device" << std::endl; > > > + return ret ? ret : -ENOENT; > > > > I don't think it's very meaningful to pick the last error here, you can > > just return -ENOENT > > Ok. > > > > > > } > > > > It could be interesting to try all available DRM devices to find the > > requested connector, as well as allowing selection of a particular > > device on the command line. That's an improvement candidate for the > > future, if a need arises. > > > > > > > > /* > > > > -- > > Regards, > > > > Laurent Pinchart > >
Hello, On 6/9/22 18:13, Eric Curtin wrote: [snip] >>>> Existing code is hardcoded to card0. Since recent fedora upgrades, we >>>> have noticed on more than one machine that card1 is present as the >>>> lowest numbered device, could theoretically be higher. >>> >>> Out of curiosity, why is that ? >> >> I don't know, I didn't mind too much, because this patch fixed it and >> it seemed like the more correct way anyway. CC'ing Javier, just in >> case he knows, being a DRM guru. I've seen it on 2 machines I own that >> updated recently and Ian also saw it on his machine. > > I got an answer on this, it is believed on modern kernels, simpledrm > will register as card0, then the real hardware-accelerated driver will > register as card1 which will cause simpledrm driver (or card0) to be > kicked out. Regardless, this was seen as a userspace bug as it should > not be assumed that if there is one card registered that it is card0, > there are no guarantees here. > To elaborate on this, before F36 we used the {vesa,efi,simple}fb drivers to have early video output but since F36 those have been replaced by simpledrm that as any DRM driver registers a DRI device (as card0 since is the first driver probed due being built-in the kernel). Later, a real DRM driver is probed and registers its own DRI device. These should remove all conflicting framebuffers when probed, but not all drivers do this at the same time. For example, the i915 driver first calls to devm_drm_dev_alloc() function [0] and then calls to drm_aperture_remove_conflicting_framebuffers() [1]. There's a small window then where one has two DRI devices between the time when i915 registers its device and the DRM core kicks out the simpledrm driver. Other DRM drivers call drm_aperture_remove_conflicting_framebuffers() at the beginning of their probe function so there will only be a single DRI device at any point. I asked the DRM folks if they would accept a patch to move the conflicting FB removal earlier but they said that's a user-space bug as Eric mentioned, since is a fragile assumption to make anyways. [0]: https://docs.kernel.org/gpu/drm-internals.html?highlight=devm_drm_dev_alloc#c.devm_drm_dev_alloc [1]: https://docs.kernel.org/gpu/drm-internals.html?highlight=remove_conflicting_framebuffers#c.drm_aperture_remove_conflicting_framebuffers
Hi Javier and Eric, On Thu, Jun 09, 2022 at 07:59:58PM +0200, Javier Martinez Canillas wrote: > On 6/9/22 18:13, Eric Curtin wrote: > > [snip] > > >>>> Existing code is hardcoded to card0. Since recent fedora upgrades, we > >>>> have noticed on more than one machine that card1 is present as the > >>>> lowest numbered device, could theoretically be higher. > >>> > >>> Out of curiosity, why is that ? > >> > >> I don't know, I didn't mind too much, because this patch fixed it and > >> it seemed like the more correct way anyway. CC'ing Javier, just in > >> case he knows, being a DRM guru. I've seen it on 2 machines I own that > >> updated recently and Ian also saw it on his machine. > > > > I got an answer on this, it is believed on modern kernels, simpledrm > > will register as card0, then the real hardware-accelerated driver will > > register as card1 which will cause simpledrm driver (or card0) to be > > kicked out. Regardless, this was seen as a userspace bug as it should > > not be assumed that if there is one card registered that it is card0, > > there are no guarantees here. > > > > To elaborate on this, before F36 we used the {vesa,efi,simple}fb drivers to > have early video output but since F36 those have been replaced by simpledrm > that as any DRM driver registers a DRI device (as card0 since is the first > driver probed due being built-in the kernel). > > Later, a real DRM driver is probed and registers its own DRI device. These > should remove all conflicting framebuffers when probed, but not all drivers > do this at the same time. > > For example, the i915 driver first calls to devm_drm_dev_alloc() function [0] > and then calls to drm_aperture_remove_conflicting_framebuffers() [1]. There's > a small window then where one has two DRI devices between the time when i915 > registers its device and the DRM core kicks out the simpledrm driver. > > Other DRM drivers call drm_aperture_remove_conflicting_framebuffers() at the > beginning of their probe function so there will only be a single DRI device > at any point. > > I asked the DRM folks if they would accept a patch to move the conflicting FB > removal earlier but they said that's a user-space bug as Eric mentioned, since > is a fragile assumption to make anyways. Thank you for the information. This makes sense. > [0]: https://docs.kernel.org/gpu/drm-internals.html?highlight=devm_drm_dev_alloc#c.devm_drm_dev_alloc > [1]: https://docs.kernel.org/gpu/drm-internals.html?highlight=remove_conflicting_framebuffers#c.drm_aperture_remove_conflicting_framebuffers
diff --git a/src/cam/drm.cpp b/src/cam/drm.cpp index 42c5a3b1..7975a502 100644 --- a/src/cam/drm.cpp +++ b/src/cam/drm.cpp @@ -8,6 +8,7 @@ #include "drm.h" #include <algorithm> +#include <dirent.h> #include <errno.h> #include <fcntl.h> #include <iostream> @@ -393,9 +394,13 @@ Device::~Device() int Device::init() { - constexpr size_t NODE_NAME_MAX = sizeof("/dev/dri/card255"); - char name[NODE_NAME_MAX]; - int ret; + constexpr size_t DIR_NAME_MAX = sizeof("/dev/dri/"); + constexpr size_t PRE_NODE_NAME_MAX = sizeof("card"); + constexpr size_t POST_NODE_NAME_MAX = sizeof("255"); + constexpr size_t NODE_NAME_MAX = + DIR_NAME_MAX + PRE_NODE_NAME_MAX + POST_NODE_NAME_MAX - 2; + char name[NODE_NAME_MAX] = "/dev/dri/"; + int ret = 0; /* * Open the first DRM/KMS device. The libdrm drmOpen*() functions @@ -404,14 +409,35 @@ int Device::init() * from drmOpen() is of no practical use as any modern system will * handle that through udev or an equivalent component. */ - snprintf(name, sizeof(name), "/dev/dri/card%u", 0); - fd_ = open(name, O_RDWR | O_CLOEXEC); + DIR *folder = opendir(name); + memcpy(name + DIR_NAME_MAX - 1, "card", PRE_NODE_NAME_MAX); + if (folder) { + for (struct dirent *res; (res = readdir(folder));) { + if (!strncmp(res->d_name, "card", 4)) { + memcpy(name + DIR_NAME_MAX + + PRE_NODE_NAME_MAX - 2, + res->d_name + PRE_NODE_NAME_MAX - 1, + POST_NODE_NAME_MAX); + fd_ = open(name, O_RDWR | O_CLOEXEC); + if (fd_ < 0) { + ret = -errno; + std::cerr + << "Failed to open DRM/KMS device " + << name << ": " + << strerror(-ret) << std::endl; + continue; + } + + break; + } + } + + closedir(folder); + } + if (fd_ < 0) { - ret = -errno; - std::cerr - << "Failed to open DRM/KMS device " << name << ": " - << strerror(-ret) << std::endl; - return ret; + std::cerr << "Unable to open any DRM/KMS device" << std::endl; + return ret ? ret : -ENOENT; } /*
Existing code is hardcoded to card0. Since recent fedora upgrades, we have noticed on more than one machine that card1 is present as the lowest numbered device, could theoretically be higher. This technique tries every file starting with card and continue only when we have successfully opened one. These devices with card1 as the lowest device were simply failing when they do not see a /dev/dri/card0 file present. Reported-by: Ian Mullins <imullins@redhat.com> Signed-off-by: Eric Curtin <ecurtin@redhat.com> --- Changes in v3: - switch back to memcpy, strncpy causing false compiler issue Changes in v2: - return if no valid card found, or directory could not be opened etc. - memcpy to strncpy for safety - only adjust the numerical bytes for each iteration of loop since /dev/dri/card won't change - initialize ret to zero --- src/cam/drm.cpp | 46 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 36 insertions(+), 10 deletions(-)