[{"id":34980,"web_url":"https://patchwork.libcamera.org/comment/34980/","msgid":"<175312132883.50296.12967542926126891436@ping.linuxembedded.co.uk>","date":"2025-07-21T18:08:48","subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","submitter":{"id":4,"url":"https://patchwork.libcamera.org/api/people/4/","name":"Kieran Bingham","email":"kieran.bingham@ideasonboard.com"},"content":"Hi Julien,\n\nI've just seen this in my backlog.\n\nIt's ... a substantial change, so it's hard to digest. Is there anything\nwe can break down here?\n\nAre you running this change at NXP ? \n\nQuoting Julien Vuillaumier (2024-12-20 14:55:56)\n> Unix-socket based IPC sometimes times out or hangs, typically\n> when multiple camera are stopped simulaneously. That specific case\n> triggers the concurrent sending by each pipeline handler instance\n> of a synchronous stop() message to its peer IPA process.\n> \n> There is a dedicated IPC socket per camera. Sockets payload receipt\n> signals all run in the camera manager thread.\n> To send a synchronous message to IPA, pipeline invokes from camera\n> thread IPCPipeUnixSocket::sendSync(). This sends the message then blocks\n> busy waiting for the peer acknowledgment. Such busy wait is done by\n> blocking on event loop calling dispatchEvent(), until the ack condition\n> is detected.\n> \n> One issue is that the socket receive slot readyRead() wakes up the\n> blocked thread via libcamera::Message receipt. Even though such message\n> resumes processEvents(), it may reblock immediately because readyRead()\n> does not interrupt() explictly the dispatcher.\n> Most of the time, an other pending event for the thread unblocks the\n> event dispatcher and the ack condition is detected - in worst case the 2\n> sec timeout kicks in. Once unblocked, the dispatcher let the message\n> acknowledgment to be detected and the sendSync() completes.\n> \n> The other issue is that in case of concurrent synchronous IPC messages\n> sent by multiple pipeline handlers, there is a possible recursion\n> of sendSync() / processEvents() nested in the camera thread stack. As\n> commented in the source, that is a dangerous construct that can lead to\n> a hang.\n\nI'm not sure I fully understand, - do you have multiple cameras using a\nSingle IPA - or multiple cameras using multiple IPAs using a single\npipeline handler perhaps?\n\n\n\n> The reason is that the last synchronous message sent is the deepest in\n> the stack. It is also the one whose acknowledgment is being busy waited.\n> However other pending synchronous messages may have been initiated before\n> and are upper in the stack. If they timeout, the condition is not\n> detected because of the stack recursion, as the thread is busy waiting\n> for the last message to be acknowledged.\n> \n> This change implements a safer mechanism to handle the synchronous\n> message sending, similar to the one used for non isolated IPA. The\n> IPCUnixSocketWrapper class is introduced to handle the IPCUnixSocket\n> receive signal in a dedicated thread.\n> Doing so, the sending thread, when emiting a synchronous message, can be\n> blocked without event dispatcher's processEvents() usage, which avoids\n> the risky stack recursion.\n> \n> Fixes: 21f1b555b (\"libcamera: Add IPCPipe implementation based on unix socket\")\n> \n\nPaul - you were the author of the commit referenced above - could you\ncheck through this patch please?\n\n> Signed-off-by: Julien Vuillaumier <julien.vuillaumier@nxp.com>\n> ---\n>  .../libcamera/internal/ipc_pipe_unixsocket.h  |  13 +-\n>  src/libcamera/ipc_pipe_unixsocket.cpp         | 242 +++++++++++++-----\n>  2 files changed, 178 insertions(+), 77 deletions(-)\n> \n> diff --git a/include/libcamera/internal/ipc_pipe_unixsocket.h b/include/libcamera/internal/ipc_pipe_unixsocket.h\n> index 8c972613..280639d5 100644\n> --- a/include/libcamera/internal/ipc_pipe_unixsocket.h\n> +++ b/include/libcamera/internal/ipc_pipe_unixsocket.h\n> @@ -16,6 +16,7 @@\n>  namespace libcamera {\n>  \n>  class Process;\n> +class IPCUnixSocketWrapper;\n>  \n>  class IPCPipeUnixSocket : public IPCPipe\n>  {\n> @@ -29,18 +30,8 @@ public:\n>         int sendAsync(const IPCMessage &data) override;\n>  \n>  private:\n> -       struct CallData {\n> -               IPCUnixSocket::Payload *response;\n> -               bool done;\n> -       };\n> -\n> -       void readyRead();\n> -       int call(const IPCUnixSocket::Payload &message,\n> -                IPCUnixSocket::Payload *response, uint32_t seq);\n> -\n>         std::unique_ptr<Process> proc_;\n> -       std::unique_ptr<IPCUnixSocket> socket_;\n> -       std::map<uint32_t, CallData> callData_;\n> +       std::unique_ptr<IPCUnixSocketWrapper> socketWrap_;\n>  };\n>  \n>  } /* namespace libcamera */\n> diff --git a/src/libcamera/ipc_pipe_unixsocket.cpp b/src/libcamera/ipc_pipe_unixsocket.cpp\n> index 668ec73b..eb5408d4 100644\n> --- a/src/libcamera/ipc_pipe_unixsocket.cpp\n> +++ b/src/libcamera/ipc_pipe_unixsocket.cpp\n> @@ -9,10 +9,9 @@\n>  \n>  #include <vector>\n>  \n> -#include <libcamera/base/event_dispatcher.h>\n>  #include <libcamera/base/log.h>\n> +#include <libcamera/base/mutex.h>\n>  #include <libcamera/base/thread.h>\n> -#include <libcamera/base/timer.h>\n>  \n>  #include \"libcamera/internal/ipc_pipe.h\"\n>  #include \"libcamera/internal/ipc_unixsocket.h\"\n> @@ -24,67 +23,161 @@ namespace libcamera {\n>  \n>  LOG_DECLARE_CATEGORY(IPCPipe)\n>  \n> -IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n> -                                    const char *ipaProxyWorkerPath)\n> -       : IPCPipe()\n> +class IPCUnixSocketWrapper : Thread\n\nI'm a bit weary of the concept of 'wrapping' an IPCUnixSocket. Doesn't\nthat simply mean IPCUnixSocket should be re-implemented ? Why do we\nwrap it ?\n\n(Sorry - big complicated change - trying to understand as I go through\n...)\n\n>  {\n> -       std::vector<int> fds;\n> -       std::vector<std::string> args;\n> -       args.push_back(ipaModulePath);\n> +public:\n> +       IPCUnixSocketWrapper(Signal<const IPCMessage &> *recv)\n> +               : recv_(recv), ready_(false), sendSyncPending_(false),\n> +                 sendSyncCookie_(0)\n> +       {\n> +               start();\n> +       }\n>  \n> -       socket_ = std::make_unique<IPCUnixSocket>();\n> -       UniqueFD fd = socket_->create();\n> -       if (!fd.isValid()) {\n> -               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> -               return;\n> +       ~IPCUnixSocketWrapper()\n> +       {\n> +               exit();\n> +               wait();\n>         }\n> -       socket_->readyRead.connect(this, &IPCPipeUnixSocket::readyRead);\n> -       args.push_back(std::to_string(fd.get()));\n> -       fds.push_back(fd.get());\n>  \n> -       proc_ = std::make_unique<Process>();\n> -       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n> -       if (ret) {\n> -               LOG(IPCPipe, Error)\n> -                       << \"Failed to start proxy worker process\";\n> -               return;\n> +       void run() override\n> +       {\n> +               /*\n> +                * IPC socket construction and connection to its readyRead\n> +                * signal has to be done from the IPC thread so that the\n> +                * relevant Object instances (EventNotifier, slot) are bound to\n> +                * its context.\n> +                */\n> +               init();\n> +               exec();\n> +               deinit();\n>         }\n>  \n> -       connected_ = true;\n> -}\n> +       int fd() { return fd_.get(); }\n> +       int sendSync(const IPCMessage &in, IPCMessage *out);\n> +       int sendAsync(const IPCMessage &data);\n> +       bool waitReady();\n>  \n> -IPCPipeUnixSocket::~IPCPipeUnixSocket()\n> -{\n> -}\n> +private:\n> +       void init();\n> +       void deinit();\n> +       void readyRead();\n>  \n> -int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n> +       UniqueFD fd_;\n> +       Signal<const IPCMessage &> *recv_;\n> +       ConditionVariable cv_;\n> +       Mutex mutex_;\n> +       bool ready_;\n> +       bool sendSyncPending_;\n> +       uint32_t sendSyncCookie_;\n> +       IPCUnixSocket::Payload *sendSyncResponse_;\n> +\n> +       /* Socket shall be constructed and destructed from IPC thread context */\n> +       std::unique_ptr<IPCUnixSocket> socket_;\n> +};\n> +\n> +int IPCUnixSocketWrapper::sendSync(const IPCMessage &in, IPCMessage *out)\n>  {\n> +       int ret;\n>         IPCUnixSocket::Payload response;\n>  \n> -       int ret = call(in.payload(), &response, in.header().cookie);\n> +       mutex_.lock();\n> +       ASSERT(!sendSyncPending_);\n> +       sendSyncPending_ = true;\n> +       sendSyncCookie_ = in.header().cookie;\n> +       sendSyncResponse_ = &response;\n> +       mutex_.unlock();\n> +\n> +       ret = socket_->send(in.payload());\n>         if (ret) {\n> -               LOG(IPCPipe, Error) << \"Failed to call sync\";\n> -               return ret;\n> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n> +               goto cleanup;\n> +       }\n> +\n> +       bool complete;\n> +       {\n> +               MutexLocker locker(mutex_);\n> +               auto syncComplete = ([&]() {\n> +                       return sendSyncPending_ == false;\n> +               });\n> +               complete = cv_.wait_for(locker, 1000ms, syncComplete);\n> +       }\n> +\n> +       if (!complete) {\n> +               LOG(IPCPipe, Error) << \"Timeout sending sync message\";\n> +               ret = -ETIMEDOUT;\n> +               goto cleanup;\n>         }\n>  \n>         if (out)\n>                 *out = IPCMessage(response);\n>  \n>         return 0;\n> +\n> +cleanup:\n> +       mutex_.lock();\n> +       sendSyncPending_ = false;\n> +       mutex_.unlock();\n> +\n> +       return ret;\n>  }\n>  \n> -int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n> +int IPCUnixSocketWrapper::sendAsync(const IPCMessage &data)\n>  {\n> -       int ret = socket_->send(data.payload());\n> -       if (ret) {\n> -               LOG(IPCPipe, Error) << \"Failed to call async\";\n> -               return ret;\n> +       int ret;\n> +       ret = socket_->send(data.payload());\n> +       if (ret)\n> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n> +       return ret;\n> +}\n> +\n> +bool IPCUnixSocketWrapper::waitReady()\n> +{\n> +       bool ready;\n> +       {\n> +               MutexLocker locker(mutex_);\n> +               auto isReady = ([&]() {\n> +                       return ready_;\n> +               });\n> +               ready = cv_.wait_for(locker, 1000ms, isReady);\n>         }\n>  \n> -       return 0;\n> +       return ready;\n> +}\n> +\n> +void IPCUnixSocketWrapper::init()\n> +{\n> +       /* Init is to be done from the IPC thread context */\n> +       ASSERT(Thread::current() == this);\n> +\n> +       socket_ = std::make_unique<IPCUnixSocket>();\n> +       fd_ = socket_->create();\n> +       if (!fd_.isValid()) {\n> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> +               return;\n> +       }\n> +\n> +       socket_->readyRead.connect(this, &IPCUnixSocketWrapper::readyRead);\n> +\n> +       mutex_.lock();\n> +       ready_ = true;\n> +       mutex_.unlock();\n> +       cv_.notify_one();\n>  }\n>  \n> -void IPCPipeUnixSocket::readyRead()\n> +void IPCUnixSocketWrapper::deinit()\n> +{\n> +       /* Deinit is to be done from the IPC thread context */\n> +       ASSERT(Thread::current() == this);\n> +\n> +       socket_->readyRead.disconnect(this);\n> +       socket_.reset();\n> +\n> +       mutex_.lock();\n> +       ready_ = false;\n> +       mutex_.unlock();\n> +}\n> +\n> +void IPCUnixSocketWrapper::readyRead()\n>  {\n>         IPCUnixSocket::Payload payload;\n>         int ret = socket_->receive(&payload);\n> @@ -93,55 +186,72 @@ void IPCPipeUnixSocket::readyRead()\n>                 return;\n>         }\n>  \n> -       /* \\todo Use span to avoid the double copy when callData is found. */\n\nIs this todo handled by the patch or made redundant by the patch ?\n\n>         if (payload.data.size() < sizeof(IPCMessage::Header)) {\n>                 LOG(IPCPipe, Error) << \"Not enough data received\";\n>                 return;\n>         }\n>  \n> -       IPCMessage ipcMessage(payload);\n> +       const IPCMessage::Header *header =\n> +               reinterpret_cast<IPCMessage::Header *>(payload.data.data());\n> +       bool syncComplete = false;\n> +       mutex_.lock();\n> +       if (sendSyncPending_ && sendSyncCookie_ == header->cookie) {\n> +               syncComplete = true;\n> +               sendSyncPending_ = false;\n> +               *sendSyncResponse_ = std::move(payload);\n> +       }\n> +       mutex_.unlock();\n>  \n> -       auto callData = callData_.find(ipcMessage.header().cookie);\n> -       if (callData != callData_.end()) {\n> -               *callData->second.response = std::move(payload);\n> -               callData->second.done = true;\n> +       if (syncComplete) {\n> +               cv_.notify_one();\n>                 return;\n>         }\n>  \n>         /* Received unexpected data, this means it's a call from the IPA. */\n> -       recv.emit(ipcMessage);\n> +       IPCMessage ipcMessage(payload);\n> +       recv_->emit(ipcMessage);\n>  }\n>  \n> -int IPCPipeUnixSocket::call(const IPCUnixSocket::Payload &message,\n> -                           IPCUnixSocket::Payload *response, uint32_t cookie)\n> +IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n> +                                    const char *ipaProxyWorkerPath)\n> +       : IPCPipe()\n>  {\n> -       Timer timeout;\n> -       int ret;\n> +       socketWrap_ = std::make_unique<IPCUnixSocketWrapper>(&recv);\n> +       if (!socketWrap_->waitReady()) {\n> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> +               return;\n> +       }\n> +       int fd = socketWrap_->fd();\n>  \n> -       const auto result = callData_.insert({ cookie, { response, false } });\n> -       const auto &iter = result.first;\n> +       std::vector<int> fds;\n> +       std::vector<std::string> args;\n> +       args.push_back(ipaModulePath);\n> +       args.push_back(std::to_string(fd));\n> +       fds.push_back(fd);\n>  \n> -       ret = socket_->send(message);\n> +       proc_ = std::make_unique<Process>();\n> +       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>         if (ret) {\n> -               callData_.erase(iter);\n> -               return ret;\n> +               LOG(IPCPipe, Error)\n> +                       << \"Failed to start proxy worker process\";\n> +               return;\n>         }\n>  \n> -       /* \\todo Make this less dangerous, see IPCPipe::sendSync() */\n> -       timeout.start(2000ms);\n> -       while (!iter->second.done) {\n> -               if (!timeout.isRunning()) {\n> -                       LOG(IPCPipe, Error) << \"Call timeout!\";\n> -                       callData_.erase(iter);\n> -                       return -ETIMEDOUT;\n> -               }\n> +       connected_ = true;\n> +}\n>  \n> -               Thread::current()->eventDispatcher()->processEvents();\n> -       }\n> +IPCPipeUnixSocket::~IPCPipeUnixSocket()\n> +{\n> +}\n>  \n> -       callData_.erase(iter);\n> +int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n> +{\n> +       return socketWrap_->sendSync(in, out);\n> +}\n>  \n> -       return 0;\n> +int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n> +{\n> +       return socketWrap_->sendAsync(data);\n>  }\n>  \n>  } /* namespace libcamera */\n> -- \n> 2.34.1\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 38F09BDCC1\n\tfor <parsemail@patchwork.libcamera.org>;\n\tMon, 21 Jul 2025 18:08:54 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 44A0F68FF7;\n\tMon, 21 Jul 2025 20:08:53 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id A25E968FB1\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tMon, 21 Jul 2025 20:08:51 +0200 (CEST)","from pendragon.ideasonboard.com\n\t(cpc89244-aztw30-2-0-cust6594.18-1.cable.virginm.net [86.31.185.195])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id 78DE222D2;\n\tMon, 21 Jul 2025 20:08:14 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"BokjnNjU\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1753121294;\n\tbh=MKXyzTW4Z1eGUOeQ/OBYdi6aZ0AOhn6suqvsVzJzuaY=;\n\th=In-Reply-To:References:Subject:From:Cc:To:Date:From;\n\tb=BokjnNjU80DEVxcFEeqFbmyNOsU7KCZy55/KQ0BQJ+x+TwSSZYRAnPPMQOiR4bwae\n\tzo7Z+tB51z/JGxIIIfF0GedwYGoMSFH2migWpx+h7p6mG/RqAN5mNyge3Gj0dJyAV3\n\tLtX5pIwoL0TTOguhNGTY3YyCQC6TiNU31bb9Vw+Y=","Content-Type":"text/plain; charset=\"utf-8\"","MIME-Version":"1.0","Content-Transfer-Encoding":"quoted-printable","In-Reply-To":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>","References":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>","Subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","From":"Kieran Bingham <kieran.bingham@ideasonboard.com>","Cc":"Julien Vuillaumier <julien.vuillaumier@nxp.com>","To":"Julien Vuillaumier <julien.vuillaumier@nxp.com>,\n\tlibcamera-devel@lists.libcamera.org,\n\tPaul Elder <paul.elder@ideasonboard.com>, ","Date":"Mon, 21 Jul 2025 19:08:48 +0100","Message-ID":"<175312132883.50296.12967542926126891436@ping.linuxembedded.co.uk>","User-Agent":"alot/0.9.1","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":35024,"web_url":"https://patchwork.libcamera.org/comment/35024/","msgid":"<2ea85c97-51f1-402e-933b-93d142c23023@nxp.com>","date":"2025-07-22T14:12:27","subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","submitter":{"id":190,"url":"https://patchwork.libcamera.org/api/people/190/","name":"Julien Vuillaumier","email":"julien.vuillaumier@nxp.com"},"content":"Hi Kieran,\n\nOn 21/07/2025 20:08, Kieran Bingham wrote:\n> Hi Julien,\n> \n> I've just seen this in my backlog.\n> \n> It's ... a substantial change, so it's hard to digest. Is there anything\n> we can break down here?\n\nThis change introduces the IPCUnixSocketWrapper class to move a part of \nthe existing IPCPipeUnixSocket class implementation, the IPC socket \nreceive path, in a dedicated thread instead of running it in the camera \nmanager thread. Purpose is to be able to properly block the camera \nmanager thread execution when sending a synchronous message over IPC. \nThe camera manager thread is later resumed by this new thread that \ndetects the completion of the sync message sending.\n\nExisting implementation had to be redistributed between the two classes.\nThe change may look substantial but I don't see how to break it down :/\n\n> \n> Are you running this change at NXP ?\n\nThis change is present in the NXP BSP.\nIt was introduced because of IPC hangs experienced during multi-cameras \noperation, with a third-party IPA running in isolated mode.\n\nRoot cause for that issue is a bit intricate. Background is given in the \ncommit message - message length was limited to keep it readable though.\n\n> \n> Quoting Julien Vuillaumier (2024-12-20 14:55:56)\n>> Unix-socket based IPC sometimes times out or hangs, typically\n>> when multiple camera are stopped simulaneously. That specific case\n>> triggers the concurrent sending by each pipeline handler instance\n>> of a synchronous stop() message to its peer IPA process.\n>>\n>> There is a dedicated IPC socket per camera. Sockets payload receipt\n>> signals all run in the camera manager thread.\n>> To send a synchronous message to IPA, pipeline invokes from camera\n>> thread IPCPipeUnixSocket::sendSync(). This sends the message then blocks\n>> busy waiting for the peer acknowledgment. Such busy wait is done by\n>> blocking on event loop calling dispatchEvent(), until the ack condition\n>> is detected.\n>>\n>> One issue is that the socket receive slot readyRead() wakes up the\n>> blocked thread via libcamera::Message receipt. Even though such message\n>> resumes processEvents(), it may reblock immediately because readyRead()\n>> does not interrupt() explictly the dispatcher.\n>> Most of the time, an other pending event for the thread unblocks the\n>> event dispatcher and the ack condition is detected - in worst case the 2\n>> sec timeout kicks in. Once unblocked, the dispatcher let the message\n>> acknowledgment to be detected and the sendSync() completes.\n>>\n>> The other issue is that in case of concurrent synchronous IPC messages\n>> sent by multiple pipeline handlers, there is a possible recursion\n>> of sendSync() / processEvents() nested in the camera thread stack. As\n>> commented in the source, that is a dangerous construct that can lead to\n>> a hang.\n> \n> I'm not sure I fully understand, - do you have multiple cameras using a\n> Single IPA - or multiple cameras using multiple IPAs using a single\n> pipeline handler perhaps?\n\nWording is not very clear indeed, my apologies for this.\n\nConditions where this issue can be seen is with a pipeline handler \ninstance in charge of multiple cameras (same libcamera process/thread). \nEach camera is interfacing with its own instance of the same IPA. Each \nIPA instance is running in isolated mode (third-party control algorithms).\nTypical occurrence of the hang is when application is stopping all the \ncameras simultaneously: pipeline handler, from the camera thread \ncontext, has to send synchronous IPC stop messages to each IPA instance \nand handle concurrently the completion of those messages.\n\n> \n>> The reason is that the last synchronous message sent is the deepest in\n>> the stack. It is also the one whose acknowledgment is being busy waited.\n>> However other pending synchronous messages may have been initiated before\n>> and are upper in the stack. If they timeout, the condition is not\n>> detected because of the stack recursion, as the thread is busy waiting\n>> for the last message to be acknowledged.\n>>\n>> This change implements a safer mechanism to handle the synchronous\n>> message sending, similar to the one used for non isolated IPA. The\n>> IPCUnixSocketWrapper class is introduced to handle the IPCUnixSocket\n>> receive signal in a dedicated thread.\n>> Doing so, the sending thread, when emiting a synchronous message, can be\n>> blocked without event dispatcher's processEvents() usage, which avoids\n>> the risky stack recursion.\n>>\n>> Fixes: 21f1b555b (\"libcamera: Add IPCPipe implementation based on unix socket\")\n>>\n> \n> Paul - you were the author of the commit referenced above - could you\n> check through this patch please?\n> \n\nLet me know if there's something I can do to help with the review of \nthis change.\n\nThanks,\nJulien","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 79678BDCC1\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 22 Jul 2025 14:12:33 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 84E2D69036;\n\tTue, 22 Jul 2025 16:12:32 +0200 (CEST)","from PA4PR04CU001.outbound.protection.outlook.com\n\t(mail-francecentralazlp170130007.outbound.protection.outlook.com\n\t[IPv6:2a01:111:f403:c20a::7])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 9485268F93\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 22 Jul 2025 16:12:30 +0200 (CEST)","from AM9PR04MB8147.eurprd04.prod.outlook.com\n\t(2603:10a6:20b:3e0::22)\n\tby AS5PR04MB11417.eurprd04.prod.outlook.com (2603:10a6:20b:6c8::9)\n\twith Microsoft SMTP Server (version=TLS1_2,\n\tcipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8964.21;\n\tTue, 22 Jul 2025 14:12:29 +0000","from AM9PR04MB8147.eurprd04.prod.outlook.com\n\t([fe80::eace:e980:28a4:ef8a]) by\n\tAM9PR04MB8147.eurprd04.prod.outlook.com\n\t([fe80::eace:e980:28a4:ef8a%6]) with mapi id 15.20.8964.019;\n\tTue, 22 Jul 2025 14:12:29 +0000"],"Authentication-Results":["lancelot.ideasonboard.com; dkim=pass (2048-bit key;\n\tunprotected) header.d=nxp.com header.i=@nxp.com header.b=\"AiF5rnnR\";\n\tdkim-atps=neutral","dkim=none (message not signed)\n\theader.d=none;dmarc=none action=none header.from=nxp.com;"],"ARC-Seal":"i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none;\n\tb=reqEC9IvUPJhoLXyS8FJjkWZzTqTaRDT1bmVWyQatH/+J14Id6aee5h+HMuZIh/33gE+jqRjjIjjfcgLsjpCuV0MXea4wjvhDdiDwEo+lJZO/pRNnrdGM1mLgs579GhifzXLtg2QPqOEfuuntXuNjNzYQEKh9ou8IW+0JL62n8opZK47P8Zr+106o9mWOwKQqbTjyD9FV6vcXEmDEMmnNhJJByr3Uv3y5PVWgdba6qlQMTc/aewXDAomliAIxcIaqQFyd8Hx4ACLb2y1xQkbc2Xonouv8m5BTo4jmo0wgJNh/Z7n4Qi20FmUhb0c5fZukdNOaMhXpDZZ3pcqpt3nZg==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n\ts=arcselector10001;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n\tbh=sFNEY1kJEM7WVXSZYi9WOTw812808rmhX6XchVxNA7s=;\n\tb=GWQUtRk4AMFyi0fSwuNg4kDy7FkW36uNWaqT9igI4BH5A9IGhpGxDlZo8H6CKtPCMMKBjZurJ2WyG95xPCU43z4GpJs7nrcIX4x2YUIyalvqldzpCozNBxSU/O26bvR66A4SSsFkhoh2LmYwOJiVOEWBPUYZsdxIZdfrBLxDrT1bOR+LzPQHEb7e8ULXGBwqmD9+Tmr6k9v4L/uFh2MDFCdiLsDTOzgfZvvvMve1suaAMLpHjh8Lr6cWu4KXJ4/IjKzlCkaoaV4Z59FE08bsXIwXkeDFqKGteveRlOS5L8Tyf3p+fzy9YohPrLBrtVoFzXeEGWcOhRwnnHLz6CmdUQ==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n\tsmtp.mailfrom=nxp.com; dmarc=pass action=none header.from=nxp.com;\n\tdkim=pass header.d=nxp.com; arc=none","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n\tbh=sFNEY1kJEM7WVXSZYi9WOTw812808rmhX6XchVxNA7s=;\n\tb=AiF5rnnRUW9dwTqV3NjUDQGzla9TKpXTdHPVH5oxHOcZ8aT1Ac6o6cmfq65BcUfvEqVJVXIg3JzbKKWpesV0sAP5VSYgirrZAicMOxafrfYXYzn7ogayH7/Bi2zLJeOSD8GUXcSv8xbGVu/0Xrg9Ee2mlJ1u8xM/ffAn88Nrx71/U6OT0Xpnk7a60dgUylNNrgizoC06s3XtUbBJX/EWofyCuSHYY3FMBi/dDqZ+QI8J44Z5mC5qwF2svleLXAjvMDjN+RF/LKgzKj6qQKYgSYHD2bAfm9ffsToWLH3ijHF9I+wMALMAxtw/rsl58ZuArIEzqzAeQBxVVL4jJC0RVw==","Message-ID":"<2ea85c97-51f1-402e-933b-93d142c23023@nxp.com>","Date":"Tue, 22 Jul 2025 16:12:27 +0200","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","To":"Kieran Bingham <kieran.bingham@ideasonboard.com>,\n\tlibcamera-devel@lists.libcamera.org,\n\tPaul Elder <paul.elder@ideasonboard.com>","References":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>\n\t<175312132883.50296.12967542926126891436@ping.linuxembedded.co.uk>","Content-Language":"en-US","From":"Julien Vuillaumier <julien.vuillaumier@nxp.com>","In-Reply-To":"<175312132883.50296.12967542926126891436@ping.linuxembedded.co.uk>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"7bit","X-ClientProxiedBy":"FR4P281CA0020.DEUP281.PROD.OUTLOOK.COM\n\t(2603:10a6:d10:c9::18) To AM9PR04MB8147.eurprd04.prod.outlook.com\n\t(2603:10a6:20b:3e0::22)","MIME-Version":"1.0","X-MS-PublicTrafficType":"Email","X-MS-TrafficTypeDiagnostic":"AM9PR04MB8147:EE_|AS5PR04MB11417:EE_","X-MS-Office365-Filtering-Correlation-Id":"6bf65457-a791-44b8-35eb-08ddc929cb0c","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam":"BCL:0;\n\tARA:13230040|366016|1800799024|376014|19092799006; ","X-Microsoft-Antispam-Message-Info":"=?utf-8?q?Cg7gzn7JsoqCvHF9cwPNKqf1bZjJ?=\n\t=?utf-8?q?o8NScQHQobrchnZNt9WTdgY1eRL9RMumwWzEbYCz3GyGB3zEGNzNIrf3?=\n\t=?utf-8?q?Ouc4Q0fn0m78P5Eom/Sp2HwcPUdYCiH1Y+vxzsZOvgQ4xklrsP9svPJu?=\n\t=?utf-8?q?bS7IkG465VjAkR91Ft1p5quLv5rumYniULOv7PxHk/D4AZrUqKBUYbz8?=\n\t=?utf-8?q?3g1oAcWPFF3FmMKMkzQluuzCqEebxGxD6uWcEaTiTjTnn0RlICBBXasa?=\n\t=?utf-8?q?JFSrgFnmpwLFqlBIM7hVE5Y73+JwqrBO3UZbKCvrDe1/cggA2/7CXBmH?=\n\t=?utf-8?q?9TKkOrbZXWZsiycM/7f/FujCi9r/XAbQ/yon7DU6kU9Dszx0uTeX14Di?=\n\t=?utf-8?q?FSjmoLl1Wwqc/qy1UJKt/h0oLdW7IY2zuRgZN1kA8ZJcM1cKCg8/2Psm?=\n\t=?utf-8?q?x37WDBxisA3waeNnjaiW2gBA53jXiz28yhEE9xfRmS+w5BPnY8qBMLfc?=\n\t=?utf-8?q?rINKeW5GbGgLvIx8h6oGuK3jITznIbpZ6HhEyMSqedpFpYypMuJNM+5z?=\n\t=?utf-8?q?NuXxGpAsro9CsOz6rQ+RBpU0kd14a/BM9ECuPJOEv3fZJU6JyVcgLA3C?=\n\t=?utf-8?q?HVhqZZvsuYde5NWZ7bGDcbUT2FqssrVfBOzpY6sioZvetj3tlSjTipO4?=\n\t=?utf-8?q?ZQagAdLb3ylwY/rHI/KLKoFup0v6iTBKmxgE+zod/mhUntmReq3QCMeX?=\n\t=?utf-8?q?SK31xg0n8Ki2LJ06mHGo1G+mIs+2ChwBGtjUUJSsMJ9twIO7uF9RdNRA?=\n\t=?utf-8?q?NwaTy9Jt3PezE0gWwH7xu0tz0QXA7zTTfFzd7/hNHzg2ZdPjEFQQ6SN0?=\n\t=?utf-8?q?bSxZKFHIDqjJmYp8MAkO7+pXsKa22kZBRlxvWeBvaPVov69L5Gim6tF4?=\n\t=?utf-8?q?v+J8khRxt0kDtRXOZuYytxNVS/QmDRYfb4hdX/fmpKXWHZnFSs+HTk5S?=\n\t=?utf-8?q?N//+qULADH+iiYiKJO58HAGHhj06qZ/88qEzKNCxqi3oAMyaOPP3AqHe?=\n\t=?utf-8?q?p0Dc3gl+xjeD4liZ6syLMyZLLj9bSROVlmPJEUEovuTw3TVrr67tITmn?=\n\t=?utf-8?q?Eb/QO98iXgXbFxmtU/FWAsIVkFjCgpnH4se+Mw5IJJnmAc2j65Z+BHu6?=\n\t=?utf-8?q?fW0lq7IERsQMcLQKjYf63K0s+NtXcgXGRA9PMAPFi1oAZea7A+acW8JB?=\n\t=?utf-8?q?jTVVll49o2Y+GVsV35HD6G+vvMnDDZL25qqFcSeE4ZvwXzgIZQJ4rcOw?=\n\t=?utf-8?q?dGv0sHgjmMZKZSHviJUu7zoLN+dGfbqr2nhjIlZ8dId+ILL7c1cNUN0P?=\n\t=?utf-8?q?j6jr3NL6Od3hyq9htk7vKOGy0HAbWy8dE5V4G+h2N1bUjTYJFoExi1b8?=\n\t=?utf-8?q?DnGumi9iykjrKTWwIOg2xYkzKLD3NTszbpH0coG2nsrpjN3l9sB43BWZ?=\n\t=?utf-8?q?Fn4SBJdz5PXSwCmU4m6l5ghUXdQpyH+3V4Zuw9b1Q0Z0nm5p23/2NGAn?=\n\t=?utf-8?q?0wm7KZUN4/BE9rfP7POYcpU=3D?=","X-Forefront-Antispam-Report":"CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;\n\tIPV:NLI; SFV:NSPM; H:AM9PR04MB8147.eurprd04.prod.outlook.com; PTR:;\n\tCAT:NONE; \n\tSFS:(13230040)(366016)(1800799024)(376014)(19092799006); DIR:OUT;\n\tSFP:1101; ","X-MS-Exchange-AntiSpam-MessageData-ChunkCount":"1","X-MS-Exchange-AntiSpam-MessageData-0":"=?utf-8?q?tYZpKWpYr1X2mZYC5vigByBNA?=\n\t=?utf-8?q?aREb4Akt8G76LoZzlzjI0VY2Uyn8+G6ISWXeGZyOfjnBcP7Fp3hTeWZG?=\n\t=?utf-8?q?A36r6TUl0XeAVheakHQlQvLu+QyrfoypsfqcmDbVkRU7hpU8MfaBcpmA?=\n\t=?utf-8?q?W5u9jq3cW+aBhkmEL0CL6LTzXHg5QCKxsOHvMh6uL8fk1G40vHgJHJ8o?=\n\t=?utf-8?q?qZAjfWnc98SPQU0ikiGKCMuQnFV6teJHB/G28COa6K1hiokSjs5dmW7p?=\n\t=?utf-8?q?PDttzoFh4bG9FM6MWoPf/JjB4zYMOOTHZ8rGvPeBNPN0SUXkhV4gbMEc?=\n\t=?utf-8?q?NZpg1S61ARqZFpmtRuHhfzFDD+1rHvADOZ+LN/DoDdq3CcEkdHVTa91X?=\n\t=?utf-8?q?Nga9pt9G2+zuuvQ1RlXauF8g+RwVzshLlHB9S//n0vTsTl4gw1eezAIr?=\n\t=?utf-8?q?xikng9+AmSOvzD3T1NS4euDy+df7ZqC15FAeWXkDVoVE0x++i15ac1vy?=\n\t=?utf-8?q?XQAAgkaEvcxxnPfGfiaio3rh9OqdjbLw8opkYUVp/e9cPR/bzPEFv6O7?=\n\t=?utf-8?q?San5GcRoDZ/GiXx0iBl8DcWBupKDYxrflTHjksxByHQT+sUmcpRjZXhC?=\n\t=?utf-8?q?l4Q03OOMegoZQqzbsOrjbIYn+iRmfpX65NfFQTkAHLeCTOCxz1G/SKbg?=\n\t=?utf-8?q?VEtRT42wipknWpwI7Huhf6UrjdTqyktUNrMTq1UJ+3rBtdTKtJ3Z1QbN?=\n\t=?utf-8?q?gWniUk0NccV5xKw4Q173coghLwppUlsPF0gSVnIWNShcuROCL3zxc0ku?=\n\t=?utf-8?q?HDeKW2Cs4FZuRGlQABfi1pPoipvXcnx0ZluwYmAJp3+gq+eyLPxqTrHJ?=\n\t=?utf-8?q?y3Ep7XGUED2R6GjC82tbU8oWuchsuGFCtHc8Zn2tWmi0SCCfeWU8pnsn?=\n\t=?utf-8?q?vk0yLje470TVly+c/lXLS/U/QlkmGcPXmcUhIyTiWix9J8S7NDdfVIAE?=\n\t=?utf-8?q?HK7WqnhSuLk8DdLGFwgceQNB3FGyf2vR9l4YNZRiZFIeiX9atGVBoXw+?=\n\t=?utf-8?q?OfnC+gYBXfqH7WebRJOw1AOUVwNHxd/c+a8wX7N6+AVbNMXfE8n+0Bl0?=\n\t=?utf-8?q?3hVL7woztaQo+v4Dhw3WuJyHeNOwxo7V40ha4uRsMTO1h6gPkMiCtr4b?=\n\t=?utf-8?q?gKIjBAGrZQAH86zv8E4Z27li5NSs56vLTrei44ibqR2MA+rrmO87uClf?=\n\t=?utf-8?q?/w0ZMb2PvA7A8VIKI83iiBa/rbu53xqEujA8la9XcNPENKnIfZnzZHkX?=\n\t=?utf-8?q?bXwj2UNWLOZfjeP+6Iy9hsEQl2ox2EL2+TzhIXdV8YwG8WyRpkpmisK2?=\n\t=?utf-8?q?fXW9KlJlZLQxXJN3bm/IkALsVaNpnNm11W6YObOgING6zPDhMSpmYXJi?=\n\t=?utf-8?q?Z/Cm6GzdMTyyipnG1DabewLVrTxgiiNtZ4aFRnAf4iWggq/vY7Z7PXsZ?=\n\t=?utf-8?q?2aPn0uxxmAEv9BLZSqhuQOKst5LW4C8nWvoR6II7WsnkiA/paHb6MY+M?=\n\t=?utf-8?q?lUSQJCCcodD99vPCjSE3hgkf2wMywVSFyFdGGk82Tc960ka2eEZPAIJz?=\n\t=?utf-8?q?+EQwb3bnYEoup5bn/iG/zVLpq7yE9foZaszfzY/yqTIkTdg4RYVbKgan?=\n\t=?utf-8?q?ExaGTm/oK5nBEpFnDhnCbs04JpjxCOW0DMssWItT+k5UpdWBdCDsLHet?=\n\t=?utf-8?q?KI22FhtwDHrw3YjKXZCxLOdiLn3pg=3D=3D?=","X-OriginatorOrg":"nxp.com","X-MS-Exchange-CrossTenant-Network-Message-Id":"6bf65457-a791-44b8-35eb-08ddc929cb0c","X-MS-Exchange-CrossTenant-AuthSource":"AM9PR04MB8147.eurprd04.prod.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Internal","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"22 Jul 2025 14:12:29.6894\n\t(UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"686ea1d3-bc2b-4c6f-a92c-d99c5c301635","X-MS-Exchange-CrossTenant-MailboxType":"HOSTED","X-MS-Exchange-CrossTenant-UserPrincipalName":"55LnBoebNyTRu0MojHiXdpvD5ZLHstVRL9ikaPkat4kbCPPR3oeUoprJa31iaEptnbMT1bRP/GAc/JTtHkOags4DTWNg123g6pQUzoFTdB4=","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"AS5PR04MB11417","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":35570,"web_url":"https://patchwork.libcamera.org/comment/35570/","msgid":"<175620412034.607151.628561646126579934@neptunite.rasen.tech>","date":"2025-08-26T10:28:40","subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","submitter":{"id":17,"url":"https://patchwork.libcamera.org/api/people/17/","name":"Paul Elder","email":"paul.elder@ideasonboard.com"},"content":"Hi Julien,\n\nThanks for the patch.\n\nQuoting Julien Vuillaumier (2024-12-20 23:55:56)\n> Unix-socket based IPC sometimes times out or hangs, typically\n> when multiple camera are stopped simulaneously. That specific case\n> triggers the concurrent sending by each pipeline handler instance\n> of a synchronous stop() message to its peer IPA process.\n> \n> There is a dedicated IPC socket per camera. Sockets payload receipt\n> signals all run in the camera manager thread.\n> To send a synchronous message to IPA, pipeline invokes from camera\n> thread IPCPipeUnixSocket::sendSync(). This sends the message then blocks\n> busy waiting for the peer acknowledgment. Such busy wait is done by\n> blocking on event loop calling dispatchEvent(), until the ack condition\n> is detected.\n> \n> One issue is that the socket receive slot readyRead() wakes up the\n> blocked thread via libcamera::Message receipt. Even though such message\n> resumes processEvents(), it may reblock immediately because readyRead()\n> does not interrupt() explictly the dispatcher.\n> Most of the time, an other pending event for the thread unblocks the\n> event dispatcher and the ack condition is detected - in worst case the 2\n> sec timeout kicks in. Once unblocked, the dispatcher let the message\n> acknowledgment to be detected and the sendSync() completes.\n> \n> The other issue is that in case of concurrent synchronous IPC messages\n> sent by multiple pipeline handlers, there is a possible recursion\n> of sendSync() / processEvents() nested in the camera thread stack. As\n> commented in the source, that is a dangerous construct that can lead to\n> a hang.\n> The reason is that the last synchronous message sent is the deepest in\n> the stack. It is also the one whose acknowledgment is being busy waited.\n> However other pending synchronous messages may have been initiated before\n> and are upper in the stack. If they timeout, the condition is not\n> detected because of the stack recursion, as the thread is busy waiting\n> for the last message to be acknowledged.\n\nOk, I think I understand the issue.\n\n> \n> This change implements a safer mechanism to handle the synchronous\n> message sending, similar to the one used for non isolated IPA. The\n> IPCUnixSocketWrapper class is introduced to handle the IPCUnixSocket\n> receive signal in a dedicated thread.\n> Doing so, the sending thread, when emiting a synchronous message, can be\n> blocked without event dispatcher's processEvents() usage, which avoids\n> the risky stack recursion.\n> \n> Fixes: 21f1b555b (\"libcamera: Add IPCPipe implementation based on unix socket\")\n> \n> Signed-off-by: Julien Vuillaumier <julien.vuillaumier@nxp.com>\n> ---\n>  .../libcamera/internal/ipc_pipe_unixsocket.h  |  13 +-\n>  src/libcamera/ipc_pipe_unixsocket.cpp         | 242 +++++++++++++-----\n>  2 files changed, 178 insertions(+), 77 deletions(-)\n> \n> diff --git a/include/libcamera/internal/ipc_pipe_unixsocket.h b/include/libcamera/internal/ipc_pipe_unixsocket.h\n> index 8c972613..280639d5 100644\n> --- a/include/libcamera/internal/ipc_pipe_unixsocket.h\n> +++ b/include/libcamera/internal/ipc_pipe_unixsocket.h\n> @@ -16,6 +16,7 @@\n>  namespace libcamera {\n>  \n>  class Process;\n> +class IPCUnixSocketWrapper;\n>  \n>  class IPCPipeUnixSocket : public IPCPipe\n>  {\n> @@ -29,18 +30,8 @@ public:\n>         int sendAsync(const IPCMessage &data) override;\n>  \n>  private:\n> -       struct CallData {\n> -               IPCUnixSocket::Payload *response;\n> -               bool done;\n> -       };\n> -\n> -       void readyRead();\n> -       int call(const IPCUnixSocket::Payload &message,\n> -                IPCUnixSocket::Payload *response, uint32_t seq);\n> -\n>         std::unique_ptr<Process> proc_;\n> -       std::unique_ptr<IPCUnixSocket> socket_;\n> -       std::map<uint32_t, CallData> callData_;\n> +       std::unique_ptr<IPCUnixSocketWrapper> socketWrap_;\n\nafaict you're essentially deleting the whole IPCPipe class and reimplementing\nit. What's wrong with changing it directly instead of adding a wrapper?\n\n>  };\n>  \n>  } /* namespace libcamera */\n> diff --git a/src/libcamera/ipc_pipe_unixsocket.cpp b/src/libcamera/ipc_pipe_unixsocket.cpp\n> index 668ec73b..eb5408d4 100644\n> --- a/src/libcamera/ipc_pipe_unixsocket.cpp\n> +++ b/src/libcamera/ipc_pipe_unixsocket.cpp\n> @@ -9,10 +9,9 @@\n>  \n>  #include <vector>\n>  \n> -#include <libcamera/base/event_dispatcher.h>\n>  #include <libcamera/base/log.h>\n> +#include <libcamera/base/mutex.h>\n>  #include <libcamera/base/thread.h>\n> -#include <libcamera/base/timer.h>\n>  \n>  #include \"libcamera/internal/ipc_pipe.h\"\n>  #include \"libcamera/internal/ipc_unixsocket.h\"\n> @@ -24,67 +23,161 @@ namespace libcamera {\n>  \n>  LOG_DECLARE_CATEGORY(IPCPipe)\n>  \n> -IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n> -                                    const char *ipaProxyWorkerPath)\n> -       : IPCPipe()\n> +class IPCUnixSocketWrapper : Thread\n>  {\n> -       std::vector<int> fds;\n> -       std::vector<std::string> args;\n> -       args.push_back(ipaModulePath);\n> +public:\n> +       IPCUnixSocketWrapper(Signal<const IPCMessage &> *recv)\n> +               : recv_(recv), ready_(false), sendSyncPending_(false),\n> +                 sendSyncCookie_(0)\n> +       {\n> +               start();\n> +       }\n>  \n> -       socket_ = std::make_unique<IPCUnixSocket>();\n> -       UniqueFD fd = socket_->create();\n> -       if (!fd.isValid()) {\n> -               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> -               return;\n> +       ~IPCUnixSocketWrapper()\n> +       {\n> +               exit();\n> +               wait();\n>         }\n> -       socket_->readyRead.connect(this, &IPCPipeUnixSocket::readyRead);\n> -       args.push_back(std::to_string(fd.get()));\n> -       fds.push_back(fd.get());\n>  \n> -       proc_ = std::make_unique<Process>();\n> -       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n> -       if (ret) {\n> -               LOG(IPCPipe, Error)\n> -                       << \"Failed to start proxy worker process\";\n> -               return;\n> +       void run() override\n> +       {\n> +               /*\n> +                * IPC socket construction and connection to its readyRead\n> +                * signal has to be done from the IPC thread so that the\n> +                * relevant Object instances (EventNotifier, slot) are bound to\n> +                * its context.\n> +                */\n> +               init();\n> +               exec();\n> +               deinit();\n>         }\n>  \n> -       connected_ = true;\n> -}\n> +       int fd() { return fd_.get(); }\n> +       int sendSync(const IPCMessage &in, IPCMessage *out);\n> +       int sendAsync(const IPCMessage &data);\n> +       bool waitReady();\n>  \n> -IPCPipeUnixSocket::~IPCPipeUnixSocket()\n> -{\n> -}\n> +private:\n> +       void init();\n> +       void deinit();\n> +       void readyRead();\n>  \n> -int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n> +       UniqueFD fd_;\n> +       Signal<const IPCMessage &> *recv_;\n> +       ConditionVariable cv_;\n> +       Mutex mutex_;\n\nSo the mutex protects everything from here...\n\n> +       bool ready_;\n> +       bool sendSyncPending_;\n> +       uint32_t sendSyncCookie_;\n> +       IPCUnixSocket::Payload *sendSyncResponse_;\n\n...to here\n\n> +\n> +       /* Socket shall be constructed and destructed from IPC thread context */\n> +       std::unique_ptr<IPCUnixSocket> socket_;\n> +};\n> +\n> +int IPCUnixSocketWrapper::sendSync(const IPCMessage &in, IPCMessage *out)\n>  {\n> +       int ret;\n>         IPCUnixSocket::Payload response;\n>  \n> -       int ret = call(in.payload(), &response, in.header().cookie);\n> +       mutex_.lock();\n> +       ASSERT(!sendSyncPending_);\n> +       sendSyncPending_ = true;\n> +       sendSyncCookie_ = in.header().cookie;\n> +       sendSyncResponse_ = &response;\n> +       mutex_.unlock();\n> +\n> +       ret = socket_->send(in.payload());\n>         if (ret) {\n> -               LOG(IPCPipe, Error) << \"Failed to call sync\";\n> -               return ret;\n> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n> +               goto cleanup;\n> +       }\n> +\n> +       bool complete;\n> +       {\n> +               MutexLocker locker(mutex_);\n> +               auto syncComplete = ([&]() {\n> +                       return sendSyncPending_ == false;\n> +               });\n> +               complete = cv_.wait_for(locker, 1000ms, syncComplete);\n> +       }\n\nOk, I think I like this idea.\n\nI still don't see why it needs to be in a wrapper class, though.\n\n> +\n> +       if (!complete) {\n> +               LOG(IPCPipe, Error) << \"Timeout sending sync message\";\n> +               ret = -ETIMEDOUT;\n> +               goto cleanup;\n>         }\n>  \n>         if (out)\n>                 *out = IPCMessage(response);\n>  \n>         return 0;\n> +\n> +cleanup:\n> +       mutex_.lock();\n> +       sendSyncPending_ = false;\n> +       mutex_.unlock();\n> +\n> +       return ret;\n>  }\n>  \n> -int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n> +int IPCUnixSocketWrapper::sendAsync(const IPCMessage &data)\n>  {\n> -       int ret = socket_->send(data.payload());\n> -       if (ret) {\n> -               LOG(IPCPipe, Error) << \"Failed to call async\";\n> -               return ret;\n> +       int ret;\n> +       ret = socket_->send(data.payload());\n> +       if (ret)\n> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n> +       return ret;\n> +}\n> +\n> +bool IPCUnixSocketWrapper::waitReady()\n> +{\n> +       bool ready;\n> +       {\n> +               MutexLocker locker(mutex_);\n> +               auto isReady = ([&]() {\n> +                       return ready_;\n> +               });\n> +               ready = cv_.wait_for(locker, 1000ms, isReady);\n>         }\n>  \n> -       return 0;\n> +       return ready;\n> +}\n\nI don't understand why this function is needed. It only waits to make sure that\ninit() is done, but afaict nothing in init() is threaded. Especially if there\nis no wrapper. Why can't IPCUnixSocket live directly in another thread?\n\n> +\n> +void IPCUnixSocketWrapper::init()\n> +{\n> +       /* Init is to be done from the IPC thread context */\n> +       ASSERT(Thread::current() == this);\n> +\n> +       socket_ = std::make_unique<IPCUnixSocket>();\n> +       fd_ = socket_->create();\n> +       if (!fd_.isValid()) {\n> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> +               return;\n> +       }\n> +\n> +       socket_->readyRead.connect(this, &IPCUnixSocketWrapper::readyRead);\n> +\n> +       mutex_.lock();\n> +       ready_ = true;\n> +       mutex_.unlock();\n> +       cv_.notify_one();\n>  }\n>  \n> -void IPCPipeUnixSocket::readyRead()\n> +void IPCUnixSocketWrapper::deinit()\n> +{\n> +       /* Deinit is to be done from the IPC thread context */\n> +       ASSERT(Thread::current() == this);\n> +\n> +       socket_->readyRead.disconnect(this);\n> +       socket_.reset();\n> +\n> +       mutex_.lock();\n> +       ready_ = false;\n> +       mutex_.unlock();\n> +}\n> +\n> +void IPCUnixSocketWrapper::readyRead()\n>  {\n>         IPCUnixSocket::Payload payload;\n>         int ret = socket_->receive(&payload);\n> @@ -93,55 +186,72 @@ void IPCPipeUnixSocket::readyRead()\n>                 return;\n>         }\n>  \n> -       /* \\todo Use span to avoid the double copy when callData is found. */\n\nOh nice, this is taken care of.\n\n>         if (payload.data.size() < sizeof(IPCMessage::Header)) {\n>                 LOG(IPCPipe, Error) << \"Not enough data received\";\n>                 return;\n>         }\n>  \n> -       IPCMessage ipcMessage(payload);\n> +       const IPCMessage::Header *header =\n> +               reinterpret_cast<IPCMessage::Header *>(payload.data.data());\n> +       bool syncComplete = false;\n> +       mutex_.lock();\n> +       if (sendSyncPending_ && sendSyncCookie_ == header->cookie) {\n> +               syncComplete = true;\n> +               sendSyncPending_ = false;\n> +               *sendSyncResponse_ = std::move(payload);\n> +       }\n> +       mutex_.unlock();\n\nThat's a cool idea.\n\n>  \n> -       auto callData = callData_.find(ipcMessage.header().cookie);\n> -       if (callData != callData_.end()) {\n> -               *callData->second.response = std::move(payload);\n> -               callData->second.done = true;\n> +       if (syncComplete) {\n> +               cv_.notify_one();\n>                 return;\n>         }\n>  \n>         /* Received unexpected data, this means it's a call from the IPA. */\n> -       recv.emit(ipcMessage);\n> +       IPCMessage ipcMessage(payload);\n> +       recv_->emit(ipcMessage);\n>  }\n>  \n> -int IPCPipeUnixSocket::call(const IPCUnixSocket::Payload &message,\n> -                           IPCUnixSocket::Payload *response, uint32_t cookie)\n> +IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n> +                                    const char *ipaProxyWorkerPath)\n> +       : IPCPipe()\n>  {\n> -       Timer timeout;\n> -       int ret;\n> +       socketWrap_ = std::make_unique<IPCUnixSocketWrapper>(&recv);\n> +       if (!socketWrap_->waitReady()) {\n> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> +               return;\n> +       }\n> +       int fd = socketWrap_->fd();\n>  \n> -       const auto result = callData_.insert({ cookie, { response, false } });\n> -       const auto &iter = result.first;\n> +       std::vector<int> fds;\n> +       std::vector<std::string> args;\n> +       args.push_back(ipaModulePath);\n> +       args.push_back(std::to_string(fd));\n> +       fds.push_back(fd);\n\nI'd replace this with:\n\n\tstd::array args{ std::string(ipaModulePath), std::to_string(fd.get()) };\n\tstd::array fds{ fd.get() };\n\n(Since this is how the base code was changed since you when wrote this (sorry I\ndidn't notice this patch earlier (I only noticed because Kieran cc'ed me)))\n\n\nThanks,\n\nPaul\n\n>  \n> -       ret = socket_->send(message);\n> +       proc_ = std::make_unique<Process>();\n> +       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>         if (ret) {\n> -               callData_.erase(iter);\n> -               return ret;\n> +               LOG(IPCPipe, Error)\n> +                       << \"Failed to start proxy worker process\";\n> +               return;\n>         }\n>  \n> -       /* \\todo Make this less dangerous, see IPCPipe::sendSync() */\n> -       timeout.start(2000ms);\n> -       while (!iter->second.done) {\n> -               if (!timeout.isRunning()) {\n> -                       LOG(IPCPipe, Error) << \"Call timeout!\";\n> -                       callData_.erase(iter);\n> -                       return -ETIMEDOUT;\n> -               }\n> +       connected_ = true;\n> +}\n>  \n> -               Thread::current()->eventDispatcher()->processEvents();\n> -       }\n> +IPCPipeUnixSocket::~IPCPipeUnixSocket()\n> +{\n> +}\n>  \n> -       callData_.erase(iter);\n> +int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n> +{\n> +       return socketWrap_->sendSync(in, out);\n> +}\n>  \n> -       return 0;\n> +int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n> +{\n> +       return socketWrap_->sendAsync(data);\n>  }\n>  \n>  } /* namespace libcamera */\n> -- \n> 2.34.1\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id D6E74BEFBE\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 26 Aug 2025 10:28:49 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id B5CA3692E9;\n\tTue, 26 Aug 2025 12:28:48 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 89280692D1\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 26 Aug 2025 12:28:47 +0200 (CEST)","from neptunite.rasen.tech (unknown\n\t[IPv6:2404:7a81:160:2100:5d3f:5a62:a50a:b707])\n\tby perceval.ideasonboard.com (Postfix) with UTF8SMTPSA id 106A32093; \n\tTue, 26 Aug 2025 12:27:43 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"nJq+n+B3\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1756204064;\n\tbh=ODiFEiuykBLq0G11Am6ktB+ezspP/VLM/8VgLNIBhY8=;\n\th=In-Reply-To:References:Subject:From:Cc:To:Date:From;\n\tb=nJq+n+B3eei9U4mS8m/oVq2qZRzSSYn38aklWzb+zsaQ32DuGGHEbEKm5ttSTPppH\n\t73Jo0wdBKhqKltYDhDR8Qx7RJHXwt+YCLlWWTDt7H6uoMrNnsBlqK7OjA8hi45ni9a\n\tRWPL1WAdB5nTMIoB28QZosKSM1rKlqJHfZjvcTnA=","Content-Type":"text/plain; charset=\"utf-8\"","MIME-Version":"1.0","Content-Transfer-Encoding":"quoted-printable","In-Reply-To":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>","References":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>","Subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","From":"Paul Elder <paul.elder@ideasonboard.com>","Cc":"Julien Vuillaumier <julien.vuillaumier@nxp.com>","To":"Julien Vuillaumier <julien.vuillaumier@nxp.com>,\n\tlibcamera-devel@lists.libcamera.org","Date":"Tue, 26 Aug 2025 19:28:40 +0900","Message-ID":"<175620412034.607151.628561646126579934@neptunite.rasen.tech>","User-Agent":"alot/0.0.0","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":35622,"web_url":"https://patchwork.libcamera.org/comment/35622/","msgid":"<92c2b0ad-8d62-42af-9e51-36fa61ff81cb@nxp.com>","date":"2025-08-29T09:42:03","subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","submitter":{"id":190,"url":"https://patchwork.libcamera.org/api/people/190/","name":"Julien Vuillaumier","email":"julien.vuillaumier@nxp.com"},"content":"Hi Paul,\n\n\nOn 26/08/2025 12:28, Paul Elder wrote:\n> \n> Hi Julien,\n> \n> Thanks for the patch.\n> \n> Quoting Julien Vuillaumier (2024-12-20 23:55:56)\n>> Unix-socket based IPC sometimes times out or hangs, typically\n>> when multiple camera are stopped simulaneously. That specific case\n>> triggers the concurrent sending by each pipeline handler instance\n>> of a synchronous stop() message to its peer IPA process.\n>>\n>> There is a dedicated IPC socket per camera. Sockets payload receipt\n>> signals all run in the camera manager thread.\n>> To send a synchronous message to IPA, pipeline invokes from camera\n>> thread IPCPipeUnixSocket::sendSync(). This sends the message then blocks\n>> busy waiting for the peer acknowledgment. Such busy wait is done by\n>> blocking on event loop calling dispatchEvent(), until the ack condition\n>> is detected.\n>>\n>> One issue is that the socket receive slot readyRead() wakes up the\n>> blocked thread via libcamera::Message receipt. Even though such message\n>> resumes processEvents(), it may reblock immediately because readyRead()\n>> does not interrupt() explictly the dispatcher.\n>> Most of the time, an other pending event for the thread unblocks the\n>> event dispatcher and the ack condition is detected - in worst case the 2\n>> sec timeout kicks in. Once unblocked, the dispatcher let the message\n>> acknowledgment to be detected and the sendSync() completes.\n>>\n>> The other issue is that in case of concurrent synchronous IPC messages\n>> sent by multiple pipeline handlers, there is a possible recursion\n>> of sendSync() / processEvents() nested in the camera thread stack. As\n>> commented in the source, that is a dangerous construct that can lead to\n>> a hang.\n>> The reason is that the last synchronous message sent is the deepest in\n>> the stack. It is also the one whose acknowledgment is being busy waited.\n>> However other pending synchronous messages may have been initiated before\n>> and are upper in the stack. If they timeout, the condition is not\n>> detected because of the stack recursion, as the thread is busy waiting\n>> for the last message to be acknowledged.\n> \n> Ok, I think I understand the issue.\n > >>\n>> This change implements a safer mechanism to handle the synchronous\n>> message sending, similar to the one used for non isolated IPA. The\n>> IPCUnixSocketWrapper class is introduced to handle the IPCUnixSocket\n>> receive signal in a dedicated thread.\n>> Doing so, the sending thread, when emiting a synchronous message, can be\n>> blocked without event dispatcher's processEvents() usage, which avoids\n>> the risky stack recursion.\n>>\n>> Fixes: 21f1b555b (\"libcamera: Add IPCPipe implementation based on unix socket\")\n>>\n>> Signed-off-by: Julien Vuillaumier <julien.vuillaumier@nxp.com>\n>> ---\n>>   .../libcamera/internal/ipc_pipe_unixsocket.h  |  13 +-\n>>   src/libcamera/ipc_pipe_unixsocket.cpp         | 242 +++++++++++++-----\n>>   2 files changed, 178 insertions(+), 77 deletions(-)\n>>\n>> diff --git a/include/libcamera/internal/ipc_pipe_unixsocket.h b/include/libcamera/internal/ipc_pipe_unixsocket.h\n>> index 8c972613..280639d5 100644\n>> --- a/include/libcamera/internal/ipc_pipe_unixsocket.h\n>> +++ b/include/libcamera/internal/ipc_pipe_unixsocket.h\n>> @@ -16,6 +16,7 @@\n>>   namespace libcamera {\n>>\n>>   class Process;\n>> +class IPCUnixSocketWrapper;\n>>\n>>   class IPCPipeUnixSocket : public IPCPipe\n>>   {\n>> @@ -29,18 +30,8 @@ public:\n>>          int sendAsync(const IPCMessage &data) override;\n>>\n>>   private:\n>> -       struct CallData {\n>> -               IPCUnixSocket::Payload *response;\n>> -               bool done;\n>> -       };\n>> -\n>> -       void readyRead();\n>> -       int call(const IPCUnixSocket::Payload &message,\n>> -                IPCUnixSocket::Payload *response, uint32_t seq);\n>> -\n>>          std::unique_ptr<Process> proc_;\n>> -       std::unique_ptr<IPCUnixSocket> socket_;\n>> -       std::map<uint32_t, CallData> callData_;\n>> +       std::unique_ptr<IPCUnixSocketWrapper> socketWrap_;\n> \n> afaict you're essentially deleting the whole IPCPipe class and reimplementing\n> it. What's wrong with changing it directly instead of adding a wrapper?\n> \n\nThe IPCPipeUnixSocket class is used by libcamera to create the proxy \nworker process and the IPC to communicate with the proxy. The IPC part \nrelies on the IPCUnixSocket class (asynchronous write, no threading) - \nand the same IPCUnixSocket class is also used by the proxy worker in the \nauto generated code.\n\nThe intent of the IPCUnixSocketWrapper was to provide an higher level \ninterface for the IPCUnixSocket class, more specific to the \nIPCPipeUnixSocket class usage: threading for the receipt path, \nsync/async transmit, IPCMessage to IPCUnixSocket::Payload conversions...\n\nAs you mentioned, IPCPipeUnixSocket becomes fairly slim.\nNow I can look at the option to move the IPCUnixSocketWrapper class \ncontent into the  IPCPipeUnixSocket class, if that is a preferred approach.\n\n\n>>   };\n>>\n>>   } /* namespace libcamera */\n>> diff --git a/src/libcamera/ipc_pipe_unixsocket.cpp b/src/libcamera/ipc_pipe_unixsocket.cpp\n>> index 668ec73b..eb5408d4 100644\n>> --- a/src/libcamera/ipc_pipe_unixsocket.cpp\n>> +++ b/src/libcamera/ipc_pipe_unixsocket.cpp\n>> @@ -9,10 +9,9 @@\n>>\n>>   #include <vector>\n>>\n>> -#include <libcamera/base/event_dispatcher.h>\n>>   #include <libcamera/base/log.h>\n>> +#include <libcamera/base/mutex.h>\n>>   #include <libcamera/base/thread.h>\n>> -#include <libcamera/base/timer.h>\n>>\n>>   #include \"libcamera/internal/ipc_pipe.h\"\n>>   #include \"libcamera/internal/ipc_unixsocket.h\"\n>> @@ -24,67 +23,161 @@ namespace libcamera {\n>>\n>>   LOG_DECLARE_CATEGORY(IPCPipe)\n>>\n>> -IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n>> -                                    const char *ipaProxyWorkerPath)\n>> -       : IPCPipe()\n>> +class IPCUnixSocketWrapper : Thread\n>>   {\n>> -       std::vector<int> fds;\n>> -       std::vector<std::string> args;\n>> -       args.push_back(ipaModulePath);\n>> +public:\n>> +       IPCUnixSocketWrapper(Signal<const IPCMessage &> *recv)\n>> +               : recv_(recv), ready_(false), sendSyncPending_(false),\n>> +                 sendSyncCookie_(0)\n>> +       {\n>> +               start();\n>> +       }\n>>\n>> -       socket_ = std::make_unique<IPCUnixSocket>();\n>> -       UniqueFD fd = socket_->create();\n>> -       if (!fd.isValid()) {\n>> -               LOG(IPCPipe, Error) << \"Failed to create socket\";\n>> -               return;\n>> +       ~IPCUnixSocketWrapper()\n>> +       {\n>> +               exit();\n>> +               wait();\n>>          }\n>> -       socket_->readyRead.connect(this, &IPCPipeUnixSocket::readyRead);\n>> -       args.push_back(std::to_string(fd.get()));\n>> -       fds.push_back(fd.get());\n>>\n>> -       proc_ = std::make_unique<Process>();\n>> -       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>> -       if (ret) {\n>> -               LOG(IPCPipe, Error)\n>> -                       << \"Failed to start proxy worker process\";\n>> -               return;\n>> +       void run() override\n>> +       {\n>> +               /*\n>> +                * IPC socket construction and connection to its readyRead\n>> +                * signal has to be done from the IPC thread so that the\n>> +                * relevant Object instances (EventNotifier, slot) are bound to\n>> +                * its context.\n>> +                */\n>> +               init();\n>> +               exec();\n>> +               deinit();\n>>          }\n>>\n>> -       connected_ = true;\n>> -}\n>> +       int fd() { return fd_.get(); }\n>> +       int sendSync(const IPCMessage &in, IPCMessage *out);\n>> +       int sendAsync(const IPCMessage &data);\n>> +       bool waitReady();\n>>\n>> -IPCPipeUnixSocket::~IPCPipeUnixSocket()\n>> -{\n>> -}\n>> +private:\n>> +       void init();\n>> +       void deinit();\n>> +       void readyRead();\n>>\n>> -int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n>> +       UniqueFD fd_;\n>> +       Signal<const IPCMessage &> *recv_;\n>> +       ConditionVariable cv_;\n>> +       Mutex mutex_;\n> \n> So the mutex protects everything from here...\n> \n>> +       bool ready_;\n>> +       bool sendSyncPending_;\n>> +       uint32_t sendSyncCookie_;\n>> +       IPCUnixSocket::Payload *sendSyncResponse_;\n> \n> ...to here\n\nI am not sure about this comment. But the git patch output may be a bit \nmisleading here: that line is only the `Mutex mutex_` variable \ndeclaration, as a member of the IPCUnixSocketWrapper class.\nThe variables that `mutex_` actually protects depend on the functions \nimplementation where `mutex_` is used through lock()/unlock() sections \nand MutexLocker-protected scopes.\n\n> \n>> +\n>> +       /* Socket shall be constructed and destructed from IPC thread context */\n>> +       std::unique_ptr<IPCUnixSocket> socket_;\n>> +};\n>> +\n>> +int IPCUnixSocketWrapper::sendSync(const IPCMessage &in, IPCMessage *out)\n>>   {\n>> +       int ret;\n>>          IPCUnixSocket::Payload response;\n>>\n>> -       int ret = call(in.payload(), &response, in.header().cookie);\n>> +       mutex_.lock();\n>> +       ASSERT(!sendSyncPending_);\n>> +       sendSyncPending_ = true;\n>> +       sendSyncCookie_ = in.header().cookie;\n>> +       sendSyncResponse_ = &response;\n>> +       mutex_.unlock();\n>> +\n>> +       ret = socket_->send(in.payload());\n>>          if (ret) {\n>> -               LOG(IPCPipe, Error) << \"Failed to call sync\";\n>> -               return ret;\n>> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n>> +               goto cleanup;\n>> +       }\n>> +\n>> +       bool complete;\n>> +       {\n>> +               MutexLocker locker(mutex_);\n>> +               auto syncComplete = ([&]() {\n>> +                       return sendSyncPending_ == false;\n>> +               });\n>> +               complete = cv_.wait_for(locker, 1000ms, syncComplete);\n>> +       }\n> \n> Ok, I think I like this idea.\n\nThanks :)\n\n> \n> I still don't see why it needs to be in a wrapper class, though.\n\nAs mentioned above, I can look at the option to remove the wrapper class \nto keep only the IPCUnixSocketWrapper if desired.\n\n> \n>> +\n>> +       if (!complete) {\n>> +               LOG(IPCPipe, Error) << \"Timeout sending sync message\";\n>> +               ret = -ETIMEDOUT;\n>> +               goto cleanup;\n>>          }\n>>\n>>          if (out)\n>>                  *out = IPCMessage(response);\n>>\n>>          return 0;\n>> +\n>> +cleanup:\n>> +       mutex_.lock();\n>> +       sendSyncPending_ = false;\n>> +       mutex_.unlock();\n>> +\n>> +       return ret;\n>>   }\n>>\n>> -int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n>> +int IPCUnixSocketWrapper::sendAsync(const IPCMessage &data)\n>>   {\n>> -       int ret = socket_->send(data.payload());\n>> -       if (ret) {\n>> -               LOG(IPCPipe, Error) << \"Failed to call async\";\n>> -               return ret;\n>> +       int ret;\n>> +       ret = socket_->send(data.payload());\n>> +       if (ret)\n>> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n>> +       return ret;\n>> +}\n>> +\n>> +bool IPCUnixSocketWrapper::waitReady()\n>> +{\n>> +       bool ready;\n>> +       {\n>> +               MutexLocker locker(mutex_);\n>> +               auto isReady = ([&]() {\n>> +                       return ready_;\n>> +               });\n>> +               ready = cv_.wait_for(locker, 1000ms, isReady);\n>>          }\n>>\n>> -       return 0;\n>> +       return ready;\n>> +}\n> \n> I don't understand why this function is needed. It only waits to make sure that\n> init() is done, but afaict nothing in init() is threaded. Especially if there\n> is no wrapper. Why can't IPCUnixSocket live directly in another thread?\n\nThe init() function is executed from the wrapper thread - see the \nIPCUnixSocketWrapper::run() function that calls in sequence: init(), \nexec() then deinit().\nThe Object bound to the wrapper thread (EventNotifier, slots...) have to \nbe instantiated and later destroyed from the wrapper thread itself. \nThus, they can not be created by the wrapper constructor that is \nexecuted in the camera manager thread ; their creation has to be \ndeferred to the wrapper thread execution. The waitReady() checks that \nthis thread is up and running which indicates that all objects needed \nfor the class operation have been created.\n\nIf there is no more wrapper because the threading implementation is \nmoved to the IPCPipeUnixSocket class, the Object instances \n(EventNotifier, slots...) associated to the new thread will still have \nto be created from there as well.\nSimilar synchronization mechanism will be necessary in the \nIPCPipeUnixSocket constructor to wait for that other thread to be active \nand its associated Objects created.\n\n> \n>> +\n>> +void IPCUnixSocketWrapper::init()\n>> +{\n>> +       /* Init is to be done from the IPC thread context */\n>> +       ASSERT(Thread::current() == this);\n>> +\n>> +       socket_ = std::make_unique<IPCUnixSocket>();\n>> +       fd_ = socket_->create();\n>> +       if (!fd_.isValid()) {\n>> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n>> +               return;\n>> +       }\n>> +\n>> +       socket_->readyRead.connect(this, &IPCUnixSocketWrapper::readyRead);\n>> +\n>> +       mutex_.lock();\n>> +       ready_ = true;\n>> +       mutex_.unlock();\n>> +       cv_.notify_one();\n>>   }\n>>\n>> -void IPCPipeUnixSocket::readyRead()\n>> +void IPCUnixSocketWrapper::deinit()\n>> +{\n>> +       /* Deinit is to be done from the IPC thread context */\n>> +       ASSERT(Thread::current() == this);\n>> +\n>> +       socket_->readyRead.disconnect(this);\n>> +       socket_.reset();\n>> +\n>> +       mutex_.lock();\n>> +       ready_ = false;\n>> +       mutex_.unlock();\n>> +}\n>> +\n>> +void IPCUnixSocketWrapper::readyRead()\n>>   {\n>>          IPCUnixSocket::Payload payload;\n>>          int ret = socket_->receive(&payload);\n>> @@ -93,55 +186,72 @@ void IPCPipeUnixSocket::readyRead()\n>>                  return;\n>>          }\n>>\n>> -       /* \\todo Use span to avoid the double copy when callData is found. */\n> \n> Oh nice, this is taken care of.\n> \n>>          if (payload.data.size() < sizeof(IPCMessage::Header)) {\n>>                  LOG(IPCPipe, Error) << \"Not enough data received\";\n>>                  return;\n>>          }\n>>\n>> -       IPCMessage ipcMessage(payload);\n>> +       const IPCMessage::Header *header =\n>> +               reinterpret_cast<IPCMessage::Header *>(payload.data.data());\n>> +       bool syncComplete = false;\n>> +       mutex_.lock();\n>> +       if (sendSyncPending_ && sendSyncCookie_ == header->cookie) {\n>> +               syncComplete = true;\n>> +               sendSyncPending_ = false;\n>> +               *sendSyncResponse_ = std::move(payload);\n>> +       }\n>> +       mutex_.unlock();\n> \n> That's a cool idea.\n> \n>>\n>> -       auto callData = callData_.find(ipcMessage.header().cookie);\n>> -       if (callData != callData_.end()) {\n>> -               *callData->second.response = std::move(payload);\n>> -               callData->second.done = true;\n>> +       if (syncComplete) {\n>> +               cv_.notify_one();\n>>                  return;\n>>          }\n>>\n>>          /* Received unexpected data, this means it's a call from the IPA. */\n>> -       recv.emit(ipcMessage);\n>> +       IPCMessage ipcMessage(payload);\n>> +       recv_->emit(ipcMessage);\n>>   }\n>>\n>> -int IPCPipeUnixSocket::call(const IPCUnixSocket::Payload &message,\n>> -                           IPCUnixSocket::Payload *response, uint32_t cookie)\n>> +IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n>> +                                    const char *ipaProxyWorkerPath)\n>> +       : IPCPipe()\n>>   {\n>> -       Timer timeout;\n>> -       int ret;\n>> +       socketWrap_ = std::make_unique<IPCUnixSocketWrapper>(&recv);\n>> +       if (!socketWrap_->waitReady()) {\n>> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n>> +               return;\n>> +       }\n>> +       int fd = socketWrap_->fd();\n>>\n>> -       const auto result = callData_.insert({ cookie, { response, false } });\n>> -       const auto &iter = result.first;\n>> +       std::vector<int> fds;\n>> +       std::vector<std::string> args;\n>> +       args.push_back(ipaModulePath);\n>> +       args.push_back(std::to_string(fd));\n>> +       fds.push_back(fd);\n> \n> I'd replace this with:\n> \n>          std::array args{ std::string(ipaModulePath), std::to_string(fd.get()) };\n>          std::array fds{ fd.get() };\n> \n> (Since this is how the base code was changed since you when wrote this (sorry I\n> didn't notice this patch earlier (I only noticed because Kieran cc'ed me)))\n\nSure, I noticed that part had changed in libcamera v0.5.2.\nI will update in the V2.\n\nThanks,\nJulien\n\n\n> \n> Thanks,\n> \n> Paul\n> \n>>\n>> -       ret = socket_->send(message);\n>> +       proc_ = std::make_unique<Process>();\n>> +       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>>          if (ret) {\n>> -               callData_.erase(iter);\n>> -               return ret;\n>> +               LOG(IPCPipe, Error)\n>> +                       << \"Failed to start proxy worker process\";\n>> +               return;\n>>          }\n>>\n>> -       /* \\todo Make this less dangerous, see IPCPipe::sendSync() */\n>> -       timeout.start(2000ms);\n>> -       while (!iter->second.done) {\n>> -               if (!timeout.isRunning()) {\n>> -                       LOG(IPCPipe, Error) << \"Call timeout!\";\n>> -                       callData_.erase(iter);\n>> -                       return -ETIMEDOUT;\n>> -               }\n>> +       connected_ = true;\n>> +}\n>>\n>> -               Thread::current()->eventDispatcher()->processEvents();\n>> -       }\n>> +IPCPipeUnixSocket::~IPCPipeUnixSocket()\n>> +{\n>> +}\n>>\n>> -       callData_.erase(iter);\n>> +int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n>> +{\n>> +       return socketWrap_->sendSync(in, out);\n>> +}\n>>\n>> -       return 0;\n>> +int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n>> +{\n>> +       return socketWrap_->sendAsync(data);\n>>   }\n>>\n>>   } /* namespace libcamera */\n>> --\n>> 2.34.1\n>>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 9B391BD87C\n\tfor <parsemail@patchwork.libcamera.org>;\n\tFri, 29 Aug 2025 09:42:09 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id C3DFA6931A;\n\tFri, 29 Aug 2025 11:42:07 +0200 (CEST)","from PA4PR04CU001.outbound.protection.outlook.com\n\t(mail-francecentralazlp170130007.outbound.protection.outlook.com\n\t[IPv6:2a01:111:f403:c20a::7])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 40EEA692DF\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tFri, 29 Aug 2025 11:42:06 +0200 (CEST)","from AM9PR04MB8147.eurprd04.prod.outlook.com\n\t(2603:10a6:20b:3e0::22)\n\tby AM9PR04MB8130.eurprd04.prod.outlook.com (2603:10a6:20b:3b5::14)\n\twith Microsoft SMTP Server (version=TLS1_2,\n\tcipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9073.15;\n\tFri, 29 Aug 2025 09:42:05 +0000","from AM9PR04MB8147.eurprd04.prod.outlook.com\n\t([fe80::eace:e980:28a4:ef8a]) by\n\tAM9PR04MB8147.eurprd04.prod.outlook.com\n\t([fe80::eace:e980:28a4:ef8a%6]) with mapi id 15.20.9073.014;\n\tFri, 29 Aug 2025 09:42:05 +0000"],"Authentication-Results":["lancelot.ideasonboard.com; dkim=pass (2048-bit key;\n\tunprotected) header.d=nxp.com header.i=@nxp.com header.b=\"FxGirzNw\";\n\tdkim-atps=neutral","dkim=none (message not signed)\n\theader.d=none;dmarc=none action=none header.from=nxp.com;"],"ARC-Seal":"i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none;\n\tb=WH/yDlRgUmCB7BNe6ahnbhO4NHBm9/vDahrmwzdWx1S6QrckqUMo+o2prSpq/Lu7VLh7Dfp952PAL+WWCQ38cOzah4cOWk89rxf0LiBKxEOoORKM6xtHi0tPYkLOV5Dd2bOETBI45tyNlRbvOGt4WoPyrQUYQHbwVafxUkJQyIiS1TdS7/CzS3qqxL2n6ULb53Mx52xghTFZEH8drP4/kRemxZuFxc4UVC900X13xO5URwlRYmyR7J6Dbt3zPBagJ9ernxR31L6DHnaut+wCvWSFXfDYRrkXsj6zTk1hYfmZ3XPLCie4LlkvJaGaPOuElShAEOOApYI4Bz2k2UE6eg==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n\ts=arcselector10001;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n\tbh=+SRxJm2Jdam49rT2lesR3FdBTwEZR9Hzb9ixB64NQl0=;\n\tb=y1VBd1DSPPE8N6nSwpc0lcqZwRd6QFNkBH6PvK5wbT3WMQnwLXSymqL99gJVC9/G9mxBAOrxvWHgqoAxk7GnZWqi0VeVUHIbmwbSZwx8YS1CiHsWKYfwuxYs4HxQGwNuPPlylh+fABtQbuOfR0TTCKdpWSFd5t1sVneioBkTdNiQCipe0Mv8OfN4Knmfr3Utv10oaYqbGST66U8O9NIzL7EGmOxSOs9jZJ/ZHdwm8paDB6pk0KIlqpCxyvfb6W4AogzGZ38LqU9bhvN9YJv5tqGMlA6bXdLNEz8drgILWpNJeGP6tcnnOkmZxPBZSZ9LTvSLSQf01S43vm6SFxfHBA==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n\tsmtp.mailfrom=nxp.com; dmarc=pass action=none header.from=nxp.com;\n\tdkim=pass header.d=nxp.com; arc=none","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n\tbh=+SRxJm2Jdam49rT2lesR3FdBTwEZR9Hzb9ixB64NQl0=;\n\tb=FxGirzNw8t3XyniQvkbbDMByq0hHK21rIEY92H79+xNxjRZtnSQRzbIU2KDcV97sy0AuOy+f3Qil8hB/um+DOlSu50PtmyUjbgTRSveSpC/8MjDKqSkNh4L/fgkgzD3OjKFuQd/Cc4tQ8s3vvZtoEbBqAt+b8fzPgzPrxZyWL12yzV8MDmj/yfbOhs9VL/obSdsZy+faSGoWXu1wP5R+kQqj8AZ+J4Gc98yWfYXdvlgDM3lpRZ3R3hAtXHgZSqAt39m8GPBSC/DkMNrqoac+GvyUIOws55faRD4djGYfxtG8WvfUXPJwypjAiFCwBJaAc3gZjyR1Qnn+Nvg6C5qe1w==","Message-ID":"<92c2b0ad-8d62-42af-9e51-36fa61ff81cb@nxp.com>","Date":"Fri, 29 Aug 2025 11:42:03 +0200","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","To":"Paul Elder <paul.elder@ideasonboard.com>,\n\tlibcamera-devel@lists.libcamera.org","References":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>\n\t<175620412034.607151.628561646126579934@neptunite.rasen.tech>","Content-Language":"en-US","From":"Julien Vuillaumier <julien.vuillaumier@nxp.com>","In-Reply-To":"<175620412034.607151.628561646126579934@neptunite.rasen.tech>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"7bit","X-ClientProxiedBy":"AM0PR02CA0171.eurprd02.prod.outlook.com\n\t(2603:10a6:20b:28e::8) To AM9PR04MB8147.eurprd04.prod.outlook.com\n\t(2603:10a6:20b:3e0::22)","MIME-Version":"1.0","X-MS-PublicTrafficType":"Email","X-MS-TrafficTypeDiagnostic":"AM9PR04MB8147:EE_|AM9PR04MB8130:EE_","X-MS-Office365-Filtering-Correlation-Id":"4e3a152e-b35a-4507-2b60-08dde6e05002","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam":"BCL:0;\n\tARA:13230040|1800799024|366016|19092799006|376014; ","X-Microsoft-Antispam-Message-Info":"=?utf-8?q?tPh4ZQIDqQIioqLLxkk1H7oJWATJ?=\n\t=?utf-8?q?Dat81m6cHrzK5D4KhosKpphhsgaAt9sTMqsKnLH1my/LOiXMba+pqJqR?=\n\t=?utf-8?q?K8pZlIT8qk4Uryjjlt5kLO3rnsoIl86e4d8vn78+XjMbCsLoU+q/9FBe?=\n\t=?utf-8?q?PYkByAcx4EAgIPygal81PNE+2ClYbhSVTansv7ol7whMI4WM/Pv/BmGc?=\n\t=?utf-8?q?rxn1HlrXHbJ+PZIg2ROtj2M6VsEE7jrhMMrW3hVYHmTTZtC9APxX2e7D?=\n\t=?utf-8?q?mpOq2OAPvDybegYM/hfG2cOwqr0LulTISV57HTS/RpVWRvt+w97jlL11?=\n\t=?utf-8?q?AhKAtrl9YtfWeQq+Jw29kbWEBc8YiuaPGuSOA1LRtMh1J8Y3swvZnHaX?=\n\t=?utf-8?q?SVAsMUJ53HdnbsAiIV2H0K7lRnM7kd/sd7sWN7zo+QgtOf50NqziAeYW?=\n\t=?utf-8?q?8aZtPltYv5ayta6WnhHokrB8KJSsnCMNGzDN7GB0cBbWJEjuadJZXJhM?=\n\t=?utf-8?q?w4hGQDSwtOERDQrPB1kFblboIGC2wNo6GdbAho/5XhZzUqMNfbFYip3b?=\n\t=?utf-8?q?y+/th4yrn5vxl1cV3W6po0GbcnRFPPnkEIjjHF1oFnfJUwsci2bK7lwl?=\n\t=?utf-8?q?oLQDTH2AaDKi9uBBb6nU2Rb5K56/hKNf3rkiQ03zhnsVymCjiov5ImKz?=\n\t=?utf-8?q?ucppKqsKkb8eFj1dQ4kELyAlg3cVgkFd8ef0CE7TwZmoxxGDX/QB9bHM?=\n\t=?utf-8?q?BWUbixEdhHrvz2DLWTLlMem2XVM6+bVySwF07W7KYPFrMWJ03UyR1PZ0?=\n\t=?utf-8?q?5eU99mo/FGrPqChlXkrO7bxTF+eJdzI5U87WHv/TcPbo1AvkPoH8F7/A?=\n\t=?utf-8?q?Sa80bSEoV2w+25W4IuEioR+yJY3OAcEPet4h8GNg5vb9lF5iDD3PIYJB?=\n\t=?utf-8?q?gWur43Wn90TQyy4IUOvgf9vgm8RSJ4u/4ExJ9LcbXOAK2JCTQmmLqkFt?=\n\t=?utf-8?q?q9K8FMyupHb0hw6BqGWwdRKDr5Ich38eT8ZtQKW4J0LOzx/I3BRmCGQX?=\n\t=?utf-8?q?ZkLqkU2bfzGuXzzyOx9MaR/UGFUv3PhXsbBmRUtEbx2/TJr+M5VjqhXc?=\n\t=?utf-8?q?+Q/gtmjfORF7BN55VAAhAgZSOkVTkLUGgHHwa9iOweuFvoUacMqSJZE7?=\n\t=?utf-8?q?59Wl6DKeLiz1fk85G8AorZ7K0d840Lh8Dp6OMcNzP5ZCs5ry++7bEjUC?=\n\t=?utf-8?q?4WhVTTLJq4nnjBVmdpNX/Ng/5lCrwIuW/OxLEwBUP7qjvOKLvwY4zQ0V?=\n\t=?utf-8?q?+4FsItH5HQ/AP9Bu6QuADQU8ZBH2EPDaGYIr8G1GPD+CoohfqoCBxsA0?=\n\t=?utf-8?q?Nmlyt3xWqq+XhBK9rxrr3YwJWtfSKkWsHH0FYSVHycReuui3h9Mv5k2I?=\n\t=?utf-8?q?c71MnR3KIDFWpaKTJhHXYNHEEB+oZ/CVDgC9hanMCJIx6BbdRCbxSl3d?=\n\t=?utf-8?q?22rS+IxhNZnHcMxle3fMzaxjQ+Hjobj0wyUXvSAuf9AUt6NR1xoBmnmo?=\n\t=?utf-8?q?LA8Orm3GiaY9W1+2CaEwBv4=3D?=","X-Forefront-Antispam-Report":"CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;\n\tIPV:NLI; SFV:NSPM; H:AM9PR04MB8147.eurprd04.prod.outlook.com; PTR:;\n\tCAT:NONE; \n\tSFS:(13230040)(1800799024)(366016)(19092799006)(376014); DIR:OUT;\n\tSFP:1101; ","X-MS-Exchange-AntiSpam-MessageData-ChunkCount":"1","X-MS-Exchange-AntiSpam-MessageData-0":"=?utf-8?q?RtJzaSMInrsuGFvbV41IkLBsM?=\n\t=?utf-8?q?BYNPkNiAMB5pt5XZ5Yk+cndLb0sfs82C16T9Sr7sXv5izhpy1YWMV7bt?=\n\t=?utf-8?q?VMANQsPbvdPH2ZX/gnfWuaMGzzhOiuZQAwcqZGyJuovWBWAFriX7Op4M?=\n\t=?utf-8?q?Jd3zWQD7a425GzkLeUF0mPVVYwt+QcM3wdwqYYAzw0qII/ED9b/TX+ke?=\n\t=?utf-8?q?8q8vp12Vp6rd+C8O/SCzT6PJgh9HzcoiKrFmL7QhSBZPo6uSulvb+3B3?=\n\t=?utf-8?q?civ+J22cEtqM0gCBd+Ef/X9FVEDCI0cxoddjgN+909JYqXFIBss0q6ym?=\n\t=?utf-8?q?kV/UcGZFGmSrv6sA1ZXPEtnb8JbrAc/0BK1vrNX+lkBV6r4EG1a19IXV?=\n\t=?utf-8?q?n3USZztrlxbSTLcn26UOQGmjRgZot8uv34tjGWXv6dIwwbXkXHsOWOLr?=\n\t=?utf-8?q?Sn+55eBxjlw7xAOPFZH5nxaXSEqfy67jT0wG70C9LQFnmRg4jVMYOI51?=\n\t=?utf-8?q?35x+D0DTGDx+7RipOJ2ZecSc1I8k447nE1rQhv/i8nu1Le0fg3Lb8w4R?=\n\t=?utf-8?q?l0uGD97yr8g3Z8T+W4OXkRS/3Z3ib+7lOkD3Wg/apKPQwlbtOjEnrAJf?=\n\t=?utf-8?q?KWri9ksLEMu94Mf1v45v5TMmdMnZpUBVSQHgSVPH5eOh+UMRpWqxv3jK?=\n\t=?utf-8?q?dPWYi3rZ1gwb03Xu5eyxa49iyPqZ4+YKrxoC+ctLz1FBTKrI8L+DBHJq?=\n\t=?utf-8?q?eIxhjM8BDZ4HuZc0ULE/mU8ov7Ma6DAIjhWNg1KNSeIKYHkhkWY0wBgo?=\n\t=?utf-8?q?SyXanXGNIl4rOuBFTC40F8DDgT/3FahCWJHDzbvdG2KHlfaBaDhl45W3?=\n\t=?utf-8?q?ZNq2GjmmNtbMM28+qDZ1TzbUd+VDyqlclf9RXqCyokjOgKu5ipauEnMa?=\n\t=?utf-8?q?EuFnpDg5QIs/L5jZRiPKVDRzxHDixDVWNO4JiuXG9hqoyQyAokPC7da5?=\n\t=?utf-8?q?YwsHSY8404vskTxrrHMRV9uYjrQJoEgNGIulliqdKNLDT+h0b01o8G48?=\n\t=?utf-8?q?zr1qirOy5qNl5HZbIVFrbGjI0VgTkv9RZKBmAlAQCIGcPNE3jVBTCE36?=\n\t=?utf-8?q?uw63bXwZ6SpA9FkrXIC+t5+8VcGpLZ0I27NvmK4uOCyNArfKCU+UaoPO?=\n\t=?utf-8?q?RrD0/wqgz68WlLEejFqGM7lbXB5s0+9cE4XQVQ8IZ7MUGfWGbi+WRo71?=\n\t=?utf-8?q?78P+g+xqIwVoVe967jWcTPb4Yd9SYcYfSBNKG7Dh+EwmULbRzTohjQoU?=\n\t=?utf-8?q?pYizQncngsX7LEkl0aNTcA2wID1mLtkoLQrvd3Odb9va/MN+4CRDCbL7?=\n\t=?utf-8?q?8eyj5WeSRrgUX3x5lV4xmLyBtG4kGbfW0RWCWX4lOA+vtPdunkc6GzKO?=\n\t=?utf-8?q?UEs+xsdjVV5hgYDLx0bMeeE8bgKkLyAMdCDy2G7mLL37Dw4wVUN47hdv?=\n\t=?utf-8?q?2TrVRKfGLUM1LNQufrGNl+/vrnSBBdkhZXr0ng9K9NhTtTnYIQh2Cn8K?=\n\t=?utf-8?q?ZepM8S95t0n/lZABl25DPTv84Tifhr3oZu1tcrZcsiiGYlnrn7fvU7Sl?=\n\t=?utf-8?q?u8D5HBDnaUL0jUXsOr3Lq9Cbw2cMAh4aL972lOtEdDKKwk7nUt7OxoLx?=\n\t=?utf-8?q?RntQd1SYLrtSFHMWOfqV9D5BQWmuvxtV5QLHdLK1XFd3pgQj5NhdxXk5?=\n\t=?utf-8?q?W3FRV/AEum5gSQT+YK9ZSPeRsT5qg=3D=3D?=","X-OriginatorOrg":"nxp.com","X-MS-Exchange-CrossTenant-Network-Message-Id":"4e3a152e-b35a-4507-2b60-08dde6e05002","X-MS-Exchange-CrossTenant-AuthSource":"AM9PR04MB8147.eurprd04.prod.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Internal","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"29 Aug 2025 09:42:05.0141\n\t(UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"686ea1d3-bc2b-4c6f-a92c-d99c5c301635","X-MS-Exchange-CrossTenant-MailboxType":"HOSTED","X-MS-Exchange-CrossTenant-UserPrincipalName":"/+oAceyIysgVslJfNOxkp7BlHR0Dw1Y2/1OUIWPDDU2QO9sQ3+Wxa4lN+u+dFNlv5/5Oytj5BHHTPaeDlvSseY5PHboos2+uQbj06X4xZRI=","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"AM9PR04MB8130","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":35629,"web_url":"https://patchwork.libcamera.org/comment/35629/","msgid":"<1b9e7e80-09c1-4598-94dc-74196a92bd39@ideasonboard.com>","date":"2025-08-29T16:02:00","subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","submitter":{"id":216,"url":"https://patchwork.libcamera.org/api/people/216/","name":"Barnabás Pőcze","email":"barnabas.pocze@ideasonboard.com"},"content":"Hi\n\n\n2024. 12. 20. 15:55 keltezéssel, Julien Vuillaumier írta:\n> Unix-socket based IPC sometimes times out or hangs, typically\n> when multiple camera are stopped simulaneously. That specific case\n> triggers the concurrent sending by each pipeline handler instance\n> of a synchronous stop() message to its peer IPA process.\n> \n> There is a dedicated IPC socket per camera. Sockets payload receipt\n> signals all run in the camera manager thread.\n> To send a synchronous message to IPA, pipeline invokes from camera\n> thread IPCPipeUnixSocket::sendSync(). This sends the message then blocks\n> busy waiting for the peer acknowledgment. Such busy wait is done by\n> blocking on event loop calling dispatchEvent(), until the ack condition\n> is detected.\n> \n> One issue is that the socket receive slot readyRead() wakes up the\n> blocked thread via libcamera::Message receipt. Even though such message\n> resumes processEvents(), it may reblock immediately because readyRead()\n> does not interrupt() explictly the dispatcher.\n> Most of the time, an other pending event for the thread unblocks the\n> event dispatcher and the ack condition is detected - in worst case the 2\n> sec timeout kicks in. Once unblocked, the dispatcher let the message\n> acknowledgment to be detected and the sendSync() completes.\n\nAs far as I can see\n\n   * EventDispatcherPoll::processEvents() is blocked in ppoll()\n   * socket receives data, signals POLLIN, ppoll() returns\n   * EventDispatcherPoll::processNotifiers()\n       -> EventNotifier::activated emitted\n       -> IPCUnixSocket::dataNotifier() runs\n       -> IPCUnixSocket::readyRead emitted\n       -> IPCPipeUnixSocket::readyRead() runs\n   * IPCPipeUnixSocket::readyRead() reads the message and modifes `callData_` accordingly\n   * EventDispatcherPoll::processEvents() returns\n   * IPCPipeUnixSocket::call() sees the update in `callData_` and stops\n\nAnd as far as I can tell every signal emission happens synchronously.\nSo what am I missing? Where does it get blocked?\n\n\n> \n> The other issue is that in case of concurrent synchronous IPC messages\n> sent by multiple pipeline handlers, there is a possible recursion\n> of sendSync() / processEvents() nested in the camera thread stack. As\n> commented in the source, that is a dangerous construct that can lead to\n> a hang.\n> The reason is that the last synchronous message sent is the deepest in\n> the stack. It is also the one whose acknowledgment is being busy waited.\n> However other pending synchronous messages may have been initiated before\n> and are upper in the stack. If they timeout, the condition is not\n> detected because of the stack recursion, as the thread is busy waiting\n> for the last message to be acknowledged.\n\nWithout a large-scale refactor, I see two choices:\n\n   (1) dispatch the event loop recursively;\n   (2) block the event loop;\n\nthe proposed solution here appears to be (2). But then what is not clear\nto me is why you're not just running a poll() in a loop on that single\nsocket? Why do we need an extra thread?\n\n\nRegards,\nBarnabás Pőcze\n\n> \n> This change implements a safer mechanism to handle the synchronous\n> message sending, similar to the one used for non isolated IPA. The\n> IPCUnixSocketWrapper class is introduced to handle the IPCUnixSocket\n> receive signal in a dedicated thread.\n> Doing so, the sending thread, when emiting a synchronous message, can be\n> blocked without event dispatcher's processEvents() usage, which avoids\n> the risky stack recursion.\n> \n> Fixes: 21f1b555b (\"libcamera: Add IPCPipe implementation based on unix socket\")\n> \n> Signed-off-by: Julien Vuillaumier <julien.vuillaumier@nxp.com>\n> ---\n>   .../libcamera/internal/ipc_pipe_unixsocket.h  |  13 +-\n>   src/libcamera/ipc_pipe_unixsocket.cpp         | 242 +++++++++++++-----\n>   2 files changed, 178 insertions(+), 77 deletions(-)\n> \n> diff --git a/include/libcamera/internal/ipc_pipe_unixsocket.h b/include/libcamera/internal/ipc_pipe_unixsocket.h\n> index 8c972613..280639d5 100644\n> --- a/include/libcamera/internal/ipc_pipe_unixsocket.h\n> +++ b/include/libcamera/internal/ipc_pipe_unixsocket.h\n> @@ -16,6 +16,7 @@\n>   namespace libcamera {\n>   \n>   class Process;\n> +class IPCUnixSocketWrapper;\n>   \n>   class IPCPipeUnixSocket : public IPCPipe\n>   {\n> @@ -29,18 +30,8 @@ public:\n>   \tint sendAsync(const IPCMessage &data) override;\n>   \n>   private:\n> -\tstruct CallData {\n> -\t\tIPCUnixSocket::Payload *response;\n> -\t\tbool done;\n> -\t};\n> -\n> -\tvoid readyRead();\n> -\tint call(const IPCUnixSocket::Payload &message,\n> -\t\t IPCUnixSocket::Payload *response, uint32_t seq);\n> -\n>   \tstd::unique_ptr<Process> proc_;\n> -\tstd::unique_ptr<IPCUnixSocket> socket_;\n> -\tstd::map<uint32_t, CallData> callData_;\n> +\tstd::unique_ptr<IPCUnixSocketWrapper> socketWrap_;\n>   };\n>   \n>   } /* namespace libcamera */\n> diff --git a/src/libcamera/ipc_pipe_unixsocket.cpp b/src/libcamera/ipc_pipe_unixsocket.cpp\n> index 668ec73b..eb5408d4 100644\n> --- a/src/libcamera/ipc_pipe_unixsocket.cpp\n> +++ b/src/libcamera/ipc_pipe_unixsocket.cpp\n> @@ -9,10 +9,9 @@\n>   \n>   #include <vector>\n>   \n> -#include <libcamera/base/event_dispatcher.h>\n>   #include <libcamera/base/log.h>\n> +#include <libcamera/base/mutex.h>\n>   #include <libcamera/base/thread.h>\n> -#include <libcamera/base/timer.h>\n>   \n>   #include \"libcamera/internal/ipc_pipe.h\"\n>   #include \"libcamera/internal/ipc_unixsocket.h\"\n> @@ -24,67 +23,161 @@ namespace libcamera {\n>   \n>   LOG_DECLARE_CATEGORY(IPCPipe)\n>   \n> -IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n> -\t\t\t\t     const char *ipaProxyWorkerPath)\n> -\t: IPCPipe()\n> +class IPCUnixSocketWrapper : Thread\n>   {\n> -\tstd::vector<int> fds;\n> -\tstd::vector<std::string> args;\n> -\targs.push_back(ipaModulePath);\n> +public:\n> +\tIPCUnixSocketWrapper(Signal<const IPCMessage &> *recv)\n> +\t\t: recv_(recv), ready_(false), sendSyncPending_(false),\n> +\t\t  sendSyncCookie_(0)\n> +\t{\n> +\t\tstart();\n> +\t}\n>   \n> -\tsocket_ = std::make_unique<IPCUnixSocket>();\n> -\tUniqueFD fd = socket_->create();\n> -\tif (!fd.isValid()) {\n> -\t\tLOG(IPCPipe, Error) << \"Failed to create socket\";\n> -\t\treturn;\n> +\t~IPCUnixSocketWrapper()\n> +\t{\n> +\t\texit();\n> +\t\twait();\n>   \t}\n> -\tsocket_->readyRead.connect(this, &IPCPipeUnixSocket::readyRead);\n> -\targs.push_back(std::to_string(fd.get()));\n> -\tfds.push_back(fd.get());\n>   \n> -\tproc_ = std::make_unique<Process>();\n> -\tint ret = proc_->start(ipaProxyWorkerPath, args, fds);\n> -\tif (ret) {\n> -\t\tLOG(IPCPipe, Error)\n> -\t\t\t<< \"Failed to start proxy worker process\";\n> -\t\treturn;\n> +\tvoid run() override\n> +\t{\n> +\t\t/*\n> +\t\t * IPC socket construction and connection to its readyRead\n> +\t\t * signal has to be done from the IPC thread so that the\n> +\t\t * relevant Object instances (EventNotifier, slot) are bound to\n> +\t\t * its context.\n> +\t\t */\n> +\t\tinit();\n> +\t\texec();\n> +\t\tdeinit();\n>   \t}\n>   \n> -\tconnected_ = true;\n> -}\n> +\tint fd() { return fd_.get(); }\n> +\tint sendSync(const IPCMessage &in, IPCMessage *out);\n> +\tint sendAsync(const IPCMessage &data);\n> +\tbool waitReady();\n>   \n> -IPCPipeUnixSocket::~IPCPipeUnixSocket()\n> -{\n> -}\n> +private:\n> +\tvoid init();\n> +\tvoid deinit();\n> +\tvoid readyRead();\n>   \n> -int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n> +\tUniqueFD fd_;\n> +\tSignal<const IPCMessage &> *recv_;\n> +\tConditionVariable cv_;\n> +\tMutex mutex_;\n> +\tbool ready_;\n> +\tbool sendSyncPending_;\n> +\tuint32_t sendSyncCookie_;\n> +\tIPCUnixSocket::Payload *sendSyncResponse_;\n> +\n> +\t/* Socket shall be constructed and destructed from IPC thread context */\n> +\tstd::unique_ptr<IPCUnixSocket> socket_;\n> +};\n> +\n> +int IPCUnixSocketWrapper::sendSync(const IPCMessage &in, IPCMessage *out)\n>   {\n> +\tint ret;\n>   \tIPCUnixSocket::Payload response;\n>   \n> -\tint ret = call(in.payload(), &response, in.header().cookie);\n> +\tmutex_.lock();\n> +\tASSERT(!sendSyncPending_);\n> +\tsendSyncPending_ = true;\n> +\tsendSyncCookie_ = in.header().cookie;\n> +\tsendSyncResponse_ = &response;\n> +\tmutex_.unlock();\n> +\n> +\tret = socket_->send(in.payload());\n>   \tif (ret) {\n> -\t\tLOG(IPCPipe, Error) << \"Failed to call sync\";\n> -\t\treturn ret;\n> +\t\tLOG(IPCPipe, Error) << \"Failed to send sync message\";\n> +\t\tgoto cleanup;\n> +\t}\n> +\n> +\tbool complete;\n> +\t{\n> +\t\tMutexLocker locker(mutex_);\n> +\t\tauto syncComplete = ([&]() {\n> +\t\t\treturn sendSyncPending_ == false;\n> +\t\t});\n> +\t\tcomplete = cv_.wait_for(locker, 1000ms, syncComplete);\n> +\t}\n> +\n> +\tif (!complete) {\n> +\t\tLOG(IPCPipe, Error) << \"Timeout sending sync message\";\n> +\t\tret = -ETIMEDOUT;\n> +\t\tgoto cleanup;\n>   \t}\n>   \n>   \tif (out)\n>   \t\t*out = IPCMessage(response);\n>   \n>   \treturn 0;\n> +\n> +cleanup:\n> +\tmutex_.lock();\n> +\tsendSyncPending_ = false;\n> +\tmutex_.unlock();\n> +\n> +\treturn ret;\n>   }\n>   \n> -int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n> +int IPCUnixSocketWrapper::sendAsync(const IPCMessage &data)\n>   {\n> -\tint ret = socket_->send(data.payload());\n> -\tif (ret) {\n> -\t\tLOG(IPCPipe, Error) << \"Failed to call async\";\n> -\t\treturn ret;\n> +\tint ret;\n> +\tret = socket_->send(data.payload());\n> +\tif (ret)\n> +\t\tLOG(IPCPipe, Error) << \"Failed to send sync message\";\n> +\treturn ret;\n> +}\n> +\n> +bool IPCUnixSocketWrapper::waitReady()\n> +{\n> +\tbool ready;\n> +\t{\n> +\t\tMutexLocker locker(mutex_);\n> +\t\tauto isReady = ([&]() {\n> +\t\t\treturn ready_;\n> +\t\t});\n> +\t\tready = cv_.wait_for(locker, 1000ms, isReady);\n>   \t}\n>   \n> -\treturn 0;\n> +\treturn ready;\n> +}\n> +\n> +void IPCUnixSocketWrapper::init()\n> +{\n> +\t/* Init is to be done from the IPC thread context */\n> +\tASSERT(Thread::current() == this);\n> +\n> +\tsocket_ = std::make_unique<IPCUnixSocket>();\n> +\tfd_ = socket_->create();\n> +\tif (!fd_.isValid()) {\n> +\t\tLOG(IPCPipe, Error) << \"Failed to create socket\";\n> +\t\treturn;\n> +\t}\n> +\n> +\tsocket_->readyRead.connect(this, &IPCUnixSocketWrapper::readyRead);\n> +\n> +\tmutex_.lock();\n> +\tready_ = true;\n> +\tmutex_.unlock();\n> +\tcv_.notify_one();\n>   }\n>   \n> -void IPCPipeUnixSocket::readyRead()\n> +void IPCUnixSocketWrapper::deinit()\n> +{\n> +\t/* Deinit is to be done from the IPC thread context */\n> +\tASSERT(Thread::current() == this);\n> +\n> +\tsocket_->readyRead.disconnect(this);\n> +\tsocket_.reset();\n> +\n> +\tmutex_.lock();\n> +\tready_ = false;\n> +\tmutex_.unlock();\n> +}\n> +\n> +void IPCUnixSocketWrapper::readyRead()\n>   {\n>   \tIPCUnixSocket::Payload payload;\n>   \tint ret = socket_->receive(&payload);\n> @@ -93,55 +186,72 @@ void IPCPipeUnixSocket::readyRead()\n>   \t\treturn;\n>   \t}\n>   \n> -\t/* \\todo Use span to avoid the double copy when callData is found. */\n>   \tif (payload.data.size() < sizeof(IPCMessage::Header)) {\n>   \t\tLOG(IPCPipe, Error) << \"Not enough data received\";\n>   \t\treturn;\n>   \t}\n>   \n> -\tIPCMessage ipcMessage(payload);\n> +\tconst IPCMessage::Header *header =\n> +\t\treinterpret_cast<IPCMessage::Header *>(payload.data.data());\n> +\tbool syncComplete = false;\n> +\tmutex_.lock();\n> +\tif (sendSyncPending_ && sendSyncCookie_ == header->cookie) {\n> +\t\tsyncComplete = true;\n> +\t\tsendSyncPending_ = false;\n> +\t\t*sendSyncResponse_ = std::move(payload);\n> +\t}\n> +\tmutex_.unlock();\n>   \n> -\tauto callData = callData_.find(ipcMessage.header().cookie);\n> -\tif (callData != callData_.end()) {\n> -\t\t*callData->second.response = std::move(payload);\n> -\t\tcallData->second.done = true;\n> +\tif (syncComplete) {\n> +\t\tcv_.notify_one();\n>   \t\treturn;\n>   \t}\n>   \n>   \t/* Received unexpected data, this means it's a call from the IPA. */\n> -\trecv.emit(ipcMessage);\n> +\tIPCMessage ipcMessage(payload);\n> +\trecv_->emit(ipcMessage);\n>   }\n>   \n> -int IPCPipeUnixSocket::call(const IPCUnixSocket::Payload &message,\n> -\t\t\t    IPCUnixSocket::Payload *response, uint32_t cookie)\n> +IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n> +\t\t\t\t     const char *ipaProxyWorkerPath)\n> +\t: IPCPipe()\n>   {\n> -\tTimer timeout;\n> -\tint ret;\n> +\tsocketWrap_ = std::make_unique<IPCUnixSocketWrapper>(&recv);\n> +\tif (!socketWrap_->waitReady()) {\n> +\t\tLOG(IPCPipe, Error) << \"Failed to create socket\";\n> +\t\treturn;\n> +\t}\n> +\tint fd = socketWrap_->fd();\n>   \n> -\tconst auto result = callData_.insert({ cookie, { response, false } });\n> -\tconst auto &iter = result.first;\n> +\tstd::vector<int> fds;\n> +\tstd::vector<std::string> args;\n> +\targs.push_back(ipaModulePath);\n> +\targs.push_back(std::to_string(fd));\n> +\tfds.push_back(fd);\n>   \n> -\tret = socket_->send(message);\n> +\tproc_ = std::make_unique<Process>();\n> +\tint ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>   \tif (ret) {\n> -\t\tcallData_.erase(iter);\n> -\t\treturn ret;\n> +\t\tLOG(IPCPipe, Error)\n> +\t\t\t<< \"Failed to start proxy worker process\";\n> +\t\treturn;\n>   \t}\n>   \n> -\t/* \\todo Make this less dangerous, see IPCPipe::sendSync() */\n> -\ttimeout.start(2000ms);\n> -\twhile (!iter->second.done) {\n> -\t\tif (!timeout.isRunning()) {\n> -\t\t\tLOG(IPCPipe, Error) << \"Call timeout!\";\n> -\t\t\tcallData_.erase(iter);\n> -\t\t\treturn -ETIMEDOUT;\n> -\t\t}\n> +\tconnected_ = true;\n> +}\n>   \n> -\t\tThread::current()->eventDispatcher()->processEvents();\n> -\t}\n> +IPCPipeUnixSocket::~IPCPipeUnixSocket()\n> +{\n> +}\n>   \n> -\tcallData_.erase(iter);\n> +int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n> +{\n> +\treturn socketWrap_->sendSync(in, out);\n> +}\n>   \n> -\treturn 0;\n> +int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n> +{\n> +\treturn socketWrap_->sendAsync(data);\n>   }\n>   \n>   } /* namespace libcamera */","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 8D6E3BD87C\n\tfor <parsemail@patchwork.libcamera.org>;\n\tFri, 29 Aug 2025 16:02:06 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 4C95D69325;\n\tFri, 29 Aug 2025 18:02:05 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 2077F6931C\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tFri, 29 Aug 2025 18:02:04 +0200 (CEST)","from [192.168.33.19] (185.221.143.232.nat.pool.zt.hu\n\t[185.221.143.232])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id B7E1C2394;\n\tFri, 29 Aug 2025 18:00:58 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"OHkJuor3\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1756483258;\n\tbh=sm1/Kzk7h08PExdzMt8Zz4TPF/80yz34vMLyruLQbOM=;\n\th=Date:Subject:To:References:From:In-Reply-To:From;\n\tb=OHkJuor3Kt039JWIjSa4DoTDCwyk1fLQ6K45BVYdfJtTzAjvUMt+21+xrd7tl9I99\n\tE9hMgvzuncFnwlii1doROcLId94SGkxzfejeC+xgnuVakOFdEuc82kCPZpgM+IwYub\n\tcHNyayEdvbicE3hKQlz807cb1Yexnqohlba/vXR0=","Message-ID":"<1b9e7e80-09c1-4598-94dc-74196a92bd39@ideasonboard.com>","Date":"Fri, 29 Aug 2025 18:02:00 +0200","MIME-Version":"1.0","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","To":"Julien Vuillaumier <julien.vuillaumier@nxp.com>,\n\tlibcamera-devel@lists.libcamera.org","References":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>","From":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <barnabas.pocze@ideasonboard.com>","Content-Language":"en-US, hu-HU","In-Reply-To":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"8bit","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":35686,"web_url":"https://patchwork.libcamera.org/comment/35686/","msgid":"<8add848a-060e-4fbf-a020-633b1473b759@nxp.com>","date":"2025-09-01T17:12:27","subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","submitter":{"id":190,"url":"https://patchwork.libcamera.org/api/people/190/","name":"Julien Vuillaumier","email":"julien.vuillaumier@nxp.com"},"content":"Hi Barnabás,\n\nOn 29/08/2025 18:02, Barnabás Pőcze wrote:\n> \n> Hi\n> \n> \n> 2024. 12. 20. 15:55 keltezéssel, Julien Vuillaumier írta:\n>> Unix-socket based IPC sometimes times out or hangs, typically\n>> when multiple camera are stopped simulaneously. That specific case\n>> triggers the concurrent sending by each pipeline handler instance\n>> of a synchronous stop() message to its peer IPA process.\n>>\n>> There is a dedicated IPC socket per camera. Sockets payload receipt\n>> signals all run in the camera manager thread.\n>> To send a synchronous message to IPA, pipeline invokes from camera\n>> thread IPCPipeUnixSocket::sendSync(). This sends the message then blocks\n>> busy waiting for the peer acknowledgment. Such busy wait is done by\n>> blocking on event loop calling dispatchEvent(), until the ack condition\n>> is detected.\n>>\n>> One issue is that the socket receive slot readyRead() wakes up the\n>> blocked thread via libcamera::Message receipt. Even though such message\n>> resumes processEvents(), it may reblock immediately because readyRead()\n>> does not interrupt() explictly the dispatcher.\n>> Most of the time, an other pending event for the thread unblocks the\n>> event dispatcher and the ack condition is detected - in worst case the 2\n>> sec timeout kicks in. Once unblocked, the dispatcher let the message\n>> acknowledgment to be detected and the sendSync() completes.\n> \n> As far as I can see\n> \n>    * EventDispatcherPoll::processEvents() is blocked in ppoll()\n>    * socket receives data, signals POLLIN, ppoll() returns\n>    * EventDispatcherPoll::processNotifiers()\n>        -> EventNotifier::activated emitted\n>        -> IPCUnixSocket::dataNotifier() runs\n>        -> IPCUnixSocket::readyRead emitted\n>        -> IPCPipeUnixSocket::readyRead() runs\n>    * IPCPipeUnixSocket::readyRead() reads the message and modifes \n> `callData_` accordingly\n>    * EventDispatcherPoll::processEvents() returns\n>    * IPCPipeUnixSocket::call() sees the update in `callData_` and stops\n> \n> And as far as I can tell every signal emission happens synchronously.\n> So what am I missing? Where does it get blocked?\n\nThat is the nominal behavior - no problem have been seen with a single \ncamera operation. Locks are experienced when a single libcamera instance \n(process) controls multiple camera. In that case all the cameras are \noperated from the same camera manager thread. If the user wants for \ninstance to stop all camera at the same time, a synchronous stop IPC \nmessage for each camera will be sent to its associated IPA. First IPC \nmessage will busy-wait waiting for the sync message acknowledgement to \nbe received on the socket. While looping on camera0 ack receipt, the \nmessage for the second camera will be sent from the same thread and also \nbusy-wait for its ack.\n\nWe end up with that kind of thread backtrace - pipeline handler \n(PipelineHandlerNxpNeo) mentioned here is not upstream yet... but that \ngives the idea of the general case.\n\npoll()\nlibcamera::EventDispatcherPoll::processEvents()\nlibcamera::IPCPipeUnixSocket::call()\nlibcamera::IPCPipeUnixSocket::sendSync()\nlibcamera::ipa::nxpneo::IPAProxyNxpNeo::stopIPC\nlibcamera::ipa::nxpneo::IPAProxyNxpNeo::stop()\nlibcamera::NxpNeoCameraData::stop()\nlibcamera::PipelineHandlerNxpNeo::stop()\nlibcamera::InvokeMessage::invoke()\nlibcamera::Object::message()\nlibcamera::Thread::dispatchMessages\n### stop request for camera 1, while blocked on camera 0 IPC stop\nlibcamera::EventDispatcherPoll::processEvents()\nlibcamera::IPCPipeUnixSocket::call()\nlibcamera::IPCPipeUnixSocket::sendSync()\nlibcamera::ipa::nxpneo::IPAProxyNxpNeo::stopIPC()\nlibcamera::ipa::nxpneo::IPAProxyNxpNeo:stop()\nlibcamera::NxpNeoCameraData::stop()\nlibcamera::PipelineHandlerNxpNeo::stop()\nlibcamera::InvokeMessage::invoke()\nlibcamera::Object::message()\nlibcamera::Thread::dispatchMessages\n### stop request for camera 0\nlibcamera::EventDispatcherPoll::processEvents()\nlibcamera::Thread::exec()\nlibcamera::CameraManager::Private::run()\nlibcamera::Thread::startThread()\n\nIssue is seen with 4 cameras - in that case up to 4 levels of nesting of \nIPC sync sending may be nested in the camera thread stack.\n\nIf any of the IPC socket receives its sync message ack, the camera \nthread will be scheduled on top of above stack and execute the sequence:\n\n... callData is updated\nIPCPipeUnixSocket::readyRead()\nEventDispatcherPoll::processNotifiers()\nIPCUnixSocket::dataNotifier()\n\nIIRC a couple of things are fragile:\n- In IPCPipeUnixSocket::readyRead()(), callData is updated and next time \nthread will resume the IPC sync ack may be detected from call() \nbusy-wait loop. But that still requires the thread to be resumed by some \nkind of event, that may not occcur when pipeline is stopping - maybe an \nEventDispatcher::Interrupt() would help here\n- If the ack comes for camera 0 on its socket, it will not be able to \ncomplete before camera 1 sync message has been acked as it is the one on \ntop of the camera thread stack. So camera 1 stack frames need to unwind \nfirst before camera 0 IPC message can complete. More recent messages are \ndelaying the completion of the older messages that are more likely to \ntimeout\n- The timeout construct is not robust in multi-camera: camera 0 may \ntimeout while camera 1 is being busy-waited. In that case, camera 0 \ntimeout message will resume the thread in the camera 1 call() function, \nbe ignored, and the wake up event for camera 0 get lost\n\nThis is a fairly complex construct, and the behavior is coupled to the \nEventDispatcher implementation.\nIntent of that change is to propose a simpler construct without camera \ncontexts nesting in the stack: for each sync message, the camera thread \nis blocked on a condition variable waiting for the message ack. A \nsupplementary thread becomes necessary to be able to handle the socket \nreceipt path, and eventually unblock the camera thread.\n\n>>\n>> The other issue is that in case of concurrent synchronous IPC messages\n>> sent by multiple pipeline handlers, there is a possible recursion\n>> of sendSync() / processEvents() nested in the camera thread stack. As\n>> commented in the source, that is a dangerous construct that can lead to\n>> a hang.\n>> The reason is that the last synchronous message sent is the deepest in\n>> the stack. It is also the one whose acknowledgment is being busy waited.\n>> However other pending synchronous messages may have been initiated before\n>> and are upper in the stack. If they timeout, the condition is not\n>> detected because of the stack recursion, as the thread is busy waiting\n>> for the last message to be acknowledged.\n> \n> Without a large-scale refactor, I see two choices:\n> \n>    (1) dispatch the event loop recursively;\n>    (2) block the event loop;\n> \n> the proposed solution here appears to be (2). But then what is not clear\n> to me is why you're not just running a poll() in a loop on that single\n> socket? Why do we need an extra thread?\n\nThat is correct, proposed solution is (2).\n\nI am not sure what you mean by running a poll() in loop on that single \nsocket.\n\n From libcamera viewpoint, the socket receive path is used for 2 cases:\n1) Pipeline-handler-originated sync message acknowldegement issued by IPA\n2) Asynchronous message issued by IPA\n\nIPCUnixtSocket already handles a poll() on the socket via a combination \nof EventNotifier() + manual poll in IPCUnixSocket::dataNotifier() to \ncheck the payload availability.\n\nInstead of using an additional thread, I presume we could instead change \nthe existing busy-wait loop in IPCPipeUnixSocket::call() by:\n- removing processEvents() call\n- adding the equivalent of IPCUnixSocket::dataNotifier() without \nreadyRead signal emission, followed by IPCPipeUnixSocket::readyRead()\nThen some sort of timeout management would also still be needed.\n\nI am struggling to figure out if that would play smoothly with the IPA \nasynchronous messages.\nIMO blocking the camera thread and rely on a dedicated thread to handle \nthe IPCUnixSocket::dataNotifier() / IPCPipeUnixSocket::readyRead() is \nsimpler and safer. The other advantage would be that it keeps the \nIPCUnixSocket implementation unchanged.\n\nThanks,\nJulien\n\n> \n> \n> Regards,\n> Barnabás Pőcze\n> \n>>\n>> This change implements a safer mechanism to handle the synchronous\n>> message sending, similar to the one used for non isolated IPA. The\n>> IPCUnixSocketWrapper class is introduced to handle the IPCUnixSocket\n>> receive signal in a dedicated thread.\n>> Doing so, the sending thread, when emiting a synchronous message, can be\n>> blocked without event dispatcher's processEvents() usage, which avoids\n>> the risky stack recursion.\n>>\n>> Fixes: 21f1b555b (\"libcamera: Add IPCPipe implementation based on unix \n>> socket\")\n>>\n>> Signed-off-by: Julien Vuillaumier <julien.vuillaumier@nxp.com>\n>> ---\n>>   .../libcamera/internal/ipc_pipe_unixsocket.h  |  13 +-\n>>   src/libcamera/ipc_pipe_unixsocket.cpp         | 242 +++++++++++++-----\n>>   2 files changed, 178 insertions(+), 77 deletions(-)\n>>\n>> diff --git a/include/libcamera/internal/ipc_pipe_unixsocket.h b/ \n>> include/libcamera/internal/ipc_pipe_unixsocket.h\n>> index 8c972613..280639d5 100644\n>> --- a/include/libcamera/internal/ipc_pipe_unixsocket.h\n>> +++ b/include/libcamera/internal/ipc_pipe_unixsocket.h\n>> @@ -16,6 +16,7 @@\n>>   namespace libcamera {\n>>\n>>   class Process;\n>> +class IPCUnixSocketWrapper;\n>>\n>>   class IPCPipeUnixSocket : public IPCPipe\n>>   {\n>> @@ -29,18 +30,8 @@ public:\n>>       int sendAsync(const IPCMessage &data) override;\n>>\n>>   private:\n>> -     struct CallData {\n>> -             IPCUnixSocket::Payload *response;\n>> -             bool done;\n>> -     };\n>> -\n>> -     void readyRead();\n>> -     int call(const IPCUnixSocket::Payload &message,\n>> -              IPCUnixSocket::Payload *response, uint32_t seq);\n>> -\n>>       std::unique_ptr<Process> proc_;\n>> -     std::unique_ptr<IPCUnixSocket> socket_;\n>> -     std::map<uint32_t, CallData> callData_;\n>> +     std::unique_ptr<IPCUnixSocketWrapper> socketWrap_;\n>>   };\n>>\n>>   } /* namespace libcamera */\n>> diff --git a/src/libcamera/ipc_pipe_unixsocket.cpp b/src/libcamera/ \n>> ipc_pipe_unixsocket.cpp\n>> index 668ec73b..eb5408d4 100644\n>> --- a/src/libcamera/ipc_pipe_unixsocket.cpp\n>> +++ b/src/libcamera/ipc_pipe_unixsocket.cpp\n>> @@ -9,10 +9,9 @@\n>>\n>>   #include <vector>\n>>\n>> -#include <libcamera/base/event_dispatcher.h>\n>>   #include <libcamera/base/log.h>\n>> +#include <libcamera/base/mutex.h>\n>>   #include <libcamera/base/thread.h>\n>> -#include <libcamera/base/timer.h>\n>>\n>>   #include \"libcamera/internal/ipc_pipe.h\"\n>>   #include \"libcamera/internal/ipc_unixsocket.h\"\n>> @@ -24,67 +23,161 @@ namespace libcamera {\n>>\n>>   LOG_DECLARE_CATEGORY(IPCPipe)\n>>\n>> -IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n>> -                                  const char *ipaProxyWorkerPath)\n>> -     : IPCPipe()\n>> +class IPCUnixSocketWrapper : Thread\n>>   {\n>> -     std::vector<int> fds;\n>> -     std::vector<std::string> args;\n>> -     args.push_back(ipaModulePath);\n>> +public:\n>> +     IPCUnixSocketWrapper(Signal<const IPCMessage &> *recv)\n>> +             : recv_(recv), ready_(false), sendSyncPending_(false),\n>> +               sendSyncCookie_(0)\n>> +     {\n>> +             start();\n>> +     }\n>>\n>> -     socket_ = std::make_unique<IPCUnixSocket>();\n>> -     UniqueFD fd = socket_->create();\n>> -     if (!fd.isValid()) {\n>> -             LOG(IPCPipe, Error) << \"Failed to create socket\";\n>> -             return;\n>> +     ~IPCUnixSocketWrapper()\n>> +     {\n>> +             exit();\n>> +             wait();\n>>       }\n>> -     socket_->readyRead.connect(this, &IPCPipeUnixSocket::readyRead);\n>> -     args.push_back(std::to_string(fd.get()));\n>> -     fds.push_back(fd.get());\n>>\n>> -     proc_ = std::make_unique<Process>();\n>> -     int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>> -     if (ret) {\n>> -             LOG(IPCPipe, Error)\n>> -                     << \"Failed to start proxy worker process\";\n>> -             return;\n>> +     void run() override\n>> +     {\n>> +             /*\n>> +              * IPC socket construction and connection to its readyRead\n>> +              * signal has to be done from the IPC thread so that the\n>> +              * relevant Object instances (EventNotifier, slot) are \n>> bound to\n>> +              * its context.\n>> +              */\n>> +             init();\n>> +             exec();\n>> +             deinit();\n>>       }\n>>\n>> -     connected_ = true;\n>> -}\n>> +     int fd() { return fd_.get(); }\n>> +     int sendSync(const IPCMessage &in, IPCMessage *out);\n>> +     int sendAsync(const IPCMessage &data);\n>> +     bool waitReady();\n>>\n>> -IPCPipeUnixSocket::~IPCPipeUnixSocket()\n>> -{\n>> -}\n>> +private:\n>> +     void init();\n>> +     void deinit();\n>> +     void readyRead();\n>>\n>> -int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n>> +     UniqueFD fd_;\n>> +     Signal<const IPCMessage &> *recv_;\n>> +     ConditionVariable cv_;\n>> +     Mutex mutex_;\n>> +     bool ready_;\n>> +     bool sendSyncPending_;\n>> +     uint32_t sendSyncCookie_;\n>> +     IPCUnixSocket::Payload *sendSyncResponse_;\n>> +\n>> +     /* Socket shall be constructed and destructed from IPC thread \n>> context */\n>> +     std::unique_ptr<IPCUnixSocket> socket_;\n>> +};\n>> +\n>> +int IPCUnixSocketWrapper::sendSync(const IPCMessage &in, IPCMessage \n>> *out)\n>>   {\n>> +     int ret;\n>>       IPCUnixSocket::Payload response;\n>>\n>> -     int ret = call(in.payload(), &response, in.header().cookie);\n>> +     mutex_.lock();\n>> +     ASSERT(!sendSyncPending_);\n>> +     sendSyncPending_ = true;\n>> +     sendSyncCookie_ = in.header().cookie;\n>> +     sendSyncResponse_ = &response;\n>> +     mutex_.unlock();\n>> +\n>> +     ret = socket_->send(in.payload());\n>>       if (ret) {\n>> -             LOG(IPCPipe, Error) << \"Failed to call sync\";\n>> -             return ret;\n>> +             LOG(IPCPipe, Error) << \"Failed to send sync message\";\n>> +             goto cleanup;\n>> +     }\n>> +\n>> +     bool complete;\n>> +     {\n>> +             MutexLocker locker(mutex_);\n>> +             auto syncComplete = ([&]() {\n>> +                     return sendSyncPending_ == false;\n>> +             });\n>> +             complete = cv_.wait_for(locker, 1000ms, syncComplete);\n>> +     }\n>> +\n>> +     if (!complete) {\n>> +             LOG(IPCPipe, Error) << \"Timeout sending sync message\";\n>> +             ret = -ETIMEDOUT;\n>> +             goto cleanup;\n>>       }\n>>\n>>       if (out)\n>>               *out = IPCMessage(response);\n>>\n>>       return 0;\n>> +\n>> +cleanup:\n>> +     mutex_.lock();\n>> +     sendSyncPending_ = false;\n>> +     mutex_.unlock();\n>> +\n>> +     return ret;\n>>   }\n>>\n>> -int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n>> +int IPCUnixSocketWrapper::sendAsync(const IPCMessage &data)\n>>   {\n>> -     int ret = socket_->send(data.payload());\n>> -     if (ret) {\n>> -             LOG(IPCPipe, Error) << \"Failed to call async\";\n>> -             return ret;\n>> +     int ret;\n>> +     ret = socket_->send(data.payload());\n>> +     if (ret)\n>> +             LOG(IPCPipe, Error) << \"Failed to send sync message\";\n>> +     return ret;\n>> +}\n>> +\n>> +bool IPCUnixSocketWrapper::waitReady()\n>> +{\n>> +     bool ready;\n>> +     {\n>> +             MutexLocker locker(mutex_);\n>> +             auto isReady = ([&]() {\n>> +                     return ready_;\n>> +             });\n>> +             ready = cv_.wait_for(locker, 1000ms, isReady);\n>>       }\n>>\n>> -     return 0;\n>> +     return ready;\n>> +}\n>> +\n>> +void IPCUnixSocketWrapper::init()\n>> +{\n>> +     /* Init is to be done from the IPC thread context */\n>> +     ASSERT(Thread::current() == this);\n>> +\n>> +     socket_ = std::make_unique<IPCUnixSocket>();\n>> +     fd_ = socket_->create();\n>> +     if (!fd_.isValid()) {\n>> +             LOG(IPCPipe, Error) << \"Failed to create socket\";\n>> +             return;\n>> +     }\n>> +\n>> +     socket_->readyRead.connect(this, &IPCUnixSocketWrapper::readyRead);\n>> +\n>> +     mutex_.lock();\n>> +     ready_ = true;\n>> +     mutex_.unlock();\n>> +     cv_.notify_one();\n>>   }\n>>\n>> -void IPCPipeUnixSocket::readyRead()\n>> +void IPCUnixSocketWrapper::deinit()\n>> +{\n>> +     /* Deinit is to be done from the IPC thread context */\n>> +     ASSERT(Thread::current() == this);\n>> +\n>> +     socket_->readyRead.disconnect(this);\n>> +     socket_.reset();\n>> +\n>> +     mutex_.lock();\n>> +     ready_ = false;\n>> +     mutex_.unlock();\n>> +}\n>> +\n>> +void IPCUnixSocketWrapper::readyRead()\n>>   {\n>>       IPCUnixSocket::Payload payload;\n>>       int ret = socket_->receive(&payload);\n>> @@ -93,55 +186,72 @@ void IPCPipeUnixSocket::readyRead()\n>>               return;\n>>       }\n>>\n>> -     /* \\todo Use span to avoid the double copy when callData is \n>> found. */\n>>       if (payload.data.size() < sizeof(IPCMessage::Header)) {\n>>               LOG(IPCPipe, Error) << \"Not enough data received\";\n>>               return;\n>>       }\n>>\n>> -     IPCMessage ipcMessage(payload);\n>> +     const IPCMessage::Header *header =\n>> +             reinterpret_cast<IPCMessage::Header \n>> *>(payload.data.data());\n>> +     bool syncComplete = false;\n>> +     mutex_.lock();\n>> +     if (sendSyncPending_ && sendSyncCookie_ == header->cookie) {\n>> +             syncComplete = true;\n>> +             sendSyncPending_ = false;\n>> +             *sendSyncResponse_ = std::move(payload);\n>> +     }\n>> +     mutex_.unlock();\n>>\n>> -     auto callData = callData_.find(ipcMessage.header().cookie);\n>> -     if (callData != callData_.end()) {\n>> -             *callData->second.response = std::move(payload);\n>> -             callData->second.done = true;\n>> +     if (syncComplete) {\n>> +             cv_.notify_one();\n>>               return;\n>>       }\n>>\n>>       /* Received unexpected data, this means it's a call from the \n>> IPA. */\n>> -     recv.emit(ipcMessage);\n>> +     IPCMessage ipcMessage(payload);\n>> +     recv_->emit(ipcMessage);\n>>   }\n>>\n>> -int IPCPipeUnixSocket::call(const IPCUnixSocket::Payload &message,\n>> -                         IPCUnixSocket::Payload *response, uint32_t \n>> cookie)\n>> +IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n>> +                                  const char *ipaProxyWorkerPath)\n>> +     : IPCPipe()\n>>   {\n>> -     Timer timeout;\n>> -     int ret;\n>> +     socketWrap_ = std::make_unique<IPCUnixSocketWrapper>(&recv);\n>> +     if (!socketWrap_->waitReady()) {\n>> +             LOG(IPCPipe, Error) << \"Failed to create socket\";\n>> +             return;\n>> +     }\n>> +     int fd = socketWrap_->fd();\n>>\n>> -     const auto result = callData_.insert({ cookie, { response, \n>> false } });\n>> -     const auto &iter = result.first;\n>> +     std::vector<int> fds;\n>> +     std::vector<std::string> args;\n>> +     args.push_back(ipaModulePath);\n>> +     args.push_back(std::to_string(fd));\n>> +     fds.push_back(fd);\n>>\n>> -     ret = socket_->send(message);\n>> +     proc_ = std::make_unique<Process>();\n>> +     int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>>       if (ret) {\n>> -             callData_.erase(iter);\n>> -             return ret;\n>> +             LOG(IPCPipe, Error)\n>> +                     << \"Failed to start proxy worker process\";\n>> +             return;\n>>       }\n>>\n>> -     /* \\todo Make this less dangerous, see IPCPipe::sendSync() */\n>> -     timeout.start(2000ms);\n>> -     while (!iter->second.done) {\n>> -             if (!timeout.isRunning()) {\n>> -                     LOG(IPCPipe, Error) << \"Call timeout!\";\n>> -                     callData_.erase(iter);\n>> -                     return -ETIMEDOUT;\n>> -             }\n>> +     connected_ = true;\n>> +}\n>>\n>> -             Thread::current()->eventDispatcher()->processEvents();\n>> -     }\n>> +IPCPipeUnixSocket::~IPCPipeUnixSocket()\n>> +{\n>> +}\n>>\n>> -     callData_.erase(iter);\n>> +int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n>> +{\n>> +     return socketWrap_->sendSync(in, out);\n>> +}\n>>\n>> -     return 0;\n>> +int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n>> +{\n>> +     return socketWrap_->sendAsync(data);\n>>   }\n>>\n>>   } /* namespace libcamera */\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 307E2BEFBE\n\tfor <parsemail@patchwork.libcamera.org>;\n\tMon,  1 Sep 2025 17:12:34 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 530E569330;\n\tMon,  1 Sep 2025 19:12:33 +0200 (CEST)","from GVXPR05CU001.outbound.protection.outlook.com\n\t(mail-swedencentralazlp170130007.outbound.protection.outlook.com\n\t[IPv6:2a01:111:f403:c202::7])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 999AA69324\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tMon,  1 Sep 2025 19:12:31 +0200 (CEST)","from AM9PR04MB8147.eurprd04.prod.outlook.com\n\t(2603:10a6:20b:3e0::22)\n\tby DU2PR04MB8677.eurprd04.prod.outlook.com (2603:10a6:10:2dc::14)\n\twith Microsoft SMTP Server (version=TLS1_2,\n\tcipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.16;\n\tMon, 1 Sep 2025 17:12:29 +0000","from AM9PR04MB8147.eurprd04.prod.outlook.com\n\t([fe80::eace:e980:28a4:ef8a]) by\n\tAM9PR04MB8147.eurprd04.prod.outlook.com\n\t([fe80::eace:e980:28a4:ef8a%6]) with mapi id 15.20.9073.014;\n\tMon, 1 Sep 2025 17:12:29 +0000"],"Authentication-Results":["lancelot.ideasonboard.com; dkim=pass (2048-bit key;\n\tunprotected) header.d=nxp.com header.i=@nxp.com header.b=\"Oi6FE7oo\";\n\tdkim-atps=neutral","dkim=none (message not signed)\n\theader.d=none;dmarc=none action=none header.from=nxp.com;"],"ARC-Seal":"i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none;\n\tb=kNrDeGcZfm/9k9iegV/BovnYQXoyvJq/X6QP4334savM3rSGuvttOr2QTyWv7PUP3O9xnj+1fC0ULXQMjo9dUTR3Wx67+Nod+dGWuk3GjfR/yDKUse7QfOnWqbjz2fnj4fAEOx5UVTUOAjTRk4qZgkXBRZRbLaLezV2cfT3oeBVL6SCQbYl4OgkP5cr8x2rMt4yjIsrxR66cov9fNMG5WJYdxHhABRAGl2ah5+z7v1M+Vs1hA+i+/X/i0jH5KoWzeJirIokQOjiJxb9X1dDvgPo5JsJ324+yysc/5HFMlYmLA4wxbGdkPFuYqs8W2TEL2VzB6dNrfg3Kdjo4r3thZg==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n\ts=arcselector10001;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n\tbh=qpsTOXMzLk0+AGwDYgdRf9L1RdRQYG1vCb7muXNtWts=;\n\tb=L6rDAlxeec+nFw6JS/JkIYPRXZbTf8I55ILHNwgmzv2aQvsuAGRfqRfK0x1fGEwcKLrh9HLIyoSc8QyGUYynCh+dYyeKL/zGo7aRTCX0yTZcolUMhmvArv+uJSxqq0UK7sJZzIIU1fcG4N0iNugvGxEU+hXipQJAD19Ak5BvNlsZJDrnHnq7qyV5Q9Pg5N52LZqYZtHtBKYL4k5NuX/wvN/00/m2ZPyg5KNjQB5t+E2aj/UhwAypFStgLqU3laYAgZQ85i5zfbDPEh1FI3Ga8wQYhQDj9HzLUmfD5P6jQExrVtPs3x3sfibOIoUzwut4ynk4NdVO49xw+pejDOnLVA==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n\tsmtp.mailfrom=nxp.com; dmarc=pass action=none header.from=nxp.com;\n\tdkim=pass header.d=nxp.com; arc=none","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n\tbh=qpsTOXMzLk0+AGwDYgdRf9L1RdRQYG1vCb7muXNtWts=;\n\tb=Oi6FE7oon7z+ebg/04c2smQpV+67cnoo0vSzXy1q5RaSrAeSu8NMdtD0ioRnJTtVLF7CqQZ0lLpx5SXNvr0JIQJCEUSg1vayRl7WJewLHrNxS4EoVHa9fmjMly0RPHI4pUZq6CSBEylkLv5HNzn/nGoxAas/01YlbYT43vdeARnPVC+d1bPEAnpAvKW3OelkPnT2LSy8IHwMCyO8vhLwX9Tq9jK2q7E8y2eBXIkHCJL1zU+TX90bq8r9X3N8dajMG8xRO7CjoN5aflMG7Mzrid2XASw5evkqoIqi90mHxGjulBWdoGQax3hf6/XsOROWp2rzkd1zIwcHjK0aYnlxtQ==","Message-ID":"<8add848a-060e-4fbf-a020-633b1473b759@nxp.com>","Date":"Mon, 1 Sep 2025 19:12:27 +0200","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","To":"=?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= <barnabas.pocze@ideasonboard.com>,\n\tlibcamera-devel@lists.libcamera.org","References":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>\n\t<1b9e7e80-09c1-4598-94dc-74196a92bd39@ideasonboard.com>","Content-Language":"en-US","From":"Julien Vuillaumier <julien.vuillaumier@nxp.com>","Cc":"Paul Elder <paul.elder@ideasonboard.com>","In-Reply-To":"<1b9e7e80-09c1-4598-94dc-74196a92bd39@ideasonboard.com>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"8bit","X-ClientProxiedBy":"AM9P250CA0024.EURP250.PROD.OUTLOOK.COM\n\t(2603:10a6:20b:21c::29) To AM9PR04MB8147.eurprd04.prod.outlook.com\n\t(2603:10a6:20b:3e0::22)","MIME-Version":"1.0","X-MS-PublicTrafficType":"Email","X-MS-TrafficTypeDiagnostic":"AM9PR04MB8147:EE_|DU2PR04MB8677:EE_","X-MS-Office365-Filtering-Correlation-Id":"65a44456-0555-4b58-6baa-08dde97abad3","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam":"BCL:0;\n\tARA:13230040|1800799024|366016|376014|19092799006; ","X-Microsoft-Antispam-Message-Info":"=?utf-8?q?YFfQ+QXF8THobK5RURZsk/Cifnlb?=\n\t=?utf-8?q?49ZTGcB9kRmQ7O+CUaKD8WlrUDqhXfPiFLe2HM4sL7+NMUBqLQFZqR81?=\n\t=?utf-8?q?Q+fuh4fPIAOzsZrtjBReYPVXDMqoAX99VjmyQChTQN9T3o8BzTCC/bYk?=\n\t=?utf-8?q?aZL6n2imsZFJTlCgTH2AOSjgwDrhuDTagqOhYr1ZjbtyyiS/slQTcJw8?=\n\t=?utf-8?q?7Mx5ESzIybVgwZ/pSgwI/92v4AIjHrCOyWmaLIulej6agFbebU90dfTR?=\n\t=?utf-8?q?bgDYTM/e3UB0JrS0riA2tNATdIlutVkbjYfJe38o5ZAcVz5YTFlarID6?=\n\t=?utf-8?q?9KNq9GARSX0yaoEmZqqsXJymY26hVp7HB5QiDQP+cuT5pYmIScedNTDw?=\n\t=?utf-8?q?FrDfqpx//XR9LIORLTTgO3DAt7qDbyh8gJ81BE2sOymtsoHF9WVRmrM2?=\n\t=?utf-8?q?sBHFYHIYc2xJI/WoFwANGSIpJSCt0tuXgQy6NDAwCLQnP5UM84VkBEpW?=\n\t=?utf-8?q?QBwZT3XaR5V7d1XBMbo7KMetzWH98Jz2m5D0VkELqYxEuCVH59XAmFRg?=\n\t=?utf-8?q?ki4kX0LAfGRlh1Xt4E5iwiujJnDAUqlzPkCeLjcM07D9eTpgCSZ1ny+h?=\n\t=?utf-8?q?amT8ouVGFZwmaG6ihCgRYpaOmFO0kgOz0SwOUIyCgVHzB6iVg0tE1/L5?=\n\t=?utf-8?q?hPqj/AKTadVu1I2fQMfbhADflvheHpqx7msHMA8oXgaX8g5e/4KGEpQJ?=\n\t=?utf-8?q?anQyVZJVdl2C/7fCKoJaynttBKgQL1Cn9HRvXQ5elNgFRU2WhF2Wfpe7?=\n\t=?utf-8?q?sKLmYpYW4iK2tdJSLUA542kHkN+GjCmaPSv/LXJQEgIkdtub65rWdClR?=\n\t=?utf-8?q?TnWhz1DqPSoSg2EMJLaJYzI9/9D8YSRH4VNB9E6nOtnli18hECJaFaye?=\n\t=?utf-8?q?JlEajnrHaY9PpFOqc+1cMGmf5epJgpyy/ORwojrdBZ6m26WiCAjbYUAv?=\n\t=?utf-8?q?vylM+C9+zm7jSDfwKtGhKrNtYRb7EpVjS4mm8Ugyj9VMQOir5+3yFE0I?=\n\t=?utf-8?q?qnz7KqOBg/oJUUNpdinxNR8GNcVMpXrHhVBQDCle6oHTfWXTgO+M0N5t?=\n\t=?utf-8?q?ZPtWOGZb0ZZmlF20b1UocKbdY4S0quWgZeISsRMHcgfMEG25RzX8Hy2y?=\n\t=?utf-8?q?xbdnvIJIQl2pO410BxjZ78u+XcTKkpG29lRKGS3kE3ez9o7O3tRRJsV4?=\n\t=?utf-8?q?SHdTL6MNg6LAbfDJF+U+xwOp39rxKm1TOiXSoZPGuaoM8S6uE2D06mP2?=\n\t=?utf-8?q?M38QIN2wStxtwxMul1GAn8EQjHqlwl9Nan4HjzZikvNqbCO5EmhW5A1A?=\n\t=?utf-8?q?JeIfS7+OcceowJcZ5++sVUrqjOJxXMIOTwZfX77MXaHIiOXqzGVPTFbQ?=\n\t=?utf-8?q?gR+LKMdkM5+SyTCeP1acgIvFCNUt3lb3z1zXwL+J0oKrZuC5j1uUuySI?=\n\t=?utf-8?q?GZeDx6fStQbhXLm5s77ROEQoOAlB6rKiIv09rEsgE8ohyNOrlHP8iMdM?=\n\t=?utf-8?q?xqkb56btr/XATJfnm000aDA=3D?=","X-Forefront-Antispam-Report":"CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;\n\tIPV:NLI; SFV:NSPM; H:AM9PR04MB8147.eurprd04.prod.outlook.com; PTR:;\n\tCAT:NONE; \n\tSFS:(13230040)(1800799024)(366016)(376014)(19092799006); DIR:OUT;\n\tSFP:1101; ","X-MS-Exchange-AntiSpam-MessageData-ChunkCount":"1","X-MS-Exchange-AntiSpam-MessageData-0":"=?utf-8?q?uFKbxQ6K00HueswpRX1uKH55m?=\n\t=?utf-8?q?dz6+Ira61e1UfyKFVq3DueoqcrRBtjiHVhFWAzf+ykd+3O8R9+3eJ7ix?=\n\t=?utf-8?q?nxBmLxHOJ5nYhU3EmEBg0+EqxMu4kVaQX5BjUM7jN/lcd9PqBvkK+paU?=\n\t=?utf-8?q?MBNoIMRDrRcy5Z2E4SomXNhSv2UH855/ZMpa4PQpZRx/wX5EMgHh+Kyo?=\n\t=?utf-8?q?k2IOHLfw0SH77sWz7uTuThif96Y3RqgXKXkCsgjsGjWBBuGJYy5WrQqn?=\n\t=?utf-8?q?ndTAbdshVy1RgiR9jWXphuAGXUK8IsK8jrp5MnGZ5r4X9PChgIBEi4oq?=\n\t=?utf-8?q?XYpvaeLtJtQWVgREFf1/ztPMYbRVVomjRMT3Pp7iAMBwaq332ckljXfn?=\n\t=?utf-8?q?0JL6FWmedfVvUowIwh9Rl0AiMEWmOHkKE83YR0nG/rhuO7rEOVsxwjzU?=\n\t=?utf-8?q?ti9kakAliu5OusVlNPhbB+N1yv6CkPwNXjqrvGGu+KQfsjkVeMn0r6Jg?=\n\t=?utf-8?q?x+H0bngyA5HdzJ/2722HxdZdFTXSb9zJLGVMlp09KAdmIURCcSGFIsI/?=\n\t=?utf-8?q?aU8t60GljWG63RWS7aKn6e54z+hVGU6vWM4veGEzOLWB8UvqKifjD2og?=\n\t=?utf-8?q?w0q3dd02gWguRfYxuSfhST96TOZJRCzgxTqBugnhs6+ndyS8gL2K/nho?=\n\t=?utf-8?q?8mwpmAiwsYxv9Cw9pHxxqagSoeLSKHCNOC/IA9fvDdzEkHVFf2H2Iwb0?=\n\t=?utf-8?q?EsGhO5Fi3i/NEDSyhC8Ovo7G7W/pofxrhZsLk+C3ExZMRTdTpZCtze3w?=\n\t=?utf-8?q?x6b4fz9o3MHATYQUmFEUKzSdFnOWAFRTwh1HzAUPh0DkNIk5okQir+fg?=\n\t=?utf-8?q?B7x43oOOOtpfoTdzkCrsLCKOfLAwN1050iCiu8l1AAZVlib8EszRealn?=\n\t=?utf-8?q?7i8wv/99lgD8iY5ZgdM4zzch0MeMyeJX/kQV72TJG8ryX+8WhBIeMIeM?=\n\t=?utf-8?q?INcutU/JrHBMya6hS4MugcSAADbtkxcjCfmkAT0t301AyZLKZCnECkn7?=\n\t=?utf-8?q?1ACcKnn8SiTbs35tZDtqMTj5YdybzpIMoNm2X+4SI/QlQ0plb7o/78oL?=\n\t=?utf-8?q?7L8EDjo74l+KiiZYeJFCmKabnT3RWXiIsI6Yu6zqeWFjSJ2bf0iWlHME?=\n\t=?utf-8?q?HTdC1pA/3loHHL5L519qX8/oKBylSoKK84MBXDNdjeFd+qrjjNywMuDj?=\n\t=?utf-8?q?U7Z4Wr7fpkruEZBq6Pi956pFDKI/omBQmSYTU/fbgOpQvYrLS0qvxlLg?=\n\t=?utf-8?q?S4EW5LKpB2G31esjcsQx1QDW//4iGXpYpFWjsaJEm9Xmf14MFXqKpFDQ?=\n\t=?utf-8?q?jVM/mk6K+QX7IElMTRGjsif/u3OLevOD5iQsd5jo02qumZyGZchdcodL?=\n\t=?utf-8?q?pj2PJmMnpXGC+xG4gh4NKKq67ChlPWD3a0pBV8CHoWfFLy/66GAmCBXt?=\n\t=?utf-8?q?Lr1YZNVn2Z5QRli4aQy7nbwp5T2FvOVG74f075KktFnb9EW1f66gx2PR?=\n\t=?utf-8?q?9ONx9Q1EoGOUvFyRd556l2397IA3jz9oOU8tGDbitj+DE9OqE+mp+Lwn?=\n\t=?utf-8?q?/SRiyQCPeGunO+gU8a7FOFEjfCxSiPvnwGMJZZbkBSW5VOm9Op2I57IC?=\n\t=?utf-8?q?gyGZn8/7Dl95FIYqrBYsLTYiQsDbtp+JIhppckLvPxFTEGsvsuPh7Tqs?=\n\t=?utf-8?q?vG+40oL7OPVr/feAl9c/NBAqIDw8w=3D=3D?=","X-OriginatorOrg":"nxp.com","X-MS-Exchange-CrossTenant-Network-Message-Id":"65a44456-0555-4b58-6baa-08dde97abad3","X-MS-Exchange-CrossTenant-AuthSource":"AM9PR04MB8147.eurprd04.prod.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Internal","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"01 Sep 2025 17:12:29.1694\n\t(UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"686ea1d3-bc2b-4c6f-a92c-d99c5c301635","X-MS-Exchange-CrossTenant-MailboxType":"HOSTED","X-MS-Exchange-CrossTenant-UserPrincipalName":"WRE6DZkB5Yv36t5r8a2t2CpMMlSHYGT83cILyKQeh0/paoaF/xa7rfYrh+j2d2YN2WRrxqH6HSDgGLZg219OogGL8SG2jaDWkgf4WZf+J44=","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"DU2PR04MB8677","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":35692,"web_url":"https://patchwork.libcamera.org/comment/35692/","msgid":"<175690657206.2636072.17788644204927420566@neptunite.rasen.tech>","date":"2025-09-03T13:36:12","subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","submitter":{"id":17,"url":"https://patchwork.libcamera.org/api/people/17/","name":"Paul Elder","email":"paul.elder@ideasonboard.com"},"content":"Hi Julien,\n\nQuoting Julien Vuillaumier (2025-08-29 18:42:03)\n> Hi Paul,\n> \n> \n> On 26/08/2025 12:28, Paul Elder wrote:\n> > \n> > Hi Julien,\n> > \n> > Thanks for the patch.\n> > \n> > Quoting Julien Vuillaumier (2024-12-20 23:55:56)\n> >> Unix-socket based IPC sometimes times out or hangs, typically\n> >> when multiple camera are stopped simulaneously. That specific case\n> >> triggers the concurrent sending by each pipeline handler instance\n> >> of a synchronous stop() message to its peer IPA process.\n> >>\n> >> There is a dedicated IPC socket per camera. Sockets payload receipt\n> >> signals all run in the camera manager thread.\n> >> To send a synchronous message to IPA, pipeline invokes from camera\n> >> thread IPCPipeUnixSocket::sendSync(). This sends the message then blocks\n> >> busy waiting for the peer acknowledgment. Such busy wait is done by\n> >> blocking on event loop calling dispatchEvent(), until the ack condition\n> >> is detected.\n> >>\n> >> One issue is that the socket receive slot readyRead() wakes up the\n> >> blocked thread via libcamera::Message receipt. Even though such message\n> >> resumes processEvents(), it may reblock immediately because readyRead()\n> >> does not interrupt() explictly the dispatcher.\n> >> Most of the time, an other pending event for the thread unblocks the\n> >> event dispatcher and the ack condition is detected - in worst case the 2\n> >> sec timeout kicks in. Once unblocked, the dispatcher let the message\n> >> acknowledgment to be detected and the sendSync() completes.\n> >>\n> >> The other issue is that in case of concurrent synchronous IPC messages\n> >> sent by multiple pipeline handlers, there is a possible recursion\n> >> of sendSync() / processEvents() nested in the camera thread stack. As\n> >> commented in the source, that is a dangerous construct that can lead to\n> >> a hang.\n> >> The reason is that the last synchronous message sent is the deepest in\n> >> the stack. It is also the one whose acknowledgment is being busy waited.\n> >> However other pending synchronous messages may have been initiated before\n> >> and are upper in the stack. If they timeout, the condition is not\n> >> detected because of the stack recursion, as the thread is busy waiting\n> >> for the last message to be acknowledged.\n> > \n> > Ok, I think I understand the issue.\n>  > >>\n> >> This change implements a safer mechanism to handle the synchronous\n> >> message sending, similar to the one used for non isolated IPA. The\n> >> IPCUnixSocketWrapper class is introduced to handle the IPCUnixSocket\n> >> receive signal in a dedicated thread.\n> >> Doing so, the sending thread, when emiting a synchronous message, can be\n> >> blocked without event dispatcher's processEvents() usage, which avoids\n> >> the risky stack recursion.\n\nWhile I showed initial positive response, after some internal discussion we've\nuncovered some issues...\n\nThe main design goal is that we would like to not block threads.\n\nHowever, by definition, synchronous calls block the calling thread. At the\nmoment we have a single thread for all cameras, so blocking on one camera\nblocks all cameras, thus leading to the problem that you have discovered.\n\nSo how can we design this better?\n\nOne option would be to create one thread per camera, but that wil be very\ndifficult as all pipeline handlers would have to deal with thread\nsynchronization. One thread per pipeline handler *might* be feasible.\n\nAnother option is to run nested event loops. We would need to generalize the\nmechanism that exists today in IPCPipeUnixSocket::call(), and validate all side\neffects and reentrant calls that the pipeline handler might not expect.\n\nAnother option is to remove sync calls and have only async calls. This seems to\neasiest, as the only sync calls in any IPA interface are: init, start, stop,\nconfigure, mapBuffers, and unmapBuffers. Since none of these are in a hot\npath, I don't think it would cause that much of a problem to convert them all\nto async calls. The pipeline handler would be come slightly more complex \n\nThe main challenge I think is that stop() is called from the application, and\nfrom the application point of view this call must be synchronous and blocking.\nThus we would need some sort of asyncio type of mechanism. Perhaps we could use\nstd::future or std::promise, or some other asyncio library for C++?\n\n\nThanks,\n\nPaul\n\n> >>\n> >> Fixes: 21f1b555b (\"libcamera: Add IPCPipe implementation based on unix socket\")\n> >>\n> >> Signed-off-by: Julien Vuillaumier <julien.vuillaumier@nxp.com>\n> >> ---\n> >>   .../libcamera/internal/ipc_pipe_unixsocket.h  |  13 +-\n> >>   src/libcamera/ipc_pipe_unixsocket.cpp         | 242 +++++++++++++-----\n> >>   2 files changed, 178 insertions(+), 77 deletions(-)\n> >>\n> >> diff --git a/include/libcamera/internal/ipc_pipe_unixsocket.h b/include/libcamera/internal/ipc_pipe_unixsocket.h\n> >> index 8c972613..280639d5 100644\n> >> --- a/include/libcamera/internal/ipc_pipe_unixsocket.h\n> >> +++ b/include/libcamera/internal/ipc_pipe_unixsocket.h\n> >> @@ -16,6 +16,7 @@\n> >>   namespace libcamera {\n> >>\n> >>   class Process;\n> >> +class IPCUnixSocketWrapper;\n> >>\n> >>   class IPCPipeUnixSocket : public IPCPipe\n> >>   {\n> >> @@ -29,18 +30,8 @@ public:\n> >>          int sendAsync(const IPCMessage &data) override;\n> >>\n> >>   private:\n> >> -       struct CallData {\n> >> -               IPCUnixSocket::Payload *response;\n> >> -               bool done;\n> >> -       };\n> >> -\n> >> -       void readyRead();\n> >> -       int call(const IPCUnixSocket::Payload &message,\n> >> -                IPCUnixSocket::Payload *response, uint32_t seq);\n> >> -\n> >>          std::unique_ptr<Process> proc_;\n> >> -       std::unique_ptr<IPCUnixSocket> socket_;\n> >> -       std::map<uint32_t, CallData> callData_;\n> >> +       std::unique_ptr<IPCUnixSocketWrapper> socketWrap_;\n> > \n> > afaict you're essentially deleting the whole IPCPipe class and reimplementing\n> > it. What's wrong with changing it directly instead of adding a wrapper?\n> > \n> \n> The IPCPipeUnixSocket class is used by libcamera to create the proxy \n> worker process and the IPC to communicate with the proxy. The IPC part \n> relies on the IPCUnixSocket class (asynchronous write, no threading) - \n> and the same IPCUnixSocket class is also used by the proxy worker in the \n> auto generated code.\n> \n> The intent of the IPCUnixSocketWrapper was to provide an higher level \n> interface for the IPCUnixSocket class, more specific to the \n> IPCPipeUnixSocket class usage: threading for the receipt path, \n> sync/async transmit, IPCMessage to IPCUnixSocket::Payload conversions...\n> \n> As you mentioned, IPCPipeUnixSocket becomes fairly slim.\n> Now I can look at the option to move the IPCUnixSocketWrapper class \n> content into the  IPCPipeUnixSocket class, if that is a preferred approach.\n> \n> \n> >>   };\n> >>\n> >>   } /* namespace libcamera */\n> >> diff --git a/src/libcamera/ipc_pipe_unixsocket.cpp b/src/libcamera/ipc_pipe_unixsocket.cpp\n> >> index 668ec73b..eb5408d4 100644\n> >> --- a/src/libcamera/ipc_pipe_unixsocket.cpp\n> >> +++ b/src/libcamera/ipc_pipe_unixsocket.cpp\n> >> @@ -9,10 +9,9 @@\n> >>\n> >>   #include <vector>\n> >>\n> >> -#include <libcamera/base/event_dispatcher.h>\n> >>   #include <libcamera/base/log.h>\n> >> +#include <libcamera/base/mutex.h>\n> >>   #include <libcamera/base/thread.h>\n> >> -#include <libcamera/base/timer.h>\n> >>\n> >>   #include \"libcamera/internal/ipc_pipe.h\"\n> >>   #include \"libcamera/internal/ipc_unixsocket.h\"\n> >> @@ -24,67 +23,161 @@ namespace libcamera {\n> >>\n> >>   LOG_DECLARE_CATEGORY(IPCPipe)\n> >>\n> >> -IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n> >> -                                    const char *ipaProxyWorkerPath)\n> >> -       : IPCPipe()\n> >> +class IPCUnixSocketWrapper : Thread\n> >>   {\n> >> -       std::vector<int> fds;\n> >> -       std::vector<std::string> args;\n> >> -       args.push_back(ipaModulePath);\n> >> +public:\n> >> +       IPCUnixSocketWrapper(Signal<const IPCMessage &> *recv)\n> >> +               : recv_(recv), ready_(false), sendSyncPending_(false),\n> >> +                 sendSyncCookie_(0)\n> >> +       {\n> >> +               start();\n> >> +       }\n> >>\n> >> -       socket_ = std::make_unique<IPCUnixSocket>();\n> >> -       UniqueFD fd = socket_->create();\n> >> -       if (!fd.isValid()) {\n> >> -               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> >> -               return;\n> >> +       ~IPCUnixSocketWrapper()\n> >> +       {\n> >> +               exit();\n> >> +               wait();\n> >>          }\n> >> -       socket_->readyRead.connect(this, &IPCPipeUnixSocket::readyRead);\n> >> -       args.push_back(std::to_string(fd.get()));\n> >> -       fds.push_back(fd.get());\n> >>\n> >> -       proc_ = std::make_unique<Process>();\n> >> -       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n> >> -       if (ret) {\n> >> -               LOG(IPCPipe, Error)\n> >> -                       << \"Failed to start proxy worker process\";\n> >> -               return;\n> >> +       void run() override\n> >> +       {\n> >> +               /*\n> >> +                * IPC socket construction and connection to its readyRead\n> >> +                * signal has to be done from the IPC thread so that the\n> >> +                * relevant Object instances (EventNotifier, slot) are bound to\n> >> +                * its context.\n> >> +                */\n> >> +               init();\n> >> +               exec();\n> >> +               deinit();\n> >>          }\n> >>\n> >> -       connected_ = true;\n> >> -}\n> >> +       int fd() { return fd_.get(); }\n> >> +       int sendSync(const IPCMessage &in, IPCMessage *out);\n> >> +       int sendAsync(const IPCMessage &data);\n> >> +       bool waitReady();\n> >>\n> >> -IPCPipeUnixSocket::~IPCPipeUnixSocket()\n> >> -{\n> >> -}\n> >> +private:\n> >> +       void init();\n> >> +       void deinit();\n> >> +       void readyRead();\n> >>\n> >> -int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n> >> +       UniqueFD fd_;\n> >> +       Signal<const IPCMessage &> *recv_;\n> >> +       ConditionVariable cv_;\n> >> +       Mutex mutex_;\n> > \n> > So the mutex protects everything from here...\n> > \n> >> +       bool ready_;\n> >> +       bool sendSyncPending_;\n> >> +       uint32_t sendSyncCookie_;\n> >> +       IPCUnixSocket::Payload *sendSyncResponse_;\n> > \n> > ...to here\n> \n> I am not sure about this comment. But the git patch output may be a bit \n> misleading here: that line is only the `Mutex mutex_` variable \n> declaration, as a member of the IPCUnixSocketWrapper class.\n> The variables that `mutex_` actually protects depend on the functions \n> implementation where `mutex_` is used through lock()/unlock() sections \n> and MutexLocker-protected scopes.\n> \n> > \n> >> +\n> >> +       /* Socket shall be constructed and destructed from IPC thread context */\n> >> +       std::unique_ptr<IPCUnixSocket> socket_;\n> >> +};\n> >> +\n> >> +int IPCUnixSocketWrapper::sendSync(const IPCMessage &in, IPCMessage *out)\n> >>   {\n> >> +       int ret;\n> >>          IPCUnixSocket::Payload response;\n> >>\n> >> -       int ret = call(in.payload(), &response, in.header().cookie);\n> >> +       mutex_.lock();\n> >> +       ASSERT(!sendSyncPending_);\n> >> +       sendSyncPending_ = true;\n> >> +       sendSyncCookie_ = in.header().cookie;\n> >> +       sendSyncResponse_ = &response;\n> >> +       mutex_.unlock();\n> >> +\n> >> +       ret = socket_->send(in.payload());\n> >>          if (ret) {\n> >> -               LOG(IPCPipe, Error) << \"Failed to call sync\";\n> >> -               return ret;\n> >> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n> >> +               goto cleanup;\n> >> +       }\n> >> +\n> >> +       bool complete;\n> >> +       {\n> >> +               MutexLocker locker(mutex_);\n> >> +               auto syncComplete = ([&]() {\n> >> +                       return sendSyncPending_ == false;\n> >> +               });\n> >> +               complete = cv_.wait_for(locker, 1000ms, syncComplete);\n> >> +       }\n> > \n> > Ok, I think I like this idea.\n> \n> Thanks :)\n> \n> > \n> > I still don't see why it needs to be in a wrapper class, though.\n> \n> As mentioned above, I can look at the option to remove the wrapper class \n> to keep only the IPCUnixSocketWrapper if desired.\n> \n> > \n> >> +\n> >> +       if (!complete) {\n> >> +               LOG(IPCPipe, Error) << \"Timeout sending sync message\";\n> >> +               ret = -ETIMEDOUT;\n> >> +               goto cleanup;\n> >>          }\n> >>\n> >>          if (out)\n> >>                  *out = IPCMessage(response);\n> >>\n> >>          return 0;\n> >> +\n> >> +cleanup:\n> >> +       mutex_.lock();\n> >> +       sendSyncPending_ = false;\n> >> +       mutex_.unlock();\n> >> +\n> >> +       return ret;\n> >>   }\n> >>\n> >> -int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n> >> +int IPCUnixSocketWrapper::sendAsync(const IPCMessage &data)\n> >>   {\n> >> -       int ret = socket_->send(data.payload());\n> >> -       if (ret) {\n> >> -               LOG(IPCPipe, Error) << \"Failed to call async\";\n> >> -               return ret;\n> >> +       int ret;\n> >> +       ret = socket_->send(data.payload());\n> >> +       if (ret)\n> >> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n> >> +       return ret;\n> >> +}\n> >> +\n> >> +bool IPCUnixSocketWrapper::waitReady()\n> >> +{\n> >> +       bool ready;\n> >> +       {\n> >> +               MutexLocker locker(mutex_);\n> >> +               auto isReady = ([&]() {\n> >> +                       return ready_;\n> >> +               });\n> >> +               ready = cv_.wait_for(locker, 1000ms, isReady);\n> >>          }\n> >>\n> >> -       return 0;\n> >> +       return ready;\n> >> +}\n> > \n> > I don't understand why this function is needed. It only waits to make sure that\n> > init() is done, but afaict nothing in init() is threaded. Especially if there\n> > is no wrapper. Why can't IPCUnixSocket live directly in another thread?\n> \n> The init() function is executed from the wrapper thread - see the \n> IPCUnixSocketWrapper::run() function that calls in sequence: init(), \n> exec() then deinit().\n> The Object bound to the wrapper thread (EventNotifier, slots...) have to \n> be instantiated and later destroyed from the wrapper thread itself. \n> Thus, they can not be created by the wrapper constructor that is \n> executed in the camera manager thread ; their creation has to be \n> deferred to the wrapper thread execution. The waitReady() checks that \n> this thread is up and running which indicates that all objects needed \n> for the class operation have been created.\n> \n> If there is no more wrapper because the threading implementation is \n> moved to the IPCPipeUnixSocket class, the Object instances \n> (EventNotifier, slots...) associated to the new thread will still have \n> to be created from there as well.\n> Similar synchronization mechanism will be necessary in the \n> IPCPipeUnixSocket constructor to wait for that other thread to be active \n> and its associated Objects created.\n> \n> > \n> >> +\n> >> +void IPCUnixSocketWrapper::init()\n> >> +{\n> >> +       /* Init is to be done from the IPC thread context */\n> >> +       ASSERT(Thread::current() == this);\n> >> +\n> >> +       socket_ = std::make_unique<IPCUnixSocket>();\n> >> +       fd_ = socket_->create();\n> >> +       if (!fd_.isValid()) {\n> >> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> >> +               return;\n> >> +       }\n> >> +\n> >> +       socket_->readyRead.connect(this, &IPCUnixSocketWrapper::readyRead);\n> >> +\n> >> +       mutex_.lock();\n> >> +       ready_ = true;\n> >> +       mutex_.unlock();\n> >> +       cv_.notify_one();\n> >>   }\n> >>\n> >> -void IPCPipeUnixSocket::readyRead()\n> >> +void IPCUnixSocketWrapper::deinit()\n> >> +{\n> >> +       /* Deinit is to be done from the IPC thread context */\n> >> +       ASSERT(Thread::current() == this);\n> >> +\n> >> +       socket_->readyRead.disconnect(this);\n> >> +       socket_.reset();\n> >> +\n> >> +       mutex_.lock();\n> >> +       ready_ = false;\n> >> +       mutex_.unlock();\n> >> +}\n> >> +\n> >> +void IPCUnixSocketWrapper::readyRead()\n> >>   {\n> >>          IPCUnixSocket::Payload payload;\n> >>          int ret = socket_->receive(&payload);\n> >> @@ -93,55 +186,72 @@ void IPCPipeUnixSocket::readyRead()\n> >>                  return;\n> >>          }\n> >>\n> >> -       /* \\todo Use span to avoid the double copy when callData is found. */\n> > \n> > Oh nice, this is taken care of.\n> > \n> >>          if (payload.data.size() < sizeof(IPCMessage::Header)) {\n> >>                  LOG(IPCPipe, Error) << \"Not enough data received\";\n> >>                  return;\n> >>          }\n> >>\n> >> -       IPCMessage ipcMessage(payload);\n> >> +       const IPCMessage::Header *header =\n> >> +               reinterpret_cast<IPCMessage::Header *>(payload.data.data());\n> >> +       bool syncComplete = false;\n> >> +       mutex_.lock();\n> >> +       if (sendSyncPending_ && sendSyncCookie_ == header->cookie) {\n> >> +               syncComplete = true;\n> >> +               sendSyncPending_ = false;\n> >> +               *sendSyncResponse_ = std::move(payload);\n> >> +       }\n> >> +       mutex_.unlock();\n> > \n> > That's a cool idea.\n> > \n> >>\n> >> -       auto callData = callData_.find(ipcMessage.header().cookie);\n> >> -       if (callData != callData_.end()) {\n> >> -               *callData->second.response = std::move(payload);\n> >> -               callData->second.done = true;\n> >> +       if (syncComplete) {\n> >> +               cv_.notify_one();\n> >>                  return;\n> >>          }\n> >>\n> >>          /* Received unexpected data, this means it's a call from the IPA. */\n> >> -       recv.emit(ipcMessage);\n> >> +       IPCMessage ipcMessage(payload);\n> >> +       recv_->emit(ipcMessage);\n> >>   }\n> >>\n> >> -int IPCPipeUnixSocket::call(const IPCUnixSocket::Payload &message,\n> >> -                           IPCUnixSocket::Payload *response, uint32_t cookie)\n> >> +IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n> >> +                                    const char *ipaProxyWorkerPath)\n> >> +       : IPCPipe()\n> >>   {\n> >> -       Timer timeout;\n> >> -       int ret;\n> >> +       socketWrap_ = std::make_unique<IPCUnixSocketWrapper>(&recv);\n> >> +       if (!socketWrap_->waitReady()) {\n> >> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n> >> +               return;\n> >> +       }\n> >> +       int fd = socketWrap_->fd();\n> >>\n> >> -       const auto result = callData_.insert({ cookie, { response, false } });\n> >> -       const auto &iter = result.first;\n> >> +       std::vector<int> fds;\n> >> +       std::vector<std::string> args;\n> >> +       args.push_back(ipaModulePath);\n> >> +       args.push_back(std::to_string(fd));\n> >> +       fds.push_back(fd);\n> > \n> > I'd replace this with:\n> > \n> >          std::array args{ std::string(ipaModulePath), std::to_string(fd.get()) };\n> >          std::array fds{ fd.get() };\n> > \n> > (Since this is how the base code was changed since you when wrote this (sorry I\n> > didn't notice this patch earlier (I only noticed because Kieran cc'ed me)))\n> \n> Sure, I noticed that part had changed in libcamera v0.5.2.\n> I will update in the V2.\n> \n> Thanks,\n> Julien\n> \n> \n> > \n> > Thanks,\n> > \n> > Paul\n> > \n> >>\n> >> -       ret = socket_->send(message);\n> >> +       proc_ = std::make_unique<Process>();\n> >> +       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n> >>          if (ret) {\n> >> -               callData_.erase(iter);\n> >> -               return ret;\n> >> +               LOG(IPCPipe, Error)\n> >> +                       << \"Failed to start proxy worker process\";\n> >> +               return;\n> >>          }\n> >>\n> >> -       /* \\todo Make this less dangerous, see IPCPipe::sendSync() */\n> >> -       timeout.start(2000ms);\n> >> -       while (!iter->second.done) {\n> >> -               if (!timeout.isRunning()) {\n> >> -                       LOG(IPCPipe, Error) << \"Call timeout!\";\n> >> -                       callData_.erase(iter);\n> >> -                       return -ETIMEDOUT;\n> >> -               }\n> >> +       connected_ = true;\n> >> +}\n> >>\n> >> -               Thread::current()->eventDispatcher()->processEvents();\n> >> -       }\n> >> +IPCPipeUnixSocket::~IPCPipeUnixSocket()\n> >> +{\n> >> +}\n> >>\n> >> -       callData_.erase(iter);\n> >> +int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n> >> +{\n> >> +       return socketWrap_->sendSync(in, out);\n> >> +}\n> >>\n> >> -       return 0;\n> >> +int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n> >> +{\n> >> +       return socketWrap_->sendAsync(data);\n> >>   }\n> >>\n> >>   } /* namespace libcamera */\n> >> --\n> >> 2.34.1\n> >>\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 8984DBDB13\n\tfor <parsemail@patchwork.libcamera.org>;\n\tWed,  3 Sep 2025 13:36:22 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 4C39C69334;\n\tWed,  3 Sep 2025 15:36:21 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[213.167.242.64])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 8AB8E69332\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tWed,  3 Sep 2025 15:36:18 +0200 (CEST)","from neptunite.rasen.tech (unknown\n\t[IPv6:2404:7a81:160:2100:30ba:ee2f:ad96:6665])\n\tby perceval.ideasonboard.com (Postfix) with UTF8SMTPSA id 0E55F7E9;\n\tWed,  3 Sep 2025 15:35:08 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com; dkim=pass (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"tggxGQWJ\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1756906509;\n\tbh=JWcK3HaW4o1p94qAyJTisPzMciKjKVyaCt45ZrbW5fc=;\n\th=In-Reply-To:References:Subject:From:To:Date:From;\n\tb=tggxGQWJRbGKnu5ZGzDJOlWAHU4SIDFbiI2zkaB1A+Y2k194bpDax7Y8i/N3vsyln\n\tvRjRjb4Jb10vLTpC2czA3vAV2WkejN3u65bwlCao13XJQjsy48xeK+4UnRqoVMLVT+\n\tWD2aL6zU+1GYx85/n+rHNMUl2PyXXCFhSi0DR7mo=","Content-Type":"text/plain; charset=\"utf-8\"","MIME-Version":"1.0","Content-Transfer-Encoding":"quoted-printable","In-Reply-To":"<92c2b0ad-8d62-42af-9e51-36fa61ff81cb@nxp.com>","References":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>\n\t<175620412034.607151.628561646126579934@neptunite.rasen.tech>\n\t<92c2b0ad-8d62-42af-9e51-36fa61ff81cb@nxp.com>","Subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","From":"Paul Elder <paul.elder@ideasonboard.com>","To":"Julien Vuillaumier <julien.vuillaumier@nxp.com>,\n\tlibcamera-devel@lists.libcamera.org","Date":"Wed, 03 Sep 2025 22:36:12 +0900","Message-ID":"<175690657206.2636072.17788644204927420566@neptunite.rasen.tech>","User-Agent":"alot/0.0.0","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":35693,"web_url":"https://patchwork.libcamera.org/comment/35693/","msgid":"<f742b41d-7dc0-4b9f-876f-cb566b380819@nxp.com>","date":"2025-09-03T17:42:29","subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","submitter":{"id":190,"url":"https://patchwork.libcamera.org/api/people/190/","name":"Julien Vuillaumier","email":"julien.vuillaumier@nxp.com"},"content":"Hi Paul,\n\nOn 03/09/2025 15:36, Paul Elder wrote:\n> Hi Julien,\n> \n> Quoting Julien Vuillaumier (2025-08-29 18:42:03)\n>> Hi Paul,\n>>\n>>\n>> On 26/08/2025 12:28, Paul Elder wrote:\n>>>\n>>> Hi Julien,\n>>>\n>>> Thanks for the patch.\n>>>\n>>> Quoting Julien Vuillaumier (2024-12-20 23:55:56)\n>>>> Unix-socket based IPC sometimes times out or hangs, typically\n>>>> when multiple camera are stopped simulaneously. That specific case\n>>>> triggers the concurrent sending by each pipeline handler instance\n>>>> of a synchronous stop() message to its peer IPA process.\n>>>>\n>>>> There is a dedicated IPC socket per camera. Sockets payload receipt\n>>>> signals all run in the camera manager thread.\n>>>> To send a synchronous message to IPA, pipeline invokes from camera\n>>>> thread IPCPipeUnixSocket::sendSync(). This sends the message then blocks\n>>>> busy waiting for the peer acknowledgment. Such busy wait is done by\n>>>> blocking on event loop calling dispatchEvent(), until the ack condition\n>>>> is detected.\n>>>>\n>>>> One issue is that the socket receive slot readyRead() wakes up the\n>>>> blocked thread via libcamera::Message receipt. Even though such message\n>>>> resumes processEvents(), it may reblock immediately because readyRead()\n>>>> does not interrupt() explictly the dispatcher.\n>>>> Most of the time, an other pending event for the thread unblocks the\n>>>> event dispatcher and the ack condition is detected - in worst case the 2\n>>>> sec timeout kicks in. Once unblocked, the dispatcher let the message\n>>>> acknowledgment to be detected and the sendSync() completes.\n>>>>\n>>>> The other issue is that in case of concurrent synchronous IPC messages\n>>>> sent by multiple pipeline handlers, there is a possible recursion\n>>>> of sendSync() / processEvents() nested in the camera thread stack. As\n>>>> commented in the source, that is a dangerous construct that can lead to\n>>>> a hang.\n>>>> The reason is that the last synchronous message sent is the deepest in\n>>>> the stack. It is also the one whose acknowledgment is being busy waited.\n>>>> However other pending synchronous messages may have been initiated before\n>>>> and are upper in the stack. If they timeout, the condition is not\n>>>> detected because of the stack recursion, as the thread is busy waiting\n>>>> for the last message to be acknowledged.\n>>>\n>>> Ok, I think I understand the issue.\n>>   > >>\n>>>> This change implements a safer mechanism to handle the synchronous\n>>>> message sending, similar to the one used for non isolated IPA. The\n>>>> IPCUnixSocketWrapper class is introduced to handle the IPCUnixSocket\n>>>> receive signal in a dedicated thread.\n>>>> Doing so, the sending thread, when emiting a synchronous message, can be\n>>>> blocked without event dispatcher's processEvents() usage, which avoids\n>>>> the risky stack recursion.\n> \n> While I showed initial positive response, after some internal discussion we've\n> uncovered some issues...\n> \n> The main design goal is that we would like to not block threads.\n> \n> However, by definition, synchronous calls block the calling thread. At the\n> moment we have a single thread for all cameras, so blocking on one camera\n> blocks all cameras, thus leading to the problem that you have discovered.\n\nIPC synchronous calls target short functions - the ones involving IPA \ncomputations like process(), prepare() etc are asynchronous. IMO briefly \nblocking the camera manager thread (I did not measure the duration but \nwould expect tens of microsec?) is not the main issue. Main issue is \nrather the nesting of the EventDispatcher::processEvents() calls that is \na fragile construct in its actual form.\n\nWhen the IPA is running in a thread (no isolation so no IPC), the camera \nmanager thread is also blocked on a condition variable during the \nsynchronous call from the pipeline handler to the IPA thread.\n\nOn stop() command issued by the app thread, the camera manager thread \ncall stack is something like this:\n\nlibcamera::ConditionVariable::wait<>()\nlibcamera::Semaphore::acquire()\nlibcamera::BoundMethodBase::activatePack()\nlibcamera::BoundMethodMember<>::activate\nlibcamera::Object::invokeMethod<>(func=IPAProxyNxpNeo::ThreadProxy::stop(), \ntype=ConnectionTypeBlocking)\nlibcamera::ipa::nxpneo::IPAProxyNxpNeo::stopThread()\nlibcamera::ipa::nxpneo::IPAProxyNxpNeo::stop()\nlibcamera::NxpNeoCameraData::stopDevice()\nlibcamera::PipelineHandlerNxpNeo::stopDevice()\nlibcamera::PipelineHandler::stop()\nlibcamera::BoundMethodMember<>::invoke()\nlibcamera::BoundMethodArgs()\nlibcamera::InvokeMessage::invoke()\nlibcamera::Object::message()\nlibcamera::Thread::dispatchMessages()\nlibcamera::EventDispatcherPoll::processEvents()\nlibcamera::Thread::exec()\nlibcamera::CameraManager::Private::run()\nlibcamera::Thread::startThread()\n\nSemaphore is acquired by camera thread in \nBoundMethodBase::activatePack() implementation, and released in \nObject::message() from the IPA thread after the remote function \nexecution. During that time the camera thread is blocked, there is no \nEventDispatcher::processEvent() loop, and that does not cause problems \neven in multi-camera.\n\nFor that reason, implementing the same blocking scheme for IPC seemed to \nbe a reasonable option.\n\n> \n> So how can we design this better?\n> \n> One option would be to create one thread per camera, but that wil be very\n> difficult as all pipeline handlers would have to deal with thread\n> synchronization. One thread per pipeline handler *might* be feasible.\n\nUsing one thread per pipeline handler -if doable-  would not help: a \nsingle pipeline handler instance may handle multiple cameras when they \nshare media device resources.\nSo we could still have multiple cameras operated from the same thread.\n\n\n> Another option is to run nested event loops. We would need to generalize the\n> mechanism that exists today in IPCPipeUnixSocket::call(), and validate all side\n> effects and reentrant calls that the pipeline handler might not expect.\n\nIt is an option but probably a complex tasks: its is an intricate \nconstruct and does require a very fine understanding of the \nimplementation of the threading, event loop, messaging, timers \nmanagement etc framework.\n\n> \n> Another option is to remove sync calls and have only async calls. This seems to\n> easiest, as the only sync calls in any IPA interface are: init, start, stop,\n> configure, mapBuffers, and unmapBuffers. Since none of these are in a hot\n> path, I don't think it would cause that much of a problem to convert them all\n> to async calls. The pipeline handler would be come slightly more complex\n> \n> The main challenge I think is that stop() is called from the application, and\n> from the application point of view this call must be synchronous and blocking.\n> Thus we would need some sort of asyncio type of mechanism. Perhaps we could use\n> std::future or std::promise, or some other asyncio library for C++?\n\nMoving all synchronous messages to asynchronous would be an option \nindeed. Simple implementation could be for all pipeline handler to split \ntheir init(), configure(), start() and such ops in multiple chunks to \nwait for an IPA async message confirmation before proceeding, where \ncurrently synchronization is done via sync message.\nThat is doable but cost may be high as it impacts all pipeline handlers. \nAlso it may negatively impact their readability.\n\nIMO synchronizing IPCPIpeUnixSocket sync message from a IPCUnixSocket \nthread is a simpler option limiting the scope of the changes. Drawback \nis that it blocks the camera thread during some (short) time, but that \nis not more constraining for the camera thread than what we already do \nfor the threaded IPA case, so I believe (hope?) that is acceptable.\n\nThanks,\nJulien\n\n> \n> Thanks,\n> \n> Paul\n> \n>>>>\n>>>> Fixes: 21f1b555b (\"libcamera: Add IPCPipe implementation based on unix socket\")\n>>>>\n>>>> Signed-off-by: Julien Vuillaumier <julien.vuillaumier@nxp.com>\n>>>> ---\n>>>>    .../libcamera/internal/ipc_pipe_unixsocket.h  |  13 +-\n>>>>    src/libcamera/ipc_pipe_unixsocket.cpp         | 242 +++++++++++++-----\n>>>>    2 files changed, 178 insertions(+), 77 deletions(-)\n>>>>\n>>>> diff --git a/include/libcamera/internal/ipc_pipe_unixsocket.h b/include/libcamera/internal/ipc_pipe_unixsocket.h\n>>>> index 8c972613..280639d5 100644\n>>>> --- a/include/libcamera/internal/ipc_pipe_unixsocket.h\n>>>> +++ b/include/libcamera/internal/ipc_pipe_unixsocket.h\n>>>> @@ -16,6 +16,7 @@\n>>>>    namespace libcamera {\n>>>>\n>>>>    class Process;\n>>>> +class IPCUnixSocketWrapper;\n>>>>\n>>>>    class IPCPipeUnixSocket : public IPCPipe\n>>>>    {\n>>>> @@ -29,18 +30,8 @@ public:\n>>>>           int sendAsync(const IPCMessage &data) override;\n>>>>\n>>>>    private:\n>>>> -       struct CallData {\n>>>> -               IPCUnixSocket::Payload *response;\n>>>> -               bool done;\n>>>> -       };\n>>>> -\n>>>> -       void readyRead();\n>>>> -       int call(const IPCUnixSocket::Payload &message,\n>>>> -                IPCUnixSocket::Payload *response, uint32_t seq);\n>>>> -\n>>>>           std::unique_ptr<Process> proc_;\n>>>> -       std::unique_ptr<IPCUnixSocket> socket_;\n>>>> -       std::map<uint32_t, CallData> callData_;\n>>>> +       std::unique_ptr<IPCUnixSocketWrapper> socketWrap_;\n>>>\n>>> afaict you're essentially deleting the whole IPCPipe class and reimplementing\n>>> it. What's wrong with changing it directly instead of adding a wrapper?\n>>>\n>>\n>> The IPCPipeUnixSocket class is used by libcamera to create the proxy\n>> worker process and the IPC to communicate with the proxy. The IPC part\n>> relies on the IPCUnixSocket class (asynchronous write, no threading) -\n>> and the same IPCUnixSocket class is also used by the proxy worker in the\n>> auto generated code.\n>>\n>> The intent of the IPCUnixSocketWrapper was to provide an higher level\n>> interface for the IPCUnixSocket class, more specific to the\n>> IPCPipeUnixSocket class usage: threading for the receipt path,\n>> sync/async transmit, IPCMessage to IPCUnixSocket::Payload conversions...\n>>\n>> As you mentioned, IPCPipeUnixSocket becomes fairly slim.\n>> Now I can look at the option to move the IPCUnixSocketWrapper class\n>> content into the  IPCPipeUnixSocket class, if that is a preferred approach.\n>>\n>>\n>>>>    };\n>>>>\n>>>>    } /* namespace libcamera */\n>>>> diff --git a/src/libcamera/ipc_pipe_unixsocket.cpp b/src/libcamera/ipc_pipe_unixsocket.cpp\n>>>> index 668ec73b..eb5408d4 100644\n>>>> --- a/src/libcamera/ipc_pipe_unixsocket.cpp\n>>>> +++ b/src/libcamera/ipc_pipe_unixsocket.cpp\n>>>> @@ -9,10 +9,9 @@\n>>>>\n>>>>    #include <vector>\n>>>>\n>>>> -#include <libcamera/base/event_dispatcher.h>\n>>>>    #include <libcamera/base/log.h>\n>>>> +#include <libcamera/base/mutex.h>\n>>>>    #include <libcamera/base/thread.h>\n>>>> -#include <libcamera/base/timer.h>\n>>>>\n>>>>    #include \"libcamera/internal/ipc_pipe.h\"\n>>>>    #include \"libcamera/internal/ipc_unixsocket.h\"\n>>>> @@ -24,67 +23,161 @@ namespace libcamera {\n>>>>\n>>>>    LOG_DECLARE_CATEGORY(IPCPipe)\n>>>>\n>>>> -IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n>>>> -                                    const char *ipaProxyWorkerPath)\n>>>> -       : IPCPipe()\n>>>> +class IPCUnixSocketWrapper : Thread\n>>>>    {\n>>>> -       std::vector<int> fds;\n>>>> -       std::vector<std::string> args;\n>>>> -       args.push_back(ipaModulePath);\n>>>> +public:\n>>>> +       IPCUnixSocketWrapper(Signal<const IPCMessage &> *recv)\n>>>> +               : recv_(recv), ready_(false), sendSyncPending_(false),\n>>>> +                 sendSyncCookie_(0)\n>>>> +       {\n>>>> +               start();\n>>>> +       }\n>>>>\n>>>> -       socket_ = std::make_unique<IPCUnixSocket>();\n>>>> -       UniqueFD fd = socket_->create();\n>>>> -       if (!fd.isValid()) {\n>>>> -               LOG(IPCPipe, Error) << \"Failed to create socket\";\n>>>> -               return;\n>>>> +       ~IPCUnixSocketWrapper()\n>>>> +       {\n>>>> +               exit();\n>>>> +               wait();\n>>>>           }\n>>>> -       socket_->readyRead.connect(this, &IPCPipeUnixSocket::readyRead);\n>>>> -       args.push_back(std::to_string(fd.get()));\n>>>> -       fds.push_back(fd.get());\n>>>>\n>>>> -       proc_ = std::make_unique<Process>();\n>>>> -       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>>>> -       if (ret) {\n>>>> -               LOG(IPCPipe, Error)\n>>>> -                       << \"Failed to start proxy worker process\";\n>>>> -               return;\n>>>> +       void run() override\n>>>> +       {\n>>>> +               /*\n>>>> +                * IPC socket construction and connection to its readyRead\n>>>> +                * signal has to be done from the IPC thread so that the\n>>>> +                * relevant Object instances (EventNotifier, slot) are bound to\n>>>> +                * its context.\n>>>> +                */\n>>>> +               init();\n>>>> +               exec();\n>>>> +               deinit();\n>>>>           }\n>>>>\n>>>> -       connected_ = true;\n>>>> -}\n>>>> +       int fd() { return fd_.get(); }\n>>>> +       int sendSync(const IPCMessage &in, IPCMessage *out);\n>>>> +       int sendAsync(const IPCMessage &data);\n>>>> +       bool waitReady();\n>>>>\n>>>> -IPCPipeUnixSocket::~IPCPipeUnixSocket()\n>>>> -{\n>>>> -}\n>>>> +private:\n>>>> +       void init();\n>>>> +       void deinit();\n>>>> +       void readyRead();\n>>>>\n>>>> -int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n>>>> +       UniqueFD fd_;\n>>>> +       Signal<const IPCMessage &> *recv_;\n>>>> +       ConditionVariable cv_;\n>>>> +       Mutex mutex_;\n>>>\n>>> So the mutex protects everything from here...\n>>>\n>>>> +       bool ready_;\n>>>> +       bool sendSyncPending_;\n>>>> +       uint32_t sendSyncCookie_;\n>>>> +       IPCUnixSocket::Payload *sendSyncResponse_;\n>>>\n>>> ...to here\n>>\n>> I am not sure about this comment. But the git patch output may be a bit\n>> misleading here: that line is only the `Mutex mutex_` variable\n>> declaration, as a member of the IPCUnixSocketWrapper class.\n>> The variables that `mutex_` actually protects depend on the functions\n>> implementation where `mutex_` is used through lock()/unlock() sections\n>> and MutexLocker-protected scopes.\n>>\n>>>\n>>>> +\n>>>> +       /* Socket shall be constructed and destructed from IPC thread context */\n>>>> +       std::unique_ptr<IPCUnixSocket> socket_;\n>>>> +};\n>>>> +\n>>>> +int IPCUnixSocketWrapper::sendSync(const IPCMessage &in, IPCMessage *out)\n>>>>    {\n>>>> +       int ret;\n>>>>           IPCUnixSocket::Payload response;\n>>>>\n>>>> -       int ret = call(in.payload(), &response, in.header().cookie);\n>>>> +       mutex_.lock();\n>>>> +       ASSERT(!sendSyncPending_);\n>>>> +       sendSyncPending_ = true;\n>>>> +       sendSyncCookie_ = in.header().cookie;\n>>>> +       sendSyncResponse_ = &response;\n>>>> +       mutex_.unlock();\n>>>> +\n>>>> +       ret = socket_->send(in.payload());\n>>>>           if (ret) {\n>>>> -               LOG(IPCPipe, Error) << \"Failed to call sync\";\n>>>> -               return ret;\n>>>> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n>>>> +               goto cleanup;\n>>>> +       }\n>>>> +\n>>>> +       bool complete;\n>>>> +       {\n>>>> +               MutexLocker locker(mutex_);\n>>>> +               auto syncComplete = ([&]() {\n>>>> +                       return sendSyncPending_ == false;\n>>>> +               });\n>>>> +               complete = cv_.wait_for(locker, 1000ms, syncComplete);\n>>>> +       }\n>>>\n>>> Ok, I think I like this idea.\n>>\n>> Thanks :)\n>>\n>>>\n>>> I still don't see why it needs to be in a wrapper class, though.\n>>\n>> As mentioned above, I can look at the option to remove the wrapper class\n>> to keep only the IPCUnixSocketWrapper if desired.\n>>\n>>>\n>>>> +\n>>>> +       if (!complete) {\n>>>> +               LOG(IPCPipe, Error) << \"Timeout sending sync message\";\n>>>> +               ret = -ETIMEDOUT;\n>>>> +               goto cleanup;\n>>>>           }\n>>>>\n>>>>           if (out)\n>>>>                   *out = IPCMessage(response);\n>>>>\n>>>>           return 0;\n>>>> +\n>>>> +cleanup:\n>>>> +       mutex_.lock();\n>>>> +       sendSyncPending_ = false;\n>>>> +       mutex_.unlock();\n>>>> +\n>>>> +       return ret;\n>>>>    }\n>>>>\n>>>> -int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n>>>> +int IPCUnixSocketWrapper::sendAsync(const IPCMessage &data)\n>>>>    {\n>>>> -       int ret = socket_->send(data.payload());\n>>>> -       if (ret) {\n>>>> -               LOG(IPCPipe, Error) << \"Failed to call async\";\n>>>> -               return ret;\n>>>> +       int ret;\n>>>> +       ret = socket_->send(data.payload());\n>>>> +       if (ret)\n>>>> +               LOG(IPCPipe, Error) << \"Failed to send sync message\";\n>>>> +       return ret;\n>>>> +}\n>>>> +\n>>>> +bool IPCUnixSocketWrapper::waitReady()\n>>>> +{\n>>>> +       bool ready;\n>>>> +       {\n>>>> +               MutexLocker locker(mutex_);\n>>>> +               auto isReady = ([&]() {\n>>>> +                       return ready_;\n>>>> +               });\n>>>> +               ready = cv_.wait_for(locker, 1000ms, isReady);\n>>>>           }\n>>>>\n>>>> -       return 0;\n>>>> +       return ready;\n>>>> +}\n>>>\n>>> I don't understand why this function is needed. It only waits to make sure that\n>>> init() is done, but afaict nothing in init() is threaded. Especially if there\n>>> is no wrapper. Why can't IPCUnixSocket live directly in another thread?\n>>\n>> The init() function is executed from the wrapper thread - see the\n>> IPCUnixSocketWrapper::run() function that calls in sequence: init(),\n>> exec() then deinit().\n>> The Object bound to the wrapper thread (EventNotifier, slots...) have to\n>> be instantiated and later destroyed from the wrapper thread itself.\n>> Thus, they can not be created by the wrapper constructor that is\n>> executed in the camera manager thread ; their creation has to be\n>> deferred to the wrapper thread execution. The waitReady() checks that\n>> this thread is up and running which indicates that all objects needed\n>> for the class operation have been created.\n>>\n>> If there is no more wrapper because the threading implementation is\n>> moved to the IPCPipeUnixSocket class, the Object instances\n>> (EventNotifier, slots...) associated to the new thread will still have\n>> to be created from there as well.\n>> Similar synchronization mechanism will be necessary in the\n>> IPCPipeUnixSocket constructor to wait for that other thread to be active\n>> and its associated Objects created.\n>>\n>>>\n>>>> +\n>>>> +void IPCUnixSocketWrapper::init()\n>>>> +{\n>>>> +       /* Init is to be done from the IPC thread context */\n>>>> +       ASSERT(Thread::current() == this);\n>>>> +\n>>>> +       socket_ = std::make_unique<IPCUnixSocket>();\n>>>> +       fd_ = socket_->create();\n>>>> +       if (!fd_.isValid()) {\n>>>> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n>>>> +               return;\n>>>> +       }\n>>>> +\n>>>> +       socket_->readyRead.connect(this, &IPCUnixSocketWrapper::readyRead);\n>>>> +\n>>>> +       mutex_.lock();\n>>>> +       ready_ = true;\n>>>> +       mutex_.unlock();\n>>>> +       cv_.notify_one();\n>>>>    }\n>>>>\n>>>> -void IPCPipeUnixSocket::readyRead()\n>>>> +void IPCUnixSocketWrapper::deinit()\n>>>> +{\n>>>> +       /* Deinit is to be done from the IPC thread context */\n>>>> +       ASSERT(Thread::current() == this);\n>>>> +\n>>>> +       socket_->readyRead.disconnect(this);\n>>>> +       socket_.reset();\n>>>> +\n>>>> +       mutex_.lock();\n>>>> +       ready_ = false;\n>>>> +       mutex_.unlock();\n>>>> +}\n>>>> +\n>>>> +void IPCUnixSocketWrapper::readyRead()\n>>>>    {\n>>>>           IPCUnixSocket::Payload payload;\n>>>>           int ret = socket_->receive(&payload);\n>>>> @@ -93,55 +186,72 @@ void IPCPipeUnixSocket::readyRead()\n>>>>                   return;\n>>>>           }\n>>>>\n>>>> -       /* \\todo Use span to avoid the double copy when callData is found. */\n>>>\n>>> Oh nice, this is taken care of.\n>>>\n>>>>           if (payload.data.size() < sizeof(IPCMessage::Header)) {\n>>>>                   LOG(IPCPipe, Error) << \"Not enough data received\";\n>>>>                   return;\n>>>>           }\n>>>>\n>>>> -       IPCMessage ipcMessage(payload);\n>>>> +       const IPCMessage::Header *header =\n>>>> +               reinterpret_cast<IPCMessage::Header *>(payload.data.data());\n>>>> +       bool syncComplete = false;\n>>>> +       mutex_.lock();\n>>>> +       if (sendSyncPending_ && sendSyncCookie_ == header->cookie) {\n>>>> +               syncComplete = true;\n>>>> +               sendSyncPending_ = false;\n>>>> +               *sendSyncResponse_ = std::move(payload);\n>>>> +       }\n>>>> +       mutex_.unlock();\n>>>\n>>> That's a cool idea.\n>>>\n>>>>\n>>>> -       auto callData = callData_.find(ipcMessage.header().cookie);\n>>>> -       if (callData != callData_.end()) {\n>>>> -               *callData->second.response = std::move(payload);\n>>>> -               callData->second.done = true;\n>>>> +       if (syncComplete) {\n>>>> +               cv_.notify_one();\n>>>>                   return;\n>>>>           }\n>>>>\n>>>>           /* Received unexpected data, this means it's a call from the IPA. */\n>>>> -       recv.emit(ipcMessage);\n>>>> +       IPCMessage ipcMessage(payload);\n>>>> +       recv_->emit(ipcMessage);\n>>>>    }\n>>>>\n>>>> -int IPCPipeUnixSocket::call(const IPCUnixSocket::Payload &message,\n>>>> -                           IPCUnixSocket::Payload *response, uint32_t cookie)\n>>>> +IPCPipeUnixSocket::IPCPipeUnixSocket(const char *ipaModulePath,\n>>>> +                                    const char *ipaProxyWorkerPath)\n>>>> +       : IPCPipe()\n>>>>    {\n>>>> -       Timer timeout;\n>>>> -       int ret;\n>>>> +       socketWrap_ = std::make_unique<IPCUnixSocketWrapper>(&recv);\n>>>> +       if (!socketWrap_->waitReady()) {\n>>>> +               LOG(IPCPipe, Error) << \"Failed to create socket\";\n>>>> +               return;\n>>>> +       }\n>>>> +       int fd = socketWrap_->fd();\n>>>>\n>>>> -       const auto result = callData_.insert({ cookie, { response, false } });\n>>>> -       const auto &iter = result.first;\n>>>> +       std::vector<int> fds;\n>>>> +       std::vector<std::string> args;\n>>>> +       args.push_back(ipaModulePath);\n>>>> +       args.push_back(std::to_string(fd));\n>>>> +       fds.push_back(fd);\n>>>\n>>> I'd replace this with:\n>>>\n>>>           std::array args{ std::string(ipaModulePath), std::to_string(fd.get()) };\n>>>           std::array fds{ fd.get() };\n>>>\n>>> (Since this is how the base code was changed since you when wrote this (sorry I\n>>> didn't notice this patch earlier (I only noticed because Kieran cc'ed me)))\n>>\n>> Sure, I noticed that part had changed in libcamera v0.5.2.\n>> I will update in the V2.\n>>\n>> Thanks,\n>> Julien\n>>\n>>\n>>>\n>>> Thanks,\n>>>\n>>> Paul\n>>>\n>>>>\n>>>> -       ret = socket_->send(message);\n>>>> +       proc_ = std::make_unique<Process>();\n>>>> +       int ret = proc_->start(ipaProxyWorkerPath, args, fds);\n>>>>           if (ret) {\n>>>> -               callData_.erase(iter);\n>>>> -               return ret;\n>>>> +               LOG(IPCPipe, Error)\n>>>> +                       << \"Failed to start proxy worker process\";\n>>>> +               return;\n>>>>           }\n>>>>\n>>>> -       /* \\todo Make this less dangerous, see IPCPipe::sendSync() */\n>>>> -       timeout.start(2000ms);\n>>>> -       while (!iter->second.done) {\n>>>> -               if (!timeout.isRunning()) {\n>>>> -                       LOG(IPCPipe, Error) << \"Call timeout!\";\n>>>> -                       callData_.erase(iter);\n>>>> -                       return -ETIMEDOUT;\n>>>> -               }\n>>>> +       connected_ = true;\n>>>> +}\n>>>>\n>>>> -               Thread::current()->eventDispatcher()->processEvents();\n>>>> -       }\n>>>> +IPCPipeUnixSocket::~IPCPipeUnixSocket()\n>>>> +{\n>>>> +}\n>>>>\n>>>> -       callData_.erase(iter);\n>>>> +int IPCPipeUnixSocket::sendSync(const IPCMessage &in, IPCMessage *out)\n>>>> +{\n>>>> +       return socketWrap_->sendSync(in, out);\n>>>> +}\n>>>>\n>>>> -       return 0;\n>>>> +int IPCPipeUnixSocket::sendAsync(const IPCMessage &data)\n>>>> +{\n>>>> +       return socketWrap_->sendAsync(data);\n>>>>    }\n>>>>\n>>>>    } /* namespace libcamera */\n>>>> --\n>>>> 2.34.1\n>>>>\n>>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 732A3BDB13\n\tfor <parsemail@patchwork.libcamera.org>;\n\tWed,  3 Sep 2025 17:42:37 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 7758B69334;\n\tWed,  3 Sep 2025 19:42:36 +0200 (CEST)","from DB3PR0202CU003.outbound.protection.outlook.com\n\t(mail-northeuropeazlp170100001.outbound.protection.outlook.com\n\t[IPv6:2a01:111:f403:c200::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 92C2569332\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tWed,  3 Sep 2025 19:42:34 +0200 (CEST)","from AM9PR04MB8147.eurprd04.prod.outlook.com\n\t(2603:10a6:20b:3e0::22)\n\tby VI1PR04MB10026.eurprd04.prod.outlook.com (2603:10a6:800:1df::6)\n\twith Microsoft SMTP Server (version=TLS1_2,\n\tcipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.17;\n\tWed, 3 Sep 2025 17:42:31 +0000","from AM9PR04MB8147.eurprd04.prod.outlook.com\n\t([fe80::eace:e980:28a4:ef8a]) by\n\tAM9PR04MB8147.eurprd04.prod.outlook.com\n\t([fe80::eace:e980:28a4:ef8a%6]) with mapi id 15.20.9073.026;\n\tWed, 3 Sep 2025 17:42:31 +0000"],"Authentication-Results":["lancelot.ideasonboard.com; dkim=pass (2048-bit key;\n\tunprotected) header.d=nxp.com header.i=@nxp.com header.b=\"RDD5dKH7\";\n\tdkim-atps=neutral","dkim=none (message not signed)\n\theader.d=none;dmarc=none action=none header.from=nxp.com;"],"ARC-Seal":"i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none;\n\tb=bMt8cXhRtX2q3zX2Mv8en5Ocuth+9+ox3e6vg5BU3OFWAIx0yCHhTXGh3QJrbyTgPwbOsIDLdun2r2MR0sOgZRc7LQZ2LCVpV8Fm3Qfwl1XUXP5kLwj205gykflhNbnmVmBafuavTYyRw09nOylspgOqwIrlM67+wIqn5aDLc+2f/qSeCbEucqtb1aI5qIDiOf0HiaYCN3FgT6E3pPAYyNaLa0xgqo9YN5zWV5mgIEFP78Dc5rEBxrl82ml1BrTgY1WOMBTTcvpwUXefeRqxFv/+oa8JBj0fvg3u3B5u3Cc0VkDoxUZN6/Q9ECc/l/1+Q88m+XOjpvfOE7gbfMw44A==","ARC-Message-Signature":"i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;\n\ts=arcselector10001;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;\n\tbh=Jt87xhptnrTHzEPx4/D9wpHIR6Ef+WXmq7L2zEoI7DA=;\n\tb=tAyphLG/ZlT6BgCIXGH/SgP0MGb/w9tAEURmKON/0NyrGFDVokMPsNAj9DjKNu7j/2up+SOYpG+nkwvu5VWAD0K6Ofz8nQkMXIX4YUSifJAchEf0dpzaCOaZzS60nREBFZX3qV4ChT+6iHt2wbrBFy92RBNDr4/czYMBbPKX+Hqz3pqDbZA3U8jP2fK+NPFe5JcWBwoKoG77gHu43XWa08uW7bwgS/BdBEFNEIfXxGe+Y/sbGQ2JxT23nSccSH8u/7Pq5fF7Kka4v7TDPQXCJJHuPCCK3Rd6NKet+49Hy6J9gXj9AMMxatWJ8Tfz6ubekK1TH/SnBHaqGXc7CQYrSA==","ARC-Authentication-Results":"i=1; mx.microsoft.com 1; spf=pass\n\tsmtp.mailfrom=nxp.com; dmarc=pass action=none header.from=nxp.com;\n\tdkim=pass header.d=nxp.com; arc=none","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=nxp.com; s=selector1;\n\th=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;\n\tbh=Jt87xhptnrTHzEPx4/D9wpHIR6Ef+WXmq7L2zEoI7DA=;\n\tb=RDD5dKH7WCCFO8FEtaL4JcL1eKjCJPRTzjnsJo2J/CJ0WqW/v9MyBMr7MbImhREkSTc3egZIdVuqnU5tio316zI7zKwwg1mA/ahE26RL9h/Rm3t8o0kt6LpXxV6Nf947WhEWPuXY9u78UMBan6w6h2vpSp0Z6MFMt6UYry8zdG7v+ronLV/7GeoJUIy5DTXj8KVYxatWS4ttUUBb5mYYadK/qwiGm/zWDgE7PxIVvbNymKY/5uA8fzA5xwUvdR6Oxw3GjtFjBkWtY1rhMvSZRS5YxeSYNG9+YEbS/RdqSegdpTZWHSF4VpzC4Nu+qrDX05CiD15GH6uWqfrU2wMTwA==","Message-ID":"<f742b41d-7dc0-4b9f-876f-cb566b380819@nxp.com>","Date":"Wed, 3 Sep 2025 19:42:29 +0200","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH] libcamera: ipc_unixsocket: Fix sendSync() timeout and\n\thang","To":"Paul Elder <paul.elder@ideasonboard.com>,\n\tlibcamera-devel@lists.libcamera.org","References":"<20241220145556.3011657-1-julien.vuillaumier@nxp.com>\n\t<175620412034.607151.628561646126579934@neptunite.rasen.tech>\n\t<92c2b0ad-8d62-42af-9e51-36fa61ff81cb@nxp.com>\n\t<175690657206.2636072.17788644204927420566@neptunite.rasen.tech>","Content-Language":"en-US","From":"Julien Vuillaumier <julien.vuillaumier@nxp.com>","In-Reply-To":"<175690657206.2636072.17788644204927420566@neptunite.rasen.tech>","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"7bit","X-ClientProxiedBy":"AM0PR02CA0197.eurprd02.prod.outlook.com\n\t(2603:10a6:20b:28e::34) To AM9PR04MB8147.eurprd04.prod.outlook.com\n\t(2603:10a6:20b:3e0::22)","MIME-Version":"1.0","X-MS-PublicTrafficType":"Email","X-MS-TrafficTypeDiagnostic":"AM9PR04MB8147:EE_|VI1PR04MB10026:EE_","X-MS-Office365-Filtering-Correlation-Id":"41fc1a7b-0e1e-4390-1720-08ddeb1141d2","X-MS-Exchange-SenderADCheck":"1","X-MS-Exchange-AntiSpam-Relay":"0","X-Microsoft-Antispam":"BCL:0;\n\tARA:13230040|19092799006|1800799024|366016|376014; ","X-Microsoft-Antispam-Message-Info":"=?utf-8?q?MEH3F9wiGFo8WVLzxqXQ+kr/sgaJ?=\n\t=?utf-8?q?JRc8B/t/nb+bIVjL7qLEv4NmQMAQN6GjCCqg49LOnwWn5ZlRqji4g+XG?=\n\t=?utf-8?q?3zaxvb80JIVudQNF+1WtuHgvEAt3pol24kx9v/HHkIZXPmukLdKn6NLn?=\n\t=?utf-8?q?h/R7KYl+OC1sA1vBOg+RCo5WyDg2hUtKUyuOp3oFW02Bydury4QKecNz?=\n\t=?utf-8?q?osEE3EyY0jZP/FehEN/8ZWhF+O1yWZINSq+5ap+n9IjrI4WwXlQ0XNI2?=\n\t=?utf-8?q?f+WQDPRiec2v5h4WfU3WYfRH1L6MNmn2GnJtPVDEG15686vLYg8dSMvo?=\n\t=?utf-8?q?nwP4Uc4arteJQnhodBZIZVDkFcVelmX5Vxmq/J/JOkA0IbU2w+z6gnKu?=\n\t=?utf-8?q?fNbFKGUu/EaZCmApY5KdyuvHyEJE8YyaegXj14vWjpgbKnGyituZ/XaD?=\n\t=?utf-8?q?w2ejs1iWjTkdz5xmlf4vVftAwT+jCX6RwXxQCSIeZU/snRDNGKR5eQg1?=\n\t=?utf-8?q?Yv71naUJFIdI5lyaBtbsykCwZypbGCChMMC2/WMPbyzH0Q4fNVmyp5VL?=\n\t=?utf-8?q?XxStLsH4Ye91V7+RXK/8Sz3JtyIunEJOP5TfK7vZEJ3trOXz6rK7mXq3?=\n\t=?utf-8?q?sgTD8LT1R42vJu3YR7RnKnvu7vpMoUQpJbB1ZdoBjAbVqr3TfpzRjV/d?=\n\t=?utf-8?q?pjfeCMNXzFMdsXWrX9asFvUxtU7oIvkQemDxEAU42Aa+FRCj3DgLvFzR?=\n\t=?utf-8?q?LsnKmxbVdZI9gl3ZjeyPXGEeTUgQplnn/e64We9lli1Y5Xiudgc3RsMJ?=\n\t=?utf-8?q?VmWbStotRYUoFWK75UgI5ZpK/4I+M//gXGUidRcvLQKKxpD+EngQtSWJ?=\n\t=?utf-8?q?JLLmvzw6TciU6YJlFbVcP2/OH0eiFVqLGntKJVRr/+JczUZiqKVoI7Q0?=\n\t=?utf-8?q?+TKeuZfS88kcNaux+axe3jLbiJi/UeiT88QMceyntm9ZV+RK8oSY4wvu?=\n\t=?utf-8?q?XWFVVvrqrOd8SdBKbNg4PopQBZxebnA1o84kPS0xFtUW4lc597PeYyo+?=\n\t=?utf-8?q?uBVOkvSwj91fjNKQnbQqo4J7/WsyyMZkGLb0PIkVv38SNG9pbD5tAnbH?=\n\t=?utf-8?q?t5dVtxEXgvcAa1fC77XYhK+/q2jC32aodb3/mwpEaWWi/pGCfJWKzham?=\n\t=?utf-8?q?ArDqdGvwrdVClssg//3WTWUqOJJg3MA7W0A6zAqGR5ntLexIB9MLRzbB?=\n\t=?utf-8?q?6lBJfHgmaCPZ8YbdKV3pw18U2LA3ZE0wB5C5UTxDgzrGyASO+q0oEZW4?=\n\t=?utf-8?q?yUPUTOIs2OC+laa10LldvR9BfADzwPzb46B78D9bZj0aXWPFjFnrIr2J?=\n\t=?utf-8?q?bUUuJcuHQtPFQ5b5FV57NJVCerfx7JtCitO0gBjBd3tPFqYdTua1OKYu?=\n\t=?utf-8?q?KQIQxOF8wie6cypLqX7/KALJtI7SFIVLHU70tR3Q96yiz6s2YFzTtSRj?=\n\t=?utf-8?q?z+OJddZf4YpKno0V0nCkejuQMssblO+b+gQq4If0GR++tiH1mSxDCo5l?=\n\t=?utf-8?q?abiIak9/20zWmE88L0jpE6A=3D?=","X-Forefront-Antispam-Report":"CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;\n\tIPV:NLI; SFV:NSPM; H:AM9PR04MB8147.eurprd04.prod.outlook.com; PTR:;\n\tCAT:NONE; \n\tSFS:(13230040)(19092799006)(1800799024)(366016)(376014); DIR:OUT;\n\tSFP:1101; ","X-MS-Exchange-AntiSpam-MessageData-ChunkCount":"1","X-MS-Exchange-AntiSpam-MessageData-0":"=?utf-8?q?szoCRiz697mcMo8GsX2o2soHk?=\n\t=?utf-8?q?ePnwzqTLWcqtKAkUnCgx0iptgutj3QNUR4EkyU3HjqRcJ7r/8fIWGXIx?=\n\t=?utf-8?q?J7bx+skDWBFY0vX5KNzvCJ5ipafcM4Ft+gcv/wEIkt3JP4GUpIxzyx9l?=\n\t=?utf-8?q?x3lBenPv3BGfyBPCYl34W4IztZWYrQqHz2rf71ZH/mhNLItbRstWG6kK?=\n\t=?utf-8?q?EDiqUUELN4C0ZUkOSWkyoI/jS07oqFaBAft2tU+qNRL8v6AGgcjw9jbb?=\n\t=?utf-8?q?KkJP9fF9lrXoNRsC6WqQZqoD1phyHVVvLvaA2IluDreCA76jdK712vlW?=\n\t=?utf-8?q?fswx+x9dGf1AahNeavESkGTHwagN3KXEM4j1WbDduZ8ZeLDhA06KLJ4d?=\n\t=?utf-8?q?kcbLDzR+UCD3fO5Q8hzEXiyZPCPNmf/PRGsNlXYPbMwmc4NaRfY4qFuv?=\n\t=?utf-8?q?gRZBeisWk48TVdYW75W7T3p6szakrvpVDNUoqNZfq6/8RZ2oxTzy7yVl?=\n\t=?utf-8?q?nW8QcelvU7WgAs9YsulH5uBFQAuBqcTcvI4deG5dJj2+DUhOf8jtIWVz?=\n\t=?utf-8?q?YahL8glpWhNNUKut2SAIJdua+1uWDtvi/Bn5/zhic8XTaMVn5Z5kL82E?=\n\t=?utf-8?q?Nd3F4bzorfY5uObug5RhLt/sakgzXh/UtsU/KwIR0mrEP2kPMHqG0N8D?=\n\t=?utf-8?q?8bwcVCzEmhOgaUnRaoNkz2JRk+k5UgVcsKDpPB507Jp8l2JxeCsGLcTd?=\n\t=?utf-8?q?krR62AUHt4QAY0c8xuZbklzfv5VW1N7ENbYHsjOT4eCwndTdneZGVnKg?=\n\t=?utf-8?q?MlwmfiJin6t/S7RbGgqbjsejsi8TWyjTW6aIKlleV8u/mdRxR0laUOuL?=\n\t=?utf-8?q?xrxFjrCdRttw+bob+RmcGmJq9G7obi0PBqlk/2tZuNt/mMgnklieNJjf?=\n\t=?utf-8?q?Cn8meRbahemhaEFwjt2UHW2mjXA3nCRjGqyUYvxLq0K37yJDlFzFERa/?=\n\t=?utf-8?q?5HwzTpkzGKmwtHDgDH2nIdOHQ+N0F5RctUdgMlO2UEBHRyxpuVxnhLj/?=\n\t=?utf-8?q?4bqkFnYb7O7mY2JUEUz0AKVrrVAPSvWGf9aKUUVQnBEvSS/UXascJD1W?=\n\t=?utf-8?q?inUX/bkAkv5seYwOG+lvplUOlwT4/WzGIj4lyvgJlPvBujcdAhvFjjte?=\n\t=?utf-8?q?eRrcLvwO9VoJ3MubR6m7QB6+/uyuRPnuHG7ghhE3l2/xGEO80I/jIMVR?=\n\t=?utf-8?q?ZzUNqSap+lWPEGtyjdbJ2qqshVxHvAvwWEW/laYDdC60q/1YAemwiJ7t?=\n\t=?utf-8?q?zGzIRCgSA7G5tO94QCXuk9tdVC6NMQigGJE0lxCKwBvka3Vavf/VLVTC?=\n\t=?utf-8?q?G8UYLWc4KMPuVSIxhmbh+hZsIccK8A5LW2Ii9PfzMUASe0v6UqbfSd/p?=\n\t=?utf-8?q?zl6EUnFwiM8qIqWuzp0hvL6I1rTwMSWsDjiWmCjMiqnycLhSdw0YXgJx?=\n\t=?utf-8?q?EdjModj4xjSeKKXOk8W138mfhYwZdv68VbjWccjyL/2sxyym4bZjMRqI?=\n\t=?utf-8?q?Cl/K+b9/9Ezoks138I43noH/zYPIgzqVXUhZryjzVhzy5zTwQZfuOYJc?=\n\t=?utf-8?q?61qIZOchOH+0oR2KXs+zVhlkDQP1kfovpTUkc+lZVqYnrmCk53m1LQHQ?=\n\t=?utf-8?q?Ef7Ix3BvXLAg1tqXm78sG8Ou0wOASrjZ6Qd6KPgpXDc1Di2CYabvFOYu?=\n\t=?utf-8?q?D6H+wLSGBCvDdvZHJl/8rYhyzUw4Q=3D=3D?=","X-OriginatorOrg":"nxp.com","X-MS-Exchange-CrossTenant-Network-Message-Id":"41fc1a7b-0e1e-4390-1720-08ddeb1141d2","X-MS-Exchange-CrossTenant-AuthSource":"AM9PR04MB8147.eurprd04.prod.outlook.com","X-MS-Exchange-CrossTenant-AuthAs":"Internal","X-MS-Exchange-CrossTenant-OriginalArrivalTime":"03 Sep 2025 17:42:31.2572\n\t(UTC)","X-MS-Exchange-CrossTenant-FromEntityHeader":"Hosted","X-MS-Exchange-CrossTenant-Id":"686ea1d3-bc2b-4c6f-a92c-d99c5c301635","X-MS-Exchange-CrossTenant-MailboxType":"HOSTED","X-MS-Exchange-CrossTenant-UserPrincipalName":"JfHSz2lOZCaeTFA6D0BMJ7tTXKV176z5pU1HSuf4w+WK6ip778EpSeguJ2Bdbbe40ZLw0ZPu6YoFcMB9MvPqV2l3AKTgRgSPmwvNmPro+5A=","X-MS-Exchange-Transport-CrossTenantHeadersStamped":"VI1PR04MB10026","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}}]