[{"id":18462,"web_url":"https://patchwork.libcamera.org/comment/18462/","msgid":"<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>","date":"2021-08-01T23:42:53","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":2,"url":"https://patchwork.libcamera.org/api/people/2/","name":"Laurent Pinchart","email":"laurent.pinchart@ideasonboard.com"},"content":"Hi Nícolas,\n\nThank you for the patch.\n\nOn Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> Pipelines have relied on bufferCount to decide on the number of buffers\n> to allocate internally through allocateBuffers() and on the number of\n> V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> number of internal buffers should be the minimum required by the\n> algorithms to avoid wasting memory, and the number of V4L2 buffer slots\n> should overallocate to avoid thrashing dmabuf mappings.\n> \n> For now, just set them to constants and stop relying on bufferCount, to\n> allow for its removal.\n> \n> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> ---\n> \n> No changes in v7\n> \n> Changes in v6:\n> - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n>   INTERNAL_BUFFER_COUNT constant\n> \n>  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n>  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n>  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n>  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------\n>  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n>  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n>  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n>  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n>  src/libcamera/pipeline/simple/converter.h         |  3 +++\n>  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n>  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n>  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n>  12 files changed, 35 insertions(+), 43 deletions(-)\n\nGiven that some of the pipeline handlers will need more intrusive\nchanges to address the comments below, you could split this with one\npatch per pipeline handler (or perhaps grouping the easy ones together).\n\n> \n> diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp b/src/libcamera/pipeline/ipu3/imgu.cpp\n> index e955bc3456ba..f36e99dacbe7 100644\n> --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> @@ -593,22 +593,22 @@ int ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n>  /**\n>   * \\brief Allocate buffers for all the ImgU video devices\n>   */\n> -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> +int ImgUDevice::allocateBuffers()\n>  {\n>  \t/* Share buffers between CIO2 output and ImgU input. */\n> -\tint ret = input_->importBuffers(bufferCount);\n> +\tint ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n>  \tif (ret) {\n>  \t\tLOG(IPU3, Error) << \"Failed to import ImgU input buffers\";\n>  \t\treturn ret;\n>  \t}\n>  \n> -\tret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> +\tret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n>  \tif (ret < 0) {\n>  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU param buffers\";\n>  \t\tgoto error;\n>  \t}\n>  \n> -\tret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> +\tret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &statBuffers_);\n>  \tif (ret < 0) {\n>  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU stat buffers\";\n>  \t\tgoto error;\n> @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n>  \t * corresponding stream is active or inactive, as the driver needs\n>  \t * buffers to be requested on the V4L2 devices in order to operate.\n>  \t */\n> -\tret = output_->importBuffers(bufferCount);\n> +\tret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n>  \tif (ret < 0) {\n>  \t\tLOG(IPU3, Error) << \"Failed to import ImgU output buffers\";\n>  \t\tgoto error;\n>  \t}\n>  \n> -\tret = viewfinder_->importBuffers(bufferCount);\n> +\tret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n>  \tif (ret < 0) {\n>  \t\tLOG(IPU3, Error) << \"Failed to import ImgU viewfinder buffers\";\n>  \t\tgoto error;\n> diff --git a/src/libcamera/pipeline/ipu3/imgu.h b/src/libcamera/pipeline/ipu3/imgu.h\n> index 9d4915116087..f934a951fc75 100644\n> --- a/src/libcamera/pipeline/ipu3/imgu.h\n> +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> @@ -61,7 +61,7 @@ public:\n>  \t\t\t\t\t    outputFormat);\n>  \t}\n>  \n> -\tint allocateBuffers(unsigned int bufferCount);\n> +\tint allocateBuffers();\n>  \tvoid freeBuffers();\n>  \n>  \tint start();\n> @@ -86,6 +86,9 @@ private:\n>  \tstatic constexpr unsigned int PAD_VF = 3;\n>  \tstatic constexpr unsigned int PAD_STAT = 4;\n>  \n> +\tstatic constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> +\tstatic constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n\n5 buffer slots is low. It means that if applications cycle more than 5\nbuffers, the V4L2VideoDevice cache that maintains associations between\ndmabufs and buffer slots will the trashed. Due to the internal queue of\nrequests in the IPU3 pipeline handler (similar to what you have\nimplemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\nqueue\" for other pipeline handlers), we won't fail at queuing requests,\nbut performance will suffer. I thus think we need to increase the number\nof slots to what applications can be reasonably expected to use. We\ncould use 8, or even 16, as buffer slots are cheap. The same holds for\nother pipeline handlers.\n\nThe number of slots for the CIO2 output should match the number of\nbuffer slots for the ImgU input, as the same buffers are used on the two\nvideo devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,\ninstead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\nbuffers that are allocated by exportBuffers() in CIO2Device::start(), to\nbe used in case the application doesn't provide any RAW buffer, should\nbe lower, as those are real buffer and are thus expensive. The number of\nbuffers and buffer slots on the CIO2 thus needs to be decoupled.\n\nFor proper operation, the CIO2 will require at least two queued buffers\n(one being DMA'ed to, and one waiting). We need at least one extra\nbuffer queued to the ImgU to keep buffers flowing. Depending on\nprocessing timings, it may be that the ImgU will complete processing of\nits buffer before the CIO2 captures the next one, leading to a temporary\nsituation where the CIO2 will have three buffers queued, or the CIO2\nwill finish the capture first, leading to a temporary situation where\nthe CIO2 will have one buffer queued and the ImgU will have two buffers\nqueued. In either case, shortly afterwards, the other component will\ncomplete capture or processing, and we'll get back to a situation with\ntwo buffers queued in the CIO2 and one in the ImgU. That's thus a\nminimum of three buffers for raw images.\n\nFrom an ImgU point of view, we could probably get away with a single\nparameter and a single stats buffer. This would however not allow\nqueuing the next frame for processing in the ImgU before the current\nframe completes, so two buffers would be better. Now, if we take the IPA\ninto account, the statistics buffer will spend some time on the IPA side\nfor processing. It would thus be best to have an extra statistics buffer\nto accommodate that, thus requiring three statistics buffers (and three\nparameters buffers, as we associate them together).\n\nThis rationale leads to using the same number of internal buffers for\nthe CIO2, the parameters and the statistics. We currently use four, and\nwhile the logic above indicates we could get away with three, it would\nbe safer to keep using four in this patch, and possibly reduce the\nnumber of buffers later.\n\nI know documentation isn't fun, but I think this rationale should be\ncaptured in a comment in the IPU3 pipeline handler, along with a \\todo\nitem to try and lower the number of internal buffers to three.\n\n> +\n>  \tint linkSetup(const std::string &source, unsigned int sourcePad,\n>  \t\t      const std::string &sink, unsigned int sinkPad,\n>  \t\t      bool enable);\n> diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> index 5fd1757bfe13..4efd201c05e5 100644\n> --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera *camera)\n>  {\n>  \tIPU3CameraData *data = cameraData(camera);\n>  \tImgUDevice *imgu = data->imgu_;\n> -\tunsigned int bufferCount;\n>  \tint ret;\n>  \n> -\tbufferCount = std::max({\n> -\t\tdata->outStream_.configuration().bufferCount,\n> -\t\tdata->vfStream_.configuration().bufferCount,\n> -\t\tdata->rawStream_.configuration().bufferCount,\n> -\t});\n> -\n> -\tret = imgu->allocateBuffers(bufferCount);\n> +\tret = imgu->allocateBuffers();\n>  \tif (ret < 0)\n>  \t\treturn ret;\n>  \n> diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> index d1cd3d9dc082..776e0f92aed1 100644\n> --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera)\n>  {\n>  \tRPiCameraData *data = cameraData(camera);\n>  \tint ret;\n> +\tconstexpr unsigned int bufferCount = 4;\n>  \n>  \t/*\n> -\t * Decide how many internal buffers to allocate. For now, simply look\n> -\t * at how many external buffers will be provided. We'll need to improve\n> -\t * this logic. However, we really must have all streams allocate the same\n> -\t * number of buffers to simplify error handling in queueRequestDevice().\n> +\t * Allocate internal buffers. We really must have all streams allocate\n> +\t * the same number of buffers to simplify error handling in\n> +\t * queueRequestDevice().\n>  \t */\n> -\tunsigned int maxBuffers = 0;\n> -\tfor (const Stream *s : camera->streams())\n> -\t\tif (static_cast<const RPi::Stream *>(s)->isExternal())\n> -\t\t\tmaxBuffers = std::max(maxBuffers, s->configuration().bufferCount);\n> -\n>  \tfor (auto const stream : data->streams_) {\n> -\t\tret = stream->prepareBuffers(maxBuffers);\n> +\t\tret = stream->prepareBuffers(bufferCount);\n\nWe have a similar problem here, 4 buffer slots is too little, but when\nthe stream has to allocate internal buffers (!importOnly), which is the\ncase for most streams, we don't want to overallocate.\n\nI'd like to get feedback from Naush here, but I think this means we'll\nhave to relax the requirement documented in the comment above, and\naccept a different number of buffers for each stream.\n\n>  \t\tif (ret < 0)\n>  \t\t\treturn ret;\n>  \t}\n> diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> index 11325875b929..f4ea2fd4d4d0 100644\n> --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera *camera)\n>  \tunsigned int ipaBufferId = 1;\n>  \tint ret;\n>  \n> -\tunsigned int maxCount = std::max({\n> -\t\tdata->mainPathStream_.configuration().bufferCount,\n> -\t\tdata->selfPathStream_.configuration().bufferCount,\n> -\t});\n> -\n> -\tret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> +\tret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n>  \tif (ret < 0)\n>  \t\tgoto error;\n>  \n> -\tret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> +\tret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &statBuffers_);\n>  \tif (ret < 0)\n>  \t\tgoto error;\n>  \n> diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> index 25f482eb8d8e..fea330f72886 100644\n> --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> @@ -172,7 +172,7 @@ int RkISP1Path::start()\n>  \t\treturn -EBUSY;\n>  \n>  \t/* \\todo Make buffer count user configurable. */\n> -\tret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> +\tret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n>  \tif (ret)\n>  \t\treturn ret;\n>  \n> diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> index 91757600ccdc..3c5891009c58 100644\n> --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> @@ -27,6 +27,9 @@ class V4L2Subdevice;\n>  struct StreamConfiguration;\n>  struct V4L2SubdeviceFormat;\n>  \n> +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n\nThe situation should be simpler for the rkisp1, as it has a different\npipeline model (inline ISP as opposed to offline ISP for the IPU3). We\ncan allocate more slots (8 or 16, as for other pipeline handlers), and\nrestrict the number of internal buffers (for stats and parameters) to\nthe number of requests we expect to queue to the device at once, plus\none for the IPA.  Four thus seems good. Capturing this rationale in a\ncomment would be good too.\n\nBTW, I may be too tired to think properly, or just unable to see the\nobvious, so please challenge any rationale you think is incorrect.\n\n> +\n>  class RkISP1Path\n>  {\n>  public:\n> diff --git a/src/libcamera/pipeline/simple/converter.cpp b/src/libcamera/pipeline/simple/converter.cpp\n> index b5e34c4cd0c5..b3bcf01483f7 100644\n> --- a/src/libcamera/pipeline/simple/converter.cpp\n> +++ b/src/libcamera/pipeline/simple/converter.cpp\n> @@ -103,11 +103,11 @@ int SimpleConverter::Stream::exportBuffers(unsigned int count,\n>  \n>  int SimpleConverter::Stream::start()\n>  {\n> -\tint ret = m2m_->output()->importBuffers(inputBufferCount_);\n> +\tint ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n\nShouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\nmuch of an issue I suppose.\n\n>  \tif (ret < 0)\n>  \t\treturn ret;\n>  \n> -\tret = m2m_->capture()->importBuffers(outputBufferCount_);\n> +\tret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n>  \tif (ret < 0) {\n>  \t\tstop();\n>  \t\treturn ret;\n> diff --git a/src/libcamera/pipeline/simple/converter.h b/src/libcamera/pipeline/simple/converter.h\n> index 276a2a291c21..7e1d60674f62 100644\n> --- a/src/libcamera/pipeline/simple/converter.h\n> +++ b/src/libcamera/pipeline/simple/converter.h\n> @@ -29,6 +29,9 @@ class SizeRange;\n>  struct StreamConfiguration;\n>  class V4L2M2MDevice;\n>  \n> +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n\nLet's name the variables kSimpleInternalBufferCount and\nkSimpleBufferSlotCount, as that's the naming scheme we're moving to for\nnon-macro constants. Same comment elsewhere in this patch.\n\nThose constants don't belong to converter.h. Could you turn them into\nmember constants of the SimplePipelineHandler class, as\nkNumInternalBuffers (which btw should be removed) ? The number of buffer\nslots can be passed as a parameter to SimpleConverter::start().\n\nThere's no stats or parameters here, and no IPA, so the situation is\ndifferent than for IPU3 and RkISP1. The number of internal buffers\nshould just be one more than the minimum number of buffers required by\nthe capture device, I don't think there's another requirement.\n\n> +\n>  class SimpleConverter\n>  {\n>  public:\n> diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp\n> index 1c25a7344f5f..a1163eaf8be2 100644\n> --- a/src/libcamera/pipeline/simple/simple.cpp\n> +++ b/src/libcamera/pipeline/simple/simple.cpp\n> @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera, [[maybe_unused]] const ControlL\n>  \t\t * When using the converter allocate a fixed number of internal\n>  \t\t * buffers.\n>  \t\t */\n> -\t\tret = video->allocateBuffers(kNumInternalBuffers,\n> +\t\tret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n>  \t\t\t\t\t     &data->converterBuffers_);\n>  \t} else {\n> -\t\t/* Otherwise, prepare for using buffers from the only stream. */\n> -\t\tStream *stream = &data->streams_[0];\n> -\t\tret = video->importBuffers(stream->configuration().bufferCount);\n> +\t\tret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n>  \t}\n>  \tif (ret < 0)\n>  \t\treturn ret;\n> diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> index fd39b3d3c72c..755949e7a59a 100644\n> --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> @@ -91,6 +91,8 @@ private:\n>  \t\treturn static_cast<UVCCameraData *>(\n>  \t\t\tPipelineHandler::cameraData(camera));\n>  \t}\n> +\n> +\tstatic constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n>  };\n>  \n>  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera *camera,\n>  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n>  {\n>  \tUVCCameraData *data = cameraData(camera);\n> -\tunsigned int count = data->stream_.configuration().bufferCount;\n>  \n> -\tint ret = data->video_->importBuffers(count);\n> +\tint ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n\nFor the uvc and vimc pipeline handlers, we have no internal buffers, so\nit's quite easy. We should have 8 or 16 slots, as for other pipeline\nhandlers.\n\n>  \tif (ret < 0)\n>  \t\treturn ret;\n>  \n> diff --git a/src/libcamera/pipeline/vimc/vimc.cpp b/src/libcamera/pipeline/vimc/vimc.cpp\n> index e89d53182c6d..24ba743a946c 100644\n> --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> @@ -102,6 +102,8 @@ private:\n>  \t\treturn static_cast<VimcCameraData *>(\n>  \t\t\tPipelineHandler::cameraData(camera));\n>  \t}\n> +\n> +\tstatic constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n>  };\n>  \n>  namespace {\n> @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera *camera,\n>  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n>  {\n>  \tVimcCameraData *data = cameraData(camera);\n> -\tunsigned int count = data->stream_.configuration().bufferCount;\n>  \n> -\tint ret = data->video_->importBuffers(count);\n> +\tint ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n>  \tif (ret < 0)\n>  \t\treturn ret;\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 3F6A8BD878\n\tfor <parsemail@patchwork.libcamera.org>;\n\tSun,  1 Aug 2021 23:43:06 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id B5656687BD;\n\tMon,  2 Aug 2021 01:43:05 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 80151687B6\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tMon,  2 Aug 2021 01:43:03 +0200 (CEST)","from pendragon.ideasonboard.com (62-78-145-57.bb.dnainternet.fi\n\t[62.78.145.57])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id D7AEE87C;\n\tMon,  2 Aug 2021 01:43:02 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"T3i9wdMF\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1627861383;\n\tbh=bwTzo5jHIq9VN+LtccduhkZpQk7cj4fNhMZLQwCCtG8=;\n\th=Date:From:To:Cc:Subject:References:In-Reply-To:From;\n\tb=T3i9wdMFAJUe08/N65bsMwYa5DeAxDxwjj4YkWPxandnQK9FvVXGdouAVeevKRE+G\n\tudoLpDpCHYo1mjJeiCFKB/075zakKxZUDJAW8lNOggg/MyO3kzZ+If9aKEPT3srSd6\n\thMZI96q7uTfNp3qSOxl7x/t90B3GJ+1ZsGwXMPSw=","Date":"Mon, 2 Aug 2021 02:42:53 +0300","From":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","To":"=?utf-8?b?TsOtY29sYXMgRi4gUi4gQS4=?= Prado <nfraprado@collabora.com>","Message-ID":"<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<20210722232851.747614-10-nfraprado@collabora.com>","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org, kernel@collabora.com, =?utf-8?q?A?=\n\t=?utf-8?b?bmRyw6k=?= Almeida <andrealmeid@collabora.com>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":18603,"web_url":"https://patchwork.libcamera.org/comment/18603/","msgid":"<20210807150345.o4mcczkjt5vxium4@notapiano>","date":"2021-08-07T15:03:45","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":84,"url":"https://patchwork.libcamera.org/api/people/84/","name":"Nícolas F. R. A. Prado","email":"nfraprado@collabora.com"},"content":"Hi Laurent,\n\nOn Mon, Aug 02, 2021 at 02:42:53AM +0300, Laurent Pinchart wrote:\n> Hi Nícolas,\n> \n> Thank you for the patch.\n> \n> On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> > Pipelines have relied on bufferCount to decide on the number of buffers\n> > to allocate internally through allocateBuffers() and on the number of\n> > V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> > number of internal buffers should be the minimum required by the\n> > algorithms to avoid wasting memory, and the number of V4L2 buffer slots\n> > should overallocate to avoid thrashing dmabuf mappings.\n> > \n> > For now, just set them to constants and stop relying on bufferCount, to\n> > allow for its removal.\n> > \n> > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> > ---\n> > \n> > No changes in v7\n> > \n> > Changes in v6:\n> > - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n> >   INTERNAL_BUFFER_COUNT constant\n> > \n> >  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n> >  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n> >  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n> >  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------\n> >  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n> >  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n> >  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n> >  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n> >  src/libcamera/pipeline/simple/converter.h         |  3 +++\n> >  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n> >  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n> >  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n> >  12 files changed, 35 insertions(+), 43 deletions(-)\n> \n> Given that some of the pipeline handlers will need more intrusive\n> changes to address the comments below, you could split this with one\n> patch per pipeline handler (or perhaps grouping the easy ones together).\n> \n> > \n> > diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > index e955bc3456ba..f36e99dacbe7 100644\n> > --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> > +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > @@ -593,22 +593,22 @@ int ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n> >  /**\n> >   * \\brief Allocate buffers for all the ImgU video devices\n> >   */\n> > -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > +int ImgUDevice::allocateBuffers()\n> >  {\n> >  \t/* Share buffers between CIO2 output and ImgU input. */\n> > -\tint ret = input_->importBuffers(bufferCount);\n> > +\tint ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> >  \tif (ret) {\n> >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU input buffers\";\n> >  \t\treturn ret;\n> >  \t}\n> >  \n> > -\tret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> > +\tret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> >  \tif (ret < 0) {\n> >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU param buffers\";\n> >  \t\tgoto error;\n> >  \t}\n> >  \n> > -\tret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> > +\tret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> >  \tif (ret < 0) {\n> >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU stat buffers\";\n> >  \t\tgoto error;\n> > @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> >  \t * corresponding stream is active or inactive, as the driver needs\n> >  \t * buffers to be requested on the V4L2 devices in order to operate.\n> >  \t */\n> > -\tret = output_->importBuffers(bufferCount);\n> > +\tret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> >  \tif (ret < 0) {\n> >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU output buffers\";\n> >  \t\tgoto error;\n> >  \t}\n> >  \n> > -\tret = viewfinder_->importBuffers(bufferCount);\n> > +\tret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> >  \tif (ret < 0) {\n> >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU viewfinder buffers\";\n> >  \t\tgoto error;\n> > diff --git a/src/libcamera/pipeline/ipu3/imgu.h b/src/libcamera/pipeline/ipu3/imgu.h\n> > index 9d4915116087..f934a951fc75 100644\n> > --- a/src/libcamera/pipeline/ipu3/imgu.h\n> > +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> > @@ -61,7 +61,7 @@ public:\n> >  \t\t\t\t\t    outputFormat);\n> >  \t}\n> >  \n> > -\tint allocateBuffers(unsigned int bufferCount);\n> > +\tint allocateBuffers();\n> >  \tvoid freeBuffers();\n> >  \n> >  \tint start();\n> > @@ -86,6 +86,9 @@ private:\n> >  \tstatic constexpr unsigned int PAD_VF = 3;\n> >  \tstatic constexpr unsigned int PAD_STAT = 4;\n> >  \n> > +\tstatic constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> > +\tstatic constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n> \n> 5 buffer slots is low. It means that if applications cycle more than 5\n> buffers, the V4L2VideoDevice cache that maintains associations between\n> dmabufs and buffer slots will the trashed. Due to the internal queue of\n> requests in the IPU3 pipeline handler (similar to what you have\n> implemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\n> queue\" for other pipeline handlers), we won't fail at queuing requests,\n> but performance will suffer. I thus think we need to increase the number\n> of slots to what applications can be reasonably expected to use. We\n> could use 8, or even 16, as buffer slots are cheap. The same holds for\n> other pipeline handlers.\n> \n> The number of slots for the CIO2 output should match the number of\n> buffer slots for the ImgU input, as the same buffers are used on the two\n> video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,\n> instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\n> buffers that are allocated by exportBuffers() in CIO2Device::start(), to\n> be used in case the application doesn't provide any RAW buffer, should\n> be lower, as those are real buffer and are thus expensive. The number of\n> buffers and buffer slots on the CIO2 thus needs to be decoupled.\n> \n> For proper operation, the CIO2 will require at least two queued buffers\n> (one being DMA'ed to, and one waiting). We need at least one extra\n> buffer queued to the ImgU to keep buffers flowing. Depending on\n> processing timings, it may be that the ImgU will complete processing of\n> its buffer before the CIO2 captures the next one, leading to a temporary\n> situation where the CIO2 will have three buffers queued, or the CIO2\n> will finish the capture first, leading to a temporary situation where\n> the CIO2 will have one buffer queued and the ImgU will have two buffers\n> queued. In either case, shortly afterwards, the other component will\n> complete capture or processing, and we'll get back to a situation with\n> two buffers queued in the CIO2 and one in the ImgU. That's thus a\n> minimum of three buffers for raw images.\n> \n> From an ImgU point of view, we could probably get away with a single\n> parameter and a single stats buffer. This would however not allow\n> queuing the next frame for processing in the ImgU before the current\n> frame completes, so two buffers would be better. Now, if we take the IPA\n> into account, the statistics buffer will spend some time on the IPA side\n> for processing. It would thus be best to have an extra statistics buffer\n> to accommodate that, thus requiring three statistics buffers (and three\n> parameters buffers, as we associate them together).\n> \n> This rationale leads to using the same number of internal buffers for\n> the CIO2, the parameters and the statistics. We currently use four, and\n> while the logic above indicates we could get away with three, it would\n> be safer to keep using four in this patch, and possibly reduce the\n> number of buffers later.\n> \n> I know documentation isn't fun, but I think this rationale should be\n> captured in a comment in the IPU3 pipeline handler, along with a \\todo\n> item to try and lower the number of internal buffers to three.\n\nThis is the IPU3 topology as I understand it:\n\n      Output  .               .   Input        Output .\n      +---+   .               .   +---+        +---+  .\n      |   | --------------------> |   |        |   |  .\n      +---+   .               .   +---+        +---+  .\nCIO2          .   IPA         .          ImgU         .          IPA\n              .        Param  .   Param        Stat   .   Stat\n              .        +---+  .   +---+        +---+  .   +---+ \n              .        |   | ---> |   |        |   | ---> |   | \n              .        +---+  .   +---+        +---+  .   +---+ \n          \nYour suggestions for the minimum number of buffers required are the following,\nfrom what I understand:\n\nCIO2 raw internal buffers:\n- 2x on CIO2 Output (one being DMA'ed, one waiting)\n- 1x on ImgU Input\n\nImgU Param/Stat internal buffers:\n- 2x on ImgU Param/Stat (one being processed, one waiting)\n- 1x on IPA Stat\n\nThis arrangement doesn't seem to take into account that IPU3Frames::Info binds\nCIO2 internal buffers and ImgU Param/Stat buffers together. This means that each\nraw buffer queued to CIO2 Output needs a Param/Stat buffer as well. And each\nParam/Stat buffer queued to ImgU for processing needs a CIO2 raw buffer as well.\nAfter ImgU processing though, the raw buffer gets released and reused, so the\nStat buffer queued to the IPA does not require a CIO2 raw buffer.\n\nThis means that to achieve the above minimum, due to the IPU3Frames::Info\nconstraint, we'd actually need:\n\nCIO2 internal buffers:\n- 2x on CIO2 Output (one being DMA'ed, one waiting)\n- 2x on ImgU Input (for the two ImgU Param/Stat buffers we want to have there)\n\nImgU Param/Stat internal buffers:\n- 2x on CIO2 Output (for the two CIO2 raw buffers we want to have there)\n- 2x on ImgU Param/Stat (one being processed, one waiting)\n- 1x on IPA Stat\n\nAlso we're not accounting for parameter filling in the IPA before we queue the\nbuffers to ImgU, but perhaps that's fast enough that it doesn't matter?\n\nDoes this make sense? Or am I missing something?\n\nThanks,\nNícolas\n\n> \n> > +\n> >  \tint linkSetup(const std::string &source, unsigned int sourcePad,\n> >  \t\t      const std::string &sink, unsigned int sinkPad,\n> >  \t\t      bool enable);\n> > diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > index 5fd1757bfe13..4efd201c05e5 100644\n> > --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera *camera)\n> >  {\n> >  \tIPU3CameraData *data = cameraData(camera);\n> >  \tImgUDevice *imgu = data->imgu_;\n> > -\tunsigned int bufferCount;\n> >  \tint ret;\n> >  \n> > -\tbufferCount = std::max({\n> > -\t\tdata->outStream_.configuration().bufferCount,\n> > -\t\tdata->vfStream_.configuration().bufferCount,\n> > -\t\tdata->rawStream_.configuration().bufferCount,\n> > -\t});\n> > -\n> > -\tret = imgu->allocateBuffers(bufferCount);\n> > +\tret = imgu->allocateBuffers();\n> >  \tif (ret < 0)\n> >  \t\treturn ret;\n> >  \n> > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > index d1cd3d9dc082..776e0f92aed1 100644\n> > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera)\n> >  {\n> >  \tRPiCameraData *data = cameraData(camera);\n> >  \tint ret;\n> > +\tconstexpr unsigned int bufferCount = 4;\n> >  \n> >  \t/*\n> > -\t * Decide how many internal buffers to allocate. For now, simply look\n> > -\t * at how many external buffers will be provided. We'll need to improve\n> > -\t * this logic. However, we really must have all streams allocate the same\n> > -\t * number of buffers to simplify error handling in queueRequestDevice().\n> > +\t * Allocate internal buffers. We really must have all streams allocate\n> > +\t * the same number of buffers to simplify error handling in\n> > +\t * queueRequestDevice().\n> >  \t */\n> > -\tunsigned int maxBuffers = 0;\n> > -\tfor (const Stream *s : camera->streams())\n> > -\t\tif (static_cast<const RPi::Stream *>(s)->isExternal())\n> > -\t\t\tmaxBuffers = std::max(maxBuffers, s->configuration().bufferCount);\n> > -\n> >  \tfor (auto const stream : data->streams_) {\n> > -\t\tret = stream->prepareBuffers(maxBuffers);\n> > +\t\tret = stream->prepareBuffers(bufferCount);\n> \n> We have a similar problem here, 4 buffer slots is too little, but when\n> the stream has to allocate internal buffers (!importOnly), which is the\n> case for most streams, we don't want to overallocate.\n> \n> I'd like to get feedback from Naush here, but I think this means we'll\n> have to relax the requirement documented in the comment above, and\n> accept a different number of buffers for each stream.\n> \n> >  \t\tif (ret < 0)\n> >  \t\t\treturn ret;\n> >  \t}\n> > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > index 11325875b929..f4ea2fd4d4d0 100644\n> > --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera *camera)\n> >  \tunsigned int ipaBufferId = 1;\n> >  \tint ret;\n> >  \n> > -\tunsigned int maxCount = std::max({\n> > -\t\tdata->mainPathStream_.configuration().bufferCount,\n> > -\t\tdata->selfPathStream_.configuration().bufferCount,\n> > -\t});\n> > -\n> > -\tret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> > +\tret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> >  \tif (ret < 0)\n> >  \t\tgoto error;\n> >  \n> > -\tret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> > +\tret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> >  \tif (ret < 0)\n> >  \t\tgoto error;\n> >  \n> > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > index 25f482eb8d8e..fea330f72886 100644\n> > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > @@ -172,7 +172,7 @@ int RkISP1Path::start()\n> >  \t\treturn -EBUSY;\n> >  \n> >  \t/* \\todo Make buffer count user configurable. */\n> > -\tret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> > +\tret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n> >  \tif (ret)\n> >  \t\treturn ret;\n> >  \n> > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > index 91757600ccdc..3c5891009c58 100644\n> > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > @@ -27,6 +27,9 @@ class V4L2Subdevice;\n> >  struct StreamConfiguration;\n> >  struct V4L2SubdeviceFormat;\n> >  \n> > +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> > +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n> \n> The situation should be simpler for the rkisp1, as it has a different\n> pipeline model (inline ISP as opposed to offline ISP for the IPU3). We\n> can allocate more slots (8 or 16, as for other pipeline handlers), and\n> restrict the number of internal buffers (for stats and parameters) to\n> the number of requests we expect to queue to the device at once, plus\n> one for the IPA.  Four thus seems good. Capturing this rationale in a\n> comment would be good too.\n> \n> BTW, I may be too tired to think properly, or just unable to see the\n> obvious, so please challenge any rationale you think is incorrect.\n> \n> > +\n> >  class RkISP1Path\n> >  {\n> >  public:\n> > diff --git a/src/libcamera/pipeline/simple/converter.cpp b/src/libcamera/pipeline/simple/converter.cpp\n> > index b5e34c4cd0c5..b3bcf01483f7 100644\n> > --- a/src/libcamera/pipeline/simple/converter.cpp\n> > +++ b/src/libcamera/pipeline/simple/converter.cpp\n> > @@ -103,11 +103,11 @@ int SimpleConverter::Stream::exportBuffers(unsigned int count,\n> >  \n> >  int SimpleConverter::Stream::start()\n> >  {\n> > -\tint ret = m2m_->output()->importBuffers(inputBufferCount_);\n> > +\tint ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> \n> Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\n> much of an issue I suppose.\n> \n> >  \tif (ret < 0)\n> >  \t\treturn ret;\n> >  \n> > -\tret = m2m_->capture()->importBuffers(outputBufferCount_);\n> > +\tret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> >  \tif (ret < 0) {\n> >  \t\tstop();\n> >  \t\treturn ret;\n> > diff --git a/src/libcamera/pipeline/simple/converter.h b/src/libcamera/pipeline/simple/converter.h\n> > index 276a2a291c21..7e1d60674f62 100644\n> > --- a/src/libcamera/pipeline/simple/converter.h\n> > +++ b/src/libcamera/pipeline/simple/converter.h\n> > @@ -29,6 +29,9 @@ class SizeRange;\n> >  struct StreamConfiguration;\n> >  class V4L2M2MDevice;\n> >  \n> > +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> > +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n> \n> Let's name the variables kSimpleInternalBufferCount and\n> kSimpleBufferSlotCount, as that's the naming scheme we're moving to for\n> non-macro constants. Same comment elsewhere in this patch.\n> \n> Those constants don't belong to converter.h. Could you turn them into\n> member constants of the SimplePipelineHandler class, as\n> kNumInternalBuffers (which btw should be removed) ? The number of buffer\n> slots can be passed as a parameter to SimpleConverter::start().\n> \n> There's no stats or parameters here, and no IPA, so the situation is\n> different than for IPU3 and RkISP1. The number of internal buffers\n> should just be one more than the minimum number of buffers required by\n> the capture device, I don't think there's another requirement.\n> \n> > +\n> >  class SimpleConverter\n> >  {\n> >  public:\n> > diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp\n> > index 1c25a7344f5f..a1163eaf8be2 100644\n> > --- a/src/libcamera/pipeline/simple/simple.cpp\n> > +++ b/src/libcamera/pipeline/simple/simple.cpp\n> > @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera, [[maybe_unused]] const ControlL\n> >  \t\t * When using the converter allocate a fixed number of internal\n> >  \t\t * buffers.\n> >  \t\t */\n> > -\t\tret = video->allocateBuffers(kNumInternalBuffers,\n> > +\t\tret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n> >  \t\t\t\t\t     &data->converterBuffers_);\n> >  \t} else {\n> > -\t\t/* Otherwise, prepare for using buffers from the only stream. */\n> > -\t\tStream *stream = &data->streams_[0];\n> > -\t\tret = video->importBuffers(stream->configuration().bufferCount);\n> > +\t\tret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> >  \t}\n> >  \tif (ret < 0)\n> >  \t\treturn ret;\n> > diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > index fd39b3d3c72c..755949e7a59a 100644\n> > --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > @@ -91,6 +91,8 @@ private:\n> >  \t\treturn static_cast<UVCCameraData *>(\n> >  \t\t\tPipelineHandler::cameraData(camera));\n> >  \t}\n> > +\n> > +\tstatic constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n> >  };\n> >  \n> >  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> > @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera *camera,\n> >  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> >  {\n> >  \tUVCCameraData *data = cameraData(camera);\n> > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> >  \n> > -\tint ret = data->video_->importBuffers(count);\n> > +\tint ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n> \n> For the uvc and vimc pipeline handlers, we have no internal buffers, so\n> it's quite easy. We should have 8 or 16 slots, as for other pipeline\n> handlers.\n> \n> >  \tif (ret < 0)\n> >  \t\treturn ret;\n> >  \n> > diff --git a/src/libcamera/pipeline/vimc/vimc.cpp b/src/libcamera/pipeline/vimc/vimc.cpp\n> > index e89d53182c6d..24ba743a946c 100644\n> > --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> > +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> > @@ -102,6 +102,8 @@ private:\n> >  \t\treturn static_cast<VimcCameraData *>(\n> >  \t\t\tPipelineHandler::cameraData(camera));\n> >  \t}\n> > +\n> > +\tstatic constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n> >  };\n> >  \n> >  namespace {\n> > @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera *camera,\n> >  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> >  {\n> >  \tVimcCameraData *data = cameraData(camera);\n> > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> >  \n> > -\tint ret = data->video_->importBuffers(count);\n> > +\tint ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n> >  \tif (ret < 0)\n> >  \t\treturn ret;\n> >  \n> \n> -- \n> Regards,\n> \n> Laurent Pinchart","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id A053DBD87D\n\tfor <parsemail@patchwork.libcamera.org>;\n\tSat,  7 Aug 2021 15:03:54 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id BBAE26882A;\n\tSat,  7 Aug 2021 17:03:53 +0200 (CEST)","from bhuna.collabora.co.uk (bhuna.collabora.co.uk\n\t[IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 76C37687D0\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tSat,  7 Aug 2021 17:03:52 +0200 (CEST)","from notapiano (unknown [IPv6:2804:14c:1a9:2434:b693:c9:5cb6:b688])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\t(Authenticated sender: nfraprado)\n\tby bhuna.collabora.co.uk (Postfix) with ESMTPSA id 680251F42A0C;\n\tSat,  7 Aug 2021 16:03:50 +0100 (BST)"],"Date":"Sat, 7 Aug 2021 12:03:45 -0300","From":"=?utf-8?b?TsOtY29sYXMgRi4gUi4gQS4=?= Prado <nfraprado@collabora.com>","To":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","Message-ID":"<20210807150345.o4mcczkjt5vxium4@notapiano>","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>\n\t<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=iso-8859-1","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org, kernel@collabora.com, =?utf-8?q?A?=\n\t=?utf-8?b?bmRyw6k=?= Almeida <andrealmeid@collabora.com>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":18643,"web_url":"https://patchwork.libcamera.org/comment/18643/","msgid":"<20210809202646.blgq4lyab7ktglsp@notapiano>","date":"2021-08-09T20:26:46","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":84,"url":"https://patchwork.libcamera.org/api/people/84/","name":"Nícolas F. R. A. Prado","email":"nfraprado@collabora.com"},"content":"A few more comments:\n\nOn Sat, Aug 07, 2021 at 12:03:52PM -0300, Nícolas F. R. A. Prado wrote:\n> Hi Laurent,\n> \n> On Mon, Aug 02, 2021 at 02:42:53AM +0300, Laurent Pinchart wrote:\n> > Hi Nícolas,\n> > \n> > Thank you for the patch.\n> > \n> > On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> > > Pipelines have relied on bufferCount to decide on the number of buffers\n> > > to allocate internally through allocateBuffers() and on the number of\n> > > V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> > > number of internal buffers should be the minimum required by the\n> > > algorithms to avoid wasting memory, and the number of V4L2 buffer slots\n> > > should overallocate to avoid thrashing dmabuf mappings.\n> > > \n> > > For now, just set them to constants and stop relying on bufferCount, to\n> > > allow for its removal.\n> > > \n> > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> > > ---\n> > > \n> > > No changes in v7\n> > > \n> > > Changes in v6:\n> > > - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n> > >   INTERNAL_BUFFER_COUNT constant\n> > > \n> > >  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n> > >  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n> > >  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n> > >  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------\n> > >  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n> > >  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n> > >  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n> > >  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n> > >  src/libcamera/pipeline/simple/converter.h         |  3 +++\n> > >  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n> > >  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n> > >  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n> > >  12 files changed, 35 insertions(+), 43 deletions(-)\n> > \n> > Given that some of the pipeline handlers will need more intrusive\n> > changes to address the comments below, you could split this with one\n> > patch per pipeline handler (or perhaps grouping the easy ones together).\n> > \n> > > \n> > > diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > index e955bc3456ba..f36e99dacbe7 100644\n> > > --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > @@ -593,22 +593,22 @@ int ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n> > >  /**\n> > >   * \\brief Allocate buffers for all the ImgU video devices\n> > >   */\n> > > -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > +int ImgUDevice::allocateBuffers()\n> > >  {\n> > >  \t/* Share buffers between CIO2 output and ImgU input. */\n> > > -\tint ret = input_->importBuffers(bufferCount);\n> > > +\tint ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > >  \tif (ret) {\n> > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU input buffers\";\n> > >  \t\treturn ret;\n> > >  \t}\n> > >  \n> > > -\tret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> > > +\tret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > >  \tif (ret < 0) {\n> > >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU param buffers\";\n> > >  \t\tgoto error;\n> > >  \t}\n> > >  \n> > > -\tret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> > > +\tret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > >  \tif (ret < 0) {\n> > >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU stat buffers\";\n> > >  \t\tgoto error;\n> > > @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > >  \t * corresponding stream is active or inactive, as the driver needs\n> > >  \t * buffers to be requested on the V4L2 devices in order to operate.\n> > >  \t */\n> > > -\tret = output_->importBuffers(bufferCount);\n> > > +\tret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > >  \tif (ret < 0) {\n> > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU output buffers\";\n> > >  \t\tgoto error;\n> > >  \t}\n> > >  \n> > > -\tret = viewfinder_->importBuffers(bufferCount);\n> > > +\tret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > >  \tif (ret < 0) {\n> > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU viewfinder buffers\";\n> > >  \t\tgoto error;\n> > > diff --git a/src/libcamera/pipeline/ipu3/imgu.h b/src/libcamera/pipeline/ipu3/imgu.h\n> > > index 9d4915116087..f934a951fc75 100644\n> > > --- a/src/libcamera/pipeline/ipu3/imgu.h\n> > > +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> > > @@ -61,7 +61,7 @@ public:\n> > >  \t\t\t\t\t    outputFormat);\n> > >  \t}\n> > >  \n> > > -\tint allocateBuffers(unsigned int bufferCount);\n> > > +\tint allocateBuffers();\n> > >  \tvoid freeBuffers();\n> > >  \n> > >  \tint start();\n> > > @@ -86,6 +86,9 @@ private:\n> > >  \tstatic constexpr unsigned int PAD_VF = 3;\n> > >  \tstatic constexpr unsigned int PAD_STAT = 4;\n> > >  \n> > > +\tstatic constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> > > +\tstatic constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n> > \n> > 5 buffer slots is low. It means that if applications cycle more than 5\n> > buffers, the V4L2VideoDevice cache that maintains associations between\n> > dmabufs and buffer slots will the trashed. Due to the internal queue of\n> > requests in the IPU3 pipeline handler (similar to what you have\n> > implemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\n> > queue\" for other pipeline handlers), we won't fail at queuing requests,\n> > but performance will suffer. I thus think we need to increase the number\n> > of slots to what applications can be reasonably expected to use. We\n> > could use 8, or even 16, as buffer slots are cheap. The same holds for\n> > other pipeline handlers.\n> > \n> > The number of slots for the CIO2 output should match the number of\n> > buffer slots for the ImgU input, as the same buffers are used on the two\n> > video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,\n> > instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\n> > buffers that are allocated by exportBuffers() in CIO2Device::start(), to\n> > be used in case the application doesn't provide any RAW buffer, should\n> > be lower, as those are real buffer and are thus expensive. The number of\n> > buffers and buffer slots on the CIO2 thus needs to be decoupled.\n> > \n> > For proper operation, the CIO2 will require at least two queued buffers\n> > (one being DMA'ed to, and one waiting). We need at least one extra\n> > buffer queued to the ImgU to keep buffers flowing. Depending on\n> > processing timings, it may be that the ImgU will complete processing of\n> > its buffer before the CIO2 captures the next one, leading to a temporary\n> > situation where the CIO2 will have three buffers queued, or the CIO2\n> > will finish the capture first, leading to a temporary situation where\n> > the CIO2 will have one buffer queued and the ImgU will have two buffers\n> > queued. In either case, shortly afterwards, the other component will\n> > complete capture or processing, and we'll get back to a situation with\n> > two buffers queued in the CIO2 and one in the ImgU. That's thus a\n> > minimum of three buffers for raw images.\n> > \n> > From an ImgU point of view, we could probably get away with a single\n> > parameter and a single stats buffer. This would however not allow\n> > queuing the next frame for processing in the ImgU before the current\n> > frame completes, so two buffers would be better. Now, if we take the IPA\n> > into account, the statistics buffer will spend some time on the IPA side\n> > for processing. It would thus be best to have an extra statistics buffer\n> > to accommodate that, thus requiring three statistics buffers (and three\n> > parameters buffers, as we associate them together).\n> > \n> > This rationale leads to using the same number of internal buffers for\n> > the CIO2, the parameters and the statistics. We currently use four, and\n> > while the logic above indicates we could get away with three, it would\n> > be safer to keep using four in this patch, and possibly reduce the\n> > number of buffers later.\n> > \n> > I know documentation isn't fun, but I think this rationale should be\n> > captured in a comment in the IPU3 pipeline handler, along with a \\todo\n> > item to try and lower the number of internal buffers to three.\n> \n> This is the IPU3 topology as I understand it:\n> \n>       Output  .               .   Input        Output .\n>       +---+   .               .   +---+        +---+  .\n>       |   | --------------------> |   |        |   |  .\n>       +---+   .               .   +---+        +---+  .\n> CIO2          .   IPA         .          ImgU         .          IPA\n>               .        Param  .   Param        Stat   .   Stat\n>               .        +---+  .   +---+        +---+  .   +---+ \n>               .        |   | ---> |   |        |   | ---> |   | \n>               .        +---+  .   +---+        +---+  .   +---+ \n>           \n> Your suggestions for the minimum number of buffers required are the following,\n> from what I understand:\n> \n> CIO2 raw internal buffers:\n> - 2x on CIO2 Output (one being DMA'ed, one waiting)\n> - 1x on ImgU Input\n> \n> ImgU Param/Stat internal buffers:\n> - 2x on ImgU Param/Stat (one being processed, one waiting)\n> - 1x on IPA Stat\n> \n> This arrangement doesn't seem to take into account that IPU3Frames::Info binds\n> CIO2 internal buffers and ImgU Param/Stat buffers together. This means that each\n> raw buffer queued to CIO2 Output needs a Param/Stat buffer as well. And each\n> Param/Stat buffer queued to ImgU for processing needs a CIO2 raw buffer as well.\n> After ImgU processing though, the raw buffer gets released and reused, so the\n> Stat buffer queued to the IPA does not require a CIO2 raw buffer.\n> \n> This means that to achieve the above minimum, due to the IPU3Frames::Info\n> constraint, we'd actually need:\n> \n> CIO2 internal buffers:\n> - 2x on CIO2 Output (one being DMA'ed, one waiting)\n> - 2x on ImgU Input (for the two ImgU Param/Stat buffers we want to have there)\n> \n> ImgU Param/Stat internal buffers:\n> - 2x on CIO2 Output (for the two CIO2 raw buffers we want to have there)\n> - 2x on ImgU Param/Stat (one being processed, one waiting)\n> - 1x on IPA Stat\n> \n> Also we're not accounting for parameter filling in the IPA before we queue the\n> buffers to ImgU, but perhaps that's fast enough that it doesn't matter?\n> \n> Does this make sense? Or am I missing something?\n> \n> Thanks,\n> Nícolas\n> \n> > \n> > > +\n> > >  \tint linkSetup(const std::string &source, unsigned int sourcePad,\n> > >  \t\t      const std::string &sink, unsigned int sinkPad,\n> > >  \t\t      bool enable);\n> > > diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > index 5fd1757bfe13..4efd201c05e5 100644\n> > > --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera *camera)\n> > >  {\n> > >  \tIPU3CameraData *data = cameraData(camera);\n> > >  \tImgUDevice *imgu = data->imgu_;\n> > > -\tunsigned int bufferCount;\n> > >  \tint ret;\n> > >  \n> > > -\tbufferCount = std::max({\n> > > -\t\tdata->outStream_.configuration().bufferCount,\n> > > -\t\tdata->vfStream_.configuration().bufferCount,\n> > > -\t\tdata->rawStream_.configuration().bufferCount,\n> > > -\t});\n> > > -\n> > > -\tret = imgu->allocateBuffers(bufferCount);\n> > > +\tret = imgu->allocateBuffers();\n> > >  \tif (ret < 0)\n> > >  \t\treturn ret;\n> > >  \n> > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > index d1cd3d9dc082..776e0f92aed1 100644\n> > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera)\n> > >  {\n> > >  \tRPiCameraData *data = cameraData(camera);\n> > >  \tint ret;\n> > > +\tconstexpr unsigned int bufferCount = 4;\n> > >  \n> > >  \t/*\n> > > -\t * Decide how many internal buffers to allocate. For now, simply look\n> > > -\t * at how many external buffers will be provided. We'll need to improve\n> > > -\t * this logic. However, we really must have all streams allocate the same\n> > > -\t * number of buffers to simplify error handling in queueRequestDevice().\n> > > +\t * Allocate internal buffers. We really must have all streams allocate\n> > > +\t * the same number of buffers to simplify error handling in\n> > > +\t * queueRequestDevice().\n> > >  \t */\n> > > -\tunsigned int maxBuffers = 0;\n> > > -\tfor (const Stream *s : camera->streams())\n> > > -\t\tif (static_cast<const RPi::Stream *>(s)->isExternal())\n> > > -\t\t\tmaxBuffers = std::max(maxBuffers, s->configuration().bufferCount);\n> > > -\n> > >  \tfor (auto const stream : data->streams_) {\n> > > -\t\tret = stream->prepareBuffers(maxBuffers);\n> > > +\t\tret = stream->prepareBuffers(bufferCount);\n> > \n> > We have a similar problem here, 4 buffer slots is too little, but when\n> > the stream has to allocate internal buffers (!importOnly), which is the\n> > case for most streams, we don't want to overallocate.\n> > \n> > I'd like to get feedback from Naush here, but I think this means we'll\n> > have to relax the requirement documented in the comment above, and\n> > accept a different number of buffers for each stream.\n> > \n> > >  \t\tif (ret < 0)\n> > >  \t\t\treturn ret;\n> > >  \t}\n> > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > index 11325875b929..f4ea2fd4d4d0 100644\n> > > --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera *camera)\n> > >  \tunsigned int ipaBufferId = 1;\n> > >  \tint ret;\n> > >  \n> > > -\tunsigned int maxCount = std::max({\n> > > -\t\tdata->mainPathStream_.configuration().bufferCount,\n> > > -\t\tdata->selfPathStream_.configuration().bufferCount,\n> > > -\t});\n> > > -\n> > > -\tret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> > > +\tret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > >  \tif (ret < 0)\n> > >  \t\tgoto error;\n> > >  \n> > > -\tret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> > > +\tret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > >  \tif (ret < 0)\n> > >  \t\tgoto error;\n> > >  \n> > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > index 25f482eb8d8e..fea330f72886 100644\n> > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > @@ -172,7 +172,7 @@ int RkISP1Path::start()\n> > >  \t\treturn -EBUSY;\n> > >  \n> > >  \t/* \\todo Make buffer count user configurable. */\n> > > -\tret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> > > +\tret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n> > >  \tif (ret)\n> > >  \t\treturn ret;\n> > >  \n> > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > index 91757600ccdc..3c5891009c58 100644\n> > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > @@ -27,6 +27,9 @@ class V4L2Subdevice;\n> > >  struct StreamConfiguration;\n> > >  struct V4L2SubdeviceFormat;\n> > >  \n> > > +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> > > +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n> > \n> > The situation should be simpler for the rkisp1, as it has a different\n> > pipeline model (inline ISP as opposed to offline ISP for the IPU3). We\n> > can allocate more slots (8 or 16, as for other pipeline handlers), and\n> > restrict the number of internal buffers (for stats and parameters) to\n> > the number of requests we expect to queue to the device at once, plus\n> > one for the IPA.  Four thus seems good. Capturing this rationale in a\n> > comment would be good too.\n\nShouldn't we also have one extra buffer queued to the capture device, like for\nthe others, totalling five (four on the capture, one on the IPA)? Or since the\ndriver already requires three buffers the extra one isn't needed?\n\nI'm not sure how it works, but if the driver requires three buffers at all times\nto keep streaming, then I think we indeed should have the extra buffer to avoid\ndropping frames. Otherwise, if that requirement is only for starting the stream,\nthen for drivers that require at least two buffers we shouldn't need an extra\none, I'd think.\n\n> > \n> > BTW, I may be too tired to think properly, or just unable to see the\n> > obvious, so please challenge any rationale you think is incorrect.\n> > \n> > > +\n> > >  class RkISP1Path\n> > >  {\n> > >  public:\n> > > diff --git a/src/libcamera/pipeline/simple/converter.cpp b/src/libcamera/pipeline/simple/converter.cpp\n> > > index b5e34c4cd0c5..b3bcf01483f7 100644\n> > > --- a/src/libcamera/pipeline/simple/converter.cpp\n> > > +++ b/src/libcamera/pipeline/simple/converter.cpp\n> > > @@ -103,11 +103,11 @@ int SimpleConverter::Stream::exportBuffers(unsigned int count,\n> > >  \n> > >  int SimpleConverter::Stream::start()\n> > >  {\n> > > -\tint ret = m2m_->output()->importBuffers(inputBufferCount_);\n> > > +\tint ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > \n> > Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\n> > much of an issue I suppose.\n\nIndeed. I was under the impression that we should always importBuffers() using\nBUFFER_SLOT_COUNT, but now, after reading more code, I understand that's not\nalways the case (although this seems to be the only case, due to the presence of\nthe converter).\n\n> > \n> > >  \tif (ret < 0)\n> > >  \t\treturn ret;\n> > >  \n> > > -\tret = m2m_->capture()->importBuffers(outputBufferCount_);\n> > > +\tret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > >  \tif (ret < 0) {\n> > >  \t\tstop();\n> > >  \t\treturn ret;\n> > > diff --git a/src/libcamera/pipeline/simple/converter.h b/src/libcamera/pipeline/simple/converter.h\n> > > index 276a2a291c21..7e1d60674f62 100644\n> > > --- a/src/libcamera/pipeline/simple/converter.h\n> > > +++ b/src/libcamera/pipeline/simple/converter.h\n> > > @@ -29,6 +29,9 @@ class SizeRange;\n> > >  struct StreamConfiguration;\n> > >  class V4L2M2MDevice;\n> > >  \n> > > +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> > > +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n> > \n> > Let's name the variables kSimpleInternalBufferCount and\n> > kSimpleBufferSlotCount, as that's the naming scheme we're moving to for\n> > non-macro constants. Same comment elsewhere in this patch.\n> > \n> > Those constants don't belong to converter.h. Could you turn them into\n> > member constants of the SimplePipelineHandler class, as\n> > kNumInternalBuffers (which btw should be removed) ? The number of buffer\n> > slots can be passed as a parameter to SimpleConverter::start().\n> > \n> > There's no stats or parameters here, and no IPA, so the situation is\n> > different than for IPU3 and RkISP1. The number of internal buffers\n> > should just be one more than the minimum number of buffers required by\n> > the capture device, I don't think there's another requirement.\n\nPlus one extra to have queued at the converter's 'output' node (which is its\ninput, confusingly)?\n\nThanks,\nNícolas\n\n> > \n> > > +\n> > >  class SimpleConverter\n> > >  {\n> > >  public:\n> > > diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp\n> > > index 1c25a7344f5f..a1163eaf8be2 100644\n> > > --- a/src/libcamera/pipeline/simple/simple.cpp\n> > > +++ b/src/libcamera/pipeline/simple/simple.cpp\n> > > @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera, [[maybe_unused]] const ControlL\n> > >  \t\t * When using the converter allocate a fixed number of internal\n> > >  \t\t * buffers.\n> > >  \t\t */\n> > > -\t\tret = video->allocateBuffers(kNumInternalBuffers,\n> > > +\t\tret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n> > >  \t\t\t\t\t     &data->converterBuffers_);\n> > >  \t} else {\n> > > -\t\t/* Otherwise, prepare for using buffers from the only stream. */\n> > > -\t\tStream *stream = &data->streams_[0];\n> > > -\t\tret = video->importBuffers(stream->configuration().bufferCount);\n> > > +\t\tret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > >  \t}\n> > >  \tif (ret < 0)\n> > >  \t\treturn ret;\n> > > diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > index fd39b3d3c72c..755949e7a59a 100644\n> > > --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > @@ -91,6 +91,8 @@ private:\n> > >  \t\treturn static_cast<UVCCameraData *>(\n> > >  \t\t\tPipelineHandler::cameraData(camera));\n> > >  \t}\n> > > +\n> > > +\tstatic constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n> > >  };\n> > >  \n> > >  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> > > @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera *camera,\n> > >  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > >  {\n> > >  \tUVCCameraData *data = cameraData(camera);\n> > > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> > >  \n> > > -\tint ret = data->video_->importBuffers(count);\n> > > +\tint ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n> > \n> > For the uvc and vimc pipeline handlers, we have no internal buffers, so\n> > it's quite easy. We should have 8 or 16 slots, as for other pipeline\n> > handlers.\n> > \n> > >  \tif (ret < 0)\n> > >  \t\treturn ret;\n> > >  \n> > > diff --git a/src/libcamera/pipeline/vimc/vimc.cpp b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > index e89d53182c6d..24ba743a946c 100644\n> > > --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> > > +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > @@ -102,6 +102,8 @@ private:\n> > >  \t\treturn static_cast<VimcCameraData *>(\n> > >  \t\t\tPipelineHandler::cameraData(camera));\n> > >  \t}\n> > > +\n> > > +\tstatic constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n> > >  };\n> > >  \n> > >  namespace {\n> > > @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera *camera,\n> > >  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > >  {\n> > >  \tVimcCameraData *data = cameraData(camera);\n> > > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> > >  \n> > > -\tint ret = data->video_->importBuffers(count);\n> > > +\tint ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n> > >  \tif (ret < 0)\n> > >  \t\treturn ret;\n> > >  \n> > \n> > -- \n> > Regards,\n> > \n> > Laurent Pinchart","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id BDB0DBD87D\n\tfor <parsemail@patchwork.libcamera.org>;\n\tMon,  9 Aug 2021 20:26:55 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 1361668826;\n\tMon,  9 Aug 2021 22:26:55 +0200 (CEST)","from bhuna.collabora.co.uk (bhuna.collabora.co.uk\n\t[IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id A000C60269\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tMon,  9 Aug 2021 22:26:53 +0200 (CEST)","from notapiano (unknown [IPv6:2804:14c:1a9:2434:b693:c9:5cb6:b688])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\t(Authenticated sender: nfraprado)\n\tby bhuna.collabora.co.uk (Postfix) with ESMTPSA id CDA581F42003;\n\tMon,  9 Aug 2021 21:26:51 +0100 (BST)"],"Date":"Mon, 9 Aug 2021 17:26:46 -0300","From":"=?utf-8?b?TsOtY29sYXMgRi4gUi4gQS4=?= Prado <nfraprado@collabora.com>","To":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","Message-ID":"<20210809202646.blgq4lyab7ktglsp@notapiano>","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>\n\t<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>\n\t<20210807150345.o4mcczkjt5vxium4@notapiano>","MIME-Version":"1.0","Content-Type":"text/plain; charset=iso-8859-1","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<20210807150345.o4mcczkjt5vxium4@notapiano>","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org, kernel@collabora.com, =?utf-8?q?A?=\n\t=?utf-8?b?bmRyw6k=?= Almeida <andrealmeid@collabora.com>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":18731,"web_url":"https://patchwork.libcamera.org/comment/18731/","msgid":"<CAEmqJPq94iMjF92TivzPkgRk29dVRB4Rut1SEeRAhvRjuPJOuA@mail.gmail.com>","date":"2021-08-12T11:32:28","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":34,"url":"https://patchwork.libcamera.org/api/people/34/","name":"Naushir Patuck","email":"naush@raspberrypi.com"},"content":"Hi Laurent and Nicolas,\n\n\nOn Mon, 2 Aug 2021 at 00:43, Laurent Pinchart <\nlaurent.pinchart@ideasonboard.com> wrote:\n\n> Hi Nícolas,\n>\n> Thank you for the patch.\n>\n> On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> > Pipelines have relied on bufferCount to decide on the number of buffers\n> > to allocate internally through allocateBuffers() and on the number of\n> > V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> > number of internal buffers should be the minimum required by the\n> > algorithms to avoid wasting memory, and the number of V4L2 buffer slots\n> > should overallocate to avoid thrashing dmabuf mappings.\n> >\n> > For now, just set them to constants and stop relying on bufferCount, to\n> > allow for its removal.\n> >\n> > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> > ---\n> >\n> > No changes in v7\n> >\n> > Changes in v6:\n> > - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n> >   INTERNAL_BUFFER_COUNT constant\n> >\n> >  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n> >  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n> >  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n> >  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------\n> >  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n> >  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n> >  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n> >  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n> >  src/libcamera/pipeline/simple/converter.h         |  3 +++\n> >  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n> >  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n> >  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n> >  12 files changed, 35 insertions(+), 43 deletions(-)\n>\n> Given that some of the pipeline handlers will need more intrusive\n> changes to address the comments below, you could split this with one\n> patch per pipeline handler (or perhaps grouping the easy ones together).\n>\n> >\n> > diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp\n> b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > index e955bc3456ba..f36e99dacbe7 100644\n> > --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> > +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > @@ -593,22 +593,22 @@ int\n> ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n> >  /**\n> >   * \\brief Allocate buffers for all the ImgU video devices\n> >   */\n> > -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > +int ImgUDevice::allocateBuffers()\n> >  {\n> >       /* Share buffers between CIO2 output and ImgU input. */\n> > -     int ret = input_->importBuffers(bufferCount);\n> > +     int ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> >       if (ret) {\n> >               LOG(IPU3, Error) << \"Failed to import ImgU input buffers\";\n> >               return ret;\n> >       }\n> >\n> > -     ret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> > +     ret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT,\n> &paramBuffers_);\n> >       if (ret < 0) {\n> >               LOG(IPU3, Error) << \"Failed to allocate ImgU param\n> buffers\";\n> >               goto error;\n> >       }\n> >\n> > -     ret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> > +     ret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT,\n> &statBuffers_);\n> >       if (ret < 0) {\n> >               LOG(IPU3, Error) << \"Failed to allocate ImgU stat buffers\";\n> >               goto error;\n> > @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int\n> bufferCount)\n> >        * corresponding stream is active or inactive, as the driver needs\n> >        * buffers to be requested on the V4L2 devices in order to operate.\n> >        */\n> > -     ret = output_->importBuffers(bufferCount);\n> > +     ret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> >       if (ret < 0) {\n> >               LOG(IPU3, Error) << \"Failed to import ImgU output buffers\";\n> >               goto error;\n> >       }\n> >\n> > -     ret = viewfinder_->importBuffers(bufferCount);\n> > +     ret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> >       if (ret < 0) {\n> >               LOG(IPU3, Error) << \"Failed to import ImgU viewfinder\n> buffers\";\n> >               goto error;\n> > diff --git a/src/libcamera/pipeline/ipu3/imgu.h\n> b/src/libcamera/pipeline/ipu3/imgu.h\n> > index 9d4915116087..f934a951fc75 100644\n> > --- a/src/libcamera/pipeline/ipu3/imgu.h\n> > +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> > @@ -61,7 +61,7 @@ public:\n> >                                           outputFormat);\n> >       }\n> >\n> > -     int allocateBuffers(unsigned int bufferCount);\n> > +     int allocateBuffers();\n> >       void freeBuffers();\n> >\n> >       int start();\n> > @@ -86,6 +86,9 @@ private:\n> >       static constexpr unsigned int PAD_VF = 3;\n> >       static constexpr unsigned int PAD_STAT = 4;\n> >\n> > +     static constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> > +     static constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n>\n> 5 buffer slots is low. It means that if applications cycle more than 5\n> buffers, the V4L2VideoDevice cache that maintains associations between\n> dmabufs and buffer slots will the trashed. Due to the internal queue of\n> requests in the IPU3 pipeline handler (similar to what you have\n> implemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\n> queue\" for other pipeline handlers), we won't fail at queuing requests,\n> but performance will suffer. I thus think we need to increase the number\n> of slots to what applications can be reasonably expected to use. We\n> could use 8, or even 16, as buffer slots are cheap. The same holds for\n> other pipeline handlers.\n>\n> The number of slots for the CIO2 output should match the number of\n> buffer slots for the ImgU input, as the same buffers are used on the two\n> video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,\n> instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\n> buffers that are allocated by exportBuffers() in CIO2Device::start(), to\n> be used in case the application doesn't provide any RAW buffer, should\n> be lower, as those are real buffer and are thus expensive. The number of\n> buffers and buffer slots on the CIO2 thus needs to be decoupled.\n>\n> For proper operation, the CIO2 will require at least two queued buffers\n> (one being DMA'ed to, and one waiting). We need at least one extra\n> buffer queued to the ImgU to keep buffers flowing. Depending on\n> processing timings, it may be that the ImgU will complete processing of\n> its buffer before the CIO2 captures the next one, leading to a temporary\n> situation where the CIO2 will have three buffers queued, or the CIO2\n> will finish the capture first, leading to a temporary situation where\n> the CIO2 will have one buffer queued and the ImgU will have two buffers\n> queued. In either case, shortly afterwards, the other component will\n> complete capture or processing, and we'll get back to a situation with\n> two buffers queued in the CIO2 and one in the ImgU. That's thus a\n> minimum of three buffers for raw images.\n>\n> From an ImgU point of view, we could probably get away with a single\n> parameter and a single stats buffer. This would however not allow\n> queuing the next frame for processing in the ImgU before the current\n> frame completes, so two buffers would be better. Now, if we take the IPA\n> into account, the statistics buffer will spend some time on the IPA side\n> for processing. It would thus be best to have an extra statistics buffer\n> to accommodate that, thus requiring three statistics buffers (and three\n> parameters buffers, as we associate them together).\n>\n> This rationale leads to using the same number of internal buffers for\n> the CIO2, the parameters and the statistics. We currently use four, and\n> while the logic above indicates we could get away with three, it would\n> be safer to keep using four in this patch, and possibly reduce the\n> number of buffers later.\n>\n> I know documentation isn't fun, but I think this rationale should be\n> captured in a comment in the IPU3 pipeline handler, along with a \\todo\n> item to try and lower the number of internal buffers to three.\n>\n> > +\n> >       int linkSetup(const std::string &source, unsigned int sourcePad,\n> >                     const std::string &sink, unsigned int sinkPad,\n> >                     bool enable);\n> > diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > index 5fd1757bfe13..4efd201c05e5 100644\n> > --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera\n> *camera)\n> >  {\n> >       IPU3CameraData *data = cameraData(camera);\n> >       ImgUDevice *imgu = data->imgu_;\n> > -     unsigned int bufferCount;\n> >       int ret;\n> >\n> > -     bufferCount = std::max({\n> > -             data->outStream_.configuration().bufferCount,\n> > -             data->vfStream_.configuration().bufferCount,\n> > -             data->rawStream_.configuration().bufferCount,\n> > -     });\n> > -\n> > -     ret = imgu->allocateBuffers(bufferCount);\n> > +     ret = imgu->allocateBuffers();\n> >       if (ret < 0)\n> >               return ret;\n> >\n> > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > index d1cd3d9dc082..776e0f92aed1 100644\n> > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera\n> *camera)\n> >  {\n> >       RPiCameraData *data = cameraData(camera);\n> >       int ret;\n> > +     constexpr unsigned int bufferCount = 4;\n> >\n> >       /*\n> > -      * Decide how many internal buffers to allocate. For now, simply\n> look\n> > -      * at how many external buffers will be provided. We'll need to\n> improve\n> > -      * this logic. However, we really must have all streams allocate\n> the same\n> > -      * number of buffers to simplify error handling in\n> queueRequestDevice().\n> > +      * Allocate internal buffers. We really must have all streams\n> allocate\n> > +      * the same number of buffers to simplify error handling in\n> > +      * queueRequestDevice().\n> >        */\n> > -     unsigned int maxBuffers = 0;\n> > -     for (const Stream *s : camera->streams())\n> > -             if (static_cast<const RPi::Stream *>(s)->isExternal())\n> > -                     maxBuffers = std::max(maxBuffers,\n> s->configuration().bufferCount);\n> > -\n> >       for (auto const stream : data->streams_) {\n> > -             ret = stream->prepareBuffers(maxBuffers);\n> > +             ret = stream->prepareBuffers(bufferCount);\n>\n> We have a similar problem here, 4 buffer slots is too little, but when\n> the stream has to allocate internal buffers (!importOnly), which is the\n> case for most streams, we don't want to overallocate.\n>\n> I'd like to get feedback from Naush here, but I think this means we'll\n> have to relax the requirement documented in the comment above, and\n> accept a different number of buffers for each stream.\n>\n\nSorry for the late reply to this thread!\n\nAs is evident from the above comment, this bit of code does need to be\nimproved\nto avoid over-applications which I will get to at some point. However, to\naddress this\nchange and the comments, 4 buffer slots sounds like it might be too\nlittle.  Regarding\nthe requirement on having streams allocate the same number of buffers -\nthat can be\nrelaxed (and the comment removed) as we do handle it correctly.\n\nRegards,\nNaush\n\n\n\n>\n> >               if (ret < 0)\n> >                       return ret;\n> >       }\n> > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > index 11325875b929..f4ea2fd4d4d0 100644\n> > --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera\n> *camera)\n> >       unsigned int ipaBufferId = 1;\n> >       int ret;\n> >\n> > -     unsigned int maxCount = std::max({\n> > -             data->mainPathStream_.configuration().bufferCount,\n> > -             data->selfPathStream_.configuration().bufferCount,\n> > -     });\n> > -\n> > -     ret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> > +     ret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT,\n> &paramBuffers_);\n> >       if (ret < 0)\n> >               goto error;\n> >\n> > -     ret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> > +     ret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT,\n> &statBuffers_);\n> >       if (ret < 0)\n> >               goto error;\n> >\n> > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > index 25f482eb8d8e..fea330f72886 100644\n> > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > @@ -172,7 +172,7 @@ int RkISP1Path::start()\n> >               return -EBUSY;\n> >\n> >       /* \\todo Make buffer count user configurable. */\n> > -     ret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> > +     ret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n> >       if (ret)\n> >               return ret;\n> >\n> > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > index 91757600ccdc..3c5891009c58 100644\n> > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > @@ -27,6 +27,9 @@ class V4L2Subdevice;\n> >  struct StreamConfiguration;\n> >  struct V4L2SubdeviceFormat;\n> >\n> > +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> > +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n>\n> The situation should be simpler for the rkisp1, as it has a different\n> pipeline model (inline ISP as opposed to offline ISP for the IPU3). We\n> can allocate more slots (8 or 16, as for other pipeline handlers), and\n> restrict the number of internal buffers (for stats and parameters) to\n> the number of requests we expect to queue to the device at once, plus\n> one for the IPA.  Four thus seems good. Capturing this rationale in a\n> comment would be good too.\n>\n> BTW, I may be too tired to think properly, or just unable to see the\n> obvious, so please challenge any rationale you think is incorrect.\n>\n> > +\n> >  class RkISP1Path\n> >  {\n> >  public:\n> > diff --git a/src/libcamera/pipeline/simple/converter.cpp\n> b/src/libcamera/pipeline/simple/converter.cpp\n> > index b5e34c4cd0c5..b3bcf01483f7 100644\n> > --- a/src/libcamera/pipeline/simple/converter.cpp\n> > +++ b/src/libcamera/pipeline/simple/converter.cpp\n> > @@ -103,11 +103,11 @@ int\n> SimpleConverter::Stream::exportBuffers(unsigned int count,\n> >\n> >  int SimpleConverter::Stream::start()\n> >  {\n> > -     int ret = m2m_->output()->importBuffers(inputBufferCount_);\n> > +     int ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n>\n> Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\n> much of an issue I suppose.\n>\n> >       if (ret < 0)\n> >               return ret;\n> >\n> > -     ret = m2m_->capture()->importBuffers(outputBufferCount_);\n> > +     ret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> >       if (ret < 0) {\n> >               stop();\n> >               return ret;\n> > diff --git a/src/libcamera/pipeline/simple/converter.h\n> b/src/libcamera/pipeline/simple/converter.h\n> > index 276a2a291c21..7e1d60674f62 100644\n> > --- a/src/libcamera/pipeline/simple/converter.h\n> > +++ b/src/libcamera/pipeline/simple/converter.h\n> > @@ -29,6 +29,9 @@ class SizeRange;\n> >  struct StreamConfiguration;\n> >  class V4L2M2MDevice;\n> >\n> > +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> > +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n>\n> Let's name the variables kSimpleInternalBufferCount and\n> kSimpleBufferSlotCount, as that's the naming scheme we're moving to for\n> non-macro constants. Same comment elsewhere in this patch.\n>\n> Those constants don't belong to converter.h. Could you turn them into\n> member constants of the SimplePipelineHandler class, as\n> kNumInternalBuffers (which btw should be removed) ? The number of buffer\n> slots can be passed as a parameter to SimpleConverter::start().\n>\n> There's no stats or parameters here, and no IPA, so the situation is\n> different than for IPU3 and RkISP1. The number of internal buffers\n> should just be one more than the minimum number of buffers required by\n> the capture device, I don't think there's another requirement.\n>\n> > +\n> >  class SimpleConverter\n> >  {\n> >  public:\n> > diff --git a/src/libcamera/pipeline/simple/simple.cpp\n> b/src/libcamera/pipeline/simple/simple.cpp\n> > index 1c25a7344f5f..a1163eaf8be2 100644\n> > --- a/src/libcamera/pipeline/simple/simple.cpp\n> > +++ b/src/libcamera/pipeline/simple/simple.cpp\n> > @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera,\n> [[maybe_unused]] const ControlL\n> >                * When using the converter allocate a fixed number of\n> internal\n> >                * buffers.\n> >                */\n> > -             ret = video->allocateBuffers(kNumInternalBuffers,\n> > +             ret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n> >                                            &data->converterBuffers_);\n> >       } else {\n> > -             /* Otherwise, prepare for using buffers from the only\n> stream. */\n> > -             Stream *stream = &data->streams_[0];\n> > -             ret =\n> video->importBuffers(stream->configuration().bufferCount);\n> > +             ret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> >       }\n> >       if (ret < 0)\n> >               return ret;\n> > diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > index fd39b3d3c72c..755949e7a59a 100644\n> > --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > @@ -91,6 +91,8 @@ private:\n> >               return static_cast<UVCCameraData *>(\n> >                       PipelineHandler::cameraData(camera));\n> >       }\n> > +\n> > +     static constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n> >  };\n> >\n> >  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> > @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera\n> *camera,\n> >  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const\n> ControlList *controls)\n> >  {\n> >       UVCCameraData *data = cameraData(camera);\n> > -     unsigned int count = data->stream_.configuration().bufferCount;\n> >\n> > -     int ret = data->video_->importBuffers(count);\n> > +     int ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n>\n> For the uvc and vimc pipeline handlers, we have no internal buffers, so\n> it's quite easy. We should have 8 or 16 slots, as for other pipeline\n> handlers.\n>\n> >       if (ret < 0)\n> >               return ret;\n> >\n> > diff --git a/src/libcamera/pipeline/vimc/vimc.cpp\n> b/src/libcamera/pipeline/vimc/vimc.cpp\n> > index e89d53182c6d..24ba743a946c 100644\n> > --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> > +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> > @@ -102,6 +102,8 @@ private:\n> >               return static_cast<VimcCameraData *>(\n> >                       PipelineHandler::cameraData(camera));\n> >       }\n> > +\n> > +     static constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n> >  };\n> >\n> >  namespace {\n> > @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera\n> *camera,\n> >  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const\n> ControlList *controls)\n> >  {\n> >       VimcCameraData *data = cameraData(camera);\n> > -     unsigned int count = data->stream_.configuration().bufferCount;\n> >\n> > -     int ret = data->video_->importBuffers(count);\n> > +     int ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n> >       if (ret < 0)\n> >               return ret;\n> >\n>\n> --\n> Regards,\n>\n> Laurent Pinchart\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 1F762C3240\n\tfor <parsemail@patchwork.libcamera.org>;\n\tThu, 12 Aug 2021 11:32:50 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 14FA768888;\n\tThu, 12 Aug 2021 13:32:48 +0200 (CEST)","from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com\n\t[IPv6:2a00:1450:4864:20::12e])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id DE9AD60264\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tThu, 12 Aug 2021 13:32:45 +0200 (CEST)","by mail-lf1-x12e.google.com with SMTP id w20so12851991lfu.7\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tThu, 12 Aug 2021 04:32:45 -0700 (PDT)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=raspberrypi.com header.i=@raspberrypi.com\n\theader.b=\"az0woXnZ\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=raspberrypi.com; s=google;\n\th=mime-version:references:in-reply-to:from:date:message-id:subject:to\n\t:cc; bh=GdbIUiqiznju3j4FEGJO0ElPcsBlag0fgWWuk/FMcj4=;\n\tb=az0woXnZktM7FXqqVNCDDcso7mKzC5ZxXh++h1EW1Ics/VWNZIQB+hssmDKPQ9fA6b\n\tSfkBebM5S7RsiZl7eDMjE7Jq2M7Irt3M9EcSu2nThCiPLJEKkkHhIvjC9hV67KO1v9Oa\n\tWINibrhFPa3lHT+L8fIoD5bc1ODHot8tCEwv6NXLIdYX61WvHqbRIzGIsKEcxTSYGVuO\n\tAAAhjxoaxG/qVB8WuE9qGfZKUL7XfT2dKWWqCNaXECICf3ieQH1ZxCGgZdxghbzX/1ZK\n\tEwhae0l2GlgTbniTKtjBU6Cru95ChRrxI5bqqg2WxrWujaFmaz0DWDfKPOLDh4AA6AiK\n\tEvsg==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:mime-version:references:in-reply-to:from:date\n\t:message-id:subject:to:cc;\n\tbh=GdbIUiqiznju3j4FEGJO0ElPcsBlag0fgWWuk/FMcj4=;\n\tb=kc3UOgzOjcaRHKmSuztRaBKEgaV/V01aVljpP7QqLulm+G7pnj2ohbPqmApemFLif0\n\tErKxyhqiBUJzv6/T9FgBETWXuBI7vqIH2VXsM8wpqihttgDlVpI3ee5+U7S23Bpk31BN\n\tRXfoEyjC2Ret8D4eGtvUre7f65jN3NIq7/DCdszf8OB6AdXPyV8eC0FNrbhAuARo8VzD\n\tyEaQYFv2c/QSSlyQkeH0HGAotBUUfle9NW/GvvanNr8Y1YpFg+1B3exh5iT7p1WfBZcv\n\thdZ2lhA0E00vHCtsxLaYl52xVQepP84/dqpCvXmYeMbn8Z8Yu+s/Fi3iQPwecBl3A0Zq\n\tTlUg==","X-Gm-Message-State":"AOAM530hp3ZGs+KOih2tpdjTvB38Oszf7WCfGqpd9lfjTJWdNmD2Mbnn\n\tQJ0fdBQ3n6mPN9uhcOw26gCtuqf50LKszPKBQhhLyA==","X-Google-Smtp-Source":"ABdhPJxcJ7pc9S+wJe8B244ysjR8qJEbCFFgQ9EES2uKTStInetnhUFvO/PCs81VJBI6ym7fFtuF4skUJ8LoclGVuwQ=","X-Received":"by 2002:ac2:559c:: with SMTP id\n\tv28mr2240200lfg.133.1628767964860; \n\tThu, 12 Aug 2021 04:32:44 -0700 (PDT)","MIME-Version":"1.0","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>\n\t<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>","In-Reply-To":"<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>","From":"Naushir Patuck <naush@raspberrypi.com>","Date":"Thu, 12 Aug 2021 12:32:28 +0100","Message-ID":"<CAEmqJPq94iMjF92TivzPkgRk29dVRB4Rut1SEeRAhvRjuPJOuA@mail.gmail.com>","To":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","Content-Type":"multipart/alternative; boundary=\"000000000000f7a58305c95b17b7\"","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera devel <libcamera-devel@lists.libcamera.org>,\n\tkernel@collabora.com, =?utf-8?q?Andr=C3=A9_Almeida?=\n\t<andrealmeid@collabora.com>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":18857,"web_url":"https://patchwork.libcamera.org/comment/18857/","msgid":"<YRsBBa++KC1IdJVz@pendragon.ideasonboard.com>","date":"2021-08-17T00:21:25","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":2,"url":"https://patchwork.libcamera.org/api/people/2/","name":"Laurent Pinchart","email":"laurent.pinchart@ideasonboard.com"},"content":"Hi Naush,\n\nOn Thu, Aug 12, 2021 at 12:32:28PM +0100, Naushir Patuck wrote:\n> On Mon, 2 Aug 2021 at 00:43, Laurent Pinchart wrote:\n> > On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> > > Pipelines have relied on bufferCount to decide on the number of buffers\n> > > to allocate internally through allocateBuffers() and on the number of\n> > > V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> > > number of internal buffers should be the minimum required by the\n> > > algorithms to avoid wasting memory, and the number of V4L2 buffer slots\n> > > should overallocate to avoid thrashing dmabuf mappings.\n> > >\n> > > For now, just set them to constants and stop relying on bufferCount, to\n> > > allow for its removal.\n> > >\n> > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> > > ---\n> > >\n> > > No changes in v7\n> > >\n> > > Changes in v6:\n> > > - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n> > >   INTERNAL_BUFFER_COUNT constant\n> > >\n> > >  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n> > >  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n> > >  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n> > >  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------\n> > >  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n> > >  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n> > >  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n> > >  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n> > >  src/libcamera/pipeline/simple/converter.h         |  3 +++\n> > >  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n> > >  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n> > >  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n> > >  12 files changed, 35 insertions(+), 43 deletions(-)\n> >\n> > Given that some of the pipeline handlers will need more intrusive\n> > changes to address the comments below, you could split this with one\n> > patch per pipeline handler (or perhaps grouping the easy ones together).\n> >\n> > > diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > index e955bc3456ba..f36e99dacbe7 100644\n> > > --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > @@ -593,22 +593,22 @@ int ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n> > >  /**\n> > >   * \\brief Allocate buffers for all the ImgU video devices\n> > >   */\n> > > -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > +int ImgUDevice::allocateBuffers()\n> > >  {\n> > >       /* Share buffers between CIO2 output and ImgU input. */\n> > > -     int ret = input_->importBuffers(bufferCount);\n> > > +     int ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > >       if (ret) {\n> > >               LOG(IPU3, Error) << \"Failed to import ImgU input buffers\";\n> > >               return ret;\n> > >       }\n> > >\n> > > -     ret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> > > +     ret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > >       if (ret < 0) {\n> > >               LOG(IPU3, Error) << \"Failed to allocate ImgU param buffers\";\n> > >               goto error;\n> > >       }\n> > >\n> > > -     ret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> > > +     ret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > >       if (ret < 0) {\n> > >               LOG(IPU3, Error) << \"Failed to allocate ImgU stat buffers\";\n> > >               goto error;\n> > > @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > >        * corresponding stream is active or inactive, as the driver needs\n> > >        * buffers to be requested on the V4L2 devices in order to operate.\n> > >        */\n> > > -     ret = output_->importBuffers(bufferCount);\n> > > +     ret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > >       if (ret < 0) {\n> > >               LOG(IPU3, Error) << \"Failed to import ImgU output buffers\";\n> > >               goto error;\n> > >       }\n> > >\n> > > -     ret = viewfinder_->importBuffers(bufferCount);\n> > > +     ret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > >       if (ret < 0) {\n> > >               LOG(IPU3, Error) << \"Failed to import ImgU viewfinder buffers\";\n> > >               goto error;\n> > > diff --git a/src/libcamera/pipeline/ipu3/imgu.h b/src/libcamera/pipeline/ipu3/imgu.h\n> > > index 9d4915116087..f934a951fc75 100644\n> > > --- a/src/libcamera/pipeline/ipu3/imgu.h\n> > > +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> > > @@ -61,7 +61,7 @@ public:\n> > >                                           outputFormat);\n> > >       }\n> > >\n> > > -     int allocateBuffers(unsigned int bufferCount);\n> > > +     int allocateBuffers();\n> > >       void freeBuffers();\n> > >\n> > >       int start();\n> > > @@ -86,6 +86,9 @@ private:\n> > >       static constexpr unsigned int PAD_VF = 3;\n> > >       static constexpr unsigned int PAD_STAT = 4;\n> > >\n> > > +     static constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> > > +     static constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n> >\n> > 5 buffer slots is low. It means that if applications cycle more than 5\n> > buffers, the V4L2VideoDevice cache that maintains associations between\n> > dmabufs and buffer slots will the trashed. Due to the internal queue of\n> > requests in the IPU3 pipeline handler (similar to what you have\n> > implemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\n> > queue\" for other pipeline handlers), we won't fail at queuing requests,\n> > but performance will suffer. I thus think we need to increase the number\n> > of slots to what applications can be reasonably expected to use. We\n> > could use 8, or even 16, as buffer slots are cheap. The same holds for\n> > other pipeline handlers.\n> >\n> > The number of slots for the CIO2 output should match the number of\n> > buffer slots for the ImgU input, as the same buffers are used on the two\n> > video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,\n> > instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\n> > buffers that are allocated by exportBuffers() in CIO2Device::start(), to\n> > be used in case the application doesn't provide any RAW buffer, should\n> > be lower, as those are real buffer and are thus expensive. The number of\n> > buffers and buffer slots on the CIO2 thus needs to be decoupled.\n> >\n> > For proper operation, the CIO2 will require at least two queued buffers\n> > (one being DMA'ed to, and one waiting). We need at least one extra\n> > buffer queued to the ImgU to keep buffers flowing. Depending on\n> > processing timings, it may be that the ImgU will complete processing of\n> > its buffer before the CIO2 captures the next one, leading to a temporary\n> > situation where the CIO2 will have three buffers queued, or the CIO2\n> > will finish the capture first, leading to a temporary situation where\n> > the CIO2 will have one buffer queued and the ImgU will have two buffers\n> > queued. In either case, shortly afterwards, the other component will\n> > complete capture or processing, and we'll get back to a situation with\n> > two buffers queued in the CIO2 and one in the ImgU. That's thus a\n> > minimum of three buffers for raw images.\n> >\n> > From an ImgU point of view, we could probably get away with a single\n> > parameter and a single stats buffer. This would however not allow\n> > queuing the next frame for processing in the ImgU before the current\n> > frame completes, so two buffers would be better. Now, if we take the IPA\n> > into account, the statistics buffer will spend some time on the IPA side\n> > for processing. It would thus be best to have an extra statistics buffer\n> > to accommodate that, thus requiring three statistics buffers (and three\n> > parameters buffers, as we associate them together).\n> >\n> > This rationale leads to using the same number of internal buffers for\n> > the CIO2, the parameters and the statistics. We currently use four, and\n> > while the logic above indicates we could get away with three, it would\n> > be safer to keep using four in this patch, and possibly reduce the\n> > number of buffers later.\n> >\n> > I know documentation isn't fun, but I think this rationale should be\n> > captured in a comment in the IPU3 pipeline handler, along with a \\todo\n> > item to try and lower the number of internal buffers to three.\n> >\n> > > +\n> > >       int linkSetup(const std::string &source, unsigned int sourcePad,\n> > >                     const std::string &sink, unsigned int sinkPad,\n> > >                     bool enable);\n> > > diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > index 5fd1757bfe13..4efd201c05e5 100644\n> > > --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera *camera)\n> > >  {\n> > >       IPU3CameraData *data = cameraData(camera);\n> > >       ImgUDevice *imgu = data->imgu_;\n> > > -     unsigned int bufferCount;\n> > >       int ret;\n> > >\n> > > -     bufferCount = std::max({\n> > > -             data->outStream_.configuration().bufferCount,\n> > > -             data->vfStream_.configuration().bufferCount,\n> > > -             data->rawStream_.configuration().bufferCount,\n> > > -     });\n> > > -\n> > > -     ret = imgu->allocateBuffers(bufferCount);\n> > > +     ret = imgu->allocateBuffers();\n> > >       if (ret < 0)\n> > >               return ret;\n> > >\n> > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > index d1cd3d9dc082..776e0f92aed1 100644\n> > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera)\n> > >  {\n> > >       RPiCameraData *data = cameraData(camera);\n> > >       int ret;\n> > > +     constexpr unsigned int bufferCount = 4;\n> > >\n> > >       /*\n> > > -      * Decide how many internal buffers to allocate. For now, simply look\n> > > -      * at how many external buffers will be provided. We'll need to improve\n> > > -      * this logic. However, we really must have all streams allocate the same\n> > > -      * number of buffers to simplify error handling in queueRequestDevice().\n> > > +      * Allocate internal buffers. We really must have all streams allocate\n> > > +      * the same number of buffers to simplify error handling in\n> > > +      * queueRequestDevice().\n> > >        */\n> > > -     unsigned int maxBuffers = 0;\n> > > -     for (const Stream *s : camera->streams())\n> > > -             if (static_cast<const RPi::Stream *>(s)->isExternal())\n> > > -                     maxBuffers = std::max(maxBuffers, s->configuration().bufferCount);\n> > > -\n> > >       for (auto const stream : data->streams_) {\n> > > -             ret = stream->prepareBuffers(maxBuffers);\n> > > +             ret = stream->prepareBuffers(bufferCount);\n> >\n> > We have a similar problem here, 4 buffer slots is too little, but when\n> > the stream has to allocate internal buffers (!importOnly), which is the\n> > case for most streams, we don't want to overallocate.\n> >\n> > I'd like to get feedback from Naush here, but I think this means we'll\n> > have to relax the requirement documented in the comment above, and\n> > accept a different number of buffers for each stream.\n> \n> Sorry for the late reply to this thread!\n> \n> As is evident from the above comment, this bit of code does need to be improved\n> to avoid over-applications which I will get to at some point. However, to address this\n> change and the comments, 4 buffer slots sounds like it might be too little.  Regarding\n> the requirement on having streams allocate the same number of buffers - that can be\n> relaxed (and the comment removed) as we do handle it correctly.\n\nThanks for the information. I understand that this means that we can\ndrop the comment and have different numbers of buffers for different\nstreams without any other change to the pipeline handler. If that's\nincorrect, please let me know.\n\n> > >               if (ret < 0)\n> > >                       return ret;\n> > >       }\n> > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > index 11325875b929..f4ea2fd4d4d0 100644\n> > > --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera *camera)\n> > >       unsigned int ipaBufferId = 1;\n> > >       int ret;\n> > >\n> > > -     unsigned int maxCount = std::max({\n> > > -             data->mainPathStream_.configuration().bufferCount,\n> > > -             data->selfPathStream_.configuration().bufferCount,\n> > > -     });\n> > > -\n> > > -     ret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> > > +     ret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > >       if (ret < 0)\n> > >               goto error;\n> > >\n> > > -     ret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> > > +     ret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > >       if (ret < 0)\n> > >               goto error;\n> > >\n> > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > index 25f482eb8d8e..fea330f72886 100644\n> > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > @@ -172,7 +172,7 @@ int RkISP1Path::start()\n> > >               return -EBUSY;\n> > >\n> > >       /* \\todo Make buffer count user configurable. */\n> > > -     ret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> > > +     ret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n> > >       if (ret)\n> > >               return ret;\n> > >\n> > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > index 91757600ccdc..3c5891009c58 100644\n> > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > @@ -27,6 +27,9 @@ class V4L2Subdevice;\n> > >  struct StreamConfiguration;\n> > >  struct V4L2SubdeviceFormat;\n> > >\n> > > +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> > > +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n> >\n> > The situation should be simpler for the rkisp1, as it has a different\n> > pipeline model (inline ISP as opposed to offline ISP for the IPU3). We\n> > can allocate more slots (8 or 16, as for other pipeline handlers), and\n> > restrict the number of internal buffers (for stats and parameters) to\n> > the number of requests we expect to queue to the device at once, plus\n> > one for the IPA.  Four thus seems good. Capturing this rationale in a\n> > comment would be good too.\n> >\n> > BTW, I may be too tired to think properly, or just unable to see the\n> > obvious, so please challenge any rationale you think is incorrect.\n> >\n> > > +\n> > >  class RkISP1Path\n> > >  {\n> > >  public:\n> > > diff --git a/src/libcamera/pipeline/simple/converter.cpp b/src/libcamera/pipeline/simple/converter.cpp\n> > > index b5e34c4cd0c5..b3bcf01483f7 100644\n> > > --- a/src/libcamera/pipeline/simple/converter.cpp\n> > > +++ b/src/libcamera/pipeline/simple/converter.cpp\n> > > @@ -103,11 +103,11 @@ int SimpleConverter::Stream::exportBuffers(unsigned int count,\n> > >\n> > >  int SimpleConverter::Stream::start()\n> > >  {\n> > > -     int ret = m2m_->output()->importBuffers(inputBufferCount_);\n> > > +     int ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> >\n> > Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\n> > much of an issue I suppose.\n> >\n> > >       if (ret < 0)\n> > >               return ret;\n> > >\n> > > -     ret = m2m_->capture()->importBuffers(outputBufferCount_);\n> > > +     ret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > >       if (ret < 0) {\n> > >               stop();\n> > >               return ret;\n> > > diff --git a/src/libcamera/pipeline/simple/converter.h b/src/libcamera/pipeline/simple/converter.h\n> > > index 276a2a291c21..7e1d60674f62 100644\n> > > --- a/src/libcamera/pipeline/simple/converter.h\n> > > +++ b/src/libcamera/pipeline/simple/converter.h\n> > > @@ -29,6 +29,9 @@ class SizeRange;\n> > >  struct StreamConfiguration;\n> > >  class V4L2M2MDevice;\n> > >\n> > > +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> > > +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n> >\n> > Let's name the variables kSimpleInternalBufferCount and\n> > kSimpleBufferSlotCount, as that's the naming scheme we're moving to for\n> > non-macro constants. Same comment elsewhere in this patch.\n> >\n> > Those constants don't belong to converter.h. Could you turn them into\n> > member constants of the SimplePipelineHandler class, as\n> > kNumInternalBuffers (which btw should be removed) ? The number of buffer\n> > slots can be passed as a parameter to SimpleConverter::start().\n> >\n> > There's no stats or parameters here, and no IPA, so the situation is\n> > different than for IPU3 and RkISP1. The number of internal buffers\n> > should just be one more than the minimum number of buffers required by\n> > the capture device, I don't think there's another requirement.\n> >\n> > > +\n> > >  class SimpleConverter\n> > >  {\n> > >  public:\n> > > diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp\n> > > index 1c25a7344f5f..a1163eaf8be2 100644\n> > > --- a/src/libcamera/pipeline/simple/simple.cpp\n> > > +++ b/src/libcamera/pipeline/simple/simple.cpp\n> > > @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera, [[maybe_unused]] const ControlL\n> > >                * When using the converter allocate a fixed number of internal\n> > >                * buffers.\n> > >                */\n> > > -             ret = video->allocateBuffers(kNumInternalBuffers,\n> > > +             ret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n> > >                                            &data->converterBuffers_);\n> > >       } else {\n> > > -             /* Otherwise, prepare for using buffers from the only stream. */\n> > > -             Stream *stream = &data->streams_[0];\n> > > -             ret = video->importBuffers(stream->configuration().bufferCount);\n> > > +             ret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > >       }\n> > >       if (ret < 0)\n> > >               return ret;\n> > > diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > index fd39b3d3c72c..755949e7a59a 100644\n> > > --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > @@ -91,6 +91,8 @@ private:\n> > >               return static_cast<UVCCameraData *>(\n> > >                       PipelineHandler::cameraData(camera));\n> > >       }\n> > > +\n> > > +     static constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n> > >  };\n> > >\n> > >  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> > > @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera *camera,\n> > >  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > >  {\n> > >       UVCCameraData *data = cameraData(camera);\n> > > -     unsigned int count = data->stream_.configuration().bufferCount;\n> > >\n> > > -     int ret = data->video_->importBuffers(count);\n> > > +     int ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n> >\n> > For the uvc and vimc pipeline handlers, we have no internal buffers, so\n> > it's quite easy. We should have 8 or 16 slots, as for other pipeline\n> > handlers.\n> >\n> > >       if (ret < 0)\n> > >               return ret;\n> > >\n> > > diff --git a/src/libcamera/pipeline/vimc/vimc.cpp b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > index e89d53182c6d..24ba743a946c 100644\n> > > --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> > > +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > @@ -102,6 +102,8 @@ private:\n> > >               return static_cast<VimcCameraData *>(\n> > >                       PipelineHandler::cameraData(camera));\n> > >       }\n> > > +\n> > > +     static constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n> > >  };\n> > >\n> > >  namespace {\n> > > @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera *camera,\n> > >  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > >  {\n> > >       VimcCameraData *data = cameraData(camera);\n> > > -     unsigned int count = data->stream_.configuration().bufferCount;\n> > >\n> > > -     int ret = data->video_->importBuffers(count);\n> > > +     int ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n> > >       if (ret < 0)\n> > >               return ret;\n> > >","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 8F360BD87C\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 17 Aug 2021 00:21:33 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 01B6568895;\n\tTue, 17 Aug 2021 02:21:33 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[IPv6:2001:4b98:dc2:55:216:3eff:fef7:d647])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id E5B1F68889\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 17 Aug 2021 02:21:31 +0200 (CEST)","from pendragon.ideasonboard.com (62-78-145-57.bb.dnainternet.fi\n\t[62.78.145.57])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id 5504A3E5;\n\tTue, 17 Aug 2021 02:21:31 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"ondwNG/j\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1629159691;\n\tbh=I2XAvOU2fU7qZT8aPW1IOX9XCtcptuZcW5QJKgXNyUc=;\n\th=Date:From:To:Cc:Subject:References:In-Reply-To:From;\n\tb=ondwNG/j91L5CrUjuxhHi3KyQDW/7T6Iwm4ueZsacAzWRVshuPJH+Xb5/jevp1rC7\n\tCiDP2zsI/wE2mgnLeXi8KeMW3o+h7ITKO9nIz1pXo3f1Eez3ShtlLv1+MN0Hx3fdhy\n\tn4KPXkpMS7SKo3Iu9OTPvSFxqF17tlNDtSw0/5IU=","Date":"Tue, 17 Aug 2021 03:21:25 +0300","From":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","To":"Naushir Patuck <naush@raspberrypi.com>","Message-ID":"<YRsBBa++KC1IdJVz@pendragon.ideasonboard.com>","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>\n\t<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>\n\t<CAEmqJPq94iMjF92TivzPkgRk29dVRB4Rut1SEeRAhvRjuPJOuA@mail.gmail.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<CAEmqJPq94iMjF92TivzPkgRk29dVRB4Rut1SEeRAhvRjuPJOuA@mail.gmail.com>","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera devel <libcamera-devel@lists.libcamera.org>,\n\tkernel@collabora.com, =?utf-8?b?QW5kcsOp?=\n\tAlmeida <andrealmeid@collabora.com>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":18858,"web_url":"https://patchwork.libcamera.org/comment/18858/","msgid":"<YRsgB6M7NE88y34v@pendragon.ideasonboard.com>","date":"2021-08-17T02:33:43","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":2,"url":"https://patchwork.libcamera.org/api/people/2/","name":"Laurent Pinchart","email":"laurent.pinchart@ideasonboard.com"},"content":"Hi Nícolas,\n\nOn Mon, Aug 09, 2021 at 05:26:46PM -0300, Nícolas F. R. A. Prado wrote:\n> On Sat, Aug 07, 2021 at 12:03:52PM -0300, Nícolas F. R. A. Prado wrote:\n> > On Mon, Aug 02, 2021 at 02:42:53AM +0300, Laurent Pinchart wrote:\n> > > On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> > > > Pipelines have relied on bufferCount to decide on the number of buffers\n> > > > to allocate internally through allocateBuffers() and on the number of\n> > > > V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> > > > number of internal buffers should be the minimum required by the\n> > > > algorithms to avoid wasting memory, and the number of V4L2 buffer slots\n> > > > should overallocate to avoid thrashing dmabuf mappings.\n> > > > \n> > > > For now, just set them to constants and stop relying on bufferCount, to\n> > > > allow for its removal.\n> > > > \n> > > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> > > > ---\n> > > > \n> > > > No changes in v7\n> > > > \n> > > > Changes in v6:\n> > > > - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n> > > >   INTERNAL_BUFFER_COUNT constant\n> > > > \n> > > >  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n> > > >  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n> > > >  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n> > > >  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------\n> > > >  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n> > > >  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n> > > >  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n> > > >  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n> > > >  src/libcamera/pipeline/simple/converter.h         |  3 +++\n> > > >  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n> > > >  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n> > > >  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n> > > >  12 files changed, 35 insertions(+), 43 deletions(-)\n> > > \n> > > Given that some of the pipeline handlers will need more intrusive\n> > > changes to address the comments below, you could split this with one\n> > > patch per pipeline handler (or perhaps grouping the easy ones together).\n> > > \n> > > > \n> > > > diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > index e955bc3456ba..f36e99dacbe7 100644\n> > > > --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > @@ -593,22 +593,22 @@ int ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n> > > >  /**\n> > > >   * \\brief Allocate buffers for all the ImgU video devices\n> > > >   */\n> > > > -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > > +int ImgUDevice::allocateBuffers()\n> > > >  {\n> > > >  \t/* Share buffers between CIO2 output and ImgU input. */\n> > > > -\tint ret = input_->importBuffers(bufferCount);\n> > > > +\tint ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > >  \tif (ret) {\n> > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU input buffers\";\n> > > >  \t\treturn ret;\n> > > >  \t}\n> > > >  \n> > > > -\tret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> > > > +\tret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > > >  \tif (ret < 0) {\n> > > >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU param buffers\";\n> > > >  \t\tgoto error;\n> > > >  \t}\n> > > >  \n> > > > -\tret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> > > > +\tret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > > >  \tif (ret < 0) {\n> > > >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU stat buffers\";\n> > > >  \t\tgoto error;\n> > > > @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > >  \t * corresponding stream is active or inactive, as the driver needs\n> > > >  \t * buffers to be requested on the V4L2 devices in order to operate.\n> > > >  \t */\n> > > > -\tret = output_->importBuffers(bufferCount);\n> > > > +\tret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > >  \tif (ret < 0) {\n> > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU output buffers\";\n> > > >  \t\tgoto error;\n> > > >  \t}\n> > > >  \n> > > > -\tret = viewfinder_->importBuffers(bufferCount);\n> > > > +\tret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > >  \tif (ret < 0) {\n> > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU viewfinder buffers\";\n> > > >  \t\tgoto error;\n> > > > diff --git a/src/libcamera/pipeline/ipu3/imgu.h b/src/libcamera/pipeline/ipu3/imgu.h\n> > > > index 9d4915116087..f934a951fc75 100644\n> > > > --- a/src/libcamera/pipeline/ipu3/imgu.h\n> > > > +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> > > > @@ -61,7 +61,7 @@ public:\n> > > >  \t\t\t\t\t    outputFormat);\n> > > >  \t}\n> > > >  \n> > > > -\tint allocateBuffers(unsigned int bufferCount);\n> > > > +\tint allocateBuffers();\n> > > >  \tvoid freeBuffers();\n> > > >  \n> > > >  \tint start();\n> > > > @@ -86,6 +86,9 @@ private:\n> > > >  \tstatic constexpr unsigned int PAD_VF = 3;\n> > > >  \tstatic constexpr unsigned int PAD_STAT = 4;\n> > > >  \n> > > > +\tstatic constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> > > > +\tstatic constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n> > > \n> > > 5 buffer slots is low. It means that if applications cycle more than 5\n> > > buffers, the V4L2VideoDevice cache that maintains associations between\n> > > dmabufs and buffer slots will the trashed. Due to the internal queue of\n> > > requests in the IPU3 pipeline handler (similar to what you have\n> > > implemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\n> > > queue\" for other pipeline handlers), we won't fail at queuing requests,\n> > > but performance will suffer. I thus think we need to increase the number\n> > > of slots to what applications can be reasonably expected to use. We\n> > > could use 8, or even 16, as buffer slots are cheap. The same holds for\n> > > other pipeline handlers.\n> > > \n> > > The number of slots for the CIO2 output should match the number of\n> > > buffer slots for the ImgU input, as the same buffers are used on the two\n> > > video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,\n> > > instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\n> > > buffers that are allocated by exportBuffers() in CIO2Device::start(), to\n> > > be used in case the application doesn't provide any RAW buffer, should\n> > > be lower, as those are real buffer and are thus expensive. The number of\n> > > buffers and buffer slots on the CIO2 thus needs to be decoupled.\n> > > \n> > > For proper operation, the CIO2 will require at least two queued buffers\n> > > (one being DMA'ed to, and one waiting). We need at least one extra\n> > > buffer queued to the ImgU to keep buffers flowing. Depending on\n> > > processing timings, it may be that the ImgU will complete processing of\n> > > its buffer before the CIO2 captures the next one, leading to a temporary\n> > > situation where the CIO2 will have three buffers queued, or the CIO2\n> > > will finish the capture first, leading to a temporary situation where\n> > > the CIO2 will have one buffer queued and the ImgU will have two buffers\n> > > queued. In either case, shortly afterwards, the other component will\n> > > complete capture or processing, and we'll get back to a situation with\n> > > two buffers queued in the CIO2 and one in the ImgU. That's thus a\n> > > minimum of three buffers for raw images.\n> > > \n> > > From an ImgU point of view, we could probably get away with a single\n> > > parameter and a single stats buffer. This would however not allow\n> > > queuing the next frame for processing in the ImgU before the current\n> > > frame completes, so two buffers would be better. Now, if we take the IPA\n> > > into account, the statistics buffer will spend some time on the IPA side\n> > > for processing. It would thus be best to have an extra statistics buffer\n> > > to accommodate that, thus requiring three statistics buffers (and three\n> > > parameters buffers, as we associate them together).\n> > > \n> > > This rationale leads to using the same number of internal buffers for\n> > > the CIO2, the parameters and the statistics. We currently use four, and\n> > > while the logic above indicates we could get away with three, it would\n> > > be safer to keep using four in this patch, and possibly reduce the\n> > > number of buffers later.\n> > > \n> > > I know documentation isn't fun, but I think this rationale should be\n> > > captured in a comment in the IPU3 pipeline handler, along with a \\todo\n> > > item to try and lower the number of internal buffers to three.\n> > \n> > This is the IPU3 topology as I understand it:\n> > \n> >       Output  .               .   Input        Output .\n> >       +---+   .               .   +---+        +---+  .\n> >       |   | --------------------> |   |        |   |  .\n> >       +---+   .               .   +---+        +---+  .\n> > CIO2          .   IPA         .          ImgU         .          IPA\n> >               .        Param  .   Param        Stat   .   Stat\n> >               .        +---+  .   +---+        +---+  .   +---+ \n> >               .        |   | ---> |   |        |   | ---> |   | \n> >               .        +---+  .   +---+        +---+  .   +---+ \n> >           \n> > Your suggestions for the minimum number of buffers required are the following,\n> > from what I understand:\n> > \n> > CIO2 raw internal buffers:\n> > - 2x on CIO2 Output (one being DMA'ed, one waiting)\n> > - 1x on ImgU Input\n> > \n> > ImgU Param/Stat internal buffers:\n> > - 2x on ImgU Param/Stat (one being processed, one waiting)\n> > - 1x on IPA Stat\n> > \n> > This arrangement doesn't seem to take into account that IPU3Frames::Info binds\n> > CIO2 internal buffers and ImgU Param/Stat buffers together. This means that each\n> > raw buffer queued to CIO2 Output needs a Param/Stat buffer as well. And each\n> > Param/Stat buffer queued to ImgU for processing needs a CIO2 raw buffer as well.\n> > After ImgU processing though, the raw buffer gets released and reused, so the\n> > Stat buffer queued to the IPA does not require a CIO2 raw buffer.\n> > \n> > This means that to achieve the above minimum, due to the IPU3Frames::Info\n> > constraint, we'd actually need:\n> > \n> > CIO2 internal buffers:\n> > - 2x on CIO2 Output (one being DMA'ed, one waiting)\n> > - 2x on ImgU Input (for the two ImgU Param/Stat buffers we want to have there)\n> > \n> > ImgU Param/Stat internal buffers:\n> > - 2x on CIO2 Output (for the two CIO2 raw buffers we want to have there)\n> > - 2x on ImgU Param/Stat (one being processed, one waiting)\n\nNote that the need to have two buffers here is to ensure back-to-back\nprocessing of frame on the ImgU and thus avoid delays, but this need\nactually depends on how fast the ImgU is. With a very fast ImgU\n(compared to the frame duration), inter-frame delays may not be an\nissue. There's more on this below.\n\n> > - 1x on IPA Stat\n\nProcessing of the statistics can occur after the corresponding raw image\nbuffer has been requeued to the CIO2, the only hard requirement is that\nthe buffer needs to be available by the time the ImgU will process the\ncorresponding raw frame buffer again.\n\n> > Also we're not accounting for parameter filling in the IPA before we queue the\n> > buffers to ImgU, but perhaps that's fast enough that it doesn't matter?\n\nThat's one of the questions we need to answer, I don't think we have\nnumbers at this time. If filling the parameters buffer takes a\nsignificant amount of time, then that would need to be taken into\naccount as an additional step in the pipeline, with an additional set of\nbuffers.\n\n> > Does this make sense? Or am I missing something?\n\nOne thing that you make not have taken into account is that the two\nbuffers queued on the CIO2 output and the two buffers queued on the ImgU\nare not necessarily queued at the same time. I'll try to explain.\n\nOn the CIO2 side, we have a strong real time requirement to always keep\nthe CIO2 fed with buffers. The details depend a bit on the hardware and\ndriver implementations, but the base idea is that once a buffer is\ncomplete and the time comes to move to the next buffer for the next\nframe, there has to be a next buffer available. When exactly this occurs\ncan vary. Some drivers will give the buffer for the next frame to the\ndevice when capture for the current frame starts, and some will give it\nwhen the hardware signals completion of the capture of the current frame\n(frame end). In theory this could be delayed even a bit more, but it has\nto happen before the hardware needs the new buffer, and giving it when\nthe DMA completes is often too risky already as vertical blanking can be\nshort and interrupts can be delayed a bit. I tried to check the driver\nto see what the exact requirement is, but I'm not familiar with the\nhardware and the code is not very easy to follow.\n\nNote that frame start is the time when the first pixel of the frame is\nwritten to memory, and frame end the time when the last pixel of the\nframe is written to memory. The end of frame N and the start of frame\nN+1 are separated by the vertical blanking time.\n\nLet's assume that the CIO2 needs to be programmed with the buffer for\nframe N+1 at the start of frame N (Edit: I've written all the\nexplanation below based on this assumption, but after further\ninvestigation, I *think* the CIO2 only requires the buffer for frame N+1\nat the beginning of frame N+1, but the driver enforces that the buffer\nmust be present just before the start of frame N to avoid race\nconditions - just before the start of frame N and at the start of frame\nN at practically speaking the same thing. Sakari, do you know if this is\ncorrect ?). We'll constantly transition between the following states,\nfrom the CIO2 point of view.\n\n0. (Initial state) 2x idle buffers in the queue, hardware stopped. The\n   CIO2 is then started, the first buffer in the queue is given to the\n   device to capture the first frame, and the second buffer in the queue\n   is given to the device to capture the second frame. The first frame\n   starts.\n\n1. 1x active buffer being DMA'ed to, 1x pending buffer already given to\n   the hardware for the next frame, 0x idle buffers in the queue. Two\n   events can occur at this point, either completion of the current\n   frame (-> 2), or a new buffer being queued by userspace (-> 4).\n\n2. 0x active buffer beind DMA'ed to, 1x pending buffer already given to\n   the hardware for the next frame, 0x idle buffers in the queue. Two\n   events can occur at this point, either start of the next frame (->\n   3), or a new buffer being queued by userspace (-> 5).\n\n   This state lasts for the duration of the vertical blanking only, and\n   can thus be short-lived.\n\n3. The next frame start. The pending buffer becomes active. We have no\n   buffer in the queue to give to the hardware for the next frame. An\n   underrun has occurred, a frame will be dropped. Game over.\n\n4. 1x active buffer being DMA'ed to, 1x pending buffer already given to\n   the hardware for the next frame, 1x idle buffers in the queue. The\n   next event that will occur is the start of the next frame (as the\n   other option, a new buffer being queued, will give us additional\n   safety by increasing the number of queued buffers, but isn't\n   meaningful when considering the case where we try to run with the\n   minimum number of buffers possible).\n\n   As the current frame ends, the active buffer is given back to the\n   userspace. There's no active buffer (the DMA will start soon, after\n   the vertical blanking, when the next frame starts), the pending\n   buffer stays pending, and the idle buffer stays idle (-> 5).\n\n5. 0x active buffer beind DMA'ed to, 1x pending buffer already given to\n   the hardware for the next frame, 1x idle buffers in the queue. The\n   next event that will occur is the start of the next frame (for the\n   same reason as in 4).\n\n   As the next frame starts, the pending buffer becomes active. The\n   queue buffer is given to the hardware for the subsequent frame. The\n   queue of idle buffers become empty (-> 1).\n\n   If this state is reached from state 2, it lasts for the remaining of\n   the vertical blanking only. If it is reached from state 4, it lasts\n   for the whole vertical blanking. In both cases, it can be\n   short-lived.\n\nWe can thus cycle either through 1 -> 2 -> 5 -> 1 or through 1 -> 4 -> 5\n-> 1. The first cycle requires two buffers for the CIO2, with an\nintermediate state (2) that has a single buffer only. This is unsafe, as\na failure to queue a second buffer in the short-lived state 2 will lead\nto state 3 and frame drops.\n\nThe second cycle requires three buffers for the CIO2. This is the cycle\nwe want to use, to avoid frame drops. Note that only state 4 requires\nall three buffers, and userspace can queue the third buffer at any point\nin state 1 (before the end of the current frame). If userspace queues\nthe frame slightly too late, after the completion of the current frame\nbut before the start of the next one, we'll go to the unsafe cycle but\nwill still not lose frames.\n\nNow, let's look at the ImgU side, and assume we use three buffers in\ntotal. The ImgU operates from memory to memory, it thus has no realtime\nrequirement. It only starts processing a frame when the frame is given\nto it. This occurs, from a CIO2 point of view, in the transition from\nstate 4 to state 5, plus all delays introduced by delivering the CIO2\nframe completion event to userspace, queueing the frame to the ImgU (I'm\nignoring the IPA here), and starting the ImgU itself. The ImgU\nprocessing time will, on average, be lower than the frame duration,\notherwise it won't be able to process all frames. Once the ImgU\ncompletes processing of the frame, it will signal this to userspace.\nThere's also a processing delay there (signalling, task switching, ...),\nand userspace will requeue the frame to the CIO2. This has to occur at\nthe latest before the end of the current frame, otherwise state 1 will\ntransition to state 2.\n\nWe thus see that, in the 3 buffers case, we need to ensure that the\ntotal time to process the frame on the ImgU, from the CIO2 interrupt\nsignalling the end of state 4 to the buffer being requeued to the CIO2,\nthus including all task switching and other delays, doesn't exceed the\nduration of states 5 + 1, which is equal to the duration of a frame. The\nImgU processing time itself is guaranteed to be lower than that, but the\nadditional delays may be problematic. We also need to include a possible\nround-trip to the IPA after end of buffer capture by the CIO2 and start\nof processing by the ImgU to retrieve the ImgU parameters for the frame.\nThree buffers start sounding quite risky. I'm thus correcting myself,\nhour buffers seem safer.\n\nNone of this takes the parameters or statistics buffers into account,\nbut I don't think they're particularly problematic in the sense that the\nmost strict realtime constraints come from the raw image buffers. Feel\nfree to prove me wrong though :-)\n\nLet's however note that we can probably fetch the ImgU parameters for\nthe frame that has just been captured before the end of the frame, so\nthat would remove a delay in the ImgU processing. This assumes that the\nalgorithms wouldn't need to know the exact exposure time and analog gain\nthat have been used to capture the current frame in order to compute the\nImgU parameters. This leads to a first question to David: does the\nRaspberry Pi IPA require the sensor metadata to calculate ISP\nparameters, or are they needed only when processing statistics from\nframe N to calculate sensor and ISP parameters of subsequent frames ?\n\nThe next question is for everybody (and that's why I've expanded the CC\nlist to Kieran, Jean-Michel and Sakari too): what did I get wrong in the\nabove explanation ? :-)\n\n> > > > +\n> > > >  \tint linkSetup(const std::string &source, unsigned int sourcePad,\n> > > >  \t\t      const std::string &sink, unsigned int sinkPad,\n> > > >  \t\t      bool enable);\n> > > > diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > index 5fd1757bfe13..4efd201c05e5 100644\n> > > > --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera *camera)\n> > > >  {\n> > > >  \tIPU3CameraData *data = cameraData(camera);\n> > > >  \tImgUDevice *imgu = data->imgu_;\n> > > > -\tunsigned int bufferCount;\n> > > >  \tint ret;\n> > > >  \n> > > > -\tbufferCount = std::max({\n> > > > -\t\tdata->outStream_.configuration().bufferCount,\n> > > > -\t\tdata->vfStream_.configuration().bufferCount,\n> > > > -\t\tdata->rawStream_.configuration().bufferCount,\n> > > > -\t});\n> > > > -\n> > > > -\tret = imgu->allocateBuffers(bufferCount);\n> > > > +\tret = imgu->allocateBuffers();\n> > > >  \tif (ret < 0)\n> > > >  \t\treturn ret;\n> > > >  \n> > > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > index d1cd3d9dc082..776e0f92aed1 100644\n> > > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera)\n> > > >  {\n> > > >  \tRPiCameraData *data = cameraData(camera);\n> > > >  \tint ret;\n> > > > +\tconstexpr unsigned int bufferCount = 4;\n> > > >  \n> > > >  \t/*\n> > > > -\t * Decide how many internal buffers to allocate. For now, simply look\n> > > > -\t * at how many external buffers will be provided. We'll need to improve\n> > > > -\t * this logic. However, we really must have all streams allocate the same\n> > > > -\t * number of buffers to simplify error handling in queueRequestDevice().\n> > > > +\t * Allocate internal buffers. We really must have all streams allocate\n> > > > +\t * the same number of buffers to simplify error handling in\n> > > > +\t * queueRequestDevice().\n> > > >  \t */\n> > > > -\tunsigned int maxBuffers = 0;\n> > > > -\tfor (const Stream *s : camera->streams())\n> > > > -\t\tif (static_cast<const RPi::Stream *>(s)->isExternal())\n> > > > -\t\t\tmaxBuffers = std::max(maxBuffers, s->configuration().bufferCount);\n> > > > -\n> > > >  \tfor (auto const stream : data->streams_) {\n> > > > -\t\tret = stream->prepareBuffers(maxBuffers);\n> > > > +\t\tret = stream->prepareBuffers(bufferCount);\n> > > \n> > > We have a similar problem here, 4 buffer slots is too little, but when\n> > > the stream has to allocate internal buffers (!importOnly), which is the\n> > > case for most streams, we don't want to overallocate.\n> > > \n> > > I'd like to get feedback from Naush here, but I think this means we'll\n> > > have to relax the requirement documented in the comment above, and\n> > > accept a different number of buffers for each stream.\n> > > \n> > > >  \t\tif (ret < 0)\n> > > >  \t\t\treturn ret;\n> > > >  \t}\n> > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > index 11325875b929..f4ea2fd4d4d0 100644\n> > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera *camera)\n> > > >  \tunsigned int ipaBufferId = 1;\n> > > >  \tint ret;\n> > > >  \n> > > > -\tunsigned int maxCount = std::max({\n> > > > -\t\tdata->mainPathStream_.configuration().bufferCount,\n> > > > -\t\tdata->selfPathStream_.configuration().bufferCount,\n> > > > -\t});\n> > > > -\n> > > > -\tret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> > > > +\tret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > > >  \tif (ret < 0)\n> > > >  \t\tgoto error;\n> > > >  \n> > > > -\tret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> > > > +\tret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > > >  \tif (ret < 0)\n> > > >  \t\tgoto error;\n> > > >  \n> > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > index 25f482eb8d8e..fea330f72886 100644\n> > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > @@ -172,7 +172,7 @@ int RkISP1Path::start()\n> > > >  \t\treturn -EBUSY;\n> > > >  \n> > > >  \t/* \\todo Make buffer count user configurable. */\n> > > > -\tret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> > > > +\tret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n> > > >  \tif (ret)\n> > > >  \t\treturn ret;\n> > > >  \n> > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > index 91757600ccdc..3c5891009c58 100644\n> > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > @@ -27,6 +27,9 @@ class V4L2Subdevice;\n> > > >  struct StreamConfiguration;\n> > > >  struct V4L2SubdeviceFormat;\n> > > >  \n> > > > +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> > > > +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n> > > \n> > > The situation should be simpler for the rkisp1, as it has a different\n> > > pipeline model (inline ISP as opposed to offline ISP for the IPU3). We\n> > > can allocate more slots (8 or 16, as for other pipeline handlers), and\n> > > restrict the number of internal buffers (for stats and parameters) to\n> > > the number of requests we expect to queue to the device at once, plus\n> > > one for the IPA.  Four thus seems good. Capturing this rationale in a\n> > > comment would be good too.\n> \n> Shouldn't we also have one extra buffer queued to the capture device, like for\n> the others, totalling five (four on the capture, one on the IPA)? Or since the\n> driver already requires three buffers the extra one isn't needed?\n>\n> I'm not sure how it works, but if the driver requires three buffers at all times\n> to keep streaming, then I think we indeed should have the extra buffer to avoid\n> dropping frames. Otherwise, if that requirement is only for starting the stream,\n> then for drivers that require at least two buffers we shouldn't need an extra\n> one, I'd think.\n\nIt seems to be only needed to start capture. Even then I think it could\nbe lowered to two buffers, I don't see anything in the driver that\nrequires three. Maybe someone from Collabora could comment on this ? And\nmaybe you could give it a try by modifying the driver ?\n\nBy the way, if you try to apply the CIO2 reasoning above to the RkISP1,\nyou will need to take into account the fact the the driver programs the\nhardware with the buffer for frame N+1 not at the beginning of frame N,\nbut at the end of frame N-1.\n\nI think four buffers is enough. We currently use four buffers and it\nseems to work :-) Granted, the RkISP1 IPA is a skeleton, so this\nargument isn't very strong, but given that the driver only needs two\nbuffers except at start time, four should be fine.\n\n> > > BTW, I may be too tired to think properly, or just unable to see the\n> > > obvious, so please challenge any rationale you think is incorrect.\n> > > \n> > > > +\n> > > >  class RkISP1Path\n> > > >  {\n> > > >  public:\n> > > > diff --git a/src/libcamera/pipeline/simple/converter.cpp b/src/libcamera/pipeline/simple/converter.cpp\n> > > > index b5e34c4cd0c5..b3bcf01483f7 100644\n> > > > --- a/src/libcamera/pipeline/simple/converter.cpp\n> > > > +++ b/src/libcamera/pipeline/simple/converter.cpp\n> > > > @@ -103,11 +103,11 @@ int SimpleConverter::Stream::exportBuffers(unsigned int count,\n> > > >  \n> > > >  int SimpleConverter::Stream::start()\n> > > >  {\n> > > > -\tint ret = m2m_->output()->importBuffers(inputBufferCount_);\n> > > > +\tint ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > \n> > > Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\n> > > much of an issue I suppose.\n> \n> Indeed. I was under the impression that we should always importBuffers() using\n> BUFFER_SLOT_COUNT, but now, after reading more code, I understand that's not\n> always the case (although this seems to be the only case, due to the presence of\n> the converter).\n> \n> > > >  \tif (ret < 0)\n> > > >  \t\treturn ret;\n> > > >  \n> > > > -\tret = m2m_->capture()->importBuffers(outputBufferCount_);\n> > > > +\tret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > >  \tif (ret < 0) {\n> > > >  \t\tstop();\n> > > >  \t\treturn ret;\n> > > > diff --git a/src/libcamera/pipeline/simple/converter.h b/src/libcamera/pipeline/simple/converter.h\n> > > > index 276a2a291c21..7e1d60674f62 100644\n> > > > --- a/src/libcamera/pipeline/simple/converter.h\n> > > > +++ b/src/libcamera/pipeline/simple/converter.h\n> > > > @@ -29,6 +29,9 @@ class SizeRange;\n> > > >  struct StreamConfiguration;\n> > > >  class V4L2M2MDevice;\n> > > >  \n> > > > +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> > > > +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n> > > \n> > > Let's name the variables kSimpleInternalBufferCount and\n> > > kSimpleBufferSlotCount, as that's the naming scheme we're moving to for\n> > > non-macro constants. Same comment elsewhere in this patch.\n> > > \n> > > Those constants don't belong to converter.h. Could you turn them into\n> > > member constants of the SimplePipelineHandler class, as\n> > > kNumInternalBuffers (which btw should be removed) ? The number of buffer\n> > > slots can be passed as a parameter to SimpleConverter::start().\n> > > \n> > > There's no stats or parameters here, and no IPA, so the situation is\n> > > different than for IPU3 and RkISP1. The number of internal buffers\n> > > should just be one more than the minimum number of buffers required by\n> > > the capture device, I don't think there's another requirement.\n> \n> Plus one extra to have queued at the converter's 'output' node (which is its\n> input, confusingly)?\n\nIt depends a bit on the exact timings of the capture device, as is\nprobably clear with the explanation above (or at least is now clearly\nseen as a complicated topic :-)). We need to ensure that the realtime\nrequirements of the device are met, and that the capture buffers that\ncomplete, and are then processed by the converter, will be requeued in\ntime to the capture device to meet those requirements.\n\nAs the simple pipeline handler deals with a variety of devices, we have\ntwo options, either checking the requirements of each device and\nrecording them in the supportedDevices array, or pick a common number of\nbuffers that should be good enough for everybody. I'd start with the\nsecond option for simplicity, and as the pipeline handler currently uses\n3 buffers, I'd stick to that for now.\n\n> > > > +\n> > > >  class SimpleConverter\n> > > >  {\n> > > >  public:\n> > > > diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp\n> > > > index 1c25a7344f5f..a1163eaf8be2 100644\n> > > > --- a/src/libcamera/pipeline/simple/simple.cpp\n> > > > +++ b/src/libcamera/pipeline/simple/simple.cpp\n> > > > @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera, [[maybe_unused]] const ControlL\n> > > >  \t\t * When using the converter allocate a fixed number of internal\n> > > >  \t\t * buffers.\n> > > >  \t\t */\n> > > > -\t\tret = video->allocateBuffers(kNumInternalBuffers,\n> > > > +\t\tret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n> > > >  \t\t\t\t\t     &data->converterBuffers_);\n> > > >  \t} else {\n> > > > -\t\t/* Otherwise, prepare for using buffers from the only stream. */\n> > > > -\t\tStream *stream = &data->streams_[0];\n> > > > -\t\tret = video->importBuffers(stream->configuration().bufferCount);\n> > > > +\t\tret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > >  \t}\n> > > >  \tif (ret < 0)\n> > > >  \t\treturn ret;\n> > > > diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > index fd39b3d3c72c..755949e7a59a 100644\n> > > > --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > @@ -91,6 +91,8 @@ private:\n> > > >  \t\treturn static_cast<UVCCameraData *>(\n> > > >  \t\t\tPipelineHandler::cameraData(camera));\n> > > >  \t}\n> > > > +\n> > > > +\tstatic constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n> > > >  };\n> > > >  \n> > > >  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> > > > @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera *camera,\n> > > >  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > > >  {\n> > > >  \tUVCCameraData *data = cameraData(camera);\n> > > > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> > > >  \n> > > > -\tint ret = data->video_->importBuffers(count);\n> > > > +\tint ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n> > > \n> > > For the uvc and vimc pipeline handlers, we have no internal buffers, so\n> > > it's quite easy. We should have 8 or 16 slots, as for other pipeline\n> > > handlers.\n> > > \n> > > >  \tif (ret < 0)\n> > > >  \t\treturn ret;\n> > > >  \n> > > > diff --git a/src/libcamera/pipeline/vimc/vimc.cpp b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > index e89d53182c6d..24ba743a946c 100644\n> > > > --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > @@ -102,6 +102,8 @@ private:\n> > > >  \t\treturn static_cast<VimcCameraData *>(\n> > > >  \t\t\tPipelineHandler::cameraData(camera));\n> > > >  \t}\n> > > > +\n> > > > +\tstatic constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n> > > >  };\n> > > >  \n> > > >  namespace {\n> > > > @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera *camera,\n> > > >  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > > >  {\n> > > >  \tVimcCameraData *data = cameraData(camera);\n> > > > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> > > >  \n> > > > -\tint ret = data->video_->importBuffers(count);\n> > > > +\tint ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n> > > >  \tif (ret < 0)\n> > > >  \t\treturn ret;\n> > > >","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id BBCA5BD87D\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 17 Aug 2021 02:33:51 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id E5CB568895;\n\tTue, 17 Aug 2021 04:33:50 +0200 (CEST)","from perceval.ideasonboard.com (perceval.ideasonboard.com\n\t[213.167.242.64])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id D83A068889\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 17 Aug 2021 04:33:49 +0200 (CEST)","from pendragon.ideasonboard.com (62-78-145-57.bb.dnainternet.fi\n\t[62.78.145.57])\n\tby perceval.ideasonboard.com (Postfix) with ESMTPSA id 34D923E5;\n\tTue, 17 Aug 2021 04:33:49 +0200 (CEST)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (1024-bit key;\n\tunprotected) header.d=ideasonboard.com header.i=@ideasonboard.com\n\theader.b=\"cDybG7nz\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com;\n\ts=mail; t=1629167629;\n\tbh=eao22Z7GsOfEJ027zmoVxe3Lo1PvYHKZpGjDEvs0bkE=;\n\th=Date:From:To:Cc:Subject:References:In-Reply-To:From;\n\tb=cDybG7nzRRJQvcHYU86uLcZLXdoIOZ/uxrRvAKL2XpLg69dxC5FuQMWGdOA756AvJ\n\tOzh4k168d1OAyXh5m4N2mAx5dG1RpSSgcek7H6jTvQlhOKQNvlz5o+T64TyXXMzid6\n\tt23v7ueLGuADliAh8PYzXMAFLr+XDKHYt90Ji5Fo=","Date":"Tue, 17 Aug 2021 05:33:43 +0300","From":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","To":"=?utf-8?b?TsOtY29sYXMgRi4gUi4gQS4=?= Prado <nfraprado@collabora.com>","Message-ID":"<YRsgB6M7NE88y34v@pendragon.ideasonboard.com>","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>\n\t<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>\n\t<20210807150345.o4mcczkjt5vxium4@notapiano>\n\t<20210809202646.blgq4lyab7ktglsp@notapiano>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<20210809202646.blgq4lyab7ktglsp@notapiano>","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org, Sakari Ailus <sakari.ailus@iki.fi>, \n\t=?utf-8?b?QW5kcsOp?= Almeida <andrealmeid@collabora.com>,\n\tkernel@collabora.com","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":18862,"web_url":"https://patchwork.libcamera.org/comment/18862/","msgid":"<CAEmqJPqqv8K63Mc0ktDpx2ZdcRtxWrM4yFcH9PJHZU_Hpc=4+A@mail.gmail.com>","date":"2021-08-17T06:47:13","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":34,"url":"https://patchwork.libcamera.org/api/people/34/","name":"Naushir Patuck","email":"naush@raspberrypi.com"},"content":"Hi Laurent,\n\n\n\nOn Tue, 17 Aug 2021, 1:21 am Laurent Pinchart, <\nlaurent.pinchart@ideasonboard.com> wrote:\n\n> Hi Naush,\n>\n> On Thu, Aug 12, 2021 at 12:32:28PM +0100, Naushir Patuck wrote:\n> > On Mon, 2 Aug 2021 at 00:43, Laurent Pinchart wrote:\n> > > On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> > > > Pipelines have relied on bufferCount to decide on the number of\n> buffers\n> > > > to allocate internally through allocateBuffers() and on the number of\n> > > > V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> > > > number of internal buffers should be the minimum required by the\n> > > > algorithms to avoid wasting memory, and the number of V4L2 buffer\n> slots\n> > > > should overallocate to avoid thrashing dmabuf mappings.\n> > > >\n> > > > For now, just set them to constants and stop relying on bufferCount,\n> to\n> > > > allow for its removal.\n> > > >\n> > > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> > > > ---\n> > > >\n> > > > No changes in v7\n> > > >\n> > > > Changes in v6:\n> > > > - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n> > > >   INTERNAL_BUFFER_COUNT constant\n> > > >\n> > > >  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n> > > >  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n> > > >  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n> > > >  .../pipeline/raspberrypi/raspberrypi.cpp          | 15\n> +++++----------\n> > > >  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n> > > >  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n> > > >  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n> > > >  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n> > > >  src/libcamera/pipeline/simple/converter.h         |  3 +++\n> > > >  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n> > > >  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n> > > >  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n> > > >  12 files changed, 35 insertions(+), 43 deletions(-)\n> > >\n> > > Given that some of the pipeline handlers will need more intrusive\n> > > changes to address the comments below, you could split this with one\n> > > patch per pipeline handler (or perhaps grouping the easy ones\n> together).\n> > >\n> > > > diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp\n> b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > index e955bc3456ba..f36e99dacbe7 100644\n> > > > --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > @@ -593,22 +593,22 @@ int\n> ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n> > > >  /**\n> > > >   * \\brief Allocate buffers for all the ImgU video devices\n> > > >   */\n> > > > -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > > +int ImgUDevice::allocateBuffers()\n> > > >  {\n> > > >       /* Share buffers between CIO2 output and ImgU input. */\n> > > > -     int ret = input_->importBuffers(bufferCount);\n> > > > +     int ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > >       if (ret) {\n> > > >               LOG(IPU3, Error) << \"Failed to import ImgU input\n> buffers\";\n> > > >               return ret;\n> > > >       }\n> > > >\n> > > > -     ret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> > > > +     ret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT,\n> &paramBuffers_);\n> > > >       if (ret < 0) {\n> > > >               LOG(IPU3, Error) << \"Failed to allocate ImgU param\n> buffers\";\n> > > >               goto error;\n> > > >       }\n> > > >\n> > > > -     ret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> > > > +     ret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT,\n> &statBuffers_);\n> > > >       if (ret < 0) {\n> > > >               LOG(IPU3, Error) << \"Failed to allocate ImgU stat\n> buffers\";\n> > > >               goto error;\n> > > > @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int\n> bufferCount)\n> > > >        * corresponding stream is active or inactive, as the driver\n> needs\n> > > >        * buffers to be requested on the V4L2 devices in order to\n> operate.\n> > > >        */\n> > > > -     ret = output_->importBuffers(bufferCount);\n> > > > +     ret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > >       if (ret < 0) {\n> > > >               LOG(IPU3, Error) << \"Failed to import ImgU output\n> buffers\";\n> > > >               goto error;\n> > > >       }\n> > > >\n> > > > -     ret = viewfinder_->importBuffers(bufferCount);\n> > > > +     ret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > >       if (ret < 0) {\n> > > >               LOG(IPU3, Error) << \"Failed to import ImgU viewfinder\n> buffers\";\n> > > >               goto error;\n> > > > diff --git a/src/libcamera/pipeline/ipu3/imgu.h\n> b/src/libcamera/pipeline/ipu3/imgu.h\n> > > > index 9d4915116087..f934a951fc75 100644\n> > > > --- a/src/libcamera/pipeline/ipu3/imgu.h\n> > > > +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> > > > @@ -61,7 +61,7 @@ public:\n> > > >                                           outputFormat);\n> > > >       }\n> > > >\n> > > > -     int allocateBuffers(unsigned int bufferCount);\n> > > > +     int allocateBuffers();\n> > > >       void freeBuffers();\n> > > >\n> > > >       int start();\n> > > > @@ -86,6 +86,9 @@ private:\n> > > >       static constexpr unsigned int PAD_VF = 3;\n> > > >       static constexpr unsigned int PAD_STAT = 4;\n> > > >\n> > > > +     static constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> > > > +     static constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n> > >\n> > > 5 buffer slots is low. It means that if applications cycle more than 5\n> > > buffers, the V4L2VideoDevice cache that maintains associations between\n> > > dmabufs and buffer slots will the trashed. Due to the internal queue of\n> > > requests in the IPU3 pipeline handler (similar to what you have\n> > > implemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\n> > > queue\" for other pipeline handlers), we won't fail at queuing requests,\n> > > but performance will suffer. I thus think we need to increase the\n> number\n> > > of slots to what applications can be reasonably expected to use. We\n> > > could use 8, or even 16, as buffer slots are cheap. The same holds for\n> > > other pipeline handlers.\n> > >\n> > > The number of slots for the CIO2 output should match the number of\n> > > buffer slots for the ImgU input, as the same buffers are used on the\n> two\n> > > video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the\n> CIO2,\n> > > instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\n> > > buffers that are allocated by exportBuffers() in CIO2Device::start(),\n> to\n> > > be used in case the application doesn't provide any RAW buffer, should\n> > > be lower, as those are real buffer and are thus expensive. The number\n> of\n> > > buffers and buffer slots on the CIO2 thus needs to be decoupled.\n> > >\n> > > For proper operation, the CIO2 will require at least two queued buffers\n> > > (one being DMA'ed to, and one waiting). We need at least one extra\n> > > buffer queued to the ImgU to keep buffers flowing. Depending on\n> > > processing timings, it may be that the ImgU will complete processing of\n> > > its buffer before the CIO2 captures the next one, leading to a\n> temporary\n> > > situation where the CIO2 will have three buffers queued, or the CIO2\n> > > will finish the capture first, leading to a temporary situation where\n> > > the CIO2 will have one buffer queued and the ImgU will have two buffers\n> > > queued. In either case, shortly afterwards, the other component will\n> > > complete capture or processing, and we'll get back to a situation with\n> > > two buffers queued in the CIO2 and one in the ImgU. That's thus a\n> > > minimum of three buffers for raw images.\n> > >\n> > > From an ImgU point of view, we could probably get away with a single\n> > > parameter and a single stats buffer. This would however not allow\n> > > queuing the next frame for processing in the ImgU before the current\n> > > frame completes, so two buffers would be better. Now, if we take the\n> IPA\n> > > into account, the statistics buffer will spend some time on the IPA\n> side\n> > > for processing. It would thus be best to have an extra statistics\n> buffer\n> > > to accommodate that, thus requiring three statistics buffers (and three\n> > > parameters buffers, as we associate them together).\n> > >\n> > > This rationale leads to using the same number of internal buffers for\n> > > the CIO2, the parameters and the statistics. We currently use four, and\n> > > while the logic above indicates we could get away with three, it would\n> > > be safer to keep using four in this patch, and possibly reduce the\n> > > number of buffers later.\n> > >\n> > > I know documentation isn't fun, but I think this rationale should be\n> > > captured in a comment in the IPU3 pipeline handler, along with a \\todo\n> > > item to try and lower the number of internal buffers to three.\n> > >\n> > > > +\n> > > >       int linkSetup(const std::string &source, unsigned int\n> sourcePad,\n> > > >                     const std::string &sink, unsigned int sinkPad,\n> > > >                     bool enable);\n> > > > diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > index 5fd1757bfe13..4efd201c05e5 100644\n> > > > --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera\n> *camera)\n> > > >  {\n> > > >       IPU3CameraData *data = cameraData(camera);\n> > > >       ImgUDevice *imgu = data->imgu_;\n> > > > -     unsigned int bufferCount;\n> > > >       int ret;\n> > > >\n> > > > -     bufferCount = std::max({\n> > > > -             data->outStream_.configuration().bufferCount,\n> > > > -             data->vfStream_.configuration().bufferCount,\n> > > > -             data->rawStream_.configuration().bufferCount,\n> > > > -     });\n> > > > -\n> > > > -     ret = imgu->allocateBuffers(bufferCount);\n> > > > +     ret = imgu->allocateBuffers();\n> > > >       if (ret < 0)\n> > > >               return ret;\n> > > >\n> > > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > index d1cd3d9dc082..776e0f92aed1 100644\n> > > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > @@ -1149,20 +1149,15 @@ int\n> PipelineHandlerRPi::prepareBuffers(Camera *camera)\n> > > >  {\n> > > >       RPiCameraData *data = cameraData(camera);\n> > > >       int ret;\n> > > > +     constexpr unsigned int bufferCount = 4;\n> > > >\n> > > >       /*\n> > > > -      * Decide how many internal buffers to allocate. For now,\n> simply look\n> > > > -      * at how many external buffers will be provided. We'll need\n> to improve\n> > > > -      * this logic. However, we really must have all streams\n> allocate the same\n> > > > -      * number of buffers to simplify error handling in\n> queueRequestDevice().\n> > > > +      * Allocate internal buffers. We really must have all streams\n> allocate\n> > > > +      * the same number of buffers to simplify error handling in\n> > > > +      * queueRequestDevice().\n> > > >        */\n> > > > -     unsigned int maxBuffers = 0;\n> > > > -     for (const Stream *s : camera->streams())\n> > > > -             if (static_cast<const RPi::Stream *>(s)->isExternal())\n> > > > -                     maxBuffers = std::max(maxBuffers,\n> s->configuration().bufferCount);\n> > > > -\n> > > >       for (auto const stream : data->streams_) {\n> > > > -             ret = stream->prepareBuffers(maxBuffers);\n> > > > +             ret = stream->prepareBuffers(bufferCount);\n> > >\n> > > We have a similar problem here, 4 buffer slots is too little, but when\n> > > the stream has to allocate internal buffers (!importOnly), which is the\n> > > case for most streams, we don't want to overallocate.\n> > >\n> > > I'd like to get feedback from Naush here, but I think this means we'll\n> > > have to relax the requirement documented in the comment above, and\n> > > accept a different number of buffers for each stream.\n> >\n> > Sorry for the late reply to this thread!\n> >\n> > As is evident from the above comment, this bit of code does need to be\n> improved\n> > to avoid over-applications which I will get to at some point. However,\n> to address this\n> > change and the comments, 4 buffer slots sounds like it might be too\n> little.  Regarding\n> > the requirement on having streams allocate the same number of buffers -\n> that can be\n> > relaxed (and the comment removed) as we do handle it correctly.\n>\n> Thanks for the information. I understand that this means that we can\n> drop the comment and have different numbers of buffers for different\n> streams without any other change to the pipeline handler. If that's\n> incorrect, please let me know.\n>\n\n\nYes, that should be the case now.\n\nHowever, I would probably still prefer to keep the number of Unicam Image\nand Unicam Embedded buffers the same for symmetry.\nI don't think that should cause any issue with this rework.\n\nRegards,\nNaush\n\n>\n> > > >               if (ret < 0)\n> > > >                       return ret;\n> > > >       }\n> > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > index 11325875b929..f4ea2fd4d4d0 100644\n> > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > @@ -690,16 +690,11 @@ int\n> PipelineHandlerRkISP1::allocateBuffers(Camera *camera)\n> > > >       unsigned int ipaBufferId = 1;\n> > > >       int ret;\n> > > >\n> > > > -     unsigned int maxCount = std::max({\n> > > > -             data->mainPathStream_.configuration().bufferCount,\n> > > > -             data->selfPathStream_.configuration().bufferCount,\n> > > > -     });\n> > > > -\n> > > > -     ret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> > > > +     ret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT,\n> &paramBuffers_);\n> > > >       if (ret < 0)\n> > > >               goto error;\n> > > >\n> > > > -     ret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> > > > +     ret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT,\n> &statBuffers_);\n> > > >       if (ret < 0)\n> > > >               goto error;\n> > > >\n> > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > index 25f482eb8d8e..fea330f72886 100644\n> > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > @@ -172,7 +172,7 @@ int RkISP1Path::start()\n> > > >               return -EBUSY;\n> > > >\n> > > >       /* \\todo Make buffer count user configurable. */\n> > > > -     ret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> > > > +     ret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n> > > >       if (ret)\n> > > >               return ret;\n> > > >\n> > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > index 91757600ccdc..3c5891009c58 100644\n> > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > @@ -27,6 +27,9 @@ class V4L2Subdevice;\n> > > >  struct StreamConfiguration;\n> > > >  struct V4L2SubdeviceFormat;\n> > > >\n> > > > +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> > > > +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n> > >\n> > > The situation should be simpler for the rkisp1, as it has a different\n> > > pipeline model (inline ISP as opposed to offline ISP for the IPU3). We\n> > > can allocate more slots (8 or 16, as for other pipeline handlers), and\n> > > restrict the number of internal buffers (for stats and parameters) to\n> > > the number of requests we expect to queue to the device at once, plus\n> > > one for the IPA.  Four thus seems good. Capturing this rationale in a\n> > > comment would be good too.\n> > >\n> > > BTW, I may be too tired to think properly, or just unable to see the\n> > > obvious, so please challenge any rationale you think is incorrect.\n> > >\n> > > > +\n> > > >  class RkISP1Path\n> > > >  {\n> > > >  public:\n> > > > diff --git a/src/libcamera/pipeline/simple/converter.cpp\n> b/src/libcamera/pipeline/simple/converter.cpp\n> > > > index b5e34c4cd0c5..b3bcf01483f7 100644\n> > > > --- a/src/libcamera/pipeline/simple/converter.cpp\n> > > > +++ b/src/libcamera/pipeline/simple/converter.cpp\n> > > > @@ -103,11 +103,11 @@ int\n> SimpleConverter::Stream::exportBuffers(unsigned int count,\n> > > >\n> > > >  int SimpleConverter::Stream::start()\n> > > >  {\n> > > > -     int ret = m2m_->output()->importBuffers(inputBufferCount_);\n> > > > +     int ret =\n> m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > >\n> > > Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\n> > > much of an issue I suppose.\n> > >\n> > > >       if (ret < 0)\n> > > >               return ret;\n> > > >\n> > > > -     ret = m2m_->capture()->importBuffers(outputBufferCount_);\n> > > > +     ret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > >       if (ret < 0) {\n> > > >               stop();\n> > > >               return ret;\n> > > > diff --git a/src/libcamera/pipeline/simple/converter.h\n> b/src/libcamera/pipeline/simple/converter.h\n> > > > index 276a2a291c21..7e1d60674f62 100644\n> > > > --- a/src/libcamera/pipeline/simple/converter.h\n> > > > +++ b/src/libcamera/pipeline/simple/converter.h\n> > > > @@ -29,6 +29,9 @@ class SizeRange;\n> > > >  struct StreamConfiguration;\n> > > >  class V4L2M2MDevice;\n> > > >\n> > > > +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> > > > +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n> > >\n> > > Let's name the variables kSimpleInternalBufferCount and\n> > > kSimpleBufferSlotCount, as that's the naming scheme we're moving to for\n> > > non-macro constants. Same comment elsewhere in this patch.\n> > >\n> > > Those constants don't belong to converter.h. Could you turn them into\n> > > member constants of the SimplePipelineHandler class, as\n> > > kNumInternalBuffers (which btw should be removed) ? The number of\n> buffer\n> > > slots can be passed as a parameter to SimpleConverter::start().\n> > >\n> > > There's no stats or parameters here, and no IPA, so the situation is\n> > > different than for IPU3 and RkISP1. The number of internal buffers\n> > > should just be one more than the minimum number of buffers required by\n> > > the capture device, I don't think there's another requirement.\n> > >\n> > > > +\n> > > >  class SimpleConverter\n> > > >  {\n> > > >  public:\n> > > > diff --git a/src/libcamera/pipeline/simple/simple.cpp\n> b/src/libcamera/pipeline/simple/simple.cpp\n> > > > index 1c25a7344f5f..a1163eaf8be2 100644\n> > > > --- a/src/libcamera/pipeline/simple/simple.cpp\n> > > > +++ b/src/libcamera/pipeline/simple/simple.cpp\n> > > > @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera\n> *camera, [[maybe_unused]] const ControlL\n> > > >                * When using the converter allocate a fixed number of\n> internal\n> > > >                * buffers.\n> > > >                */\n> > > > -             ret = video->allocateBuffers(kNumInternalBuffers,\n> > > > +             ret =\n> video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n> > > >                                            &data->converterBuffers_);\n> > > >       } else {\n> > > > -             /* Otherwise, prepare for using buffers from the only\n> stream. */\n> > > > -             Stream *stream = &data->streams_[0];\n> > > > -             ret =\n> video->importBuffers(stream->configuration().bufferCount);\n> > > > +             ret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > >       }\n> > > >       if (ret < 0)\n> > > >               return ret;\n> > > > diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > index fd39b3d3c72c..755949e7a59a 100644\n> > > > --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > @@ -91,6 +91,8 @@ private:\n> > > >               return static_cast<UVCCameraData *>(\n> > > >                       PipelineHandler::cameraData(camera));\n> > > >       }\n> > > > +\n> > > > +     static constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n> > > >  };\n> > > >\n> > > >  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> > > > @@ -236,9 +238,8 @@ int\n> PipelineHandlerUVC::exportFrameBuffers(Camera *camera,\n> > > >  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]]\n> const ControlList *controls)\n> > > >  {\n> > > >       UVCCameraData *data = cameraData(camera);\n> > > > -     unsigned int count = data->stream_.configuration().bufferCount;\n> > > >\n> > > > -     int ret = data->video_->importBuffers(count);\n> > > > +     int ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n> > >\n> > > For the uvc and vimc pipeline handlers, we have no internal buffers, so\n> > > it's quite easy. We should have 8 or 16 slots, as for other pipeline\n> > > handlers.\n> > >\n> > > >       if (ret < 0)\n> > > >               return ret;\n> > > >\n> > > > diff --git a/src/libcamera/pipeline/vimc/vimc.cpp\n> b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > index e89d53182c6d..24ba743a946c 100644\n> > > > --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > @@ -102,6 +102,8 @@ private:\n> > > >               return static_cast<VimcCameraData *>(\n> > > >                       PipelineHandler::cameraData(camera));\n> > > >       }\n> > > > +\n> > > > +     static constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n> > > >  };\n> > > >\n> > > >  namespace {\n> > > > @@ -312,9 +314,8 @@ int\n> PipelineHandlerVimc::exportFrameBuffers(Camera *camera,\n> > > >  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]]\n> const ControlList *controls)\n> > > >  {\n> > > >       VimcCameraData *data = cameraData(camera);\n> > > > -     unsigned int count = data->stream_.configuration().bufferCount;\n> > > >\n> > > > -     int ret = data->video_->importBuffers(count);\n> > > > +     int ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n> > > >       if (ret < 0)\n> > > >               return ret;\n> > > >\n>\n> --\n> Regards,\n>\n> Laurent Pinchart\n>","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id 60EA6BD87C\n\tfor <parsemail@patchwork.libcamera.org>;\n\tTue, 17 Aug 2021 06:47:27 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 15C16688AC;\n\tTue, 17 Aug 2021 08:47:27 +0200 (CEST)","from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com\n\t[IPv6:2a00:1450:4864:20::22c])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 08FAF6025B\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tTue, 17 Aug 2021 08:47:26 +0200 (CEST)","by mail-lj1-x22c.google.com with SMTP id y7so31316758ljp.3\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tMon, 16 Aug 2021 23:47:26 -0700 (PDT)"],"Authentication-Results":"lancelot.ideasonboard.com;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=raspberrypi.com header.i=@raspberrypi.com\n\theader.b=\"Isj3CmkX\"; dkim-atps=neutral","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=raspberrypi.com; s=google;\n\th=mime-version:references:in-reply-to:from:date:message-id:subject:to\n\t:cc; bh=8vsfIeAgW+oTt3vcQQ1kDo8NDgbc/EN+JQK/I1/pOJw=;\n\tb=Isj3CmkXb6FH0nY8oijz2VyU5qcIjUU+zJvqe3wquvWXM7ltQuP3W77rNOiFgVRm+y\n\tsyEpiVpgWFzfryExxg3a7HExefhlfynZ7Ev+LhFVUJHP9Jp+zwG1GIUKIjk4AJkvVXTS\n\tNUifqWQy9/71GjeUJac6SyqvbndqtHW6B6gQdaIkh4pfSjIOm5YLvJ3AYN/ED0EzCKEe\n\tQ/c5DCTe7oMjyaS5c7QaY0IyoYsvUwFenaY8ZjV0fjhNptPc2AlcqmKnpOzqaF/ZfzVC\n\t1SFxqwWI2csfHyTsx919omAclQQvH2Q3UEJSmEv994hJlwCjQIQWj5F4Lv2+1dbaUu0m\n\tj5+w==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:mime-version:references:in-reply-to:from:date\n\t:message-id:subject:to:cc;\n\tbh=8vsfIeAgW+oTt3vcQQ1kDo8NDgbc/EN+JQK/I1/pOJw=;\n\tb=mR1G0Jicvnutnmt7UeGkkgNXisUuo9kOHBQNSox81L178401+zKZNajq2jL70UqgX3\n\tFvlLVwbKVyWkCufQ3zQ1fSk8NUP/Gi7P9DZO7l4Y8KthtEJwkVKNHYIrTw3SNcHHbwVU\n\t9Ag3YOCzLsWlNufy8/hOkcRXHOPZvpdUWaFIsVO/FlIUlEyJU1BLXmBpbtOgj2+NSkkX\n\t9BiAirLjnB3q2cT7vEA7SSbeOJSV5u9ya5wEDMxmJDqGmX5FzDRVqpiAYmi469cwO80e\n\tfi/kbZlLmmsIuQcnI+WZ1djwC1pmiSyfcXyC6vZRSlK76MVTr8y/v0NhJY+SHH2pQKxT\n\tWHCA==","X-Gm-Message-State":"AOAM533fL5QXRQ3pHpra2OXayHJ+Gcd1kDdTd4zeAcZU1+P0sKWcS6pD\n\tvkOTEyHCBibpIEZ1KuBq4j9SsqAYvAlz3NbJvtu44Q==","X-Google-Smtp-Source":"ABdhPJwSTp50zccN2U4Xd/NsLOOYSS9A8mB0DbzbQjR0yfKPZPc8Y2FDecNDrF33hdjwF/ec8BsfwTBi1rRubKZaEy0=","X-Received":"by 2002:a2e:9e59:: with SMTP id\n\tg25mr1774630ljk.499.1629182845199; \n\tMon, 16 Aug 2021 23:47:25 -0700 (PDT)","MIME-Version":"1.0","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>\n\t<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>\n\t<CAEmqJPq94iMjF92TivzPkgRk29dVRB4Rut1SEeRAhvRjuPJOuA@mail.gmail.com>\n\t<YRsBBa++KC1IdJVz@pendragon.ideasonboard.com>","In-Reply-To":"<YRsBBa++KC1IdJVz@pendragon.ideasonboard.com>","From":"Naushir Patuck <naush@raspberrypi.com>","Date":"Tue, 17 Aug 2021 07:47:13 +0100","Message-ID":"<CAEmqJPqqv8K63Mc0ktDpx2ZdcRtxWrM4yFcH9PJHZU_Hpc=4+A@mail.gmail.com>","To":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","Content-Type":"multipart/alternative; boundary=\"000000000000c33ba205c9bbb007\"","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera devel <libcamera-devel@lists.libcamera.org>,\n\tkernel@collabora.com, =?utf-8?q?Andr=C3=A9_Almeida?=\n\t<andrealmeid@collabora.com>","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":18947,"web_url":"https://patchwork.libcamera.org/comment/18947/","msgid":"<20210819131212.3gznnqdacmlwsigx@notapiano>","date":"2021-08-19T13:12:12","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":84,"url":"https://patchwork.libcamera.org/api/people/84/","name":"Nícolas F. R. A. Prado","email":"nfraprado@collabora.com"},"content":"Hi Laurent,\n\nOn Tue, Aug 17, 2021 at 05:33:43AM +0300, Laurent Pinchart wrote:\n> Hi Nícolas,\n> \n> On Mon, Aug 09, 2021 at 05:26:46PM -0300, Nícolas F. R. A. Prado wrote:\n> > On Sat, Aug 07, 2021 at 12:03:52PM -0300, Nícolas F. R. A. Prado wrote:\n> > > On Mon, Aug 02, 2021 at 02:42:53AM +0300, Laurent Pinchart wrote:\n> > > > On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> > > > > Pipelines have relied on bufferCount to decide on the number of buffers\n> > > > > to allocate internally through allocateBuffers() and on the number of\n> > > > > V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> > > > > number of internal buffers should be the minimum required by the\n> > > > > algorithms to avoid wasting memory, and the number of V4L2 buffer slots\n> > > > > should overallocate to avoid thrashing dmabuf mappings.\n> > > > > \n> > > > > For now, just set them to constants and stop relying on bufferCount, to\n> > > > > allow for its removal.\n> > > > > \n> > > > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> > > > > ---\n> > > > > \n> > > > > No changes in v7\n> > > > > \n> > > > > Changes in v6:\n> > > > > - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n> > > > >   INTERNAL_BUFFER_COUNT constant\n> > > > > \n> > > > >  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n> > > > >  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n> > > > >  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n> > > > >  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------\n> > > > >  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n> > > > >  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n> > > > >  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n> > > > >  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n> > > > >  src/libcamera/pipeline/simple/converter.h         |  3 +++\n> > > > >  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n> > > > >  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n> > > > >  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n> > > > >  12 files changed, 35 insertions(+), 43 deletions(-)\n> > > > \n> > > > Given that some of the pipeline handlers will need more intrusive\n> > > > changes to address the comments below, you could split this with one\n> > > > patch per pipeline handler (or perhaps grouping the easy ones together).\n> > > > \n> > > > > \n> > > > > diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > > index e955bc3456ba..f36e99dacbe7 100644\n> > > > > --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > > +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > > @@ -593,22 +593,22 @@ int ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n> > > > >  /**\n> > > > >   * \\brief Allocate buffers for all the ImgU video devices\n> > > > >   */\n> > > > > -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > > > +int ImgUDevice::allocateBuffers()\n> > > > >  {\n> > > > >  \t/* Share buffers between CIO2 output and ImgU input. */\n> > > > > -\tint ret = input_->importBuffers(bufferCount);\n> > > > > +\tint ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU input buffers\";\n> > > > >  \t\treturn ret;\n> > > > >  \t}\n> > > > >  \n> > > > > -\tret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> > > > > +\tret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU param buffers\";\n> > > > >  \t\tgoto error;\n> > > > >  \t}\n> > > > >  \n> > > > > -\tret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> > > > > +\tret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU stat buffers\";\n> > > > >  \t\tgoto error;\n> > > > > @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > > >  \t * corresponding stream is active or inactive, as the driver needs\n> > > > >  \t * buffers to be requested on the V4L2 devices in order to operate.\n> > > > >  \t */\n> > > > > -\tret = output_->importBuffers(bufferCount);\n> > > > > +\tret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU output buffers\";\n> > > > >  \t\tgoto error;\n> > > > >  \t}\n> > > > >  \n> > > > > -\tret = viewfinder_->importBuffers(bufferCount);\n> > > > > +\tret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU viewfinder buffers\";\n> > > > >  \t\tgoto error;\n> > > > > diff --git a/src/libcamera/pipeline/ipu3/imgu.h b/src/libcamera/pipeline/ipu3/imgu.h\n> > > > > index 9d4915116087..f934a951fc75 100644\n> > > > > --- a/src/libcamera/pipeline/ipu3/imgu.h\n> > > > > +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> > > > > @@ -61,7 +61,7 @@ public:\n> > > > >  \t\t\t\t\t    outputFormat);\n> > > > >  \t}\n> > > > >  \n> > > > > -\tint allocateBuffers(unsigned int bufferCount);\n> > > > > +\tint allocateBuffers();\n> > > > >  \tvoid freeBuffers();\n> > > > >  \n> > > > >  \tint start();\n> > > > > @@ -86,6 +86,9 @@ private:\n> > > > >  \tstatic constexpr unsigned int PAD_VF = 3;\n> > > > >  \tstatic constexpr unsigned int PAD_STAT = 4;\n> > > > >  \n> > > > > +\tstatic constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> > > > > +\tstatic constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n> > > > \n> > > > 5 buffer slots is low. It means that if applications cycle more than 5\n> > > > buffers, the V4L2VideoDevice cache that maintains associations between\n> > > > dmabufs and buffer slots will the trashed. Due to the internal queue of\n> > > > requests in the IPU3 pipeline handler (similar to what you have\n> > > > implemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\n> > > > queue\" for other pipeline handlers), we won't fail at queuing requests,\n> > > > but performance will suffer. I thus think we need to increase the number\n> > > > of slots to what applications can be reasonably expected to use. We\n> > > > could use 8, or even 16, as buffer slots are cheap. The same holds for\n> > > > other pipeline handlers.\n> > > > \n> > > > The number of slots for the CIO2 output should match the number of\n> > > > buffer slots for the ImgU input, as the same buffers are used on the two\n> > > > video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,\n> > > > instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\n> > > > buffers that are allocated by exportBuffers() in CIO2Device::start(), to\n> > > > be used in case the application doesn't provide any RAW buffer, should\n> > > > be lower, as those are real buffer and are thus expensive. The number of\n> > > > buffers and buffer slots on the CIO2 thus needs to be decoupled.\n> > > > \n> > > > For proper operation, the CIO2 will require at least two queued buffers\n> > > > (one being DMA'ed to, and one waiting). We need at least one extra\n> > > > buffer queued to the ImgU to keep buffers flowing. Depending on\n> > > > processing timings, it may be that the ImgU will complete processing of\n> > > > its buffer before the CIO2 captures the next one, leading to a temporary\n> > > > situation where the CIO2 will have three buffers queued, or the CIO2\n> > > > will finish the capture first, leading to a temporary situation where\n> > > > the CIO2 will have one buffer queued and the ImgU will have two buffers\n> > > > queued. In either case, shortly afterwards, the other component will\n> > > > complete capture or processing, and we'll get back to a situation with\n> > > > two buffers queued in the CIO2 and one in the ImgU. That's thus a\n> > > > minimum of three buffers for raw images.\n> > > > \n> > > > From an ImgU point of view, we could probably get away with a single\n> > > > parameter and a single stats buffer. This would however not allow\n> > > > queuing the next frame for processing in the ImgU before the current\n> > > > frame completes, so two buffers would be better. Now, if we take the IPA\n> > > > into account, the statistics buffer will spend some time on the IPA side\n> > > > for processing. It would thus be best to have an extra statistics buffer\n> > > > to accommodate that, thus requiring three statistics buffers (and three\n> > > > parameters buffers, as we associate them together).\n> > > > \n> > > > This rationale leads to using the same number of internal buffers for\n> > > > the CIO2, the parameters and the statistics. We currently use four, and\n> > > > while the logic above indicates we could get away with three, it would\n> > > > be safer to keep using four in this patch, and possibly reduce the\n> > > > number of buffers later.\n> > > > \n> > > > I know documentation isn't fun, but I think this rationale should be\n> > > > captured in a comment in the IPU3 pipeline handler, along with a \\todo\n> > > > item to try and lower the number of internal buffers to three.\n> > > \n> > > This is the IPU3 topology as I understand it:\n> > > \n> > >       Output  .               .   Input        Output .\n> > >       +---+   .               .   +---+        +---+  .\n> > >       |   | --------------------> |   |        |   |  .\n> > >       +---+   .               .   +---+        +---+  .\n> > > CIO2          .   IPA         .          ImgU         .          IPA\n> > >               .        Param  .   Param        Stat   .   Stat\n> > >               .        +---+  .   +---+        +---+  .   +---+ \n> > >               .        |   | ---> |   |        |   | ---> |   | \n> > >               .        +---+  .   +---+        +---+  .   +---+ \n> > >           \n> > > Your suggestions for the minimum number of buffers required are the following,\n> > > from what I understand:\n> > > \n> > > CIO2 raw internal buffers:\n> > > - 2x on CIO2 Output (one being DMA'ed, one waiting)\n> > > - 1x on ImgU Input\n> > > \n> > > ImgU Param/Stat internal buffers:\n> > > - 2x on ImgU Param/Stat (one being processed, one waiting)\n> > > - 1x on IPA Stat\n> > > \n> > > This arrangement doesn't seem to take into account that IPU3Frames::Info binds\n> > > CIO2 internal buffers and ImgU Param/Stat buffers together. This means that each\n> > > raw buffer queued to CIO2 Output needs a Param/Stat buffer as well. And each\n> > > Param/Stat buffer queued to ImgU for processing needs a CIO2 raw buffer as well.\n> > > After ImgU processing though, the raw buffer gets released and reused, so the\n> > > Stat buffer queued to the IPA does not require a CIO2 raw buffer.\n> > > \n> > > This means that to achieve the above minimum, due to the IPU3Frames::Info\n> > > constraint, we'd actually need:\n> > > \n> > > CIO2 internal buffers:\n> > > - 2x on CIO2 Output (one being DMA'ed, one waiting)\n> > > - 2x on ImgU Input (for the two ImgU Param/Stat buffers we want to have there)\n> > > \n> > > ImgU Param/Stat internal buffers:\n> > > - 2x on CIO2 Output (for the two CIO2 raw buffers we want to have there)\n> > > - 2x on ImgU Param/Stat (one being processed, one waiting)\n> \n> Note that the need to have two buffers here is to ensure back-to-back\n> processing of frame on the ImgU and thus avoid delays, but this need\n> actually depends on how fast the ImgU is. With a very fast ImgU\n> (compared to the frame duration), inter-frame delays may not be an\n> issue. There's more on this below.\n> \n> > > - 1x on IPA Stat\n> \n> Processing of the statistics can occur after the corresponding raw image\n> buffer has been requeued to the CIO2, the only hard requirement is that\n> the buffer needs to be available by the time the ImgU will process the\n> corresponding raw frame buffer again.\n\nIPU3CameraData::queuePendingRequests() creates a IPU3Frames::Info with a param\nand stat buffers before adding a raw buffer to it and queuing to the CIO2.\nSo in order to have the statistics processing by the IPA happen after the\nraw buffer has been requeued to the CIO2 we would either need to have one extra\nparam/stat buffer compared to the number of CIO2 internal buffers, so 5\nparam/stat buffers, or change that code so that the param/stat buffers are only\nadded to the FrameInfo after we receive the buffer ready from the CIO2.\n\nIn any case as you've mentioned, we currently use four for both and it works\nwell, so I'll leave it that way, I just wanted to point out that technically the\nIPA stat processing is currently part of the requeue delay for CIO2 buffers.\n\nThanks,\nNícolas\n\n> \n> > > Also we're not accounting for parameter filling in the IPA before we queue the\n> > > buffers to ImgU, but perhaps that's fast enough that it doesn't matter?\n> \n> That's one of the questions we need to answer, I don't think we have\n> numbers at this time. If filling the parameters buffer takes a\n> significant amount of time, then that would need to be taken into\n> account as an additional step in the pipeline, with an additional set of\n> buffers.\n> \n> > > Does this make sense? Or am I missing something?\n> \n> One thing that you make not have taken into account is that the two\n> buffers queued on the CIO2 output and the two buffers queued on the ImgU\n> are not necessarily queued at the same time. I'll try to explain.\n> \n> On the CIO2 side, we have a strong real time requirement to always keep\n> the CIO2 fed with buffers. The details depend a bit on the hardware and\n> driver implementations, but the base idea is that once a buffer is\n> complete and the time comes to move to the next buffer for the next\n> frame, there has to be a next buffer available. When exactly this occurs\n> can vary. Some drivers will give the buffer for the next frame to the\n> device when capture for the current frame starts, and some will give it\n> when the hardware signals completion of the capture of the current frame\n> (frame end). In theory this could be delayed even a bit more, but it has\n> to happen before the hardware needs the new buffer, and giving it when\n> the DMA completes is often too risky already as vertical blanking can be\n> short and interrupts can be delayed a bit. I tried to check the driver\n> to see what the exact requirement is, but I'm not familiar with the\n> hardware and the code is not very easy to follow.\n> \n> Note that frame start is the time when the first pixel of the frame is\n> written to memory, and frame end the time when the last pixel of the\n> frame is written to memory. The end of frame N and the start of frame\n> N+1 are separated by the vertical blanking time.\n> \n> Let's assume that the CIO2 needs to be programmed with the buffer for\n> frame N+1 at the start of frame N (Edit: I've written all the\n> explanation below based on this assumption, but after further\n> investigation, I *think* the CIO2 only requires the buffer for frame N+1\n> at the beginning of frame N+1, but the driver enforces that the buffer\n> must be present just before the start of frame N to avoid race\n> conditions - just before the start of frame N and at the start of frame\n> N at practically speaking the same thing. Sakari, do you know if this is\n> correct ?). We'll constantly transition between the following states,\n> from the CIO2 point of view.\n> \n> 0. (Initial state) 2x idle buffers in the queue, hardware stopped. The\n>    CIO2 is then started, the first buffer in the queue is given to the\n>    device to capture the first frame, and the second buffer in the queue\n>    is given to the device to capture the second frame. The first frame\n>    starts.\n> \n> 1. 1x active buffer being DMA'ed to, 1x pending buffer already given to\n>    the hardware for the next frame, 0x idle buffers in the queue. Two\n>    events can occur at this point, either completion of the current\n>    frame (-> 2), or a new buffer being queued by userspace (-> 4).\n> \n> 2. 0x active buffer beind DMA'ed to, 1x pending buffer already given to\n>    the hardware for the next frame, 0x idle buffers in the queue. Two\n>    events can occur at this point, either start of the next frame (->\n>    3), or a new buffer being queued by userspace (-> 5).\n> \n>    This state lasts for the duration of the vertical blanking only, and\n>    can thus be short-lived.\n> \n> 3. The next frame start. The pending buffer becomes active. We have no\n>    buffer in the queue to give to the hardware for the next frame. An\n>    underrun has occurred, a frame will be dropped. Game over.\n> \n> 4. 1x active buffer being DMA'ed to, 1x pending buffer already given to\n>    the hardware for the next frame, 1x idle buffers in the queue. The\n>    next event that will occur is the start of the next frame (as the\n>    other option, a new buffer being queued, will give us additional\n>    safety by increasing the number of queued buffers, but isn't\n>    meaningful when considering the case where we try to run with the\n>    minimum number of buffers possible).\n> \n>    As the current frame ends, the active buffer is given back to the\n>    userspace. There's no active buffer (the DMA will start soon, after\n>    the vertical blanking, when the next frame starts), the pending\n>    buffer stays pending, and the idle buffer stays idle (-> 5).\n> \n> 5. 0x active buffer beind DMA'ed to, 1x pending buffer already given to\n>    the hardware for the next frame, 1x idle buffers in the queue. The\n>    next event that will occur is the start of the next frame (for the\n>    same reason as in 4).\n> \n>    As the next frame starts, the pending buffer becomes active. The\n>    queue buffer is given to the hardware for the subsequent frame. The\n>    queue of idle buffers become empty (-> 1).\n> \n>    If this state is reached from state 2, it lasts for the remaining of\n>    the vertical blanking only. If it is reached from state 4, it lasts\n>    for the whole vertical blanking. In both cases, it can be\n>    short-lived.\n> \n> We can thus cycle either through 1 -> 2 -> 5 -> 1 or through 1 -> 4 -> 5\n> -> 1. The first cycle requires two buffers for the CIO2, with an\n> intermediate state (2) that has a single buffer only. This is unsafe, as\n> a failure to queue a second buffer in the short-lived state 2 will lead\n> to state 3 and frame drops.\n> \n> The second cycle requires three buffers for the CIO2. This is the cycle\n> we want to use, to avoid frame drops. Note that only state 4 requires\n> all three buffers, and userspace can queue the third buffer at any point\n> in state 1 (before the end of the current frame). If userspace queues\n> the frame slightly too late, after the completion of the current frame\n> but before the start of the next one, we'll go to the unsafe cycle but\n> will still not lose frames.\n> \n> Now, let's look at the ImgU side, and assume we use three buffers in\n> total. The ImgU operates from memory to memory, it thus has no realtime\n> requirement. It only starts processing a frame when the frame is given\n> to it. This occurs, from a CIO2 point of view, in the transition from\n> state 4 to state 5, plus all delays introduced by delivering the CIO2\n> frame completion event to userspace, queueing the frame to the ImgU (I'm\n> ignoring the IPA here), and starting the ImgU itself. The ImgU\n> processing time will, on average, be lower than the frame duration,\n> otherwise it won't be able to process all frames. Once the ImgU\n> completes processing of the frame, it will signal this to userspace.\n> There's also a processing delay there (signalling, task switching, ...),\n> and userspace will requeue the frame to the CIO2. This has to occur at\n> the latest before the end of the current frame, otherwise state 1 will\n> transition to state 2.\n> \n> We thus see that, in the 3 buffers case, we need to ensure that the\n> total time to process the frame on the ImgU, from the CIO2 interrupt\n> signalling the end of state 4 to the buffer being requeued to the CIO2,\n> thus including all task switching and other delays, doesn't exceed the\n> duration of states 5 + 1, which is equal to the duration of a frame. The\n> ImgU processing time itself is guaranteed to be lower than that, but the\n> additional delays may be problematic. We also need to include a possible\n> round-trip to the IPA after end of buffer capture by the CIO2 and start\n> of processing by the ImgU to retrieve the ImgU parameters for the frame.\n> Three buffers start sounding quite risky. I'm thus correcting myself,\n> hour buffers seem safer.\n> \n> None of this takes the parameters or statistics buffers into account,\n> but I don't think they're particularly problematic in the sense that the\n> most strict realtime constraints come from the raw image buffers. Feel\n> free to prove me wrong though :-)\n> \n> Let's however note that we can probably fetch the ImgU parameters for\n> the frame that has just been captured before the end of the frame, so\n> that would remove a delay in the ImgU processing. This assumes that the\n> algorithms wouldn't need to know the exact exposure time and analog gain\n> that have been used to capture the current frame in order to compute the\n> ImgU parameters. This leads to a first question to David: does the\n> Raspberry Pi IPA require the sensor metadata to calculate ISP\n> parameters, or are they needed only when processing statistics from\n> frame N to calculate sensor and ISP parameters of subsequent frames ?\n> \n> The next question is for everybody (and that's why I've expanded the CC\n> list to Kieran, Jean-Michel and Sakari too): what did I get wrong in the\n> above explanation ? :-)\n> \n> > > > > +\n> > > > >  \tint linkSetup(const std::string &source, unsigned int sourcePad,\n> > > > >  \t\t      const std::string &sink, unsigned int sinkPad,\n> > > > >  \t\t      bool enable);\n> > > > > diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > > index 5fd1757bfe13..4efd201c05e5 100644\n> > > > > --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > > +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > > @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera *camera)\n> > > > >  {\n> > > > >  \tIPU3CameraData *data = cameraData(camera);\n> > > > >  \tImgUDevice *imgu = data->imgu_;\n> > > > > -\tunsigned int bufferCount;\n> > > > >  \tint ret;\n> > > > >  \n> > > > > -\tbufferCount = std::max({\n> > > > > -\t\tdata->outStream_.configuration().bufferCount,\n> > > > > -\t\tdata->vfStream_.configuration().bufferCount,\n> > > > > -\t\tdata->rawStream_.configuration().bufferCount,\n> > > > > -\t});\n> > > > > -\n> > > > > -\tret = imgu->allocateBuffers(bufferCount);\n> > > > > +\tret = imgu->allocateBuffers();\n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > >  \n> > > > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > > index d1cd3d9dc082..776e0f92aed1 100644\n> > > > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > > @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera)\n> > > > >  {\n> > > > >  \tRPiCameraData *data = cameraData(camera);\n> > > > >  \tint ret;\n> > > > > +\tconstexpr unsigned int bufferCount = 4;\n> > > > >  \n> > > > >  \t/*\n> > > > > -\t * Decide how many internal buffers to allocate. For now, simply look\n> > > > > -\t * at how many external buffers will be provided. We'll need to improve\n> > > > > -\t * this logic. However, we really must have all streams allocate the same\n> > > > > -\t * number of buffers to simplify error handling in queueRequestDevice().\n> > > > > +\t * Allocate internal buffers. We really must have all streams allocate\n> > > > > +\t * the same number of buffers to simplify error handling in\n> > > > > +\t * queueRequestDevice().\n> > > > >  \t */\n> > > > > -\tunsigned int maxBuffers = 0;\n> > > > > -\tfor (const Stream *s : camera->streams())\n> > > > > -\t\tif (static_cast<const RPi::Stream *>(s)->isExternal())\n> > > > > -\t\t\tmaxBuffers = std::max(maxBuffers, s->configuration().bufferCount);\n> > > > > -\n> > > > >  \tfor (auto const stream : data->streams_) {\n> > > > > -\t\tret = stream->prepareBuffers(maxBuffers);\n> > > > > +\t\tret = stream->prepareBuffers(bufferCount);\n> > > > \n> > > > We have a similar problem here, 4 buffer slots is too little, but when\n> > > > the stream has to allocate internal buffers (!importOnly), which is the\n> > > > case for most streams, we don't want to overallocate.\n> > > > \n> > > > I'd like to get feedback from Naush here, but I think this means we'll\n> > > > have to relax the requirement documented in the comment above, and\n> > > > accept a different number of buffers for each stream.\n> > > > \n> > > > >  \t\tif (ret < 0)\n> > > > >  \t\t\treturn ret;\n> > > > >  \t}\n> > > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > > index 11325875b929..f4ea2fd4d4d0 100644\n> > > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > > @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera *camera)\n> > > > >  \tunsigned int ipaBufferId = 1;\n> > > > >  \tint ret;\n> > > > >  \n> > > > > -\tunsigned int maxCount = std::max({\n> > > > > -\t\tdata->mainPathStream_.configuration().bufferCount,\n> > > > > -\t\tdata->selfPathStream_.configuration().bufferCount,\n> > > > > -\t});\n> > > > > -\n> > > > > -\tret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> > > > > +\tret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > > > >  \tif (ret < 0)\n> > > > >  \t\tgoto error;\n> > > > >  \n> > > > > -\tret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> > > > > +\tret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > > > >  \tif (ret < 0)\n> > > > >  \t\tgoto error;\n> > > > >  \n> > > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > > index 25f482eb8d8e..fea330f72886 100644\n> > > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > > @@ -172,7 +172,7 @@ int RkISP1Path::start()\n> > > > >  \t\treturn -EBUSY;\n> > > > >  \n> > > > >  \t/* \\todo Make buffer count user configurable. */\n> > > > > -\tret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> > > > > +\tret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret)\n> > > > >  \t\treturn ret;\n> > > > >  \n> > > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > > index 91757600ccdc..3c5891009c58 100644\n> > > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > > @@ -27,6 +27,9 @@ class V4L2Subdevice;\n> > > > >  struct StreamConfiguration;\n> > > > >  struct V4L2SubdeviceFormat;\n> > > > >  \n> > > > > +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> > > > > +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n> > > > \n> > > > The situation should be simpler for the rkisp1, as it has a different\n> > > > pipeline model (inline ISP as opposed to offline ISP for the IPU3). We\n> > > > can allocate more slots (8 or 16, as for other pipeline handlers), and\n> > > > restrict the number of internal buffers (for stats and parameters) to\n> > > > the number of requests we expect to queue to the device at once, plus\n> > > > one for the IPA.  Four thus seems good. Capturing this rationale in a\n> > > > comment would be good too.\n> > \n> > Shouldn't we also have one extra buffer queued to the capture device, like for\n> > the others, totalling five (four on the capture, one on the IPA)? Or since the\n> > driver already requires three buffers the extra one isn't needed?\n> >\n> > I'm not sure how it works, but if the driver requires three buffers at all times\n> > to keep streaming, then I think we indeed should have the extra buffer to avoid\n> > dropping frames. Otherwise, if that requirement is only for starting the stream,\n> > then for drivers that require at least two buffers we shouldn't need an extra\n> > one, I'd think.\n> \n> It seems to be only needed to start capture. Even then I think it could\n> be lowered to two buffers, I don't see anything in the driver that\n> requires three. Maybe someone from Collabora could comment on this ? And\n> maybe you could give it a try by modifying the driver ?\n> \n> By the way, if you try to apply the CIO2 reasoning above to the RkISP1,\n> you will need to take into account the fact the the driver programs the\n> hardware with the buffer for frame N+1 not at the beginning of frame N,\n> but at the end of frame N-1.\n> \n> I think four buffers is enough. We currently use four buffers and it\n> seems to work :-) Granted, the RkISP1 IPA is a skeleton, so this\n> argument isn't very strong, but given that the driver only needs two\n> buffers except at start time, four should be fine.\n> \n> > > > BTW, I may be too tired to think properly, or just unable to see the\n> > > > obvious, so please challenge any rationale you think is incorrect.\n> > > > \n> > > > > +\n> > > > >  class RkISP1Path\n> > > > >  {\n> > > > >  public:\n> > > > > diff --git a/src/libcamera/pipeline/simple/converter.cpp b/src/libcamera/pipeline/simple/converter.cpp\n> > > > > index b5e34c4cd0c5..b3bcf01483f7 100644\n> > > > > --- a/src/libcamera/pipeline/simple/converter.cpp\n> > > > > +++ b/src/libcamera/pipeline/simple/converter.cpp\n> > > > > @@ -103,11 +103,11 @@ int SimpleConverter::Stream::exportBuffers(unsigned int count,\n> > > > >  \n> > > > >  int SimpleConverter::Stream::start()\n> > > > >  {\n> > > > > -\tint ret = m2m_->output()->importBuffers(inputBufferCount_);\n> > > > > +\tint ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > > \n> > > > Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\n> > > > much of an issue I suppose.\n> > \n> > Indeed. I was under the impression that we should always importBuffers() using\n> > BUFFER_SLOT_COUNT, but now, after reading more code, I understand that's not\n> > always the case (although this seems to be the only case, due to the presence of\n> > the converter).\n> > \n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > >  \n> > > > > -\tret = m2m_->capture()->importBuffers(outputBufferCount_);\n> > > > > +\tret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tstop();\n> > > > >  \t\treturn ret;\n> > > > > diff --git a/src/libcamera/pipeline/simple/converter.h b/src/libcamera/pipeline/simple/converter.h\n> > > > > index 276a2a291c21..7e1d60674f62 100644\n> > > > > --- a/src/libcamera/pipeline/simple/converter.h\n> > > > > +++ b/src/libcamera/pipeline/simple/converter.h\n> > > > > @@ -29,6 +29,9 @@ class SizeRange;\n> > > > >  struct StreamConfiguration;\n> > > > >  class V4L2M2MDevice;\n> > > > >  \n> > > > > +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> > > > > +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n> > > > \n> > > > Let's name the variables kSimpleInternalBufferCount and\n> > > > kSimpleBufferSlotCount, as that's the naming scheme we're moving to for\n> > > > non-macro constants. Same comment elsewhere in this patch.\n> > > > \n> > > > Those constants don't belong to converter.h. Could you turn them into\n> > > > member constants of the SimplePipelineHandler class, as\n> > > > kNumInternalBuffers (which btw should be removed) ? The number of buffer\n> > > > slots can be passed as a parameter to SimpleConverter::start().\n> > > > \n> > > > There's no stats or parameters here, and no IPA, so the situation is\n> > > > different than for IPU3 and RkISP1. The number of internal buffers\n> > > > should just be one more than the minimum number of buffers required by\n> > > > the capture device, I don't think there's another requirement.\n> > \n> > Plus one extra to have queued at the converter's 'output' node (which is its\n> > input, confusingly)?\n> \n> It depends a bit on the exact timings of the capture device, as is\n> probably clear with the explanation above (or at least is now clearly\n> seen as a complicated topic :-)). We need to ensure that the realtime\n> requirements of the device are met, and that the capture buffers that\n> complete, and are then processed by the converter, will be requeued in\n> time to the capture device to meet those requirements.\n> \n> As the simple pipeline handler deals with a variety of devices, we have\n> two options, either checking the requirements of each device and\n> recording them in the supportedDevices array, or pick a common number of\n> buffers that should be good enough for everybody. I'd start with the\n> second option for simplicity, and as the pipeline handler currently uses\n> 3 buffers, I'd stick to that for now.\n> \n> > > > > +\n> > > > >  class SimpleConverter\n> > > > >  {\n> > > > >  public:\n> > > > > diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp\n> > > > > index 1c25a7344f5f..a1163eaf8be2 100644\n> > > > > --- a/src/libcamera/pipeline/simple/simple.cpp\n> > > > > +++ b/src/libcamera/pipeline/simple/simple.cpp\n> > > > > @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera, [[maybe_unused]] const ControlL\n> > > > >  \t\t * When using the converter allocate a fixed number of internal\n> > > > >  \t\t * buffers.\n> > > > >  \t\t */\n> > > > > -\t\tret = video->allocateBuffers(kNumInternalBuffers,\n> > > > > +\t\tret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n> > > > >  \t\t\t\t\t     &data->converterBuffers_);\n> > > > >  \t} else {\n> > > > > -\t\t/* Otherwise, prepare for using buffers from the only stream. */\n> > > > > -\t\tStream *stream = &data->streams_[0];\n> > > > > -\t\tret = video->importBuffers(stream->configuration().bufferCount);\n> > > > > +\t\tret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > > >  \t}\n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > > diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > > index fd39b3d3c72c..755949e7a59a 100644\n> > > > > --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > > +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > > @@ -91,6 +91,8 @@ private:\n> > > > >  \t\treturn static_cast<UVCCameraData *>(\n> > > > >  \t\t\tPipelineHandler::cameraData(camera));\n> > > > >  \t}\n> > > > > +\n> > > > > +\tstatic constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n> > > > >  };\n> > > > >  \n> > > > >  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> > > > > @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera *camera,\n> > > > >  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > > > >  {\n> > > > >  \tUVCCameraData *data = cameraData(camera);\n> > > > > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> > > > >  \n> > > > > -\tint ret = data->video_->importBuffers(count);\n> > > > > +\tint ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n> > > > \n> > > > For the uvc and vimc pipeline handlers, we have no internal buffers, so\n> > > > it's quite easy. We should have 8 or 16 slots, as for other pipeline\n> > > > handlers.\n> > > > \n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > >  \n> > > > > diff --git a/src/libcamera/pipeline/vimc/vimc.cpp b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > > index e89d53182c6d..24ba743a946c 100644\n> > > > > --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > > +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > > @@ -102,6 +102,8 @@ private:\n> > > > >  \t\treturn static_cast<VimcCameraData *>(\n> > > > >  \t\t\tPipelineHandler::cameraData(camera));\n> > > > >  \t}\n> > > > > +\n> > > > > +\tstatic constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n> > > > >  };\n> > > > >  \n> > > > >  namespace {\n> > > > > @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera *camera,\n> > > > >  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > > > >  {\n> > > > >  \tVimcCameraData *data = cameraData(camera);\n> > > > > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> > > > >  \n> > > > > -\tint ret = data->video_->importBuffers(count);\n> > > > > +\tint ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > >  \n> \n> -- \n> Regards,\n> \n> Laurent Pinchart\n> \n> -- \n> To unsubscribe, send mail to kernel-unsubscribe@lists.collabora.co.uk.","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id E776DBD87C\n\tfor <parsemail@patchwork.libcamera.org>;\n\tThu, 19 Aug 2021 13:12:21 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 246FA688A3;\n\tThu, 19 Aug 2021 15:12:21 +0200 (CEST)","from bhuna.collabora.co.uk (bhuna.collabora.co.uk [46.235.227.227])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 9C5D160264\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tThu, 19 Aug 2021 15:12:20 +0200 (CEST)","from notapiano (unknown [IPv6:2804:14c:1a9:2434:b693:c9:5cb6:b688])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\t(Authenticated sender: nfraprado)\n\tby bhuna.collabora.co.uk (Postfix) with ESMTPSA id D3CDC1F44109;\n\tThu, 19 Aug 2021 14:12:17 +0100 (BST)"],"Date":"Thu, 19 Aug 2021 10:12:12 -0300","From":"=?utf-8?b?TsOtY29sYXMgRi4gUi4gQS4=?= Prado <nfraprado@collabora.com>","To":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","Message-ID":"<20210819131212.3gznnqdacmlwsigx@notapiano>","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>\n\t<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>\n\t<20210807150345.o4mcczkjt5vxium4@notapiano>\n\t<20210809202646.blgq4lyab7ktglsp@notapiano>\n\t<YRsgB6M7NE88y34v@pendragon.ideasonboard.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=iso-8859-1","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<YRsgB6M7NE88y34v@pendragon.ideasonboard.com>","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org, Sakari Ailus <sakari.ailus@iki.fi>, \n\t=?utf-8?b?QW5kcsOp?= Almeida <andrealmeid@collabora.com>,\n\tkernel@collabora.com","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}},{"id":18955,"web_url":"https://patchwork.libcamera.org/comment/18955/","msgid":"<20210819203604.bmx2rg6lavphsa5x@notapiano>","date":"2021-08-19T20:36:04","subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","submitter":{"id":84,"url":"https://patchwork.libcamera.org/api/people/84/","name":"Nícolas F. R. A. Prado","email":"nfraprado@collabora.com"},"content":"Hi again,\n\nOn Tue, Aug 17, 2021 at 05:33:43AM +0300, Laurent Pinchart wrote:\n> Hi Nícolas,\n> \n> On Mon, Aug 09, 2021 at 05:26:46PM -0300, Nícolas F. R. A. Prado wrote:\n> > On Sat, Aug 07, 2021 at 12:03:52PM -0300, Nícolas F. R. A. Prado wrote:\n> > > On Mon, Aug 02, 2021 at 02:42:53AM +0300, Laurent Pinchart wrote:\n> > > > On Thu, Jul 22, 2021 at 08:28:49PM -0300, Nícolas F. R. A. Prado wrote:\n> > > > > Pipelines have relied on bufferCount to decide on the number of buffers\n> > > > > to allocate internally through allocateBuffers() and on the number of\n> > > > > V4L2 buffer slots to reserve through importBuffers(). Instead, the\n> > > > > number of internal buffers should be the minimum required by the\n> > > > > algorithms to avoid wasting memory, and the number of V4L2 buffer slots\n> > > > > should overallocate to avoid thrashing dmabuf mappings.\n> > > > > \n> > > > > For now, just set them to constants and stop relying on bufferCount, to\n> > > > > allow for its removal.\n> > > > > \n> > > > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>\n> > > > > ---\n> > > > > \n> > > > > No changes in v7\n> > > > > \n> > > > > Changes in v6:\n> > > > > - Added pipeline name as prefix to each BUFFER_SLOT_COUNT and\n> > > > >   INTERNAL_BUFFER_COUNT constant\n> > > > > \n> > > > >  src/libcamera/pipeline/ipu3/imgu.cpp              | 12 ++++++------\n> > > > >  src/libcamera/pipeline/ipu3/imgu.h                |  5 ++++-\n> > > > >  src/libcamera/pipeline/ipu3/ipu3.cpp              |  9 +--------\n> > > > >  .../pipeline/raspberrypi/raspberrypi.cpp          | 15 +++++----------\n> > > > >  src/libcamera/pipeline/rkisp1/rkisp1.cpp          |  9 ++-------\n> > > > >  src/libcamera/pipeline/rkisp1/rkisp1_path.cpp     |  2 +-\n> > > > >  src/libcamera/pipeline/rkisp1/rkisp1_path.h       |  3 +++\n> > > > >  src/libcamera/pipeline/simple/converter.cpp       |  4 ++--\n> > > > >  src/libcamera/pipeline/simple/converter.h         |  3 +++\n> > > > >  src/libcamera/pipeline/simple/simple.cpp          |  6 ++----\n> > > > >  src/libcamera/pipeline/uvcvideo/uvcvideo.cpp      |  5 +++--\n> > > > >  src/libcamera/pipeline/vimc/vimc.cpp              |  5 +++--\n> > > > >  12 files changed, 35 insertions(+), 43 deletions(-)\n> > > > \n> > > > Given that some of the pipeline handlers will need more intrusive\n> > > > changes to address the comments below, you could split this with one\n> > > > patch per pipeline handler (or perhaps grouping the easy ones together).\n> > > > \n> > > > > \n> > > > > diff --git a/src/libcamera/pipeline/ipu3/imgu.cpp b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > > index e955bc3456ba..f36e99dacbe7 100644\n> > > > > --- a/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > > +++ b/src/libcamera/pipeline/ipu3/imgu.cpp\n> > > > > @@ -593,22 +593,22 @@ int ImgUDevice::configureVideoDevice(V4L2VideoDevice *dev, unsigned int pad,\n> > > > >  /**\n> > > > >   * \\brief Allocate buffers for all the ImgU video devices\n> > > > >   */\n> > > > > -int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > > > +int ImgUDevice::allocateBuffers()\n> > > > >  {\n> > > > >  \t/* Share buffers between CIO2 output and ImgU input. */\n> > > > > -\tint ret = input_->importBuffers(bufferCount);\n> > > > > +\tint ret = input_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU input buffers\";\n> > > > >  \t\treturn ret;\n> > > > >  \t}\n> > > > >  \n> > > > > -\tret = param_->allocateBuffers(bufferCount, &paramBuffers_);\n> > > > > +\tret = param_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU param buffers\";\n> > > > >  \t\tgoto error;\n> > > > >  \t}\n> > > > >  \n> > > > > -\tret = stat_->allocateBuffers(bufferCount, &statBuffers_);\n> > > > > +\tret = stat_->allocateBuffers(IPU3_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to allocate ImgU stat buffers\";\n> > > > >  \t\tgoto error;\n> > > > > @@ -619,13 +619,13 @@ int ImgUDevice::allocateBuffers(unsigned int bufferCount)\n> > > > >  \t * corresponding stream is active or inactive, as the driver needs\n> > > > >  \t * buffers to be requested on the V4L2 devices in order to operate.\n> > > > >  \t */\n> > > > > -\tret = output_->importBuffers(bufferCount);\n> > > > > +\tret = output_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU output buffers\";\n> > > > >  \t\tgoto error;\n> > > > >  \t}\n> > > > >  \n> > > > > -\tret = viewfinder_->importBuffers(bufferCount);\n> > > > > +\tret = viewfinder_->importBuffers(IPU3_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tLOG(IPU3, Error) << \"Failed to import ImgU viewfinder buffers\";\n> > > > >  \t\tgoto error;\n> > > > > diff --git a/src/libcamera/pipeline/ipu3/imgu.h b/src/libcamera/pipeline/ipu3/imgu.h\n> > > > > index 9d4915116087..f934a951fc75 100644\n> > > > > --- a/src/libcamera/pipeline/ipu3/imgu.h\n> > > > > +++ b/src/libcamera/pipeline/ipu3/imgu.h\n> > > > > @@ -61,7 +61,7 @@ public:\n> > > > >  \t\t\t\t\t    outputFormat);\n> > > > >  \t}\n> > > > >  \n> > > > > -\tint allocateBuffers(unsigned int bufferCount);\n> > > > > +\tint allocateBuffers();\n> > > > >  \tvoid freeBuffers();\n> > > > >  \n> > > > >  \tint start();\n> > > > > @@ -86,6 +86,9 @@ private:\n> > > > >  \tstatic constexpr unsigned int PAD_VF = 3;\n> > > > >  \tstatic constexpr unsigned int PAD_STAT = 4;\n> > > > >  \n> > > > > +\tstatic constexpr unsigned int IPU3_INTERNAL_BUFFER_COUNT = 4;\n> > > > > +\tstatic constexpr unsigned int IPU3_BUFFER_SLOT_COUNT = 5;\n> > > > \n> > > > 5 buffer slots is low. It means that if applications cycle more than 5\n> > > > buffers, the V4L2VideoDevice cache that maintains associations between\n> > > > dmabufs and buffer slots will the trashed. Due to the internal queue of\n> > > > requests in the IPU3 pipeline handler (similar to what you have\n> > > > implemented in \"[PATCH 0/3] libcamera: pipeline: Add internal request\n> > > > queue\" for other pipeline handlers), we won't fail at queuing requests,\n> > > > but performance will suffer. I thus think we need to increase the number\n> > > > of slots to what applications can be reasonably expected to use. We\n> > > > could use 8, or even 16, as buffer slots are cheap. The same holds for\n> > > > other pipeline handlers.\n> > > > \n> > > > The number of slots for the CIO2 output should match the number of\n> > > > buffer slots for the ImgU input, as the same buffers are used on the two\n> > > > video devices. One option is to use IPU3_BUFFER_SLOT_COUNT for the CIO2,\n> > > > instead of CIO2_BUFFER_COUNT. However, the number of internal CIO2\n> > > > buffers that are allocated by exportBuffers() in CIO2Device::start(), to\n> > > > be used in case the application doesn't provide any RAW buffer, should\n> > > > be lower, as those are real buffer and are thus expensive. The number of\n> > > > buffers and buffer slots on the CIO2 thus needs to be decoupled.\n> > > > \n> > > > For proper operation, the CIO2 will require at least two queued buffers\n> > > > (one being DMA'ed to, and one waiting). We need at least one extra\n> > > > buffer queued to the ImgU to keep buffers flowing. Depending on\n> > > > processing timings, it may be that the ImgU will complete processing of\n> > > > its buffer before the CIO2 captures the next one, leading to a temporary\n> > > > situation where the CIO2 will have three buffers queued, or the CIO2\n> > > > will finish the capture first, leading to a temporary situation where\n> > > > the CIO2 will have one buffer queued and the ImgU will have two buffers\n> > > > queued. In either case, shortly afterwards, the other component will\n> > > > complete capture or processing, and we'll get back to a situation with\n> > > > two buffers queued in the CIO2 and one in the ImgU. That's thus a\n> > > > minimum of three buffers for raw images.\n> > > > \n> > > > From an ImgU point of view, we could probably get away with a single\n> > > > parameter and a single stats buffer. This would however not allow\n> > > > queuing the next frame for processing in the ImgU before the current\n> > > > frame completes, so two buffers would be better. Now, if we take the IPA\n> > > > into account, the statistics buffer will spend some time on the IPA side\n> > > > for processing. It would thus be best to have an extra statistics buffer\n> > > > to accommodate that, thus requiring three statistics buffers (and three\n> > > > parameters buffers, as we associate them together).\n> > > > \n> > > > This rationale leads to using the same number of internal buffers for\n> > > > the CIO2, the parameters and the statistics. We currently use four, and\n> > > > while the logic above indicates we could get away with three, it would\n> > > > be safer to keep using four in this patch, and possibly reduce the\n> > > > number of buffers later.\n> > > > \n> > > > I know documentation isn't fun, but I think this rationale should be\n> > > > captured in a comment in the IPU3 pipeline handler, along with a \\todo\n> > > > item to try and lower the number of internal buffers to three.\n> > > \n> > > This is the IPU3 topology as I understand it:\n> > > \n> > >       Output  .               .   Input        Output .\n> > >       +---+   .               .   +---+        +---+  .\n> > >       |   | --------------------> |   |        |   |  .\n> > >       +---+   .               .   +---+        +---+  .\n> > > CIO2          .   IPA         .          ImgU         .          IPA\n> > >               .        Param  .   Param        Stat   .   Stat\n> > >               .        +---+  .   +---+        +---+  .   +---+ \n> > >               .        |   | ---> |   |        |   | ---> |   | \n> > >               .        +---+  .   +---+        +---+  .   +---+ \n> > >           \n> > > Your suggestions for the minimum number of buffers required are the following,\n> > > from what I understand:\n> > > \n> > > CIO2 raw internal buffers:\n> > > - 2x on CIO2 Output (one being DMA'ed, one waiting)\n> > > - 1x on ImgU Input\n> > > \n> > > ImgU Param/Stat internal buffers:\n> > > - 2x on ImgU Param/Stat (one being processed, one waiting)\n> > > - 1x on IPA Stat\n> > > \n> > > This arrangement doesn't seem to take into account that IPU3Frames::Info binds\n> > > CIO2 internal buffers and ImgU Param/Stat buffers together. This means that each\n> > > raw buffer queued to CIO2 Output needs a Param/Stat buffer as well. And each\n> > > Param/Stat buffer queued to ImgU for processing needs a CIO2 raw buffer as well.\n> > > After ImgU processing though, the raw buffer gets released and reused, so the\n> > > Stat buffer queued to the IPA does not require a CIO2 raw buffer.\n> > > \n> > > This means that to achieve the above minimum, due to the IPU3Frames::Info\n> > > constraint, we'd actually need:\n> > > \n> > > CIO2 internal buffers:\n> > > - 2x on CIO2 Output (one being DMA'ed, one waiting)\n> > > - 2x on ImgU Input (for the two ImgU Param/Stat buffers we want to have there)\n> > > \n> > > ImgU Param/Stat internal buffers:\n> > > - 2x on CIO2 Output (for the two CIO2 raw buffers we want to have there)\n> > > - 2x on ImgU Param/Stat (one being processed, one waiting)\n> \n> Note that the need to have two buffers here is to ensure back-to-back\n> processing of frame on the ImgU and thus avoid delays, but this need\n> actually depends on how fast the ImgU is. With a very fast ImgU\n> (compared to the frame duration), inter-frame delays may not be an\n> issue. There's more on this below.\n> \n> > > - 1x on IPA Stat\n> \n> Processing of the statistics can occur after the corresponding raw image\n> buffer has been requeued to the CIO2, the only hard requirement is that\n> the buffer needs to be available by the time the ImgU will process the\n> corresponding raw frame buffer again.\n> \n> > > Also we're not accounting for parameter filling in the IPA before we queue the\n> > > buffers to ImgU, but perhaps that's fast enough that it doesn't matter?\n> \n> That's one of the questions we need to answer, I don't think we have\n> numbers at this time. If filling the parameters buffer takes a\n> significant amount of time, then that would need to be taken into\n> account as an additional step in the pipeline, with an additional set of\n> buffers.\n> \n> > > Does this make sense? Or am I missing something?\n> \n> One thing that you make not have taken into account is that the two\n> buffers queued on the CIO2 output and the two buffers queued on the ImgU\n> are not necessarily queued at the same time. I'll try to explain.\n> \n> On the CIO2 side, we have a strong real time requirement to always keep\n> the CIO2 fed with buffers. The details depend a bit on the hardware and\n> driver implementations, but the base idea is that once a buffer is\n> complete and the time comes to move to the next buffer for the next\n> frame, there has to be a next buffer available. When exactly this occurs\n> can vary. Some drivers will give the buffer for the next frame to the\n> device when capture for the current frame starts, and some will give it\n> when the hardware signals completion of the capture of the current frame\n> (frame end). In theory this could be delayed even a bit more, but it has\n> to happen before the hardware needs the new buffer, and giving it when\n> the DMA completes is often too risky already as vertical blanking can be\n> short and interrupts can be delayed a bit. I tried to check the driver\n> to see what the exact requirement is, but I'm not familiar with the\n> hardware and the code is not very easy to follow.\n> \n> Note that frame start is the time when the first pixel of the frame is\n> written to memory, and frame end the time when the last pixel of the\n> frame is written to memory. The end of frame N and the start of frame\n> N+1 are separated by the vertical blanking time.\n> \n> Let's assume that the CIO2 needs to be programmed with the buffer for\n> frame N+1 at the start of frame N (Edit: I've written all the\n> explanation below based on this assumption, but after further\n> investigation, I *think* the CIO2 only requires the buffer for frame N+1\n> at the beginning of frame N+1, but the driver enforces that the buffer\n> must be present just before the start of frame N to avoid race\n> conditions - just before the start of frame N and at the start of frame\n> N at practically speaking the same thing. Sakari, do you know if this is\n> correct ?). We'll constantly transition between the following states,\n> from the CIO2 point of view.\n> \n> 0. (Initial state) 2x idle buffers in the queue, hardware stopped. The\n>    CIO2 is then started, the first buffer in the queue is given to the\n>    device to capture the first frame, and the second buffer in the queue\n>    is given to the device to capture the second frame. The first frame\n>    starts.\n> \n> 1. 1x active buffer being DMA'ed to, 1x pending buffer already given to\n>    the hardware for the next frame, 0x idle buffers in the queue. Two\n>    events can occur at this point, either completion of the current\n>    frame (-> 2), or a new buffer being queued by userspace (-> 4).\n> \n> 2. 0x active buffer beind DMA'ed to, 1x pending buffer already given to\n>    the hardware for the next frame, 0x idle buffers in the queue. Two\n>    events can occur at this point, either start of the next frame (->\n>    3), or a new buffer being queued by userspace (-> 5).\n> \n>    This state lasts for the duration of the vertical blanking only, and\n>    can thus be short-lived.\n> \n> 3. The next frame start. The pending buffer becomes active. We have no\n>    buffer in the queue to give to the hardware for the next frame. An\n>    underrun has occurred, a frame will be dropped. Game over.\n> \n> 4. 1x active buffer being DMA'ed to, 1x pending buffer already given to\n>    the hardware for the next frame, 1x idle buffers in the queue. The\n>    next event that will occur is the start of the next frame (as the\n>    other option, a new buffer being queued, will give us additional\n>    safety by increasing the number of queued buffers, but isn't\n>    meaningful when considering the case where we try to run with the\n>    minimum number of buffers possible).\n> \n>    As the current frame ends, the active buffer is given back to the\n>    userspace. There's no active buffer (the DMA will start soon, after\n>    the vertical blanking, when the next frame starts), the pending\n>    buffer stays pending, and the idle buffer stays idle (-> 5).\n> \n> 5. 0x active buffer beind DMA'ed to, 1x pending buffer already given to\n>    the hardware for the next frame, 1x idle buffers in the queue. The\n>    next event that will occur is the start of the next frame (for the\n>    same reason as in 4).\n> \n>    As the next frame starts, the pending buffer becomes active. The\n>    queue buffer is given to the hardware for the subsequent frame. The\n>    queue of idle buffers become empty (-> 1).\n> \n>    If this state is reached from state 2, it lasts for the remaining of\n>    the vertical blanking only. If it is reached from state 4, it lasts\n>    for the whole vertical blanking. In both cases, it can be\n>    short-lived.\n> \n> We can thus cycle either through 1 -> 2 -> 5 -> 1 or through 1 -> 4 -> 5\n> -> 1. The first cycle requires two buffers for the CIO2, with an\n> intermediate state (2) that has a single buffer only. This is unsafe, as\n> a failure to queue a second buffer in the short-lived state 2 will lead\n> to state 3 and frame drops.\n> \n> The second cycle requires three buffers for the CIO2. This is the cycle\n> we want to use, to avoid frame drops. Note that only state 4 requires\n> all three buffers, and userspace can queue the third buffer at any point\n> in state 1 (before the end of the current frame). If userspace queues\n> the frame slightly too late, after the completion of the current frame\n> but before the start of the next one, we'll go to the unsafe cycle but\n> will still not lose frames.\n> \n> Now, let's look at the ImgU side, and assume we use three buffers in\n> total. The ImgU operates from memory to memory, it thus has no realtime\n> requirement. It only starts processing a frame when the frame is given\n> to it. This occurs, from a CIO2 point of view, in the transition from\n> state 4 to state 5, plus all delays introduced by delivering the CIO2\n> frame completion event to userspace, queueing the frame to the ImgU (I'm\n> ignoring the IPA here), and starting the ImgU itself. The ImgU\n> processing time will, on average, be lower than the frame duration,\n> otherwise it won't be able to process all frames. Once the ImgU\n> completes processing of the frame, it will signal this to userspace.\n> There's also a processing delay there (signalling, task switching, ...),\n> and userspace will requeue the frame to the CIO2. This has to occur at\n> the latest before the end of the current frame, otherwise state 1 will\n> transition to state 2.\n> \n> We thus see that, in the 3 buffers case, we need to ensure that the\n> total time to process the frame on the ImgU, from the CIO2 interrupt\n> signalling the end of state 4 to the buffer being requeued to the CIO2,\n> thus including all task switching and other delays, doesn't exceed the\n> duration of states 5 + 1, which is equal to the duration of a frame. The\n> ImgU processing time itself is guaranteed to be lower than that, but the\n> additional delays may be problematic. We also need to include a possible\n> round-trip to the IPA after end of buffer capture by the CIO2 and start\n> of processing by the ImgU to retrieve the ImgU parameters for the frame.\n> Three buffers start sounding quite risky. I'm thus correcting myself,\n> hour buffers seem safer.\n> \n> None of this takes the parameters or statistics buffers into account,\n> but I don't think they're particularly problematic in the sense that the\n> most strict realtime constraints come from the raw image buffers. Feel\n> free to prove me wrong though :-)\n> \n> Let's however note that we can probably fetch the ImgU parameters for\n> the frame that has just been captured before the end of the frame, so\n> that would remove a delay in the ImgU processing. This assumes that the\n> algorithms wouldn't need to know the exact exposure time and analog gain\n> that have been used to capture the current frame in order to compute the\n> ImgU parameters. This leads to a first question to David: does the\n> Raspberry Pi IPA require the sensor metadata to calculate ISP\n> parameters, or are they needed only when processing statistics from\n> frame N to calculate sensor and ISP parameters of subsequent frames ?\n> \n> The next question is for everybody (and that's why I've expanded the CC\n> list to Kieran, Jean-Michel and Sakari too): what did I get wrong in the\n> above explanation ? :-)\n> \n> > > > > +\n> > > > >  \tint linkSetup(const std::string &source, unsigned int sourcePad,\n> > > > >  \t\t      const std::string &sink, unsigned int sinkPad,\n> > > > >  \t\t      bool enable);\n> > > > > diff --git a/src/libcamera/pipeline/ipu3/ipu3.cpp b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > > index 5fd1757bfe13..4efd201c05e5 100644\n> > > > > --- a/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > > +++ b/src/libcamera/pipeline/ipu3/ipu3.cpp\n> > > > > @@ -681,16 +681,9 @@ int PipelineHandlerIPU3::allocateBuffers(Camera *camera)\n> > > > >  {\n> > > > >  \tIPU3CameraData *data = cameraData(camera);\n> > > > >  \tImgUDevice *imgu = data->imgu_;\n> > > > > -\tunsigned int bufferCount;\n> > > > >  \tint ret;\n> > > > >  \n> > > > > -\tbufferCount = std::max({\n> > > > > -\t\tdata->outStream_.configuration().bufferCount,\n> > > > > -\t\tdata->vfStream_.configuration().bufferCount,\n> > > > > -\t\tdata->rawStream_.configuration().bufferCount,\n> > > > > -\t});\n> > > > > -\n> > > > > -\tret = imgu->allocateBuffers(bufferCount);\n> > > > > +\tret = imgu->allocateBuffers();\n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > >  \n> > > > > diff --git a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > > index d1cd3d9dc082..776e0f92aed1 100644\n> > > > > --- a/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > > +++ b/src/libcamera/pipeline/raspberrypi/raspberrypi.cpp\n> > > > > @@ -1149,20 +1149,15 @@ int PipelineHandlerRPi::prepareBuffers(Camera *camera)\n> > > > >  {\n> > > > >  \tRPiCameraData *data = cameraData(camera);\n> > > > >  \tint ret;\n> > > > > +\tconstexpr unsigned int bufferCount = 4;\n> > > > >  \n> > > > >  \t/*\n> > > > > -\t * Decide how many internal buffers to allocate. For now, simply look\n> > > > > -\t * at how many external buffers will be provided. We'll need to improve\n> > > > > -\t * this logic. However, we really must have all streams allocate the same\n> > > > > -\t * number of buffers to simplify error handling in queueRequestDevice().\n> > > > > +\t * Allocate internal buffers. We really must have all streams allocate\n> > > > > +\t * the same number of buffers to simplify error handling in\n> > > > > +\t * queueRequestDevice().\n> > > > >  \t */\n> > > > > -\tunsigned int maxBuffers = 0;\n> > > > > -\tfor (const Stream *s : camera->streams())\n> > > > > -\t\tif (static_cast<const RPi::Stream *>(s)->isExternal())\n> > > > > -\t\t\tmaxBuffers = std::max(maxBuffers, s->configuration().bufferCount);\n> > > > > -\n> > > > >  \tfor (auto const stream : data->streams_) {\n> > > > > -\t\tret = stream->prepareBuffers(maxBuffers);\n> > > > > +\t\tret = stream->prepareBuffers(bufferCount);\n> > > > \n> > > > We have a similar problem here, 4 buffer slots is too little, but when\n> > > > the stream has to allocate internal buffers (!importOnly), which is the\n> > > > case for most streams, we don't want to overallocate.\n> > > > \n> > > > I'd like to get feedback from Naush here, but I think this means we'll\n> > > > have to relax the requirement documented in the comment above, and\n> > > > accept a different number of buffers for each stream.\n> > > > \n> > > > >  \t\tif (ret < 0)\n> > > > >  \t\t\treturn ret;\n> > > > >  \t}\n> > > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1.cpp b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > > index 11325875b929..f4ea2fd4d4d0 100644\n> > > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1.cpp\n> > > > > @@ -690,16 +690,11 @@ int PipelineHandlerRkISP1::allocateBuffers(Camera *camera)\n> > > > >  \tunsigned int ipaBufferId = 1;\n> > > > >  \tint ret;\n> > > > >  \n> > > > > -\tunsigned int maxCount = std::max({\n> > > > > -\t\tdata->mainPathStream_.configuration().bufferCount,\n> > > > > -\t\tdata->selfPathStream_.configuration().bufferCount,\n> > > > > -\t});\n> > > > > -\n> > > > > -\tret = param_->allocateBuffers(maxCount, &paramBuffers_);\n> > > > > +\tret = param_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &paramBuffers_);\n> > > > >  \tif (ret < 0)\n> > > > >  \t\tgoto error;\n> > > > >  \n> > > > > -\tret = stat_->allocateBuffers(maxCount, &statBuffers_);\n> > > > > +\tret = stat_->allocateBuffers(RKISP1_INTERNAL_BUFFER_COUNT, &statBuffers_);\n> > > > >  \tif (ret < 0)\n> > > > >  \t\tgoto error;\n> > > > >  \n> > > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > > index 25f482eb8d8e..fea330f72886 100644\n> > > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.cpp\n> > > > > @@ -172,7 +172,7 @@ int RkISP1Path::start()\n> > > > >  \t\treturn -EBUSY;\n> > > > >  \n> > > > >  \t/* \\todo Make buffer count user configurable. */\n> > > > > -\tret = video_->importBuffers(RKISP1_BUFFER_COUNT);\n> > > > > +\tret = video_->importBuffers(RKISP1_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret)\n> > > > >  \t\treturn ret;\n> > > > >  \n> > > > > diff --git a/src/libcamera/pipeline/rkisp1/rkisp1_path.h b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > > index 91757600ccdc..3c5891009c58 100644\n> > > > > --- a/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > > +++ b/src/libcamera/pipeline/rkisp1/rkisp1_path.h\n> > > > > @@ -27,6 +27,9 @@ class V4L2Subdevice;\n> > > > >  struct StreamConfiguration;\n> > > > >  struct V4L2SubdeviceFormat;\n> > > > >  \n> > > > > +static constexpr unsigned int RKISP1_INTERNAL_BUFFER_COUNT = 4;\n> > > > > +static constexpr unsigned int RKISP1_BUFFER_SLOT_COUNT = 5;\n> > > > \n> > > > The situation should be simpler for the rkisp1, as it has a different\n> > > > pipeline model (inline ISP as opposed to offline ISP for the IPU3). We\n> > > > can allocate more slots (8 or 16, as for other pipeline handlers), and\n> > > > restrict the number of internal buffers (for stats and parameters) to\n> > > > the number of requests we expect to queue to the device at once, plus\n> > > > one for the IPA.  Four thus seems good. Capturing this rationale in a\n> > > > comment would be good too.\n> > \n> > Shouldn't we also have one extra buffer queued to the capture device, like for\n> > the others, totalling five (four on the capture, one on the IPA)? Or since the\n> > driver already requires three buffers the extra one isn't needed?\n> >\n> > I'm not sure how it works, but if the driver requires three buffers at all times\n> > to keep streaming, then I think we indeed should have the extra buffer to avoid\n> > dropping frames. Otherwise, if that requirement is only for starting the stream,\n> > then for drivers that require at least two buffers we shouldn't need an extra\n> > one, I'd think.\n> \n> It seems to be only needed to start capture. Even then I think it could\n> be lowered to two buffers, I don't see anything in the driver that\n> requires three. Maybe someone from Collabora could comment on this ? And\n> maybe you could give it a try by modifying the driver ?\n> \n> By the way, if you try to apply the CIO2 reasoning above to the RkISP1,\n> you will need to take into account the fact the the driver programs the\n> hardware with the buffer for frame N+1 not at the beginning of frame N,\n> but at the end of frame N-1.\n> \n> I think four buffers is enough. We currently use four buffers and it\n> seems to work :-) Granted, the RkISP1 IPA is a skeleton, so this\n> argument isn't very strong, but given that the driver only needs two\n> buffers except at start time, four should be fine.\n\nJust to give some feedback, I lowered the RKISP1_MIN_BUFFERS_NEEDED constant in\nthe driver to 1 and tested capture using cam and everything still works as\nexpected.\n\nUsing a single request obviously causes a lot of frame drops and with two\nthere's still a bit as well. But three requests works completely fine, which\nseems to suggest that two buffers are used internally with the third one\ncovering for the propagation delays to and from userspace (and through the\nskeleton IPA) while the first buffer is requeued back.\n\nSo four buffers should cover for when the IPA is further developed as well like\nyou said.\n\n(That said I was able to get frame drops by choosing a resolution of at least\n600x400 and saving the frames to disk with -F, but this amount of delay may\nalready be enough that the application would consider overallocating buffers)\n\nThanks,\nNícolas\n\n> \n> > > > BTW, I may be too tired to think properly, or just unable to see the\n> > > > obvious, so please challenge any rationale you think is incorrect.\n> > > > \n> > > > > +\n> > > > >  class RkISP1Path\n> > > > >  {\n> > > > >  public:\n> > > > > diff --git a/src/libcamera/pipeline/simple/converter.cpp b/src/libcamera/pipeline/simple/converter.cpp\n> > > > > index b5e34c4cd0c5..b3bcf01483f7 100644\n> > > > > --- a/src/libcamera/pipeline/simple/converter.cpp\n> > > > > +++ b/src/libcamera/pipeline/simple/converter.cpp\n> > > > > @@ -103,11 +103,11 @@ int SimpleConverter::Stream::exportBuffers(unsigned int count,\n> > > > >  \n> > > > >  int SimpleConverter::Stream::start()\n> > > > >  {\n> > > > > -\tint ret = m2m_->output()->importBuffers(inputBufferCount_);\n> > > > > +\tint ret = m2m_->output()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > > \n> > > > Shouldn't this be SIMPLE_INTERNAL_BUFFER_COUNT ? Overallocating is not\n> > > > much of an issue I suppose.\n> > \n> > Indeed. I was under the impression that we should always importBuffers() using\n> > BUFFER_SLOT_COUNT, but now, after reading more code, I understand that's not\n> > always the case (although this seems to be the only case, due to the presence of\n> > the converter).\n> > \n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > >  \n> > > > > -\tret = m2m_->capture()->importBuffers(outputBufferCount_);\n> > > > > +\tret = m2m_->capture()->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret < 0) {\n> > > > >  \t\tstop();\n> > > > >  \t\treturn ret;\n> > > > > diff --git a/src/libcamera/pipeline/simple/converter.h b/src/libcamera/pipeline/simple/converter.h\n> > > > > index 276a2a291c21..7e1d60674f62 100644\n> > > > > --- a/src/libcamera/pipeline/simple/converter.h\n> > > > > +++ b/src/libcamera/pipeline/simple/converter.h\n> > > > > @@ -29,6 +29,9 @@ class SizeRange;\n> > > > >  struct StreamConfiguration;\n> > > > >  class V4L2M2MDevice;\n> > > > >  \n> > > > > +constexpr unsigned int SIMPLE_INTERNAL_BUFFER_COUNT = 5;\n> > > > > +constexpr unsigned int SIMPLE_BUFFER_SLOT_COUNT = 5;\n> > > > \n> > > > Let's name the variables kSimpleInternalBufferCount and\n> > > > kSimpleBufferSlotCount, as that's the naming scheme we're moving to for\n> > > > non-macro constants. Same comment elsewhere in this patch.\n> > > > \n> > > > Those constants don't belong to converter.h. Could you turn them into\n> > > > member constants of the SimplePipelineHandler class, as\n> > > > kNumInternalBuffers (which btw should be removed) ? The number of buffer\n> > > > slots can be passed as a parameter to SimpleConverter::start().\n> > > > \n> > > > There's no stats or parameters here, and no IPA, so the situation is\n> > > > different than for IPU3 and RkISP1. The number of internal buffers\n> > > > should just be one more than the minimum number of buffers required by\n> > > > the capture device, I don't think there's another requirement.\n> > \n> > Plus one extra to have queued at the converter's 'output' node (which is its\n> > input, confusingly)?\n> \n> It depends a bit on the exact timings of the capture device, as is\n> probably clear with the explanation above (or at least is now clearly\n> seen as a complicated topic :-)). We need to ensure that the realtime\n> requirements of the device are met, and that the capture buffers that\n> complete, and are then processed by the converter, will be requeued in\n> time to the capture device to meet those requirements.\n> \n> As the simple pipeline handler deals with a variety of devices, we have\n> two options, either checking the requirements of each device and\n> recording them in the supportedDevices array, or pick a common number of\n> buffers that should be good enough for everybody. I'd start with the\n> second option for simplicity, and as the pipeline handler currently uses\n> 3 buffers, I'd stick to that for now.\n> \n> > > > > +\n> > > > >  class SimpleConverter\n> > > > >  {\n> > > > >  public:\n> > > > > diff --git a/src/libcamera/pipeline/simple/simple.cpp b/src/libcamera/pipeline/simple/simple.cpp\n> > > > > index 1c25a7344f5f..a1163eaf8be2 100644\n> > > > > --- a/src/libcamera/pipeline/simple/simple.cpp\n> > > > > +++ b/src/libcamera/pipeline/simple/simple.cpp\n> > > > > @@ -803,12 +803,10 @@ int SimplePipelineHandler::start(Camera *camera, [[maybe_unused]] const ControlL\n> > > > >  \t\t * When using the converter allocate a fixed number of internal\n> > > > >  \t\t * buffers.\n> > > > >  \t\t */\n> > > > > -\t\tret = video->allocateBuffers(kNumInternalBuffers,\n> > > > > +\t\tret = video->allocateBuffers(SIMPLE_INTERNAL_BUFFER_COUNT,\n> > > > >  \t\t\t\t\t     &data->converterBuffers_);\n> > > > >  \t} else {\n> > > > > -\t\t/* Otherwise, prepare for using buffers from the only stream. */\n> > > > > -\t\tStream *stream = &data->streams_[0];\n> > > > > -\t\tret = video->importBuffers(stream->configuration().bufferCount);\n> > > > > +\t\tret = video->importBuffers(SIMPLE_BUFFER_SLOT_COUNT);\n> > > > >  \t}\n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > > diff --git a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > > index fd39b3d3c72c..755949e7a59a 100644\n> > > > > --- a/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > > +++ b/src/libcamera/pipeline/uvcvideo/uvcvideo.cpp\n> > > > > @@ -91,6 +91,8 @@ private:\n> > > > >  \t\treturn static_cast<UVCCameraData *>(\n> > > > >  \t\t\tPipelineHandler::cameraData(camera));\n> > > > >  \t}\n> > > > > +\n> > > > > +\tstatic constexpr unsigned int UVC_BUFFER_SLOT_COUNT = 5;\n> > > > >  };\n> > > > >  \n> > > > >  UVCCameraConfiguration::UVCCameraConfiguration(UVCCameraData *data)\n> > > > > @@ -236,9 +238,8 @@ int PipelineHandlerUVC::exportFrameBuffers(Camera *camera,\n> > > > >  int PipelineHandlerUVC::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > > > >  {\n> > > > >  \tUVCCameraData *data = cameraData(camera);\n> > > > > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> > > > >  \n> > > > > -\tint ret = data->video_->importBuffers(count);\n> > > > > +\tint ret = data->video_->importBuffers(UVC_BUFFER_SLOT_COUNT);\n> > > > \n> > > > For the uvc and vimc pipeline handlers, we have no internal buffers, so\n> > > > it's quite easy. We should have 8 or 16 slots, as for other pipeline\n> > > > handlers.\n> > > > \n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > >  \n> > > > > diff --git a/src/libcamera/pipeline/vimc/vimc.cpp b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > > index e89d53182c6d..24ba743a946c 100644\n> > > > > --- a/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > > +++ b/src/libcamera/pipeline/vimc/vimc.cpp\n> > > > > @@ -102,6 +102,8 @@ private:\n> > > > >  \t\treturn static_cast<VimcCameraData *>(\n> > > > >  \t\t\tPipelineHandler::cameraData(camera));\n> > > > >  \t}\n> > > > > +\n> > > > > +\tstatic constexpr unsigned int VIMC_BUFFER_SLOT_COUNT = 5;\n> > > > >  };\n> > > > >  \n> > > > >  namespace {\n> > > > > @@ -312,9 +314,8 @@ int PipelineHandlerVimc::exportFrameBuffers(Camera *camera,\n> > > > >  int PipelineHandlerVimc::start(Camera *camera, [[maybe_unused]] const ControlList *controls)\n> > > > >  {\n> > > > >  \tVimcCameraData *data = cameraData(camera);\n> > > > > -\tunsigned int count = data->stream_.configuration().bufferCount;\n> > > > >  \n> > > > > -\tint ret = data->video_->importBuffers(count);\n> > > > > +\tint ret = data->video_->importBuffers(VIMC_BUFFER_SLOT_COUNT);\n> > > > >  \tif (ret < 0)\n> > > > >  \t\treturn ret;\n> > > > >  \n> \n> -- \n> Regards,\n> \n> Laurent Pinchart\n> \n> -- \n> To unsubscribe, send mail to kernel-unsubscribe@lists.collabora.co.uk.","headers":{"Return-Path":"<libcamera-devel-bounces@lists.libcamera.org>","X-Original-To":"parsemail@patchwork.libcamera.org","Delivered-To":"parsemail@patchwork.libcamera.org","Received":["from lancelot.ideasonboard.com (lancelot.ideasonboard.com\n\t[92.243.16.209])\n\tby patchwork.libcamera.org (Postfix) with ESMTPS id BC38ABD87D\n\tfor <parsemail@patchwork.libcamera.org>;\n\tThu, 19 Aug 2021 20:36:14 +0000 (UTC)","from lancelot.ideasonboard.com (localhost [IPv6:::1])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTP id 40F5568892;\n\tThu, 19 Aug 2021 22:36:14 +0200 (CEST)","from bhuna.collabora.co.uk (bhuna.collabora.co.uk\n\t[IPv6:2a00:1098:0:82:1000:25:2eeb:e3e3])\n\tby lancelot.ideasonboard.com (Postfix) with ESMTPS id 743A760264\n\tfor <libcamera-devel@lists.libcamera.org>;\n\tThu, 19 Aug 2021 22:36:12 +0200 (CEST)","from notapiano (unknown [IPv6:2804:14c:1a9:2434:b693:c9:5cb6:b688])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\t(Authenticated sender: nfraprado)\n\tby bhuna.collabora.co.uk (Postfix) with ESMTPSA id 8E9D21F44265;\n\tThu, 19 Aug 2021 21:36:09 +0100 (BST)"],"Date":"Thu, 19 Aug 2021 17:36:04 -0300","From":"=?utf-8?b?TsOtY29sYXMgRi4gUi4gQS4=?= Prado <nfraprado@collabora.com>","To":"Laurent Pinchart <laurent.pinchart@ideasonboard.com>","Message-ID":"<20210819203604.bmx2rg6lavphsa5x@notapiano>","References":"<20210722232851.747614-1-nfraprado@collabora.com>\n\t<20210722232851.747614-10-nfraprado@collabora.com>\n\t<YQcxfd4imcmam/IB@pendragon.ideasonboard.com>\n\t<20210807150345.o4mcczkjt5vxium4@notapiano>\n\t<20210809202646.blgq4lyab7ktglsp@notapiano>\n\t<YRsgB6M7NE88y34v@pendragon.ideasonboard.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=iso-8859-1","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<YRsgB6M7NE88y34v@pendragon.ideasonboard.com>","Subject":"Re: [libcamera-devel] [PATCH v7 09/11] libcamera: pipeline: Don't\n\trely on bufferCount","X-BeenThere":"libcamera-devel@lists.libcamera.org","X-Mailman-Version":"2.1.29","Precedence":"list","List-Id":"<libcamera-devel.lists.libcamera.org>","List-Unsubscribe":"<https://lists.libcamera.org/options/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=unsubscribe>","List-Archive":"<https://lists.libcamera.org/pipermail/libcamera-devel/>","List-Post":"<mailto:libcamera-devel@lists.libcamera.org>","List-Help":"<mailto:libcamera-devel-request@lists.libcamera.org?subject=help>","List-Subscribe":"<https://lists.libcamera.org/listinfo/libcamera-devel>,\n\t<mailto:libcamera-devel-request@lists.libcamera.org?subject=subscribe>","Cc":"libcamera-devel@lists.libcamera.org, Sakari Ailus <sakari.ailus@iki.fi>, \n\t=?utf-8?b?QW5kcsOp?= Almeida <andrealmeid@collabora.com>,\n\tkernel@collabora.com","Errors-To":"libcamera-devel-bounces@lists.libcamera.org","Sender":"\"libcamera-devel\" <libcamera-devel-bounces@lists.libcamera.org>"}}]