From patchwork Fri Jun 6 16:41:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= X-Patchwork-Id: 23489 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id E53F2C3327 for ; Fri, 6 Jun 2025 16:42:39 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id 5A27D68DC3; Fri, 6 Jun 2025 18:42:39 +0200 (CEST) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (1024-bit key; unprotected) header.d=ideasonboard.com header.i=@ideasonboard.com header.b="gWhzA+lJ"; dkim-atps=neutral Received: from perceval.ideasonboard.com (perceval.ideasonboard.com [213.167.242.64]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 4B81768DBF for ; Fri, 6 Jun 2025 18:42:17 +0200 (CEST) Received: from pb-laptop.local (185.182.215.79.nat.pool.zt.hu [185.182.215.79]) by perceval.ideasonboard.com (Postfix) with ESMTPSA id AD8B98DB for ; Fri, 6 Jun 2025 18:42:12 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ideasonboard.com; s=mail; t=1749228132; bh=1kY2O66QbkDnWvTOQisEpM4nUeC4GAZmxwqjPJ7dfHk=; h=From:To:Subject:Date:In-Reply-To:References:From; b=gWhzA+lJgExJjMc86OEg5mLvWthCuoTlEENMPwkfG7mYtZuUkZWbhHbYppAlGklgr byjqbma35QyNwdeaiiV4y2GQoat/9sM3HMXUvIcbFndOnbXQuiVIDuA4xOjuedk3NT /ggiUbxe2zl9WvghUwAQSaCr4Yc+h1jqq8PQAqEA= From: =?utf-8?q?Barnab=C3=A1s_P=C5=91cze?= To: libcamera-devel@lists.libcamera.org Subject: [RFC PATCH v1 08/23] Documentation: design: Document `MetadataList` Date: Fri, 6 Jun 2025 18:41:41 +0200 Message-ID: <20250606164156.1442682-9-barnabas.pocze@ideasonboard.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250606164156.1442682-1-barnabas.pocze@ideasonboard.com> References: <20250606164156.1442682-1-barnabas.pocze@ideasonboard.com> MIME-Version: 1.0 X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Add a document describing the problem, the choices, and the design of the separate metadata list data structure. Signed-off-by: Barnabás Pőcze --- Documentation/design/metadata-list.rst | 234 +++++++++++++++++++++++++ Documentation/index.rst | 1 + Documentation/meson.build | 1 + 3 files changed, 236 insertions(+) create mode 100644 Documentation/design/metadata-list.rst diff --git a/Documentation/design/metadata-list.rst b/Documentation/design/metadata-list.rst new file mode 100644 index 000000000..a42f94bdf --- /dev/null +++ b/Documentation/design/metadata-list.rst @@ -0,0 +1,234 @@ +.. SPDX-License-Identifier: CC-BY-SA-4.0 + +Design of the metadata list +=========================== + +This document explains the design and rationale of the metadata list. + + +Description of the problem +-------------------------- + +Early metadata +^^^^^^^^^^^^^^ + +A pipeline handler might report numerous metadata items to the application about +a single request. It is likely that different metadata items become available at +different points in time while a request is being processed. + +Simultaneously, an application might desire to carry out potentially non-trivial +extra processing one the image, etc. using certain metadata items. For such an +application it is likely best if the final value of each metadata item is reported +as soon as possible, thus allowing it to start processing as soon as possible. + +For this reason, libcamera provides the `metadataAvailable` signal on each `Camera` object. +This signal is dispatched whenever new metadata items become available for a queued request. +This mechanism is completely optional, only interested applications need to subscribe, +others are free to ignore it completely. `Request::metadata()` will contain the sum of +all early metadata items at request completion. + +Thread safety +^^^^^^^^^^^^^ + +At the moment, event handlers of the application are always dispatched in a private +thread of libcamera. This requires that applications process the various events in a +thread-safe manner wrt. themselves. The burden of correct synchronization falls +upon the applications. + +Previously, a `ControlList` was used to store the metadata pertaining to a particular +request. A `ControlList` is implemented using an `std::unordered_map`, meaning that +its thread-safety is limited. This hints at a need for a separate data structure +or at least some kind of thread-safe wrapper. + + +Requirements +------------ + +We wish to provide a simple, easy-to-use, and hard-to-misuse interface for applications. +Notably, applications should be able to delegate early metadata processing to their +own separate threads safely wrt. the metadata list. Consider the following scenario: +the pipeline handler send early metadata items to the application, the application +delegates it to a separate thread. After that, the private libcamera thread is no +longer blocked, thus the pipeline handler can continue working on the request: e.g. +add more metadata items. Simultaneously, the application might be reading the metadata +items on a separate thread. This situation should be safe and work correctly, ideally +with any number of threads reading the completed metadata items. Until the request +is destroyed or reused, whichever happens first. + +Secondarily, efficiency should be considered: copies, locks, reference counting, etc. +should be avoided if possible. + +Preferably, it should be possible to refer to a contiguous (in insertion order) subset +of values reasonably efficiently (i.e. avoiding having to store a separate list of +numeric identifiers, etc.). + + +Options +------- + +Keep using `ControlList` +^^^^^^^^^^^^^^^^^^^^^^^^ + +Using a `ControlList` (and hence `std::unordered_map`) with early metadata completion would +be possible, but it would place a number of potentially non-intuitive and easy to violate +restrictions on applications, making it harder to use safely. Specifically, the application +would have to retrieve a pointer to the `ControlValue` object in the metadata `ControlList`, +and then access it only through that pointer. It wouldn't be able to do lookups on the metadata +list outside the event handler. As a consequence, the usual way of retrieving metadata using +the pre-defined `Control` objects would no longer be possible, losing type-safety. + +Send a copy +^^^^^^^^^^^ + +Passing a separate `ControlList` containing the just completed metadata, and disallowing access +to the request's metadata list until completion works fine, and avoids the synchronization issues +on the libcamera side. Nonetheless, it has two significant drawbacks: + +1. It moves the issue of synchronization from libcamera to the application: the application still has + to access its own data in a thread-safe manner and/or transfer the partial metadata list to its + *main* thread of execution. +2. Early metadata can be reported multiple times for each request, thus making copies can have negative + performance implications. + + +Design +------ + +A separate data structure is introduced to contain the metadata items pertaining to a given request. +It is referred to as "metadata list" from now on. + +The current design of the metadata list places a number of restrictions on request metadata. +A metadata list is backed by a pre-allocated (at construction time) contiguous block of +memory sized appropriately to contain all possible metadata items. This means that the +number and size of metadata items that a camera can report must be known in advance. The +newly introduced `MetadataListPlan` type is used for that purpose. At the time of writing +this does not appear to be a significant limitation since most metadata has a fixed size, +and each pipeline handler (and IPA) has a fixed set of metadata that it can report. There +are, however, metadata items that have a variably-sized array type. In those cases an upper +bound on the number of elements must be provided. + +`MetadataListPlan` +^^^^^^^^^^^^^^^^^^ + +A `MetadataListPlan` collects the set of possible metadata items. It maps the numeric id +of the control to a collection of static information (size, etc.). This is most importantly +used to calculate the size required to store all possible metadata item. + +Each camera has its own `MetadataListPlan` object similarly to its `ControlInfoMap`. It is +used to create requests for the the camera with an appropriately sized `MetadataList`. +Pipeline handlers should fill it during camera initialization or configuration, and they +are allowed to modify it as long as they camera is not configured and during configuration. + +`MetadataList` +^^^^^^^^^^^^^^ + +The current metadata list implementation is a single-writer multiple-readers thread-safe +data structure that provides lock-free lookup and access for any number of threads, while +allowing a single thread at a time to add metadata items. + +The implemented metadata list has two main parts. The first part essentially contains +a copy of the `MetadataListPlan` used to construct the `MetadataList`. In addition to +the static information about the metadata item, it contains dynamic information such +as whether the metadata item has been added to the list or not. + +The second part of a metadata list is a completely self-contained serialized list +of metadata items. The number of bytes used for actually storing metadata items in +this second part will be referred to as the "fill level" from now on. The self-contained +nature of the second part leads to a certain level of data duplication between the two +parts, however, the end goal is to have a serialized version of `ControlList` with the +same serialized format. This would allow a `MetadataList` to be "trivially" reinterpreted +as a control list at any point of its lifetime, simplifying the interoperability between the two. +TODO: do we really want that? + +A metadata list, at construction time, calculates the number of bytes necessary to store +all possible metadata items according to the supplied `MetadataListPlan`. Storage, for +all possible metadata items and the necessary auxiliary structures is then allocated. +This allocation remains fixed for the entire lifetime of a `MetadataList`, which is +crucial to satisfy the earlier requirements. + +Each metadata item can only be added to a metadata list once. This constraint does not pose +a significant limitation, instead, it simplifies the interface and implementation; it is +essentially an append-only list. + +Serialization +''''''''''''' + +The actual values are encoded in the "second part" of the metadata list in a fairly +simple fashion. Each control value is encoded as header + data bytes + padding. Each +value has a header, which contains information such as the size, alignment, type, etc. +of the value. The data bytes are aligned to the alignment specified in the header, +and padding may be inserted after the last data byte to guarantee proper alignment +for the next header. Padding is present even after the last entry. + +The minimum amount of state needed to describe such a serialized list of values is +merely the number of bytes used, which can reasonably be limited to 4 GiB, meaning +that a 32-bit unsigned integer is sufficient to store the fill level. This makes it +possible to easily update the state in a wait-free fashion. + +Lookup +'''''' + +Lookup in a metadata list is done using the metadata entries in the "first part". +These entries are sorted by their numeric identifiers, hence binary search is used to +find the appropriate entry. Then, it is checked whether the given control id has already +been added, and if it has, then its data can be returned in a `ControlValueView` object. + +Insertion +''''''''' + +Similarly to lookup, insertion also starts with binary searching the entry belonging +to the given numeric identifier. If an entry is present for the given id and no value +has already been stored with that id, then insertion can proceed. The value is appended +to the serialized list of control values according to the format described earlier. +Then the fill level is atomically incremented, and the entry is marked as set. After +that the new value is available for readers to consume. + +Having a single writer is an essential requirement to be able to carry out insertion in +a reasonably efficient, and thread-safe manner. + +Iteration +''''''''' + +Iteration of a `MetadataList` is carried out only using the serialized list of controls +in the "second part" of the data structure. An iterator can be implemented as a single +pointer, pointing to the header of the current entry. The begin iterator simply points +to location of the header of the first value. The end iterator is simply the end of the +serialized list of values, which can be calculated from the begin iterator and the fill +level of the serialized list. + +The above iterator can model a `C++ forward iterator`_, that is, only increments of 1 are +possible in constant time, and going backwards is not possible. Advancing to the next value +can be simply implemented by reading the size and alignment from the header, and adjusting +the iterator's pointer by the necessary amount. + +TODO: is a forward iterator enough? is a bidirectional iterator needed? + +.. _C++ forward iterator: https://en.cppreference.com/w/cpp/iterator/forward_iterator.html + +Clearing +'''''''' + +Removing a single value is not supported, but clearing the entire metadata list is. +This should only be done when there are no readers, otherwise readers might run into +data races if they keep reading the metadata when new entries are being added after +clearing it. + +Clearing is implemented by resetting each metadata entry in the "first part", as well +as resetting the stored fill level of the serialized buffer to 0. + +Partial view +'''''''''''' + +When multiple metadata items are completed early, it is important to provide a way +for the application to know exactly which metadata items have just been added. The +serialized values in the data structure are encoded such that a simple byte range +is capable of representing any number of items that have been added in succession. + +The `MetadataList::Checkpoint` type is used to store that state of the serialized +list (number of bytes and number of items) at a given point in time. From such a +checkpoint object a `MetadataList::Diff` object can be constructed, which represents +all values added since the checkpoint. This *diff* object is reasonably small, and +trivially copyable, making it easy to provide to the application. It has much of +the same features as a `MetadataList`, e.g. it can be iterated and one can do lookups. +Naturally, both iteration and lookups only consider the values added after the checkpoint +and before the creation of the `MetadataList::Diff` object. diff --git a/Documentation/index.rst b/Documentation/index.rst index 251112fbd..60cb77702 100644 --- a/Documentation/index.rst +++ b/Documentation/index.rst @@ -24,6 +24,7 @@ Tracing guide Design document: AE + Design document: Metadata list .. toctree:: :hidden: diff --git a/Documentation/meson.build b/Documentation/meson.build index 0fc5909d0..79e687953 100644 --- a/Documentation/meson.build +++ b/Documentation/meson.build @@ -127,6 +127,7 @@ if sphinx.found() 'conf.py', 'contributing.rst', 'design/ae.rst', + 'design/metadata-list.rst', 'documentation-contents.rst', 'environment_variables.rst', 'feature_requirements.rst',