[v2,00/32] rkisp1: pipeline rework for PFC
mbox series

Message ID 20260325151416.2114564-1-stefan.klug@ideasonboard.com
Headers show
Series
  • rkisp1: pipeline rework for PFC
Related show

Message

Stefan Klug March 25, 2026, 3:13 p.m. UTC
Hi all,

This is v2 of the pipeline rework. v1 is available here:
https://patchwork.libcamera.org/project/libcamera/list/?series=5524

After the feedback on the last series it got clear that the name is not
really well chosen as PFC is not the content of the series but rather
the preparatory work needed to later implement PFC on rkisp1/imx8mp.

I'll start with the changes to v1 first, the slightly adapted cover
letter from v1 follows.

The biggest topics in v1 review were the additional accumulation of
logic in the IPA and the quite complicated bookkeeping in the pipeline
handler.

The first Was improved by implementing a callback mechanism in FcQueue
in patch 22. On the long run I envision moving even more logic into a
base class and possibly dropping the FcQueue altogether. A preview of
that is in patch 32.

The latter was improved by adding a BufferQueue class in patch 24. I
left the big rework in patch 25 as complicated as it was. The
BufferQueue is applied in the following patches, to be able to see the
benefit and have a exit strategy if the BufferQueue idea doesn't find
enough support.

Aside from that, a lot of smaller fixes from review were applied and a
few patches from the end of the series were dropped as they were not
absolutely necessary for this (but are still in my pipeline).


Now to the big picture (mostly the old cover-letter):

From a very high level view, this series implements the mechanics needed
to cope with unstable hardware and to later implement per frame control
on the rkisp1. It does not yet touch the possible PFC mechanics like
fast-tracked controls. It was only tested on the imx8mp together with
the dewarper so there might still be issues on other rkisp1 platforms.

To do correct per-frame-control we need to synchronize multiple things:
- The sensor needs to be fed with parameters according to the specific
  delays
- The ISP parameters need to be in sync also. This is especially
  important if for example digital gain in the ISP is used to mitigate
quantization in the sensor gain and therefore need to perfectly match
the sensor timing.

The current IPA model has basically two loops:

The sensor controls loop:
- Pipeline::statsBufferReady()
- IPA::processStats()
- IPA::setSensorControls.emit()
  - Pipeline::DelayedControls.push()
- IPA::metadataReady.emit()
  - Pipeline::tryCompleteRequest()

The ISP params loop:
- Pipeline::queueRequestDevice()
- IPA::queueRequest()
- IPA::computeParams()
- Pipeline::paramsComputed()
  - Pipeline::queueParams()

This means the sensor controls and params are only indirectly coupled.
To add to the complexity, in the sensor controls loop, the frame context
for the frame that produced the stats is used as basis for the call to
setSensorControls(). That is the point where I often fail to follow the
chain of events.

The reworked model now changes the mechanics: In multiple discussions it
got clear that even though theoretically the ISP params have a 0-1 frame
delay and are therefore 1 frame faster than typical sensor delays it is
more important to ensure that sensor data and ISP params are in sync
than to optimiza for that one frame. We can still have a look at that
later. Stating that sensor and params should be in sync also means that
we can produce them at the same time. So the pipeline was reworked with
the following premises in mind:

- Pipeline and IPA now handle sensor sequence numbers instead
  of request numbers.

The stats loop only updates active state and fills metadata:
- ISP calculates stats
- IPA::processStats()
- IPA::metadataReady.emit()
- Pipeline::tryCompleteRequest()

The sensor loop triggers calculation of params + sensor controls:
- Pipeline::startFrame()
- Pipeline::DelayedControls.applyControls(n+delay)
- IPA::computeParams(n+delay+1)
  - IPA::setSensorControls.emit()
    - Pipeline::DelayedControls.push()
- Pipeline::paramsComputed()
  - Pipeline::queueParams()

The request loop:
- Pipeline::queueRequestDevice()
- IPA::queueRequest()

The main change is that sensor controls and ISP params are computed in
response to computeParams(). This has the nice effect, that we just need
to ensure that we call computeParams() early enough to send it out to
the sensor as that is usually the longest delay.

A request underrun is easy to handle as the params/stats machinery just
continues to run with scratch buffers in the kernel. The only change to
do there is to repurpose the IPA::queueRequest to a
IPA::initializeFrameContext and call that on demand when params for a
given frame are needed.

Additionally to this conceptual change, the stats, image and params
buffers were decoupled. Due to the asyncronous nature of V4L2 and the
usage of scratch buffers in the kernel it is impossible to ensure that a
grouping of (stats, image and params) stays in sync. Especially under
high load or with bad connectors, this breaks. Decoupling them makes a
few things easier but also brings a bit of complexity. This rework and
most of the loop structure change is unfortunately in one big commit
(patch 25) as I didn't see a way to split that into easy to digest
parts.

For the (re)synchronization to work properly, I needed to add
synchronization helpers.

In the current code we have a minimal resync
facility that adds 1 if it detects a sequence that is bigger than the
expected one. The problem here is the queue in the kernel. If a offset
is detected when dequing a buffer, we can compensate for that offset
when queuing new buffers. But in case there are n buffers queued when we observe the error, we will
most likely observe that error another n times until the compensation
applies. By then we have overcompensated by n and started a beautiful
oscillation. To handle that we need to track the observed error and the
applied compensation over time.

Another reason for this series was to get started faster. Therefor
controls passed to Camera::start() are now handled properly. In that
context a initial params buffer is also filled. This ensures that the
stats on frame 0 are calculated based on the correct settings and
prevents large oscillations at the beginning of the stream.

Open Todos:
- The semantics of delayed controls was modified to not delay, but more
  to a record and query. Maybe we should rename it?
- The changed delayed controls breaks the unit tests and possibly
  other pipelines
- Missing documentation on SequenceSyncHelper
- As always, a bit more cleanup on the patches and commit messages...

Best regards,
Stefan

---

Changes in v2:
- Reworked lazy initialization of the FrameContext
- Added BufferQueue to simplify queue handling
- Smaller fixes (see local changelogs)

Jacopo Mondi (1):
  libcamera: v4l2_videodevice: Do not hide frame drops

Stefan Klug (31):
  libcamera: delayed_controls: Add push() function that accepts a
    sequence number
  libcamera: delayed_controls: Handle missed pushes
  libcamera: delayed_controls: Increase log level for dummy pushes
  libcamera: delayed_controls: Queue noop when needed, not before
  libcamera: delayed_controls: Add maxDelay() function
  pipeline: rkisp1: Include frame number when pushing to delayed
    controls
  libipa: fc_queue: Rename template argument to FC
  libipa: fc_queue: Add trailing underscore to private members of
    FrameContext
  ipa: rkisp1: Refactor setControls()
  pipeline: rkisp1: Add a frameStart function to handle
    DelayedControls::applyControls
  ipa: rkisp1: Move setSensorControls signal to computeParams
  pipeline: rkisp1: Fix controls in raw mode
  ipa: rkisp1: Add initializeFrameContext() function
  pipeline: rkisp1: Apply initial controls
  ipa: rkisp1: Set frameContext.agc in queueRequest for auto mode also
  ipa: rkisp1: agc: Process frame duration at the right time
  libcamera: delayed_controls: Change semantics of sequence numbers
  libipa: algorithm: Update docs
  libcamera: delayed_controls: Ignore double pushes for the same frame
    number
  ipa: rkisp1: Allow processStats() to be called without stats buffer
  ipa: rkisp1: Lazy initialise frame context
  libcamera: internal: Add SequenceSyncHelper class
  libcamera: internal: Add a BufferQueue class to handle buffer queues
  pipeline: rkisp1: Decouple image, stats and param buffers
  pipline: rkisp1: Reinstantiate maxQueuedRequestsDevice limit
  pipeline: rkisp1: Correctly handle params buffer for frame 0
  pipeline: rkisp1: Fix buffer metadata when using the dewarper
  pipeline: rkisp1: Pass bufferId to paramsComputed()
  pipeline: rkisp1: rkisp1_path: Modify interface to be compatible with
    BufferQueue
  pipeline: rkisp1: Use BufferQueue for buffer handling
  DNI: Move all queue/algo logic into FcLogic class

 include/libcamera/internal/buffer_queue.h     | 131 +++
 include/libcamera/internal/delayed_controls.h |   4 +-
 include/libcamera/internal/meson.build        |   2 +
 .../libcamera/internal/sequence_sync_helper.h |  76 ++
 include/libcamera/internal/v4l2_videodevice.h |   1 -
 include/libcamera/ipa/rkisp1.mojom            |  12 +-
 src/ipa/ipu3/ipu3.cpp                         |  18 +-
 src/ipa/libipa/algorithm.cpp                  |  41 +-
 src/ipa/libipa/algorithm.h                    |   5 +
 src/ipa/libipa/fc_logic.cpp                   |  20 +
 src/ipa/libipa/fc_logic.h                     | 151 +++
 src/ipa/libipa/fc_queue.cpp                   |   8 +-
 src/ipa/libipa/fc_queue.h                     | 126 ++-
 src/ipa/libipa/meson.build                    |   2 +
 src/ipa/mali-c55/mali-c55.cpp                 |  20 +-
 src/ipa/rkisp1/algorithms/agc.cpp             |  31 +-
 src/ipa/rkisp1/algorithms/agc.h               |   3 +-
 src/ipa/rkisp1/algorithms/algorithm.h         |   5 +
 src/ipa/rkisp1/algorithms/awb.cpp             |   2 +
 src/ipa/rkisp1/ipa_context.h                  |   5 +-
 src/ipa/rkisp1/rkisp1.cpp                     | 106 ++-
 src/ipa/simple/soft_simple.cpp                |  18 +-
 src/libcamera/buffer_queue.cpp                | 449 +++++++++
 src/libcamera/delayed_controls.cpp            |  85 +-
 src/libcamera/meson.build                     |   2 +
 src/libcamera/pipeline/ipu3/ipu3.cpp          |   2 +-
 src/libcamera/pipeline/mali-c55/mali-c55.cpp  |   6 +-
 src/libcamera/pipeline/rkisp1/rkisp1.cpp      | 870 +++++++++++-------
 src/libcamera/pipeline/rkisp1/rkisp1_path.cpp |  42 +-
 src/libcamera/pipeline/rkisp1/rkisp1_path.h   |  23 +-
 src/libcamera/sequence_sync_helper.cpp        |  21 +
 src/libcamera/v4l2_videodevice.cpp            |  15 -
 test/meson.build                              |   2 +-
 33 files changed, 1727 insertions(+), 577 deletions(-)
 create mode 100644 include/libcamera/internal/buffer_queue.h
 create mode 100644 include/libcamera/internal/sequence_sync_helper.h
 create mode 100644 src/ipa/libipa/fc_logic.cpp
 create mode 100644 src/ipa/libipa/fc_logic.h
 create mode 100644 src/libcamera/buffer_queue.cpp
 create mode 100644 src/libcamera/sequence_sync_helper.cpp