diff --git a/src/libcamera/camera.cpp b/src/libcamera/camera.cpp
index eff999ec322a..83be4202735a 100644
--- a/src/libcamera/camera.cpp
+++ b/src/libcamera/camera.cpp
@@ -99,6 +99,238 @@
  * on the crop rectangle and the output stream size. The crop rectangle is
  * expressed relatively to the full pixel array size and indicates how the field
  * of view is affected by the pipeline.
+ *
+ * \section pipeline-stages Pipeline Stages
+ *
+ * At the hardware level, pipelines are often more complex. A camera is usually
+ * made of multiple independent stages chained together. For instance, a common
+ * pattern seen in camera hardware architectures splits the image processing,
+ * after the camera sensor, in two parts:
+ *
+ * - The first hardware processing stage is connected to the camera sensor and
+ *   captures raw frames to memory, possibly applying image processing to the
+ *   raw data (such as black level subtraction or lens shading correction).
+ *   This is referred to as inline processing, as frames are processed as they
+ *   arrive, in real time.
+ *
+ * - The second hardware processing stage reads the raw frames from memory,
+ *   applies demosaicing, color space conversion and other processing steps,
+ *   and stores the processed frames in memory in YUV format. This is referred
+ *   to as offline processing, as the timing constraints are not driven by a
+ *   live input.
+ *
+ * More offline processing stages may be chained after the first one to produce
+ * the final images. In libcamera, control of the pipeline stages happens by
+ * default behind the scenes in pipeline handlers to hide the complexity from
+ * applications.
+ *
+ * \subsection pipeline-stages-control Explicit Control of Pipeline Stages
+ *
+ * Applications may have use cases that require explicit control of the
+ * pipeline stages. In the previous example, an application may need to apply
+ * custom processing to the raw images between the inline and offline stages.
+ * libcamera supports this feature by making the pipelines explicit.
+ *
+ * The pipeline concept introduced previously is generalized as a logical view
+ * of processing operations applied to frames, covering one or multiple
+ * hardware stages. Each pipeline receives frames from a single input and
+ * produces one or multiple output streams of frames. The input corresponds to
+ * either the camera sensor, or frames stored in memory. A pipeline that
+ * produces frames generated by the camera sensor is known as a capture
+ * pipeline, while a pipeline that produces frames based on a memory input is
+ * known as a processing pipeline. Not all cameras may support processing
+ * pipelines.
+ *
+ * Pipelines operate on streams, which model an input or output of the pipeline.
+ * With the exception of the stream corresponding to the camera sensor, known
+ * as the live stream, all streams operate on memory. Output streams capture
+ * frames to memory, and input streams fetch frames from memory for further
+ * processing.
+ *
+ * Pipelines are constructed by applications when configuring the camera. To
+ * create a pipeline, applications shall select, among all the streams exposed
+ * by the camera, the streams that best match their use case based on the
+ * capabilities the streams expose. libcamera provides helper functions to
+ * assist this streams selection process.
+ *
+ * \todo Provide an example of two pipelines being used concurrently in the
+ * form of a diagram
+ *
+ * \subsection pipeline-resources Resource Sharing
+ *
+ * Within a camera, multiple pipelines may share hardware resources. For
+ * instance, with the typical inline/offline hardware architecture described
+ * above, an application may construct a capture pipeline to capture frames to
+ * memory in both raw format and processed YUV format, and a processing
+ * pipeline to process raw frames from memory. The capture pipeline would use
+ * both the inline stage (to capture raw frames) and the offline stage (to
+ * generate processed YUV frames), and the processing pipeline would use the
+ * same offline stage for memory to memory processing. Those two pipelines may
+ * be operated concurrently by an application, resulting in the offline stage
+ * and its streams being shared between the pipelines.
+ *
+ * Resource sharing between multiple pipelines is handled by libcamera as
+ * transparently as possible. The camera configuration API exposes information
+ * to inform of any user-visible impact of resource sharing and allow
+ * applications to make appropriate usage decisions.
+ *
+ * \subsection pipeline-stages-model-mapping Mapping to The Pipeline Model
+ *
+ * Depending on which input and output streams it uses, a pipeline usually
+ * supports a subset of the operations defined by the
+ * \ref camera-pipeline-model "pipeline model". For instance, a capture
+ * pipeline that ends at a stream capturing raw images may support operations
+ * up to pixel readout, or up to lens shading correction. A processing pipeline
+ * operating on raw frames and outputting YUV frames may start at black level
+ * subtraction or at spatial noise filtering.
+ *
+ * \section camera-use-cases Sample Use Cases
+ *
+ * To better understand the usage of pipelines and streams, this section
+ * presents several common use cases and how they map to pipelines and streams.
+ *
+ * \subsection camera-use-case-viewfinder Viewfinder
+ *
+ * The simplest use case captures a single live stream from the camera to
+ * display it on the screen. This is named a viewfinder, due to its usage in
+ * photo applications to display a preview of what will appear on the picture.
+ *
+ * In this use case, the camera operates with a single capture pipeline,
+ * containing a single output stream. The output stream shall be selected for
+ * its ability to produce a format and a size compatible with the display
+ * requirements. It will thus typically support scaling frames. The pixel
+ * format and size of the output stream are selected by the application.
+ *
+ * \subsection camera-use-case-viewfinder-still Viewfinder and Still Image Capture
+ *
+ * A slightly more advanced use case combines the viewfinder from the previous
+ * use case with high resolution still image capture. This is the most common
+ * simple point-and-shoot camera implementation, with the viewfinder offering
+ * live display on the screen, and still images being occasionally captured
+ * based on user input at a high(er) resolution.
+ *
+ * In this use case, the camera operates with a single capture pipeline,
+ * containing two output streams, respectively named viewfinder and still
+ * capture. Note that the stream naming only serves to ease referring to
+ * streams in the documentation of a particular use case, the streams selected
+ * for the viewfinder and still capture roles may support more use cases and may
+ * not be intrinsicly dedicated to these roles.
+ *
+ * As in the previous use case, the output streams shall be selected for their
+ * compatibility with the display and still capture requirements. The still
+ * capture stream may not support scaling, but may offer additional image
+ * quality improvements compared to the viewfinder stream (such as higher
+ * quality noise reduction).
+ *
+ * The pixel format and size of both streams are selected by the application.
+ *
+ * \subsection camera-use-case-viewfinder-video Viewfinder and Video Capture
+ *
+ * Similarly to the previous use case, this use case combines a viewfinder with
+ * a second stream, this time to capture video. This is a common use case for
+ * video recording or video conferencing applications, with the viewfinder
+ * offering live preview of the video on the screen, and the captured video
+ * being sent to an encoder and recorded on permanent storage or sent over the
+ * network.
+ *
+ * In this use case, the camera operates with a single capture pipeline,
+ * containing two output streams, respectively named viewfinder and video.
+ * Selection of the output streams by the application follows the same process
+ * as before. Both the viewfinder and video streams are typically selected for
+ * their ability to scale the image and output a format compatible with the
+ * display and the encoder respectively. The video stream may offer additional
+ * features such as video stabilization, and the viewfinder stream may support
+ * mirroring the image to present a more usual self view on the screen.
+ *
+ * The video stream in this use case is not limited to being encoded and stored
+ * or streamed. It may be used by the application for other purposes, such as
+ * analysis by computer vision algorithms.
+ *
+ * \subsection camera-use-case-raw Raw Capture
+ *
+ * In addition to processed frames, cameras may support capturing raw frames as
+ * produced by the camera sensor, with no or minimal processing. Raw frames may
+ * be used to capture in the Digital Negative (DNG) file format, or for the
+ * purpose of camera tuning during system development and integration. This may
+ * be combined with any of the previous use cases.
+ *
+ * In this use case, the camera operates with a single capture pipeline,
+ * containing one raw output stream, in addition to the processed output
+ * streams required for other purposes (such as viewfinder or still image
+ * capture). The raw stream is selected for its ability to generate raw frames.
+ *
+ * While applications select the format and size of processed streams, the raw
+ * stream typically offers less flexibility. Its pixel format is dictated by
+ * what the camera sensor produces, and may allow selection of a lower bit
+ * depth. The raw stream's size may be fixed when the sensor provides no
+ * scaling capability. Otherwise, it interacts with the size of the processed
+ * streams.
+ *
+ * \subsection camera-use-case-raw-processing Viewfinder and Still Image Capture with Custom Processing
+ *
+ * In the \ref camera-use-case-viewfinder-still "viewfinder and still image capture"
+ * use case described previously, all the processing applied to the still image
+ * is performed by the device. In order to increase the still image quality
+ * further, additional processing of the raw frame too complex for the
+ * hardware, or simply not supported by the camera, may be desirable. This
+ * includes, for instance, temporal noise reduction that combines multiple
+ * consecutive frames to reduce the average noise.
+ *
+ * Capturing the raw frame and processing it in the application is possible as
+ * explained in the \ref camera-use-case-raw "raw capture" use case. In that
+ * case, however, the application would be responsible for the complete
+ * processing of the raw frame to produce a still capture, severely increasing
+ * the application complexity. To avoid this, cameras can expose processing
+ * pipelines to applications, allowing them to capture raw frames, process them
+ * with custom algorithms, and send those pre-processed frames back to the
+ * camera's processing pipeline to apply all the regular camera processing. The
+ * pre-processing step is in that case fully implemented on the application
+ * side (usually based on custom software running on the main CPU, but nothing
+ * in libcamera would limit the application's ability to offload to a GPU or
+ * another processing engine), while harnessing the full power of the camera's
+ * hardware processing.
+ *
+ * In this use case, the camera operates with two pipelines, a capture pipeline
+ * and a processing pipeline. The capture pipeline contains one viewfinder
+ * output stream and one raw output stream. The processing pipeline contains one
+ * raw input stream and one still image capture output stream. The raw input
+ * stream is selected for its ability to consume raw frames in the same format
+ * as generated by the raw output stream.
+ *
+ * During viewfinder operation, the application uses the capture pipeline only,
+ * to capture viewfinder frames. When a still image capture is needed, the
+ * application additionally captures a raw frame from the capture pipeline,
+ * pre-processes it, and then submits the pre-processed frame to the processing
+ * pipeline to complete the still image capture operation. If the pre-processing
+ * requires more than one frame, the application may capture multiple raw
+ * frames, process them together into one pre-processed raw frame, and submit
+ * that frame back to the processing pipeline.
+ *
+ * \subsection camera-use-case-zsl Viewfinder and Zero Shutter Lag Still Image Capture
+ *
+ * When a user wants to capture a still image in a point-and-shoot camera
+ * application, various delays are involved at all stages of the process. On
+ * the device side, registering a button press or a tap on the screen, and
+ * processing the event, introduces a significant delay. Even if that delay
+ * was to be minimized, the delay from the scene event that needs to be
+ * captured and the user's action on the device is also significant. This often
+ * results in missed shots when trying to capture fast actions.
+ *
+ * To solve this issue, a technique can be used by the application to capture an
+ * image from the past, compensating the system's delays with a "negative
+ * delay". The camera uses the same capture and processing pipelines as in the
+ * previous use case. The capture pipeline is operated differently, with raw
+ * frames being captured continuously to a small ring buffer of frames managed
+ * by the application. When the still image capture is requested, the
+ * application selects the appropriate raw frame from the ring buffer, based on
+ * an evaluation of the capture event delay, and submits it to the processing
+ * pipeline to generate a processed still image. This technique is referred to
+ * as zero shutter lag, or ZSL, due to the apparent removal of all delays.
+ *
+ * Zero shutter lag can be combined with application-side processing of raw
+ * frames, for instance using multiple raw frames from the ring buffer to
+ * perform temporal noise reduction, or using image analysis to pick the best
+ * raw frame from the ring buffer.
  */
 
 namespace libcamera {