From patchwork Sat Jan 13 14:22:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hans de Goede X-Patchwork-Id: 19401 Return-Path: X-Original-To: parsemail@patchwork.libcamera.org Delivered-To: parsemail@patchwork.libcamera.org Received: from lancelot.ideasonboard.com (lancelot.ideasonboard.com [92.243.16.209]) by patchwork.libcamera.org (Postfix) with ESMTPS id 23C23C32BD for ; Sat, 13 Jan 2024 14:23:02 +0000 (UTC) Received: from lancelot.ideasonboard.com (localhost [IPv6:::1]) by lancelot.ideasonboard.com (Postfix) with ESMTP id BAEBA61D57; Sat, 13 Jan 2024 15:23:01 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=libcamera.org; s=mail; t=1705155781; bh=CY3GiS2AK81nKO39eEE1/6hcpdEkV9KEbpl8i+ITXbQ=; h=To:Date:In-Reply-To:References:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=N9HwK8wKb2z9jAFwsHEbkyVHHOdFRTojuD/zrWU8qZ8qepqt+/raiclc92t3uCeRU arzKRBIe4jYbP9RZYbu+fuFCkEHo6DXKvihaniQUautDsd8af+v4WPq+Fs/XgEkN69 UsbO5IxM7c7gOSlWZMWsSuT6iC35KBdINrGkEsKGaEmgCxwv2EKhVL1xEC2xW72ANk Pphv3r/wKEjuduoZqFp/JBYQVvlAHShGiWcG7uVvDAnlF1u36K/VXEVhDFN1CAAuQO QFMg1MtwoM8i1ooYR1YVyGBv1ID0KpSjr9BzKojsxLuoESBC4rjaV1nXValOxF9c0o TrWwuCv4c44cA== Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lancelot.ideasonboard.com (Postfix) with ESMTPS id 57FE26293E for ; Sat, 13 Jan 2024 15:22:59 +0100 (CET) Authentication-Results: lancelot.ideasonboard.com; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="hj5KYIbt"; dkim-atps=neutral DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705155778; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=smKNwA+SuPq4ZnFIwXRUUmh9rEFsw4rqtF6BjuXZD6M=; b=hj5KYIbtT5HZJ//eRty7aEJ2HJlcSHwj9sXC/2VTAp8r9/dnzmqP3tF5zfPYAX6/VdRZrD x7K+7nmoUIBdNw7X1GCJ2OCHqIPfDX67TUYZHywfNDDhc1fsDH3o5k80+iyIA+X8hoFd4F 4hrYAgwzYmf54QJwRyoT0ju2S65gH6w= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-625--SmlkuPXMKOlMrYSYoprtw-1; Sat, 13 Jan 2024 09:22:52 -0500 X-MC-Unique: -SmlkuPXMKOlMrYSYoprtw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 730263C11C8E; Sat, 13 Jan 2024 14:22:51 +0000 (UTC) Received: from localhost.localdomain (unknown [10.39.192.58]) by smtp.corp.redhat.com (Postfix) with ESMTP id 41CD43C25; Sat, 13 Jan 2024 14:22:49 +0000 (UTC) To: libcamera-devel@lists.libcamera.org, Andrey Konovalov Date: Sat, 13 Jan 2024 15:22:10 +0100 Message-ID: <20240113142218.28063-11-hdegoede@redhat.com> In-Reply-To: <20240113142218.28063-1-hdegoede@redhat.com> References: <20240113142218.28063-1-hdegoede@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [libcamera-devel] [PATCH v2 10/18] libcamera: software_isp: Add DebayerCpu class X-BeenThere: libcamera-devel@lists.libcamera.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hans de Goede via libcamera-devel From: Hans de Goede Reply-To: Hans de Goede Cc: Maxime Ripard , g.martti@gmail.com, t.langendam@gmail.com, srinivas.kandagatla@linaro.org, Pavel Machek , Bryan O'Donoghue , admin@dennisbonke.com Errors-To: libcamera-devel-bounces@lists.libcamera.org Sender: "libcamera-devel" Add CPU based debayering implementation. This initial implementation only supports debayering packed 10 bits per pixel bayer data in the 4 standard bayer orders. Doxygen documentation by Dennis Bonke. Co-authored-by: Dennis Bonke Signed-off-by: Dennis Bonke Co-authored-by: Andrey Konovalov Signed-off-by: Andrey Konovalov Co-authored-by: Pavel Machek Signed-off-by: Pavel Machek Signed-off-by: Hans de Goede Tested-by: Bryan O'Donoghue # sc8280xp Lenovo x13s Tested-by: Pavel Machek --- .../internal/software_isp/debayer_cpu.h | 131 +++++ .../internal/software_isp/meson.build | 1 + src/libcamera/software_isp/debayer_cpu.cpp | 528 ++++++++++++++++++ src/libcamera/software_isp/meson.build | 1 + 4 files changed, 661 insertions(+) create mode 100644 include/libcamera/internal/software_isp/debayer_cpu.h create mode 100644 src/libcamera/software_isp/debayer_cpu.cpp diff --git a/include/libcamera/internal/software_isp/debayer_cpu.h b/include/libcamera/internal/software_isp/debayer_cpu.h new file mode 100644 index 00000000..78573f44 --- /dev/null +++ b/include/libcamera/internal/software_isp/debayer_cpu.h @@ -0,0 +1,131 @@ +/* SPDX-License-Identifier: LGPL-2.1-or-later */ +/* + * Copyright (C) 2023, Linaro Ltd + * Copyright (C) 2023, Red Hat Inc. + * + * Authors: + * Hans de Goede + * + * debayer_cpu.h - CPU based debayering header + */ + +#pragma once + +#include +#include +#include + +#include + +#include "libcamera/internal/software_isp/swstats_cpu.h" +#include "libcamera/internal/software_isp/debayer.h" + +namespace libcamera { + +/** + * \class DebayerCpu + * \brief Class for debayering on the CPU + * + * Implementation for CPU based debayering + */ +class DebayerCpu : public Debayer, public Object +{ +public: + /* + * FIXME this should be a plain (implementation independent) SwStats + * this can be fixed once getStats() is dropped. + */ + /** + * \brief Constructs a DebayerCpu object. + * \param[in] stats Pointer to the stats object to use. + */ + DebayerCpu(std::unique_ptr stats); + ~DebayerCpu(); + + /* + * Setup the Debayer object according to the passed in parameters. + * Return 0 on success, a negative errno value on failure + * (unsupported parameters). + */ + int configure(const StreamConfiguration &inputCfg, + const std::vector> &outputCfgs); + + /* + * Get width and height at which the bayer-pattern repeats. + * Return pattern-size or an empty Size for an unsupported inputFormat. + */ + Size patternSize(PixelFormat inputFormat); + + std::vector formats(PixelFormat input); + std::tuple + strideAndFrameSize(const PixelFormat &outputFormat, const Size &size); + + void process(FrameBuffer *input, FrameBuffer *output, DebayerParams params); + + /** + * \brief Get the file descriptor for the statistics. + * + * \return the file descriptor pointing to the statistics. + */ + const SharedFD &getStatsFD() { return stats_->getStatsFD(); } + + /** + * \brief Get the output frame size. + * + * \return The output frame size. + */ + unsigned int frameSize() { return outputConfig_.frameSize; } +private: + void initLinePointers(const uint8_t *linePointers[], const uint8_t *src); + void shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src); + void process2(const uint8_t *src, uint8_t *dst); + void process4(const uint8_t *src, uint8_t *dst); + /* CSI-2 packed 10-bit raw bayer format (all the 4 orders) */ + void debayer10P_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]); + void debayer10P_GRGR_BGR888(uint8_t *dst, const uint8_t *src[]); + void debayer10P_GBGB_BGR888(uint8_t *dst, const uint8_t *src[]); + void debayer10P_RGRG_BGR888(uint8_t *dst, const uint8_t *src[]); + + typedef void (DebayerCpu::*debayerFn)(uint8_t *dst, const uint8_t *src[]); + + struct DebayerInputConfig { + Size patternSize; + unsigned int bpp; /* Memory used per pixel, not precision */ + unsigned int stride; + std::vector outputFormats; + }; + + struct DebayerOutputConfig { + unsigned int bpp; /* Memory used per pixel, not precision */ + unsigned int stride; + unsigned int frameSize; + }; + + int getInputConfig(PixelFormat inputFormat, DebayerInputConfig &config); + int getOutputConfig(PixelFormat outputFormat, DebayerOutputConfig &config); + int setDebayerFunctions(PixelFormat inputFormat, PixelFormat outputFormat); + + uint8_t gamma_[1024]; + uint8_t red_[256]; + uint8_t green_[256]; + uint8_t blue_[256]; + debayerFn debayer0_; + debayerFn debayer1_; + debayerFn debayer2_; + debayerFn debayer3_; + Rectangle window_; + DebayerInputConfig inputConfig_; + DebayerOutputConfig outputConfig_; + std::unique_ptr stats_; + uint8_t *lineBuffers_[5]; + unsigned int lineBufferIndex_; + bool enableInputMemcpy_; + float gamma_correction_; + int measuredFrames_; + int64_t frameProcessTime_; + /* Skip 30 frames for things to stabilize then measure 30 frames */ + static const int framesToSkip = 30; + static const int framesToMeasure = 60; +}; + +} /* namespace libcamera */ diff --git a/include/libcamera/internal/software_isp/meson.build b/include/libcamera/internal/software_isp/meson.build index 7e40925e..b5a0d737 100644 --- a/include/libcamera/internal/software_isp/meson.build +++ b/include/libcamera/internal/software_isp/meson.build @@ -2,6 +2,7 @@ libcamera_internal_headers += files([ 'debayer.h', + 'debayer_cpu.h', 'debayer_params.h', 'swisp_stats.h', 'swstats.h', diff --git a/src/libcamera/software_isp/debayer_cpu.cpp b/src/libcamera/software_isp/debayer_cpu.cpp new file mode 100644 index 00000000..e0c3c658 --- /dev/null +++ b/src/libcamera/software_isp/debayer_cpu.cpp @@ -0,0 +1,528 @@ +/* SPDX-License-Identifier: LGPL-2.1-or-later */ +/* + * Copyright (C) 2023, Linaro Ltd + * Copyright (C) 2023, Red Hat Inc. + * + * Authors: + * Hans de Goede + * + * debayer_cpu.cpp - CPU based debayering class + */ + +#include "libcamera/internal/software_isp/debayer_cpu.h" + +#include +#include +#include + +#include + +#include "libcamera/internal/bayer_format.h" +#include "libcamera/internal/framebuffer.h" +#include "libcamera/internal/mapped_framebuffer.h" + +namespace libcamera { + +DebayerCpu::DebayerCpu(std::unique_ptr stats) + : stats_(std::move(stats)), gamma_correction_(1.0) +{ +#ifdef __x86_64__ + enableInputMemcpy_ = false; +#else + enableInputMemcpy_ = true; +#endif + /* Initialize gamma to 1.0 curve */ + for (int i = 0; i < 1024; i++) + gamma_[i] = i / 4; + + for (int i = 0; i < 5; i++) + lineBuffers_[i] = NULL; +} + +DebayerCpu::~DebayerCpu() +{ + for (int i = 0; i < 5; i++) + free(lineBuffers_[i]); +} + +// RGR +// GBG +// RGR +#define BGGR_BGR888(p, n, div) \ + *dst++ = blue_[curr[x] / (div)]; \ + *dst++ = green_[(prev[x] + curr[x - p] + curr[x + n] + next[x]) / (4 * (div))]; \ + *dst++ = red_[(prev[x - p] + prev[x + n] + next[x - p] + next[x + n]) / (4 * (div))]; \ + x++; + +// GBG +// RGR +// GBG +#define GRBG_BGR888(p, n, div) \ + *dst++ = blue_[(prev[x] + next[x]) / (2 * (div))]; \ + *dst++ = green_[curr[x] / (div)]; \ + *dst++ = red_[(curr[x - p] + curr[x + n]) / (2 * (div))]; \ + x++; + +// GRG +// BGB +// GRG +#define GBRG_BGR888(p, n, div) \ + *dst++ = blue_[(curr[x - p] + curr[x + n]) / (2 * (div))]; \ + *dst++ = green_[curr[x] / (div)]; \ + *dst++ = red_[(prev[x] + next[x]) / (2 * (div))]; \ + x++; + +// BGB +// GRG +// BGB +#define RGGB_BGR888(p, n, div) \ + *dst++ = blue_[(prev[x - p] + prev[x + n] + next[x - p] + next[x + n]) / (4 * (div))]; \ + *dst++ = green_[(prev[x] + curr[x - p] + curr[x + n] + next[x]) / (4 * (div))]; \ + *dst++ = red_[curr[x] / (div)]; \ + x++; + +void DebayerCpu::debayer10P_BGBG_BGR888(uint8_t *dst, const uint8_t *src[]) +{ + const int width_in_bytes = window_.width * 5 / 4; + const uint8_t *prev = (const uint8_t *)src[0]; + const uint8_t *curr = (const uint8_t *)src[1]; + const uint8_t *next = (const uint8_t *)src[2]; + + /* + * For the first pixel getting a pixel from the previous column uses + * x - 2 to skip the 5th byte with least-significant bits for 4 pixels. + * Same for last pixel (uses x + 2) and looking at the next column. + * x++ in the for-loop skips the 5th byte with 4 x 2 lsb-s for 10bit packed. + */ + for (int x = 0; x < width_in_bytes; x++) { + /* Even pixel */ + BGGR_BGR888(2, 1, 1) + /* Odd pixel BGGR -> GBRG */ + GBRG_BGR888(1, 1, 1) + /* Same thing for next 2 pixels */ + BGGR_BGR888(1, 1, 1) + GBRG_BGR888(1, 2, 1) + } +} + +void DebayerCpu::debayer10P_GRGR_BGR888(uint8_t *dst, const uint8_t *src[]) +{ + const int width_in_bytes = window_.width * 5 / 4; + const uint8_t *prev = (const uint8_t *)src[0]; + const uint8_t *curr = (const uint8_t *)src[1]; + const uint8_t *next = (const uint8_t *)src[2]; + + for (int x = 0; x < width_in_bytes; x++) { + /* Even pixel */ + GRBG_BGR888(2, 1, 1) + /* Odd pixel GRBG -> RGGB */ + RGGB_BGR888(1, 1, 1) + /* Same thing for next 2 pixels */ + GRBG_BGR888(1, 1, 1) + RGGB_BGR888(1, 2, 1) + } +} + +void DebayerCpu::debayer10P_GBGB_BGR888(uint8_t *dst, const uint8_t *src[]) +{ + const int width_in_bytes = window_.width * 5 / 4; + const uint8_t *prev = (const uint8_t *)src[0]; + const uint8_t *curr = (const uint8_t *)src[1]; + const uint8_t *next = (const uint8_t *)src[2]; + + for (int x = 0; x < width_in_bytes; x++) { + /* Even pixel */ + GBRG_BGR888(2, 1, 1) + /* Odd pixel GBGR -> BGGR */ + BGGR_BGR888(1, 1, 1) + /* Same thing for next 2 pixels */ + GBRG_BGR888(1, 1, 1) + BGGR_BGR888(1, 2, 1) + } +} + +void DebayerCpu::debayer10P_RGRG_BGR888(uint8_t *dst, const uint8_t *src[]) +{ + const int width_in_bytes = window_.width * 5 / 4; + const uint8_t *prev = (const uint8_t *)src[0]; + const uint8_t *curr = (const uint8_t *)src[1]; + const uint8_t *next = (const uint8_t *)src[2]; + + for (int x = 0; x < width_in_bytes; x++) { + /* Even pixel */ + RGGB_BGR888(2, 1, 1) + /* Odd pixel RGGB -> GRBG*/ + GRBG_BGR888(1, 1, 1) + /* Same thing for next 2 pixels */ + RGGB_BGR888(1, 1, 1) + GRBG_BGR888(1, 2, 1) + } +} + +static bool isStandardBayerOrder(BayerFormat::Order order) +{ + return order == BayerFormat::BGGR || order == BayerFormat::GBRG || + order == BayerFormat::GRBG || order == BayerFormat::RGGB; +} + +int DebayerCpu::getInputConfig(PixelFormat inputFormat, DebayerInputConfig &config) +{ + BayerFormat bayerFormat = + BayerFormat::fromPixelFormat(inputFormat); + + if (bayerFormat.bitDepth == 10 && + bayerFormat.packing == BayerFormat::Packing::CSI2 && + isStandardBayerOrder(bayerFormat.order)) { + config.bpp = 10; + config.patternSize.width = 4; /* 5 bytes per *4* pixels */ + config.patternSize.height = 2; + config.outputFormats = std::vector({ formats::RGB888 }); + return 0; + } + + LOG(Debayer, Info) + << "Unsupported input format " << inputFormat.toString(); + return -EINVAL; +} + +int DebayerCpu::getOutputConfig(PixelFormat outputFormat, DebayerOutputConfig &config) +{ + if (outputFormat == formats::RGB888) { + config.bpp = 24; + return 0; + } + + LOG(Debayer, Info) + << "Unsupported output format " << outputFormat.toString(); + return -EINVAL; +} + +/* TODO: this ignores outputFormat since there is only 1 supported outputFormat for now */ +int DebayerCpu::setDebayerFunctions(PixelFormat inputFormat, [[maybe_unused]] PixelFormat outputFormat) +{ + BayerFormat bayerFormat = + BayerFormat::fromPixelFormat(inputFormat); + + if (bayerFormat.bitDepth == 10 && + bayerFormat.packing == BayerFormat::Packing::CSI2) { + switch (bayerFormat.order) { + case BayerFormat::BGGR: + debayer0_ = &DebayerCpu::debayer10P_BGBG_BGR888; + debayer1_ = &DebayerCpu::debayer10P_GRGR_BGR888; + return 0; + case BayerFormat::GBRG: + debayer0_ = &DebayerCpu::debayer10P_GBGB_BGR888; + debayer1_ = &DebayerCpu::debayer10P_RGRG_BGR888; + return 0; + case BayerFormat::GRBG: + debayer0_ = &DebayerCpu::debayer10P_GRGR_BGR888; + debayer1_ = &DebayerCpu::debayer10P_BGBG_BGR888; + return 0; + case BayerFormat::RGGB: + debayer0_ = &DebayerCpu::debayer10P_RGRG_BGR888; + debayer1_ = &DebayerCpu::debayer10P_GBGB_BGR888; + return 0; + default: + break; + } + } + + LOG(Debayer, Error) << "Unsupported input output format combination"; + return -EINVAL; +} + +int DebayerCpu::configure(const StreamConfiguration &inputCfg, + const std::vector> &outputCfgs) +{ + if (getInputConfig(inputCfg.pixelFormat, inputConfig_) != 0) + return -EINVAL; + + if (stats_->configure(inputCfg) != 0) + return -EINVAL; + + const Size &stats_pattern_size = stats_->patternSize(); + if (inputConfig_.patternSize.width != stats_pattern_size.width || + inputConfig_.patternSize.height != stats_pattern_size.height) { + LOG(Debayer, Error) + << "mismatching stats and debayer pattern sizes for " + << inputCfg.pixelFormat.toString(); + return -EINVAL; + } + + inputConfig_.stride = inputCfg.stride; + + if (outputCfgs.size() != 1) { + LOG(Debayer, Error) + << "Unsupported number of output streams: " + << outputCfgs.size(); + return -EINVAL; + } + + const StreamConfiguration &outputCfg = outputCfgs[0]; + SizeRange outSizeRange = sizes(inputCfg.pixelFormat, inputCfg.size); + std::tie(outputConfig_.stride, outputConfig_.frameSize) = + strideAndFrameSize(outputCfg.pixelFormat, outputCfg.size); + + if (!outSizeRange.contains(outputCfg.size) || outputConfig_.stride != outputCfg.stride) { + LOG(Debayer, Error) + << "Invalid output size/stride: " + << "\n " << outputCfg.size << " (" << outSizeRange << ")" + << "\n " << outputCfg.stride << " (" << outputConfig_.stride << ")"; + return -EINVAL; + } + + if (setDebayerFunctions(inputCfg.pixelFormat, outputCfg.pixelFormat) != 0) + return -EINVAL; + + window_.x = ((inputCfg.size.width - outputCfg.size.width) / 2) & + ~(inputConfig_.patternSize.width - 1); + window_.y = ((inputCfg.size.height - outputCfg.size.height) / 2) & + ~(inputConfig_.patternSize.height - 1); + window_.width = outputCfg.size.width; + window_.height = outputCfg.size.height; + + /* Don't pass x,y since process() already adjusts src before passing it */ + stats_->setWindow(Rectangle(window_.size())); + + for (unsigned int i = 0; + i < (inputConfig_.patternSize.height + 1) && enableInputMemcpy_; + i++) { + /* pad with patternSize.Width on both left and right side */ + size_t lineLength = (window_.width + 2 * inputConfig_.patternSize.width) * + inputConfig_.bpp / 8; + + free(lineBuffers_[i]); + lineBuffers_[i] = (uint8_t *)malloc(lineLength); + if (!lineBuffers_[i]) + return -ENOMEM; + } + + measuredFrames_ = 0; + frameProcessTime_ = 0; + + return 0; +} + +Size DebayerCpu::patternSize(PixelFormat inputFormat) +{ + DebayerCpu::DebayerInputConfig config; + + if (getInputConfig(inputFormat, config) != 0) + return {}; + + return config.patternSize; +} + +std::vector DebayerCpu::formats(PixelFormat inputFormat) +{ + DebayerCpu::DebayerInputConfig config; + + if (getInputConfig(inputFormat, config) != 0) + return std::vector(); + + return config.outputFormats; +} + +std::tuple +DebayerCpu::strideAndFrameSize(const PixelFormat &outputFormat, const Size &size) +{ + DebayerCpu::DebayerOutputConfig config; + + if (getOutputConfig(outputFormat, config) != 0) + return std::make_tuple(0, 0); + + /* round up to multiple of 8 for 64 bits alignment */ + unsigned int stride = (size.width * config.bpp / 8 + 7) & ~7; + + return std::make_tuple(stride, stride * size.height); +} + +void DebayerCpu::initLinePointers(const uint8_t *linePointers[], const uint8_t *src) +{ + const int patternHeight = inputConfig_.patternSize.height; + + for (int i = 0; i < patternHeight; i++) + linePointers[i + 1] = src + + (-patternHeight / 2 + i) * (int)inputConfig_.stride; + + if (!enableInputMemcpy_) + return; + + for (int i = 0; i < patternHeight; i++) { + /* pad with patternSize.Width on both left and right side */ + size_t lineLength = (window_.width + 2 * inputConfig_.patternSize.width) * + inputConfig_.bpp / 8; + int padding = inputConfig_.patternSize.width * inputConfig_.bpp / 8; + + memcpy(lineBuffers_[i], linePointers[i + 1] - padding, lineLength); + linePointers[i + 1] = lineBuffers_[i] + padding; + } + + /* Point lineBufferIndex_ to first unused lineBuffer */ + lineBufferIndex_ = patternHeight; +} + +void DebayerCpu::shiftLinePointers(const uint8_t *linePointers[], const uint8_t *src) +{ + const int patternHeight = inputConfig_.patternSize.height; + + for (int i = 0; i < patternHeight; i++) + linePointers[i] = linePointers[i + 1]; + + linePointers[patternHeight] = src + + (patternHeight / 2) * (int)inputConfig_.stride; + + if (!enableInputMemcpy_) + return; + + size_t lineLength = (window_.width + 2 * inputConfig_.patternSize.width) * + inputConfig_.bpp / 8; + int padding = inputConfig_.patternSize.width * inputConfig_.bpp / 8; + memcpy(lineBuffers_[lineBufferIndex_], linePointers[patternHeight] - padding, lineLength); + linePointers[patternHeight] = lineBuffers_[lineBufferIndex_] + padding; + + lineBufferIndex_ = (lineBufferIndex_ + 1) % (patternHeight + 1); +} + +void DebayerCpu::process2(const uint8_t *src, uint8_t *dst) +{ + const unsigned int y_end = window_.y + window_.height; + const uint8_t *linePointers[3]; + + /* Adjust src to top left corner of the window */ + src += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; + + initLinePointers(linePointers, src); + + for (unsigned int y = window_.y; y < y_end; y += 2) { + shiftLinePointers(linePointers, src); + stats_->processLine0(y, linePointers); + (this->*debayer0_)(dst, linePointers); + src += inputConfig_.stride; + dst += outputConfig_.stride; + + shiftLinePointers(linePointers, src); + (this->*debayer1_)(dst, linePointers); + src += inputConfig_.stride; + dst += outputConfig_.stride; + } +} + +void DebayerCpu::process4(const uint8_t *src, uint8_t *dst) +{ + const unsigned int y_end = window_.y + window_.height; + const uint8_t *linePointers[5]; + + /* Adjust src to top left corner of the window */ + src += window_.y * inputConfig_.stride + window_.x * inputConfig_.bpp / 8; + + initLinePointers(linePointers, src); + + for (unsigned int y = window_.y; y < y_end; y += 4) { + shiftLinePointers(linePointers, src); + stats_->processLine0(y, linePointers); + (this->*debayer0_)(dst, linePointers); + src += inputConfig_.stride; + dst += outputConfig_.stride; + + shiftLinePointers(linePointers, src); + (this->*debayer1_)(dst, linePointers); + src += inputConfig_.stride; + dst += outputConfig_.stride; + + shiftLinePointers(linePointers, src); + stats_->processLine2(y, linePointers); + (this->*debayer2_)(dst, linePointers); + src += inputConfig_.stride; + dst += outputConfig_.stride; + + shiftLinePointers(linePointers, src); + (this->*debayer3_)(dst, linePointers); + src += inputConfig_.stride; + dst += outputConfig_.stride; + } +} + +static inline int64_t timeDiff(timespec &after, timespec &before) +{ + return (after.tv_sec - before.tv_sec) * 1000000000LL + + (int64_t)after.tv_nsec - (int64_t)before.tv_nsec; +} + +void DebayerCpu::process(FrameBuffer *input, FrameBuffer *output, DebayerParams params) +{ + timespec frameStartTime; + + if (measuredFrames_ < DebayerCpu::framesToMeasure) { + frameStartTime = {}; + clock_gettime(CLOCK_MONOTONIC_RAW, &frameStartTime); + } + + /* Apply DebayerParams */ + if (params.gamma != gamma_correction_) { + for (int i = 0; i < 1024; i++) + gamma_[i] = 255 * powf(i / 1023.0, params.gamma); + + gamma_correction_ = params.gamma; + } + + for (int i = 0; i < 256; i++) { + int idx; + + /* Apply gamma after gain! */ + idx = std::min({ i * params.gainR / 64U, 1023U }); + red_[i] = gamma_[idx]; + + idx = std::min({ i * params.gainG / 64U, 1023U }); + green_[i] = gamma_[idx]; + + idx = std::min({ i * params.gainB / 64U, 1023U }); + blue_[i] = gamma_[idx]; + } + + /* Copy metadata from the input buffer */ + FrameMetadata &metadata = output->_d()->metadata(); + metadata.status = input->metadata().status; + metadata.sequence = input->metadata().sequence; + metadata.timestamp = input->metadata().timestamp; + + MappedFrameBuffer in(input, MappedFrameBuffer::MapFlag::Read); + MappedFrameBuffer out(output, MappedFrameBuffer::MapFlag::Write); + if (!in.isValid() || !out.isValid()) { + LOG(Debayer, Error) << "mmap-ing buffer(s) failed"; + metadata.status = FrameMetadata::FrameError; + return; + } + + stats_->startFrame(); + + if (inputConfig_.patternSize.height == 2) + process2(in.planes()[0].data(), out.planes()[0].data()); + else + process4(in.planes()[0].data(), out.planes()[0].data()); + + metadata.planes()[0].bytesused = out.planes()[0].size(); + + /* Measure before emitting signals */ + if (measuredFrames_ < DebayerCpu::framesToMeasure && + ++measuredFrames_ > DebayerCpu::framesToSkip) { + timespec frameEndTime = {}; + clock_gettime(CLOCK_MONOTONIC_RAW, &frameEndTime); + frameProcessTime_ += timeDiff(frameEndTime, frameStartTime); + if (measuredFrames_ == DebayerCpu::framesToMeasure) { + const int measuredFrames = DebayerCpu::framesToMeasure - + DebayerCpu::framesToSkip; + LOG(Debayer, Info) + << "Processed " << measuredFrames + << " frames in " << frameProcessTime_ / 1000 << "us, " + << frameProcessTime_ / (1000 * measuredFrames) + << " us/frame"; + } + } + + stats_->finishFrame(); + outputBufferReady.emit(output); + inputBufferReady.emit(input); +} + +} /* namespace libcamera */ diff --git a/src/libcamera/software_isp/meson.build b/src/libcamera/software_isp/meson.build index d4ae5ac7..6d7a44d7 100644 --- a/src/libcamera/software_isp/meson.build +++ b/src/libcamera/software_isp/meson.build @@ -2,6 +2,7 @@ libcamera_sources += files([ 'debayer.cpp', + 'debayer_cpu.cpp', 'swstats.cpp', 'swstats_cpu.cpp', ])