[libcamera-devel,v2,1/2] libcamera: ipa_module: add IPA shared library loader

Message ID 20190514223808.27351-1-paul.elder@ideasonboard.com
State Superseded
Headers show
Series
  • [libcamera-devel,v2,1/2] libcamera: ipa_module: add IPA shared library loader
Related show

Commit Message

Paul Elder May 14, 2019, 10:38 p.m. UTC
Implement a class to load a struct IPAModuleInfo of a given symbol name
from a .so shared object library.

Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
---
Changes in v2:
- renamed LibLoader class to IPAModule
- added documentation
- added logging
- check that bitness of the shared object is the same as libcamera
- moved symbol loading ("init") to the constructor, and added isValid()
- added elfPointer() to prevent segfaults when reading data from mmap
- moved struct IPAModuleInfo out of IPAModule
- rename getIPAModuleInfo() to IPAModuleInfo(), and make it return a
  const reference
- added munmap after the mmap

 src/libcamera/include/ipa_module.h |  43 +++++
 src/libcamera/ipa_module.cpp       | 277 +++++++++++++++++++++++++++++
 src/libcamera/meson.build          |   2 +
 3 files changed, 322 insertions(+)
 create mode 100644 src/libcamera/include/ipa_module.h
 create mode 100644 src/libcamera/ipa_module.cpp

Comments

Kieran Bingham May 15, 2019, 9:28 a.m. UTC | #1
Hi Paul,

Thank you for your patch, This is looking good and it's going in the
right direction but I have a few design comments ... feel free to
disagree though ...


On 14/05/2019 23:38, Paul Elder wrote:
> Implement a class to load a struct IPAModuleInfo of a given symbol name
> from a .so shared object library.
> 
> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> ---
> Changes in v2:
> - renamed LibLoader class to IPAModule
> - added documentation
> - added logging
> - check that bitness of the shared object is the same as libcamera
> - moved symbol loading ("init") to the constructor, and added isValid()
> - added elfPointer() to prevent segfaults when reading data from mmap
> - moved struct IPAModuleInfo out of IPAModule
> - rename getIPAModuleInfo() to IPAModuleInfo(), and make it return a
>   const reference
> - added munmap after the mmap
> 
>  src/libcamera/include/ipa_module.h |  43 +++++
>  src/libcamera/ipa_module.cpp       | 277 +++++++++++++++++++++++++++++
>  src/libcamera/meson.build          |   2 +
>  3 files changed, 322 insertions(+)
>  create mode 100644 src/libcamera/include/ipa_module.h
>  create mode 100644 src/libcamera/ipa_module.cpp
> 
> diff --git a/src/libcamera/include/ipa_module.h b/src/libcamera/include/ipa_module.h
> new file mode 100644
> index 0000000..9eb0094
> --- /dev/null
> +++ b/src/libcamera/include/ipa_module.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> +/*
> + * Copyright (C) 2019, Google Inc.
> + *
> + * ipa_module.h - load IPA module information from shared library at runtime

Hrm ... we have two sides here.

We need a public header which defines the interface between an IPA
module and libcamera. That would include a struct IPAModuleInfo and any
registration details, but should not include any internal libcamera
private details regarding how the module is loaded.


> + */
> +#ifndef __LIBCAMERA_IPA_MODULE_H__
> +#define __LIBCAMERA_IPA_MODULE_H__
> +
> +#include <string>
> +
> +namespace libcamera {
> +
> +struct IPAModuleInfo {
> +	char name[256];
> +	unsigned int version;
> +};
> +

So IPModuleInfo (and then later, the class definition for how a
developer would construct an IPA Module) should live in the public
headers at:

 /include/libcamera/ipa_module.h


> +class IPAModule
> +{
> +public:
> +	explicit IPAModule(const std::string &libPath, const std::string &symbol);
> +
> +	bool isValid() const;
> +
> +	struct IPAModuleInfo const &IPAModuleInfo() const;
> +
> +private:
> +	struct IPAModuleInfo info_;
> +
> +	bool loaded_;
> +	int bitClass_;
> +
> +	int loadELFIdent(int fd);
> +	template<class ElfHeader, class SecHeader, class SymHeader>
> +	int loadSymbol(void *data, size_t size, char *map, size_t soSize,
> +		       const char *symbol);
> +	int loadIPAModuleInfo(const char *libPath, const char *symbol);
> +};
> +
> +} /* namespace libcamera */
> +
> +#endif /* __LIBCAMERA_IPA_MODULE_H__ */
> diff --git a/src/libcamera/ipa_module.cpp b/src/libcamera/ipa_module.cpp
> new file mode 100644
> index 0000000..5ca16e8
> --- /dev/null
> +++ b/src/libcamera/ipa_module.cpp
> @@ -0,0 +1,277 @@
> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> +/*
> + * Copyright (C) 2019, Google Inc.
> + *
> + * ipa_module.cpp - load IPA module information from shared library at runtime
> + */
> +
> +#include "ipa_module.h"
> +
> +#include <elf.h>
> +#include <errno.h>
> +#include <fcntl.h>
> +#include <string.h>
> +#include <sys/mman.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +
> +#include <iostream>
> +
> +#include "log.h"
> +
> +/**
> + * \file ipa_module.h
> + * \brief IPA module information loading
> + */
> +
> +namespace libcamera {
> +
> +/**
> + * \struct IPAModuleInfo
> + * \brief Information of an IPA plugin
> + *
> + * This structure contains the information of an IPA plugin. It is loaded,
> + * read, and validated before anything else is loaded from the plugin.
> + *
> + * \var IPAModuleInfo::name
> + * \brief The name of the IPA plugin
> + *
> + * \var IPAModuleInfo::version
> + * \brief The version of the IPA plugin

I don't think we need to store the 'version' of the plugin, so much as
the version of the API it was compiled against to ensure that it is
compatible with this 'running' instance of libcamera.

I.e. We don't care that it's the 3rd iteration of the rk3399 module -
but we do care that it is compatible with the API defined in Libcamera 1.0.

Maybe we should add compatible strings to match against pipeline
handlers as some point too :-) <I'm not even sure if I'm joking now>

> + */
> +
> +LOG_DEFINE_CATEGORY(IPAModule)
> +
> +template<typename T>
> +typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
> +						 size_t fileSize)
> +{
> +	size_t size = offset + sizeof(T);
> +	if (size > fileSize || size < sizeof(T))
> +		return nullptr;
> +
> +	return reinterpret_cast<typename std::remove_extent<T>::type *>
> +		(static_cast<char *>(map) + offset);
> +}
> +
> +/**
> + * \class IPAModule
> + * \brief Load IPA module information from an IPA plugin
> + */
> +
> +/**
> + * \brief Construct an IPAModule instance
> + * \param[in] libPath path to IPA plugin
> + * \param[in] symbol name of IPAModuleInfo to load from libPath
> + *
> + * Loads the IPAModuleInfo of the given symbol from the IPA plugin at libPath.
> + * The IPA plugin shared object file must be of the same endianness and
> + * bitness as libcamera.
> + *
> + * Note that isValid() should be called to check if this loading was successful
> + * or not.

Hrm ... this is a slightly different design pattern than the rest of our
objects which are always constructable, and then actions later can fail.

I don't mind - but I wonder if we should add some documentation
regarding our design patterns somewhere.

(not an action on this patch)

> + */
> +
> +IPAModule::IPAModule(const std::string &libPath, const std::string &symbol)
> +	: loaded_(false)

So actually, from my text below - what I'm really saying is that I don't
think you should provide a &symbol to this IPAModule constructor :)

> +{
> +	int ret = loadIPAModuleInfo(libPath.c_str(), symbol.c_str());


I see that you have moved the load into this constructor based on
comments from Laurent in v1. I think that's fine, but I think merging
the loading of the object, and the parsing of the symbol might be
combining too much. But it depends (as ever).


For efficiency, the constructor can load the file once and when created
and .isValid() is true, then we know we can successfully parse symbols.


But I think loadIPAModuleInfo() should be 'hardcoded' to know exactly
which symbol name it will expect for an IPAModule. Think of it like a
'main()' entry point for our IPAModule. You would always expect a C
binary to have a main() symbol...

So I would expect a libcamera IPA module to have a registration
something like:


/* Register our module with Libcamera */
const struct IPAModuleInfo ipaModuleInfo = {
	.name = "Image Processing for the RK3399",
	.version = 1, /* Perhaps this should be apiversion to prevent
                       * people thinking it describes the version of the
                       * module ? ...
                       */
};

This could even then be turned into a macro:

#define LIBCAMERA_IPA_MODULE(__NAME) \
const struct IPAModuleInfo ipaModuleInfo = { \
	.name = __NAME,	\
	.apiversion = 1, \
}

so that a module rk3399_ipa.cpp could define:

LIBCAMERA_IPA_MODULE("Image Processing Algorithms for the RK3399");

(apiversion would be hardcoded by the macro because it defines what
version of the API/libraries it's compiled against..., and we check it's
compatible at runtime)



> +	if (ret < 0) {
> +		loaded_ = false;
> +		return;
> +	}
> +
> +	loaded_ = true;
> +}
> +
> +int IPAModule::loadELFIdent(int fd)
> +{
> +	int ret = lseek(fd, 0, SEEK_SET);
> +	if (ret < 0)
> +		return -errno;
> +
> +	unsigned char e_ident[EI_NIDENT];
> +	ret = read(fd, e_ident, EI_NIDENT);
> +	if (ret < 0)
> +		return -errno;
> +
> +	if (e_ident[EI_MAG0] != ELFMAG0 ||
> +	    e_ident[EI_MAG1] != ELFMAG1 ||
> +	    e_ident[EI_MAG2] != ELFMAG2 ||
> +	    e_ident[EI_MAG3] != ELFMAG3 ||
> +	    e_ident[EI_VERSION] != EV_CURRENT)
> +		return -ENOEXEC;
> +
> +	if ((e_ident[EI_CLASS] == ELFCLASS32 && sizeof(unsigned long) == 4) ||
> +	    (e_ident[EI_CLASS] == ELFCLASS64 && sizeof(unsigned long) == 8))
> +		bitClass_ = e_ident[EI_CLASS];
> +	else
> +		return -ENOEXEC;
> +
> +	int a = 1;
> +	unsigned char endianness = *reinterpret_cast<char *>(&a) == 1
> +				 ? ELFDATA2LSB : ELFDATA2MSB;
> +	if (e_ident[EI_DATA] != endianness)
> +		return -ENOEXEC;
> +
> +	return 0;
> +}
> +
> +template<class ElfHeader, class SecHeader, class SymHeader>
> +int IPAModule::loadSymbol(void *dst, size_t size, char *map, size_t soSize,
> +			  const char *symbol)
> +{> +	ElfHeader *eHdr = elfPointer<ElfHeader>(map, 0, soSize);
> +	if (!eHdr)
> +		return -ENOEXEC;
> +
> +	off_t offset = eHdr->e_shoff + eHdr->e_shentsize * eHdr->e_shstrndx;
> +	SecHeader *sHdr = elfPointer<SecHeader>(map, offset, soSize);
> +	if (!sHdr)
> +		return -ENOEXEC;
> +	off_t shnameoff = sHdr->sh_offset;
> +
> +	/* Locate .dynsym section header. */
> +	bool found = false;
> +	SecHeader *dynsym;
> +	for (unsigned int i = 0; i < eHdr->e_shnum; i++) {
> +		offset = eHdr->e_shoff + eHdr->e_shentsize * i;
> +		sHdr = elfPointer<SecHeader>(map, offset, soSize);
> +		if (!sHdr)
> +			return -ENOEXEC;
> +
> +		offset = shnameoff + sHdr->sh_name;
> +		char *name = elfPointer<char>(map, offset, soSize);
> +		if (!name)
> +			return -ENOEXEC;
> +
> +		if (sHdr->sh_type == SHT_DYNSYM && !strcmp(name, ".dynsym")) {
> +			found = true;
> +			dynsym = sHdr;
> +			break;
> +		}
> +	}
> +
> +	if (!found) {
> +		LOG(IPAModule, Error) << "ELF has no .dynsym section";
> +		return -ENOEXEC;
> +	}
> +
> +	offset = eHdr->e_shoff + eHdr->e_shentsize * dynsym->sh_link;
> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> +	if (!sHdr)
> +		return -ENOEXEC;
> +	off_t dynsym_nameoff = sHdr->sh_offset;
> +
> +	/* Locate symbol in the .dynsym section. */
> +	found = false;
> +	SymHeader *target_symbol;
> +	unsigned int dynsym_num = dynsym->sh_size / dynsym->sh_entsize;
> +	for (unsigned int i = 0; i < dynsym_num; i++) {
> +		offset = dynsym->sh_offset + dynsym->sh_entsize * i;
> +		SymHeader *sym = elfPointer<SymHeader>(map, offset, soSize);
> +		if (!sym)
> +			return -ENOEXEC;
> +
> +		offset = dynsym_nameoff + sym->st_name;
> +		char *name = elfPointer<char>(map, offset, soSize);
> +		if (!name)
> +			return -ENOEXEC;
> +
> +		if (!strcmp(name, symbol) &&
> +		    sym->st_info & STB_GLOBAL &&
> +		    sym->st_size == sizeof(struct IPAModuleInfo)) {

I think this should be a check on the size_t size passed in?

> +			found = true;
> +			target_symbol = sym;
> +			break;
> +		}
> +	}
> +
> +	if (!found) {
> +		LOG(IPAModule, Error) << "IPAModuleInfo symbol not found";

This function is called loadSymbol, it doesn't 'know' that it's loading
an IPAModuleInfo symbol.

I'd change this error to "Symbol " << symbol << " not found";

> +		return -ENOEXEC;
> +	}
> +
> +	/* Locate and return data of symbol. */
> +	offset = eHdr->e_shoff + target_symbol->st_shndx * eHdr->e_shentsize;
> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> +	if (!sHdr)
> +		return -ENOEXEC;
> +	offset = sHdr->sh_offset + (target_symbol->st_value - sHdr->sh_addr);
> +	char *data = elfPointer<char>(map, offset, soSize);
> +	if (!data)
> +		return -ENOEXEC;
> +	memcpy(dst, data, size);

Oh - interesting, you're copying the symbol out. Once it's mmapped ...
why not parse it directly?

Can we do all of the 'work' we need to do on the file, and then close it
at the destructor?

I guess actually this might then keep the whole file mapped in memory
for longer which might not be desirable...



> +
> +	return 0;
> +}
> +
> +int IPAModule::loadIPAModuleInfo(const char *libPath, const char *symbol)
> +{
> +	int fd = open(libPath, O_RDONLY);
> +	if (fd < 0) {
> +		LOG(IPAModule, Error) << "Failed to open IPA library: "
> +				      << strerror(errno);
> +		return fd;
> +	}
> +
> +	int ret = loadELFIdent(fd);
> +	if (ret)
> +		goto close;
> +
> +	size_t soSize;
> +	char *map;
> +	struct stat st;
> +	ret = fstat(fd, &st);
> +	if (ret < 0)
> +		goto close;
> +	soSize = st.st_size;
> +	map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
> +				       MAP_PRIVATE, fd, 0));

*iff* the constructor did the open, and the symbol parsing was
separated, then I think soSize and map would go to class privates. Then
they could be accessed directly by loadSymbol too. But that's an only if...

Essentially, it would be

IPAModule("SO-path")
 - Constructor opens the file, performs stat, and mmap.
	 (or those become an open() call)
 - loadIPAModuleInfo() calls loadSymbol("ipaModuleInfo"); and deals only
with the ipaModule info structure and checking or such.




> +	if (map == MAP_FAILED) {
> +		ret = -errno;
> +		goto close;
> +	}
> +
> +	if (bitClass_ == ELFCLASS32)
> +		ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
> +				(&info_, sizeof(info_), map, soSize, symbol);
> +	else if (bitClass_ == ELFCLASS64)
> +		ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
> +				(&info_, sizeof(info_), map, soSize, symbol);
> +	if (ret)
> +		goto unmap;
> +
> +unmap:
> +	munmap(map, soSize);
> +close:
> +	close(fd);
> +	return ret;
> +}
> +
> +/**
> + * \brief Check if construction of the IPAModule instance succeeded
> + *
> + * \return true if the constructor succeeded with no errors, false otherwise
> + */
> +
> +bool IPAModule::isValid() const
> +{
> +	return loaded_;
> +}
> +
> +/**
> + * \brief Get the loaded IPAModuleInfo
> + *
> + * Check isValid() before calling this.
> + *
> + * \return IPAModuleInfo
> + */
> +
> +struct IPAModuleInfo const &IPAModule::IPAModuleInfo() const
> +{
> +	return info_;
> +}
> +
> +} /* namespace libcamera */
> diff --git a/src/libcamera/meson.build b/src/libcamera/meson.build
> index 8796f49..e5b48f2 100644
> --- a/src/libcamera/meson.build
> +++ b/src/libcamera/meson.build
> @@ -10,6 +10,7 @@ libcamera_sources = files([
>      'event_notifier.cpp',
>      'formats.cpp',
>      'geometry.cpp',
> +    'ipa_module.cpp',
>      'log.cpp',
>      'media_device.cpp',
>      'media_object.cpp',
> @@ -31,6 +32,7 @@ libcamera_headers = files([
>      'include/device_enumerator_udev.h',
>      'include/event_dispatcher_poll.h',
>      'include/formats.h',
> +    'include/ipa_module.h',
>      'include/log.h',
>      'include/media_device.h',
>      'include/media_object.h',
>
Paul Elder May 15, 2019, 3:02 p.m. UTC | #2
Hi Kieran,

On Wed, May 15, 2019 at 10:28:46AM +0100, Kieran Bingham wrote:
> Hi Paul,
> 
> Thank you for your patch, This is looking good and it's going in the
> right direction but I have a few design comments ... feel free to
> disagree though ...

Thank you for the review.

> On 14/05/2019 23:38, Paul Elder wrote:
> > Implement a class to load a struct IPAModuleInfo of a given symbol name
> > from a .so shared object library.
> > 
> > Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> > ---
> > Changes in v2:
> > - renamed LibLoader class to IPAModule
> > - added documentation
> > - added logging
> > - check that bitness of the shared object is the same as libcamera
> > - moved symbol loading ("init") to the constructor, and added isValid()
> > - added elfPointer() to prevent segfaults when reading data from mmap
> > - moved struct IPAModuleInfo out of IPAModule
> > - rename getIPAModuleInfo() to IPAModuleInfo(), and make it return a
> >   const reference
> > - added munmap after the mmap
> > 
> >  src/libcamera/include/ipa_module.h |  43 +++++
> >  src/libcamera/ipa_module.cpp       | 277 +++++++++++++++++++++++++++++
> >  src/libcamera/meson.build          |   2 +
> >  3 files changed, 322 insertions(+)
> >  create mode 100644 src/libcamera/include/ipa_module.h
> >  create mode 100644 src/libcamera/ipa_module.cpp
> > 
> > diff --git a/src/libcamera/include/ipa_module.h b/src/libcamera/include/ipa_module.h
> > new file mode 100644
> > index 0000000..9eb0094
> > --- /dev/null
> > +++ b/src/libcamera/include/ipa_module.h
> > @@ -0,0 +1,43 @@
> > +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> > +/*
> > + * Copyright (C) 2019, Google Inc.
> > + *
> > + * ipa_module.h - load IPA module information from shared library at runtime
> 
> Hrm ... we have two sides here.
> 
> We need a public header which defines the interface between an IPA
> module and libcamera. That would include a struct IPAModuleInfo and any
> registration details, but should not include any internal libcamera
> private details regarding how the module is loaded.

Yes, I agree.

> > + */
> > +#ifndef __LIBCAMERA_IPA_MODULE_H__
> > +#define __LIBCAMERA_IPA_MODULE_H__
> > +
> > +#include <string>
> > +
> > +namespace libcamera {
> > +
> > +struct IPAModuleInfo {
> > +	char name[256];
> > +	unsigned int version;
> > +};
> > +
> 
> So IPModuleInfo (and then later, the class definition for how a
> developer would construct an IPA Module) should live in the public
> headers at:
> 
>  /include/libcamera/ipa_module.h

Yeah.

Then, where should class IPAModule go?

> > +class IPAModule
> > +{
> > +public:
> > +	explicit IPAModule(const std::string &libPath, const std::string &symbol);
> > +
> > +	bool isValid() const;
> > +
> > +	struct IPAModuleInfo const &IPAModuleInfo() const;
> > +
> > +private:
> > +	struct IPAModuleInfo info_;
> > +
> > +	bool loaded_;
> > +	int bitClass_;
> > +
> > +	int loadELFIdent(int fd);
> > +	template<class ElfHeader, class SecHeader, class SymHeader>
> > +	int loadSymbol(void *data, size_t size, char *map, size_t soSize,
> > +		       const char *symbol);
> > +	int loadIPAModuleInfo(const char *libPath, const char *symbol);
> > +};
> > +
> > +} /* namespace libcamera */
> > +
> > +#endif /* __LIBCAMERA_IPA_MODULE_H__ */
> > diff --git a/src/libcamera/ipa_module.cpp b/src/libcamera/ipa_module.cpp
> > new file mode 100644
> > index 0000000..5ca16e8
> > --- /dev/null
> > +++ b/src/libcamera/ipa_module.cpp
> > @@ -0,0 +1,277 @@
> > +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> > +/*
> > + * Copyright (C) 2019, Google Inc.
> > + *
> > + * ipa_module.cpp - load IPA module information from shared library at runtime
> > + */
> > +
> > +#include "ipa_module.h"
> > +
> > +#include <elf.h>
> > +#include <errno.h>
> > +#include <fcntl.h>
> > +#include <string.h>
> > +#include <sys/mman.h>
> > +#include <sys/stat.h>
> > +#include <sys/types.h>
> > +#include <unistd.h>
> > +
> > +#include <iostream>
> > +
> > +#include "log.h"
> > +
> > +/**
> > + * \file ipa_module.h
> > + * \brief IPA module information loading
> > + */
> > +
> > +namespace libcamera {
> > +
> > +/**
> > + * \struct IPAModuleInfo
> > + * \brief Information of an IPA plugin
> > + *
> > + * This structure contains the information of an IPA plugin. It is loaded,
> > + * read, and validated before anything else is loaded from the plugin.
> > + *
> > + * \var IPAModuleInfo::name
> > + * \brief The name of the IPA plugin
> > + *
> > + * \var IPAModuleInfo::version
> > + * \brief The version of the IPA plugin
> 
> I don't think we need to store the 'version' of the plugin, so much as
> the version of the API it was compiled against to ensure that it is
> compatible with this 'running' instance of libcamera.
> 
> I.e. We don't care that it's the 3rd iteration of the rk3399 module -
> but we do care that it is compatible with the API defined in Libcamera 1.0.

Yeah, good point; I agree.

How would we get the API version of libcamera?

> Maybe we should add compatible strings to match against pipeline
> handlers as some point too :-) <I'm not even sure if I'm joking now>
> 
> > + */
> > +
> > +LOG_DEFINE_CATEGORY(IPAModule)
> > +
> > +template<typename T>
> > +typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
> > +						 size_t fileSize)
> > +{
> > +	size_t size = offset + sizeof(T);
> > +	if (size > fileSize || size < sizeof(T))
> > +		return nullptr;
> > +
> > +	return reinterpret_cast<typename std::remove_extent<T>::type *>
> > +		(static_cast<char *>(map) + offset);
> > +}
> > +
> > +/**
> > + * \class IPAModule
> > + * \brief Load IPA module information from an IPA plugin
> > + */
> > +
> > +/**
> > + * \brief Construct an IPAModule instance
> > + * \param[in] libPath path to IPA plugin
> > + * \param[in] symbol name of IPAModuleInfo to load from libPath
> > + *
> > + * Loads the IPAModuleInfo of the given symbol from the IPA plugin at libPath.
> > + * The IPA plugin shared object file must be of the same endianness and
> > + * bitness as libcamera.
> > + *
> > + * Note that isValid() should be called to check if this loading was successful
> > + * or not.
> 
> Hrm ... this is a slightly different design pattern than the rest of our
> objects which are always constructable, and then actions later can fail.
> 
> I don't mind - but I wonder if we should add some documentation
> regarding our design patterns somewhere.
> 
> (not an action on this patch)
> 
> > + */
> > +
> > +IPAModule::IPAModule(const std::string &libPath, const std::string &symbol)
> > +	: loaded_(false)
> 
> So actually, from my text below - what I'm really saying is that I don't
> think you should provide a &symbol to this IPAModule constructor :)
> 
> > +{
> > +	int ret = loadIPAModuleInfo(libPath.c_str(), symbol.c_str());
> 
> 
> I see that you have moved the load into this constructor based on
> comments from Laurent in v1. I think that's fine, but I think merging
> the loading of the object, and the parsing of the symbol might be
> combining too much. But it depends (as ever).
> 
> 
> For efficiency, the constructor can load the file once and when created
> and .isValid() is true, then we know we can successfully parse symbols.
> 
> 
> But I think loadIPAModuleInfo() should be 'hardcoded' to know exactly
> which symbol name it will expect for an IPAModule. Think of it like a
> 'main()' entry point for our IPAModule. You would always expect a C
> binary to have a main() symbol...

Yeah, I think that's a good idea.

> So I would expect a libcamera IPA module to have a registration
> something like:
> 
> 
> /* Register our module with Libcamera */
> const struct IPAModuleInfo ipaModuleInfo = {
> 	.name = "Image Processing for the RK3399",
> 	.version = 1, /* Perhaps this should be apiversion to prevent
>                        * people thinking it describes the version of the
>                        * module ? ...
>                        */
> };

Should the info entry point be called ipaModuleInfo?

> This could even then be turned into a macro:
> 
> #define LIBCAMERA_IPA_MODULE(__NAME) \
> const struct IPAModuleInfo ipaModuleInfo = { \
> 	.name = __NAME,	\
> 	.apiversion = 1, \
> }
> 
> so that a module rk3399_ipa.cpp could define:
> 
> LIBCAMERA_IPA_MODULE("Image Processing Algorithms for the RK3399");
> 
> (apiversion would be hardcoded by the macro because it defines what
> version of the API/libraries it's compiled against..., and we check it's
> compatible at runtime)
> 
> 
> 
> > +	if (ret < 0) {
> > +		loaded_ = false;
> > +		return;
> > +	}
> > +
> > +	loaded_ = true;
> > +}
> > +
> > +int IPAModule::loadELFIdent(int fd)
> > +{
> > +	int ret = lseek(fd, 0, SEEK_SET);
> > +	if (ret < 0)
> > +		return -errno;
> > +
> > +	unsigned char e_ident[EI_NIDENT];
> > +	ret = read(fd, e_ident, EI_NIDENT);
> > +	if (ret < 0)
> > +		return -errno;
> > +
> > +	if (e_ident[EI_MAG0] != ELFMAG0 ||
> > +	    e_ident[EI_MAG1] != ELFMAG1 ||
> > +	    e_ident[EI_MAG2] != ELFMAG2 ||
> > +	    e_ident[EI_MAG3] != ELFMAG3 ||
> > +	    e_ident[EI_VERSION] != EV_CURRENT)
> > +		return -ENOEXEC;
> > +
> > +	if ((e_ident[EI_CLASS] == ELFCLASS32 && sizeof(unsigned long) == 4) ||
> > +	    (e_ident[EI_CLASS] == ELFCLASS64 && sizeof(unsigned long) == 8))
> > +		bitClass_ = e_ident[EI_CLASS];
> > +	else
> > +		return -ENOEXEC;
> > +
> > +	int a = 1;
> > +	unsigned char endianness = *reinterpret_cast<char *>(&a) == 1
> > +				 ? ELFDATA2LSB : ELFDATA2MSB;
> > +	if (e_ident[EI_DATA] != endianness)
> > +		return -ENOEXEC;
> > +
> > +	return 0;
> > +}
> > +
> > +template<class ElfHeader, class SecHeader, class SymHeader>
> > +int IPAModule::loadSymbol(void *dst, size_t size, char *map, size_t soSize,
> > +			  const char *symbol)
> > +{> +	ElfHeader *eHdr = elfPointer<ElfHeader>(map, 0, soSize);
> > +	if (!eHdr)
> > +		return -ENOEXEC;
> > +
> > +	off_t offset = eHdr->e_shoff + eHdr->e_shentsize * eHdr->e_shstrndx;
> > +	SecHeader *sHdr = elfPointer<SecHeader>(map, offset, soSize);
> > +	if (!sHdr)
> > +		return -ENOEXEC;
> > +	off_t shnameoff = sHdr->sh_offset;
> > +
> > +	/* Locate .dynsym section header. */
> > +	bool found = false;
> > +	SecHeader *dynsym;
> > +	for (unsigned int i = 0; i < eHdr->e_shnum; i++) {
> > +		offset = eHdr->e_shoff + eHdr->e_shentsize * i;
> > +		sHdr = elfPointer<SecHeader>(map, offset, soSize);
> > +		if (!sHdr)
> > +			return -ENOEXEC;
> > +
> > +		offset = shnameoff + sHdr->sh_name;
> > +		char *name = elfPointer<char>(map, offset, soSize);
> > +		if (!name)
> > +			return -ENOEXEC;
> > +
> > +		if (sHdr->sh_type == SHT_DYNSYM && !strcmp(name, ".dynsym")) {
> > +			found = true;
> > +			dynsym = sHdr;
> > +			break;
> > +		}
> > +	}
> > +
> > +	if (!found) {
> > +		LOG(IPAModule, Error) << "ELF has no .dynsym section";
> > +		return -ENOEXEC;
> > +	}
> > +
> > +	offset = eHdr->e_shoff + eHdr->e_shentsize * dynsym->sh_link;
> > +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> > +	if (!sHdr)
> > +		return -ENOEXEC;
> > +	off_t dynsym_nameoff = sHdr->sh_offset;
> > +
> > +	/* Locate symbol in the .dynsym section. */
> > +	found = false;
> > +	SymHeader *target_symbol;
> > +	unsigned int dynsym_num = dynsym->sh_size / dynsym->sh_entsize;
> > +	for (unsigned int i = 0; i < dynsym_num; i++) {
> > +		offset = dynsym->sh_offset + dynsym->sh_entsize * i;
> > +		SymHeader *sym = elfPointer<SymHeader>(map, offset, soSize);
> > +		if (!sym)
> > +			return -ENOEXEC;
> > +
> > +		offset = dynsym_nameoff + sym->st_name;
> > +		char *name = elfPointer<char>(map, offset, soSize);
> > +		if (!name)
> > +			return -ENOEXEC;
> > +
> > +		if (!strcmp(name, symbol) &&
> > +		    sym->st_info & STB_GLOBAL &&
> > +		    sym->st_size == sizeof(struct IPAModuleInfo)) {
> 
> I think this should be a check on the size_t size passed in?

Ah yeah, I totally missed that.

> > +			found = true;
> > +			target_symbol = sym;
> > +			break;
> > +		}
> > +	}
> > +
> > +	if (!found) {
> > +		LOG(IPAModule, Error) << "IPAModuleInfo symbol not found";
> 
> This function is called loadSymbol, it doesn't 'know' that it's loading
> an IPAModuleInfo symbol.
> 
> I'd change this error to "Symbol " << symbol << " not found";

And this.

> > +		return -ENOEXEC;
> > +	}
> > +
> > +	/* Locate and return data of symbol. */
> > +	offset = eHdr->e_shoff + target_symbol->st_shndx * eHdr->e_shentsize;
> > +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> > +	if (!sHdr)
> > +		return -ENOEXEC;
> > +	offset = sHdr->sh_offset + (target_symbol->st_value - sHdr->sh_addr);
> > +	char *data = elfPointer<char>(map, offset, soSize);
> > +	if (!data)
> > +		return -ENOEXEC;
> > +	memcpy(dst, data, size);
> 
> Oh - interesting, you're copying the symbol out. Once it's mmapped ...
> why not parse it directly?

I'm not sure what you mean by "parse"?

> Can we do all of the 'work' we need to do on the file, and then close it
> at the destructor?

I don't want to keep the file open/mapped over multiple method calls.
That's why in my first version there was a constructor that did nothing
and an init() that did everything (which got moved to the constructor in
this version).

> I guess actually this might then keep the whole file mapped in memory
> for longer which might not be desirable...
> 
> 
> 
> > +
> > +	return 0;
> > +}
> > +
> > +int IPAModule::loadIPAModuleInfo(const char *libPath, const char *symbol)
> > +{
> > +	int fd = open(libPath, O_RDONLY);
> > +	if (fd < 0) {
> > +		LOG(IPAModule, Error) << "Failed to open IPA library: "
> > +				      << strerror(errno);
> > +		return fd;
> > +	}
> > +
> > +	int ret = loadELFIdent(fd);
> > +	if (ret)
> > +		goto close;
> > +
> > +	size_t soSize;
> > +	char *map;
> > +	struct stat st;
> > +	ret = fstat(fd, &st);
> > +	if (ret < 0)
> > +		goto close;
> > +	soSize = st.st_size;
> > +	map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
> > +				       MAP_PRIVATE, fd, 0));
> 
> *iff* the constructor did the open, and the symbol parsing was
> separated, then I think soSize and map would go to class privates. Then
> they could be accessed directly by loadSymbol too. But that's an only if...
> 
> Essentially, it would be
> 
> IPAModule("SO-path")
>  - Constructor opens the file, performs stat, and mmap.
> 	 (or those become an open() call)
>  - loadIPAModuleInfo() calls loadSymbol("ipaModuleInfo"); and deals only
> with the ipaModule info structure and checking or such.

If loadIPAModuleInfo() is to be called from within the constructor
(after open, stat, and mmap), then I agree. I'm not sure map and soSize
need to be privates though. It could go either way.

> 
> > +	if (map == MAP_FAILED) {
> > +		ret = -errno;
> > +		goto close;
> > +	}
> > +
> > +	if (bitClass_ == ELFCLASS32)
> > +		ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
> > +				(&info_, sizeof(info_), map, soSize, symbol);
> > +	else if (bitClass_ == ELFCLASS64)
> > +		ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
> > +				(&info_, sizeof(info_), map, soSize, symbol);
> > +	if (ret)
> > +		goto unmap;
> > +
> > +unmap:
> > +	munmap(map, soSize);
> > +close:
> > +	close(fd);
> > +	return ret;
> > +}
> > +
> > +/**
> > + * \brief Check if construction of the IPAModule instance succeeded
> > + *
> > + * \return true if the constructor succeeded with no errors, false otherwise
> > + */
> > +
> > +bool IPAModule::isValid() const
> > +{
> > +	return loaded_;
> > +}
> > +
> > +/**
> > + * \brief Get the loaded IPAModuleInfo
> > + *
> > + * Check isValid() before calling this.
> > + *
> > + * \return IPAModuleInfo
> > + */
> > +
> > +struct IPAModuleInfo const &IPAModule::IPAModuleInfo() const
> > +{
> > +	return info_;
> > +}
> > +
> > +} /* namespace libcamera */
> > diff --git a/src/libcamera/meson.build b/src/libcamera/meson.build
> > index 8796f49..e5b48f2 100644
> > --- a/src/libcamera/meson.build
> > +++ b/src/libcamera/meson.build
> > @@ -10,6 +10,7 @@ libcamera_sources = files([
> >      'event_notifier.cpp',
> >      'formats.cpp',
> >      'geometry.cpp',
> > +    'ipa_module.cpp',
> >      'log.cpp',
> >      'media_device.cpp',
> >      'media_object.cpp',
> > @@ -31,6 +32,7 @@ libcamera_headers = files([
> >      'include/device_enumerator_udev.h',
> >      'include/event_dispatcher_poll.h',
> >      'include/formats.h',
> > +    'include/ipa_module.h',
> >      'include/log.h',
> >      'include/media_device.h',
> >      'include/media_object.h',
> > 


Thanks,

Paul
Kieran Bingham May 15, 2019, 3:26 p.m. UTC | #3
Hi Paul,

On 15/05/2019 16:02, Paul Elder wrote:
> Hi Kieran,
> 
> On Wed, May 15, 2019 at 10:28:46AM +0100, Kieran Bingham wrote:
>> Hi Paul,
>>
>> Thank you for your patch, This is looking good and it's going in the
>> right direction but I have a few design comments ... feel free to
>> disagree though ...
> 
> Thank you for the review.
> 
>> On 14/05/2019 23:38, Paul Elder wrote:
>>> Implement a class to load a struct IPAModuleInfo of a given symbol name
>>> from a .so shared object library.
>>>
>>> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
>>> ---
>>> Changes in v2:
>>> - renamed LibLoader class to IPAModule
>>> - added documentation
>>> - added logging
>>> - check that bitness of the shared object is the same as libcamera
>>> - moved symbol loading ("init") to the constructor, and added isValid()
>>> - added elfPointer() to prevent segfaults when reading data from mmap
>>> - moved struct IPAModuleInfo out of IPAModule
>>> - rename getIPAModuleInfo() to IPAModuleInfo(), and make it return a
>>>   const reference
>>> - added munmap after the mmap
>>>
>>>  src/libcamera/include/ipa_module.h |  43 +++++
>>>  src/libcamera/ipa_module.cpp       | 277 +++++++++++++++++++++++++++++
>>>  src/libcamera/meson.build          |   2 +
>>>  3 files changed, 322 insertions(+)
>>>  create mode 100644 src/libcamera/include/ipa_module.h
>>>  create mode 100644 src/libcamera/ipa_module.cpp
>>>
>>> diff --git a/src/libcamera/include/ipa_module.h b/src/libcamera/include/ipa_module.h
>>> new file mode 100644
>>> index 0000000..9eb0094
>>> --- /dev/null
>>> +++ b/src/libcamera/include/ipa_module.h
>>> @@ -0,0 +1,43 @@
>>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
>>> +/*
>>> + * Copyright (C) 2019, Google Inc.
>>> + *
>>> + * ipa_module.h - load IPA module information from shared library at runtime
>>
>> Hrm ... we have two sides here.
>>
>> We need a public header which defines the interface between an IPA
>> module and libcamera. That would include a struct IPAModuleInfo and any
>> registration details, but should not include any internal libcamera
>> private details regarding how the module is loaded.
> 
> Yes, I agree.
> 
>>> + */
>>> +#ifndef __LIBCAMERA_IPA_MODULE_H__
>>> +#define __LIBCAMERA_IPA_MODULE_H__
>>> +
>>> +#include <string>
>>> +
>>> +namespace libcamera {
>>> +
>>> +struct IPAModuleInfo {
>>> +	char name[256];
>>> +	unsigned int version;
>>> +};
>>> +
>>
>> So IPModuleInfo (and then later, the class definition for how a
>> developer would construct an IPA Module) should live in the public
>> headers at:
>>
>>  /include/libcamera/ipa_module.h
> 
> Yeah.
> 
> Then, where should class IPAModule go?

Stays here in this file I think :-)

> 
>>> +class IPAModule

I'm wondering if we have two sides if this class should be calls
IPAModule though.

Surely an IPAModule would implement a class called  IPAModule, and
perhaps 'this' class is an IPAModuleParser? IPAModuleLoader?

Probably a Loader :D

>>> +{
>>> +public:
>>> +	explicit IPAModule(const std::string &libPath, const std::string &symbol);
>>> +
>>> +	bool isValid() const;
>>> +
>>> +	struct IPAModuleInfo const &IPAModuleInfo() const;
>>> +
>>> +private:
>>> +	struct IPAModuleInfo info_;
>>> +
>>> +	bool loaded_;
>>> +	int bitClass_;
>>> +
>>> +	int loadELFIdent(int fd);
>>> +	template<class ElfHeader, class SecHeader, class SymHeader>
>>> +	int loadSymbol(void *data, size_t size, char *map, size_t soSize,
>>> +		       const char *symbol);
>>> +	int loadIPAModuleInfo(const char *libPath, const char *symbol);
>>> +};
>>> +
>>> +} /* namespace libcamera */
>>> +
>>> +#endif /* __LIBCAMERA_IPA_MODULE_H__ */
>>> diff --git a/src/libcamera/ipa_module.cpp b/src/libcamera/ipa_module.cpp
>>> new file mode 100644
>>> index 0000000..5ca16e8
>>> --- /dev/null
>>> +++ b/src/libcamera/ipa_module.cpp
>>> @@ -0,0 +1,277 @@
>>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
>>> +/*
>>> + * Copyright (C) 2019, Google Inc.
>>> + *
>>> + * ipa_module.cpp - load IPA module information from shared library at runtime
>>> + */
>>> +
>>> +#include "ipa_module.h"
>>> +
>>> +#include <elf.h>
>>> +#include <errno.h>
>>> +#include <fcntl.h>
>>> +#include <string.h>
>>> +#include <sys/mman.h>
>>> +#include <sys/stat.h>
>>> +#include <sys/types.h>
>>> +#include <unistd.h>
>>> +
>>> +#include <iostream>
>>> +
>>> +#include "log.h"
>>> +
>>> +/**
>>> + * \file ipa_module.h
>>> + * \brief IPA module information loading
>>> + */
>>> +
>>> +namespace libcamera {
>>> +
>>> +/**
>>> + * \struct IPAModuleInfo
>>> + * \brief Information of an IPA plugin
>>> + *
>>> + * This structure contains the information of an IPA plugin. It is loaded,
>>> + * read, and validated before anything else is loaded from the plugin.
>>> + *
>>> + * \var IPAModuleInfo::name
>>> + * \brief The name of the IPA plugin
>>> + *
>>> + * \var IPAModuleInfo::version
>>> + * \brief The version of the IPA plugin
>>
>> I don't think we need to store the 'version' of the plugin, so much as
>> the version of the API it was compiled against to ensure that it is
>> compatible with this 'running' instance of libcamera.
>>
>> I.e. We don't care that it's the 3rd iteration of the rk3399 module -
>> but we do care that it is compatible with the API defined in Libcamera 1.0.
> 
> Yeah, good point; I agree.
> 
> How would we get the API version of libcamera?

You get to define it :-)

To start with it's probably just:

  #define LIBCAMERA_IPA_API_VERSION 1

This would be a 'local' IPAModule API version number or such I think, as
our IPA modules must go through a defined interface...

Any time an ABI break occurs in the IPAModule 'API' we would have to
bump up the apiversion.



>> Maybe we should add compatible strings to match against pipeline
>> handlers as some point too :-) <I'm not even sure if I'm joking now>
>>
>>> + */
>>> +
>>> +LOG_DEFINE_CATEGORY(IPAModule)
>>> +
>>> +template<typename T>
>>> +typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
>>> +						 size_t fileSize)
>>> +{
>>> +	size_t size = offset + sizeof(T);
>>> +	if (size > fileSize || size < sizeof(T))
>>> +		return nullptr;
>>> +
>>> +	return reinterpret_cast<typename std::remove_extent<T>::type *>
>>> +		(static_cast<char *>(map) + offset);
>>> +}
>>> +
>>> +/**
>>> + * \class IPAModule
>>> + * \brief Load IPA module information from an IPA plugin
>>> + */
>>> +
>>> +/**
>>> + * \brief Construct an IPAModule instance
>>> + * \param[in] libPath path to IPA plugin
>>> + * \param[in] symbol name of IPAModuleInfo to load from libPath
>>> + *
>>> + * Loads the IPAModuleInfo of the given symbol from the IPA plugin at libPath.
>>> + * The IPA plugin shared object file must be of the same endianness and
>>> + * bitness as libcamera.
>>> + *
>>> + * Note that isValid() should be called to check if this loading was successful
>>> + * or not.
>>
>> Hrm ... this is a slightly different design pattern than the rest of our
>> objects which are always constructable, and then actions later can fail.
>>
>> I don't mind - but I wonder if we should add some documentation
>> regarding our design patterns somewhere.
>>
>> (not an action on this patch)
>>
>>> + */
>>> +
>>> +IPAModule::IPAModule(const std::string &libPath, const std::string &symbol)
>>> +	: loaded_(false)
>>
>> So actually, from my text below - what I'm really saying is that I don't
>> think you should provide a &symbol to this IPAModule constructor :)
>>
>>> +{
>>> +	int ret = loadIPAModuleInfo(libPath.c_str(), symbol.c_str());
>>
>>
>> I see that you have moved the load into this constructor based on
>> comments from Laurent in v1. I think that's fine, but I think merging
>> the loading of the object, and the parsing of the symbol might be
>> combining too much. But it depends (as ever).
>>
>>
>> For efficiency, the constructor can load the file once and when created
>> and .isValid() is true, then we know we can successfully parse symbols.
>>
>>
>> But I think loadIPAModuleInfo() should be 'hardcoded' to know exactly
>> which symbol name it will expect for an IPAModule. Think of it like a
>> 'main()' entry point for our IPAModule. You would always expect a C
>> binary to have a main() symbol...
> 
> Yeah, I think that's a good idea.
> 
>> So I would expect a libcamera IPA module to have a registration
>> something like:
>>
>>
>> /* Register our module with Libcamera */
>> const struct IPAModuleInfo ipaModuleInfo = {
>> 	.name = "Image Processing for the RK3399",
>> 	.version = 1, /* Perhaps this should be apiversion to prevent
>>                        * people thinking it describes the version of the
>>                        * module ? ...
>>                        */
>> };
> 
> Should the info entry point be called ipaModuleInfo?

I think variables are in camelCase with a lowerCase first letter in the
project right? or was this not quite your question?


> 
>> This could even then be turned into a macro:
>>
>> #define LIBCAMERA_IPA_MODULE(__NAME) \
>> const struct IPAModuleInfo ipaModuleInfo = { \
>> 	.name = __NAME,	\
>> 	.apiversion = 1, \
>> }
>>
>> so that a module rk3399_ipa.cpp could define:
>>
>> LIBCAMERA_IPA_MODULE("Image Processing Algorithms for the RK3399");
>>
>> (apiversion would be hardcoded by the macro because it defines what
>> version of the API/libraries it's compiled against..., and we check it's
>> compatible at runtime)
>>
>>
>>
>>> +	if (ret < 0) {
>>> +		loaded_ = false;
>>> +		return;
>>> +	}
>>> +
>>> +	loaded_ = true;
>>> +}
>>> +
>>> +int IPAModule::loadELFIdent(int fd)
>>> +{
>>> +	int ret = lseek(fd, 0, SEEK_SET);
>>> +	if (ret < 0)
>>> +		return -errno;
>>> +
>>> +	unsigned char e_ident[EI_NIDENT];
>>> +	ret = read(fd, e_ident, EI_NIDENT);
>>> +	if (ret < 0)
>>> +		return -errno;
>>> +
>>> +	if (e_ident[EI_MAG0] != ELFMAG0 ||
>>> +	    e_ident[EI_MAG1] != ELFMAG1 ||
>>> +	    e_ident[EI_MAG2] != ELFMAG2 ||
>>> +	    e_ident[EI_MAG3] != ELFMAG3 ||
>>> +	    e_ident[EI_VERSION] != EV_CURRENT)
>>> +		return -ENOEXEC;
>>> +
>>> +	if ((e_ident[EI_CLASS] == ELFCLASS32 && sizeof(unsigned long) == 4) ||
>>> +	    (e_ident[EI_CLASS] == ELFCLASS64 && sizeof(unsigned long) == 8))
>>> +		bitClass_ = e_ident[EI_CLASS];
>>> +	else
>>> +		return -ENOEXEC;
>>> +
>>> +	int a = 1;
>>> +	unsigned char endianness = *reinterpret_cast<char *>(&a) == 1
>>> +				 ? ELFDATA2LSB : ELFDATA2MSB;
>>> +	if (e_ident[EI_DATA] != endianness)
>>> +		return -ENOEXEC;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +template<class ElfHeader, class SecHeader, class SymHeader>
>>> +int IPAModule::loadSymbol(void *dst, size_t size, char *map, size_t soSize,
>>> +			  const char *symbol)
>>> +{> +	ElfHeader *eHdr = elfPointer<ElfHeader>(map, 0, soSize);
>>> +	if (!eHdr)
>>> +		return -ENOEXEC;
>>> +
>>> +	off_t offset = eHdr->e_shoff + eHdr->e_shentsize * eHdr->e_shstrndx;
>>> +	SecHeader *sHdr = elfPointer<SecHeader>(map, offset, soSize);
>>> +	if (!sHdr)
>>> +		return -ENOEXEC;
>>> +	off_t shnameoff = sHdr->sh_offset;
>>> +
>>> +	/* Locate .dynsym section header. */
>>> +	bool found = false;
>>> +	SecHeader *dynsym;
>>> +	for (unsigned int i = 0; i < eHdr->e_shnum; i++) {
>>> +		offset = eHdr->e_shoff + eHdr->e_shentsize * i;
>>> +		sHdr = elfPointer<SecHeader>(map, offset, soSize);
>>> +		if (!sHdr)
>>> +			return -ENOEXEC;
>>> +
>>> +		offset = shnameoff + sHdr->sh_name;
>>> +		char *name = elfPointer<char>(map, offset, soSize);
>>> +		if (!name)
>>> +			return -ENOEXEC;
>>> +
>>> +		if (sHdr->sh_type == SHT_DYNSYM && !strcmp(name, ".dynsym")) {
>>> +			found = true;
>>> +			dynsym = sHdr;
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	if (!found) {
>>> +		LOG(IPAModule, Error) << "ELF has no .dynsym section";
>>> +		return -ENOEXEC;
>>> +	}
>>> +
>>> +	offset = eHdr->e_shoff + eHdr->e_shentsize * dynsym->sh_link;
>>> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
>>> +	if (!sHdr)
>>> +		return -ENOEXEC;
>>> +	off_t dynsym_nameoff = sHdr->sh_offset;
>>> +
>>> +	/* Locate symbol in the .dynsym section. */
>>> +	found = false;
>>> +	SymHeader *target_symbol;
>>> +	unsigned int dynsym_num = dynsym->sh_size / dynsym->sh_entsize;
>>> +	for (unsigned int i = 0; i < dynsym_num; i++) {
>>> +		offset = dynsym->sh_offset + dynsym->sh_entsize * i;
>>> +		SymHeader *sym = elfPointer<SymHeader>(map, offset, soSize);
>>> +		if (!sym)
>>> +			return -ENOEXEC;
>>> +
>>> +		offset = dynsym_nameoff + sym->st_name;
>>> +		char *name = elfPointer<char>(map, offset, soSize);
>>> +		if (!name)
>>> +			return -ENOEXEC;
>>> +
>>> +		if (!strcmp(name, symbol) &&
>>> +		    sym->st_info & STB_GLOBAL &&
>>> +		    sym->st_size == sizeof(struct IPAModuleInfo)) {
>>
>> I think this should be a check on the size_t size passed in?
> 
> Ah yeah, I totally missed that.

No worries.

> 
>>> +			found = true;
>>> +			target_symbol = sym;
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	if (!found) {
>>> +		LOG(IPAModule, Error) << "IPAModuleInfo symbol not found";
>>
>> This function is called loadSymbol, it doesn't 'know' that it's loading
>> an IPAModuleInfo symbol.
>>
>> I'd change this error to "Symbol " << symbol << " not found";
> 
> And this.
> 
>>> +		return -ENOEXEC;
>>> +	}
>>> +
>>> +	/* Locate and return data of symbol. */
>>> +	offset = eHdr->e_shoff + target_symbol->st_shndx * eHdr->e_shentsize;
>>> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
>>> +	if (!sHdr)
>>> +		return -ENOEXEC;
>>> +	offset = sHdr->sh_offset + (target_symbol->st_value - sHdr->sh_addr);
>>> +	char *data = elfPointer<char>(map, offset, soSize);
>>> +	if (!data)
>>> +		return -ENOEXEC;
>>> +	memcpy(dst, data, size);
>>
>> Oh - interesting, you're copying the symbol out. Once it's mmapped ...
>> why not parse it directly?
> 
> I'm not sure what you mean by "parse"?


I mean once you find the symbol and obtain the pointer to it ('data'
here) then you have a pointer to memory which contains the structure.
You could return that pointer, and cast it to the type. Then you can
read the structure members directly.

But that requires knowing that the symbol doesn't require any further
relocations ... which is fine for our simple info structure.

It just seems logical in my head to have a function called findSymbol()
which returns a pointer to the symbol, and then (if you're going to copy
it) a function called loadStructure(), which finds the symbol, checks
the size then copies it or such. Maybe it's generalising the functions
more than we care if we will only use them for one thing

> 
>> Can we do all of the 'work' we need to do on the file, and then close it
>> at the destructor?
> 
> I don't want to keep the file open/mapped over multiple method calls.
> That's why in my first version there was a constructor that did nothing
> and an init() that did everything (which got moved to the constructor in
> this version).

The only things we'll do on this structure is read it and decide if this
library is OK for loading, perhaps even printing the name string in debug.

Once that has occurred, can't the IPAModuleInfo structure be discarded?
The next thing that we'll do is spawn a container/namespace/safe-thingy
and use the normal dl_open methods to re-open the library there...




> 
>> I guess actually this might then keep the whole file mapped in memory
>> for longer which might not be desirable...
>>
>>
>>
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +int IPAModule::loadIPAModuleInfo(const char *libPath, const char *symbol)
>>> +{
>>> +	int fd = open(libPath, O_RDONLY);
>>> +	if (fd < 0) {
>>> +		LOG(IPAModule, Error) << "Failed to open IPA library: "
>>> +				      << strerror(errno);
>>> +		return fd;
>>> +	}
>>> +
>>> +	int ret = loadELFIdent(fd);
>>> +	if (ret)
>>> +		goto close;
>>> +
>>> +	size_t soSize;
>>> +	char *map;
>>> +	struct stat st;
>>> +	ret = fstat(fd, &st);
>>> +	if (ret < 0)
>>> +		goto close;
>>> +	soSize = st.st_size;
>>> +	map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
>>> +				       MAP_PRIVATE, fd, 0));
>>
>> *iff* the constructor did the open, and the symbol parsing was
>> separated, then I think soSize and map would go to class privates. Then
>> they could be accessed directly by loadSymbol too. But that's an only if...
>>
>> Essentially, it would be
>>
>> IPAModule("SO-path")
>>  - Constructor opens the file, performs stat, and mmap.
>> 	 (or those become an open() call)
>>  - loadIPAModuleInfo() calls loadSymbol("ipaModuleInfo"); and deals only
>> with the ipaModule info structure and checking or such.
> 
> If loadIPAModuleInfo() is to be called from within the constructor
> (after open, stat, and mmap), then I agree. I'm not sure map and soSize
> need to be privates though. It could go either way.

Well, I leave that all up to you now :-)

> 
>>
>>> +	if (map == MAP_FAILED) {
>>> +		ret = -errno;
>>> +		goto close;
>>> +	}
>>> +
>>> +	if (bitClass_ == ELFCLASS32)
>>> +		ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
>>> +				(&info_, sizeof(info_), map, soSize, symbol);
>>> +	else if (bitClass_ == ELFCLASS64)
>>> +		ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
>>> +				(&info_, sizeof(info_), map, soSize, symbol);
>>> +	if (ret)
>>> +		goto unmap;
>>> +
>>> +unmap:
>>> +	munmap(map, soSize);
>>> +close:
>>> +	close(fd);
>>> +	return ret;
>>> +}
>>> +
>>> +/**
>>> + * \brief Check if construction of the IPAModule instance succeeded
>>> + *
>>> + * \return true if the constructor succeeded with no errors, false otherwise
>>> + */
>>> +
>>> +bool IPAModule::isValid() const
>>> +{
>>> +	return loaded_;
>>> +}
>>> +
>>> +/**
>>> + * \brief Get the loaded IPAModuleInfo
>>> + *
>>> + * Check isValid() before calling this.
>>> + *
>>> + * \return IPAModuleInfo
>>> + */
>>> +
>>> +struct IPAModuleInfo const &IPAModule::IPAModuleInfo() const
>>> +{
>>> +	return info_;
>>> +}
>>> +
>>> +} /* namespace libcamera */
>>> diff --git a/src/libcamera/meson.build b/src/libcamera/meson.build
>>> index 8796f49..e5b48f2 100644
>>> --- a/src/libcamera/meson.build
>>> +++ b/src/libcamera/meson.build
>>> @@ -10,6 +10,7 @@ libcamera_sources = files([
>>>      'event_notifier.cpp',
>>>      'formats.cpp',
>>>      'geometry.cpp',
>>> +    'ipa_module.cpp',
>>>      'log.cpp',
>>>      'media_device.cpp',
>>>      'media_object.cpp',
>>> @@ -31,6 +32,7 @@ libcamera_headers = files([
>>>      'include/device_enumerator_udev.h',
>>>      'include/event_dispatcher_poll.h',
>>>      'include/formats.h',
>>> +    'include/ipa_module.h',
>>>      'include/log.h',
>>>      'include/media_device.h',
>>>      'include/media_object.h',
>>>
> 
> 
> Thanks,
> 
> Paul
>
Laurent Pinchart May 15, 2019, 9:19 p.m. UTC | #4
Hello,

On Wed, May 15, 2019 at 04:26:48PM +0100, Kieran Bingham wrote:
> On 15/05/2019 16:02, Paul Elder wrote:
> > On Wed, May 15, 2019 at 10:28:46AM +0100, Kieran Bingham wrote:
> >> Hi Paul,
> >>
> >> Thank you for your patch, This is looking good and it's going in the
> >> right direction but I have a few design comments ... feel free to
> >> disagree though ...
> > 
> > Thank you for the review.
> > 
> >> On 14/05/2019 23:38, Paul Elder wrote:
> >>> Implement a class to load a struct IPAModuleInfo of a given symbol name
> >>> from a .so shared object library.
> >>>
> >>> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> >>> ---
> >>> Changes in v2:
> >>> - renamed LibLoader class to IPAModule
> >>> - added documentation
> >>> - added logging
> >>> - check that bitness of the shared object is the same as libcamera
> >>> - moved symbol loading ("init") to the constructor, and added isValid()
> >>> - added elfPointer() to prevent segfaults when reading data from mmap
> >>> - moved struct IPAModuleInfo out of IPAModule
> >>> - rename getIPAModuleInfo() to IPAModuleInfo(), and make it return a
> >>>   const reference
> >>> - added munmap after the mmap
> >>>
> >>>  src/libcamera/include/ipa_module.h |  43 +++++
> >>>  src/libcamera/ipa_module.cpp       | 277 +++++++++++++++++++++++++++++
> >>>  src/libcamera/meson.build          |   2 +
> >>>  3 files changed, 322 insertions(+)
> >>>  create mode 100644 src/libcamera/include/ipa_module.h
> >>>  create mode 100644 src/libcamera/ipa_module.cpp
> >>>
> >>> diff --git a/src/libcamera/include/ipa_module.h b/src/libcamera/include/ipa_module.h
> >>> new file mode 100644
> >>> index 0000000..9eb0094
> >>> --- /dev/null
> >>> +++ b/src/libcamera/include/ipa_module.h
> >>> @@ -0,0 +1,43 @@
> >>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> >>> +/*
> >>> + * Copyright (C) 2019, Google Inc.
> >>> + *
> >>> + * ipa_module.h - load IPA module information from shared library at runtime
> >>
> >> Hrm ... we have two sides here.
> >>
> >> We need a public header which defines the interface between an IPA
> >> module and libcamera. That would include a struct IPAModuleInfo and any
> >> registration details, but should not include any internal libcamera
> >> private details regarding how the module is loaded.
> > 
> > Yes, I agree.
> > 
> >>> + */
> >>> +#ifndef __LIBCAMERA_IPA_MODULE_H__
> >>> +#define __LIBCAMERA_IPA_MODULE_H__
> >>> +
> >>> +#include <string>
> >>> +
> >>> +namespace libcamera {
> >>> +
> >>> +struct IPAModuleInfo {
> >>> +	char name[256];
> >>> +	unsigned int version;
> >>> +};
> >>> +
> >>
> >> So IPModuleInfo (and then later, the class definition for how a
> >> developer would construct an IPA Module) should live in the public
> >> headers at:
> >>
> >>  /include/libcamera/ipa_module.h
> > 
> > Yeah.
> > 
> > Then, where should class IPAModule go?
> 
> Stays here in this file I think :-)

That's what I had envisioned, so it sounds good to me.

> > 
> >>> +class IPAModule
> 
> I'm wondering if we have two sides if this class should be calls
> IPAModule though.
> 
> Surely an IPAModule would implement a class called  IPAModule, and
> perhaps 'this' class is an IPAModuleParser? IPAModuleLoader?
> 
> Probably a Loader :D

I was wondering about this. One option would indeed to provide an
IPAModule class with pure virtual functions that the module would need
to implement. Another option would be to use a structure with C-style
function pointers. I was thinking about the latter, but the former is
not a bad idea. However, we need to take care about backward
compatibility. Can a module compiled with an old version of g++ (or
clang++) be incompatible with libcamera built with a newer version if
using a C++ ABI ?

Ie we go the C++ way, we will indeed need two different names for the
classes. I was thinking about IPAModule for this class and
IPAModuleInterface for the new one, in which case no change would be
needed here. Note that IPAModule would inherit from IPAModuleInterface
as it would provide the IPA API to pipeline handlers. I don't like the
name IPAModuleLoader much, as from a pipeline handler point of view
we're using modules, not module loaders. Other options may be possible.

> >>> +{
> >>> +public:
> >>> +	explicit IPAModule(const std::string &libPath, const std::string &symbol);

No need to pass the symbol name here, you can hardcode it in the
implementation. All modules will use the same symbol name.

> >>> +
> >>> +	bool isValid() const;
> >>> +
> >>> +	struct IPAModuleInfo const &IPAModuleInfo() const;

We usually write this const struct IPAModuleInfo &IPAModuleInfo() const.
And I would name the function info(), as IPAModule::info() makes it
clear that we're retrieving the information of an IPA module.

> >>> +
> >>> +private:
> >>> +	struct IPAModuleInfo info_;
> >>> +
> >>> +	bool loaded_;

Let's name this valid_ to match isValid(). We will later have a load()
function to dlopen() the module, and likely a loaded_ flag to match
that.

> >>> +	int bitClass_;
> >>> +
> >>> +	int loadELFIdent(int fd);
> >>> +	template<class ElfHeader, class SecHeader, class SymHeader>
> >>> +	int loadSymbol(void *data, size_t size, char *map, size_t soSize,
> >>> +		       const char *symbol);
> >>> +	int loadIPAModuleInfo(const char *libPath, const char *symbol);
> >>> +};
> >>> +
> >>> +} /* namespace libcamera */
> >>> +
> >>> +#endif /* __LIBCAMERA_IPA_MODULE_H__ */
> >>> diff --git a/src/libcamera/ipa_module.cpp b/src/libcamera/ipa_module.cpp
> >>> new file mode 100644
> >>> index 0000000..5ca16e8
> >>> --- /dev/null
> >>> +++ b/src/libcamera/ipa_module.cpp
> >>> @@ -0,0 +1,277 @@
> >>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> >>> +/*
> >>> + * Copyright (C) 2019, Google Inc.
> >>> + *
> >>> + * ipa_module.cpp - load IPA module information from shared library at runtime

I think we can already update the description to "Image Processing
Algorithm module" or similar

> >>> + */
> >>> +
> >>> +#include "ipa_module.h"
> >>> +
> >>> +#include <elf.h>
> >>> +#include <errno.h>
> >>> +#include <fcntl.h>
> >>> +#include <string.h>
> >>> +#include <sys/mman.h>
> >>> +#include <sys/stat.h>
> >>> +#include <sys/types.h>
> >>> +#include <unistd.h>
> >>> +
> >>> +#include <iostream>
> >>> +
> >>> +#include "log.h"
> >>> +
> >>> +/**
> >>> + * \file ipa_module.h
> >>> + * \brief IPA module information loading

Same here.

> >>> + */
> >>> +
> >>> +namespace libcamera {
> >>> +
> >>> +/**
> >>> + * \struct IPAModuleInfo
> >>> + * \brief Information of an IPA plugin
> >>> + *
> >>> + * This structure contains the information of an IPA plugin. It is loaded,
> >>> + * read, and validated before anything else is loaded from the plugin.
> >>> + *
> >>> + * \var IPAModuleInfo::name
> >>> + * \brief The name of the IPA plugin
> >>> + *
> >>> + * \var IPAModuleInfo::version
> >>> + * \brief The version of the IPA plugin
> >>
> >> I don't think we need to store the 'version' of the plugin, so much as
> >> the version of the API it was compiled against to ensure that it is
> >> compatible with this 'running' instance of libcamera.
> >>
> >> I.e. We don't care that it's the 3rd iteration of the rk3399 module -
> >> but we do care that it is compatible with the API defined in Libcamera 1.0.
> > 
> > Yeah, good point; I agree.
> > 
> > How would we get the API version of libcamera?
> 
> You get to define it :-)
> 
> To start with it's probably just:
> 
>   #define LIBCAMERA_IPA_API_VERSION 1
> 
> This would be a 'local' IPAModule API version number or such I think, as
> our IPA modules must go through a defined interface...
> 
> Any time an ABI break occurs in the IPAModule 'API' we would have to
> bump up the apiversion.

I agree, but I think we can defer this. The IPAModuleInfo structure is
currently a mock up to start implementing the IPAModule class, I expect
it to change a lot.

> >> Maybe we should add compatible strings to match against pipeline
> >> handlers as some point too :-) <I'm not even sure if I'm joking now>

I think we will need that :-)

Paul, could you record there open items ?

> >>> + */
> >>> +
> >>> +LOG_DEFINE_CATEGORY(IPAModule)
> >>> +
> >>> +template<typename T>
> >>> +typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
> >>> +						 size_t fileSize)
> >>> +{
> >>> +	size_t size = offset + sizeof(T);
> >>> +	if (size > fileSize || size < sizeof(T))
> >>> +		return nullptr;
> >>> +
> >>> +	return reinterpret_cast<typename std::remove_extent<T>::type *>
> >>> +		(static_cast<char *>(map) + offset);
> >>> +}

This function should be static. The C way of doing so is using the
static keyword, the C++ way is to put it in an anonymous namespace.

namespace {

template<typename T>
typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
						 size_t fileSize)
{
...
}

};

> >>> +
> >>> +/**
> >>> + * \class IPAModule
> >>> + * \brief Load IPA module information from an IPA plugin

To match the suggestion above, "IPA module wrapper" ? I think we can do
better than that, maybe "Wrapper around an IPA module shared object" ?
Suggestions are welcome.

> >>> + */
> >>> +
> >>> +/**
> >>> + * \brief Construct an IPAModule instance
> >>> + * \param[in] libPath path to IPA plugin
> >>> + * \param[in] symbol name of IPAModuleInfo to load from libPath
> >>> + *
> >>> + * Loads the IPAModuleInfo of the given symbol from the IPA plugin at libPath.
> >>> + * The IPA plugin shared object file must be of the same endianness and
> >>> + * bitness as libcamera.
> >>> + *
> >>> + * Note that isValid() should be called to check if this loading was successful
> >>> + * or not.
> >>
> >> Hrm ... this is a slightly different design pattern than the rest of our
> >> objects which are always constructable, and then actions later can fail.
> >>
> >> I don't mind - but I wonder if we should add some documentation
> >> regarding our design patterns somewhere.
> >>
> >> (not an action on this patch)

I've discussed this with Paul, and I think that in this case this
implementation makes sense. There's no need to delay loading, and as the
IPAModule class is a wrapper around a .so, constructing an instance on
which on can call isValid() to check if the IPA is valid makes sense to
me.

Maybe we need to reconsider our other classes indeed. Any
suggestion/preference ?

> >>> + */
> >>> +
> >>> +IPAModule::IPAModule(const std::string &libPath, const std::string &symbol)
> >>> +	: loaded_(false)
> >>
> >> So actually, from my text below - what I'm really saying is that I don't
> >> think you should provide a &symbol to this IPAModule constructor :)

So we agree :-)

> >>> +{
> >>> +	int ret = loadIPAModuleInfo(libPath.c_str(), symbol.c_str());
> >>
> >> I see that you have moved the load into this constructor based on
> >> comments from Laurent in v1. I think that's fine, but I think merging
> >> the loading of the object, and the parsing of the symbol might be
> >> combining too much. But it depends (as ever).

Note that this only loads the module info, not the module itself (I
foresee a load() function that will dlopen() the module). I'm not sure
what you mean by merging the loading of the object and the parsing of
the symbol.

> >> For efficiency, the constructor can load the file once and when created
> >> and .isValid() is true, then we know we can successfully parse symbols.

But you can't know it's a valid IPA unless you can load the
IPAModuleInfo successfully (and later maybe even cryptographically
verifying the module).

> >> But I think loadIPAModuleInfo() should be 'hardcoded' to know exactly
> >> which symbol name it will expect for an IPAModule. Think of it like a
> >> 'main()' entry point for our IPAModule. You would always expect a C
> >> binary to have a main() symbol...
> > 
> > Yeah, I think that's a good idea.
> > 
> >> So I would expect a libcamera IPA module to have a registration
> >> something like:
> >>
> >>
> >> /* Register our module with Libcamera */
> >> const struct IPAModuleInfo ipaModuleInfo = {
> >> 	.name = "Image Processing for the RK3399",
> >> 	.version = 1, /* Perhaps this should be apiversion to prevent
> >>                        * people thinking it describes the version of the
> >>                        * module ? ...
> >>                        */
> >> };
> > 
> > Should the info entry point be called ipaModuleInfo?
> 
> I think variables are in camelCase with a lowerCase first letter in the
> project right? or was this not quite your question?

This sounds good to me.

> >> This could even then be turned into a macro:
> >>
> >> #define LIBCAMERA_IPA_MODULE(__NAME) \
> >> const struct IPAModuleInfo ipaModuleInfo = { \
> >> 	.name = __NAME,	\
> >> 	.apiversion = 1, \
> >> }

We'll have to throw an extern "C" here, otherwise your symbol name will
be _ZL13ipaModuleInfo with g++.

> >>
> >> so that a module rk3399_ipa.cpp could define:
> >>
> >> LIBCAMERA_IPA_MODULE("Image Processing Algorithms for the RK3399");
> >>
> >> (apiversion would be hardcoded by the macro because it defines what
> >> version of the API/libraries it's compiled against..., and we check it's
> >> compatible at runtime)

Something like that. If the number of fields grows, and if some of them
become complex, we may want to fill the structure manually, be we should
still use a macro to instantiate the right structure with the right name
and the extern "C" keyword.

> >>> +	if (ret < 0) {
> >>> +		loaded_ = false;

This isn't needed, you initialise loaded_ to false above.

> >>> +		return;
> >>> +	}
> >>> +
> >>> +	loaded_ = true;
> >>> +}
> >>> +
> >>> +int IPAModule::loadELFIdent(int fd)

This function doesn't actually load much, how about renaming it to
verifyELFIdent() ?

> >>> +{
> >>> +	int ret = lseek(fd, 0, SEEK_SET);
> >>> +	if (ret < 0)
> >>> +		return -errno;
> >>> +

As we're mmap()ing the whole .so, you could mmap() before calling
loadELFIdent(), and pass the void *map and size to the function, instead
of using read.

> >>> +	unsigned char e_ident[EI_NIDENT];
> >>> +	ret = read(fd, e_ident, EI_NIDENT);
> >>> +	if (ret < 0)
> >>> +		return -errno;
> >>> +
> >>> +	if (e_ident[EI_MAG0] != ELFMAG0 ||
> >>> +	    e_ident[EI_MAG1] != ELFMAG1 ||
> >>> +	    e_ident[EI_MAG2] != ELFMAG2 ||
> >>> +	    e_ident[EI_MAG3] != ELFMAG3 ||
> >>> +	    e_ident[EI_VERSION] != EV_CURRENT)
> >>> +		return -ENOEXEC;
> >>> +
> >>> +	if ((e_ident[EI_CLASS] == ELFCLASS32 && sizeof(unsigned long) == 4) ||
> >>> +	    (e_ident[EI_CLASS] == ELFCLASS64 && sizeof(unsigned long) == 8))
> >>> +		bitClass_ = e_ident[EI_CLASS];
> >>> +	else
> >>> +		return -ENOEXEC;

How about

	bitClass_ = sizeof(unsigned long) == 4 ? ELFCLASS32 : ELFCLASS64;
	if (e_ident[EI_CLASS] != bitClass_)
		return -ENOEXEC;

> >>> +
> >>> +	int a = 1;
> >>> +	unsigned char endianness = *reinterpret_cast<char *>(&a) == 1
> >>> +				 ? ELFDATA2LSB : ELFDATA2MSB;

So you didn't like my union-based solution ? :-)

> >>> +	if (e_ident[EI_DATA] != endianness)
> >>> +		return -ENOEXEC;
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +template<class ElfHeader, class SecHeader, class SymHeader>
> >>> +int IPAModule::loadSymbol(void *dst, size_t size, char *map, size_t soSize,
> >>> +			  const char *symbol)
> >>> +{
> >>> +	ElfHeader *eHdr = elfPointer<ElfHeader>(map, 0, soSize);
> >>> +	if (!eHdr)
> >>> +		return -ENOEXEC;
> >>> +
> >>> +	off_t offset = eHdr->e_shoff + eHdr->e_shentsize * eHdr->e_shstrndx;
> >>> +	SecHeader *sHdr = elfPointer<SecHeader>(map, offset, soSize);
> >>> +	if (!sHdr)
> >>> +		return -ENOEXEC;
> >>> +	off_t shnameoff = sHdr->sh_offset;
> >>> +
> >>> +	/* Locate .dynsym section header. */
> >>> +	bool found = false;
> >>> +	SecHeader *dynsym;
> >>> +	for (unsigned int i = 0; i < eHdr->e_shnum; i++) {
> >>> +		offset = eHdr->e_shoff + eHdr->e_shentsize * i;
> >>> +		sHdr = elfPointer<SecHeader>(map, offset, soSize);
> >>> +		if (!sHdr)
> >>> +			return -ENOEXEC;
> >>> +
> >>> +		offset = shnameoff + sHdr->sh_name;
> >>> +		char *name = elfPointer<char>(map, offset, soSize);

This should be elfPointer<char[8]> as you want to ensure that at least 8
bytes are available (sizeof(".dynsym"), including the terminating 0).

> >>> +		if (!name)
> >>> +			return -ENOEXEC;
> >>> +
> >>> +		if (sHdr->sh_type == SHT_DYNSYM && !strcmp(name, ".dynsym")) {
> >>> +			found = true;
> >>> +			dynsym = sHdr;

You could initialise dynsym to nullptr and remove found.

> >>> +			break;
> >>> +		}
> >>> +	}
> >>> +
> >>> +	if (!found) {
> >>> +		LOG(IPAModule, Error) << "ELF has no .dynsym section";
> >>> +		return -ENOEXEC;
> >>> +	}
> >>> +
> >>> +	offset = eHdr->e_shoff + eHdr->e_shentsize * dynsym->sh_link;
> >>> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> >>> +	if (!sHdr)
> >>> +		return -ENOEXEC;
> >>> +	off_t dynsym_nameoff = sHdr->sh_offset;
> >>> +
> >>> +	/* Locate symbol in the .dynsym section. */
> >>> +	found = false;
> >>> +	SymHeader *target_symbol;

targetSymbol ?

> >>> +	unsigned int dynsym_num = dynsym->sh_size / dynsym->sh_entsize;
> >>> +	for (unsigned int i = 0; i < dynsym_num; i++) {
> >>> +		offset = dynsym->sh_offset + dynsym->sh_entsize * i;
> >>> +		SymHeader *sym = elfPointer<SymHeader>(map, offset, soSize);
> >>> +		if (!sym)
> >>> +			return -ENOEXEC;
> >>> +
> >>> +		offset = dynsym_nameoff + sym->st_name;
> >>> +		char *name = elfPointer<char>(map, offset, soSize);

Same here, you want elfPointer<char[strlen(symbol) + 1]>. I hope this
will compile :-)

> >>> +		if (!name)
> >>> +			return -ENOEXEC;
> >>> +
> >>> +		if (!strcmp(name, symbol) &&
> >>> +		    sym->st_info & STB_GLOBAL &&
> >>> +		    sym->st_size == sizeof(struct IPAModuleInfo)) {
> >>
> >> I think this should be a check on the size_t size passed in?
> > 
> > Ah yeah, I totally missed that.
> 
> No worries.
> 
> >>> +			found = true;
> >>> +			target_symbol = sym;

You could also remove found here.

> >>> +			break;
> >>> +		}
> >>> +	}
> >>> +
> >>> +	if (!found) {
> >>> +		LOG(IPAModule, Error) << "IPAModuleInfo symbol not found";
> >>
> >> This function is called loadSymbol, it doesn't 'know' that it's loading
> >> an IPAModuleInfo symbol.
> >>
> >> I'd change this error to "Symbol " << symbol << " not found";
> > 
> > And this.
> > 
> >>> +		return -ENOEXEC;
> >>> +	}
> >>> +
> >>> +	/* Locate and return data of symbol. */

As a sanity check you may want to verify that target_symbol->st_shndx <
eHdr->e_shnum. Otherwise none of the elfPointer<> checks below may
trigger, and you will load an invalid symbol. This can happen for
instance if the symbol is absolute (SHN_ABS).

> >>> +	offset = eHdr->e_shoff + target_symbol->st_shndx * eHdr->e_shentsize;
> >>> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> >>> +	if (!sHdr)
> >>> +		return -ENOEXEC;
> >>> +	offset = sHdr->sh_offset + (target_symbol->st_value - sHdr->sh_addr);
> >>> +	char *data = elfPointer<char>(map, offset, soSize);

elfPointer<char[size]>

> >>> +	if (!data)
> >>> +		return -ENOEXEC;
> >>> +	memcpy(dst, data, size);
> >>
> >> Oh - interesting, you're copying the symbol out. Once it's mmapped ...
> >> why not parse it directly?
> > 
> > I'm not sure what you mean by "parse"?
> 
> I mean once you find the symbol and obtain the pointer to it ('data'
> here) then you have a pointer to memory which contains the structure.
> You could return that pointer, and cast it to the type. Then you can
> read the structure members directly.
> 
> But that requires knowing that the symbol doesn't require any further
> relocations ... which is fine for our simple info structure.
> 
> It just seems logical in my head to have a function called findSymbol()
> which returns a pointer to the symbol, and then (if you're going to copy
> it) a function called loadStructure(), which finds the symbol, checks
> the size then copies it or such. Maybe it's generalising the functions
> more than we care if we will only use them for one thing

I have no particular opinion on this. We could indeed rename this
function to findSymbol() and move the copy outside. It may actually be a
good idea, but then this function should return both the address (or
offset) and size (one or both through output arguments).

> >> Can we do all of the 'work' we need to do on the file, and then close it
> >> at the destructor?
> > 
> > I don't want to keep the file open/mapped over multiple method calls.
> > That's why in my first version there was a constructor that did nothing
> > and an init() that did everything (which got moved to the constructor in
> > this version).

I agree, we shouldn't keep it mapped.

> The only things we'll do on this structure is read it and decide if this
> library is OK for loading, perhaps even printing the name string in debug.
> 
> Once that has occurred, can't the IPAModuleInfo structure be discarded?
> The next thing that we'll do is spawn a container/namespace/safe-thingy
> and use the normal dl_open methods to re-open the library there...

No, we need it at least to match the module with pipeline handlers, so
it needs to be copied. I'm sure we'll have other fields that we will
also need to access later.

> >> I guess actually this might then keep the whole file mapped in memory
> >> for longer which might not be desirable...
> >>
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +int IPAModule::loadIPAModuleInfo(const char *libPath, const char *symbol)

I would store libPath in a member variable as we will need it for
dlopen(), and use that member variable in this function.

> >>> +{
> >>> +	int fd = open(libPath, O_RDONLY);
> >>> +	if (fd < 0) {
> >>> +		LOG(IPAModule, Error) << "Failed to open IPA library: "
> >>> +				      << strerror(errno);

You need

		ret = -errno;
		LOG(IPAModule, Error) << "Failed to open IPA library: "
				      << strerror(-ret);
		return ret;

ass errno may be changed by LOG(IPAModule, Error).

> >>> +		return fd;
> >>> +	}
> >>> +
> >>> +	int ret = loadELFIdent(fd);
> >>> +	if (ret)
> >>> +		goto close;
> >>> +
> >>> +	size_t soSize;
> >>> +	char *map;
> >>> +	struct stat st;
> >>> +	ret = fstat(fd, &st);
> >>> +	if (ret < 0)
> >>> +		goto close;
> >>> +	soSize = st.st_size;
> >>> +	map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
> >>> +				       MAP_PRIVATE, fd, 0));

	struct stat st;
	ret = fstat(fd, &st);
	if (ret < 0)
		goto close;

	size_t soSize = st.st_size;
	char *map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
					     MAP_PRIVATE, fd, 0));

I think I would make it a void *map, pass it as a void * to all the
functions, and let elfPointer<> do the case.

> >> *iff* the constructor did the open, and the symbol parsing was
> >> separated, then I think soSize and map would go to class privates. Then
> >> they could be accessed directly by loadSymbol too. But that's an only if...
> >>
> >> Essentially, it would be
> >>
> >> IPAModule("SO-path")
> >>  - Constructor opens the file, performs stat, and mmap.
> >> 	 (or those become an open() call)
> >>  - loadIPAModuleInfo() calls loadSymbol("ipaModuleInfo"); and deals only
> >> with the ipaModule info structure and checking or such.
> > 
> > If loadIPAModuleInfo() is to be called from within the constructor
> > (after open, stat, and mmap), then I agree. I'm not sure map and soSize
> > need to be privates though. It could go either way.

I thought about it, but it the address and size are really temporary
variables, as we unmap the file after the parsing, so I think it's best
to pass them explicitly to functions.

> Well, I leave that all up to you now :-)
> 
> >>> +	if (map == MAP_FAILED) {
> >>> +		ret = -errno;
> >>> +		goto close;
> >>> +	}
> >>> +
> >>> +	if (bitClass_ == ELFCLASS32)
> >>> +		ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
> >>> +				(&info_, sizeof(info_), map, soSize, symbol);
> >>> +	else if (bitClass_ == ELFCLASS64)
> >>> +		ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
> >>> +				(&info_, sizeof(info_), map, soSize, symbol);

As we don't need to support a module with a different bit class than
libcamera, would we avoid compiling in both version of the function ?
Maybe something like

#if sizeof(long) == 4
	ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
			(&info_, sizeof(info_), map, soSize, symbol);
#else
	ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
			(&info_, sizeof(info_), map, soSize, symbol);
#endif

and given that we only need one version of the function, I even wonder
if we need to keep it defined as a template function, maybe we could
have a

#if sizeof(long) == 4
#define Elf_Ehdr Elf32_Ehdr
#define Elf_Shdr Elf32_Shdr
#define Elf_Sym Elf32_Sym
#else
#define Elf_Ehdr Elf64_Ehdr
#define Elf_Shdr Elf64_Shdr
#define Elf_Sym Elf64_Sym
#endif

just before the function definition. This would however require moving
the function away from the class to a global function, private to this
file, to avoid exposing it in the ipa_module.h header, otherwise we
would have to move the above #if's there, which isn't neat. If we do
this we should move the loadElfIdent function too. Neither function use
any of the class members (except for bitClass_, which isn't needed
anymore), so they're essentially static, and separating them from the
IPAModule class would open the door to sharing them with other classes
if needed later. elfPointer<> is already separate. If you prefer to keep
them as class members, they should be declared as static in the class
definition.

> >>> +	if (ret)
> >>> +		goto unmap;
> >>> +
> >>> +unmap:
> >>> +	munmap(map, soSize);
> >>> +close:
> >>> +	close(fd);
> >>> +	return ret;
> >>> +}
> >>> +
> >>> +/**
> >>> + * \brief Check if construction of the IPAModule instance succeeded

I would say "Check if the IPAModule instance is valid" and then add a
paragraph to detail what a valid (or invalid) IPAModule is.

> >>> + *
> >>> + * \return true if the constructor succeeded with no errors, false otherwise

true if the IPAModule is valid, false otherwise

> >>> + */
> >>> +

No need for this blank line.

> >>> +bool IPAModule::isValid() const
> >>> +{
> >>> +	return loaded_;
> >>> +}
> >>> +
> >>> +/**
> >>> + * \brief Get the loaded IPAModuleInfo

"Get the IPA module information"

> >>> + *
> >>> + * Check isValid() before calling this.

"The content of the IPA module information is loaded from the module,
and is valid only if the module is valid (as returned by isValid()).
Calling this function on an invalid module is an error."

> >>> + *
> >>> + * \return IPAModuleInfo
> >>> + */
> >>> +

No need for a blank line here either

> >>> +struct IPAModuleInfo const &IPAModule::IPAModuleInfo() const
> >>> +{
> >>> +	return info_;
> >>> +}
> >>> +
> >>> +} /* namespace libcamera */
> >>> diff --git a/src/libcamera/meson.build b/src/libcamera/meson.build
> >>> index 8796f49..e5b48f2 100644
> >>> --- a/src/libcamera/meson.build
> >>> +++ b/src/libcamera/meson.build
> >>> @@ -10,6 +10,7 @@ libcamera_sources = files([
> >>>      'event_notifier.cpp',
> >>>      'formats.cpp',
> >>>      'geometry.cpp',
> >>> +    'ipa_module.cpp',
> >>>      'log.cpp',
> >>>      'media_device.cpp',
> >>>      'media_object.cpp',
> >>> @@ -31,6 +32,7 @@ libcamera_headers = files([
> >>>      'include/device_enumerator_udev.h',
> >>>      'include/event_dispatcher_poll.h',
> >>>      'include/formats.h',
> >>> +    'include/ipa_module.h',
> >>>      'include/log.h',
> >>>      'include/media_device.h',
> >>>      'include/media_object.h',
Paul Elder May 16, 2019, 5:51 p.m. UTC | #5
Hi,

On Thu, May 16, 2019 at 12:19:42AM +0300, Laurent Pinchart wrote:
> Hello,
> 
> On Wed, May 15, 2019 at 04:26:48PM +0100, Kieran Bingham wrote:
> > On 15/05/2019 16:02, Paul Elder wrote:
> > > On Wed, May 15, 2019 at 10:28:46AM +0100, Kieran Bingham wrote:
> > >> Hi Paul,
> > >>
> > >> Thank you for your patch, This is looking good and it's going in the
> > >> right direction but I have a few design comments ... feel free to
> > >> disagree though ...
> > > 
> > > Thank you for the review.
> > > 
> > >> On 14/05/2019 23:38, Paul Elder wrote:
> > >>> Implement a class to load a struct IPAModuleInfo of a given symbol name
> > >>> from a .so shared object library.
> > >>>
> > >>> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> > >>> ---
> > >>> Changes in v2:
> > >>> - renamed LibLoader class to IPAModule
> > >>> - added documentation
> > >>> - added logging
> > >>> - check that bitness of the shared object is the same as libcamera
> > >>> - moved symbol loading ("init") to the constructor, and added isValid()
> > >>> - added elfPointer() to prevent segfaults when reading data from mmap
> > >>> - moved struct IPAModuleInfo out of IPAModule
> > >>> - rename getIPAModuleInfo() to IPAModuleInfo(), and make it return a
> > >>>   const reference
> > >>> - added munmap after the mmap
> > >>>
> > >>>  src/libcamera/include/ipa_module.h |  43 +++++
> > >>>  src/libcamera/ipa_module.cpp       | 277 +++++++++++++++++++++++++++++
> > >>>  src/libcamera/meson.build          |   2 +
> > >>>  3 files changed, 322 insertions(+)
> > >>>  create mode 100644 src/libcamera/include/ipa_module.h
> > >>>  create mode 100644 src/libcamera/ipa_module.cpp
> > >>>
> > >>> diff --git a/src/libcamera/include/ipa_module.h b/src/libcamera/include/ipa_module.h
> > >>> new file mode 100644
> > >>> index 0000000..9eb0094
> > >>> --- /dev/null
> > >>> +++ b/src/libcamera/include/ipa_module.h
> > >>> @@ -0,0 +1,43 @@
> > >>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> > >>> +/*
> > >>> + * Copyright (C) 2019, Google Inc.
> > >>> + *
> > >>> + * ipa_module.h - load IPA module information from shared library at runtime
> > >>
> > >> Hrm ... we have two sides here.
> > >>
> > >> We need a public header which defines the interface between an IPA
> > >> module and libcamera. That would include a struct IPAModuleInfo and any
> > >> registration details, but should not include any internal libcamera
> > >> private details regarding how the module is loaded.
> > > 
> > > Yes, I agree.
> > > 
> > >>> + */
> > >>> +#ifndef __LIBCAMERA_IPA_MODULE_H__
> > >>> +#define __LIBCAMERA_IPA_MODULE_H__
> > >>> +
> > >>> +#include <string>
> > >>> +
> > >>> +namespace libcamera {
> > >>> +
> > >>> +struct IPAModuleInfo {
> > >>> +	char name[256];
> > >>> +	unsigned int version;
> > >>> +};
> > >>> +
> > >>
> > >> So IPModuleInfo (and then later, the class definition for how a
> > >> developer would construct an IPA Module) should live in the public
> > >> headers at:
> > >>
> > >>  /include/libcamera/ipa_module.h
> > > 
> > > Yeah.
> > > 
> > > Then, where should class IPAModule go?
> > 
> > Stays here in this file I think :-)
> 
> That's what I had envisioned, so it sounds good to me.
> 
> > > 
> > >>> +class IPAModule
> > 
> > I'm wondering if we have two sides if this class should be calls
> > IPAModule though.
> > 
> > Surely an IPAModule would implement a class called  IPAModule, and
> > perhaps 'this' class is an IPAModuleParser? IPAModuleLoader?
> > 
> > Probably a Loader :D
> 
> I was wondering about this. One option would indeed to provide an
> IPAModule class with pure virtual functions that the module would need
> to implement. Another option would be to use a structure with C-style
> function pointers. I was thinking about the latter, but the former is
> not a bad idea. However, we need to take care about backward
> compatibility. Can a module compiled with an old version of g++ (or
> clang++) be incompatible with libcamera built with a newer version if
> using a C++ ABI ?
> 
> Ie we go the C++ way, we will indeed need two different names for the
> classes. I was thinking about IPAModule for this class and
> IPAModuleInterface for the new one, in which case no change would be
> needed here. Note that IPAModule would inherit from IPAModuleInterface
> as it would provide the IPA API to pipeline handlers. I don't like the
> name IPAModuleLoader much, as from a pipeline handler point of view
> we're using modules, not module loaders. Other options may be possible.

I don't know about C++ ABI compatability. I think to just avoid the risk
itself we should go with C-style function pointers.

In which case Laurent's shim idea would work well for process isolation
later on. Since the only layers are Pipeline -> IPAModule there isn't
really anywhere else we can break off into another process without
making the Pipeline -> IPAModule interface ugly.

> > >>> +{
> > >>> +public:
> > >>> +	explicit IPAModule(const std::string &libPath, const std::string &symbol);
> 
> No need to pass the symbol name here, you can hardcode it in the
> implementation. All modules will use the same symbol name.
> 
> > >>> +
> > >>> +	bool isValid() const;
> > >>> +
> > >>> +	struct IPAModuleInfo const &IPAModuleInfo() const;
> 
> We usually write this const struct IPAModuleInfo &IPAModuleInfo() const.
> And I would name the function info(), as IPAModule::info() makes it
> clear that we're retrieving the information of an IPA module.
> 
> > >>> +
> > >>> +private:
> > >>> +	struct IPAModuleInfo info_;
> > >>> +
> > >>> +	bool loaded_;
> 
> Let's name this valid_ to match isValid(). We will later have a load()
> function to dlopen() the module, and likely a loaded_ flag to match
> that.
> 
> > >>> +	int bitClass_;
> > >>> +
> > >>> +	int loadELFIdent(int fd);
> > >>> +	template<class ElfHeader, class SecHeader, class SymHeader>
> > >>> +	int loadSymbol(void *data, size_t size, char *map, size_t soSize,
> > >>> +		       const char *symbol);
> > >>> +	int loadIPAModuleInfo(const char *libPath, const char *symbol);
> > >>> +};
> > >>> +
> > >>> +} /* namespace libcamera */
> > >>> +
> > >>> +#endif /* __LIBCAMERA_IPA_MODULE_H__ */
> > >>> diff --git a/src/libcamera/ipa_module.cpp b/src/libcamera/ipa_module.cpp
> > >>> new file mode 100644
> > >>> index 0000000..5ca16e8
> > >>> --- /dev/null
> > >>> +++ b/src/libcamera/ipa_module.cpp
> > >>> @@ -0,0 +1,277 @@
> > >>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> > >>> +/*
> > >>> + * Copyright (C) 2019, Google Inc.
> > >>> + *
> > >>> + * ipa_module.cpp - load IPA module information from shared library at runtime
> 
> I think we can already update the description to "Image Processing
> Algorithm module" or similar
> 
> > >>> + */
> > >>> +
> > >>> +#include "ipa_module.h"
> > >>> +
> > >>> +#include <elf.h>
> > >>> +#include <errno.h>
> > >>> +#include <fcntl.h>
> > >>> +#include <string.h>
> > >>> +#include <sys/mman.h>
> > >>> +#include <sys/stat.h>
> > >>> +#include <sys/types.h>
> > >>> +#include <unistd.h>
> > >>> +
> > >>> +#include <iostream>
> > >>> +
> > >>> +#include "log.h"
> > >>> +
> > >>> +/**
> > >>> + * \file ipa_module.h
> > >>> + * \brief IPA module information loading
> 
> Same here.
> 
> > >>> + */
> > >>> +
> > >>> +namespace libcamera {
> > >>> +
> > >>> +/**
> > >>> + * \struct IPAModuleInfo
> > >>> + * \brief Information of an IPA plugin
> > >>> + *
> > >>> + * This structure contains the information of an IPA plugin. It is loaded,
> > >>> + * read, and validated before anything else is loaded from the plugin.
> > >>> + *
> > >>> + * \var IPAModuleInfo::name
> > >>> + * \brief The name of the IPA plugin
> > >>> + *
> > >>> + * \var IPAModuleInfo::version
> > >>> + * \brief The version of the IPA plugin
> > >>
> > >> I don't think we need to store the 'version' of the plugin, so much as
> > >> the version of the API it was compiled against to ensure that it is
> > >> compatible with this 'running' instance of libcamera.
> > >>
> > >> I.e. We don't care that it's the 3rd iteration of the rk3399 module -
> > >> but we do care that it is compatible with the API defined in Libcamera 1.0.
> > > 
> > > Yeah, good point; I agree.
> > > 
> > > How would we get the API version of libcamera?
> > 
> > You get to define it :-)
> > 
> > To start with it's probably just:
> > 
> >   #define LIBCAMERA_IPA_API_VERSION 1
> > 
> > This would be a 'local' IPAModule API version number or such I think, as
> > our IPA modules must go through a defined interface...
> > 
> > Any time an ABI break occurs in the IPAModule 'API' we would have to
> > bump up the apiversion.
> 
> I agree, but I think we can defer this. The IPAModuleInfo structure is
> currently a mock up to start implementing the IPAModule class, I expect
> it to change a lot.
> 
> > >> Maybe we should add compatible strings to match against pipeline
> > >> handlers as some point too :-) <I'm not even sure if I'm joking now>
> 
> I think we will need that :-)
> 
> Paul, could you record there open items ?

I'm not sure what you mean.

> > >>> + */
> > >>> +
> > >>> +LOG_DEFINE_CATEGORY(IPAModule)
> > >>> +
> > >>> +template<typename T>
> > >>> +typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
> > >>> +						 size_t fileSize)
> > >>> +{
> > >>> +	size_t size = offset + sizeof(T);
> > >>> +	if (size > fileSize || size < sizeof(T))
> > >>> +		return nullptr;
> > >>> +
> > >>> +	return reinterpret_cast<typename std::remove_extent<T>::type *>
> > >>> +		(static_cast<char *>(map) + offset);
> > >>> +}
> 
> This function should be static. The C way of doing so is using the
> static keyword, the C++ way is to put it in an anonymous namespace.
> 
> namespace {
> 
> template<typename T>
> typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
> 						 size_t fileSize)
> {
> ...
> }
> 
> };
> 
> > >>> +
> > >>> +/**
> > >>> + * \class IPAModule
> > >>> + * \brief Load IPA module information from an IPA plugin
> 
> To match the suggestion above, "IPA module wrapper" ? I think we can do
> better than that, maybe "Wrapper around an IPA module shared object" ?
> Suggestions are welcome.
> 
> > >>> + */
> > >>> +
> > >>> +/**
> > >>> + * \brief Construct an IPAModule instance
> > >>> + * \param[in] libPath path to IPA plugin
> > >>> + * \param[in] symbol name of IPAModuleInfo to load from libPath
> > >>> + *
> > >>> + * Loads the IPAModuleInfo of the given symbol from the IPA plugin at libPath.
> > >>> + * The IPA plugin shared object file must be of the same endianness and
> > >>> + * bitness as libcamera.
> > >>> + *
> > >>> + * Note that isValid() should be called to check if this loading was successful
> > >>> + * or not.
> > >>
> > >> Hrm ... this is a slightly different design pattern than the rest of our
> > >> objects which are always constructable, and then actions later can fail.
> > >>
> > >> I don't mind - but I wonder if we should add some documentation
> > >> regarding our design patterns somewhere.
> > >>
> > >> (not an action on this patch)
> 
> I've discussed this with Paul, and I think that in this case this
> implementation makes sense. There's no need to delay loading, and as the
> IPAModule class is a wrapper around a .so, constructing an instance on
> which on can call isValid() to check if the IPA is valid makes sense to
> me.
> 
> Maybe we need to reconsider our other classes indeed. Any
> suggestion/preference ?

Do you mean to make the other classes follow the same model? As long as
we document the exception (well the usage is already documented...) then
shouldn't it be fine?

> > >>> + */
> > >>> +
> > >>> +IPAModule::IPAModule(const std::string &libPath, const std::string &symbol)
> > >>> +	: loaded_(false)
> > >>
> > >> So actually, from my text below - what I'm really saying is that I don't
> > >> think you should provide a &symbol to this IPAModule constructor :)
> 
> So we agree :-)
> 
> > >>> +{
> > >>> +	int ret = loadIPAModuleInfo(libPath.c_str(), symbol.c_str());
> > >>
> > >> I see that you have moved the load into this constructor based on
> > >> comments from Laurent in v1. I think that's fine, but I think merging
> > >> the loading of the object, and the parsing of the symbol might be
> > >> combining too much. But it depends (as ever).
> 
> Note that this only loads the module info, not the module itself (I
> foresee a load() function that will dlopen() the module). I'm not sure
> what you mean by merging the loading of the object and the parsing of
> the symbol.
> 
> > >> For efficiency, the constructor can load the file once and when created
> > >> and .isValid() is true, then we know we can successfully parse symbols.
> 
> But you can't know it's a valid IPA unless you can load the
> IPAModuleInfo successfully (and later maybe even cryptographically
> verifying the module).
> 
> > >> But I think loadIPAModuleInfo() should be 'hardcoded' to know exactly
> > >> which symbol name it will expect for an IPAModule. Think of it like a
> > >> 'main()' entry point for our IPAModule. You would always expect a C
> > >> binary to have a main() symbol...
> > > 
> > > Yeah, I think that's a good idea.
> > > 
> > >> So I would expect a libcamera IPA module to have a registration
> > >> something like:
> > >>
> > >>
> > >> /* Register our module with Libcamera */
> > >> const struct IPAModuleInfo ipaModuleInfo = {
> > >> 	.name = "Image Processing for the RK3399",
> > >> 	.version = 1, /* Perhaps this should be apiversion to prevent
> > >>                        * people thinking it describes the version of the
> > >>                        * module ? ...
> > >>                        */
> > >> };
> > > 
> > > Should the info entry point be called ipaModuleInfo?
> > 
> > I think variables are in camelCase with a lowerCase first letter in the
> > project right? or was this not quite your question?
> 
> This sounds good to me.

My question was about the entry point name, not the casing :p

> > >> This could even then be turned into a macro:
> > >>
> > >> #define LIBCAMERA_IPA_MODULE(__NAME) \
> > >> const struct IPAModuleInfo ipaModuleInfo = { \
> > >> 	.name = __NAME,	\
> > >> 	.apiversion = 1, \
> > >> }
> 
> We'll have to throw an extern "C" here, otherwise your symbol name will
> be _ZL13ipaModuleInfo with g++.
> 
> > >>
> > >> so that a module rk3399_ipa.cpp could define:
> > >>
> > >> LIBCAMERA_IPA_MODULE("Image Processing Algorithms for the RK3399");
> > >>
> > >> (apiversion would be hardcoded by the macro because it defines what
> > >> version of the API/libraries it's compiled against..., and we check it's
> > >> compatible at runtime)
> 
> Something like that. If the number of fields grows, and if some of them
> become complex, we may want to fill the structure manually, be we should
> still use a macro to instantiate the right structure with the right name
> and the extern "C" keyword.

Yeah I think I'll do that.

> > >>> +	if (ret < 0) {
> > >>> +		loaded_ = false;
> 
> This isn't needed, you initialise loaded_ to false above.
> 
> > >>> +		return;
> > >>> +	}
> > >>> +
> > >>> +	loaded_ = true;
> > >>> +}
> > >>> +
> > >>> +int IPAModule::loadELFIdent(int fd)
> 
> This function doesn't actually load much, how about renaming it to
> verifyELFIdent() ?
> 
> > >>> +{
> > >>> +	int ret = lseek(fd, 0, SEEK_SET);
> > >>> +	if (ret < 0)
> > >>> +		return -errno;
> > >>> +
> 
> As we're mmap()ing the whole .so, you could mmap() before calling
> loadELFIdent(), and pass the void *map and size to the function, instead
> of using read.

I didn't want to have to do elfPointer for every single byte that I read
it, but I forgot that I could just use elfPointer on an array.

> > >>> +	unsigned char e_ident[EI_NIDENT];
> > >>> +	ret = read(fd, e_ident, EI_NIDENT);
> > >>> +	if (ret < 0)
> > >>> +		return -errno;
> > >>> +
> > >>> +	if (e_ident[EI_MAG0] != ELFMAG0 ||
> > >>> +	    e_ident[EI_MAG1] != ELFMAG1 ||
> > >>> +	    e_ident[EI_MAG2] != ELFMAG2 ||
> > >>> +	    e_ident[EI_MAG3] != ELFMAG3 ||
> > >>> +	    e_ident[EI_VERSION] != EV_CURRENT)
> > >>> +		return -ENOEXEC;
> > >>> +
> > >>> +	if ((e_ident[EI_CLASS] == ELFCLASS32 && sizeof(unsigned long) == 4) ||
> > >>> +	    (e_ident[EI_CLASS] == ELFCLASS64 && sizeof(unsigned long) == 8))
> > >>> +		bitClass_ = e_ident[EI_CLASS];
> > >>> +	else
> > >>> +		return -ENOEXEC;
> 
> How about
> 
> 	bitClass_ = sizeof(unsigned long) == 4 ? ELFCLASS32 : ELFCLASS64;
> 	if (e_ident[EI_CLASS] != bitClass_)
> 		return -ENOEXEC;

Yeah. I guess we're going to demote bitClass_ to a local variable.

> > >>> +
> > >>> +	int a = 1;
> > >>> +	unsigned char endianness = *reinterpret_cast<char *>(&a) == 1
> > >>> +				 ? ELFDATA2LSB : ELFDATA2MSB;
> 
> So you didn't like my union-based solution ? :-)

Not really :p
imo this was more readable.

> > >>> +	if (e_ident[EI_DATA] != endianness)
> > >>> +		return -ENOEXEC;
> > >>> +
> > >>> +	return 0;
> > >>> +}
> > >>> +
> > >>> +template<class ElfHeader, class SecHeader, class SymHeader>
> > >>> +int IPAModule::loadSymbol(void *dst, size_t size, char *map, size_t soSize,
> > >>> +			  const char *symbol)
> > >>> +{
> > >>> +	ElfHeader *eHdr = elfPointer<ElfHeader>(map, 0, soSize);
> > >>> +	if (!eHdr)
> > >>> +		return -ENOEXEC;
> > >>> +
> > >>> +	off_t offset = eHdr->e_shoff + eHdr->e_shentsize * eHdr->e_shstrndx;
> > >>> +	SecHeader *sHdr = elfPointer<SecHeader>(map, offset, soSize);
> > >>> +	if (!sHdr)
> > >>> +		return -ENOEXEC;
> > >>> +	off_t shnameoff = sHdr->sh_offset;
> > >>> +
> > >>> +	/* Locate .dynsym section header. */
> > >>> +	bool found = false;
> > >>> +	SecHeader *dynsym;
> > >>> +	for (unsigned int i = 0; i < eHdr->e_shnum; i++) {
> > >>> +		offset = eHdr->e_shoff + eHdr->e_shentsize * i;
> > >>> +		sHdr = elfPointer<SecHeader>(map, offset, soSize);
> > >>> +		if (!sHdr)
> > >>> +			return -ENOEXEC;
> > >>> +
> > >>> +		offset = shnameoff + sHdr->sh_name;
> > >>> +		char *name = elfPointer<char>(map, offset, soSize);
> 
> This should be elfPointer<char[8]> as you want to ensure that at least 8
> bytes are available (sizeof(".dynsym"), including the terminating 0).

Ah right, I forgot that sizeof(T) was there.

> > >>> +		if (!name)
> > >>> +			return -ENOEXEC;
> > >>> +
> > >>> +		if (sHdr->sh_type == SHT_DYNSYM && !strcmp(name, ".dynsym")) {
> > >>> +			found = true;
> > >>> +			dynsym = sHdr;
> 
> You could initialise dynsym to nullptr and remove found.
> 
> > >>> +			break;
> > >>> +		}
> > >>> +	}
> > >>> +
> > >>> +	if (!found) {
> > >>> +		LOG(IPAModule, Error) << "ELF has no .dynsym section";
> > >>> +		return -ENOEXEC;
> > >>> +	}
> > >>> +
> > >>> +	offset = eHdr->e_shoff + eHdr->e_shentsize * dynsym->sh_link;
> > >>> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> > >>> +	if (!sHdr)
> > >>> +		return -ENOEXEC;
> > >>> +	off_t dynsym_nameoff = sHdr->sh_offset;
> > >>> +
> > >>> +	/* Locate symbol in the .dynsym section. */
> > >>> +	found = false;
> > >>> +	SymHeader *target_symbol;
> 
> targetSymbol ?
> 
> > >>> +	unsigned int dynsym_num = dynsym->sh_size / dynsym->sh_entsize;
> > >>> +	for (unsigned int i = 0; i < dynsym_num; i++) {
> > >>> +		offset = dynsym->sh_offset + dynsym->sh_entsize * i;
> > >>> +		SymHeader *sym = elfPointer<SymHeader>(map, offset, soSize);
> > >>> +		if (!sym)
> > >>> +			return -ENOEXEC;
> > >>> +
> > >>> +		offset = dynsym_nameoff + sym->st_name;
> > >>> +		char *name = elfPointer<char>(map, offset, soSize);
> 
> Same here, you want elfPointer<char[strlen(symbol) + 1]>. I hope this
> will compile :-)
> 
> > >>> +		if (!name)
> > >>> +			return -ENOEXEC;
> > >>> +
> > >>> +		if (!strcmp(name, symbol) &&
> > >>> +		    sym->st_info & STB_GLOBAL &&
> > >>> +		    sym->st_size == sizeof(struct IPAModuleInfo)) {
> > >>
> > >> I think this should be a check on the size_t size passed in?
> > > 
> > > Ah yeah, I totally missed that.
> > 
> > No worries.
> > 
> > >>> +			found = true;
> > >>> +			target_symbol = sym;
> 
> You could also remove found here.
> 
> > >>> +			break;
> > >>> +		}
> > >>> +	}
> > >>> +
> > >>> +	if (!found) {
> > >>> +		LOG(IPAModule, Error) << "IPAModuleInfo symbol not found";
> > >>
> > >> This function is called loadSymbol, it doesn't 'know' that it's loading
> > >> an IPAModuleInfo symbol.
> > >>
> > >> I'd change this error to "Symbol " << symbol << " not found";
> > > 
> > > And this.
> > > 
> > >>> +		return -ENOEXEC;
> > >>> +	}
> > >>> +
> > >>> +	/* Locate and return data of symbol. */
> 
> As a sanity check you may want to verify that target_symbol->st_shndx <
> eHdr->e_shnum. Otherwise none of the elfPointer<> checks below may
> trigger, and you will load an invalid symbol. This can happen for
> instance if the symbol is absolute (SHN_ABS).
> 
> > >>> +	offset = eHdr->e_shoff + target_symbol->st_shndx * eHdr->e_shentsize;
> > >>> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> > >>> +	if (!sHdr)
> > >>> +		return -ENOEXEC;
> > >>> +	offset = sHdr->sh_offset + (target_symbol->st_value - sHdr->sh_addr);
> > >>> +	char *data = elfPointer<char>(map, offset, soSize);
> 
> elfPointer<char[size]>
> 
> > >>> +	if (!data)
> > >>> +		return -ENOEXEC;
> > >>> +	memcpy(dst, data, size);
> > >>
> > >> Oh - interesting, you're copying the symbol out. Once it's mmapped ...
> > >> why not parse it directly?
> > > 
> > > I'm not sure what you mean by "parse"?
> > 
> > I mean once you find the symbol and obtain the pointer to it ('data'
> > here) then you have a pointer to memory which contains the structure.
> > You could return that pointer, and cast it to the type. Then you can
> > read the structure members directly.
> > 
> > But that requires knowing that the symbol doesn't require any further
> > relocations ... which is fine for our simple info structure.
> > 
> > It just seems logical in my head to have a function called findSymbol()
> > which returns a pointer to the symbol, and then (if you're going to copy
> > it) a function called loadStructure(), which finds the symbol, checks
> > the size then copies it or such. Maybe it's generalising the functions
> > more than we care if we will only use them for one thing
> 
> I have no particular opinion on this. We could indeed rename this
> function to findSymbol() and move the copy outside. It may actually be a
> good idea, but then this function should return both the address (or
> offset) and size (one or both through output arguments).

I don't think we really need to separate it into find and load. I
suppose if findSymbol turns out to be useful somewhere down the road it
could be changed.

> > >> Can we do all of the 'work' we need to do on the file, and then close it
> > >> at the destructor?
> > > 
> > > I don't want to keep the file open/mapped over multiple method calls.
> > > That's why in my first version there was a constructor that did nothing
> > > and an init() that did everything (which got moved to the constructor in
> > > this version).
> 
> I agree, we shouldn't keep it mapped.
> 
> > The only things we'll do on this structure is read it and decide if this
> > library is OK for loading, perhaps even printing the name string in debug.
> > 
> > Once that has occurred, can't the IPAModuleInfo structure be discarded?
> > The next thing that we'll do is spawn a container/namespace/safe-thingy
> > and use the normal dl_open methods to re-open the library there...
> 
> No, we need it at least to match the module with pipeline handlers, so
> it needs to be copied. I'm sure we'll have other fields that we will
> also need to access later.
> 
> > >> I guess actually this might then keep the whole file mapped in memory
> > >> for longer which might not be desirable...
> > >>
> > >>> +
> > >>> +	return 0;
> > >>> +}
> > >>> +
> > >>> +int IPAModule::loadIPAModuleInfo(const char *libPath, const char *symbol)
> 
> I would store libPath in a member variable as we will need it for
> dlopen(), and use that member variable in this function.
> 
> > >>> +{
> > >>> +	int fd = open(libPath, O_RDONLY);
> > >>> +	if (fd < 0) {
> > >>> +		LOG(IPAModule, Error) << "Failed to open IPA library: "
> > >>> +				      << strerror(errno);
> 
> You need
> 
> 		ret = -errno;
> 		LOG(IPAModule, Error) << "Failed to open IPA library: "
> 				      << strerror(-ret);
> 		return ret;
> 
> ass errno may be changed by LOG(IPAModule, Error).
> 
> > >>> +		return fd;
> > >>> +	}
> > >>> +
> > >>> +	int ret = loadELFIdent(fd);
> > >>> +	if (ret)
> > >>> +		goto close;
> > >>> +
> > >>> +	size_t soSize;
> > >>> +	char *map;
> > >>> +	struct stat st;
> > >>> +	ret = fstat(fd, &st);
> > >>> +	if (ret < 0)
> > >>> +		goto close;
> > >>> +	soSize = st.st_size;
> > >>> +	map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
> > >>> +				       MAP_PRIVATE, fd, 0));
> 
> 	struct stat st;
> 	ret = fstat(fd, &st);
> 	if (ret < 0)
> 		goto close;
> 
> 	size_t soSize = st.st_size;
> 	char *map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
> 					     MAP_PRIVATE, fd, 0));

Last time I tried that the compiler was complaining about the goto
jumping over declarations, or something along those lines. I'll try it
again.

> I think I would make it a void *map, pass it as a void * to all the
> functions, and let elfPointer<> do the case.
> 
> > >> *iff* the constructor did the open, and the symbol parsing was
> > >> separated, then I think soSize and map would go to class privates. Then
> > >> they could be accessed directly by loadSymbol too. But that's an only if...
> > >>
> > >> Essentially, it would be
> > >>
> > >> IPAModule("SO-path")
> > >>  - Constructor opens the file, performs stat, and mmap.
> > >> 	 (or those become an open() call)
> > >>  - loadIPAModuleInfo() calls loadSymbol("ipaModuleInfo"); and deals only
> > >> with the ipaModule info structure and checking or such.
> > > 
> > > If loadIPAModuleInfo() is to be called from within the constructor
> > > (after open, stat, and mmap), then I agree. I'm not sure map and soSize
> > > need to be privates though. It could go either way.
> 
> I thought about it, but it the address and size are really temporary
> variables, as we unmap the file after the parsing, so I think it's best
> to pass them explicitly to functions.
> 
> > Well, I leave that all up to you now :-)
> > 
> > >>> +	if (map == MAP_FAILED) {
> > >>> +		ret = -errno;
> > >>> +		goto close;
> > >>> +	}
> > >>> +
> > >>> +	if (bitClass_ == ELFCLASS32)
> > >>> +		ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
> > >>> +				(&info_, sizeof(info_), map, soSize, symbol);
> > >>> +	else if (bitClass_ == ELFCLASS64)
> > >>> +		ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
> > >>> +				(&info_, sizeof(info_), map, soSize, symbol);
> 
> As we don't need to support a module with a different bit class than
> libcamera, would we avoid compiling in both version of the function ?
> Maybe something like
> 
> #if sizeof(long) == 4
> 	ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
> 			(&info_, sizeof(info_), map, soSize, symbol);
> #else
> 	ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
> 			(&info_, sizeof(info_), map, soSize, symbol);
> #endif
> 
> and given that we only need one version of the function, I even wonder
> if we need to keep it defined as a template function, maybe we could
> have a
> 
> #if sizeof(long) == 4
> #define Elf_Ehdr Elf32_Ehdr
> #define Elf_Shdr Elf32_Shdr
> #define Elf_Sym Elf32_Sym
> #else
> #define Elf_Ehdr Elf64_Ehdr
> #define Elf_Shdr Elf64_Shdr
> #define Elf_Sym Elf64_Sym
> #endif
> 
> just before the function definition. This would however require moving
> the function away from the class to a global function, private to this
> file, to avoid exposing it in the ipa_module.h header, otherwise we
> would have to move the above #if's there, which isn't neat. If we do
> this we should move the loadElfIdent function too. Neither function use
> any of the class members (except for bitClass_, which isn't needed
> anymore), so they're essentially static, and separating them from the
> IPAModule class would open the door to sharing them with other classes
> if needed later. elfPointer<> is already separate. If you prefer to keep
> them as class members, they should be declared as static in the class
> definition.

I was actually thinking about separating them from IPAModule before
anyway, so I think I'll go this route.

> > >>> +	if (ret)
> > >>> +		goto unmap;
> > >>> +
> > >>> +unmap:
> > >>> +	munmap(map, soSize);
> > >>> +close:
> > >>> +	close(fd);
> > >>> +	return ret;
> > >>> +}
> > >>> +
> > >>> +/**
> > >>> + * \brief Check if construction of the IPAModule instance succeeded
> 
> I would say "Check if the IPAModule instance is valid" and then add a
> paragraph to detail what a valid (or invalid) IPAModule is.
> 
> > >>> + *
> > >>> + * \return true if the constructor succeeded with no errors, false otherwise
> 
> true if the IPAModule is valid, false otherwise
> 
> > >>> + */
> > >>> +
> 
> No need for this blank line.

Huh, I thought I saw some other file do this. Maybe I missaw :/

> > >>> +bool IPAModule::isValid() const
> > >>> +{
> > >>> +	return loaded_;
> > >>> +}
> > >>> +
> > >>> +/**
> > >>> + * \brief Get the loaded IPAModuleInfo
> 
> "Get the IPA module information"
> 
> > >>> + *
> > >>> + * Check isValid() before calling this.
> 
> "The content of the IPA module information is loaded from the module,
> and is valid only if the module is valid (as returned by isValid()).
> Calling this function on an invalid module is an error."
> 
> > >>> + *
> > >>> + * \return IPAModuleInfo
> > >>> + */
> > >>> +
> 
> No need for a blank line here either
> 
> > >>> +struct IPAModuleInfo const &IPAModule::IPAModuleInfo() const
> > >>> +{
> > >>> +	return info_;
> > >>> +}
> > >>> +
> > >>> +} /* namespace libcamera */
> > >>> diff --git a/src/libcamera/meson.build b/src/libcamera/meson.build
> > >>> index 8796f49..e5b48f2 100644
> > >>> --- a/src/libcamera/meson.build
> > >>> +++ b/src/libcamera/meson.build
> > >>> @@ -10,6 +10,7 @@ libcamera_sources = files([
> > >>>      'event_notifier.cpp',
> > >>>      'formats.cpp',
> > >>>      'geometry.cpp',
> > >>> +    'ipa_module.cpp',
> > >>>      'log.cpp',
> > >>>      'media_device.cpp',
> > >>>      'media_object.cpp',
> > >>> @@ -31,6 +32,7 @@ libcamera_headers = files([
> > >>>      'include/device_enumerator_udev.h',
> > >>>      'include/event_dispatcher_poll.h',
> > >>>      'include/formats.h',
> > >>> +    'include/ipa_module.h',
> > >>>      'include/log.h',
> > >>>      'include/media_device.h',
> > >>>      'include/media_object.h',

Everything else is either an ack or an affirmative.


Thanks,

Paul
Laurent Pinchart May 16, 2019, 6:16 p.m. UTC | #6
Hi Paul,

On Thu, May 16, 2019 at 01:51:16PM -0400, Paul Elder wrote:
> On Thu, May 16, 2019 at 12:19:42AM +0300, Laurent Pinchart wrote:
> > On Wed, May 15, 2019 at 04:26:48PM +0100, Kieran Bingham wrote:
> >> On 15/05/2019 16:02, Paul Elder wrote:
> >>> On Wed, May 15, 2019 at 10:28:46AM +0100, Kieran Bingham wrote:
> >>>> Hi Paul,
> >>>>
> >>>> Thank you for your patch, This is looking good and it's going in the
> >>>> right direction but I have a few design comments ... feel free to
> >>>> disagree though ...
> >>> 
> >>> Thank you for the review.
> >>> 
> >>>> On 14/05/2019 23:38, Paul Elder wrote:
> >>>>> Implement a class to load a struct IPAModuleInfo of a given symbol name
> >>>>> from a .so shared object library.
> >>>>>
> >>>>> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> >>>>> ---
> >>>>> Changes in v2:
> >>>>> - renamed LibLoader class to IPAModule
> >>>>> - added documentation
> >>>>> - added logging
> >>>>> - check that bitness of the shared object is the same as libcamera
> >>>>> - moved symbol loading ("init") to the constructor, and added isValid()
> >>>>> - added elfPointer() to prevent segfaults when reading data from mmap
> >>>>> - moved struct IPAModuleInfo out of IPAModule
> >>>>> - rename getIPAModuleInfo() to IPAModuleInfo(), and make it return a
> >>>>>   const reference
> >>>>> - added munmap after the mmap
> >>>>>
> >>>>>  src/libcamera/include/ipa_module.h |  43 +++++
> >>>>>  src/libcamera/ipa_module.cpp       | 277 +++++++++++++++++++++++++++++
> >>>>>  src/libcamera/meson.build          |   2 +
> >>>>>  3 files changed, 322 insertions(+)
> >>>>>  create mode 100644 src/libcamera/include/ipa_module.h
> >>>>>  create mode 100644 src/libcamera/ipa_module.cpp
> >>>>>
> >>>>> diff --git a/src/libcamera/include/ipa_module.h b/src/libcamera/include/ipa_module.h
> >>>>> new file mode 100644
> >>>>> index 0000000..9eb0094
> >>>>> --- /dev/null
> >>>>> +++ b/src/libcamera/include/ipa_module.h
> >>>>> @@ -0,0 +1,43 @@
> >>>>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> >>>>> +/*
> >>>>> + * Copyright (C) 2019, Google Inc.
> >>>>> + *
> >>>>> + * ipa_module.h - load IPA module information from shared library at runtime
> >>>>
> >>>> Hrm ... we have two sides here.
> >>>>
> >>>> We need a public header which defines the interface between an IPA
> >>>> module and libcamera. That would include a struct IPAModuleInfo and any
> >>>> registration details, but should not include any internal libcamera
> >>>> private details regarding how the module is loaded.
> >>> 
> >>> Yes, I agree.
> >>> 
> >>>>> + */
> >>>>> +#ifndef __LIBCAMERA_IPA_MODULE_H__
> >>>>> +#define __LIBCAMERA_IPA_MODULE_H__
> >>>>> +
> >>>>> +#include <string>
> >>>>> +
> >>>>> +namespace libcamera {
> >>>>> +
> >>>>> +struct IPAModuleInfo {
> >>>>> +	char name[256];
> >>>>> +	unsigned int version;
> >>>>> +};
> >>>>> +
> >>>>
> >>>> So IPModuleInfo (and then later, the class definition for how a
> >>>> developer would construct an IPA Module) should live in the public
> >>>> headers at:
> >>>>
> >>>>  /include/libcamera/ipa_module.h
> >>> 
> >>> Yeah.
> >>> 
> >>> Then, where should class IPAModule go?
> >> 
> >> Stays here in this file I think :-)
> > 
> > That's what I had envisioned, so it sounds good to me.
> > 
> >>> 
> >>>>> +class IPAModule
> >> 
> >> I'm wondering if we have two sides if this class should be calls
> >> IPAModule though.
> >> 
> >> Surely an IPAModule would implement a class called  IPAModule, and
> >> perhaps 'this' class is an IPAModuleParser? IPAModuleLoader?
> >> 
> >> Probably a Loader :D
> > 
> > I was wondering about this. One option would indeed to provide an
> > IPAModule class with pure virtual functions that the module would need
> > to implement. Another option would be to use a structure with C-style
> > function pointers. I was thinking about the latter, but the former is
> > not a bad idea. However, we need to take care about backward
> > compatibility. Can a module compiled with an old version of g++ (or
> > clang++) be incompatible with libcamera built with a newer version if
> > using a C++ ABI ?
> > 
> > Ie we go the C++ way, we will indeed need two different names for the
> > classes. I was thinking about IPAModule for this class and
> > IPAModuleInterface for the new one, in which case no change would be
> > needed here. Note that IPAModule would inherit from IPAModuleInterface
> > as it would provide the IPA API to pipeline handlers. I don't like the
> > name IPAModuleLoader much, as from a pipeline handler point of view
> > we're using modules, not module loaders. Other options may be possible.
> 
> I don't know about C++ ABI compatability. I think to just avoid the risk
> itself we should go with C-style function pointers.

I think the entry point at least should be a factory C function pointer,
but it could return a C++ object pointer, if we can guarantee ABI
compatibility.

> In which case Laurent's shim idea would work well for process isolation
> later on. Since the only layers are Pipeline -> IPAModule there isn't
> really anywhere else we can break off into another process without
> making the Pipeline -> IPAModule interface ugly.
> 
> >>>>> +{
> >>>>> +public:
> >>>>> +	explicit IPAModule(const std::string &libPath, const std::string &symbol);
> > 
> > No need to pass the symbol name here, you can hardcode it in the
> > implementation. All modules will use the same symbol name.
> > 
> >>>>> +
> >>>>> +	bool isValid() const;
> >>>>> +
> >>>>> +	struct IPAModuleInfo const &IPAModuleInfo() const;
> > 
> > We usually write this const struct IPAModuleInfo &IPAModuleInfo() const.
> > And I would name the function info(), as IPAModule::info() makes it
> > clear that we're retrieving the information of an IPA module.
> > 
> >>>>> +
> >>>>> +private:
> >>>>> +	struct IPAModuleInfo info_;
> >>>>> +
> >>>>> +	bool loaded_;
> > 
> > Let's name this valid_ to match isValid(). We will later have a load()
> > function to dlopen() the module, and likely a loaded_ flag to match
> > that.
> > 
> >>>>> +	int bitClass_;
> >>>>> +
> >>>>> +	int loadELFIdent(int fd);
> >>>>> +	template<class ElfHeader, class SecHeader, class SymHeader>
> >>>>> +	int loadSymbol(void *data, size_t size, char *map, size_t soSize,
> >>>>> +		       const char *symbol);
> >>>>> +	int loadIPAModuleInfo(const char *libPath, const char *symbol);
> >>>>> +};
> >>>>> +
> >>>>> +} /* namespace libcamera */
> >>>>> +
> >>>>> +#endif /* __LIBCAMERA_IPA_MODULE_H__ */
> >>>>> diff --git a/src/libcamera/ipa_module.cpp b/src/libcamera/ipa_module.cpp
> >>>>> new file mode 100644
> >>>>> index 0000000..5ca16e8
> >>>>> --- /dev/null
> >>>>> +++ b/src/libcamera/ipa_module.cpp
> >>>>> @@ -0,0 +1,277 @@
> >>>>> +/* SPDX-License-Identifier: LGPL-2.1-or-later */
> >>>>> +/*
> >>>>> + * Copyright (C) 2019, Google Inc.
> >>>>> + *
> >>>>> + * ipa_module.cpp - load IPA module information from shared library at runtime
> > 
> > I think we can already update the description to "Image Processing
> > Algorithm module" or similar
> > 
> >>>>> + */
> >>>>> +
> >>>>> +#include "ipa_module.h"
> >>>>> +
> >>>>> +#include <elf.h>
> >>>>> +#include <errno.h>
> >>>>> +#include <fcntl.h>
> >>>>> +#include <string.h>
> >>>>> +#include <sys/mman.h>
> >>>>> +#include <sys/stat.h>
> >>>>> +#include <sys/types.h>
> >>>>> +#include <unistd.h>
> >>>>> +
> >>>>> +#include <iostream>
> >>>>> +
> >>>>> +#include "log.h"
> >>>>> +
> >>>>> +/**
> >>>>> + * \file ipa_module.h
> >>>>> + * \brief IPA module information loading
> > 
> > Same here.
> > 
> >>>>> + */
> >>>>> +
> >>>>> +namespace libcamera {
> >>>>> +
> >>>>> +/**
> >>>>> + * \struct IPAModuleInfo
> >>>>> + * \brief Information of an IPA plugin
> >>>>> + *
> >>>>> + * This structure contains the information of an IPA plugin. It is loaded,
> >>>>> + * read, and validated before anything else is loaded from the plugin.
> >>>>> + *
> >>>>> + * \var IPAModuleInfo::name
> >>>>> + * \brief The name of the IPA plugin
> >>>>> + *
> >>>>> + * \var IPAModuleInfo::version
> >>>>> + * \brief The version of the IPA plugin
> >>>>
> >>>> I don't think we need to store the 'version' of the plugin, so much as
> >>>> the version of the API it was compiled against to ensure that it is
> >>>> compatible with this 'running' instance of libcamera.
> >>>>
> >>>> I.e. We don't care that it's the 3rd iteration of the rk3399 module -
> >>>> but we do care that it is compatible with the API defined in Libcamera 1.0.
> >>> 
> >>> Yeah, good point; I agree.
> >>> 
> >>> How would we get the API version of libcamera?
> >> 
> >> You get to define it :-)
> >> 
> >> To start with it's probably just:
> >> 
> >>   #define LIBCAMERA_IPA_API_VERSION 1
> >> 
> >> This would be a 'local' IPAModule API version number or such I think, as
> >> our IPA modules must go through a defined interface...
> >> 
> >> Any time an ABI break occurs in the IPAModule 'API' we would have to
> >> bump up the apiversion.
> > 
> > I agree, but I think we can defer this. The IPAModuleInfo structure is
> > currently a mock up to start implementing the IPAModule class, I expect
> > it to change a lot.
> > 
> >>>> Maybe we should add compatible strings to match against pipeline
> >>>> handlers as some point too :-) <I'm not even sure if I'm joking now>
> > 
> > I think we will need that :-)
> > 
> > Paul, could you record there open items ?
> 
> I'm not sure what you mean.

Sorry, s/there/the/

Could you keep a list of the todo items (possibly as \todo comments),
such as adding compability strings to the IPA module info ?

> >>>>> + */
> >>>>> +
> >>>>> +LOG_DEFINE_CATEGORY(IPAModule)
> >>>>> +
> >>>>> +template<typename T>
> >>>>> +typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
> >>>>> +						 size_t fileSize)
> >>>>> +{
> >>>>> +	size_t size = offset + sizeof(T);
> >>>>> +	if (size > fileSize || size < sizeof(T))
> >>>>> +		return nullptr;
> >>>>> +
> >>>>> +	return reinterpret_cast<typename std::remove_extent<T>::type *>
> >>>>> +		(static_cast<char *>(map) + offset);
> >>>>> +}
> > 
> > This function should be static. The C way of doing so is using the
> > static keyword, the C++ way is to put it in an anonymous namespace.
> > 
> > namespace {
> > 
> > template<typename T>
> > typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
> > 						 size_t fileSize)
> > {
> > ...
> > }
> > 
> > };
> > 
> >>>>> +
> >>>>> +/**
> >>>>> + * \class IPAModule
> >>>>> + * \brief Load IPA module information from an IPA plugin
> > 
> > To match the suggestion above, "IPA module wrapper" ? I think we can do
> > better than that, maybe "Wrapper around an IPA module shared object" ?
> > Suggestions are welcome.
> > 
> >>>>> + */
> >>>>> +
> >>>>> +/**
> >>>>> + * \brief Construct an IPAModule instance
> >>>>> + * \param[in] libPath path to IPA plugin
> >>>>> + * \param[in] symbol name of IPAModuleInfo to load from libPath
> >>>>> + *
> >>>>> + * Loads the IPAModuleInfo of the given symbol from the IPA plugin at libPath.
> >>>>> + * The IPA plugin shared object file must be of the same endianness and
> >>>>> + * bitness as libcamera.
> >>>>> + *
> >>>>> + * Note that isValid() should be called to check if this loading was successful
> >>>>> + * or not.
> >>>>
> >>>> Hrm ... this is a slightly different design pattern than the rest of our
> >>>> objects which are always constructable, and then actions later can fail.
> >>>>
> >>>> I don't mind - but I wonder if we should add some documentation
> >>>> regarding our design patterns somewhere.
> >>>>
> >>>> (not an action on this patch)
> > 
> > I've discussed this with Paul, and I think that in this case this
> > implementation makes sense. There's no need to delay loading, and as the
> > IPAModule class is a wrapper around a .so, constructing an instance on
> > which on can call isValid() to check if the IPA is valid makes sense to
> > me.
> > 
> > Maybe we need to reconsider our other classes indeed. Any
> > suggestion/preference ?
> 
> Do you mean to make the other classes follow the same model? As long as
> we document the exception (well the usage is already documented...) then
> shouldn't it be fine?

Sure, it's fine having two models when two models are needed, the
question is whether they are actually needed :-) Some of our existing
classes may benefit from using the model used here.

> >>>>> + */
> >>>>> +
> >>>>> +IPAModule::IPAModule(const std::string &libPath, const std::string &symbol)
> >>>>> +	: loaded_(false)
> >>>>
> >>>> So actually, from my text below - what I'm really saying is that I don't
> >>>> think you should provide a &symbol to this IPAModule constructor :)
> > 
> > So we agree :-)
> > 
> >>>>> +{
> >>>>> +	int ret = loadIPAModuleInfo(libPath.c_str(), symbol.c_str());
> >>>>
> >>>> I see that you have moved the load into this constructor based on
> >>>> comments from Laurent in v1. I think that's fine, but I think merging
> >>>> the loading of the object, and the parsing of the symbol might be
> >>>> combining too much. But it depends (as ever).
> > 
> > Note that this only loads the module info, not the module itself (I
> > foresee a load() function that will dlopen() the module). I'm not sure
> > what you mean by merging the loading of the object and the parsing of
> > the symbol.
> > 
> >>>> For efficiency, the constructor can load the file once and when created
> >>>> and .isValid() is true, then we know we can successfully parse symbols.
> > 
> > But you can't know it's a valid IPA unless you can load the
> > IPAModuleInfo successfully (and later maybe even cryptographically
> > verifying the module).
> > 
> >>>> But I think loadIPAModuleInfo() should be 'hardcoded' to know exactly
> >>>> which symbol name it will expect for an IPAModule. Think of it like a
> >>>> 'main()' entry point for our IPAModule. You would always expect a C
> >>>> binary to have a main() symbol...
> >>> 
> >>> Yeah, I think that's a good idea.
> >>> 
> >>>> So I would expect a libcamera IPA module to have a registration
> >>>> something like:
> >>>>
> >>>>
> >>>> /* Register our module with Libcamera */
> >>>> const struct IPAModuleInfo ipaModuleInfo = {
> >>>> 	.name = "Image Processing for the RK3399",
> >>>> 	.version = 1, /* Perhaps this should be apiversion to prevent
> >>>>                        * people thinking it describes the version of the
> >>>>                        * module ? ...
> >>>>                        */
> >>>> };
> >>> 
> >>> Should the info entry point be called ipaModuleInfo?
> >> 
> >> I think variables are in camelCase with a lowerCase first letter in the
> >> project right? or was this not quite your question?
> > 
> > This sounds good to me.
> 
> My question was about the entry point name, not the casing :p

Ah :-) yes, ipaModuleInfo seems good to me.

> >>>> This could even then be turned into a macro:
> >>>>
> >>>> #define LIBCAMERA_IPA_MODULE(__NAME) \
> >>>> const struct IPAModuleInfo ipaModuleInfo = { \
> >>>> 	.name = __NAME,	\
> >>>> 	.apiversion = 1, \
> >>>> }
> > 
> > We'll have to throw an extern "C" here, otherwise your symbol name will
> > be _ZL13ipaModuleInfo with g++.
> > 
> >>>>
> >>>> so that a module rk3399_ipa.cpp could define:
> >>>>
> >>>> LIBCAMERA_IPA_MODULE("Image Processing Algorithms for the RK3399");
> >>>>
> >>>> (apiversion would be hardcoded by the macro because it defines what
> >>>> version of the API/libraries it's compiled against..., and we check it's
> >>>> compatible at runtime)
> > 
> > Something like that. If the number of fields grows, and if some of them
> > become complex, we may want to fill the structure manually, be we should
> > still use a macro to instantiate the right structure with the right name
> > and the extern "C" keyword.
> 
> Yeah I think I'll do that.
> 
> >>>>> +	if (ret < 0) {
> >>>>> +		loaded_ = false;
> > 
> > This isn't needed, you initialise loaded_ to false above.
> > 
> >>>>> +		return;
> >>>>> +	}
> >>>>> +
> >>>>> +	loaded_ = true;
> >>>>> +}
> >>>>> +
> >>>>> +int IPAModule::loadELFIdent(int fd)
> > 
> > This function doesn't actually load much, how about renaming it to
> > verifyELFIdent() ?
> > 
> >>>>> +{
> >>>>> +	int ret = lseek(fd, 0, SEEK_SET);
> >>>>> +	if (ret < 0)
> >>>>> +		return -errno;
> >>>>> +
> > 
> > As we're mmap()ing the whole .so, you could mmap() before calling
> > loadELFIdent(), and pass the void *map and size to the function, instead
> > of using read.
> 
> I didn't want to have to do elfPointer for every single byte that I read
> it, but I forgot that I could just use elfPointer on an array.
> 
> >>>>> +	unsigned char e_ident[EI_NIDENT];
> >>>>> +	ret = read(fd, e_ident, EI_NIDENT);
> >>>>> +	if (ret < 0)
> >>>>> +		return -errno;
> >>>>> +
> >>>>> +	if (e_ident[EI_MAG0] != ELFMAG0 ||
> >>>>> +	    e_ident[EI_MAG1] != ELFMAG1 ||
> >>>>> +	    e_ident[EI_MAG2] != ELFMAG2 ||
> >>>>> +	    e_ident[EI_MAG3] != ELFMAG3 ||
> >>>>> +	    e_ident[EI_VERSION] != EV_CURRENT)
> >>>>> +		return -ENOEXEC;
> >>>>> +
> >>>>> +	if ((e_ident[EI_CLASS] == ELFCLASS32 && sizeof(unsigned long) == 4) ||
> >>>>> +	    (e_ident[EI_CLASS] == ELFCLASS64 && sizeof(unsigned long) == 8))
> >>>>> +		bitClass_ = e_ident[EI_CLASS];
> >>>>> +	else
> >>>>> +		return -ENOEXEC;
> > 
> > How about
> > 
> > 	bitClass_ = sizeof(unsigned long) == 4 ? ELFCLASS32 : ELFCLASS64;
> > 	if (e_ident[EI_CLASS] != bitClass_)
> > 		return -ENOEXEC;
> 
> Yeah. I guess we're going to demote bitClass_ to a local variable.
> 
> >>>>> +
> >>>>> +	int a = 1;
> >>>>> +	unsigned char endianness = *reinterpret_cast<char *>(&a) == 1
> >>>>> +				 ? ELFDATA2LSB : ELFDATA2MSB;
> > 
> > So you didn't like my union-based solution ? :-)
> 
> Not really :p
> imo this was more readable.

Your code, your choice :-)

> >>>>> +	if (e_ident[EI_DATA] != endianness)
> >>>>> +		return -ENOEXEC;
> >>>>> +
> >>>>> +	return 0;
> >>>>> +}
> >>>>> +
> >>>>> +template<class ElfHeader, class SecHeader, class SymHeader>
> >>>>> +int IPAModule::loadSymbol(void *dst, size_t size, char *map, size_t soSize,
> >>>>> +			  const char *symbol)
> >>>>> +{
> >>>>> +	ElfHeader *eHdr = elfPointer<ElfHeader>(map, 0, soSize);
> >>>>> +	if (!eHdr)
> >>>>> +		return -ENOEXEC;
> >>>>> +
> >>>>> +	off_t offset = eHdr->e_shoff + eHdr->e_shentsize * eHdr->e_shstrndx;
> >>>>> +	SecHeader *sHdr = elfPointer<SecHeader>(map, offset, soSize);
> >>>>> +	if (!sHdr)
> >>>>> +		return -ENOEXEC;
> >>>>> +	off_t shnameoff = sHdr->sh_offset;
> >>>>> +
> >>>>> +	/* Locate .dynsym section header. */
> >>>>> +	bool found = false;
> >>>>> +	SecHeader *dynsym;
> >>>>> +	for (unsigned int i = 0; i < eHdr->e_shnum; i++) {
> >>>>> +		offset = eHdr->e_shoff + eHdr->e_shentsize * i;
> >>>>> +		sHdr = elfPointer<SecHeader>(map, offset, soSize);
> >>>>> +		if (!sHdr)
> >>>>> +			return -ENOEXEC;
> >>>>> +
> >>>>> +		offset = shnameoff + sHdr->sh_name;
> >>>>> +		char *name = elfPointer<char>(map, offset, soSize);
> > 
> > This should be elfPointer<char[8]> as you want to ensure that at least 8
> > bytes are available (sizeof(".dynsym"), including the terminating 0).
> 
> Ah right, I forgot that sizeof(T) was there.
> 
> >>>>> +		if (!name)
> >>>>> +			return -ENOEXEC;
> >>>>> +
> >>>>> +		if (sHdr->sh_type == SHT_DYNSYM && !strcmp(name, ".dynsym")) {
> >>>>> +			found = true;
> >>>>> +			dynsym = sHdr;
> > 
> > You could initialise dynsym to nullptr and remove found.
> > 
> >>>>> +			break;
> >>>>> +		}
> >>>>> +	}
> >>>>> +
> >>>>> +	if (!found) {
> >>>>> +		LOG(IPAModule, Error) << "ELF has no .dynsym section";
> >>>>> +		return -ENOEXEC;
> >>>>> +	}
> >>>>> +
> >>>>> +	offset = eHdr->e_shoff + eHdr->e_shentsize * dynsym->sh_link;
> >>>>> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> >>>>> +	if (!sHdr)
> >>>>> +		return -ENOEXEC;
> >>>>> +	off_t dynsym_nameoff = sHdr->sh_offset;
> >>>>> +
> >>>>> +	/* Locate symbol in the .dynsym section. */
> >>>>> +	found = false;
> >>>>> +	SymHeader *target_symbol;
> > 
> > targetSymbol ?
> > 
> >>>>> +	unsigned int dynsym_num = dynsym->sh_size / dynsym->sh_entsize;
> >>>>> +	for (unsigned int i = 0; i < dynsym_num; i++) {
> >>>>> +		offset = dynsym->sh_offset + dynsym->sh_entsize * i;
> >>>>> +		SymHeader *sym = elfPointer<SymHeader>(map, offset, soSize);
> >>>>> +		if (!sym)
> >>>>> +			return -ENOEXEC;
> >>>>> +
> >>>>> +		offset = dynsym_nameoff + sym->st_name;
> >>>>> +		char *name = elfPointer<char>(map, offset, soSize);
> > 
> > Same here, you want elfPointer<char[strlen(symbol) + 1]>. I hope this
> > will compile :-)
> > 
> >>>>> +		if (!name)
> >>>>> +			return -ENOEXEC;
> >>>>> +
> >>>>> +		if (!strcmp(name, symbol) &&
> >>>>> +		    sym->st_info & STB_GLOBAL &&
> >>>>> +		    sym->st_size == sizeof(struct IPAModuleInfo)) {
> >>>>
> >>>> I think this should be a check on the size_t size passed in?
> >>> 
> >>> Ah yeah, I totally missed that.
> >> 
> >> No worries.
> >> 
> >>>>> +			found = true;
> >>>>> +			target_symbol = sym;
> > 
> > You could also remove found here.
> > 
> >>>>> +			break;
> >>>>> +		}
> >>>>> +	}
> >>>>> +
> >>>>> +	if (!found) {
> >>>>> +		LOG(IPAModule, Error) << "IPAModuleInfo symbol not found";
> >>>>
> >>>> This function is called loadSymbol, it doesn't 'know' that it's loading
> >>>> an IPAModuleInfo symbol.
> >>>>
> >>>> I'd change this error to "Symbol " << symbol << " not found";
> >>> 
> >>> And this.
> >>> 
> >>>>> +		return -ENOEXEC;
> >>>>> +	}
> >>>>> +
> >>>>> +	/* Locate and return data of symbol. */
> > 
> > As a sanity check you may want to verify that target_symbol->st_shndx <
> > eHdr->e_shnum. Otherwise none of the elfPointer<> checks below may
> > trigger, and you will load an invalid symbol. This can happen for
> > instance if the symbol is absolute (SHN_ABS).
> > 
> >>>>> +	offset = eHdr->e_shoff + target_symbol->st_shndx * eHdr->e_shentsize;
> >>>>> +	sHdr = elfPointer<SecHeader>(map, offset, soSize);
> >>>>> +	if (!sHdr)
> >>>>> +		return -ENOEXEC;
> >>>>> +	offset = sHdr->sh_offset + (target_symbol->st_value - sHdr->sh_addr);
> >>>>> +	char *data = elfPointer<char>(map, offset, soSize);
> > 
> > elfPointer<char[size]>
> > 
> >>>>> +	if (!data)
> >>>>> +		return -ENOEXEC;
> >>>>> +	memcpy(dst, data, size);
> >>>>
> >>>> Oh - interesting, you're copying the symbol out. Once it's mmapped ...
> >>>> why not parse it directly?
> >>> 
> >>> I'm not sure what you mean by "parse"?
> >> 
> >> I mean once you find the symbol and obtain the pointer to it ('data'
> >> here) then you have a pointer to memory which contains the structure.
> >> You could return that pointer, and cast it to the type. Then you can
> >> read the structure members directly.
> >> 
> >> But that requires knowing that the symbol doesn't require any further
> >> relocations ... which is fine for our simple info structure.
> >> 
> >> It just seems logical in my head to have a function called findSymbol()
> >> which returns a pointer to the symbol, and then (if you're going to copy
> >> it) a function called loadStructure(), which finds the symbol, checks
> >> the size then copies it or such. Maybe it's generalising the functions
> >> more than we care if we will only use them for one thing
> > 
> > I have no particular opinion on this. We could indeed rename this
> > function to findSymbol() and move the copy outside. It may actually be a
> > good idea, but then this function should return both the address (or
> > offset) and size (one or both through output arguments).
> 
> I don't think we really need to separate it into find and load. I
> suppose if findSymbol turns out to be useful somewhere down the road it
> could be changed.

I agree. Please however keep in mind that separating them already would
make the find function a bit smaller, so the code may become more
readable (especially as we would be closer to the "one function, one
purpose" idiom). Up to you.

> >>>> Can we do all of the 'work' we need to do on the file, and then close it
> >>>> at the destructor?
> >>> 
> >>> I don't want to keep the file open/mapped over multiple method calls.
> >>> That's why in my first version there was a constructor that did nothing
> >>> and an init() that did everything (which got moved to the constructor in
> >>> this version).
> > 
> > I agree, we shouldn't keep it mapped.
> > 
> >> The only things we'll do on this structure is read it and decide if this
> >> library is OK for loading, perhaps even printing the name string in debug.
> >> 
> >> Once that has occurred, can't the IPAModuleInfo structure be discarded?
> >> The next thing that we'll do is spawn a container/namespace/safe-thingy
> >> and use the normal dl_open methods to re-open the library there...
> > 
> > No, we need it at least to match the module with pipeline handlers, so
> > it needs to be copied. I'm sure we'll have other fields that we will
> > also need to access later.
> > 
> >>>> I guess actually this might then keep the whole file mapped in memory
> >>>> for longer which might not be desirable...
> >>>>
> >>>>> +
> >>>>> +	return 0;
> >>>>> +}
> >>>>> +
> >>>>> +int IPAModule::loadIPAModuleInfo(const char *libPath, const char *symbol)
> > 
> > I would store libPath in a member variable as we will need it for
> > dlopen(), and use that member variable in this function.
> > 
> >>>>> +{
> >>>>> +	int fd = open(libPath, O_RDONLY);
> >>>>> +	if (fd < 0) {
> >>>>> +		LOG(IPAModule, Error) << "Failed to open IPA library: "
> >>>>> +				      << strerror(errno);
> > 
> > You need
> > 
> > 		ret = -errno;
> > 		LOG(IPAModule, Error) << "Failed to open IPA library: "
> > 				      << strerror(-ret);
> > 		return ret;
> > 
> > ass errno may be changed by LOG(IPAModule, Error).
> > 
> >>>>> +		return fd;
> >>>>> +	}
> >>>>> +
> >>>>> +	int ret = loadELFIdent(fd);
> >>>>> +	if (ret)
> >>>>> +		goto close;
> >>>>> +
> >>>>> +	size_t soSize;
> >>>>> +	char *map;
> >>>>> +	struct stat st;
> >>>>> +	ret = fstat(fd, &st);
> >>>>> +	if (ret < 0)
> >>>>> +		goto close;
> >>>>> +	soSize = st.st_size;
> >>>>> +	map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
> >>>>> +				       MAP_PRIVATE, fd, 0));
> > 
> > 	struct stat st;
> > 	ret = fstat(fd, &st);
> > 	if (ret < 0)
> > 		goto close;
> > 
> > 	size_t soSize = st.st_size;
> > 	char *map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
> > 					     MAP_PRIVATE, fd, 0));
> 
> Last time I tried that the compiler was complaining about the goto
> jumping over declarations, or something along those lines. I'll try it
> again.

Oops, you're right. That's a pesky error :-(

> > I think I would make it a void *map, pass it as a void * to all the
> > functions, and let elfPointer<> do the case.
> > 
> >>>> *iff* the constructor did the open, and the symbol parsing was
> >>>> separated, then I think soSize and map would go to class privates. Then
> >>>> they could be accessed directly by loadSymbol too. But that's an only if...
> >>>>
> >>>> Essentially, it would be
> >>>>
> >>>> IPAModule("SO-path")
> >>>>  - Constructor opens the file, performs stat, and mmap.
> >>>> 	 (or those become an open() call)
> >>>>  - loadIPAModuleInfo() calls loadSymbol("ipaModuleInfo"); and deals only
> >>>> with the ipaModule info structure and checking or such.
> >>> 
> >>> If loadIPAModuleInfo() is to be called from within the constructor
> >>> (after open, stat, and mmap), then I agree. I'm not sure map and soSize
> >>> need to be privates though. It could go either way.
> > 
> > I thought about it, but it the address and size are really temporary
> > variables, as we unmap the file after the parsing, so I think it's best
> > to pass them explicitly to functions.
> > 
> >> Well, I leave that all up to you now :-)
> >> 
> >>>>> +	if (map == MAP_FAILED) {
> >>>>> +		ret = -errno;
> >>>>> +		goto close;
> >>>>> +	}
> >>>>> +
> >>>>> +	if (bitClass_ == ELFCLASS32)
> >>>>> +		ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
> >>>>> +				(&info_, sizeof(info_), map, soSize, symbol);
> >>>>> +	else if (bitClass_ == ELFCLASS64)
> >>>>> +		ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
> >>>>> +				(&info_, sizeof(info_), map, soSize, symbol);
> > 
> > As we don't need to support a module with a different bit class than
> > libcamera, would we avoid compiling in both version of the function ?
> > Maybe something like
> > 
> > #if sizeof(long) == 4
> > 	ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
> > 			(&info_, sizeof(info_), map, soSize, symbol);
> > #else
> > 	ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
> > 			(&info_, sizeof(info_), map, soSize, symbol);
> > #endif
> > 
> > and given that we only need one version of the function, I even wonder
> > if we need to keep it defined as a template function, maybe we could
> > have a
> > 
> > #if sizeof(long) == 4
> > #define Elf_Ehdr Elf32_Ehdr
> > #define Elf_Shdr Elf32_Shdr
> > #define Elf_Sym Elf32_Sym
> > #else
> > #define Elf_Ehdr Elf64_Ehdr
> > #define Elf_Shdr Elf64_Shdr
> > #define Elf_Sym Elf64_Sym
> > #endif
> > 
> > just before the function definition. This would however require moving
> > the function away from the class to a global function, private to this
> > file, to avoid exposing it in the ipa_module.h header, otherwise we
> > would have to move the above #if's there, which isn't neat. If we do
> > this we should move the loadElfIdent function too. Neither function use
> > any of the class members (except for bitClass_, which isn't needed
> > anymore), so they're essentially static, and separating them from the
> > IPAModule class would open the door to sharing them with other classes
> > if needed later. elfPointer<> is already separate. If you prefer to keep
> > them as class members, they should be declared as static in the class
> > definition.
> 
> I was actually thinking about separating them from IPAModule before
> anyway, so I think I'll go this route.
> 
> >>>>> +	if (ret)
> >>>>> +		goto unmap;
> >>>>> +
> >>>>> +unmap:
> >>>>> +	munmap(map, soSize);
> >>>>> +close:
> >>>>> +	close(fd);
> >>>>> +	return ret;
> >>>>> +}
> >>>>> +
> >>>>> +/**
> >>>>> + * \brief Check if construction of the IPAModule instance succeeded
> > 
> > I would say "Check if the IPAModule instance is valid" and then add a
> > paragraph to detail what a valid (or invalid) IPAModule is.
> > 
> >>>>> + *
> >>>>> + * \return true if the constructor succeeded with no errors, false otherwise
> > 
> > true if the IPAModule is valid, false otherwise
> > 
> >>>>> + */
> >>>>> +
> > 
> > No need for this blank line.
> 
> Huh, I thought I saw some other file do this. Maybe I missaw :/
> 
> >>>>> +bool IPAModule::isValid() const
> >>>>> +{
> >>>>> +	return loaded_;
> >>>>> +}
> >>>>> +
> >>>>> +/**
> >>>>> + * \brief Get the loaded IPAModuleInfo
> > 
> > "Get the IPA module information"
> > 
> >>>>> + *
> >>>>> + * Check isValid() before calling this.
> > 
> > "The content of the IPA module information is loaded from the module,
> > and is valid only if the module is valid (as returned by isValid()).
> > Calling this function on an invalid module is an error."
> > 
> >>>>> + *
> >>>>> + * \return IPAModuleInfo
> >>>>> + */
> >>>>> +
> > 
> > No need for a blank line here either
> > 
> >>>>> +struct IPAModuleInfo const &IPAModule::IPAModuleInfo() const
> >>>>> +{
> >>>>> +	return info_;
> >>>>> +}
> >>>>> +
> >>>>> +} /* namespace libcamera */
> >>>>> diff --git a/src/libcamera/meson.build b/src/libcamera/meson.build
> >>>>> index 8796f49..e5b48f2 100644
> >>>>> --- a/src/libcamera/meson.build
> >>>>> +++ b/src/libcamera/meson.build
> >>>>> @@ -10,6 +10,7 @@ libcamera_sources = files([
> >>>>>      'event_notifier.cpp',
> >>>>>      'formats.cpp',
> >>>>>      'geometry.cpp',
> >>>>> +    'ipa_module.cpp',
> >>>>>      'log.cpp',
> >>>>>      'media_device.cpp',
> >>>>>      'media_object.cpp',
> >>>>> @@ -31,6 +32,7 @@ libcamera_headers = files([
> >>>>>      'include/device_enumerator_udev.h',
> >>>>>      'include/event_dispatcher_poll.h',
> >>>>>      'include/formats.h',
> >>>>> +    'include/ipa_module.h',
> >>>>>      'include/log.h',
> >>>>>      'include/media_device.h',
> >>>>>      'include/media_object.h',
> 
> Everything else is either an ack or an affirmative.

Patch

diff --git a/src/libcamera/include/ipa_module.h b/src/libcamera/include/ipa_module.h
new file mode 100644
index 0000000..9eb0094
--- /dev/null
+++ b/src/libcamera/include/ipa_module.h
@@ -0,0 +1,43 @@ 
+/* SPDX-License-Identifier: LGPL-2.1-or-later */
+/*
+ * Copyright (C) 2019, Google Inc.
+ *
+ * ipa_module.h - load IPA module information from shared library at runtime
+ */
+#ifndef __LIBCAMERA_IPA_MODULE_H__
+#define __LIBCAMERA_IPA_MODULE_H__
+
+#include <string>
+
+namespace libcamera {
+
+struct IPAModuleInfo {
+	char name[256];
+	unsigned int version;
+};
+
+class IPAModule
+{
+public:
+	explicit IPAModule(const std::string &libPath, const std::string &symbol);
+
+	bool isValid() const;
+
+	struct IPAModuleInfo const &IPAModuleInfo() const;
+
+private:
+	struct IPAModuleInfo info_;
+
+	bool loaded_;
+	int bitClass_;
+
+	int loadELFIdent(int fd);
+	template<class ElfHeader, class SecHeader, class SymHeader>
+	int loadSymbol(void *data, size_t size, char *map, size_t soSize,
+		       const char *symbol);
+	int loadIPAModuleInfo(const char *libPath, const char *symbol);
+};
+
+} /* namespace libcamera */
+
+#endif /* __LIBCAMERA_IPA_MODULE_H__ */
diff --git a/src/libcamera/ipa_module.cpp b/src/libcamera/ipa_module.cpp
new file mode 100644
index 0000000..5ca16e8
--- /dev/null
+++ b/src/libcamera/ipa_module.cpp
@@ -0,0 +1,277 @@ 
+/* SPDX-License-Identifier: LGPL-2.1-or-later */
+/*
+ * Copyright (C) 2019, Google Inc.
+ *
+ * ipa_module.cpp - load IPA module information from shared library at runtime
+ */
+
+#include "ipa_module.h"
+
+#include <elf.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <iostream>
+
+#include "log.h"
+
+/**
+ * \file ipa_module.h
+ * \brief IPA module information loading
+ */
+
+namespace libcamera {
+
+/**
+ * \struct IPAModuleInfo
+ * \brief Information of an IPA plugin
+ *
+ * This structure contains the information of an IPA plugin. It is loaded,
+ * read, and validated before anything else is loaded from the plugin.
+ *
+ * \var IPAModuleInfo::name
+ * \brief The name of the IPA plugin
+ *
+ * \var IPAModuleInfo::version
+ * \brief The version of the IPA plugin
+ */
+
+LOG_DEFINE_CATEGORY(IPAModule)
+
+template<typename T>
+typename std::remove_extent<T>::type *elfPointer(void *map, off_t offset,
+						 size_t fileSize)
+{
+	size_t size = offset + sizeof(T);
+	if (size > fileSize || size < sizeof(T))
+		return nullptr;
+
+	return reinterpret_cast<typename std::remove_extent<T>::type *>
+		(static_cast<char *>(map) + offset);
+}
+
+/**
+ * \class IPAModule
+ * \brief Load IPA module information from an IPA plugin
+ */
+
+/**
+ * \brief Construct an IPAModule instance
+ * \param[in] libPath path to IPA plugin
+ * \param[in] symbol name of IPAModuleInfo to load from libPath
+ *
+ * Loads the IPAModuleInfo of the given symbol from the IPA plugin at libPath.
+ * The IPA plugin shared object file must be of the same endianness and
+ * bitness as libcamera.
+ *
+ * Note that isValid() should be called to check if this loading was successful
+ * or not.
+ */
+
+IPAModule::IPAModule(const std::string &libPath, const std::string &symbol)
+	: loaded_(false)
+{
+	int ret = loadIPAModuleInfo(libPath.c_str(), symbol.c_str());
+	if (ret < 0) {
+		loaded_ = false;
+		return;
+	}
+
+	loaded_ = true;
+}
+
+int IPAModule::loadELFIdent(int fd)
+{
+	int ret = lseek(fd, 0, SEEK_SET);
+	if (ret < 0)
+		return -errno;
+
+	unsigned char e_ident[EI_NIDENT];
+	ret = read(fd, e_ident, EI_NIDENT);
+	if (ret < 0)
+		return -errno;
+
+	if (e_ident[EI_MAG0] != ELFMAG0 ||
+	    e_ident[EI_MAG1] != ELFMAG1 ||
+	    e_ident[EI_MAG2] != ELFMAG2 ||
+	    e_ident[EI_MAG3] != ELFMAG3 ||
+	    e_ident[EI_VERSION] != EV_CURRENT)
+		return -ENOEXEC;
+
+	if ((e_ident[EI_CLASS] == ELFCLASS32 && sizeof(unsigned long) == 4) ||
+	    (e_ident[EI_CLASS] == ELFCLASS64 && sizeof(unsigned long) == 8))
+		bitClass_ = e_ident[EI_CLASS];
+	else
+		return -ENOEXEC;
+
+	int a = 1;
+	unsigned char endianness = *reinterpret_cast<char *>(&a) == 1
+				 ? ELFDATA2LSB : ELFDATA2MSB;
+	if (e_ident[EI_DATA] != endianness)
+		return -ENOEXEC;
+
+	return 0;
+}
+
+template<class ElfHeader, class SecHeader, class SymHeader>
+int IPAModule::loadSymbol(void *dst, size_t size, char *map, size_t soSize,
+			  const char *symbol)
+{
+	ElfHeader *eHdr = elfPointer<ElfHeader>(map, 0, soSize);
+	if (!eHdr)
+		return -ENOEXEC;
+
+	off_t offset = eHdr->e_shoff + eHdr->e_shentsize * eHdr->e_shstrndx;
+	SecHeader *sHdr = elfPointer<SecHeader>(map, offset, soSize);
+	if (!sHdr)
+		return -ENOEXEC;
+	off_t shnameoff = sHdr->sh_offset;
+
+	/* Locate .dynsym section header. */
+	bool found = false;
+	SecHeader *dynsym;
+	for (unsigned int i = 0; i < eHdr->e_shnum; i++) {
+		offset = eHdr->e_shoff + eHdr->e_shentsize * i;
+		sHdr = elfPointer<SecHeader>(map, offset, soSize);
+		if (!sHdr)
+			return -ENOEXEC;
+
+		offset = shnameoff + sHdr->sh_name;
+		char *name = elfPointer<char>(map, offset, soSize);
+		if (!name)
+			return -ENOEXEC;
+
+		if (sHdr->sh_type == SHT_DYNSYM && !strcmp(name, ".dynsym")) {
+			found = true;
+			dynsym = sHdr;
+			break;
+		}
+	}
+
+	if (!found) {
+		LOG(IPAModule, Error) << "ELF has no .dynsym section";
+		return -ENOEXEC;
+	}
+
+	offset = eHdr->e_shoff + eHdr->e_shentsize * dynsym->sh_link;
+	sHdr = elfPointer<SecHeader>(map, offset, soSize);
+	if (!sHdr)
+		return -ENOEXEC;
+	off_t dynsym_nameoff = sHdr->sh_offset;
+
+	/* Locate symbol in the .dynsym section. */
+	found = false;
+	SymHeader *target_symbol;
+	unsigned int dynsym_num = dynsym->sh_size / dynsym->sh_entsize;
+	for (unsigned int i = 0; i < dynsym_num; i++) {
+		offset = dynsym->sh_offset + dynsym->sh_entsize * i;
+		SymHeader *sym = elfPointer<SymHeader>(map, offset, soSize);
+		if (!sym)
+			return -ENOEXEC;
+
+		offset = dynsym_nameoff + sym->st_name;
+		char *name = elfPointer<char>(map, offset, soSize);
+		if (!name)
+			return -ENOEXEC;
+
+		if (!strcmp(name, symbol) &&
+		    sym->st_info & STB_GLOBAL &&
+		    sym->st_size == sizeof(struct IPAModuleInfo)) {
+			found = true;
+			target_symbol = sym;
+			break;
+		}
+	}
+
+	if (!found) {
+		LOG(IPAModule, Error) << "IPAModuleInfo symbol not found";
+		return -ENOEXEC;
+	}
+
+	/* Locate and return data of symbol. */
+	offset = eHdr->e_shoff + target_symbol->st_shndx * eHdr->e_shentsize;
+	sHdr = elfPointer<SecHeader>(map, offset, soSize);
+	if (!sHdr)
+		return -ENOEXEC;
+	offset = sHdr->sh_offset + (target_symbol->st_value - sHdr->sh_addr);
+	char *data = elfPointer<char>(map, offset, soSize);
+	if (!data)
+		return -ENOEXEC;
+	memcpy(dst, data, size);
+
+	return 0;
+}
+
+int IPAModule::loadIPAModuleInfo(const char *libPath, const char *symbol)
+{
+	int fd = open(libPath, O_RDONLY);
+	if (fd < 0) {
+		LOG(IPAModule, Error) << "Failed to open IPA library: "
+				      << strerror(errno);
+		return fd;
+	}
+
+	int ret = loadELFIdent(fd);
+	if (ret)
+		goto close;
+
+	size_t soSize;
+	char *map;
+	struct stat st;
+	ret = fstat(fd, &st);
+	if (ret < 0)
+		goto close;
+	soSize = st.st_size;
+	map = static_cast<char *>(mmap(NULL, soSize, PROT_READ,
+				       MAP_PRIVATE, fd, 0));
+	if (map == MAP_FAILED) {
+		ret = -errno;
+		goto close;
+	}
+
+	if (bitClass_ == ELFCLASS32)
+		ret = loadSymbol<Elf32_Ehdr, Elf32_Shdr, Elf32_Sym>
+				(&info_, sizeof(info_), map, soSize, symbol);
+	else if (bitClass_ == ELFCLASS64)
+		ret = loadSymbol<Elf64_Ehdr, Elf64_Shdr, Elf64_Sym>
+				(&info_, sizeof(info_), map, soSize, symbol);
+	if (ret)
+		goto unmap;
+
+unmap:
+	munmap(map, soSize);
+close:
+	close(fd);
+	return ret;
+}
+
+/**
+ * \brief Check if construction of the IPAModule instance succeeded
+ *
+ * \return true if the constructor succeeded with no errors, false otherwise
+ */
+
+bool IPAModule::isValid() const
+{
+	return loaded_;
+}
+
+/**
+ * \brief Get the loaded IPAModuleInfo
+ *
+ * Check isValid() before calling this.
+ *
+ * \return IPAModuleInfo
+ */
+
+struct IPAModuleInfo const &IPAModule::IPAModuleInfo() const
+{
+	return info_;
+}
+
+} /* namespace libcamera */
diff --git a/src/libcamera/meson.build b/src/libcamera/meson.build
index 8796f49..e5b48f2 100644
--- a/src/libcamera/meson.build
+++ b/src/libcamera/meson.build
@@ -10,6 +10,7 @@  libcamera_sources = files([
     'event_notifier.cpp',
     'formats.cpp',
     'geometry.cpp',
+    'ipa_module.cpp',
     'log.cpp',
     'media_device.cpp',
     'media_object.cpp',
@@ -31,6 +32,7 @@  libcamera_headers = files([
     'include/device_enumerator_udev.h',
     'include/event_dispatcher_poll.h',
     'include/formats.h',
+    'include/ipa_module.h',
     'include/log.h',
     'include/media_device.h',
     'include/media_object.h',