[libcamera-devel] lc-compliance: simple_capture: Free Requests properly
diff mbox series

Message ID 20221129110005.4067748-1-paul.elder@ideasonboard.com
State Accepted
Commit 9a913eb9107483fcea01bd0e82c21e3733a5ddb2
Headers show
Series
  • [libcamera-devel] lc-compliance: simple_capture: Free Requests properly
Related show

Commit Message

Paul Elder Nov. 29, 2022, 11 a.m. UTC
In the simple capture tests, in the capture functions, the Requests were
auto-deallocated by the function going out of scope after the test
completed. However, before the end of the scope, the Framebuffers that
the Requests referred to were manually freed. Thus when the Requests
were deallocated, they tried to cancel their Framebuffers, which involve
setting the status to cancelled, which obviously causes a segfault
because the Framebuffers had already been freed.

Fix this by moving the list of Requests to a member variable and
deallocating them before deallocating the Framebuffers at stop() time.

Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
---
 src/apps/lc-compliance/simple_capture.cpp | 7 +++----
 src/apps/lc-compliance/simple_capture.h   | 1 +
 2 files changed, 4 insertions(+), 4 deletions(-)

Comments

Kieran Bingham Nov. 29, 2022, 5:39 p.m. UTC | #1
Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
> In the simple capture tests, in the capture functions, the Requests were
> auto-deallocated by the function going out of scope after the test
> completed. However, before the end of the scope, the Framebuffers that
> the Requests referred to were manually freed. Thus when the Requests
> were deallocated, they tried to cancel their Framebuffers, which involve
> setting the status to cancelled, which obviously causes a segfault
> because the Framebuffers had already been freed.

This worries me that we have a path that could let application cause a
segfault in libcamera. That's not good.

> Fix this by moving the list of Requests to a member variable and
> deallocating them before deallocating the Framebuffers at stop() time.

Fixing this here seems reasonable, but I'm concerned that we now need a
unit test in libcamera unit tests which ensure or validate the lifetimes
of freeing buffers/requests.

If I understand it, that means that we have a lifetime management issue
still between the Request and the Buffer object that may be associated
with it.

Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>

But I am not sure we should consider the underlying issue resolved yet.
I feel like the crash in lc-compliance is simply telling us we have a
fault with our API.

The question is if we should still fix lc-compliance here or not - or if
we should keep it 'broken' so that we can validate whatever fix we do
internally in libcamera?

--
Kieran


> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> ---
>  src/apps/lc-compliance/simple_capture.cpp | 7 +++----
>  src/apps/lc-compliance/simple_capture.h   | 1 +
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
> index ab5cb35c..cf4d7cf3 100644
> --- a/src/apps/lc-compliance/simple_capture.cpp
> +++ b/src/apps/lc-compliance/simple_capture.cpp
> @@ -65,6 +65,7 @@ void SimpleCapture::stop()
>         camera_->requestCompleted.disconnect(this);
>  
>         Stream *stream = config_->at(0).stream();
> +       requests_.clear();
>         allocator_->free(stream);
>  }
>  
> @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
>         captureLimit_ = numRequests;
>  
>         /* Queue the recommended number of reqeuests. */
> -       std::vector<std::unique_ptr<libcamera::Request>> requests;
>         for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
>                 std::unique_ptr<Request> request = camera_->createRequest();
>                 ASSERT_TRUE(request) << "Can't create request";
> @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
>  
>                 ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
>  
> -               requests.push_back(std::move(request));
> +               requests_.push_back(std::move(request));
>         }
>  
>         /* Run capture session. */
> @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
>         captureLimit_ = numRequests;
>  
>         /* Queue the recommended number of reqeuests. */
> -       std::vector<std::unique_ptr<libcamera::Request>> requests;
>         for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
>                 std::unique_ptr<Request> request = camera_->createRequest();
>                 ASSERT_TRUE(request) << "Can't create request";
> @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
>  
>                 ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
>  
> -               requests.push_back(std::move(request));
> +               requests_.push_back(std::move(request));
>         }
>  
>         /* Run capture session. */
> diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
> index fd9d2a97..2911d601 100644
> --- a/src/apps/lc-compliance/simple_capture.h
> +++ b/src/apps/lc-compliance/simple_capture.h
> @@ -32,6 +32,7 @@ protected:
>         std::shared_ptr<libcamera::Camera> camera_;
>         std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
>         std::unique_ptr<libcamera::CameraConfiguration> config_;
> +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
>  };
>  
>  class SimpleCaptureBalanced : public SimpleCapture
> -- 
> 2.35.1
>
Umang Jain Nov. 30, 2022, 10:35 a.m. UTC | #2
Hi Kieran,

On 11/30/22 1:39 AM, Kieran Bingham via libcamera-devel wrote:
> Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
>> In the simple capture tests, in the capture functions, the Requests were
>> auto-deallocated by the function going out of scope after the test
>> completed. However, before the end of the scope, the Framebuffers that
>> the Requests referred to were manually freed. Thus when the Requests
>> were deallocated, they tried to cancel their Framebuffers, which involve
>> setting the status to cancelled, which obviously causes a segfault
>> because the Framebuffers had already been freed.
> This worries me that we have a path that could let application cause a
> segfault in libcamera. That's not good.
>
>> Fix this by moving the list of Requests to a member variable and
>> deallocating them before deallocating the Framebuffers at stop() time.
> Fixing this here seems reasonable, but I'm concerned that we now need a
> unit test in libcamera unit tests which ensure or validate the lifetimes
> of freeing buffers/requests.
>
> If I understand it, that means that we have a lifetime management issue
> still between the Request and the Buffer object that may be associated
> with it.

If the buffers to be associated with the Requests are allocated and 
mis-managed by the application themselves - how would that be address by 
libcamera-core?

I am reading the Request->addBuffer() documentation and clearly mentioned:

  *
  * A reference to the buffer is stored in the request. The caller is 
responsible
  * for ensuring that the buffer will remain valid until the request 
complete
  * callback is called.
*

In this case the buffers from manually freed before the request got 
completed (via set 'cancelled' status and completeRequest() codepath in 
PIpelineHandler::Stop()). So that's not the right thing to do / happen.

>
> Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
>
> But I am not sure we should consider the underlying issue resolved yet.
> I feel like the crash in lc-compliance is simply telling us we have a
> fault with our API.
>
> The question is if we should still fix lc-compliance here or not - or if
> we should keep it 'broken' so that we can validate whatever fix we do
> internally in libcamera?
>
> --
> Kieran
>
>
>> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
>> ---
>>   src/apps/lc-compliance/simple_capture.cpp | 7 +++----
>>   src/apps/lc-compliance/simple_capture.h   | 1 +
>>   2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
>> index ab5cb35c..cf4d7cf3 100644
>> --- a/src/apps/lc-compliance/simple_capture.cpp
>> +++ b/src/apps/lc-compliance/simple_capture.cpp
>> @@ -65,6 +65,7 @@ void SimpleCapture::stop()
>>          camera_->requestCompleted.disconnect(this);
>>   
>>          Stream *stream = config_->at(0).stream();
>> +       requests_.clear();
>>          allocator_->free(stream);
>>   }
>>   
>> @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
>>          captureLimit_ = numRequests;
>>   
>>          /* Queue the recommended number of reqeuests. */
>> -       std::vector<std::unique_ptr<libcamera::Request>> requests;
>>          for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
>>                  std::unique_ptr<Request> request = camera_->createRequest();
>>                  ASSERT_TRUE(request) << "Can't create request";
>> @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
>>   
>>                  ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
>>   
>> -               requests.push_back(std::move(request));
>> +               requests_.push_back(std::move(request));
>>          }
>>   
>>          /* Run capture session. */
>> @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
>>          captureLimit_ = numRequests;
>>   
>>          /* Queue the recommended number of reqeuests. */
>> -       std::vector<std::unique_ptr<libcamera::Request>> requests;
>>          for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
>>                  std::unique_ptr<Request> request = camera_->createRequest();
>>                  ASSERT_TRUE(request) << "Can't create request";
>> @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
>>   
>>                  ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
>>   
>> -               requests.push_back(std::move(request));
>> +               requests_.push_back(std::move(request));
>>          }
>>   
>>          /* Run capture session. */
>> diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
>> index fd9d2a97..2911d601 100644
>> --- a/src/apps/lc-compliance/simple_capture.h
>> +++ b/src/apps/lc-compliance/simple_capture.h
>> @@ -32,6 +32,7 @@ protected:
>>          std::shared_ptr<libcamera::Camera> camera_;
>>          std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
>>          std::unique_ptr<libcamera::CameraConfiguration> config_;
>> +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
>>   };
>>   
>>   class SimpleCaptureBalanced : public SimpleCapture
>> -- 
>> 2.35.1
>>
Kieran Bingham Nov. 30, 2022, 10:53 a.m. UTC | #3
Quoting Umang Jain (2022-11-30 10:35:47)
> Hi Kieran,
> 
> On 11/30/22 1:39 AM, Kieran Bingham via libcamera-devel wrote:
> > Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
> >> In the simple capture tests, in the capture functions, the Requests were
> >> auto-deallocated by the function going out of scope after the test
> >> completed. However, before the end of the scope, the Framebuffers that
> >> the Requests referred to were manually freed. Thus when the Requests
> >> were deallocated, they tried to cancel their Framebuffers, which involve
> >> setting the status to cancelled, which obviously causes a segfault
> >> because the Framebuffers had already been freed.
> > This worries me that we have a path that could let application cause a
> > segfault in libcamera. That's not good.
> >
> >> Fix this by moving the list of Requests to a member variable and
> >> deallocating them before deallocating the Framebuffers at stop() time.
> > Fixing this here seems reasonable, but I'm concerned that we now need a
> > unit test in libcamera unit tests which ensure or validate the lifetimes
> > of freeing buffers/requests.
> >
> > If I understand it, that means that we have a lifetime management issue
> > still between the Request and the Buffer object that may be associated
> > with it.
> 
> If the buffers to be associated with the Requests are allocated and 
> mis-managed by the application themselves - how would that be address by 
> libcamera-core?
> 
> I am reading the Request->addBuffer() documentation and clearly mentioned:
> 
>   *
>   * A reference to the buffer is stored in the request. The caller is 
> responsible
>   * for ensuring that the buffer will remain valid until the request 
> complete
>   * callback is called.
> *
> 
> In this case the buffers from manually freed before the request got 
> completed (via set 'cancelled' status and completeRequest() codepath in 
> PIpelineHandler::Stop()). So that's not the right thing to do / happen.

aha, ok - if it's documented, then we're probably good with this for
now.

I do wonder if we should have some reverse association such that when
you add a buffer to a request, the buffer knows which request it's
contained in, and can 'remove' itself when it's freed... but ... that
makes me wonder if we could legitimately have a single buffer in
multiple requests which would break that possibility ... so maybe we're
fine with just the documentation as we have it.

--
Kieran


> 
> >
> > Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
> > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> >
> > But I am not sure we should consider the underlying issue resolved yet.
> > I feel like the crash in lc-compliance is simply telling us we have a
> > fault with our API.
> >
> > The question is if we should still fix lc-compliance here or not - or if
> > we should keep it 'broken' so that we can validate whatever fix we do
> > internally in libcamera?
> >
> > --
> > Kieran
> >
> >
> >> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> >> ---
> >>   src/apps/lc-compliance/simple_capture.cpp | 7 +++----
> >>   src/apps/lc-compliance/simple_capture.h   | 1 +
> >>   2 files changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
> >> index ab5cb35c..cf4d7cf3 100644
> >> --- a/src/apps/lc-compliance/simple_capture.cpp
> >> +++ b/src/apps/lc-compliance/simple_capture.cpp
> >> @@ -65,6 +65,7 @@ void SimpleCapture::stop()
> >>          camera_->requestCompleted.disconnect(this);
> >>   
> >>          Stream *stream = config_->at(0).stream();
> >> +       requests_.clear();
> >>          allocator_->free(stream);
> >>   }
> >>   
> >> @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> >>          captureLimit_ = numRequests;
> >>   
> >>          /* Queue the recommended number of reqeuests. */
> >> -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> >>          for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> >>                  std::unique_ptr<Request> request = camera_->createRequest();
> >>                  ASSERT_TRUE(request) << "Can't create request";
> >> @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> >>   
> >>                  ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
> >>   
> >> -               requests.push_back(std::move(request));
> >> +               requests_.push_back(std::move(request));
> >>          }
> >>   
> >>          /* Run capture session. */
> >> @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> >>          captureLimit_ = numRequests;
> >>   
> >>          /* Queue the recommended number of reqeuests. */
> >> -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> >>          for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> >>                  std::unique_ptr<Request> request = camera_->createRequest();
> >>                  ASSERT_TRUE(request) << "Can't create request";
> >> @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> >>   
> >>                  ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
> >>   
> >> -               requests.push_back(std::move(request));
> >> +               requests_.push_back(std::move(request));
> >>          }
> >>   
> >>          /* Run capture session. */
> >> diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
> >> index fd9d2a97..2911d601 100644
> >> --- a/src/apps/lc-compliance/simple_capture.h
> >> +++ b/src/apps/lc-compliance/simple_capture.h
> >> @@ -32,6 +32,7 @@ protected:
> >>          std::shared_ptr<libcamera::Camera> camera_;
> >>          std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
> >>          std::unique_ptr<libcamera::CameraConfiguration> config_;
> >> +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
> >>   };
> >>   
> >>   class SimpleCaptureBalanced : public SimpleCapture
> >> -- 
> >> 2.35.1
> >>
>
Umang Jain Nov. 30, 2022, 11:24 a.m. UTC | #4
Hi Kieran,

On 11/30/22 6:53 PM, Kieran Bingham wrote:
> Quoting Umang Jain (2022-11-30 10:35:47)
>> Hi Kieran,
>>
>> On 11/30/22 1:39 AM, Kieran Bingham via libcamera-devel wrote:
>>> Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
>>>> In the simple capture tests, in the capture functions, the Requests were
>>>> auto-deallocated by the function going out of scope after the test
>>>> completed. However, before the end of the scope, the Framebuffers that
>>>> the Requests referred to were manually freed. Thus when the Requests
>>>> were deallocated, they tried to cancel their Framebuffers, which involve
>>>> setting the status to cancelled, which obviously causes a segfault
>>>> because the Framebuffers had already been freed.
>>> This worries me that we have a path that could let application cause a
>>> segfault in libcamera. That's not good.
>>>
>>>> Fix this by moving the list of Requests to a member variable and
>>>> deallocating them before deallocating the Framebuffers at stop() time.
>>> Fixing this here seems reasonable, but I'm concerned that we now need a
>>> unit test in libcamera unit tests which ensure or validate the lifetimes
>>> of freeing buffers/requests.
>>>
>>> If I understand it, that means that we have a lifetime management issue
>>> still between the Request and the Buffer object that may be associated
>>> with it.
>> If the buffers to be associated with the Requests are allocated and
>> mis-managed by the application themselves - how would that be address by
>> libcamera-core?
>>
>> I am reading the Request->addBuffer() documentation and clearly mentioned:
>>
>>    *
>>    * A reference to the buffer is stored in the request. The caller is
>> responsible
>>    * for ensuring that the buffer will remain valid until the request
>> complete
>>    * callback is called.
>> *
>>
>> In this case the buffers from manually freed before the request got
>> completed (via set 'cancelled' status and completeRequest() codepath in
>> PIpelineHandler::Stop()). So that's not the right thing to do / happen.
> aha, ok - if it's documented, then we're probably good with this for
> now.
>
> I do wonder if we should have some reverse association such that when
> you add a buffer to a request, the buffer knows which request it's
> contained in, and can 'remove' itself when it's freed... but ... that
> makes me wonder if we could legitimately have a single buffer in
> multiple requests which would break that possibility ... so maybe we're
> fine with just the documentation as we have it.

We can enforce a std::shared_ptr <> model (but that's also not 100% 
safety) but not sure how well it'll scale with language bindings in the 
future.

All in all, similar to dangling pointers problem.
>
> --
> Kieran
>
>
>>> Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
>>> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
>>> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
>>>
>>> But I am not sure we should consider the underlying issue resolved yet.
>>> I feel like the crash in lc-compliance is simply telling us we have a
>>> fault with our API.
>>>
>>> The question is if we should still fix lc-compliance here or not - or if
>>> we should keep it 'broken' so that we can validate whatever fix we do
>>> internally in libcamera?
>>>
>>> --
>>> Kieran
>>>
>>>
>>>> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
>>>> ---
>>>>    src/apps/lc-compliance/simple_capture.cpp | 7 +++----
>>>>    src/apps/lc-compliance/simple_capture.h   | 1 +
>>>>    2 files changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
>>>> index ab5cb35c..cf4d7cf3 100644
>>>> --- a/src/apps/lc-compliance/simple_capture.cpp
>>>> +++ b/src/apps/lc-compliance/simple_capture.cpp
>>>> @@ -65,6 +65,7 @@ void SimpleCapture::stop()
>>>>           camera_->requestCompleted.disconnect(this);
>>>>    
>>>>           Stream *stream = config_->at(0).stream();
>>>> +       requests_.clear();
>>>>           allocator_->free(stream);
>>>>    }
>>>>    
>>>> @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
>>>>           captureLimit_ = numRequests;
>>>>    
>>>>           /* Queue the recommended number of reqeuests. */
>>>> -       std::vector<std::unique_ptr<libcamera::Request>> requests;
>>>>           for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
>>>>                   std::unique_ptr<Request> request = camera_->createRequest();
>>>>                   ASSERT_TRUE(request) << "Can't create request";
>>>> @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
>>>>    
>>>>                   ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
>>>>    
>>>> -               requests.push_back(std::move(request));
>>>> +               requests_.push_back(std::move(request));
>>>>           }
>>>>    
>>>>           /* Run capture session. */
>>>> @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
>>>>           captureLimit_ = numRequests;
>>>>    
>>>>           /* Queue the recommended number of reqeuests. */
>>>> -       std::vector<std::unique_ptr<libcamera::Request>> requests;
>>>>           for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
>>>>                   std::unique_ptr<Request> request = camera_->createRequest();
>>>>                   ASSERT_TRUE(request) << "Can't create request";
>>>> @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
>>>>    
>>>>                   ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
>>>>    
>>>> -               requests.push_back(std::move(request));
>>>> +               requests_.push_back(std::move(request));
>>>>           }
>>>>    
>>>>           /* Run capture session. */
>>>> diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
>>>> index fd9d2a97..2911d601 100644
>>>> --- a/src/apps/lc-compliance/simple_capture.h
>>>> +++ b/src/apps/lc-compliance/simple_capture.h
>>>> @@ -32,6 +32,7 @@ protected:
>>>>           std::shared_ptr<libcamera::Camera> camera_;
>>>>           std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
>>>>           std::unique_ptr<libcamera::CameraConfiguration> config_;
>>>> +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
>>>>    };
>>>>    
>>>>    class SimpleCaptureBalanced : public SimpleCapture
>>>> -- 
>>>> 2.35.1
>>>>
Paul Elder Nov. 30, 2022, 11:27 a.m. UTC | #5
On Tue, Nov 29, 2022 at 05:39:16PM +0000, Kieran Bingham wrote:
> Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
> > In the simple capture tests, in the capture functions, the Requests were
> > auto-deallocated by the function going out of scope after the test
> > completed. However, before the end of the scope, the Framebuffers that
> > the Requests referred to were manually freed. Thus when the Requests
> > were deallocated, they tried to cancel their Framebuffers, which involve
> > setting the status to cancelled, which obviously causes a segfault
> > because the Framebuffers had already been freed.
> 
> This worries me that we have a path that could let application cause a
> segfault in libcamera. That's not good.

That's true.

Should we fix this instead in Request then? By checking that
FramBuffer::_d() == nullptr in Request::Private::doCancelRequest() like
rsglobal does in the github issue [1]?

[1] https://github.com/kbingham/libcamera/issues/60#issue-1450103563

> 
> > Fix this by moving the list of Requests to a member variable and
> > deallocating them before deallocating the Framebuffers at stop() time.
> 
> Fixing this here seems reasonable, but I'm concerned that we now need a
> unit test in libcamera unit tests which ensure or validate the lifetimes
> of freeing buffers/requests.

Yeah that's true, this should probably be split into another compliance
test in itself.

> 
> If I understand it, that means that we have a lifetime management issue
> still between the Request and the Buffer object that may be associated
> with it.
> 
> Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
> Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> 
> But I am not sure we should consider the underlying issue resolved yet.
> I feel like the crash in lc-compliance is simply telling us we have a
> fault with our API.
> 
> The question is if we should still fix lc-compliance here or not - or if
> we should keep it 'broken' so that we can validate whatever fix we do
> internally in libcamera?

I think split this specific issue into a separate smaller test. I
don't think there's a need to test this specific issue in all
combinations that we test currently in the simple capture test.


Thanks,

Paul

> 
> 
> > Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> > ---
> >  src/apps/lc-compliance/simple_capture.cpp | 7 +++----
> >  src/apps/lc-compliance/simple_capture.h   | 1 +
> >  2 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
> > index ab5cb35c..cf4d7cf3 100644
> > --- a/src/apps/lc-compliance/simple_capture.cpp
> > +++ b/src/apps/lc-compliance/simple_capture.cpp
> > @@ -65,6 +65,7 @@ void SimpleCapture::stop()
> >         camera_->requestCompleted.disconnect(this);
> >  
> >         Stream *stream = config_->at(0).stream();
> > +       requests_.clear();
> >         allocator_->free(stream);
> >  }
> >  
> > @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> >         captureLimit_ = numRequests;
> >  
> >         /* Queue the recommended number of reqeuests. */
> > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> >         for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> >                 std::unique_ptr<Request> request = camera_->createRequest();
> >                 ASSERT_TRUE(request) << "Can't create request";
> > @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> >  
> >                 ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
> >  
> > -               requests.push_back(std::move(request));
> > +               requests_.push_back(std::move(request));
> >         }
> >  
> >         /* Run capture session. */
> > @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> >         captureLimit_ = numRequests;
> >  
> >         /* Queue the recommended number of reqeuests. */
> > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> >         for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> >                 std::unique_ptr<Request> request = camera_->createRequest();
> >                 ASSERT_TRUE(request) << "Can't create request";
> > @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> >  
> >                 ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
> >  
> > -               requests.push_back(std::move(request));
> > +               requests_.push_back(std::move(request));
> >         }
> >  
> >         /* Run capture session. */
> > diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
> > index fd9d2a97..2911d601 100644
> > --- a/src/apps/lc-compliance/simple_capture.h
> > +++ b/src/apps/lc-compliance/simple_capture.h
> > @@ -32,6 +32,7 @@ protected:
> >         std::shared_ptr<libcamera::Camera> camera_;
> >         std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
> >         std::unique_ptr<libcamera::CameraConfiguration> config_;
> > +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
> >  };
> >  
> >  class SimpleCaptureBalanced : public SimpleCapture
> > -- 
> > 2.35.1
> >
Paul Elder Nov. 30, 2022, 11:31 a.m. UTC | #6
On Wed, Nov 30, 2022 at 07:24:52PM +0800, Umang Jain wrote:
> Hi Kieran,
> 
> On 11/30/22 6:53 PM, Kieran Bingham wrote:
> > Quoting Umang Jain (2022-11-30 10:35:47)
> > > Hi Kieran,
> > > 
> > > On 11/30/22 1:39 AM, Kieran Bingham via libcamera-devel wrote:
> > > > Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
> > > > > In the simple capture tests, in the capture functions, the Requests were
> > > > > auto-deallocated by the function going out of scope after the test
> > > > > completed. However, before the end of the scope, the Framebuffers that
> > > > > the Requests referred to were manually freed. Thus when the Requests
> > > > > were deallocated, they tried to cancel their Framebuffers, which involve
> > > > > setting the status to cancelled, which obviously causes a segfault
> > > > > because the Framebuffers had already been freed.
> > > > This worries me that we have a path that could let application cause a
> > > > segfault in libcamera. That's not good.
> > > > 
> > > > > Fix this by moving the list of Requests to a member variable and
> > > > > deallocating them before deallocating the Framebuffers at stop() time.
> > > > Fixing this here seems reasonable, but I'm concerned that we now need a
> > > > unit test in libcamera unit tests which ensure or validate the lifetimes
> > > > of freeing buffers/requests.
> > > > 
> > > > If I understand it, that means that we have a lifetime management issue
> > > > still between the Request and the Buffer object that may be associated
> > > > with it.
> > > If the buffers to be associated with the Requests are allocated and
> > > mis-managed by the application themselves - how would that be address by
> > > libcamera-core?
> > > 
> > > I am reading the Request->addBuffer() documentation and clearly mentioned:
> > > 
> > >    *
> > >    * A reference to the buffer is stored in the request. The caller is
> > > responsible
> > >    * for ensuring that the buffer will remain valid until the request
> > > complete
> > >    * callback is called.
> > > *
> > > 
> > > In this case the buffers from manually freed before the request got
> > > completed (via set 'cancelled' status and completeRequest() codepath in
> > > PIpelineHandler::Stop()). So that's not the right thing to do / happen.
> > aha, ok - if it's documented, then we're probably good with this for
> > now.
> > 
> > I do wonder if we should have some reverse association such that when
> > you add a buffer to a request, the buffer knows which request it's
> > contained in, and can 'remove' itself when it's freed... but ... that
> > makes me wonder if we could legitimately have a single buffer in
> > multiple requests which would break that possibility ... so maybe we're
> > fine with just the documentation as we have it.
> 
> We can enforce a std::shared_ptr <> model (but that's also not 100% safety)
> but not sure how well it'll scale with language bindings in the future.
> 
> All in all, similar to dangling pointers problem.

Oops, I didn't notice this thread in the meantime.

So this patch is good on its own and I don't have to do anything extra
then? :)


Paul

> > 
> > --
> > Kieran
> > 
> > 
> > > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
> > > > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > > > Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > > > 
> > > > But I am not sure we should consider the underlying issue resolved yet.
> > > > I feel like the crash in lc-compliance is simply telling us we have a
> > > > fault with our API.
> > > > 
> > > > The question is if we should still fix lc-compliance here or not - or if
> > > > we should keep it 'broken' so that we can validate whatever fix we do
> > > > internally in libcamera?
> > > > 
> > > > --
> > > > Kieran
> > > > 
> > > > 
> > > > > Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> > > > > ---
> > > > >    src/apps/lc-compliance/simple_capture.cpp | 7 +++----
> > > > >    src/apps/lc-compliance/simple_capture.h   | 1 +
> > > > >    2 files changed, 4 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
> > > > > index ab5cb35c..cf4d7cf3 100644
> > > > > --- a/src/apps/lc-compliance/simple_capture.cpp
> > > > > +++ b/src/apps/lc-compliance/simple_capture.cpp
> > > > > @@ -65,6 +65,7 @@ void SimpleCapture::stop()
> > > > >           camera_->requestCompleted.disconnect(this);
> > > > >           Stream *stream = config_->at(0).stream();
> > > > > +       requests_.clear();
> > > > >           allocator_->free(stream);
> > > > >    }
> > > > > @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> > > > >           captureLimit_ = numRequests;
> > > > >           /* Queue the recommended number of reqeuests. */
> > > > > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> > > > >           for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> > > > >                   std::unique_ptr<Request> request = camera_->createRequest();
> > > > >                   ASSERT_TRUE(request) << "Can't create request";
> > > > > @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> > > > >                   ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
> > > > > -               requests.push_back(std::move(request));
> > > > > +               requests_.push_back(std::move(request));
> > > > >           }
> > > > >           /* Run capture session. */
> > > > > @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> > > > >           captureLimit_ = numRequests;
> > > > >           /* Queue the recommended number of reqeuests. */
> > > > > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> > > > >           for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> > > > >                   std::unique_ptr<Request> request = camera_->createRequest();
> > > > >                   ASSERT_TRUE(request) << "Can't create request";
> > > > > @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> > > > >                   ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
> > > > > -               requests.push_back(std::move(request));
> > > > > +               requests_.push_back(std::move(request));
> > > > >           }
> > > > >           /* Run capture session. */
> > > > > diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
> > > > > index fd9d2a97..2911d601 100644
> > > > > --- a/src/apps/lc-compliance/simple_capture.h
> > > > > +++ b/src/apps/lc-compliance/simple_capture.h
> > > > > @@ -32,6 +32,7 @@ protected:
> > > > >           std::shared_ptr<libcamera::Camera> camera_;
> > > > >           std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
> > > > >           std::unique_ptr<libcamera::CameraConfiguration> config_;
> > > > > +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
> > > > >    };
> > > > >    class SimpleCaptureBalanced : public SimpleCapture
> > > > > -- 
> > > > > 2.35.1
> > > > > 
>
Kieran Bingham Nov. 30, 2022, 11:35 a.m. UTC | #7
Quoting Paul Elder (2022-11-30 11:27:40)
> On Tue, Nov 29, 2022 at 05:39:16PM +0000, Kieran Bingham wrote:
> > Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
> > > In the simple capture tests, in the capture functions, the Requests were
> > > auto-deallocated by the function going out of scope after the test
> > > completed. However, before the end of the scope, the Framebuffers that
> > > the Requests referred to were manually freed. Thus when the Requests
> > > were deallocated, they tried to cancel their Framebuffers, which involve
> > > setting the status to cancelled, which obviously causes a segfault
> > > because the Framebuffers had already been freed.
> > 
> > This worries me that we have a path that could let application cause a
> > segfault in libcamera. That's not good.
> 
> That's true.
> 
> Should we fix this instead in Request then? By checking that
> FramBuffer::_d() == nullptr in Request::Private::doCancelRequest() like
> rsglobal does in the github issue [1]?

As this is clearly a situation that applciations could get themselves into -
perhaps we need to have 'something' but with a clear message to tellng
them they've messed up ?


 
> [1] https://github.com/kbingham/libcamera/issues/60#issue-1450103563
> 
> > 
> > > Fix this by moving the list of Requests to a member variable and
> > > deallocating them before deallocating the Framebuffers at stop() time.
> > 
> > Fixing this here seems reasonable, but I'm concerned that we now need a
> > unit test in libcamera unit tests which ensure or validate the lifetimes
> > of freeing buffers/requests.
> 
> Yeah that's true, this should probably be split into another compliance
> test in itself.
> 
> > 
> > If I understand it, that means that we have a lifetime management issue
> > still between the Request and the Buffer object that may be associated
> > with it.
> > 
> > Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
> > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > 
> > But I am not sure we should consider the underlying issue resolved yet.
> > I feel like the crash in lc-compliance is simply telling us we have a
> > fault with our API.
> > 
> > The question is if we should still fix lc-compliance here or not - or if
> > we should keep it 'broken' so that we can validate whatever fix we do
> > internally in libcamera?
> 
> I think split this specific issue into a separate smaller test. I
> don't think there's a need to test this specific issue in all
> combinations that we test currently in the simple capture test.

Absolutely. It's just about figuring out what the applications could be
likely to get wrong, and if /we've/ got it wrong ... then I guess
applications could.

--
Kieran


> 
> 
> Thanks,
> 
> Paul
> 
> > 
> > 
> > > Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> > > ---
> > >  src/apps/lc-compliance/simple_capture.cpp | 7 +++----
> > >  src/apps/lc-compliance/simple_capture.h   | 1 +
> > >  2 files changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
> > > index ab5cb35c..cf4d7cf3 100644
> > > --- a/src/apps/lc-compliance/simple_capture.cpp
> > > +++ b/src/apps/lc-compliance/simple_capture.cpp
> > > @@ -65,6 +65,7 @@ void SimpleCapture::stop()
> > >         camera_->requestCompleted.disconnect(this);
> > >  
> > >         Stream *stream = config_->at(0).stream();
> > > +       requests_.clear();
> > >         allocator_->free(stream);
> > >  }
> > >  
> > > @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> > >         captureLimit_ = numRequests;
> > >  
> > >         /* Queue the recommended number of reqeuests. */
> > > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> > >         for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> > >                 std::unique_ptr<Request> request = camera_->createRequest();
> > >                 ASSERT_TRUE(request) << "Can't create request";
> > > @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> > >  
> > >                 ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
> > >  
> > > -               requests.push_back(std::move(request));
> > > +               requests_.push_back(std::move(request));
> > >         }
> > >  
> > >         /* Run capture session. */
> > > @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> > >         captureLimit_ = numRequests;
> > >  
> > >         /* Queue the recommended number of reqeuests. */
> > > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> > >         for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> > >                 std::unique_ptr<Request> request = camera_->createRequest();
> > >                 ASSERT_TRUE(request) << "Can't create request";
> > > @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> > >  
> > >                 ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
> > >  
> > > -               requests.push_back(std::move(request));
> > > +               requests_.push_back(std::move(request));
> > >         }
> > >  
> > >         /* Run capture session. */
> > > diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
> > > index fd9d2a97..2911d601 100644
> > > --- a/src/apps/lc-compliance/simple_capture.h
> > > +++ b/src/apps/lc-compliance/simple_capture.h
> > > @@ -32,6 +32,7 @@ protected:
> > >         std::shared_ptr<libcamera::Camera> camera_;
> > >         std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
> > >         std::unique_ptr<libcamera::CameraConfiguration> config_;
> > > +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
> > >  };
> > >  
> > >  class SimpleCaptureBalanced : public SimpleCapture
> > > -- 
> > > 2.35.1
> > >
Kieran Bingham Nov. 30, 2022, 11:36 a.m. UTC | #8
Quoting Paul Elder (2022-11-30 11:31:06)
> On Wed, Nov 30, 2022 at 07:24:52PM +0800, Umang Jain wrote:
> > Hi Kieran,
> > 
> > On 11/30/22 6:53 PM, Kieran Bingham wrote:
> > > Quoting Umang Jain (2022-11-30 10:35:47)
> > > > Hi Kieran,
> > > > 
> > > > On 11/30/22 1:39 AM, Kieran Bingham via libcamera-devel wrote:
> > > > > Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
> > > > > > In the simple capture tests, in the capture functions, the Requests were
> > > > > > auto-deallocated by the function going out of scope after the test
> > > > > > completed. However, before the end of the scope, the Framebuffers that
> > > > > > the Requests referred to were manually freed. Thus when the Requests
> > > > > > were deallocated, they tried to cancel their Framebuffers, which involve
> > > > > > setting the status to cancelled, which obviously causes a segfault
> > > > > > because the Framebuffers had already been freed.
> > > > > This worries me that we have a path that could let application cause a
> > > > > segfault in libcamera. That's not good.
> > > > > 
> > > > > > Fix this by moving the list of Requests to a member variable and
> > > > > > deallocating them before deallocating the Framebuffers at stop() time.
> > > > > Fixing this here seems reasonable, but I'm concerned that we now need a
> > > > > unit test in libcamera unit tests which ensure or validate the lifetimes
> > > > > of freeing buffers/requests.
> > > > > 
> > > > > If I understand it, that means that we have a lifetime management issue
> > > > > still between the Request and the Buffer object that may be associated
> > > > > with it.
> > > > If the buffers to be associated with the Requests are allocated and
> > > > mis-managed by the application themselves - how would that be address by
> > > > libcamera-core?
> > > > 
> > > > I am reading the Request->addBuffer() documentation and clearly mentioned:
> > > > 
> > > >    *
> > > >    * A reference to the buffer is stored in the request. The caller is
> > > > responsible
> > > >    * for ensuring that the buffer will remain valid until the request
> > > > complete
> > > >    * callback is called.
> > > > *
> > > > 
> > > > In this case the buffers from manually freed before the request got
> > > > completed (via set 'cancelled' status and completeRequest() codepath in
> > > > PIpelineHandler::Stop()). So that's not the right thing to do / happen.
> > > aha, ok - if it's documented, then we're probably good with this for
> > > now.
> > > 
> > > I do wonder if we should have some reverse association such that when
> > > you add a buffer to a request, the buffer knows which request it's
> > > contained in, and can 'remove' itself when it's freed... but ... that
> > > makes me wonder if we could legitimately have a single buffer in
> > > multiple requests which would break that possibility ... so maybe we're
> > > fine with just the documentation as we have it.
> > 
> > We can enforce a std::shared_ptr <> model (but that's also not 100% safety)
> > but not sure how well it'll scale with language bindings in the future.
> > 
> > All in all, similar to dangling pointers problem.
> 
> Oops, I didn't notice this thread in the meantime.
> 
> So this patch is good on its own and I don't have to do anything extra
> then? :)

That depends on how likely it is that applications could /cause/ this
problem?

--
Kieran


> 
> 
> Paul
> 
> > > 
> > > --
> > > Kieran
> > > 
> > > 
> > > > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
> > > > > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > > > > Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > > > > 
> > > > > But I am not sure we should consider the underlying issue resolved yet.
> > > > > I feel like the crash in lc-compliance is simply telling us we have a
> > > > > fault with our API.
> > > > > 
> > > > > The question is if we should still fix lc-compliance here or not - or if
> > > > > we should keep it 'broken' so that we can validate whatever fix we do
> > > > > internally in libcamera?
> > > > > 
> > > > > --
> > > > > Kieran
> > > > > 
> > > > > 
> > > > > > Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> > > > > > ---
> > > > > >    src/apps/lc-compliance/simple_capture.cpp | 7 +++----
> > > > > >    src/apps/lc-compliance/simple_capture.h   | 1 +
> > > > > >    2 files changed, 4 insertions(+), 4 deletions(-)
> > > > > > 
> > > > > > diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
> > > > > > index ab5cb35c..cf4d7cf3 100644
> > > > > > --- a/src/apps/lc-compliance/simple_capture.cpp
> > > > > > +++ b/src/apps/lc-compliance/simple_capture.cpp
> > > > > > @@ -65,6 +65,7 @@ void SimpleCapture::stop()
> > > > > >           camera_->requestCompleted.disconnect(this);
> > > > > >           Stream *stream = config_->at(0).stream();
> > > > > > +       requests_.clear();
> > > > > >           allocator_->free(stream);
> > > > > >    }
> > > > > > @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> > > > > >           captureLimit_ = numRequests;
> > > > > >           /* Queue the recommended number of reqeuests. */
> > > > > > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> > > > > >           for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> > > > > >                   std::unique_ptr<Request> request = camera_->createRequest();
> > > > > >                   ASSERT_TRUE(request) << "Can't create request";
> > > > > > @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> > > > > >                   ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
> > > > > > -               requests.push_back(std::move(request));
> > > > > > +               requests_.push_back(std::move(request));
> > > > > >           }
> > > > > >           /* Run capture session. */
> > > > > > @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> > > > > >           captureLimit_ = numRequests;
> > > > > >           /* Queue the recommended number of reqeuests. */
> > > > > > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> > > > > >           for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> > > > > >                   std::unique_ptr<Request> request = camera_->createRequest();
> > > > > >                   ASSERT_TRUE(request) << "Can't create request";
> > > > > > @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> > > > > >                   ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
> > > > > > -               requests.push_back(std::move(request));
> > > > > > +               requests_.push_back(std::move(request));
> > > > > >           }
> > > > > >           /* Run capture session. */
> > > > > > diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
> > > > > > index fd9d2a97..2911d601 100644
> > > > > > --- a/src/apps/lc-compliance/simple_capture.h
> > > > > > +++ b/src/apps/lc-compliance/simple_capture.h
> > > > > > @@ -32,6 +32,7 @@ protected:
> > > > > >           std::shared_ptr<libcamera::Camera> camera_;
> > > > > >           std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
> > > > > >           std::unique_ptr<libcamera::CameraConfiguration> config_;
> > > > > > +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
> > > > > >    };
> > > > > >    class SimpleCaptureBalanced : public SimpleCapture
> > > > > > -- 
> > > > > > 2.35.1
> > > > > > 
> >
Nícolas F. R. A. Prado Nov. 30, 2022, 5:30 p.m. UTC | #9
On Tue, Nov 29, 2022 at 08:00:05PM +0900, Paul Elder via libcamera-devel wrote:
> In the simple capture tests, in the capture functions, the Requests were
> auto-deallocated by the function going out of scope after the test
> completed. However, before the end of the scope, the Framebuffers that
> the Requests referred to were manually freed. Thus when the Requests
> were deallocated, they tried to cancel their Framebuffers, which involve
> setting the status to cancelled, which obviously causes a segfault
> because the Framebuffers had already been freed.
> 
> Fix this by moving the list of Requests to a member variable and
> deallocating them before deallocating the Framebuffers at stop() time.
> 
> Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>

I ended up sending almost exactly the same patch [1], so that's kind of a
R-by/T-by tag in itself :), but here they are explicitly:

Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>

Thanks,
Nícolas

[1] https://patchwork.libcamera.org/patch/17918/
Laurent Pinchart Dec. 1, 2022, 1:31 a.m. UTC | #10
Hello,

On Wed, Nov 30, 2022 at 11:36:15AM +0000, Kieran Bingham via libcamera-devel wrote:
> Quoting Paul Elder (2022-11-30 11:31:06)
> > On Wed, Nov 30, 2022 at 07:24:52PM +0800, Umang Jain wrote:
> > > On 11/30/22 6:53 PM, Kieran Bingham wrote:
> > > > Quoting Umang Jain (2022-11-30 10:35:47)
> > > > > On 11/30/22 1:39 AM, Kieran Bingham via libcamera-devel wrote:
> > > > > > Quoting Paul Elder via libcamera-devel (2022-11-29 11:00:05)
> > > > > > > In the simple capture tests, in the capture functions, the Requests were
> > > > > > > auto-deallocated by the function going out of scope after the test
> > > > > > > completed. However, before the end of the scope, the Framebuffers that
> > > > > > > the Requests referred to were manually freed. Thus when the Requests
> > > > > > > were deallocated, they tried to cancel their Framebuffers, which involve
> > > > > > > setting the status to cancelled, which obviously causes a segfault
> > > > > > > because the Framebuffers had already been freed.
> > > > > >
> > > > > > This worries me that we have a path that could let application cause a
> > > > > > segfault in libcamera. That's not good.

Well, if applications really mess it up, it's bound to happen. I do
however agree that in this case it seems possibly too easy for
applications to get it wrong.

> > > > > > > Fix this by moving the list of Requests to a member variable and
> > > > > > > deallocating them before deallocating the Framebuffers at stop() time.
> > > > > >
> > > > > > Fixing this here seems reasonable, but I'm concerned that we now need a
> > > > > > unit test in libcamera unit tests which ensure or validate the lifetimes
> > > > > > of freeing buffers/requests.
> > > > > > 
> > > > > > If I understand it, that means that we have a lifetime management issue
> > > > > > still between the Request and the Buffer object that may be associated
> > > > > > with it.

We have an issue if the application gets it wrong. How would a unit test
help here ?

> > > > > If the buffers to be associated with the Requests are allocated and
> > > > > mis-managed by the application themselves - how would that be address by
> > > > > libcamera-core?
> > > > > 
> > > > > I am reading the Request->addBuffer() documentation and clearly mentioned:
> > > > > 
> > > > >    *
> > > > >    * A reference to the buffer is stored in the request. The caller is responsible
> > > > >    * for ensuring that the buffer will remain valid until the request complete
> > > > >    * callback is called.
> > > > >    *
> > > > > 
> > > > > In this case the buffers from manually freed before the request got
> > > > > completed (via set 'cancelled' status and completeRequest() codepath in
> > > > > PIpelineHandler::Stop()). So that's not the right thing to do / happen.
> > > >
> > > > aha, ok - if it's documented, then we're probably good with this for
> > > > now.
> > > > 
> > > > I do wonder if we should have some reverse association such that when
> > > > you add a buffer to a request, the buffer knows which request it's
> > > > contained in, and can 'remove' itself when it's freed... but ... that
> > > > makes me wonder if we could legitimately have a single buffer in
> > > > multiple requests which would break that possibility ... so maybe we're
> > > > fine with just the documentation as we have it.
> > > 
> > > We can enforce a std::shared_ptr <> model (but that's also not 100% safety)
> > > but not sure how well it'll scale with language bindings in the future.

std::shared_ptr<> are usually painful, and not just for language
bindings, they tend to creep in everywhere when you start using them. I
wouldn't go that route if possible.

> > > All in all, similar to dangling pointers problem.
> > 
> > Oops, I didn't notice this thread in the meantime.
> > 
> > So this patch is good on its own and I don't have to do anything extra
> > then? :)
> 
> That depends on how likely it is that applications could /cause/ this
> problem?

There's something I don't quite get. The requests are destroyed after
stop(). I thus expect all requests to have completed, either normally or
because they have been cancelled, before the stop() function returns.
How comes Request::Private::doCancelRequest(), called from the request
destructor, sees a non-empty pending_ list ?

> > > > > > Bug: https://bugs.libcamera.org/show_bug.cgi?id=171
> > > > > > Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > > > > > Tested-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
> > > > > > 
> > > > > > But I am not sure we should consider the underlying issue resolved yet.
> > > > > > I feel like the crash in lc-compliance is simply telling us we have a
> > > > > > fault with our API.
> > > > > > 
> > > > > > The question is if we should still fix lc-compliance here or not - or if
> > > > > > we should keep it 'broken' so that we can validate whatever fix we do
> > > > > > internally in libcamera?
> > > > > > 
> > > > > > --
> > > > > > Kieran
> > > > > > 
> > > > > > 
> > > > > > > Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
> > > > > > > ---
> > > > > > >    src/apps/lc-compliance/simple_capture.cpp | 7 +++----
> > > > > > >    src/apps/lc-compliance/simple_capture.h   | 1 +
> > > > > > >    2 files changed, 4 insertions(+), 4 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
> > > > > > > index ab5cb35c..cf4d7cf3 100644
> > > > > > > --- a/src/apps/lc-compliance/simple_capture.cpp
> > > > > > > +++ b/src/apps/lc-compliance/simple_capture.cpp
> > > > > > > @@ -65,6 +65,7 @@ void SimpleCapture::stop()
> > > > > > >           camera_->requestCompleted.disconnect(this);
> > > > > > >           Stream *stream = config_->at(0).stream();
> > > > > > > +       requests_.clear();
> > > > > > >           allocator_->free(stream);
> > > > > > >    }
> > > > > > > @@ -95,7 +96,6 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> > > > > > >           captureLimit_ = numRequests;
> > > > > > >           /* Queue the recommended number of reqeuests. */
> > > > > > > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> > > > > > >           for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> > > > > > >                   std::unique_ptr<Request> request = camera_->createRequest();
> > > > > > >                   ASSERT_TRUE(request) << "Can't create request";
> > > > > > > @@ -104,7 +104,7 @@ void SimpleCaptureBalanced::capture(unsigned int numRequests)
> > > > > > >                   ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
> > > > > > > -               requests.push_back(std::move(request));
> > > > > > > +               requests_.push_back(std::move(request));
> > > > > > >           }
> > > > > > >           /* Run capture session. */
> > > > > > > @@ -156,7 +156,6 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> > > > > > >           captureLimit_ = numRequests;
> > > > > > >           /* Queue the recommended number of reqeuests. */
> > > > > > > -       std::vector<std::unique_ptr<libcamera::Request>> requests;
> > > > > > >           for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
> > > > > > >                   std::unique_ptr<Request> request = camera_->createRequest();
> > > > > > >                   ASSERT_TRUE(request) << "Can't create request";
> > > > > > > @@ -165,7 +164,7 @@ void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
> > > > > > >                   ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
> > > > > > > -               requests.push_back(std::move(request));
> > > > > > > +               requests_.push_back(std::move(request));
> > > > > > >           }
> > > > > > >           /* Run capture session. */
> > > > > > > diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
> > > > > > > index fd9d2a97..2911d601 100644
> > > > > > > --- a/src/apps/lc-compliance/simple_capture.h
> > > > > > > +++ b/src/apps/lc-compliance/simple_capture.h
> > > > > > > @@ -32,6 +32,7 @@ protected:
> > > > > > >           std::shared_ptr<libcamera::Camera> camera_;
> > > > > > >           std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
> > > > > > >           std::unique_ptr<libcamera::CameraConfiguration> config_;
> > > > > > > +       std::vector<std::unique_ptr<libcamera::Request>> requests_;
> > > > > > >    };
> > > > > > >    class SimpleCaptureBalanced : public SimpleCapture

Patch
diff mbox series

diff --git a/src/apps/lc-compliance/simple_capture.cpp b/src/apps/lc-compliance/simple_capture.cpp
index ab5cb35c..cf4d7cf3 100644
--- a/src/apps/lc-compliance/simple_capture.cpp
+++ b/src/apps/lc-compliance/simple_capture.cpp
@@ -65,6 +65,7 @@  void SimpleCapture::stop()
 	camera_->requestCompleted.disconnect(this);
 
 	Stream *stream = config_->at(0).stream();
+	requests_.clear();
 	allocator_->free(stream);
 }
 
@@ -95,7 +96,6 @@  void SimpleCaptureBalanced::capture(unsigned int numRequests)
 	captureLimit_ = numRequests;
 
 	/* Queue the recommended number of reqeuests. */
-	std::vector<std::unique_ptr<libcamera::Request>> requests;
 	for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
 		std::unique_ptr<Request> request = camera_->createRequest();
 		ASSERT_TRUE(request) << "Can't create request";
@@ -104,7 +104,7 @@  void SimpleCaptureBalanced::capture(unsigned int numRequests)
 
 		ASSERT_EQ(queueRequest(request.get()), 0) << "Failed to queue request";
 
-		requests.push_back(std::move(request));
+		requests_.push_back(std::move(request));
 	}
 
 	/* Run capture session. */
@@ -156,7 +156,6 @@  void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
 	captureLimit_ = numRequests;
 
 	/* Queue the recommended number of reqeuests. */
-	std::vector<std::unique_ptr<libcamera::Request>> requests;
 	for (const std::unique_ptr<FrameBuffer> &buffer : buffers) {
 		std::unique_ptr<Request> request = camera_->createRequest();
 		ASSERT_TRUE(request) << "Can't create request";
@@ -165,7 +164,7 @@  void SimpleCaptureUnbalanced::capture(unsigned int numRequests)
 
 		ASSERT_EQ(camera_->queueRequest(request.get()), 0) << "Failed to queue request";
 
-		requests.push_back(std::move(request));
+		requests_.push_back(std::move(request));
 	}
 
 	/* Run capture session. */
diff --git a/src/apps/lc-compliance/simple_capture.h b/src/apps/lc-compliance/simple_capture.h
index fd9d2a97..2911d601 100644
--- a/src/apps/lc-compliance/simple_capture.h
+++ b/src/apps/lc-compliance/simple_capture.h
@@ -32,6 +32,7 @@  protected:
 	std::shared_ptr<libcamera::Camera> camera_;
 	std::unique_ptr<libcamera::FrameBufferAllocator> allocator_;
 	std::unique_ptr<libcamera::CameraConfiguration> config_;
+	std::vector<std::unique_ptr<libcamera::Request>> requests_;
 };
 
 class SimpleCaptureBalanced : public SimpleCapture