Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime error: invalid memory address or nil pointer dereference #97

Open
tcmbilozub opened this issue Jun 19, 2023 · 11 comments
Open

Runtime error: invalid memory address or nil pointer dereference #97

tcmbilozub opened this issue Jun 19, 2023 · 11 comments

Comments

@tcmbilozub
Copy link

tcmbilozub commented Jun 19, 2023

The following error occurs when executing several simultaneous requests:


19 Jun 2023 22:05:39,909 [INFO] (rapid) ReserveFailed: AlreadyReserved
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x68c30f]

goroutine 131 [running]:
go.amzn.com/lambda/rapidcore.(*Server).Invoke.func2()
/LambdaRuntimeLocal/lambda/rapidcore/server.go:653 +0xef
created by go.amzn.com/lambda/rapidcore.(*Server).Invoke
/LambdaRuntimeLocal/lambda/rapidcore/server.go:636 +0x23d


Docker image: public.ecr.aws/lambda/python:3.8

@nparker2020
Copy link

nparker2020 commented Aug 16, 2023

Hey @tcmbilozub
I ran into this same error message, but it appears to be a different issue.
For me, it turned out to be an incorrect path to the python executable that was passed to the runtime interface emulator. Posting here to hopefully help others.


Starting mock Lambda runtime:
16 Aug 2023 16:42:49,811 [INFO] (rapid) exec '/usr/bin/python' (cwd=/, handler=awslambdaric)
16 Aug 2023 16:42:51,574 [INFO] (rapid) extensionsDisabledByLayer(/opt/disable-extensions-jwigqn8j) -> stat /opt/disable-extensions-jwigqn8j: no such file or directory
16 Aug 2023 16:42:51,574 [INFO] (rapid) Configuring and starting Operator Domain
16 Aug 2023 16:42:51,574 [INFO] (rapid) Starting runtime domain
16 Aug 2023 16:42:51,574 [WARNING] (rapid) Cannot list external agents error=open /opt/extensions: no such file or directory
16 Aug 2023 16:42:51,574 [INFO] (rapid) Starting runtime without AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN , Expected?: false
START RequestId: e4769e06-19d6-45c7-969f-a310d8769df9 Version: $LATEST
16 Aug 2023 16:42:51,574 [WARNING] (rapid) First fatal error stored in appctx: Runtime.InvalidEntrypoint
******************************************************************************************************************************************
16 Aug 2023 16:42:51,574 [ERROR] (rapid) Init failed InvokeID= error=fork/exec /usr/bin/python: no such file or directory
******************************************************************************************************************************************

16 Aug 2023 16:42:51,574 [INFO] (rapid) Starting runtime domain
16 Aug 2023 16:42:51,574 [WARNING] (rapid) Cannot list external agents error=open /opt/extensions: no such file or directory
16 Aug 2023 16:42:51,575 [INFO] (rapid) Starting runtime without AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN , Expected?: false
16 Aug 2023 16:42:51,575 [WARNING] (rapid) Omitting fatal error Runtime.InvalidEntrypoint: Runtime.InvalidEntrypoint already stored
START RequestId: 9098b9f6-2ec1-4574-976c-c1f708284870 Version: $LATEST
16 Aug 2023 16:42:55,910 [INFO] (rapid) ReserveFailed: AlreadyReserved
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x699b8f]

goroutine 51 [running]:
go.amzn.com/lambda/rapidcore.(*Server).Invoke.func2()
        /LambdaRuntimeLocal/lambda/rapidcore/server.go:653 +0xef
created by go.amzn.com/lambda/rapidcore.(*Server).Invoke
        /LambdaRuntimeLocal/lambda/rapidcore/server.go:636 +0x23d

@tcmbilozub
Copy link
Author

Temporarily solved this problem by downgrading to a lower version.

RUN curl -Lo /usr/local/bin/aws-lambda-rie https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/download/v1.10/aws-lambda-rie \ && chmod +x /usr/local/bin/aws-lambda-rie

@benoit-laplante
Copy link

Having same issue with simultaneous requests under python 3.8

(rapid) ReserveFailed: AlreadyReserved
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x699b8f]
goroutine 45 [running]:
go.amzn.com/lambda/rapidcore.(*Server).Invoke.func2()
     /LambdaRuntimeLocal/lambda/rapidcore/server.go:653 +0xef
 created by go.amzn.com/lambda/rapidcore.(*Server).Invoke
     /LambdaRuntimeLocal/lambda/rapidcore/server.go:636 +0x23d

@achernyakov-rvbd
Copy link

I too am hitting this when making simultaneous requests. Is there any workaround other than downgrading to an old version?

@fs-aikito
Copy link

The offending code is here

https://github.com/aws/aws-lambda-runtime-interface-emulator/blob/develop/lambda/rapidcore/server.go#L661-L666

It checks if Reserve() fails and logs message but then continues on as if nothing was wrong while reserveResp is nil causing this segfault immediatly when trying to access reserveResp.Token

@jmehnle
Copy link

jmehnle commented May 16, 2024

The permanent link for this code section is:

reserveResp, err := s.Reserve("", "", "")
if err != nil {
log.Infof("ReserveFailed: %s", err)
}
invoke.DeadlineNs = fmt.Sprintf("%d", metering.Monotime()+reserveResp.Token.FunctionTimeout.Nanoseconds())

I looked at this code, and I'm not sure as to what a good fix might be. One possibility would be to pull the s.Reserve() call out of the go func into the main Invoke() body, which would make the s.Reserve() call synchronous, allowing us to return from Invoke() with an error. This appears to be roughly what the previous version of Invoke() did:

func (s *Server) Invoke(responseWriter http.ResponseWriter, invoke *interop.Invoke) error {
resetCtx, resetCancel := context.WithCancel(context.Background())
defer resetCancel()
timeoutChan := make(chan error)
go func() {
select {
case <-time.After(s.GetInvokeTimeout()):
timeoutChan <- ErrInvokeTimeout
s.Reset(autoresetReasonTimeout, resetDefaultTimeoutMs)
case <-resetCtx.Done():
log.Debugf("execute finished, autoreset cancelled")
}
}()
reserveResp, err := s.Reserve(invoke.ID, "", "")
if err != nil {
switch err {

… but it seems undoing that would contravene the intent behind moving the s.Reserve() call into an asynchronous goroutine.

AWS, can someone please comment on what your plans are for addressing this regression?

@jperezr21
Copy link

Any updates on this? Makes testing locally impossible for some use cases.

@myedibleenso
Copy link

Temporarily solved this problem by downgrading to a lower version.

RUN curl -Lo /usr/local/bin/aws-lambda-rie https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/download/v1.10/aws-lambda-rie \ && chmod +x /usr/local/bin/aws-lambda-rie

Thank you, @tcmbilozub , for this workaround. With the earlier version you identified at least the first request made can complete (tested with python 3.11).

Given its effect on concurrency and the popularity of lambda, I'm surprised this regression hasn't garnered more attention by now. Someone once said "speed matters in business." AWS lambda team, please consider prioritizing this for your customers.

@OJFord
Copy link

OJFord commented Sep 19, 2024

As alluded to above, downgrading isn't really a solution, since it'll still error '[INFO] (rapid) ReserveFailed: AlreadyReserved', it just won't panic on it.

The issue seems to be that Reserve tries to update the single context, on the single Server, and so it just inherently can't serve multiple concurrent invocations:

func (s *Server) setNewInvokeContext(invokeID string, traceID, lambdaSegmentID string) (*ReserveResponse, error) {
s.mutex.Lock()
defer s.mutex.Unlock()
if s.invokeCtx != nil {
return nil, ErrAlreadyReserved
}
s.invokeCtx = &InvokeContext{
Token: interop.Token{
ReservationToken: uuid.New().String(),
InvokeID: invokeID,
VersionID: standaloneVersionID,
FunctionTimeout: s.invokeTimeout,
TraceID: traceID,
LambdaSegmentID: lambdaSegmentID,
InvackDeadlineNs: math.MaxInt64, // no INVACK in standalone
},
}
resp := &ReserveResponse{
Token: s.invokeCtx.Token,
}
s.reservationContext, s.reservationCancel = context.WithCancel(context.Background())
return resp, nil
}
type ReserveResponse struct {
Token interop.Token
InternalState *statejson.InternalStateDescription
}
// Reserve allocates invoke context
func (s *Server) Reserve(id string, traceID, lambdaSegmentID string) (*ReserveResponse, error) {
invokeID := uuid.New().String()
if len(id) > 0 {
invokeID = id
}
resp, err := s.setNewInvokeContext(invokeID, traceID, lambdaSegmentID)

but I'm not that familiar with Go, nevermind this project, and I don't know the solution.

Are you able to comment @valerena?

OJFord added a commit to OJFord/aws-lambda-runtime-interface-emulator that referenced this issue Sep 19, 2024
@OJFord
Copy link

OJFord commented Sep 19, 2024

I believe I have a fix/workaround (?) in #133.

@valerena
Copy link
Contributor

The situation here is that for requests running in the Lambda service in the cloud, there will never be concurrent requests, because Lambda manages concurrent requests in different sandboxes/environments that don't interact with each other (therefore different instances of the Lambda Runtime API). This emulator package is a close representation of what's actually executed in the real Lambda hosts, and that's why the concurrent use case is not supported here either.

About the proposed fix, it's modifying behavior that is coming from that internal functionality, and we try to not have deviations on that side.

We will analyze it a bit more, to see if we can come up with a solution that works well for the local invocations, but without affecting the actual execution in the real Lambda environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants