Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Each DynamoDB stream event handler duplicates the IAM policy, causing "Maximum policy size exceeded" #12313

Open
4 tasks done
tibbe opened this issue Dec 27, 2023 · 10 comments · May be fixed by #12320
Open
4 tasks done

Comments

@tibbe
Copy link

tibbe commented Dec 27, 2023

Are you certain it's a bug?

  • Yes, it looks like a bug

Is the issue caused by a plugin?

  • It is not a plugin issue

Are you using the latest v3 release?

  • Yes, I'm using the latest v3 release

Is there an existing issue for this?

  • I have searched existing issues, it hasn't been reported yet

Issue description

We recently started seeing a deployment error

UPDATE_FAILED: IamRoleLambdaExecution (AWS::IAM::Role)
Resource handler returned message: "Maximum policy size of 10240 bytes exceeded for role backend-xxxxxx-eu-central-1-lambdaRole (Service: Iam, Status Code: 409, Request ID: 0ab39c5b-fef4-491b-be1d-d1a730xxxxxx)" (RequestToken: 7591a9ce-5117-b2b5-c915-161cb4xxxxxx, HandlerErrorCode: ServiceLimitExceeded)

Debugging further I noticed the same IAM Role section repeated 29 times (2 repeats shown below for demonstration purposes) in the generated backend-xxxxxx-eu-central-1-lambdaRole role:

        {
            "Action": [
                "dynamodb:GetRecords",
                "dynamodb:GetShardIterator",
                "dynamodb:DescribeStream",
                "dynamodb:ListStreams"
            ],
            "Resource": [
                "arn:aws:dynamodb:eu-central-1:227135xxxxxx:table/backend-xxxxxx-Users/stream/2023-07-11T08:50:35.845"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "dynamodb:GetRecords",
                "dynamodb:GetShardIterator",
                "dynamodb:DescribeStream",
                "dynamodb:ListStreams"
            ],
            "Resource": [
                "arn:aws:dynamodb:eu-central-1:227135xxxxxx:table/backend-xxxxxx-Users/stream/2023-07-11T08:50:35.845"
            ],
            "Effect": "Allow"
        },

(The resource is the same in each section so this is truly duplicated.)

It's seem like that for each

  events:
    - stream:
      type: dynamodb

section in the YAML config we get a new repeat of the above policy, eventually causing the size to go over the limit.

Service configuration (serverless.yml) content

# The whole config too large to include and mask out sensitive parts of. Here's the relevant part:
ProjectDeleter:
  image:
    name: backend
    command:
      - xxxxxx.clean_up.project_deleted_handler.handle
  timeout: 300  # 5 minutes
  events:
    - stream:
        type: dynamodb
        arn:
          { "Fn::GetAtt": ["DynamoDBTable", "StreamArn"] }
        filterPatterns:
          - eventName: [REMOVE]
            dynamodb:
              OldImage:
                Type:
                  S: [PROJECT]
        batchWindow: 0
        batchSize: 10
        functionResponseType: ReportBatchItemFailures
  environment:
    POWERTOOLS_SERVICE_NAME: ProjectDeleter

(We have 29 of these in our config.)

Command name and used flags

serverless deploy --stage=xxxxxx

Command output

UPDATE_FAILED: IamRoleLambdaExecution (AWS::IAM::Role)
Resource handler returned message: "Maximum policy size of 10240 bytes exceeded for role backend-xxxxxx-eu-central-1-lambdaRole (Service: Iam, Status Code: 409, Request ID: 0ab39c5b-fef4-491b-be1d-d1a730xxxxxx)" (RequestToken: 7591a9ce-5117-b2b5-c915-161cb4xxxxxx, HandlerErrorCode: ServiceLimitExceeded)

Environment information

Framework Core: 3.35.2 (local) 3.33.0 (global)
Plugin: 7.0.5
SDK: 4.4.0
@tibbe
Copy link
Author

tibbe commented Dec 27, 2023

Assuming I'm looking at the right code, it's not obvious why this happens:

dynamodbStreamStatement.Resource.push(EventSourceArn);

The above code seems to add each resource to the same statement (although without any deduping based on the resource, from what I can see). I don't see how we end up getting multiple statements.

@tibbe
Copy link
Author

tibbe commented Dec 27, 2023

Could it be that CloudFormation (perhaps when doing an incremental deployment) appends rather than overwrites the policy?

@tibbe
Copy link
Author

tibbe commented Dec 27, 2023

The IAM policy editor shows "Suggestions: Redundant Statement", so it agrees these statements are redundant.

@tibbe
Copy link
Author

tibbe commented Dec 27, 2023

Manually deleting all sections containing dynamodb:GetRecords except one, followed by a redeploy didn't recreate all those sections. This suggests that this perhaps has to do with either something that happened in an earlier serverless version or something that has to do with incremental updates of some sort.

@tibbe
Copy link
Author

tibbe commented Dec 29, 2023

My workaround from #12313 (comment) no longer works. I'm not sure why it ever did. Deploys are now completely blocked due to the number of dynamodb:GetRecords statements in the lambdaRole.

This is a rather serious problem. Our serverless stack is no longer deployable and we need to consider rather undesirable workarounds, like manually merging Lambdas to reduce the size of the IAM document.

@tibbe
Copy link
Author

tibbe commented Dec 29, 2023

I tried a workaround by using a custom role, however that doesn't help as the default lambdaRole policy still contains a copy of dynamodb:GetRecords and friends, even if the lambda doesn't use the default role.

@tibbe
Copy link
Author

tibbe commented Dec 29, 2023

I've updated to the latest version and confirmed the problem persists there:

Framework Core: 3.38.0 (local) 3.38.0 (global)
Plugin: 7.2.0
SDK: 4.5.1

tibbe added a commit to tibbe/serverless that referenced this issue Jan 4, 2024
…erless#12313)

Each function consuming a stream event would emit its own PolicyDocument
statement. This statement would contain a list of actions that doesn't
change between functions. For DynamoDB streams the list is:

```
"Action": [
    "dynamodb:GetRecords",
    "dynamodb:GetShardIterator",
    "dynamodb:DescribeStream",
    "dynamodb:ListStreams"
],
```

Duplicating these for each function causes the IAM policy to exceed the
AWS limit after about 30 functions.

The resource names are still duplicated, if they happen to be the same.
@tibbe
Copy link
Author

tibbe commented Jan 4, 2024

Upon reading the code again it's clear what's going on. We're adding the same statement (in particular, the same list of allowed actions) for each event stream processing function. This is unnecessary and only the list of resources needs to be updated per function. Fix: #12320

@tibbe
Copy link
Author

tibbe commented Jan 13, 2024

We're currently running a patched version of serverless to be able to deploy to production due to this bug. Could someone please take a look at the provided PR with the fix?

@tibbe
Copy link
Author

tibbe commented Jan 24, 2024

This is a production blocking issue for larger scale users of serverless with a simple fix provided as a PR. Could someone please take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant