Lambda package size #2162

kyeotic · 2019-01-07T23:11:40Z

AWS Lambda containers are especially sensitive to their startup time. The scaling model of lambda (one-container-per-request) means containers are often starting, unlike "normal" node servers that start once.

A lot of performance testing has gone into measuring the "cold start" time of Lambdas. Code parsing and loading plays a significant role in this time. A dependency-free node Lambda running on 256mb can cold-start in around ~250ms. A node lambda with 2mb of dependencies running on 256mb can cold-start in around ~2seconds.

apollo-server-lambda has some rather large dependencies, given its task of receiving and sending JSON objects.

Full report
busboy: 539kb
apollo-engine-reporting-protobuf: 177kb
lodash: 69kb.

Busboy is an HTML body parser. This seems entirely unnecessary given that lambdas receive their events from API gateway. This is a solid 50% of the package, and I think it can be safely removed.
apollo-engine-reporting-protobuf is for the reporting engine. This may be better served as a seperate package, so that performance-critical lambdas can drop it.
Lodash is tough. Its used, but not all of it. Lodash exports its functions as seperate modules, and a performance increase could be realized by taking only was is needed.

The Challenge
Of course, none of these are direct dependencies of apollo-server-lambda, they are transitive dependencies. They all come from apollo-server-core. Fixing these transitive dependencies will require work on the core to break out these "not-actually-core" dependencies. However, from the performance tests we've done a drop of just 500kb (busboy) could save as much as 500ms during cold-starts. Hopefully that kind of improvement is worth the effort.

kyeotic · 2019-02-05T19:36:53Z

I've got some more scientific info. Here is a flame chart showing a cold start on a beefy 512 lambda

here is an interactive version

I tried to skip graphql-upload from loading with the uploads: false constructor option, but as you can see busboy still loads.

These are some of the pain points I think should be moved out of core, or provided some way to stop from loading inside the lambda package

110ms: apollo-server-core/dist/runHttpQuery
78ms: apollo-engine-reporting
68ms: graphql-upload
33ms: util.promisify/index.js (I can't figure out why this is even loading, its running on Node 8.10)

hotgazpacho · 2019-02-11T17:10:28Z

I'm looking at the contents of the node_modules directory of the zip file for my graphql handler. It is 15Mb. Three of top four largest modules make up over 1/3 of the size of the package. These are:

protobuf.js at 3.2 MB
graphql at 1.8 MB
lodash at 1.4 MB
busboy at 683 KB

> npm ls protobufjs                                                                                                                                                                                                                                                                         
myproject-api@0.1.0 /Users/will.green/Development/myproject/api
└─┬ apollo-server-lambda@2.3.3
  └─┬ apollo-server-core@2.3.3
    └─┬ apollo-engine-reporting@0.2.1
      └─┬ apollo-engine-reporting-protobuf@0.2.0
        └── protobufjs@6.8.8

apollo-engine-reporting is supposed to be opt-in, so I believe that it should be listed under peerDependencies, not dependencies.

> npm ls busboy                                                                                                                                                                                                                                                                                
myproject-api@0.1.0 /Users/will.green/Development/myproject/api
└─┬ apollo-server-lambda@2.3.3
  └─┬ apollo-server-core@2.3.3
    └─┬ graphql-upload@8.0.4
      └── busboy@0.3.0

File uploads with graphql are also opt-in, so graphql-upload should probably also be listed under peerDependencies.

Those two changes would drop my bundle size by nearly 1/3.

kyeotic · 2019-02-11T18:30:01Z

@hotgazpacho yea, ideally those wouldn’t be in the bundle. However, changing the dependencies like that is a breaking change, and probably couldn’t be done before the next major version.

hotgazpacho · 2019-02-12T02:33:05Z

Totally understand and appreciate that. My hope is that it does indeed get placed and committed to on the roadmap 😄

abernix · 2019-07-16T15:59:32Z

Per my comment, #2324 (comment), this should have been improved with 2.6.0, and I hope that you have noticed some substantial reduction. We'll continue to work on improving this (See #2360), but I just want to thank you very much for your very clear investigative approach here!

ghost · 2019-08-11T15:41:42Z

Hi @abernix - It looks like this issue was resolved and released, but I still have similar issues in apollo-server-lambda@2.8.1.

It looks like apollo-engine-reporting-protobuf is now being imported through apollo-server-types.

I couldn't find a current issue for this but wanted to make sure it wasn't something in my setup before raising one.

hotgazpacho · 2019-08-12T21:27:58Z

Sorry, but this has not improved the package size.
protobufjs is still getting bundled into the archive, as is busboy.

kyeotic · 2019-08-12T21:39:33Z

@aaronplummeridgelumira @hotgazpacho Yep, looks like this has regressed. I don't have the time right now to re-run this investigation. Is anyone else available to fix this?

hotgazpacho · 2019-08-12T22:21:44Z

Looks like the regression happened in #2900, per the blame: https://github.com/apollographql/apollo-server/blame/master/packages/apollo-server-types/src/index.ts

hotgazpacho · 2019-08-12T22:34:56Z

Upon further digging, it looks like the protobuf dependency shifted from apollo-server-core to apollo-server-types between 2.7.0 and 2.71

2.7.0

npm ls protobufjs

└─┬ apollo-server-lambda@2.7.0
  └─┬ apollo-server-core@2.7.0
    └─┬ apollo-engine-reporting-protobuf@0.4.0
      └── protobufjs@6.8.8

2.7.1

npm ls protobufjs

└─┬ apollo-server-lambda@2.7.1
  └─┬ apollo-server-types@0.2.1
    └─┬ apollo-engine-reporting-protobuf@0.4.0
      └── protobufjs@6.8.8

BTW, the 2.6.0 release did not address the issue:

npm ls protobuf

└─┬ apollo-server-lambda@2.6.0
  └─┬ apollo-server-core@2.6.0
    └─┬ apollo-engine-reporting@1.2.0
      └─┬ apollo-engine-reporting-protobuf@0.3.0
        └── protobufjs@6.8.8

ghost · 2019-08-13T16:54:38Z

Here it is for busboy with version 2.8.1 too.

npm ls busboy

`-- apollo-server-lambda@2.8.1
  `-- apollo-server-core@2.8.1
    `-- graphql-upload@8.0.7
      `-- busboy@0.3.1

Hopefully I'll get time to look at this at some point next week.

abernix · 2019-08-26T12:58:51Z

I think it would be prudent to open a new issue, though I would like to ask that we create distinction between bundle size and runtime cost.

The actual bundle size would have been unlikely to have ever changed, but we specifically shifted the evaluation of the module into a conditional to avoid it on environments where Engine is not used. That's to say, I believe we only previously fixed the runtime cost.

As an explanation for the new state of affairs, with the new configuration, it is certainly plausible to believe that the importing of Trace from apollo-engine-reporting-protobuf (which is only necessary for TypeScript typings!) might have resulted in runtime evaluation of the entirety of apollo-engine-reporting-protobuf.

I believe that a workaround worth trying is to switch to importing Trace.IQueryPlanNode (the type) directly from apollo-engine-reporting-protobuf/protobuf rather than the top-level module entry point (i.e. index).

Anyhow, still, same request for opening a new issue and referencing this from that, but please do try to take a shot at my above suggestion. Also, we should avoid evaluating any of these changes based on npm ls since that would never have been any different.

…obuf`. This is an experiment, but hopefully will resolve the issue noted in: #2162 (comment)

kyeotic · 2019-08-26T15:19:52Z

@abernix is exactly right, the tools I used to test only measure whether the package was loaded at runtime, since it is measuring how long it takes to load. Lambda is not impacted by bundle size, only by the code that is loaded. npm ls is not an effective way to check whether or not the packaeged is used at runtime.

hotgazpacho · 2019-08-26T20:09:22Z

Lambda is impacted by package size.

https://read.acloud.guru/does-coding-language-memory-or-package-size-affect-cold-starts-of-aws-lambda-a15e26d12c76

There’s a 50mb limit to the package size. Packaging something if it isn’t going to be used by your function increases not only deployment time, but also cold start time.

So, yes npm ls on the uncompressed package is a good measure.

kyeotic · 2019-08-26T20:12:36Z

@hotgazpacho unused code does not impact cold start time. I have verified this with dozens of tests. Even the article you linked draws this conclusion

bigger deployment package size does not increase cold start time

hotgazpacho · 2019-08-26T22:33:28Z

I’m sorry, but you are mistaken. Downloading the package from S3 is the very first thing that happens during a cold start. A 3mb package downloads faster than a 30mb one. Therefore, reducing deployment package size reduces cold start time.

https://lumigo.io/blog/how-to-improve-aws-lambda-cold-start-performance/

kyeotic · 2019-08-26T22:59:31Z

I don’t upload my packages to s3, so perhaps that has affected my tests. I deploy directly to lambda. I have run tests that measure cold start with paths that don’t load any dependencies. have also verified that a lambda with a 20mb video in the bundle made no impact at all to cold start. Size doesn’t matter, only what is loaded into memory. I have run these tests regularly, on several versions of node lambda. I am confident in this conclusion. Have you actually run a test to verify this?

hotgazpacho · 2019-08-27T02:18:39Z

You may not think that you upload you packages to S3, but that’s exactly what “deploying to lambda” does... it stores your package in an S3 bucket, and when the lambda service invokes it from a cold start, it pulls it down from that bucket. At least, that is how it used to work. I’ve asked around some Lambda experts who seem to think they’ve put some optimizations in place to mitigate this.

I apologize if my tone was a bit harsh. At the end of the day, I just want to get my deployment package smaller. If I have to manually strip the protobuf stuff from from my node_modules directory in the package before deploying, I can live with that so long as it doesn’t cause my code to fail. Ideally, the stuff that depends on protobuf would be a peerDependency, because it’s not needed for the regular functioning of this package.

kyeotic · 2019-08-27T02:33:30Z

It may silently use s3 as a storage layer, but it definitely doesn't create an s3 bucket. The file mode and s3 mode are mutually exclusive, perhaps you are only aware of the latter?

kyeotic · 2019-08-27T02:54:52Z

In either case, whatever secret sauce, black magic, operational wizardy that AWS uses, unloaded bundle size doesn't seem to have an impact, only what is loaded into memory. I wont pretend to know how lambda works under the hood, but I can reliably reproduce those measurements. I can share the test harness I use if you are interested (its not as automated as it could be).

hotgazpacho · 2019-08-27T03:13:55Z

I really don’t care to argue this further. As I said in my last comment:

At the end of the day, I just want to get my deployment package smaller. If I have to manually strip the protobuf stuff from from my node_modules directory in the package before deploying, I can live with that so long as it doesn’t cause my code to fail. Ideally, the stuff that depends on protobuf would be a peerDependency, because it’s not needed for the regular functioning of this package.

I’m really unconcerned with how this unused code gets removed from my final deployment package. I just want it gone.

abernix · 2019-08-27T14:21:58Z

@hotgazpacho I can understand if you want to be able to control how that unused code gets removed in an environment that you can control. Since the runtime cost and the bundle size appears to actually have remained unchanged, if removing the apollo-engine-reporting-protobuf package is important to you, I suggest removing it in the same way you were removing it before. As one suggestion, you could just remove it from the lockfile for your project.

I will note though, to truly achieve the precision you seem to desire, you'll likely want to introduce a more complicated build step that allows you to do precise dead code elimination because there is likely more to remove. It might even make sense for you to fork Apollo Server and change it to emit ECMAScript modules so you can use a build tool like parcel or rollup which can do more thorough tree-shaking with the extra detail. (We may do this in Apollo Server 3.x, but not before experimental ECMAScript modules are less experimental.)

I do think it's worth noting that, in my opinion, you could have exercised a bit more discretion in your choice of references since it makes for a confusing discussion when both the articles you've cited include notes that there are not discernible differences in cold-boot time corresponding to bundle size:

https://read.acloud.guru/does-coding-language-memory-or-package-size-affect-cold-starts-of-aws-lambda-a15e26d12c76 states as a Conclusion:

bigger deployment package size does not increase cold start time

https://lumigo.io/blog/how-to-improve-aws-lambda-cold-start-performance/ states that:

We’ve seen that the biggest impact on AWS Lambda cold start times is not the size of the package but the initialization time when the package is actually loaded for the first time.

Of course, without some concrete examples demonstrating your own increased Lambda boot times, speculating about Amazon's internal operational dynamics is not going to do us much good, though I would be a bit surprised if the 732KB uncompressed size of apollo-engine-reporting-protobuf (65KB compressed) was an obstacle for their infrastructure.

But regardless, the safe suggestion to you is to just lop off apollo-engine-reporting-protobuf if it's worrisome to you. We hope to make this even easier in future versions of Apollo Server.

hotgazpacho · 2019-08-27T16:08:47Z

you’re right, cold start appears not to be affected by this. What is affected, however, is the time to package and deploy, and the extra attack surface presented by code that has no use in this context. I should have focused my discussion on those points.

Since I don’t seem to be able to communicate effectively why reducing the package size is important, I’ll drop off this thread, and as you suggest, look for other ways to easily prune megabytes of unused files from the deployment package.

kyeotic mentioned this issue Feb 6, 2019

Lambda lazy load #2278

Closed

4 tasks

kyeotic mentioned this issue Feb 14, 2019

lazy load unused lambda packages in core #2324

Merged

4 tasks

abernix closed this as completed Jul 16, 2019

abernix added a commit that referenced this issue Aug 26, 2019

types: Try to avoid the runtime cost of `apollo-engine-reporting-prot…

b5f6452

…obuf`. This is an experiment, but hopefully will resolve the issue noted in: #2162 (comment)

abernix mentioned this issue Aug 26, 2019

types: Try to avoid runtime cost of apollo-engine-reporting-protobuf. #3210

Closed

kyeotic mentioned this issue Aug 6, 2020

Lambda Package Sized (pt2) #4462

Closed

abernix mentioned this issue Feb 5, 2021

Apollo Server 3.0 Roadmap #2360

Closed

crubier mentioned this issue Jun 30, 2021

Optimize bundle size labelflow/labelflow#233

Open

github-actions bot locked as resolved and limited conversation to collaborators Mar 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lambda package size #2162

Lambda package size #2162

kyeotic commented Jan 7, 2019

kyeotic commented Feb 5, 2019

hotgazpacho commented Feb 11, 2019

kyeotic commented Feb 11, 2019

hotgazpacho commented Feb 12, 2019

abernix commented Jul 16, 2019 •

edited

ghost commented Aug 11, 2019

hotgazpacho commented Aug 12, 2019

kyeotic commented Aug 12, 2019

hotgazpacho commented Aug 12, 2019 •

edited

hotgazpacho commented Aug 12, 2019

ghost commented Aug 13, 2019

abernix commented Aug 26, 2019

kyeotic commented Aug 26, 2019

hotgazpacho commented Aug 26, 2019

kyeotic commented Aug 26, 2019 •

edited

hotgazpacho commented Aug 26, 2019

kyeotic commented Aug 26, 2019 via email •

edited

hotgazpacho commented Aug 27, 2019

kyeotic commented Aug 27, 2019

kyeotic commented Aug 27, 2019 •

edited

hotgazpacho commented Aug 27, 2019

abernix commented Aug 27, 2019

hotgazpacho commented Aug 27, 2019

Lambda package size #2162

Lambda package size #2162

Comments

kyeotic commented Jan 7, 2019

kyeotic commented Feb 5, 2019

hotgazpacho commented Feb 11, 2019

kyeotic commented Feb 11, 2019

hotgazpacho commented Feb 12, 2019

abernix commented Jul 16, 2019 • edited

ghost commented Aug 11, 2019

hotgazpacho commented Aug 12, 2019

kyeotic commented Aug 12, 2019

hotgazpacho commented Aug 12, 2019 • edited

hotgazpacho commented Aug 12, 2019

2.7.0

2.7.1

ghost commented Aug 13, 2019

abernix commented Aug 26, 2019

kyeotic commented Aug 26, 2019

hotgazpacho commented Aug 26, 2019

kyeotic commented Aug 26, 2019 • edited

hotgazpacho commented Aug 26, 2019

kyeotic commented Aug 26, 2019 via email • edited

hotgazpacho commented Aug 27, 2019

kyeotic commented Aug 27, 2019

kyeotic commented Aug 27, 2019 • edited

hotgazpacho commented Aug 27, 2019

abernix commented Aug 27, 2019

hotgazpacho commented Aug 27, 2019

abernix commented Jul 16, 2019 •

edited

hotgazpacho commented Aug 12, 2019 •

edited

kyeotic commented Aug 26, 2019 •

edited

kyeotic commented Aug 26, 2019 via email •

edited

kyeotic commented Aug 27, 2019 •

edited