Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pricing of cloud functions compared to preemptible machines #52

Open
josephrocca opened this issue May 15, 2019 · 2 comments
Open

Pricing of cloud functions compared to preemptible machines #52

josephrocca opened this issue May 15, 2019 · 2 comments

Comments

@josephrocca
Copy link

josephrocca commented May 15, 2019

I've just started researching options for a large scale batch computing job and although pricing isn't a huge worry (since it's a one-off job), it is something that interests me in case I end up building a deployment script/framework that can be used on future projects:

So the cloud function approach appears to be ~6x the price of the preemptible machine approach? And if you take into account clock speed then it's somewhere near 12x the price? I also haven't taken into account the invocation pricing of the cloud functions, but I'm assuming that would be small. The preemptible machine approach also has the advantage of being able to run code for longer than 10 or 15 mins, which is very handy for me - it saves me splitting up files that need to be processed into smaller, digestible-in-10-minutes chunks and coding all the logic to handle merging and stuff.

@BrandiATMuhkuh, who you were talking to on the hacker news thread seems to have switched to many-core machine(s) after trying the cloud function approach, and I wonder if pricing was a consideration there?

If I've not made any mistakes or incorrect assumptions, then is it within the scope of faast.js to consider preemptible machines? Since you've implemented a local provider, I'm assuming that you've done most of the leg-work required to be able to deploy it to an actual machine, rather than a cloud function.

@acchou
Copy link
Collaborator

acchou commented May 15, 2019

This is a good topic to explore, and I was planning to write a blog post about it. Using cloud functions vs preemptible instances has different trade-offs and price is one of them. In general I've found that it's difficult to determine the performance/cost of a given workload without experiments, so I'm reluctant to give advice on preemptible instances until I've had chance to experiment. That said, I can say with confidence about cloud functions:

  • If your workload is CPU bound, then the cost of executing the workload on Google Cloud Functions generally gets more expensive with more memory (e.g. the performance increase doesn't decrease the time enough to compensate for the higher price per unit time).

  • On AWS, for CPU-bound workloads you're better off going straight to the region 1728MB-2048MB (unless your workload requires more memory). Below that and performance has higher variance, often higher cost (taking into account decreased performance means longer run times). Above that range and single core performance doesn't increase.

  • Comparing Google Cloud Functions vs. AWS Lambda for CPU-bound workloads, Lambda scales much faster to higher number of cores. Also, AWS Lambda in the optimal 1728-2048MB region is faster than anything available on GCF today. From a cost perspective, the optimal region for AWS Lambda tends to cost about the same as GCF with 1024MB memory, with ~20% better performance. And it is significantly cheaper than GCF with 2048MB memory. However, if your workload can execute with 512MB or less, GCF can be somewhat less costly, but will run much slower (e.g. 2-5x longer runtime).

  • With cloud functions you can access more aggregate CPU and I/O bandwidth with less effort than with any one instance type. On AWS you get up to 1000 concurrent invocations as a default limit, and that quota can be raised (I've asked and gotten a 5k limit). Using 2048MB as an example memory size, you get 1000 cores with ~2TB aggregate memory, and I've measured effective bandwidth exceeding 200+Gbps aggregate download bandwidth from S3 using almost 2000 instances. No one instance gives you that much performance; using multiple instances is possible but is much more logistically challenging than simply spinning up more cloud function invocations.

I do plan on exploring preemptible instances to see if they could be a good fit. One way to use them today is to simply start an instance and run a workload using the local provider. This won't be robust to the instance getting pre-empted, but it should be possible to measure performance and get a sense of cost with those measurements. I've tried some workloads with local mode on single large instances, and typically they hit a "scaling wall" with a bottleneck in CPU performance or I/O bandwidth to S3 way below limits available through cloud functions. But, as I mentioned, I haven't looked at relative cost that carefully yet.

The nature of the workload also has an impact on the choice. If you're doing a workload that requires long run times (>~15min), then cloud functions won't work. Workloads that are run continuously and has fewer load spikes, can be sized once and run on an appropriate instance. However, if the workload is a "race to completion" and only needs to be run ad-hoc, or has spiky load, or needs to finish as soon as possible, using faast.js with AWS (specifically) can provide much more flexibility and lower cost than you think (i.e. no need to overprovision a large instance and keep it running in case a workload comes in at any moment, faster startup and scaling, and higher core limits and overall i/o bandwidth).

AWS Batch and similar services can also avoid the limits of single instances and the benefits of pay-for-usage and scalability. I don't have enough experience with them to have an opinion on performance or cost at this point. But they come with more operational overhead and are (in my opinion) just plain hard to use. But I'll have to give them a spin; they might even become a target backend for faast.js if there's interest.

If you do run experiments on preemptible instances vs. AWS Lambda in the optimal region, I'd welcome any data you could provide.

@josephrocca
Copy link
Author

Many, many great points! Thank you for your thoughts. I'll definitely report back if I end up with any useful data. Cloud functions seem to "have their cake and eat it too" in many respects - perhaps except for pricing, but like you said, you're paying to not have to deal with a bunch of project-specific code to handle whatever inherent bottle-necks you might run up against when your cores are competing for memory, bandwidth, disk, etc. I'm very new to this whole area, but if I end up producing any meaningful data in my journeys I'll definitly report back!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants