AWS Announcements at a Glance: Predictive Scaling for EC2

Craig Milam

AWS Announcements
November 21, 2018

[rt_reading_time label=”Read Time:” postfix=”minutes” postfix_singular=”minute”]

Onica’s Announcements At A Glance series analyzes the latest AWS news and announcements, simplifying and explaining the significance for AWS consumers.

Earlier this week AWS announced Predictive Scaling for EC2 Powered by Machine Learning which will allow customers to scale proactively instead of reactively, to help avoid costly over-provisioning.

In the past Scaling Policies were used to determine when to scale in and out based on CloudWatch metrics such as: Average CPU, Network In/Out, and Load Balancer Requests. With the addition of Predictive Scaling, Amazon’s well-trained Machine Learning models can predict traffic and EC2 usage from daily and weekly patterns. This requires at least 24 hours of historical data to make predictions, and they are re-evaluated every day to create a forecast for the next 48 hours.

In order to enable Predictive Scaling, log in to the AWS Management Console, go to the Auto Scaling Section, and click on Get started:

Select Choose EC2 Auto Scaling groups and select the groups you want to have Predictive Scaling enabled on:

Give the scaling plan a name and pick a scaling strategy:

There are three out of the box scaling strategies AWS provides for predictive autoscaling:

Optimize for Availability keeps the average CPU utilization of Auto Scaling groups at 40% to provide high availability and ensure capacity to absorb spikes in demand.
Optimize for Cost keeps the average CPU utilization of Auto Scaling groups at 70% to ensure lower costs.
Balance Availability and Cost keeps the average CPU utilization of Auto Scaling groups at 50% to provide optimal availability and reduce costs.

There is also an option to specify your own custom policy, which allows customers to configure which metrics to use to scale out or in.

Customers can also choose to enable predictive scaling, dynamic scaling, or both. Predictive scaling works is by forecasting load and scheduling the minimum capacity for an auto scaling group. Dynamic scaling uses target tracking. Enabling both of these allows for proactive scaling, instead of reactively. This avoids over-provisioning, while improving the overall user experience for customer applications.

On the next screen there are a number of additional settings customers can use to fine tune predictive and dynamic scaling:

Of particular interest here are the Load Metric (under General Settings) and Predictive scaling mode (under Predictive scaling settings)

Load Metric

This allows customers to choose which metric history is used in the forecast for predictive scaling. If the built in metrics aren’t enough, custom metrics can be specified instead.

Predictive scaling mode

This allows customers to only show the forecasts on the dashboard,or forecast and scale allowing your auto-scaling group to scale out/in based on the metric specified earlier.

Once the scaling policy has been created customers can view the dashboard to view the predictive scaling forecasts and scheduled actions. Keep in mind that at least 24 hours of historical data is needed in order to make predictions.

Predictive scaling is a great new tool for customers hosting websites or other applications that experience periodic and predictable traffic spikes, helping to reduce the cost of infrastructure and provide an overall better experience for users.

Learn more about how to save costs on your AWS environment please click here.