How can you put in place an autoscaling strategy adapted to your business needs?
Cloud computing is becoming a standard for business. Your company probably already uses web-hosted services. In a cloud ecosystem, the key to performance (and cost rationalization) is to manage resources: at all times, you must ensure that the number of machines or servers dedicated to the task is consistent with the number of simultaneous connections.
This tailor-made resource management has a name: autoscaling, or “automatic scaling”, and it is one of the fundamentals of the public cloud. What is it and how do you put in place a strategy that suits your business needs?
What is autoscaling?
Scalability is the ability of a system to change scale as needed: more resources are injected when activity intensifies, and when it decreases, resources are withdrawn. For example, for a physical store, it’s like recruiting more sellers during sales periods.
This principle also applies to web services: for an Internet website, scalability refers to the ability of the system to change its size so that it can absorb a sudden increase in traffic (thus in the number of visitors present simultaneously) by adding resources or servers at the appropriate time.
The problem is that, because of the constant fluctuation in the resources consumed by web platforms, it is virtually impossible to manage these resources manually while remaining efficient. This is where the notion of autoscaling comes into play. With this approach, resources are automatically increased or decreased according to the consumption requested by the infrastructure to support a workload at a given time.
As such, we can separate this approach into two axes:
- Vertical Scaling: When it comes to increasing the capacity of a machine (in terms of CPU and RAM).
- Horizontal Scaling: When increasing the number of machines used.
If your website needs to absorb a sudden increase in traffic, let’s say to go from 1,000 to 10,000 visitors within a few minutes, autoscaling is triggered to increase the number of servers running. The goal? Ensure application robustness, avoid latency problems, distribute traffic more efficiently across the infrastructure (through a load balancer) and continue to deliver a good user experience quality despite peak activity.
Conversely, when activity is reduced, autoscaling reduces the amount of resources allocated to the operation of the application. So that just enough resources are provided to support a nominal and acceptable load while maintaining some flexibility.
What are the advantages of autoscaling?
This elasticity has a double advantage:
- It helps to maximize the availability of the application and to ensure a satisfactory response time for users during periods of intense activity. This is a good point for branding in sectors such as media and e-commerce.
- It helps to optimize the cost of hosting during off-peak periods: fewer servers are running, which means significant savings.
For both reasons, autoscaling is ideally suited to the needs of enterprises whose workloads are characterized by their variability – whether they are predictable or not. This applies to e-commerce websites facing traffic increases during calendar events (sales, end-of-year holidays, etc.), media whose load is forced to rise sharply according to the news, and all companies that communicate about themselves (organization of events, runs – programed or not – in the media…).
Beyond these few examples amongst the most notables, most of the industries can benefit from an autoscaling strategy: it also addresses off-web needs, especially companies with internal appliances with high loads.
How can an effective autoscaling strategy be put in place?
But autoscaling is not magic. In other words, the automatic sizing of resources is never spontaneous. It requires the establishment of an adapted strategy, supported by relevant indicators, in order to determine precisely the needs in terms of resources according to the periods of activity.
Such a strategy must be based on a monitoring service. The information collected helps to relate machine consumption to the bandwidth used at any time. It helps answer the following questions:
- What are the periods during which the number of visitors to the site is growing strongly?
- What are the time slots or intervals of the year when activity declines drastically?
In sum, this information is the pillar on which your autoscaling strategy is built to meet with maximum relevance to your needs.
The monitoring agent is installed on the machines to be monitored, which then moves the information collected back to the monitoring server(s). The data is visible on a Metrology dashboard and can be used immediately. These processes are fully automated and run in real time.
But, to be effective, an autoscaling strategy depends on continuous improvement. The monitoring agent constantly collects new metrics at regular intervals – every minute for more convincing results.
There are two benefits:
- Learning is constant: new information at any time improves the accuracy of service. Over time, your strategy adapts more and more rigorously to your needs, increasing and decreasing the resources needed for periods of the year that are either intensive or quiet.
- Unforeseen events are handled: Continuous monitoring helps to eliminate the risks associated with unexpected overloads. Your brand has been mentioned in the media and your website is experiencing a significant increase in traffic volume? When the resource shortage alert threshold is reached, autoscaling is triggered to absorb the additional load.
As such, an effective strategy may also be based on “proactive” rather than “reactive” autoscaling. In some sectors, such as e-commerce, retailers have increased visibility into periods of peak activity, thereby predicting actions to add resources ahead of predictable increases. This approach, which could be called “planned” scaling, helps to anticipate overloads before monitoring-based autoscaling has time to take effect – automatic scaling is not instantaneous.
One of the fundamentals of the public cloud is the automatic sizing of workload-based resources, which means that without autoscaling, it is virtually impossible to manage instances in a relevant and efficient manner in real-time. But the implementation of an autoscaling strategy must be carefully planned. And the choice of a managed service provider plays a crucial role in this approach: It is indeed essential to surround you with a partner with experience in your sector of activity, understanding your needs, and able to work transparently. So many cardinal virtues are at the heart of Iguana Solutions’ value proposition!
- Autoscaling allows the best tuning of platform resources to optimize performance and reduce costs.
- A good policy of autoscaling is not fixed but evolutionary; it assumes continuous improvement. For this reason, it must be based on accurate and reliable monitoring indicators.
- Automatic Scaling doesn’t mean instant scaling. It is therefore sometimes useful to plan proactive actions.