Auto-Scaling is the ability for the Google Compute Engine to scale up or down dependent on a health policy. The most common use is to scale up when demand on a server spikes up.

The Policies that define Autoscaling healthchecks are:

  • Average CPU utilization
  • HTTP load balancing service capacity
    • Max CPU
    • Max requests per second
  • Stackdriver standard and custom metric

Auto-scaling requires a load balancer configured and an instance group of Virtual machines. Consider the instant group like a template GCP uses to spin up additional servers.

To make Auto-Scaling work, you must

  • Prep your image with boot sequence and software needed
  • Define your auto scaler policy (Scale Out or Scale In)

For further information on Autoscaling check the Google Documentation.