GCP Load Balancing
As technology advances, so does the need for better and more efficient web traffic methods. One such method is load balancing, which helps to distribute traffic across multiple servers. Google Cloud Platform (GCP) offers a load-balancing service with various features and benefits for businesses looking to improve their website’s performance and reliability.
Load balancing is an essential tool for managing web traffic. It ensures that your website remains available and responsive even during periods of high traffic. By distributing requests across multiple servers, load balancing helps to prevent bottlenecks and downtime. Additionally, load balancing can help optimize resource utilization, reducing costs and improving efficiency.
In this blog post, we’ll explore GCP Load Balancing in-depth, covering its types, benefits, and how to configure and use it.
What is Load Balancing?
Load balancing is the process of distributing network traffic across multiple servers to ensure that no single server is overwhelmed. It helps to improve website performance, reliability, and scalability by spreading the workload across multiple servers. Load balancing can be implemented at different levels, including network, transport, and application.
At the network level, load balancing can be used to distribute traffic across multiple data centres or geographic regions. At the transport level, load balancing can be used to balance traffic across multiple servers within a data centre. At the application level, load balancing can be used to distribute traffic across multiple instances of an application running on different servers.
Load balancing can be achieved through various methods, including round-robin, IP hash, and least connections. Each method has its advantages and disadvantages, and the choice of method depends on the application’s specific needs.
Why use GCP Load Balancing?
GCP Load Balancing offers several benefits that make it an attractive option for businesses looking to improve their website’s performance and reliability:
Scalability: GCP Load Balancing can scale to handle millions of requests per second, making it suitable for even the largest websites and applications.
Types of Load Balancers in GCP
GCP Load Balancing offers four types of load balancers, each designed for different use cases:
Network Load Balancing
Network Load Balancing is designed to distribute traffic across multiple regions or data centres. It uses Google’s global network infrastructure to route traffic to the nearest available backend that can handle the request. This helps to minimize latency and improve performance. Network Load Balancing is ideal for applications that require low latency and high availability, such as gaming or streaming services.
Network Load Balancing supports both TCP and UDP traffic and can handle millions of requests per second.
HTTP(S) Load Balancing
HTTP(S) Load Balancing is designed to distribute HTTP and HTTPS traffic across multiple instances of an application running on different servers. It uses advanced algorithms to route traffic to the backend that can handle the request most efficiently based on factors such as server health, available capacity, and proximity to the client.
HTTP(S) Load Balancing supports both IPv4 and IPv6 traffic and can handle millions of requests per second. It also offers advanced features such as SSL offloading, content-based routing, and session affinity.
Internal Load Balancing
Internal Load Balancing is designed to distribute traffic across multiple instances of an application running within a VPC network. It uses private IP addresses to route traffic to the backend, ensuring that traffic stays within the VPC network and does not traverse the public internet.
Internal Load Balancing is ideal for applications that require high-speed and secure communication between backend services, such as microservices architectures.
Global Load Balancing
Global Load Balancing is designed to distribute traffic across multiple regions or data centres, similar to Network Load Balancing. However, it also offers additional features such as content-based routing, SSL offloading, and session affinity, making it suitable for more complex applications.
Global Load Balancing uses Google’s global anycast IP addresses to route traffic to the nearest available backend that can handle the request. This helps to minimize latency and improve performance.
How to Configure and Use GCP Load Balancing
Configuring and using GCP Load Balancing is straightforward and can be done using the GCP Console, CLI, or API. The steps involved include:
Create a target pool: A target pool is a group of backend instances receiving traffic from the load balancer. You can create a target pool for each type of load balancer.
In conclusion, GCP Load Balancing is a powerful tool for managing web traffic and improving website performance and reliability. It offers a range of load balancing types, each designed for different use cases and with a range of features and options to meet the specific needs of your application. Whether you’re running a small website or a large-scale application, GCP Load Balancing can help you achieve your performance and reliability goals.
Load Balancing Fact Sheet
Types of Load Balancing
There are 2 types of LB – Global and Regional
|Global External Load Balancing||Regional external load balancing|
|HTTP(s) load balancing||Network Load Balancing|
|SSL Proxy Load balancing||Regional Internal Load balancing|
|TCP Proxy Load balancing||Internal Load Balancing|
The following explains each type of load balancer available on GCP
- Global LB of HTTP traffic
- Can configure URL rules
- Traffic is routed to the closet LB instance group
- Cross Region Load Balancer
- LB is provided by 2 methods
- Requests per second
- CPU utilization
- Session Affinity
- Client IP affinity
- Cookie affinity
- Web Proxy Support (Web Socket)
- 30 second timeout set
- Timeout can be increased via API
- LB Interfaces
- Gcloud CLI
- GCP Console
- The REST API
- LB Timeouts and Retries
- Timeout 30 seconds
- TCP session times out 10 mins (600secs)
- API – retries GET requests not POST requests
- LB Logged by Stackdriver
- Server Firewall must be configured if used
- LB does not keep instance in sync
Typical HTTP Load balancer setup
Illegal request handling
The load balancer blocks the following for HTTP/1.1 compliance:
- It cannot parse the first line of the request.
- A header is missing the : delimiter.
- Headers or the first line contain invalid characters.
- The content length is not a valid number, or there are multiple content length headers.
- There are multiple transfer encoding keys, or there are unrecognized transfer encoding values.
- There’s a non-chunked body and no content length specified.
- Body chunks are unparseable. This is the only case where some data will make it to the backend. The load balancer will close the connections to client and backend when it receives an unparseable chunk.
The load balancer also blocks the request if any of the following are true:
- The combination of request URL and headers is longer than about 15KB.
- The request method does not allow a body, but the request has one.
- The request contains an upgrade header.
- The HTTP version is unknown.
- SSL(TLS) connections terminated @ LB layer – then SSL LB balances the connections across all instances
- Intelligent routing
- Better use of instances
- Certificate management
- Security patching
- Support ports 25,43,110,143,195,443,465,587,700,993,995
- Health checking
- Backend services
- SSL cert and key
- Global forwarding rules
- Same Properties of SSL proxy LB
- Internal LB scales services behind private LB IP accessible only to instances on VPC
- Lower Latency (as within GCP network)
- Supports Auto mode VPC, Custom mode VPC and Legacy Networks
- Can be implemented with regional managed instance groups (enables auto scale across regions)
- LB Selection Algorithm
- By Default, internal LB used 5-tiple hash
- Client source IP
- Client port
- Destination ip (the LB IP)
- Destination port
- Protocol (either TCP or UDP)
- If you want to control backend traffic – use following options
- 3-tuple hash (client IP, dest IP, Protocol)
- 2-tuple hash (client IP, Dest IP)
- By Default, internal LB used 5-tiple hash
- Internal to GCP only
- Cannot send traffic to VPN tunnel
- 50 rules max
- 250 forwarding rules max
- Balance load on incoming IP data – address, port, protocol
- Routes traffic to multiple backend services
- Load Distribution Algorithm
- Target Pools
- Session Affinity
- Health Checking
- Firewall rules and Network load balancing
- Connection Draining
- Can be drained manually or by auto-scaler
- Must set timeout duration
- User sessions gracefully terminate, new session re-routed (1-3600 seconds)