What is Load Balancing?
Load balancing is the process of distributing network traffic or computational workloads across multiple servers or resources. It aims to optimize resource use, maximize throughput, minimize response time, and avoid overload on any single resource, thus improving availability and reliability of applications.
Imagine a popular website; if all users connected to a single server, it would quickly become overwhelmed. Load balancers act as a "traffic cop" sitting in front of your servers, routing client requests across all healthy servers capable of fulfilling those requests.
Common Load Balancing Algorithms
1. Round Robin
This is one of the simplest methods. Requests are distributed sequentially across the group of servers in a rotating manner.
- How it works: The first request goes to server 1, the second to server 2, the third to server 3, and so on. When it reaches the last server, the next request goes back to server 1.
- Pros: Very simple to implement, predictable distribution.
- Cons: Doesn't account for server capacity or current load. A slow or overloaded server will still receive requests in its turn.
2. Least Connections
This algorithm is more dynamic. It directs traffic to the server with the fewest active connections at the time the request is received.
- How it works: The load balancer tracks the number of open connections to each server. The next incoming request is assigned to the server with the lowest number.
- Pros: Considers current server load (approximated by connection count), generally leads to better performance than Round Robin if requests have varying completion times.
- Cons: Requires the load balancer to actively track connections. Can sometimes cause issues if connections are long-lived but idle. Ties (multiple servers with the same lowest count) need a tie-breaking rule (often Round Robin).
3. Weighted Round Robin (WRR)
This is a variation of Round Robin designed for servers with different capacities. Each server is assigned a weight (an integer), usually based on its processing power or memory. Servers with higher weights receive proportionally more requests.
- How it works: If Server A has weight 3 and Server B has weight 1, a common WRR sequence might be A, A, A, B, A, A, A, B... More sophisticated implementations might use Greatest Common Divisor (GCD) calculations for smoother distribution.
- Pros: Allows utilizing servers with different capabilities effectively.
- Cons: Still doesn't account for dynamic load changes unless weights are adjusted manually or dynamically. Defining accurate weights can be challenging.
4. IP Hash
This algorithm uses the client's IP address to determine which server receives the request. A hash function is applied to the client's source IP address.
- How it works: Calculate
hash(client_IP) % number_of_servers
. The result determines the server index.
- Pros: Ensures that requests from a specific client consistently go to the same server (session persistence or "stickiness"). This is useful for applications that store session state on the server.
- Cons: Can lead to uneven distribution if many clients originate from behind a single NAT gateway (sharing the same public IP) or if the hash function doesn't distribute well. Doesn't adapt to server load or health changes inherently (a client is stuck with its assigned server even if it's slow). Doesn't work well if the number of servers changes frequently (causes remapping).
Visualize and Play
Configure servers, select an algorithm, and send requests to see how they are distributed.
Load Balancer (Round Robin)
Log messages will appear here...