Load Balancing Basics

Load Balancing Basics

Server load balancing is an advanced technique (using load balancing hardware or programs) designed to distribute the work load to optimize IP-based queries from the Internet or Intranet throughout a server farm.

The method most commonly used is server clusters, especially high availability clusters.

After the initial set-up, the administrator adapts these methods or scheduling rules to your specific requirements. To simplify, server load balancing is like taking several individual servers and making them appear as one giant server. This is called clustering. In the most extreme cases, you can even have several clusters of servers and load balance across these separate clusters. You might see this on a high-demand site like YouTube.

Supported Server Protocols

  • The server load balancer supports:
  • Most UDP and TCP/IP based protocols
  • DNS
  • HTTP
  • HTTPS (SSL)
  • SMTP
  • POP
  • IMAP
  • NNTP
  • FTP

How server load balancing can benefit your business

There are many reasons a company benefits by upgrading to a load-balanced solution, but the most common are scalability, high availability, and predictability. Consider a company with a website that is accessed thousands of times a day, hosted on a dedicated server. Regardless of their company's size, IT managers need to be confident:

  • The site can handle sudden and/or gradual increases in traffic;
  • The site is available all the time, 24 hours a day, 7 days a week, 365 days a year;
  • The site gives a consistent experience on each visit.

Applications Supported

The load balancer also supports popular applications such as:

  • Streaming Media
  • Active Server Pages (ASP)
  • SQL
  • UDP/IP-based protocols that include DNS, WAP, RADIUS, and others.

Meanwhile, back on the farm...

1. Scalability

Scalability lets you handle sudden and/or gradual increases in traffic. Ideally, as you grow your web presence, you will have more people visiting your site, viewing more pages, and making more requests. If you have a single server, these increases can overwhelm your server and result in downtime. If you need to upgrade, you will experience similar downtime as you move to the new server. Server load balancing allows you to seamlessly scale your hardware as needed.

Scalability

Imagine you have a farm, and each year you plow a bigger field—or one year start a second field. If you have a one-horse plow, you might plan to buy a younger, stronger horse or get a new plow, but you'd need to take time away from farming to go find and buy your selections. Instead, you could simply add a second horse to the team to balance the extra work needed. Eventually, you may need to have two teams of two-horse plows, but you can easily plan to scale up to that.

The farm is like your web traffic, and the horses are like your servers. By adding a second server, you can balance the workload across both servers, and if needed, you can add an entire additional cluster of servers - planning to scale up all the way.

2. High Availability

High availability meas your site is available all the time, 24x7x365. With only a single server, the site will need to be taken down when you perform maintenance or upgrade your hard drive or RAM. During that downtime, customers might find their way to your competition. With server load balancing, you can upgrade or run maintenance on one server, and the additional requests are automatically sent to the rest of the servers in the cluster.

High Availability

Using our farm analogy, if you have a one-horse team, your horse may get sick or need to be re-shoed, meaning you can't plow your field. If you have a multi-horse team, the rest of the horses can work a bit harder to complete the plowing without anyone really noticing the difference.

3. Predictability

Predictability provides users a consistent experience each time they visit. You may have certain times of day that more people visit your site, or you may run a promotion or release a new service. This is similar to the Slashdot effect. With a single server, peak times may mean a painfully slow website or web application. Customers may get frustrated and leave. With multiple servers, these peaks are easily addressed by sharing the workload. Comparing this to our farm, in the summer, your trusty horse may slow down or may not be able to work at all in the afternoon sun. With several horses working together, working in the heat isn't as big an issue because the workload is being shared.

Adding a more efficient server or improved network hardware alone will not meet all of your requirements, since it can improve only on the performance - this is like simply replacing your horse or your plow. To attain a high level of availability with minimum downtime and fast access speeds, we recommend that you have two or more dedicated servers operating simultaneously. These mirrored servers must then be load balanced for automatic failover and detection of poor application performance in any of the online servers. If one mirror server fails, another mirror server takes over automatically. The server load balancer knows the extent of the load on the servers, so it can direct queries in the most efficient and best possible way.

Different Methods for Server Load Balancing

There are a few methods for server load balancing and they are listed below.
NOTE: The best method for your business depends on your application and the types of servers that you have running.

  • Round Robin - This is the most common method of server load balancing. Each server that is load balanced is arbitrarily identified (i.e. A, B, C, D), and visitors are sent to each server in that order. The first visitor goes to A, the second to B, and so on. In this example, the fifth visitor would be directed to server 'A.'
  • Weighted Round Robin - This is a similar method of server load balancing that is often used when two different servers are being balanced. For example, you may have a single processor dual core server and a dual processor quad core server. The dual processor quad core server is approximately twice as fast and can handle twice the workload of the single processor dual core server. In this example, you could have the first visitor and the second visitor sent to the dual processor quad core server, while the third visitor would be sent to the single processor dual core server. This would be a situation where the servers are given a 2:1 ratio or weight, meaning twice as many visitors are sent to the more powerful server.
  • Least Connections - This method of server load balancing checks to see which server currently has the fewest connections (has the least activity in terms of number of visitors making requests), and sends new visitors to that server. This method is recommended if the application you are using has a big variance in the length of each session, with a mix of very short sessions and very long sessions. This method is not as good as the Weighted Round Robin for HTTP, unless you have some large file downloads of 100MB or more mixed in with web pages.
  • Fastest Response - As the name implies, this method of server load balancing checks to see which server responds to a query the fastest and sends new visitors to that server. This works well for applications where response time of the application and the operating system is directly related to the server load. However, this method is not advisable for most web servers.