impossible for ideas to compete in the marketplace if no forum for
Scaling your services with ZXTM Global Load Balancer
Contributed by: Zeus Technology, Inc.
However you measure it, the cost of application downtime can be very high for many organizations. For organizations that provide applications and services over the Internet, the probability of downtime is even higher.
There are two commonly used techniques to minimize the chance of a failure causing downtime in network-based applications. These are Server Load Balancing and Global Server Load Balancing.
Server Load Balancing within a Datacenter
Techniques like server load balancing and clustering are often used within a datacenter to build clusters of fault-tolerant, scalable applications. These clusters are resilient to isolated failures - for example, a server machine developing a hardware fault - and they allow the administrator to add more capacity to his application when required.
However, a clustered, fault-tolerant application running in a single datacenter is still vulnerable to downtime:
Organizations who wish to protect against these risks often choose to deploy a Global Server Load Balancing solution which routes application traffic to multiple distinct datacenters and removes the single point of failure.
Global Server Load Balancing between Datacenters
Global Server Load Balancing (GSLB) systems manage how clients are connected to a datacenter, when a service is hosted in multiple distinct datacenters.
1 Yankee Group, April 2006, "Overcoming Applications Ignorance: New Services to Enable Agility"
2 mValent Market Survey - Challenges and Priorities for Fortune 1000 companies
The primary purpose of a GSLB system isBusiness Continuity - to ensure that services are always available, even when one or more service locations (datacenters) becomes unavailable.
A second purpose of GSLB isImprove Customer Experience - to load-balance each user to the best datacenter from a choice of several. The choice can be based on datacenter performance and proximity, so that clients are directed to the datacenter that is closest and is performing the best. This way, the client gets the best possible level of service.
Who might use a Global Server Load Balancing solution?
A GSLB solution is relevant to any organization:
1. Who provides or depends on an internet-based service, such as a public-facing web site, or a network-based application for internal use.
2. Who cannot countenance service failure, whether this results in lost productivity, lost revenue or lost customers.
3. Who wishes to establish an advantageous SLA (service level agreement) with its users or customers, providing them with a superior and competitive level of service.
This white paper discusses the implementation details of a DNS-based Global Server Load Balancing solution, with particular reference to Zeusâ€™ ZXTM GLB product.
A specialist music and book retailer turns over orders in excess of $10,000 per day. Any period where users could not access the online shop would result in significant loss of revenue and reputation.
The retailer hosts their primary website in a hosting facility in New York, and replicates all database transactions to a second backup website in Boston. During normal operation, users are directed to the New York website, but if that website becomes unavailable, a GSLB system directs all users to the backup site in Boston.
When a contractor severed a fiber optic cable in the New York hosting facility, the GSLB device detected that the site was no longer accessible and immediately started directing users to the backup site in Boston instead. Because the database was continually replicated, users were able to continue with their transactions and complete their purchases.
Providing high levels of service
A UK-based publishing company publishes several prestigious scientific journals. Universities and research institutions across the world pay a subscription to access the content of these journals electronically.
A disaster recovery solution is required because the paid subscribers will not tolerate downtime. In addition, many of the subscribers in the US, Far East and Australasia report that the website is slow, and it can take too long to download the PDF content they have paid for.
The publishing company establishes mirror sites in the US and Japan and uses a GSLB device to seamlessly direct each user to the site that is geographically closest to them. Download times for many customers drop by up to 75%.
Upselling services to Hosting Customers
An innovative ISP was seeking additional services they could provide to their hosting customers.
Using data replication to a server platform located in a different datacenter, the ISP was able to synchronize customersâ€™ web content between two locations. With a GSLB device, he was able to direct traffic for some customer sites to the City North datacenter, and other sites to the City South datacenter, and thus control and manage the bandwidth used by each datacenter.
The ISPâ€™s customerâ€™s SLA contracts contained exclusions for major datacenter failure caused by elements outside the ISPâ€™s control. For an additional fee, the ISP was able to upsell a premium hosting package that included a datacenter failover service to minimize the risk of a datacenter failure rendering a customerâ€™s site inaccessible.
How does Global Server Load Balancing work?
DNS-based Global Server Load Balancing
The majority of GSLB devices function by manipulating the DNS (Domain Name System) resolution process.
An application such as a web browser needs to locate a service on the intranet before it can use it. Services are published using aDomain Name, such as www.zeus.com.
Behind the scenes, the application uses a process called ‘DNS Resolutionâ€™ to find out theIP Address of the internet server that provides the service with the given domain name. The DNS system is very much like a global internet phone book - you may know an individual by their full name (for example, "Tim Berners Lee"), but you need to look up their phone number before you can get in touch with them.
Different servers in different locations will have different IP addresses. A GSLB device controls how domain names are resolved to IP addresses, and thus controls which datacenter clients are directed to.
Several users access http://www.zeus.com, but are directed to different datacenters:
In order to effectively deploy a GSLB solution, you need a good understanding of how the DNS system functions. For background reading, you may find the Zeus publication "A Laymanâ€™s Guide to DNS" useful.
Other GSLB designs
www.zeus.com, but receives a redirect sending him to us.zeus.com, which resolves to just one of the datacenters.
This method is effective at controlling precisely which datacenter a user is sent to, but it does not cater for datacenter failure, and users may bookmark or distribute links to us.zeus.com, bypassing the load-balancing decision.
Generally, this method needs to be implemented by a DNS-based GSLB system to ensure thatwww.zeus.com is always available and a traffic management device to control how and when users are redirected.
With triangulation, incoming network traffic is distributed across one or more datacenters using round-robin DNS. When a datacenter receives a request, it determines whether it is best suited to respond to the request, or whether it should forward the request to a different datacenter.
With Layer 4 triangulation, the first datacenter forwards the request to the second datacenter, and the second responds directly to the remote client. The request and response data takes three hops across the network. Layer 4 triangulation may not be possible if one of the service providers deploys egress filtering to defeat connection source-address spoofing (a technique often used to prevent SPAM email).
With Layer 7 triangulation, the first datacenter forwards the request to the second, and the second datacenter replies back to the first. The first datacenter then relays the response back to the client. The requests and response data takes four hops over the network.
Triangulation can load-balance very compute intensive application requests, but it generally does not improve response time, it is bandwidth-intensive and it does not cater for primary datacenter failure.
BGP Routing Control
BGP (Border Gateway Protocol) is the core routing protocol of the Internet. By manipulating BGP routing tables, it is possible to move blocks of IP addresses from one physical network location to another in a very different location.
BGP routing control can be used by an ISP to provide large-scale failover, but it is too expensive and coarse to provide fine-grained load balancing control for an individual service.
Introducing ZXTM Global Load Balancer
www.zeus.com is hosted in two different locations, with IP addresses 188.8.131.52 and 184.108.40.206. Without a GSLB device, the DNS server would normally be configured to return both of these IP addresses when queries about www.zeus.com. The IP addresses would be returned in a different order each time using a process called Round-Robin DNS, and clients would connect to one of the datacenters.
Add in ZXTM GLB
ZXTM GLB builds on this standard configuration by manipulating the round-robin DNS responses:
2. ZXTM GLB forwards the DNS request to the existing DNS server.
3. The DNS server responds with all IP addresses in a round-robin fashion.
4. ZXTM GLB chooses one IP address and masks out the others from the response.
The key load-balancing decision that ZXTM GLB performs is to decide which IP address(es) should be returned to each remote user. This decision directly controls which datacenter each remote user uses.
Just one change needs to be made to the DNS information so that clients make DNS lookups through the GLB device rather than directly to the DNS servers. This change can be made by altering the NS record for the domain, or by adding a CNAME. Please refer to the ZXTM GLB documentation for more information.
DNS information is commonly cached (remembered) by intermediaries across the network. This caching behavior is advantageous because it reduces the amount of DNS traffic, but can impede the operation of a DNS-based Global Server Load Balancing device.
An important element in a DNS response is the TTL (time to live) value. This value informs any intermediaries as to how long the DNS response can be cached for. ZXTM GLB can rewrite TTL values in the DNS responses it has managed, overwriting a long default value with a much shorter one. The effect of the change (increased DNS traffic) can be easily observed using the real-time visualization tools in ZXTM GLB, so you can chose a suitable value that balances traffic rates with responsive failover.
How does ZXTM GLB work in practice?
One or more ZXTM GLB devices are deployed in each datacenter. The ZXTM GLB devices monitor the performance and availability of their own datacenter, and broadcast that information to the other ZXTM GLB devices in the other datacenters.
This way, every ZXTM GLB device knows the availability and performance of every datacenter.
Active-Active load balancing configurations
Any ZXTM GLB device may receive a DNS request a service running in the datacenters. When the datacenters are running inactive-active mode, the ZXTM GLB device chooses which datacenter the user should be directed to. This decision is based on three criteria:
The decision can be tuned so that it is based purely on load, purely on geographic location, or on a mixture of the two:
However, you may not wish to use an active-active configuration if the applications you are balancing cannot be run in multiple datacenters simultaneously - for example, because they depend on a single database or SAN that cannot be continuously replicated over multiple sites. In this case, an active-passive configuration is more appropriate.
Additionally, one side-effect of an active-active load balancing mode is that an end user may spontaneously be redirected from one datacenter to another when his client software makes a fresh DNS request. For example, the datacenter he is accessing may become overloaded and the load-balancing algorithm may assign him to a different datacenter.
If this behavior is undesirable, you can overcome it by several methods. You can use the fully deterministic ‘Geoâ€™ load-balancing method, or you can use Application-level redirection to detect userâ€™s sessions and forcibly direct him to a particular datacenter when required. Please consult the ‘Multi-site session persistence with ZXTM GLB and ZXTMâ€™ document for a full description of this technique.
Active-Passive load balancing configurations
When the datacenters are running inactive-passive mode, the load balancing decision is much simpler. You first specify the order in which the datacenters should be used:
If the first datacenter fails, all users are directed to the second datacenter (Cambridge); you can build arbitrarily long chains of datacenters for multiple levels of failover.
If the first datacenter recovers, you can specify how the service should fail back. If automatic failback is enabled, users will immediately be directed to the first datacenter again. If it is disabled, users continue to use the second datacenter until the administrator manually indicates that the first datacenter is ready to receive traffic again.
The benefit of this configuration is that it gives a very deterministic, controllable disaster recovery solution, ideally suited for complex, stateful applications.
Availability and Performance Checking
ZXTM GLB checks the performance and correct operation of the services in the local datacenter using a range of application monitors. These monitors can run simple tests like network pings, or complex tests like HTTP GETs to verify that returned pages match particular criteria.
Performance data can optionally be deduced from the response times from selected monitors, or it can be supplied separately using a standards-compliant SOAP interface. This performance data is used to weight how much each datacenter is used when the Load orAdaptive load balancing algorithm is selected.
ZXTM GLB can also run an external connectivity monitor to verify that its datacenter has connectivity to an upstream location on the Internet.
ZXTM GLB broadcasts the health and performance data to the other ZXTM GLB devices in the other datacenters. It deduces that other datacenters are available if it hears the health and performance information from the ZXTM GLBs in those datacenters. For this reason, organizations typically operate a pair of ZXTM GLB devices in each datacenter, thus removing a possible single-point-of-failure within each datacenter.
ZXTM GLB is a complete DNS-based Global Server Load Balancing solution that provides:
ZXTM GLB is very easy to deploy, with minimal infrastructure changes and very little operational risk.
The rich real-time visualization and reporting in ZXTM GLB gives a clear picture of the effectiveness of the Global Server Load Balancing configuration and the activity of your users globally at any time.
For Further Information
To find out more about ZXTM Global Load Balancer or to arrange a demonstration or product evaluation, please visit http://www.zeus.com/products/zxtmglb/
The ZXTM KnowledgeHub is a key resource for developers and system administrators wishing to learn about ZXTM and Zeusâ€™ Traffic Management solutions. It is located at http://knowledgehub.zeus.com/
Nothing you read in The Business Forum Journal should ever be construed to be the opinion of, statements condoned by, or advice from, The Business Forum Institute, its staff, workers, officers, members, directors, sponsors or shareholders. We pass no opinion whatsoever on the content of what we publish, nor do we accept any responsibility for the claims, or any of the statements made, within anything published herein. We merely aim to provide an academic forum and an information sourcing vehicle for the benefit of the business and the academic communities of the Pacific States of America and the World. Therefore, readers must always determine for themselves where the statistics, comments, statements and advice that are published herein are gained from and act, or not act, upon such entirely and always at their own risk. We accept absolutely no liability whatsoever, nor take any responsibility for what anyone does, or does not do, based upon what is published herein, or information gained through the use of links to other web sites included herein. Please refer to our: legal disclaimer