The Business Forum

"It is impossible for ideas to compete in the marketplace if no forum for
  their presentation is provided or available."         Thomas Mann, 1896

Accelerating Web Applications with ZXTM

Contributed by: Zeus Technology, Inc.




Zeus Extensible Traffic Manager (ZXTM) is a software load balancer for networked and web-enabled applications. It improves the performance, reliability and security of these applications, and reduces operational costs across complex, multi-tiered and fragile infrastructures.

What performance problems are incurred?

Many common web application platforms suffer severe performance problems. Their workload gives them a range of tasks they are not optimized for; they scale poorly when handling large numbers of clients; they under-perform with connections over slow, high latency networks.

These problems are particularly common with thread- or process-based server applications, such as the Apache Web Server, and many Java-based application servers.

They are exacerbated further by software virtualizations such as VMware and Xen that add additional networking layers.

What is the solution?

Various ‘point’ solutions address some aspects of the performance problems; for example, SSL or XML accelerators offload some of the processing tasks onto specialized hardware.

ZXTM provides a complete solution for all of the performance problems. This white paper describes these problems in detail, and explains how ZXTM is able to solve them.  For independent validation of the performance benefits described in this document, refer to the BroadBand Testing reports published at:

Reasons for Poor Performance

Poor performance arises because server applications perform tasks they are not optimized for, and because many server applications use a concurrency model that is easy to program but scales badly.

Reason 1: Out of the application’s comfort zone

A Java application server is designed for one key purpose - to manage and run application code in a Java environment. The Apache server is designed to be as portable as possible across many hardware and OS platforms, and to be feature-rich to support a great many third-party application modules.

Performance is a secondary concern in the design of these servers, and few development resources are concentrated in addressing the performance problems in compute-intensive ancillary tasks such as SSL decryption, compressing response data or performing XML operations.

In some cases, the application design makes it impossible to select the best possible implementation. Java applications are reliant on Java-based implementations of computing tasks; Apache uses lower-performance open source cryptographic libraries rather than the higher performance commercial ones that are available.

A dedicated acceleration device which performs these compute intensive tasks can bring great performance benefits. For example, the Apache server processes SSL-encrypted traffic between 10 and 20 times more slowly than non-encrypted traffic, so offloading the SSL-decryption operation onto a separate device will increase capacity by 10 or 20 times.

The return-on-investment is very clear when you consider commercial application servers that are priced on a per-CPU basis. In this case, it’s vital to offload all unnecessary processing from the application server, in order to gain the best possible return on the per-CPU licensing cost.

Reason 2: The Concurrency Model

Concurrency is the number of simultaneous connections that a server can handle. 

The Concurrency Model used in the design of a server application greatly influences how the application processes many simultaneous connections.

The limited Concurrency Model used by server applications like Apache and many Java Application Servers is the biggest cause of the performance problems they experience.

How much concurrency is needed?

A typical web client like Internet Explorer or Mozilla FireFox makes several simultaneous connections to a web server to download the content, and holds these connections open for 15 seconds (the ‘keepalive timeout’) afterwards in case they need to be reused.

Suppose your web server can handle 256 concurrent connections (a common limitation in Apache). Each client uses 2 connections to download a web page, and each connection is open for slightly more than 15 seconds. Then you are immediately limited to 8 users per second on your server:

Why is there a concurrency limit?

Web servers like Apache, and the majority of Java Application servers use a simple design to cope with concurrent connections. They use a separate process or thread to handle each connection.

This design lends itself to straightforward, reliable code because each connection is handled in its own, isolated environment. It is very suitable for application servers and servers like Apache which run third-party code because it has a much simpler programming model. The developer can write application code that performs blocking operations (such as a database transaction) without needing to concern himself with interactions between other connections that the server is managing.

However, the design is inefficient because it puts a significant load on the underlying operating system. Compared to a simple network connection, a process or thread is a very heavyweight operating system object that consumes many more resources than the connection itself.

The operating system experiences severe scalability problems when managing a large number of concurrent processes or threads. It spends a large proportion of its time housekeeping the hundreds of thread or processes (this proportion grows polynomially), and fewer and fewer CPU resources are left to run the web server or application server code, so the capacity of the system drops and the response time is affected.

A runaway web server can easily overwhelm a machine with so many threads or processes that it becomes completely unresponsive. For this reason, the Apache server is artificially limited to 256 concurrent connections.  Java-based application servers have similar limitations.

Note: The Apache Software Foundation has implemented an experimental ‘event’ MPM in Apache 2.2 to cope with their ‘keep alive problem’2; at the time of writing, this module is experimental, not available on Apache 2.0.x, and incompatible with other Apache features including SSL support.

What is the effect of the concurrency limit?

The concurrency limit imposes three restrictions on the performance of the web or application server:

1. Restricting HTTP Keepalives

HTTP Keepalives produce a much better end-user browsing experience because they make a web site much more responsive. However, concurrency-limited servers like Apache Web Server disable them by default because they can reserve the limited concurrency ‘slots’ for too long.

For maximum benefit, keepalives should persist for the period of time that a user views a web page, so that when the user clicks on a new link, the TCP connection to download the new content is already established. AJAX applications may need even longer keepalives in order to remain responsive.

However, the majority of high traffic Apache sites either disable Keepalives completely, or reduce the timeout to less than 5 seconds.

Recall the calculation used to determine the maximum number of users to a server:

The concurrency for an apache server is 256 (when using the worker MPM, it is 150); 2 is the number of simultaneous connections a typical client makes. The only control an administrator has is to reduce the connection duration by limiting keepalives.

2. Fewer Simultaneous Users

The concurrency limit puts a fixed upper limit on the number of users. Additional users are locked out; they won’t be able to access the service at all until an existing user’s connection completes.

The following graph shows the level or service that a new user experiences when he tries to access an Apache server that is already heavily used by up to 300 current users.

Once there are more users than the concurrency limit can handle, the level of service for new users becomes very poor. The existing users acquire the concurrency slots, and requests from new users do not get processed until an existing user relinquishes his slot.

3. Poor performance on a WAN

Slow, high latency networks generate much longer-duration connections. These connections spend most of their time idle as they wait for more network data, but they still occupy the limited concurrency slots:

Increasing the concurrency by increasing the number of processes or threads only has a limited effect.

Solving the concurrency problems

There are alternative server architectures can be used. Zeus Web Server and ZXTM use a higher-performance ‘select-based’ architecture which runs a single process for each processing unit (core or processor). Each process is capable of managing many thousands of connections simultaneously, switching between them using the OS ‘epoll’ system call.  This model scales evenly with the concurrency of the host hardware.

This architecture is commonly described as ‘select-based’ because early implementations use the ‘select’ system call to inspect many connections and determine which can be processed without blocking. The ‘epoll’ system call is a more efficient and scalable version of ‘select’ when inspecting large numbers of connections.

This architecture is appropriate for high-speed web servers and traffic managers, but is not appropriate for complex application servers that run third party code because the programming model is much more complicated. For example, it is extremely difficult to construct code that must perform blocking operations (such a DNS lookup, or a database transaction) within this architecture.

You can overcome the limitations of the concurrency model by using ZXTM to manage the many slow keepalive connections on behalf of the Apache server or Application Server.

How does ZXTM help?

ZXTM has a range of capabilities that improve the performance of servers and applications that it manages traffic to:

• SSL Decryption

• Content Compression

• XML processing (XSLT transformations, XPath queries)

• Content Caching

• TCP offload and buffering

• HTTP multiplexing

• Performance-sensitive load-balancing

Offloading Operations

SSL Transactions

ZXTM’s proven SSL stack is optimized for 64-bit x86 platforms like AMD Opteron and Intel Xeon. ZXTM running on a dual-processor dual-core Opteron 285 machine can decrypt and load-balance over 9000 SSL transactions per second.

In recent tests, it was demonstrated that:

• Using ZXTM to decrypt SSL traffic to a single Apache server provides up to 20-times the transaction rate and 20-times faster transactions, with no connection errors. The ZXTM 7000 was running at 30% utilization, so could comfortably accelerate three Apache servers at the same rate.

• Using ZXTM to decrypt SSL traffic for WebLogic provided over 15-times the SSL performance.

The ZXTM 7000 was running at less than 20% utilization, so could simultaneously manage and decrypt traffic to 5 WebLogic servers if required.

Content Compression

ZXTM can perform on-the-fly content compression, offloading this compute intensive task from the back-end servers. Content compression reduces the bandwidth used by a service by up to 50%, and can improve the response time by a small amount over slow, high latency networks because it reduces the amount of data that has to be transferred.

Visit  to access the BroadBand Testing Apache and BEA WebLogic performance white papers.

XML Operations

XML operations are extremely compute intensive and often perform poorly on Java-based servers. ZXTM can perform XML validation, XPath queries and XSLT transformations to request and response data on behalf of an application server.

Published benchmarks demonstrate a ZXTM 7400 Appliance system performing XSLT transformations at over 1 Gbit/s.

Content Caching

The simplest way to accelerate an application is to cache as much as possible of its response data to minimize the number of requests that it must handle.

ZXTM’s content caching capability does precisely that, using a high performance memory based cache that is fully compliant with the recommendations of RFC 2616. The content caching decisions can be controlled by TrafficScript, and ZXTM allows the administrator to define unique cache keys to cache multiple versions of the same content - for example, different home pages depending on whether the user is logged in or not.

ZXTM allows the administrator to create very cost effective content caches. Its software architecture means that it can be run on hardware that is appropriately sized for the cache required, and its 64-bit memory addressing means that cache sizes greater than 2Gb can be easily achieved.

Managing client-side connections

ZXTM functions as a full proxy, managing large numbers of slow, unreliable connections on behalf of the back-end applications.

In the case of HTTP:

1. ZXTM performs the slow TCP connection ‘accept’ and reads the entire request before connecting to a back-end server; (For large requests such as HTTP Posts, ZXTM will read the request headers and then stream the POST body to the client. This prevents memory starvation on the ZXTM system.)

2. ZXTM connects to a back end server and writes the client request over the fast local network, reusing an existing server-side Keepalive connection if possible;

3. ZXTM reads the entire response from the back-end server rapidly over the network;

4. ZXTM then either closes the server-side connection (if the server requested it), or holds the keepalive connection open for reuse;

5. ZXTM writes the response back to the remote client over the slow remote network.

The server application operates just as if it were talking to a small number of clients on a fast, local network.

ZXTM manages client side connections completely separately from server side ones. It fully supports all HTTP performance optimizations: keepalive, HTTP pipelining, compression and chunk transfer encoding.


ZXTM’s Application Acceleration results in faster, more responsive and more reliable web sites, with significantly better return-on-investment on application hardware and software. The benchmarks that illustrate these results are fully documented in the BroadBand Testing white papers that are available from

Visit the Authors Web Site

Website URL:

Your Name:
Company Name:

Inquiry Only - No Cost Or Obligation

3D Animation : red star  Click Here for The Business Forum Library of White Papers   3D Animation : red star

Search Our Site

Search the ENTIRE Business Forum site. Search includes the Business
Forum Library, The Business Forum Journal and the Calendar Pages.


The Business Forum, its Officers, partners, and all other
parties with which it deals, or is associated with, accept
absolutely no responsibility whatsoever, nor any liability,
for what is published on this web site.    Please refer to:

legal description

Home    Calendar    The Business Forum Journal     Features    Concept    History
Library     Formats    Guest Testimonials    Client Testimonials    Experts    Search
News Wire
     Join    Why Sponsor     Tell-A-Friend     Contact The Business Forum

The Business Forum
Beverly Hills, California U.S.A.

 [email protected]

Graphics by DawsonDesign