| By John Bley | Article Rating: |
|
| November 6, 2003 12:00 AM EST | Reads: |
19,361 |
So you've been told to diagnose a performance problem in a WebLogic J2EE application. Because Java systems are so complex, this can be a bit like diagnosing a rare illness.
To pinpoint the problem accurately you need to have a thorough understanding of the symptoms, be prepared to do a fair amount of investigative work, and then you must determine the proper remedy. This article offers a discussion of some of the most common types of J2EE application performance issues and their causes, followed by suggested guidelines for properly diagnosing and eliminating them.
The Symptoms
What are the symptoms of a WebLogic application performance problem? The symptoms you see guide your search through all possible illnesses. Get a notebook and start asking people for data. Try to separate speculation and theory about the root cause of the problem from hard evidence about the system's behavior. Here's a list of common symptom sets:
Why Is Problem Diagnosis So Complicated?
There is no set formula that you can apply to derive the performance for a particular usage pattern of a WebLogic application (in hard realtime engineering, techniques like rate monotonic analysis can do exactly this, but let's ignore that for the purposes of this article). The resulting performance is sensitive to whether or not another system somewhere else on the network is hitting a shared back-end service hard. Perhaps it's also dependent on matching the exact versions of the JDBC driver and the database. Maybe a developer wrote some code three years ago that happened to swallow a particular type of exception and the feedback you desperately need to solve the problem is contained in that exception.
In essence, the performance of a typical business system is an emergent property resulting from thousands of interacting variables and decisions. Like a human body, there are too many interlocking parts and processes to comprehend the totality of the domain. So we simplify, and look for overarching patterns.
The Diseases
What are the possible root causes for the symptoms you're seeing? Is it your basic flu or the beginnings of pneumonia? Is the underlying problem internal to the application or is it external to its JVM? See Table 1 for some of the most common causes of poor application performance.

Measuring Vital Statistics
As the person charged with diagnosing the problem, you should be able to keep track of vital statistics about the health of your WebLogic application. What can you measure? What tools are available to help?
Lab Work
Sometimes the data obtained during one benchmark run will not reveal the answer. And chances are that you will have a limited budget for running experiments and doing lab work to complete your diagnosis. What kinds of experiments can you run? What variables can you change or watch?
- If a particular user's logon seems to trigger a problem, it might be that user's account profile (e.g., loading the full purchase history of 2,000 orders), or it might be the way he uses the system (e.g., order of page accesses or exact query string he uses to search for a particular document).
- If you've got a clustered system, try compartmentalizing by individual machines. Despite best efforts, sometimes boxes don't have the latest app server or OS patches, which can contribute to different performance characteristics. Also, pay attention to the load balancer or nanny process to see if it's distributing work fairly and keeping up with inbound requests.
Diagnosis: Testing Your Theories
At this point, you should have enough information to form theories about the cause of the performance bottleneck (see Table 1). To confirm that your theory is correct or to differentiate between multiple competing theories, you'll need to analyze more information or run additional benchmarks on the system. Here are a few guidelines to help you out.
Example Diagnosis
Let's step through an example. Your WebLogic application exhibits the symptom of increasing slowness under load. The more users you pile on, the slower things get. Once the load is removed, the system cools down with no side effects. You measure this primary symptom and find the following (time measurement is for end-toend completion of a single typical transaction) and get the results shown in Table 2.

You form a few theories. Perhaps the disease here is a badly coded component or perhaps it's a bottleneck on a back-end system. It could be a synchronization chokepoint. How can you tell the difference? Suppose you also measured aggregate CPU usage of the application server during the load runs and got the results shown in Table 3.

It looks like the system isn't CPU bound, which means that it's spending most of its time waiting. But is it internal (a synchronization traffic jam, for instance) or external (slow database)? Well, let's suppose we gather a few more numbers and come up with Table 4.

It doesn't appear to be an internal bottleneck waiting for database connections - instead, it appears to be the JDBC query itself. Not only does the JDBC query vary with the overall transaction time, but its poor performance explains the bulk of the overall poor performance. We're still not quite done, though. You still have three major theories to sort through: Is it the database itself that's slow, is the application making unreasonable demands on the database, or does the problem lie in some layer between the application and the database? You pull up the database's vendor-specific tool to see response times from its point of view. You'd hope to see numbers such as those in Table 5.

If you didn't see this information, then you might dive back into the JDBC driver and hope to find some sort of synchronization problem inside it (remember, the CPU isn't overwhelmed). Fortunately, in this case you've narrowed the specific problem down to a database query. Figuring out if the query's demands are reasonable enough requires some domain knowledge and familiarity with the system, but perhaps in this case it simply turns out that the query compares an unindexed field against a foreign key. You work with the DBA, who changes the indexing scheme to make this query faster, and you've found your cure.
Conclusion
Diagnosing a performance bottleneck in a WebLogic J2EE application can be a difficult journey. Keep your wits about you, separate fact from speculation, and always confirm your hypotheses with hard evidence. Hopefully I've given you a taxonomy of useful ideas to think about and experiment with. Like debugging, this is still a black art but careful thinking will see you through. Good luck!
Published November 6, 2003 Reads 19,361
Copyright © 2003 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By John Bley
John Bley is a software engineer for Wily Technology. He has extensive experience with Java programming and architecture. For this article, he has drawn on the experiences of Wily's enterprise customers, who are responsible for managing complex J2EE environments.


