|By Krishna Markande, Sridhar Murthy||
|March 15, 2012 09:00 AM EDT||
Harness the power of Cloud to gain cost and competitive advantages by leveraging cloud infrastructure for simulating high volume of user load and complex processing for near real time user simulation.
It has become the norm for current enterprises to rollout their systems to cater global market or users. While building these high scalable and high performing systems is a complex exercise, it is equally challenging exercise to test these systems effectively by simulating near real-time user load and volume. This simulation has to address various aspects such as geographic distribution, network bandwidth, volume of transactions, and combination of different personas and use case combinations.
This article provides an approach and reference framework to build near real-time high volume load simulation for testing the large scaled high performance based services or solutions. This suggested high load simulation framework can leverage any public cloud infrastructure of choice and any robust load testing tools of choice of an enterprise to meet specific aspects of target test application or services. This solution approach can be leveraged in performance testing and capacity planning of any on-premise services or cloud deployed services by cutting down the testing cost and the actual beta test duration to achieve more accurate capacity planning and better performance.
Testing an enterprise application is an important part of software development life cycle and so is the case with application deployment environment. Application has to be tested for correctness, completeness, security and quality. It also has to be tested for the performance of the system under varied load, check when it fails so that the application can be constantly improvised. With respect to Software engineering, performance testing is determining the system behaviour in terms of responsiveness, stability, scalability, reliability and resource usage under a particular workload .
While developing and testing these complex systems itself poses enormous challenges, it is important to be able to economize on overall capital costs, operational costs and time to market, so as to beat the competition. Apart from the cost, key criteria of the services such as usage of network, compute and storage resources, and security should also be considered with equal priority. Cloud computing with its infinite computational, storage resources, elasticity and data centers spread across globe provides an ideal infrastructure for enterprise applications which needs additional resource to meet the demand on service hosting side as well as just in time need of high amount of infrastructure for testing side.
Performance failures result in damaged customer relations, poor productivity for users, lost revenue, cost overruns due to rework and redesign/tuning and missed market windows . For example, Aberdeen Research conducted an poll and reported that nearly 60% of the organizations said that they were not happy with the performance of business critical applications and also the responders said that the application performance issues are causing corporate revenue by 9% .To address the following, it is important to ensure that systems being rolled out in the market are ready to handle real- time loads and scale to proportions it is expected to. This scale cannot ideally and accurately be simulated in existing known methodologies by using one's own data canter, lab or rented data centers/labs. By leveraging the capabilities of cloud service provider and in concert with the cloud based testing framework, the above applications can be quickly and efficiently tested by deploying the load injection test servers in the cloud, which can simulate more closely the near-production scenarios.
Apart from quality of testing and meeting performance requirements, enterprises are looking for optimizing testing cost . Infrastructure cost is one of the major components in testing costs and this can be reduced and optimized well, by leveraging cloud infrastructure instead of legacy data center or legacy test labs approach. As pointed out by Jeffrey Rayport and Andrew Heyward, cloud computing has the potential to produce "an explosion in creativity, diversity, and democratization predicated on creating ubiquitous access to high-powered computing resources."  
Subsequent section describes each of the elements of ‘cloud-based load simulation framework', including the function and role of the framework. Additionally, architectural details on the implementation of the framework and its interaction with core web services components will be described. Finally, details are provided on what are required for the users new to cloud-based load simulation to gain real benefits of this solution.
Current Solution and its Limitations
A high load simulation testing demands a high amount of infrastructure for simulating the load. Also this infrastructure is needed for a prolong duration throughout performance testing phase and also for each release phases.
Currently these infrastructures are provisioned within the own data centers/Lab or from the partners data centers/Labs. Also considerable effort and time are spent in procurement, setup and configuration. Testers use suitable open-source or commercial load testing tools to simulate the load.
In addition there are a few options to partner with some of the testing service providers who help high volume load testing (Load Testing Services in SaaS model). But this option is not cost effective (compared to the Cloud Providers' base infrastructure cost) and not flexible for the customization needed (as they primarily target only Web Applications) for enterprise needs.
- Data centers that currently host the dedicated servers or shared hosting to test the application doesn't have a governing mechanism that can govern the infrastructural resources used or service opted for. Most of the time complete infrastructural resources requested by any enterprise customer for deployment of their test applications are allocated. And the service provider will charge for complete resources allocated irrespective of whether resources have been fully utilized or not. 
- Testing of the application is a time bound activity. Once started, a series of tests are run and the test reports are generated to understand the health of the application and performance of the application. In addition, the infrastructure needs for the performance testing are fluctuating and the current solutions lack the elasticity and the flexibility to setup the infrastructure quickly and optimally as per the usage needs.. This makes the current options limited to simulate actual high load which can identify actual product issues in advance.
- Mostly the load testing servers are in the same location as the test application. This won't simulate the real time test scenario where the user requests have to be initiated from different geographical location/data centers and the user behaviour has to be analysed from those different physical locations, by capturing application's various user side performance metrics.
Existing load testing tools available in the market are complex and are not flexible to adopt cloud infrastructure easily.. An integrated real time testing of application in cloud as a unified functionality is missing with many service providers as most commercial products are focused for large enterprises with limited functionalities. Overall integrated cloud based real time testing solution and its automated management framework is needed with high level of granularity, customization, near real-time testing and flexibility to support complex scenarios .
Solution Approach - Platform for Real time testing
Cloud-based load simulation platform showcases simulation of near real time loads in a manner that is economical, quick to start, deploy and satisfies the diverse needs of load, stress testing of small, medium or very large user base services/applications. This platform can be leveraged irrespective of the hosting model of the ‘Application Under Test (AUT)' (i.e AUT hosted in on-premise or hosted in the Cloud. Following are the high level features of the proposed cloud based load simulation platform.
- Real time testing of application using the cloud infrastructure and configurable to meet the needs of specific application
- Setting up and configuring the load test infrastructure in multiple Geographic's data centers (public or private clouds) and testing the various performance metrics (eg. response time, failure rate etc) simulating closer to real-time production scenario.
- Unified management platform to provision, configure, execute, monitor and operate multiple cloud service provider
- Simulating the load generation for near real-time production scenario across various platforms, technologies, geographic locations, Network bandwidth and Internet backbones
- Simulating high user volume with configurable usage pattern and use case combinations
- Optimizing load test infrastructure cost by leveraging usage/on-demand based cloud infrastructure
- Built in components and reports for analyzing and monitoring performance parameters on server side and client side.
Note: Diagram depicts a sample set-up. Actual infrastructure can be much more granular as per chosen number of cloud providers.
Core Components of the Platform
The implementation of a cloud-based testing solution involves four Components: a compute service, a Queuing service, a storage system and a comprehensive framework to interconnect with these components and ensure proper message flow. To demonstrate this platform, the capabilities of Cloud service provider's services are used, any public, private or hybrid cloud can be easily plugged with this platform. This platform uses queuing service (for eg: SQS of Amazon, Azure Queues in Azure etc), Storage service (for eg: S3 Services in Amazon and Azure Blobs in Azure etc) and Compute service (for eg: EC2 services in Amazon and Azure Compute in Azure etc) for providing the required components for successful implementation of cloud-based testing solution.
Brief overview on each of the components used by the platform is as below, and details on the interrelations of these components and with the cloud service providers to implement cloud-based testing is detailed in the next section.
Compute Service: The Elastic Compute Cloud service provides compute capacity in the cloud, enabling users more flexibility in easily processing computationally intensive applications. Elasticity of this service provides great benefit in implementing scalable test servers, which expand and contract based on dynamic traffic patterns.
Platform: The platform coordinates the test job and test results as they orchestrated through the compute, storage, and messaging processes. Platform interlinks all the core components and provides a mechanism to implement elasticity of the cloud depending on the application needs. The input queue of the platform is continually monitored, and additional test instances are launched to handle the increased load. When the number of test job in the input queue decreases, these test instances are terminated, taking full advantage of "utility computing" pay only for the resources used.
Queuing Service: Queue Service offers a reliable, scalable and easy retrieval of test job as they are passed from one test instance to another for processing of the job. There can be specific limit on message size and storage duration depending upon the cloud providers. Messages are queued and dequeued via simple API calls
Storage Service: Storage Service provides a storage mechanism for the test-server template, image and application configuration data. Storage files are limited to few GBs in size, but there is no upper limit on the total volume of data that can be stored in storage repository. But depending on the cloud service provider there is a practical limit, but can be thought as a limitless storage bucket. 
Anatomy of the Testing an Application from Cloud
This section describes end-to-end flow of a testing an application through the platform, detailing the management and orchestration components. Cloud based testing platform includes managing the cloud infrastructure components, the test manager processes, and the test jobs. The target test Application (Customer application) and its components along with the test scripts are shown in while in the figure below. Application can be hosted in the Customer enterprise data centers or provisioned in the cloud.
Prior to creating a test environment, the two queues indicated in Figure 4 (input and output) need to be created. This is performed by running the test macro within the Cloud Management dashboard to create the queues automatically. The platform also provides predefined macro to assist in various systems configuration tasks.
The test script details are submitted to the input job queue by the test admin, it contains the input files needed to test the Target Test Application. The test manager execute the script as a job via a test server node(s) dedicated per application, and finally it also performs any post-processing operations required specific for the given test job, the results ( Error / Reports) are submits to the output queue for the test teams consumption. Steps below depicting the chain of events performed by the platform
- Test job message is sent to the cloud test platform with the test script and application details. All the required test server(s), script details are uploaded to the appropriate location in the cloud repository
- After message is retrieved from the queue, the test manager checks the validity of the message
- The test manager creates the necessary test server environment and passes the required information to the setup manger.
- Set up manager processes the message and moves the required files from repository to an input directory on the local file system
- Setup Manager creates the compute Instance of the Test Node(s) for the test environment with the application configuration details
- The Test server runs the test script on the application running inside the enterprise / cloud and places the result in an output directory on the file system or error files.
- Test manager moves the test results to the appropriate location in the repository for the test admin to download them.
- On completion of the job successfully, a test result is sent to the Output Queue along with status
- Test admin views the status and the test result
Figure 4: Overall cloud based load simulation platform architecture in detail
Deploying the test servers in the cloud by the platform
Additional details of the testing process and the flow of event between the Enterprise, Cloud platform and Cloud Fabric is detailed below.
This flow of events is repeated for every new test message that has to be tested by the platform. The test script can be of a Test scripts created for any load testing tools such as JMeter, Grinder or Load runner etc. The platform can run single or multiple scripts parallel to test the application. In addition the platform provides a mechanism to run a test scripts at a pre- schedule time interval as a batch job.
The platform contains a scheduler whose function is to monitor the input queue, and to launch test manager instance to process the job from the queue. Different scaling metrics are used to determine the number of test manager instances to be launch. If the number of test jobs in the queue is higher additional test instances are launched to handle the test jobs. Within the Cloud dashboard, the admin can specify a new test manager instance should be launched for every N jobs or it can be determined based on the complexity of particular test and application scenario.
Once the scheduler has launched the required number of test manager instances, a call is made within the platform to initiate the allocation of server resources for testing. Setup manager will setup the required test servers in the compute environment. It will execute the test server installation scripts as defined by the test environment. Prior to launching a test server, a server template for the setup manager is created indicating the test server details, such as the characteristics of the test server instance, the test image to use with the base operating system, the region to deploy and number of test instance per availability zone to be launched, along with application configuration information.
For every test application the test server template can be configured manually or by selecting a pre-built template macro. Another key aspect of the server template is that, it can perform the installation of test server and any additional components needed to run the test. Any test specific configuration code can be downloaded from a secure file share repository, and installed on the instance as specified in the test installation script at the end of the instance's boot cycle.
Once the test server(s) are ready, the configuration scripts are run to build the required test environment. If the application under test (AUT) is in the cloud, these instances are created similar to the test server instance. If the AUT is in the remote/enterprise data centers then the test server node(s) instance is configured with the application details. Test scripts are placed in appropriate location so that node can identify the scripts and run them. Test manager triggers the scripts on the test node(s) to perform the required test operations. Each test node is responsible for managing and executing a set of test scripts on the application and uploading the test results or error details to repository.
The test server environment is deployment in the cloud service provider Infrastructure environment as a service. While deploying the platform specify the number of servers needed with appropriate application configuration information (Start-up, test-script, post-script, clean-up operation etc.,) and connection details of the application under test. In this model IT administrator have the greatest degree of control and a familiar operating network topology. The platform handles the elasticity by ensuring that the number of test servers and network elements are adequate provisioned, configured and connected in the specified network topology. On-demand resource addition and removal are also provided by the platform. Complete control is with the IT administrator for security, application usage and management.
Figure 5: Flow of event for deploying the test server(s) on the cloud infrastructure by the platform
End users, Enterprise IT administrator, test admin, Project team members and the client team interact with the platform through browser based unified dashboards.
Testing discrete applications using the Cloud platform helps not only to host and run the test infrastructure and test applications but also to use test-bed which can help projects/products reduce the overheads in setting up world class test facility. Cloud infrastructure along with Cloud test platform benefit in terms of reduced time, effort and cost in setting up the various test environments. Automated provisioning of test infrastructure on cloud enables to achieve instantaneous high scale as needed for the target test application helping in more accurate capacity planning and improved user experiences.
A Cloud Based Load Simulation platform offers to the enterprises a full service catalogue to test a range of real-time production scenarios. This platform with provisioning and de-provisioning on-demand cloud infrastructure shrinks test cycles from months to weeks by eliminating the procurement time and infrastructure setup time drastically. Configurable tools using macros, simplifies the testing process, early stage analysis of weak links and ensures business continuity. This Cloud platform reduces the complexity of using cloud infrastructure for a developer/tester by providing those as part of the feature of platform itself.
- Connie U Smith and Lloyd G Williams, Software Performance Engineering, http://www.springerlink.com/content/g311888355nh7120
- AN SPE APPROACH; By:CONNIE U. SMITH, AND LLOYD G. WILLIAMS http://www.perfeng.com/papers/pdcp.pdf
- Aberdeen Research on performance of business critical applications http://www.aberdeen.com/Aberdeen-Library/5807/RA-application-performance-management.aspx
- Ryan Roop, Deliver cloud network control to the user, http://www.ibm.com/developerworks/cloud/library/cl-cloudvirtualnetwork/
- Qiyang Chen and Rubin Xin, Montclair State University, Montclair, NJ, USA, Optimizing Enterprise IT Infrastructure through Virtual Server Consolidation, http://informingscience.org/proceedings/InSITE2005/P07f19Chen.pdf
- LJUBOMIR LAZICa, NIKOS MASTORAKISba Technical Faculty, University of Novi Pazar, Vuka Karadžića bb, 36300 Novi Pazar, SERBIA http://www.jameslewiscoleman.info/jlc_stuff/project_research/CostEffectiveSoftwa reTestMetrics_2008.pdf
- Darrell M. West, Saving Money Through Cloud Computing,
- Filippos I. Vokolos, Elaine J. Weyuker, AT&T Labs, Performance Testing of Software Systems, http://dl.acm.org/citation.cfm?id=287337
- Scott Tilley Florida Institute of Technology, 3rd International Workshop Software Testing in the Cloud (STITC 2011)
- Sidharth Subhash Ghag, Divya Sharma, Trupti Sarang, Infosys Limited, Software alidation of application deployed on Windows Azure, http://www.infosys.com/cloud/resource-center/Documents/software-validation-applications.pdf
- Shyam Kumar Doddavula, Raghuvan Subramanian, Brijesh Deb, Infosys Limited, Cloud Computing, What beyond operational Efficency?, http://www.infosys.com/cloud/resource-center/Documents/beyond-operational-efficiency.pdf
- Sumit Bose, Anjaneyulu Pasala, Dheepak RA, Sridhar Murthy, Ganesan Malaiyandisamy, Infosys Limited, SLA Management in Cloud Computing, Cloud Computing Principles and Paradigms 2011, pp. 413-436
- S. Bose and S. Sudarrajan, Optimizing migration of virtual machines across datacenters, in Proceeding of the 38th International Conference on Parallel Processing (ICPP) Workshops, Vienna, Austria, September 22-25 2009, pp. 306-313
- B. Van Halle, Business Rules Applied: Building Better Systems Using Business Rules Approach, John Wiley & Sons, Hoboken, NJ, 2002.
- Open Virtualization Format Specification, DMTF standard version 1.0.0, Doc. no.DSP0243, February 2009, http://www.dmtf.org/standards/published_documents/DSP0243_1.0.0.pdf, accessed on April 16, 2010.
- D. Mensce and V. Almeida, Capacity Planning for Web Performance: Metrics, Models and Methods, Prentice-Hall, Englewood Cliffs, NJ, 1998.
- E. de Souza E. Silva, and M. Gerla, Load balancing in distributed systems with multiple classes and site constraints, in Proceedings of the 10th International REFERENCES 435 Symposium on Computer Performance Modeling, Measurement and Evaluation, Paris, France, December 19-21 1984, pp. 17-33.
- J. Carlstrom and R. Rom, Application-aware admission control and scheduling in web servers, in Proceedings of the 21st IEEE Infocom, New York, June 23-27 2002, pp. 824-831.
- S. Bose, N. Tiwari, A. Pasala, and S. Padmanabhuni, SLA Aware "on-boarding" of applications on the cloud, Infosys Lab briefings, 7(7):27-32, 2009.
- Amazon, Amazon Elastic Compute Cloud. http://aws.amazon.com/ec2
- Microsoft, Azure, http://www.windowsazure.com/en-us/