Statistical Principals for the Performance Tester

Over the years I found that members of software development teams, developers, testers, administrators and managers alike have an insufficient grasp on how to apply mathematics or interpret statistical data on the job.
As performance testers, we must know and be able to apply certain mathematical and statistical concepts.
Exemplar Data Sets
This section refers to three exemplar data sets for the purposes of illustration.
Data Sets Summary
The following is a summary of Data Set A, B, and C.
Sample Size
95th %
Std Dev.
Data Set A
Data Set B
Data Set C
Data Set A
100 total data points, distributed as follows:
●      5 data points have a value of 1.
●      10 data points have a value of 2.
●      20 data points have a value of 3.
●      30 data points have a value of 4.
●      20 data points have a value of 5.
●      10 data points have a value of 6.
●      5 data points have a value of 7.
Data Set B

100 total data points, distributed as follows:

●      80 data points have a value of 1.
●      20 data points have a value of 16.
Data Set C
  100 total data points, distributed as follows:
●      11 data points have a value of 0.
●      10 data points have a value of 1.
●      11 data points have a value of 2.
●      13 data points have a value of 3.
●      11 data points have a value of 4.
●      11 data points have a value of 5.
●      11 data points have a value of 6.
●      12 data points have a value of 7.
●      10 data points have a value of 8.
Also known as arithmetic mean, or mean for short, the average is probably the most commonly used and most commonly misunderstood statistic of them all. Just add up all the numbers and divide by how many numbers you just added-what could be simpler?
When it comes to performance testing, in this example, Data Sets A, B, and C each have an average of exactly 4.
 In terms of application response times, these sets of data have extremely different meanings.
Given a response time goal of 5 seconds, looking at only the average of these sets, all three seem to meet the goal.
Looking at the data, however, shows that none of the data sets is composed only of data that meets the goal, and that Data Set B probably demonstrates some kind of performance anomaly.
Use caution when using averages to discuss response times, and, if at all possible, avoid using averages as your only reported statistic.
It is a straightforward concept easier to demonstrate than define. Consider the 95th percentile as an example. If you have 100 measurements ordered from greatest to least, and you count down the five largest measurements, the next largest measurement represents the 95th percentile of those measurements. For the purposes of response times, this statistic is read “Ninety-five percent of the simulated users experienced a response time of this value or less under the same conditions as the test execution.”
The 95th percentile of data set B above is 16 seconds. Obviously this does not give the impression of achieving our five-second response-time goal. Interestingly, this can be misleading as well: If we were to look at the 80th percentile on the same data set, it would be one second. Despite this possibility, percentiles remain the statistic that I find to be the most effective most often. That said, percentile statistics can stand alone only when used to represent data that’s uniformly or normally distributed and has an acceptable number of outliers.
Uniform Distributions
Uniform distribution is a term that represents a collection of data roughly equivalent to a set of random numbers that are evenly distributed between the upper and lower bounds of the data set. The key is that every number in the data set is represented approximately the same number of times. Uniform distributions are frequently used when modeling user delays, but aren’t particularly common results in actual response-time data. I’d go so far as to say that uniformly distributed results in response-time data are a pretty good indicator that someone should probably double-check the test or take a hard look at the application.
Normal Distributions
Also called a bell curve, a data set whose member data are weighted toward the center (or median value) is a normal distribution. When graphed, the shape of the “bell” of normally distributed data can vary from tall and narrow to short and squat, depending on the standard deviation of the data set; the smaller the standard deviation, the taller and more narrow the bell. Quantifiable human activities often result in normally distributed data. Normally distributed data is also common for response time data.
Standard Deviations
By definition, one standard deviation is the amount of variance within a set of measurements that encompasses approximately the top 68 percent of all measurements in the set; what that means in English is that knowing the standard deviation of your data set tells you how densely the data points are clustered around the mean. Simply put, the smaller the standard deviation, the more consistent the data. To illustrate, the standard deviation of data set A is approximately .7, while the standard deviation of data set B is approximately 6.
Another rule of thumb is this: Data with a standard deviation greater than half of its mean should be treated as suspect.
Statistical Significance
Mathematically calculating statistical significance, also known as reliability. Whenever possible, ensure that you collect at least 100 measurements from at least two independent tests.
While there’s no hard-and-fast rule about how to decide which results are statistically similar without complex equations that call for volumes of data, try comparing results from at least five test executions and apply these rules to help you determine whether or not test results are similar enough to be considered reliable if you’re not sure after your first two tests:
1.    If more than 20 percent (or one out of five) of the test execution results appear not to be similar to the rest, something is generally wrong with either the test environment, the application or the test itself.
2.    If a 95th percentile value for any test execution is greater than the maximum or less than the minimum value for any of the other test executions, it’s probably not statistically similar.
3.    If measurement from a test is noticeably higher or lower, when charted side-by-side, than the results of the rest of the test executions, it’s probably not statistically similar.
4.    If a single measurement category (for example, the response time for a specific object) in a test is noticeably higher or lower, when charted side-by-side with all the rest of the test execution results, but the results for all the rest of the measurements in that test are not, the test itself is probably statistically similar.

Economics of test automation

How to calculate the cost of test automation:
Cost of test automation = Cost of tool(s) + Labor costs of script creation + Labor costs of script maintenance
If a test script will be run every week for the next two years, automate the test if the cost of automation is less than the cost of manually executing the test 104 times.
Automate if:
Cost of automation  <  Cost of manually executing the test as many times as the automated test will
be executed 


Monitoring Windows server using Nagios

Monitoring windows server require installation on NSClient ++ on windows host.

How it works?

For example disk space usage needs to be monitored on windows host.
1. Nagios will execute check_nt command on nagios-server and request it to monitor disk usage on windows machine.
2. Check_nt on nagios-server will contact the NSClient++ service on remote windows host  and request it to execute USEDISKSPACE on remote host. The result will be retured back to check_nt on nagios-server  by NSClient++ daemon.

How to Install and configure NSClient++?

1. Download and install NSClient ++ from
2. Modify the NSClient++ service
Type services.msc in run and  then double click on NSClient++ service in the list.  Select the check-box that says “Allow service to interact with desktop”

3. Modify the Nsclient.ini file
Edit the C:\Program Files\NSClient++\NSC.ini file
– Uncomment everything under [modules] except RemoteConfiguration.dll and CheckWMI.dll
– Uncomment allowed_host under settings and add the ipaddress of the nagios-server.
– Uncomment the port# under [NSClient] section

If you have firewall running, open the port used by NSClient ++. Default ports used are 12489, 5666 and 5667.

Configuration on Nagios Monitorng Server.

1. Verify check_nt command and windows-server template
–  Verify that the check_nt is enabled under /usr/local/nagios/etc/objects/commands.cfg
# ‘check_nt’ command definition
define command{
command_name check_nt
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$

– Verify that the windows-server template is enabled under
# Windows host definition template – This is NOT a real host, just a template!
define host{
name windows-server ; The name of this host template
use generic-host ; Inherit default values from the generic-host template
check_period 24×7 ; By default, Windows servers are monitored round the clock
check_interval 5 ; Actively check the server every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each server 10 times (max)
check_command check-host-alive ; Default command to check if servers are “alive”
notification_period 24×7 ; Send notification out at any time – day or night
notification_interval 30 ; Resend notifications every 30 minutes
notification_options d,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
hostgroups windows-servers ; Host groups that Windows servers should be a member of

2. Uncomment windows.cfg in /usr/local/nagios/etc/nagios.cfg
# Definitions for monitoring a Windows machine

3. Modify /usr/local/nagios/etc/objects/windows.cfg
By default a sample host definition for a windows server is given under windows.cfg, modify this to reflect
the appropriate windows server that needs to be monitored through nagios.
# Define a host for the Windows machine we’ll be monitoring
# Change the host_name, alias, and address to fit your situation
define host{
use windows-server ; Inherit default values from a template
host_name remote-windows-host ; The name we’re giving to this host
alias Remote Windows Host ; A longer name associated with the host
address ; IP address of the remote windows host

4. Define windows services that should be monitored.
Following are the default windows services that are already enabled in the sample windows.cfg. Make sure
to update the host_name on these services to reflect the host_name defined in the above step.
define service{
use generic-service
host_name remote-windows-host
service_description NSClient++ Version
service_description NSClient++ Version
check_command check_nt!CLIENTVERSION
define service{
use generic-service
host_name remote-windows-host
service_description Uptime
check_command check_nt!UPTIME
define service{
use generic-service
host_name remote-windows-host
service_description CPU Load
check_command check_nt!CPULOAD!-l 5,80,90
define service{
use generic-service
host_name remote-windows-host
service_description Memory Usage
check_command check_nt!MEMUSE!-w 80 -c 90
define service{
use generic-service
host_name remote-windows-host
service_description C:\ Drive Space
check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90
define service{
use generic-service
host_name remote-windows-host
service_description W3SVC
check_command check_nt!SERVICESTATE!-d SHOWALL -l W3SVC
define service{
use generic-service
host_name remote-windows-host
service_description Explorer
check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe

5. Enable Password Protection
If you specified a password in the NSC.ini file of the NSClient++ configuration file on the windows
machine, you’ll need to modify the check_nt command definition to include the pssword.

Modify the /usr/local/nagios/etc/commands.cfg file and add password as shown below.
define command{
command_name check_nt
command_line $USER1$/check_ntHOSTADDRESS$ -p 12489 -s My2Secure$Password -v $ARG1$ $ARG2$

6. Verify Configuration and Restart Nagios.

Verify the nagios configuration files as shown below.
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

7. Restart nagios

In the next article I will cover How to install Nagios server?

Jmeter limitations with Monitoring server on load and how to overcome?

JMeter has option to add monitor test plan to monitor application servers. But it only works with Tomcat 5 status servlet. But any servlet container that supports JMX (Java Management Extension) can port the status servlet to provide the same information.

Also if user want want to use the monitor with other servlet or EJB containers, Tomcat’s status servlet will work with other containers for the memory statistics without any modifications. To get thread information, MBeanServer will require change in lookup to retrieve the correct MBeans.

But still it is not possible of Windows server IIS.

One of the way to overcome this situation is using Nagios. This can also be used in conjuction with JMeter. That is load is generated using JMeter and Nagios is used to monitor application server performance on load.

Now, the question is what is nagios? How it works ….

Nagios is System and network monitoring tool. It watches hosts and services that is specified and alerts when things go bad crosses threshold value ( this threshold value can be configured) and when they get better.

It works on Linux and linux like system.. BUT it can also MONITOR WINDOWS SERVER. This is the most important aspect of it.

The only requirement of running Nagios is Linux machine or its variant and c compiler.

On the next article I will explain the details of how it can be used to configure and monitor windows machine.

Load testing MS SQL2005 and MS SQL 2000 using JMeter

Load testing Microsoft SQL 2000 and SQL 2005 database server with JMeter. This will also work on SQL Server 2012, SQL Server 2008 R2, SQL Server 2008, SQL Server 2005 and SQL Azure.

1. Download the “Microsoft JDBC Driver 4.0 for SQL Server” from Microsoft download center.

2. Extract the sqljdbc.jar

3. Update the CLASSPATH with the jar file path or copy paste the jar file under jmter\lib folder.

4. Create a Jmeter test plan having JDBC Connection connection configuration and JDB Request

5. Update Database URL as


Server Name or IP is the database server name
1433 is the default port. It can be different in your case.

E.g: jdbc:sqlserver://localhost:1433;databasename=TestDB;

6. Update JDBC Driver Class as

7. Write the SQL query that needs to be tested in JDBC Request Sampler.

How to configure and use Selenium with C#

Step 1: Download Selenium Core, Selenium IDE and Selenium RC from the website

Step 2: Installing the IDE. It consists of an XPI file that needs to be added in Firefox.

Step 3: Unzip the Selenium RC folder.

Step 4: To record Scripts, Open the IDE. To run the Selenium IDE, open Firefox and Select it from Firefox tools menu. Refer the screenshot below.

To start recording, click on the Red icon ( )

Step 5:

After recording the script, select C# format to convert the scrip in C# format.


Step 6: Configure the .Net client driver.
.Net Client driver can be used with Microsoft Visual Studio. Launch Visual Studio and select new project of type Class Library.
Copy the converted code in the Class1.cs(default name) file and rename the file as required.
Add reference to the following dlls.
ThoughtWorks. Selenium.Core.dll

Step 7: Build the application.

Functional Testing using Selenium – Introduction

Selenium is a portable software testing framework for web applications. Selenium provides a record/playback tool for authoring tests without learning a test scripting language. Selenium provides a rich set of testing functions for testing web applications. These operations are highly flexible, allowing many options for locating UI elements and comparing expected test results against actual application behavior.

It also supports multiple browsers. Selenium is highly flexible. There are multiple ways by which one can customize and add functionality to selenium’s framework. This is one of the strongest characteristics when compared with other proprietary test automation tools and open source tools. In selenium different programming language and scripting language can be used. It supports Java, C#, Perl, PHP, Ruby etc.

Selenium has three major tools: viz. Selenium IDE, Selenium RC ( Remote control) and Selenium Core


Selenium-IDE is the Integrated Development Environment for building Selenium test cases. It operates as a Firefox add-on. Through this you can record user actions and store them in scripts. It has context menu integrated with Firefox browser allowing user to pick from list of assertions.

Selenium-RC (Remote Control)

Selenium-RC allows the test automation developer to use a programming language for maximum flexibility and extensibility in developing test script. Selenium-RC provides an API (Application Programming Interface) and library for each of its supported languages: HTML, Java, C#, Perl, PHP, Python, and Ruby. This helps in writing high quality automated test scripts which can also be integrated with project’s automated build environment.

For more details visit the official website