YouTube Video Link: 90th Percentile in Performance Testing
‘Percentile’ – a common but most important term, especially in performance testing. From the day first when we started analysing any performance testing report, we heard about the 90th percentile. Even our mentor gave more stress on the 90th percentile figures for response time. So, what is this 90th percentile exactly?
Let’s try to understand with an example. If you had 10 sheep and each sheep eat some KGs of grass on a daily basis. One day you weighed the grass and noted the figures of each sheep’s intake. Refer to the below table:
Sheep# | S1 | S2 | S3 | S4 | S5 | S6 | S7 | S8 | S9 | S10 |
Grass(kg) | 3 | 3.2 | 4 | 4.8 | 3.6 | 2.9 | 3.4 | 3 | 3.8 | 3.9 |
Now, you need to find out what amount of grass has been consumed by 90% of sheep. So simply you need to sort the number with respect to consumed grass and ignore the last value.
1st | 2nd | 3rd | 4th | 5th | 6th | 7th | 8th | 9th | 10th | |
Sheep# | S6 | S1 | S8 | S2 | S7 | S5 | S9 | S10 | S3 | S4 |
Grass(kg) | 2.9 | 3 | 3 | 3.2 | 3.4 | 3.6 | 3.8 | 3.9 | 4 | 4.8 |
The 90th percentile value in 10 entries is a 9th value which is 4, so just ignore S4 with 4.8 (keep it hungry for some days, it eats so much).
The conclusion is 90% of total sheep either eat 4 KGs grass or below, so you got an upper limit of grass consumption. In terms of performance testing, you need to sort the response time of a particular transaction or request in increasing order and then ignore 10% of the total count having high values. The last highest number in the remaining values will be the 90th percentile.
Example:
A performance test script is executed for 25 iterations. The response time of the login transaction of each iteration is:
S. No. | Iteration No. | Login (Response Time (in sec)) |
1 | 1 | 1.5 |
2 | 2 | 1.6 |
3 | 3 | 1.1 |
4 | 4 | 0.9 |
5 | 5 | 2.1 |
6 | 6 | 1.9 |
7 | 7 | 1.4 |
8 | 8 | 1 |
9 | 9 | 0.8 |
10 | 10 | 1.5 |
11 | 11 | 1.8 |
12 | 12 | 1.1 |
13 | 13 | 1.6 |
14 | 14 | 1.7 |
15 | 15 | 1.3 |
16 | 16 | 0.9 |
17 | 17 | 1 |
18 | 18 | 1.5 |
19 | 19 | 2.3 |
20 | 20 | 1.9 |
21 | 21 | 1.8 |
22 | 22 | 1.2 |
23 | 23 | 1.4 |
24 | 24 | 0.9 |
25 | 25 | 1.5 |
Now, sort the list in increasing order with respect to response time.
S. No. | Iteration No. | Login (Response Time) |
1 | 9 | 0.8 |
2 | 4 | 0.9 |
3 | 16 | 0.9 |
4 | 24 | 0.9 |
5 | 8 | 1 |
6 | 17 | 1 |
7 | 3 | 1.1 |
8 | 12 | 1.1 |
9 | 22 | 1.2 |
10 | 15 | 1.3 |
11 | 7 | 1.4 |
12 | 23 | 1.4 |
13 | 1 | 1.5 |
14 | 10 | 1.5 |
15 | 18 | 1.5 |
16 | 25 | 1.5 |
17 | 2 | 1.6 |
18 | 13 | 1.6 |
19 | 14 | 1.7 |
20 | 11 | 1.8 |
21 | 21 | 1.8 |
22 | 6 | 1.9 |
23 | 20 | 1.9 |
24 | 5 | 2.1 |
25 | 19 | 2.3 |
Now, 22.5 is 90% of the number of transactions i.e. 25.
=> 25 x (90/100) = 22.5
Round-off to 23. So the 23rd value will be the 90th percentile which is 1.9 seconds. It means 90% of total iterations have a response time of 1.9 seconds or less than it. Similarly, you can calculate other percentile values like the 70th, 80th or 95th percentile.
How 90th percentile calculated in MS Excel?
MS Excel uses the below formula to calculate the 90th percentile:
90th Percentile = 0.9 * (Number of Values – 1) + 1
The same formula was used by LoadRunner version 6.5. The method to find out the 90th percentile has been changed in LoadRunner version 7 and above.
Why do we need the 90th percentile in Performance Testing?
- Percentile is often considered a performance goal. If the given SLA has 90th percentile NFR and it meets during the test then it shows that 90% of the users have an experience that matches your performance goals. It gives additional confidence to the client in his application.
- Sometimes average response time appears extremely high and individual datasets seem normal. Even a couple of peaks in response times, skew the average response time numbers and impact the test. In such scenarios, the 90th percentile (or other percentile values) eliminates the unusual spike data from the result.
- In reality, most of the applications have very few high spikes in the graph; a statistician would say that the curve has a long tail. A long tail does not imply many slow transactions, but a few that are magnitudes slower than the norm. In that case, the 90th Percentile is helpful because it ignores 10% of the request having the spike (this can be ignored).
- If the 50th percentile (median) of response time is 5 seconds that means that 50% of the transactions are either as fast or faster than 5 seconds. If the 90th percentile of the same transaction is at 8 seconds it means that 90% are as fast or faster and only 10% are slower. The average, in this case, could either be lower than 5 seconds or somewhere in between. A percentile gives a much better sense of real-world performance because it shows a slice of the response time curve.
- If we calculate the difference between the 90th percentile value and the average response time value and divide this difference by the average response time value then it gives an idea of the spread of different data points. If the ratio is extremely small, it means that average and 90th percentile values are very close to each other and will indicate the good and constant performance of the application. However, if the ratio is large, it shows a high deviation in response time and non-uniform performance of the application. This is one of the methods where the 90th percentile is useful, although I would recommend to draw your conclusion using standard deviation only.
Conclusion
Percentiles are a really great and easy way of understanding the real performance characteristics of your application. They also provide a great basis for automatic base-lining, application behavioural learning and optimizing your application with a proper focus. However, averages are ineffective because they are too simplistic and one-dimensional. In short, the percentile (90th, 95th, 99th) is great in the performance testing world!
You may be interested:
- Performance Testing Tutorial
- Performance Engineering Tutorial
- LoadRunner Tutorial
- Apache JMeter Tutorial
Thanks, nice explanation
Welcome
Hi Gagan,
Can you elaborate following scenarios.
If the 50th percentile (median) of response time is 5 seconds that means that 50% of the transactions are either as fast or faster than 5 seconds. If the 90th percentile of the same transaction is at 8 seconds it means that 90% are as fast or faster and only 10% are slower. The average, in this case, could either be lower than 5 seconds or somewhere in between. A percentile gives a much better sense of real-world performance because it shows a slice of response time curve.
If we calculate the difference of the 90th percentile value and the average response time value and divide this difference with the average response time value then it gives an idea of the spread of different data points. If the ratio is extremely small, it means that average and 90th percentile values are very close to each other and will indicate good and constant performance of the application. However, if the ratio is large, it shows high deviation in response time and non-uniform performance of the application. This is one of the methods where 90th percentile is useful, although I would recommend to draw your conclusion using standard deviation only.
Hi Alekh,
Yes, in such case we can conclude the system performance by checking the standard deviation which is explained in the given post:
Link: https://www.perfmatrix.com/standard-deviation-in-performance-testing/
Thanks so much, nice explanation.
“It means 90% of total iterations having response time 1.9 seconds or less than it.”
You explain this part very well.