We are in the middle of doing Performance testing on the project we are implementing with Documentum D6. During the process I learnt that the requirements specs should be drafted very carefully when it comes to performance testing. Here are some key points that may help you:
Define what the following terms mean in the context of performance testing:
Concurrent users
Possible confusion here – Does it mean 500 users logged into the server and maintaining an idle session? OR 500 users logged in and performing active transactions?
Clearly specify that the performance testing shall be done with a sustained load of 500 concurrent users performing active transactions. There should not be any idle time between transactions except for the specified think-times.
Think-time
Think time is the time when the user is supposed to be reading documents or doing something that does not actively stress the system.
Obviously, using zero think time during performance testing will unduly load the system and is not realistic. On the other hand, if you choose a think time that is too long, the system will not be tested correctly.
The think times chosen should be realistic. Typically, 5-6 seconds between requests is a good start (since the user will get used to an application he uses frequently). If you want to be accurate, time the actions of a few users which using a pilot/demo system.
If you are writing requirements specs you MUST specify a think time that you need your vendor to comply with. Leaving this open is a sure way to arguments in future.
Response Time
Is this to be measured from user PC to server? OR is it the time taken for the server to return a response to the load testing tool?
Usually performance testing is done using a tool like LoadRunner or Rational Performance Tester. The tool will be setup on a test server(s) and run to generate the user load and perform transactions. The are two problems to take note of here:
- The tool will send and receive data at a network level (TCP). So the time taken by a typical browser to render the HTML, Javascript, etc is not recorded.
- The end-users will probably be accessing the application from different network locations. The response time will depend on the network latency.
It is not possible to measure the network latency and browser render time accurately. So you will need to make an assumption of this latency – say 4 seconds.
Finally, if you think your users can accept a response time of 8 seconds for a transaction, you will have to specify that using a tool like LoadRunner, the expected response time for the transation is 8-4 = 4 seconds.
Ramp-up period
Specify that 500 users shall be logged into the system within a period of 10 minutes.
You have two options after the user logs in. 1) Continue with the next transaction and logout after completion. OR 2) Login and wait until the load of 500 users has been attained. and only then start performing transactions. Specify which of the above options you prefer.
Sustained user load
You must specify clearly that during the test period, the number of users logged into the system will be sustained at a level of 500 users. Users that complete transactions and logout will be replaced by logging in new users. Otherwise, the system will be loaded initially but the user load will fall rapidly as the simulated users logout.
Caching
You cannot avoid caching in an enterprise system.
You should specify that caching caused by running the same search queries or downloading the same document should be eliminated. One way to achieve this is by randomly picking keywords or documents to download.
You can also use pacing creatively to avoid logging in the same user at short intervals.
Response Time measured at 95% (or 99%) Percentile Value
Don’t use Average response time as a measure of acceptable response times. The mathematical average tends to mask the actual response times. An accepted measure is the 95th percentile value.
If you have 100 transactions, sort all the response times in ascending order, the value at the 95th position is the 95th percentile value. In other words, 95% of the users will see a response time faster than or equal to the 95th percentile value.
However, don’t go by this number alone. Plot a graph that shows (Response-time) Vs (Total No of users). This will show you if the large number of users are clustered at fast response times or towards the longer response times.
Stable Test Runs
When you conduct performance testing at high loads, some of the transactions will fail. This could be due to hung threads, lack of server resources or other factors. However, if a lot of transactions fail, then perhaps the system is not stable enough. Depending on your requirement, you should specify that a test run is considered stable and valid for analysis if a suitable percentage (say 95%) of the transactions succeed.
It is also advisable to investigate why the 5% of the requests failed. This means that 5% of users will see errors. If you are running a eCommerce application or a financial application then this is a substantial number of failed transactions and you may not be willing to accept this.
To be continued…..





February 9, 2009 at 6:24 pm |
Great points.