An introduction to load testing

  • Load testing is done by running the software on one machine (or cluster of machines) to generate a large number of requests to the webserver on a second machine (or cluster).
  • Common parameters to test should include server resources (CPU, memory, etc) for handling anticipated loads; quickness of response for the user; efficiency of application; the need for scaling up hardware or scaling out to multiple servers; particularly resource-intensive pages or API calls; and maximum requests per second.
  • In general, a higher number of requests implies higher latency. But it is a good practice in real life to test multiple times at different request rates. Though a website can load in 2-5 seconds, web server latency should typically be around 50-200 milliseconds. Remember that even ‘imperceptible’ improvements add up in the aggregate for a better UX.
  • As a first step, monitor resources - mostly CPU load and free memory.
  • Next, find the maximum response rate of your web server by setting desired concurrency (100 is a safe default but check settings like MaxClients, MaxThreads, etc for your server) and test duration in any load testing tool. If your software only handles one URL at a time, run the test with a few different URLs with varying resource requirements. This should push the CPU idle time to 0% and raise response times beyond real-world expectations.
  • Dial back the load and test again for how your server performs when not pushed to its absolute limit: specify exact requests per second, and cut your maximum requests from the earlier step in half. Step up or down requests by another halfway each time till you reach your maximum for acceptable latency (which should be in the 99th or even 99.999th percentile).
  • Some options among load-testing software you can explore - ab (ApacheBench), JMeter, Siege, Locust, and wrk2.

Full post here, 13 mins read