New HTTP benchmark tool: pycurlb

Many tools exist around the globe to get performance data for a HTTP connection but we can consider them as stress tools: They focus on launching an amount of request and output statistical aggregations of latency, throughput and rate. ApacheBenchmark (ab), wrk, httperf, we regularly use these software for what they bring but they also provide a methodology which isn't adapted to some of our goal. We looked for:

  • Run only a single request: We wanted the opportunity to test a link in idle state or bursted by another stress tool
  • Get other TCP timings: DNS, SSL handshake and more have to be known

In the past, in order to fill these requirements, I used the well-known command line tool "client URL Request Library" also known as cURL. This software is considered like the swiss knife of HTTP client and despite it supports much more protocols, most of the people use it just for just to download a file or communicate with REST APIs. But if dig under the surface, curl is actually just a user interface for its powerful library libcurl and where cURL provides more than 50 command options, libcurl let you the opportunity to forge any kind HTTP request. For debug purpose, curl has an option named --write-out allowing users to export data about connection and response. Here's an example:

$ curl --write-out '%{time_total}' https://www.cloud-mercato.com/ -o /dev/null -s
0.230652

In the command above, we reach our goal but firstly there are options we will always use: -o and -s because we don't care about curl's output. Moreover, the desired output is actually difficult to obtain from command line, in case of a complex one like a JSON, you have create a template file or fight with characters escaping. To ease our work, we decided to create a tool based on libcurl and accurately designed to two tasks: Run a single HTTP request and report connection information. This is how pycurlb was born. pycurlb, abbreviation of Python cURL Benchmark, is based on pycurl which is a Python wrapper around libcurl. This software is very simple command line tool mimicking curl's behavior but outputting a JSON with a lot information available. The command similar to the one presented above would be:

$ pycurlb https://www.cloud-mercato.com/
{
  "appconnect_time": 5.673696,
  "compressed": false,
  "connect_time": 5.581115,
  "connect_timeout": 300,
  "content_length_download": 219.0,
  "content_length_upload": -1.0,
  "content_type": "text/html; charset=UTF-8",
  "effective_url": "https://www.cloud-mercato.com/",
  "header_size": 516,
  "http_code": 200,
  "http_connectcode": 0,
  "httpauth_avail": 0,
  "local_ip": "10.0.0.1",
  "local_port": 34740,
  "max_time": 0,
  "method": "GET",
  "namelookup_time": 5.520988,
  "num_connects": 1,
  "os_errno": 0,
  "pretransfer_time": 5.673749,
  "primary_ip": "1.2.3.4",
  "primary_port": 443,
  "proxyauth_avail": 0,
  "redirect_count": 0,
  "redirect_time": 0.0,
  "redirect_url": "https://www.cloud-mercato.com/",
  "request_size": 190,
  "size_download": 219.0,
  "size_upload": 0.0,
  "speed_download": 38.0,
  "speed_upload": 0.0,
  "ssl_engines": [
    "rdrand",
    "dynamic"
  ],
  "ssl_verifyresult": 0,
  "starttransfer_time": 5.74181,
  "total_time": 5.741879
}

Easy and useful, let's see which helpful metrics we have:

  • namelookup_time: Time to resolve DNS
  • connect_time : Time to do TCP connection
  • appconnect_time: Time before start HTTP communication
  • pretransfer_time: Time before start transfer
  • starttransfer_time: Time when first byte has been received
  • total_time: Total request/response time
  • speed_download and speed_upload: Throughput

We see here that the value called latency in other benchmark tools is split up in 6 items, each of them describing a stage of an applicative TCP/IP connection. Detail is here the master word, so we tried to stay fully compatible with the original curl and keep the same command line arguments, so even advanced scenario such as headers inclusion should be possible. This software is an open source project stored on Github. Feel free to use, contribute or open issues, you are very welcome.