How to properly benchmark your broadband connection

Since a while my broadband connection gets slow frequently, so I wanted to perform regular benchmarking probes and create a graph to illustrate the actual uplink and downlink speed.

Your first approach to this might be to download and upload a payload, measure the time this took, and divide the sizes of the files you downloaded and uploaded by the times it took. But this approach is seriously flawed… Why? Simple. In a usual scenario you have a router that terminates your internet connection, so eventually other LAN clients will cause traffic at the same time you’re performing your probe. This would “limit” the bandwidth you have for your probe, and thus artificially reduce the speed you calculate.

So how to do it properly? You should ask your internet gateway (your router) for the traffic it sees.

I’m running OpenWRT on my router. My first approach was to use SNMP to retrieve the interface statistics. But that was just a nice try, since the interface statistics are only updated periodically (about every 10 secs it seems), so that my measurements of the traffic that flew thru the WAN port (uplink) of my router were very imprecise.

Then I had a great idea. ifconfig shows the actual RX and TX values for each interface. So I just created two tiny CGI scripts that would output the current values. On my intranet server where I run the benchmark every 5 mins. from cron I can easily retrieve the current values by issuing a simple curl command to the CGI URL.

The script that does the actual benchmark probes has been modeled according to Vodafone Germany’s speed check. I snooped the traffic to understand what they do, and basically I do the same thing in my script apart from that I don’t download or send the smaller payloads, just the largest ones. I also run two of these downloads or uploads at the same time, exactly as Vodafone do.

Ok, so here’s the two CGI scripts that read the interface counters:

#!/bin/ash
IFCONFIG=`ifconfig pppoe-wan | grep "RX bytes"| while read l; do expr "$l" : 'RX bytes:\([0-9]*\) (.*$'; done`

echo "Content-Type: text/plain"
echo ""
echo "$IFCONFIG"
#!/bin/ash
IFCONFIG=`ifconfig pppoe-wan | grep "TX bytes"| while read l; do expr "$l" : '.*TX bytes:\([0-9]*\) (.*$'; done`

echo "Content-Type: text/plain"
echo ""
echo "$IFCONFIG"

And here’s the actual benchmarking script:

#!/usr/bin/env perl
use warnings;
use strict;
use Config;
use Time::HiRes qw ( time );
use LockFile::Simple qw(unlock trylock);
use File::Basename;
use LWP::Simple;
use threads ('yield',
	     'stack_size' => 64*4096,
	     'exit' => 'threads_only',
	     'stringify');
use LWP::UserAgent;

use constant DEBUG => 0;

# URL which will return the current "RX bytes" and "TX bytes" counters on WAN interface
use constant RX_BYTES_URL => "http://gw/cgi-bin/get-rx-bytes.sh";
use constant TX_BYTES_URL => "http://gw/cgi-bin/get-tx-bytes.sh";

use constant URL_DN => "http://www.speedcheck.vodafone.de/speedtest/random2000x2000.jpg";
use constant URL_UP => "http://www.speedcheck.vodafone.de/speedtest/upload.php";

use constant RRD_FILENAME => "netbench.rrd";

# Function prototypes
sub main();
sub rnd_str(@);

main();

sub do_download() {
    my $before = time();

    # Download dummy file twice, in two parallel threads
    my $thr1 = threads->create(sub { get(URL_DN . '?y=1'); });
    my $thr2 = threads->create(sub { get(URL_DN . '?y=2'); });
    $thr1->join();
    $thr2->join();

    my $after = time();

    my $time_dn = $after - $before;
    return($time_dn);
}

sub do_upload() {
    my $payload = rnd_str 54784, 'A'..'Z';

    my $ua = LWP::UserAgent->new;
    $ua->timeout(10);

    my $b4 = time();
    my $thr1 = threads->create(sub { my $response = $ua->post(URL_UP, Content_Type => 'application/x-www-form-urlencoded',
							      Content => [content0 => $payload, content1 => $payload,
									  content2 => $payload, content3 => $payload]);});
    my $thr2 = threads->create(sub { my $response = $ua->post(URL_UP, Content_Type => 'application/x-www-form-urlencoded',
							      Content => [content0 => $payload, content1 => $payload,
									  content2 => $payload, content3 => $payload]);});
    $thr1->join();
    $thr2->join();

    my $aftr = time();

    return($aftr - $b4);
}

# Generate random string of specified length from specified set of characters
# Usage: print rnd_str 8, 'a'..'z', 0..9;
sub rnd_str(@) {
    join'', @_[ map{ rand @_ } 1 .. shift ]
}

sub main() {
    my $LOCKFILE_DIR = "/run/lock/";
    if ("$Config{osname}" eq "darwin") {
	$LOCKFILE_DIR = "/var/tmp/";
    }

    my $LOCKFILE = basename($0, ".pl");
    $LOCKFILE = $LOCKFILE_DIR . $LOCKFILE;
    die "Cannot obtain lock ${LOCKFILE}.lock, already locked.\n" unless trylock($LOCKFILE);

    my $in_octets_1 = get(RX_BYTES_URL);
    my $time_dn = do_download();
    my $in_octets_2 = get(RX_BYTES_URL);

    if (DEBUG) {
	printf "Total octets downloaded: %d = %.1f MByte\n", $in_octets_2 - $in_octets_1,
		  ($in_octets_2 - $in_octets_1) / 1024**2;
	printf "Download took %.2f sec\n", $time_dn;
    }

    my $out_octets_1 = get(TX_BYTES_URL);
    my $time_up = do_upload();
    my $out_octets_2 = get(TX_BYTES_URL);

    if (DEBUG) {
	printf "Total octets uploaded: %d = %.1f MByte\n", $out_octets_2 - $out_octets_1,
		  ($out_octets_2 - $out_octets_1) / 1024**2;
	printf "Upload took %.2f sec\n", $time_up;
    }

    my $down = ($in_octets_2 - $in_octets_1) / $time_dn / 1024**2 * 8;
    my $up = ($out_octets_2 - $out_octets_1) / $time_up / 1024**2 * 8;
    my $now = time();
    printf "%d\t%.2f\t%.2f\n", $now, $down, $up;

    my $values = sprintf("%d:%.2f:%.2f", $now, $down, $up);
    RRDs::updatev (RRD_FILENAME, "--template", "down:up", $values);
    my $err = RRDs::error;
    if ($err) {
        warn "ERROR while updating ", RRD_FILENAME, ": ", $err;
    }

    unlock($LOCKFILE);
    exit(0);

}

The output my script creates looks like follows:

1394096383	11.59	0.67

That’s three tuples separated by tabs, the first is the seconds since epoch, the second is the download speed in MBit/s, and the third is the upload speed in MBit/s.

The above script also updates an RRD archive with the current thruput values. drraw, a very nice frontend to RRDtool, can then be used to create nice graphs from the RRD file we created.

To create the RRD database I used the following command:

rrdtool create netbench.rrd --start 1392936675 --step 300 \
    DS:down:GAUGE:600:0:20 DS:up:GAUGE:600:0:1 \
    RRA:AVERAGE:0.5:1:576 RRA:AVERAGE:0.5:3:960 \
    RRA:AVERAGE:0.5:6:1920 RRA:AVERAGE:0.5:12:4320 \
    RRA:AVERAGE:0.5:72:7300

This creates two datasources, down and up, and five round-robin archives: One for 48 hours with 5 min precision, one for 10 days with 15 min precision, one for 40 days with 30 min precision, one for 180 days with 1 h precision, and one for 5 years with 6 h precision.

In case you wonder about that “strange” date ranges: I like to have some extra period of time on top of the standard ranges, so instead of 24 hours I like 40 hours, and instead of a week I like 10 days.

The consolidation function to use is AVERAGE.

So, this is it. I hope you find this useful. If you do, please let me know. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *