Varnish


Varnish is one of my favorite things for building scalable web services these days - it can dramatically improve just about any site’s performance, and with a few configuration tweaks it can even provide per-session caching and other niceties.

It’s used as this site’s primary front-end, and has served me well so far.

Capturing varnishncsa output for R

I’ve been playing around with R, and HTTP access logs are always an interesting dataset - easy to understand, but hard to get meaningful results out of.

So I built the following script to capture varnishncsa output and build a table of timestamped requests listing the host, wiki namespace, HTTP result code an (crucially for me) cache misses and back-end response times (Varnish can log the time from request to first byte from the back-end):

import os, sys, csv, time
from subprocess import *

# the fields we want
fields = ['timestamp','hostname','namespace','page','result','size','responsetime','cache']

child = Popen(["varnishncsa","-F","%r,%s,%b,%{Varnish:time_firstbyte}x,%{Varnish:handling}x"], stdout=PIPE)

line = child.stdout.readline()
o = csv.writer(open('output.csv','wb+'))

# output the header
o.writerow(fields)

while line:
    (req,result,size,responsetime,cache) = line.strip().split(',')
    (method,url,protocol) = req.split(' ')
    # try to split the GET url HTTP/1.0 stuff into components of interest
    try:
        (dummy,dummy,hostname,namespace,page) = url.split('/',4)
    except:
        (dummy,dummy,hostname,dummy) = url.split('/',3)
        namespace = page = ''
    timestamp = str(time.time())
    # varnishncsa logs some requests with a null size
    if size == '-':
        size = ''
    row = [globals()[x] for x in fields]
    print ','.join(row)
    o.writerow(row)
    line = child.stdout.readline()

This outputs lines in the following format:

timestamp,hostname,namespace,page,result,size,responsetime,cache
1336745775.29,taoofmac.com,space,HOWTO/Setup/daapd,200,9076,0.038134813,miss
1336745775.39,taoofmac.com,themes,serif/css/serif-min.css,200,5997,0.000071287,hit
1336745775.49,taoofmac.com,themes,serif/js/site-min.js,200,41993,0.000078678,hit
1336745775.84,taoofmac.com,themes,serif/img/noise.png,200,8431,0.000062704,hit
1336745775.84,taoofmac.com,themes,serif/img/sitelogo_2011.png,200,16346,0.000038862,hit
1336745775.94,taoofmac.com,themes,serif/img/error.png,200,666,0.000059605,hit
1336745784.61,the.taoofmac.com,space,RecentChanges?format=rss,302,167,0.000052452,hit
1336745787.52,planet.taoofmac.com,,,404,358,0.000077963,hit
1336745797.29,the.taoofmac.com,,,200,129190,0.094822168,miss
1336745802.3,the.taoofmac.com,space,HOWTO/Merge%20Folders,304,0,0.006964445,pass

See Also: