Planet Varnish

October 20, 2016

Ingvar Hagelundvarnish-5.0, varnish-modules-0.9.2 and hitch-1.4.1, packages for Fedora and EPEL

The Varnish Cache project recently released varnish-5.0, and Varnish Software released hitch-1.4.1. I have wrapped packages for Fedora and EPEL.

varnish-5.0 has configuration changes, so the updated package has been pushed to rawhide, but will not replace the ones currently in EPEL nor in Fedora stable. Those who need varnish-5.0 for EPEL may use my COPR repos at https://copr.fedorainfracloud.org/coprs/ingvar/varnish50/. They include the varnish-5.0 and matching varnish-modules packages, and are compatible with EPEL 5, 6, and 7.

hitch-1.4.1 is configure file compatible with earlier releases, so packages for Fedora and EPEL are available in their respective repos, or will be once they trickle down to stable.

As always, feedback is warmly welcome. Please report via Red Hat’s Bugzilla or, while the packages are cooking in testing, Fedora’s Package Update System.

Varnish Cache is a powerful and feature rich front side web cache. It is also very fast, and that is, fast as in powered by The Dark Side of the Force. On steroids. And it is Free Software.

Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at www.redpill-linpro.com.

August 10, 2016

Ingvar Hagelundvarnish-4.1.3 and varnish-modules-0.9.1 for fedora and epel

The Varnish Cache project recently released varnish-4.1.3 and varnish-modules-0.9.1. Of course, we want updated rpms for Fedora and EPEL.

While there are official packages for el6 and el7, I tend to like to use my Fedora downstream package, also for EPEL. So I have pushed updates for Fedora, and updated copr builds for epel5, epel6, and epel7.

An update of the official supported bundle of varnish modules, varnish-modules-0.9.1, was also released a few weeks ago. I did recently wrap it for Fedora, and am waiting for its review in BZ #1324863. Packages for epel5, epel6, and epel7 are in copr as well.

Fedora updates for varnish-4.1.3 may be found at https://bodhi.fedoraproject.org/updates/?packages=varnish

The Copr repos for epel are here: https://copr.fedorainfracloud.org/coprs/ingvar/varnish41/

Test and reports are very welcome.

Varnish Cache is a powerful and feature rich front side web cache. It is also very fast, and that is, fast as in powered by The Dark Side of the Force. On steroids. And it is Free Software.

Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at www.redpill-linpro.com.

April 27, 2016

Ingvar Hagelundhitch-1.2.0 for fedora and epel

Hitch is a libev-based high performance SSL/TLS proxy. It is developed by Varnish Software, and may be used for adding https to Varnish cache.

hitch-1.2.0 was recently released. Among the new features in 1.2.0, might be mentioned more granular per-site configuration. Packages for Fedora and EPEL6/7 were requested for testing today. Please test and report feedback.

Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at www.redpill-linpro.com.

April 16, 2016

Tollef Fog HeenBlog moved, new tech

I moved my blog around a bit and it appears that static pages are now in favour, so I switched to that, by way of Hugo. CSS and such needs more tweaking, but it’ll make do for now.

As part of this, RSS feeds and such changed, if you want to subscribe to this (very seldomly updated) blog, use https://err.no/personal/blog/index.xml

December 25, 2015

Kacper WysockiFull-stack Varnish all the things

While attending VUGX Rotterdam, which was a jolly time for the Varnish family, I have been pleasantly surprised by some little known but very useful new things available for your site.

Varnish Security Firewall

I’ll tell you about them in a moment, but I’ll start by plugging VSF: The Varnish Security Firewall, which is a really useful security framework for Varnish 4, which we recently made really, really easy to install. You can give it a spin right now to stop many of the most common attacks against your website today.
VSF

Shameless plugs out of the way, Varnish is a painless process to operate, but most sites out there need to admin web servers, they need to encrypt their traffic and they need to load balance, and all this together becomes pretty messy, requiring haproxy, nginx and sticky putty goo in a glorious sandwich.

Nginx Sandwich

SSL frontend with Hitch

One problem that keeps popping up is the lack of SSL support. No, Varnish is not going to link openssl before hell freezes over in PHK’s basement, because let’s be honest: it belongs in its own binary. Still, you can now replace your heavy-handed Nginx stack with the lightweight Hitch, a revamped stud, which even has rudimentary Let’sEncrypt support.

Hitch

An SSL frontend config for Hitch is a simple 5 lines.

frontend = "[*]:443"
backend  = "[127.0.0.1]:8000"
pem-file = "/etc/ssl/private/projects.hackeriet.pem"
user = nobody
ciphers = "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH"
write-proxy = on

With the write-proxy = on option, we can even use the PROXY protocol to pass the client.ip to Varnish instead of trusting X-Forwarded-For, but we’ll have to update Varnish to accept it by copying /lib/systemd/system/varnish.service to /etc/systemd/system/varnish.service and editing ExecStart to include -a:8000,PROXY like so:

ExecStart=/usr/sbin/varnishd -a :80 -a:8000,PROXY -a :6081 -T localhost:6082 -f /etc/varnish/default.vcl -S /etc/varnish/secret -s malloc,256m

On the backend side, it’s fairly straightforward to write SSL-backends for Varnish using stunnel.

Varnish as a Web Server

For years we’ve been half-joking that what Varnish needs is web server capabilities, but little did we know that this has been around for years!

web server

Today we can all enjoy Martin’s fsbackend vmod, which lets us serve (and cache) static files straight off the file system in a secure and fast way. It doesn’t come with much documentation though, so here’s how you use it to serve your sites with the standard “index.html”, some minimal trailing-slashes support and some content-typing:

vcl 4.0;
import fsbackend;
sub vcl_init {
  new webroot = fsbackend.root("/var/www");
  webroot.add_header("Cache-Control: public, max-age=3600");
} 
sub vcl_backend_fetch {
  if(bereq.url ~ "^/mysite") {
    set bereq.url = regsub(bereq.url, "/mysite/?", "/");
    if(bereq.url ~ "/$") {
      set bereq.url = bereq.url + "index.html";
    }
    set bereq.backend = webroot.backend();
  }
}
sub vcl_backend_response {
  if (bereq.url ~ "\.html") {
    set beresp.http.Content-Type = "text/html";
  } elif(bereq.url ~ "\.txt") {
    set beresp.http.Content-type = "text/plain; charset=utf8";
  } else {
    set beresp.http.Content-Type = "text/plain";
  }
}

Sandwich

This module is historically proceeded by Dridi’s efforts for Varnish 3 and 4, which by his own words was “a hack”, as well as the std.file(), which was not good enough for serving web pages.

Coupled with Hitch for SSL, this has allowed me to drop the Nginx sandwich completely on the static parts of my sites.

Regexes with sub-captures

Regexes

Innate regexes are awesome in Varnish, but sub-captures are barely accessible and usually force you to copy and match a single string many times for multiple captures. This is both messy and not efficient. Enter Geoff from Uplex re vmod, which lets us access all subcaptures of a efficiently:

import re;
sub vcl_init {
  new myregex = re.regex("bar(\d+)");
}
sub vcl_recv {
  if (myregex.match(req.http.Foo)) {
    set req.http.Baz = myregex.backref(1, "");
  }	
}

Consistent Hashing

If you’re running a really large site where you’re regularily pinning your server bandwidth you know that load-balancing and sharding is where it’s at. That’s where the stateless persistent hashing vmod that Niels from Uplex presented comes in.
Consistent Hashing
When a frontend request hits your cluster, the vslp vmod looks up which cache is the master for that request, and then picks which backend owns the request in a consistent way across n servers. If you ever add frontends or backends, only 1/n+1 objects have to be rebalanced, giving you a limited increased load cost for adding new servers. The code has been in heavy production for the last couple of years!

This means you can keep adding frontends and backends to your pool, and you can round-robin load balance while scaling (near-) linearly.

Dynamic Backends

When running Varnish on EC2 with an ELB, it’s real tricky to keep the backend list up to date. Dynamic backend vmods will fix this problem Real Soon Now.

With consistent hashing in the Varnish core, and dynamic backends, we might no longer need haproxy nor any other messy load balancer, simplifying the stack for quite a lot of sites out there.

fullstack of pancakes
There are also rumours of a FCGI or even WSGI vmod, which will allow us to forgo the ancient dogmatic cows of web adminning entirely, and go Full Stack Varnish. Let’s just pray that day will come soon enough.

Ho Ho Ho! I wish you a merry X-Forwarded For!

PS The title is a joke, and so is everything “full stack”.

December 20, 2015

Kristian LyngstølVarnish Foo - Chapter 3

Posted on 2015-12-20

I have just pushed the third chapter of Varnish Foo online. Unlike the first two chapters, I will not post the entire chapter here on my blog.

Instead, you can head over to the official location to read it:

https://varnishfoo.info/chapter-3.html

There you will also find the first two chapters and the appendices, along with a PDF version.

I still expect changes to the chapter, but the content is largely complete and ready to be consumed. The front page of https://varnishfoo.info has instructions for how to best give feedback. I've already started getting some, and hope to get more as time passes.

Comments

December 11, 2015

Lasse KarstensenVarnish Device detection looking for maintainer

We’re looking to have someone take over maintenance of the Varnish Device detection VCL set.

The last couple of years we haven’t really served it right, and it is time to let someone with more practical/hands-on experience take over.

I’ve written an explanation in the README.rst file. Short version is that we’re good at fixing Varnish crashes and configuring it. Expert knowledge about User-Agents and what is a reasonable way to group 2015-clients we don’t really have.

Compensation is the usual in open source community projects, minor fame from your peers and some contributor gear. T-shirts and Varnish Cache coffee mugs in abundance shall be yours.

So, if you’re up for it, please drop me an email and we’ll see what we can figure out.

 


December 02, 2015

Kristian LyngstølVarnish Foo - Working With HTTP caching

Posted on 2015-12-02

Note

This is the second chapter of Varnish Foo, the book I'm writing on Varnish cache. You can find the source code at https://github.com/KristianLyng/varnishfoo . Feedback welcome.

Before you dig into the inner workings of Varnish, it's important to make sure you have the tools you need and some background information on basic caching.

This chapter looks at how HTTP caching works on multiple points in the delivery chain, and how these mechanisms work together. Not every aspect of HTTP caching is covered, but those relevant to Varnish are covered in detail. Including several browser-related concerns.

There are a multitude of tools to chose from when you are working with Varnish. This chapter provides a few suggestions and a quick guide to each tool, but makes no claim on whether one tool is better than the other. The goal is to establish what sort of tasks your chosen tool needs to be able to accomplish.

Only the absolute minimum of actual Varnish configuration is covered - yet several mechanisms to control Varnish through backend responses are provided. Most of these mechanisms are well defined in the HTTP 1.1 standard, as defined in RFC2616.

Tools: The browser

A browser is an important tool. Most of todays web traffic is, unsurprisingly, through a web browser. Therefor, it is important to be able to dig deeper into how they work with regards to cache. Most browsers have a developer- or debug console, but we will focus on Chrome.

Both Firefox and Chrome will open the debug console if you hit <F12>. It's a good habit to test and experiment with more than one browser, and luckily these consoles are very similar. A strong case in favor of Chrome is Incognito Mode, activated through <Ctrl>+<Shift>+N. This is an advantage both because it removes old cookies and because most extensions are disabled. Most examples use Chrome to keep things consistent and simple, but could just as well have been performed on Firefox.

The importance of Incognito Mode can be easily demonstrated. The following is a test with a typical Chrome session:

/img/chromium-dev-plugins.png

Notice the multiple extensions that are active, one of them is inserting a bogus call to socialwidgets.css. The exact same test in Incognito Mode:

/img/chromium-dev-incognito.png

The extra request is gone. Regardless of browser choice, your test environment should be devoid of most extensions and let you easily get rid of all cookies.

You will also quickly learn that a refresh isn't always just a refresh. In both Firefox and Chrome, a refresh triggered by <F5> or <Ctrl>+r will be "cache aware". What does that mean?

Look closer on the screenshots above, specially the return code. The return code is a 304 Not Modified, not a 200 OK. The browser had the image in cache already and issued a conditional GET request. A closer inspection:

/img/chromium-dev-304-1.png

The browser sends Cache-Control: max-age=0 and an If-Modified-Since-header. The web server correctly responds with 304 Not Modified. We'll look closer at those, but for now, let's use a different type of refresh: <Shift>+<F5> in Chrome or <Shift>+<Ctrl>+r in Firefox:

/img/chromium-dev-304-2.png

The cache-related headers have changed somewhat, and the browser is no longer sending a If-Modified-Since header. The result is a 200 OK with response body instead of an empty 304 Not Modified.

These details are both the reason you need to test with a browser - because this is how they operate - and why a simpler tool is needed in addition to the browser.

Tools: The command line tool

The browser does a lot more than issue HTTP requests, specially with regards to cache. A good request synthesizer is a must to debug and experiment with HTTP and HTTP caching without stumbling over the browser. There are countless alternatives available.

Your requirement for a simple HTTP request synthesizer should be:

  • Complete control over request headers and request method - even invalid input.
  • Stateless behavior - no caching at all
  • Show complete response headers.

Some suggestions for Windows users are curl in Powershell, Charles Web Debugging Proxy, the "Test and Rest Client" in PhpStorm, an "Adanced RST client" Chrome extension, or simply SSH'ing to a GNU/Linux VM and using one of the many tools available there. The list goes on, and so it could for Mac OS X and Linux too.

HTTPie is a small CLI tool which has the above properties. It's used throughout this book because it is a good tool, but also because it's easy to see what's going on without knowledge of the tool.

HTTPie is available on Linux, Mac OS X and Windows. On a Debian or Ubuntu system HTTPie can be installed with apt-get install httpie. For other platforms, see http://httpie.org. Testing httpie is simple:

$ http http://kly.no/misc/dummy.png
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Length: 178
Content-Type: image/png
Date: Wed, 25 Nov 2015 18:49:33 GMT
Last-Modified: Wed, 02 Sep 2015 06:46:21 GMT
Server: Really new stuff so people don't complain
Via: 1.1 varnish-v4
X-Cache: MISS from access-gateway.hospitality.swisscom.com
X-Varnish: 15849590



+-----------------------------------------+
| NOTE: binary data not shown in terminal |
+-----------------------------------------+

In many situations, the actual data is often not that interesting, while a full set of request headers are very interesting. HTTPie can show us exactly what we want:

$ http -p Hh http://kly.no/misc/dummy.png
GET /misc/dummy.png HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: kly.no
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 81
Connection: keep-alive
Content-Length: 178
Content-Type: image/png
Date: Wed, 25 Nov 2015 18:49:33 GMT
Last-Modified: Wed, 02 Sep 2015 06:46:21 GMT
Server: Really new stuff so people don't complain
Via: 1.1 varnish-v4
X-Cache: HIT from access-gateway.hospitality.swisscom.com
X-Varnish: 15849590

The -p option to http can be used to control output. Specifically:

  • -p H will print request headers.
  • -p h will print response headers.
  • -p B will print request body.
  • -p b will print response body.

These can combined. In the above example -p H and -p h combine to form -p Hh. See http --help and man http for details. Be aware that there has been some mismatch between actual command line arguments and what the documentation claims in the past, this depends on the version of HTTPie.

The example shows the original request headers and full response headers.

Faking a Host-header is frequently necessary to avoid changing DNS just to test a Varnish setup. A decent request synthesizer like HTTPie does this:

$ http -p Hh http://kly.no/ "Host: example.com"
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host:  example.com
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html
Date: Wed, 25 Nov 2015 18:58:10 GMT
Last-Modified: Tue, 24 Nov 2015 20:51:14 GMT
Server: Really new stuff so people don't complain
Transfer-Encoding: chunked
Via: 1.1 varnish-v4
X-Cache: MISS from access-gateway.hospitality.swisscom.com
X-Varnish: 15577233

Adding other headers is done the same way:

$ http -p Hh http://kly.no/ "If-Modified-Since: Tue, 24 Nov 2015 20:51:14 GMT"
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: kly.no
If-Modified-Since:  Tue, 24 Nov 2015 20:51:14 GMT
User-Agent: HTTPie/0.8.0

HTTP/1.1 304 Not Modified
Age: 5
Connection: keep-alive
Content-Encoding: gzip
Content-Type: text/html
Date: Wed, 25 Nov 2015 18:59:28 GMT
Last-Modified: Tue, 24 Nov 2015 20:51:14 GMT
Server: Really new stuff so people don't complain
Via: 1.1 varnish-v4
X-Cache: MISS from access-gateway.hospitality.swisscom.com
X-Varnish: 15880392 15904200

We just simulated what our browser did, and verified that it really was the If-Modified-Since header that made the difference earlier. To have multiple headers, just list them one after an other:

$ http -p Hh http://kly.no/ "Host: example.com" "User-Agent: foo" "X-demo: bar"
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host:  example.com
User-Agent:  foo
X-demo:  bar

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 10
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 24681
Content-Type: text/html
Date: Wed, 25 Nov 2015 19:01:08 GMT
Last-Modified: Tue, 24 Nov 2015 20:51:14 GMT
Server: Really new stuff so people don't complain
Via: 1.1 varnish-v4
X-Cache: MISS from access-gateway.hospitality.swisscom.com
X-Varnish: 15759349 15809060

Tools: A web server

Regardless of what web server is picked as an example in this book, it's the wrong one. So the first on an alphabetical list was chosen: Apache.

Any decent web server will do what you need. What you want is a web server where you can easily modify response headers. If you are comfortable doing that with NodeJS or some other slightly more modern tool than Apache, then go ahead. If you really don't care and just want a test environment, then keep reading. To save some time, these examples are oriented around Debian and/or Ubuntu-systems, but largely apply to any modern GNU/Linux distribution (and other UNIX-like systems).

Note that commands that start with # are executed as root, while commands starting with $ can be run as a regular user. This means you either have to login as root directly, through su - or sudo -i, or prefix the command with sudo if you've set up sudo on your system.

The first step is getting it installed and configured:

# apt-get install apache2
(...)
# a2enmod cgi
# cd /etc/apache2
# sed -i 's/80/8080/g' ports.conf sites-enabled/000-default.conf
# service apache2 restart

This installs Apache httpd, enables the CGI module, changes the listening port from port 80 to 8080, then restarts the web server. The listening port is changed because eventually Varnish will take up residence on port 80.

You can verify that it works through two means:

# netstat -nlpt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State PID/Program name
tcp6       0      0 :::8080                 :::*                    LISTEN 1101/apache2
# http -p Hh http://localhost:8080/
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8080
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Accept-Ranges: bytes
Connection: Keep-Alive
Content-Encoding: gzip
Content-Length: 3078
Content-Type: text/html
Date: Wed, 25 Nov 2015 20:23:09 GMT
ETag: "2b60-525632b42b90d-gzip"
Keep-Alive: timeout=5, max=100
Last-Modified: Wed, 25 Nov 2015 20:19:01 GMT
Server: Apache/2.4.10 (Debian)
Vary: Accept-Encoding

netstat reveals that apache2 is listening on port 8080. The second command issues an actual request. Both are useful to ensure the correct service is answering.

To provide a platform for experimenting with response header, it's time to drop in a CGI script:

# cd /usr/lib/cgi-bin
# cat > foo.sh <<_EOF_
 #!/bin/bash
 echo "Content-type: text/plain"
 echo
 echo "Hello. Random number: ${RANDOM}"
 date
 _EOF_
# chmod a+x foo.sh
# ./foo.sh
Content-type: text/plain

Hello. Random number: 12111
Wed Nov 25 20:26:59 UTC 2015

You may want to use an editor, like nano, vim or emacs instead of using cat. To clarify, the exact content of foo.sh is:

#!/bin/bash
echo "Content-type: text/plain"
echo
echo "Hello. Random number: ${RANDOM}"
date

We then change permissions for foo.sh, making it executable by all users, then verify that it does what it's supposed to. If everything is set up correctly, scripts under /usr/lib/cgi-bin are accessible through http://localhost:8080/cgi-bin/:

# http -p Hhb http://localhost:8080/cgi-bin/foo.sh
GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8080
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:31:00 GMT
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 12126
Wed Nov 25 20:31:00 UTC 2015

If you've been able to reproduce the above example, you're ready to start start testing and experimenting.

Tools: Varnish

We need an intermediary cache, and what better example than Varnish? We'll refrain from configuring Varnish beyond the defaults for now, though.

For now, let's just install Varnish. This assumes you're using a Debian or Ubuntu-system and that you have a web server listening on port 8080, as Varnish uses a web server on port 8080 by default:

# apt-get install varnish
# service varnish start
# http -p Hhb http://localhost:6081/cgi-bin/foo.sh
GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:38:09 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 5

Hello. Random number: 26
Wed Nov 25 20:38:09 UTC 2015

As you can see from the above example, a typical Varnish installation listens to port 6081 by default, and uses 127.0.0.1:8080 as the backend web server. If the above example doesn't work, you can change the listening port of Varnish by altering the -a argument in /etc/default/varnish and issuing service varnish restart, and the backend web server can be changed in /etc/varnish/default.vcl, then issue a restart with service varnish restart. We'll cover both of these files in detail in later chapters.

Conditional GET requests

In the tool-examples earlier we saw real examples of a conditional GET requests. In many ways, they are quite simple mechanisms to allow a HTTP client - typically a browser - to verify that they have the most up-to-date version of the HTTP object. There are two different types of conditional GET requests: If-Modified-Since and If-None-Match.

If a server sends a Last-Modified-header, the client can issue a If-Modified-Since header on later requests for the same content, indicating that the server only needs to transmit the response body if it's been updated.

Some times it isn't trivial to know the modification time, but you might be able to uniquely identify the content anyway. For that matter, the content might have been changed back to a previous state. This is where the entity tag, or ETag response header is useful.

An Etag header can be used to provide an arbitrary ID to an HTTP response, and the client can then re-use that in a If-None-Match request header.

Modifying /usr/lib/cgi-bin/foo.sh, we can make it provide a static ETag header:

#!/bin/bash
echo "Content-type: text/plain"
echo "Etag: testofetagnumber1"
echo
echo "Hello. Random number: ${RANDOM}"
date

Let's see what happens when we talk directly to Apache:

# http http://localhost:8080/cgi-bin/foo.sh
HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:43:25 GMT
Etag: testofetagnumber1
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 51126
Wed Nov 25 20:43:25 UTC 2015

# http http://localhost:8080/cgi-bin/foo.sh
HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:43:28 GMT
Etag: testofetagnumber1
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 12112
Wed Nov 25 20:43:28 UTC 2015

Two successive requests yielded updated content, but with the same Etag. Now let's see how Varnish handles this:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:44:53 GMT
Etag: testofetagnumber1
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32770

Hello. Random number: 5213
Wed Nov 25 20:44:53 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 2
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:44:53 GMT
Etag: testofetagnumber1
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32773 32771

Hello. Random number: 5213
Wed Nov 25 20:44:53 UTC 2015

It's pretty easy to see the difference in the output. However, there are two things happening here of interest. First, Etag doesn't matter for this test because we never send If-None-Match! So our http-command gets a 200 OK, not the 304 Not Modified that we were looking for. Let's try that again:

# http http://localhost:6081/cgi-bin/foo.sh "If-None-Match:
testofetagnumber1"
HTTP/1.1 304 Not Modified
Age: 0
Connection: keep-alive
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:48:52 GMT
Etag: testofetagnumber1
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 8

Now we see Etag and If-None-Match at work. Also note the absence of a body: we just saved bandwidth.

Let's try to change our If-None-Match header a bit:

# http http://localhost:6081/cgi-bin/foo.sh "If-None-Match: testofetagnumber2"
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Wed, 25 Nov 2015 20:51:10 GMT
Etag: testofetagnumber1
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 11

Hello. Random number: 12942
Wed Nov 25 20:51:10 UTC 2015

Content!

To summarize:

Server Client Server
Last-Modified If-Modified-Since 200 OK with full response body, or 304 Not Modified with no response body.
ETag If-None-Match

Warning

The examples above also demonstrates that supplying static Etag headers or bogus Last-Modified headers can have unexpected side effects. foo.sh provides new content every time. Talking directly to the web server resulted in the desired behavior of the client getting the updated content, but only because the web server ignored the conditional part of the request.

The danger is not necessarily Varnish, but proxy servers outside of the control of the web site, sitting between the client and the web server. Even if a web server ignores If-None-Match and If-Modified-Since headers, there is no guarantee that other proxies do! Make sure to only provide Etag and Last-Modified-headers that are correct, or don't provide them at all.

Cache control, age and grace

An HTTP object has an age. This is how long it is since the object was fetched or validated from the origin source. In most cases, an object starts acquiring age once it leaves a web server.

Age is measured in seconds. The HTTP response header Age is used to forward the information regarding age to HTTP clients. You can specify maximum age allowed both from a client and server. The most interesting aspect of this is the HTTP header Cache-Control. This is both a response- and request-header, which means that both clients and servers can emit this header.

The Age header has a single value: the age of the object measured in seconds. The Cache-Control header, on the other hand, has a multitude of variables and options. We'll begin with the simplest: max-age=. This is a variable that can be used both in a request-header and response-header, but is most useful in the response header. Most web servers and many intermediary caches (including Varnish), ignores a max-age field received in a HTTP request-header.

Setting max-age=0 effectively disables caching, assuming the cache obeys:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:41:53 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32776

Hello. Random number: 19972
Fri Nov 27 15:41:53 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:41:57 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32779

Hello. Random number: 92124
Fri Nov 27 15:41:57 UTC 2015

This example issues two requests against a modified http://localhost:6081/cgi-bin/foo.sh. The modified version has set max-age=0 to tell Varnish - and browsers - not to cache the content at all. A similar example can be used for max-age=10:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=10
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:44:32 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 14

Hello. Random number: 19982
Fri Nov 27 15:44:32 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 8
Cache-Control: max-age=10
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:44:32 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32782 15

Hello. Random number: 19982
Fri Nov 27 15:44:32 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 12
Cache-Control: max-age=10
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:44:32 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 19 15

Hello. Random number: 19982
Fri Nov 27 15:44:32 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 2
Cache-Control: max-age=10
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 15:44:44 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 65538 20

Hello. Random number: 9126
Fri Nov 27 15:44:44 UTC 2015

This example demonstrates several things:

  • Varnish emits an Age header, telling you how old the object is.
  • Varnish now caches.
  • Varnish delivers a 12-second old object, despite max-age=10!
  • Varnish then deliver a 2 second old object? Despite no other request in-between.

What this example is showing, is Varnish's default grace mode. The simple explanation is that Varnish keeps an object a little longer (10 seconds by default) than the regular cache duration. If the object is requested during this period, the cached variant of the object is sent to the client, while Varnish issues a request to the backend server in parallel. This is also called stale while revalidate. This happens even with zero configuration for Varnish, and is covered detailed in later chapters. For now, it's good to just get used to issuing an extra request to Varnish after the expiry time to see the update take place.

Let's do an other example of this, using a browser, and 60 seconds of max age and an ETag header set to something random so our browser can do conditional GET requests:

/img/c2/age-1.png

On the first request we get a 27 second old object.

/img/c2/age-2.png

The second request is a conditional GET request because we had it in cache. Note that our browser has already exceeded the max-age, but still made a conditional GET request. A cache (browser or otherwise) may keep an object longer than the suggested max-age, as long as it verifies the content before using it. The result is the same object, now with an age of 65 seconds.

/img/c2/age-3.png

The third request takes place just 18 seconds later. This is not a conditional GET request, most likely because our browser correctly saw that the Age of the previous object was 65, while max-age=60 instructed the browser to only keep the object until it reached an age of 60 - a time which had already past. Our browser thus did not keep the object at all this time.

Similarly, we can modify foo.sh to emit max-age=3600 and Age: 3590, pretending to be a cache. Speaking directly to Apache:

# http http://localhost:8080/cgi-bin/foo.sh
HTTP/1.1 200 OK
Age: 3590
Cache-Control: max-age=3600
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:07:36 GMT
ETag: 11235
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 54251
Fri Nov 27 16:07:36 UTC 2015

# http http://localhost:8080/cgi-bin/foo.sh
HTTP/1.1 200 OK
Age: 3590
Cache-Control: max-age=3600
Connection: Keep-Alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:07:54 GMT
ETag: 12583
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

Hello. Random number: 68323
Fri Nov 27 16:07:54 UTC 2015

Nothing too exciting, but the requests returns what we should have learned to expect by now.

Let's try three requests through Varnish:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3590
Cache-Control: max-age=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:08:50 GMT
ETag: 9315
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 65559

Hello. Random number: 22609
Fri Nov 27 16:08:50 UTC 2015

The first request is almost identical to the one we issued to Apache, except a few added headers.

15 seconds later, we issue the same command again:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3605
Cache-Control: max-age=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:08:50 GMT
ETag: 9315
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32803 65560

Hello. Random number: 22609
Fri Nov 27 16:08:50 UTC 2015

Varnish replies with a version from grace, and has issued an update to Apache in the background. Note that the Age header is now increased, and is clearly beyond the age limit of 3600.

4 seconds later, the third request:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3594
Cache-Control: max-age=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 16:09:05 GMT
ETag: 24072
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 65564 32804

Hello. Random number: 76434
Fri Nov 27 16:09:05 UTC 2015

Updated content!

The lessons to pick up from this is:

  • Age is not just an informative header. It is used by intermediary caches and by browser caches.
  • max-age is relative to Age and not to when the request was made.
  • You can have multiple tiers of caches, and max-age=x will be correct for the end user if all intermediary caches correctly obey it and adds to Age.

The Cache-Control header

The Cache-Control header has a multitude of possible values, and can be supplied as both a request-header and response-header. Varnish ignores any Cache-Control header received from a client - other caches might obey them.

It is defined in RFC2616, 14.9. As Varnish ignores all Cache-Control headers in a client request, we will focus on the parts relevant to a HTTP response, here's an excerpt from RFC2616:

Cache-Control   = "Cache-Control" ":" 1#cache-directive

cache-directive = cache-request-directive
     | cache-response-directive

(...)

 cache-response-directive =
       "public"                               ; Section 14.9.1
     | "private" [ "=" <"> 1#field-name <"> ] ; Section 14.9.1
     | "no-cache" [ "=" <"> 1#field-name <"> ]; Section 14.9.1
     | "no-store"                             ; Section 14.9.2
     | "no-transform"                         ; Section 14.9.5
     | "must-revalidate"                      ; Section 14.9.4
     | "proxy-revalidate"                     ; Section 14.9.4
     | "max-age" "=" delta-seconds            ; Section 14.9.3
     | "s-maxage" "=" delta-seconds           ; Section 14.9.3
     | cache-extension                        ; Section 14.9.6

cache-extension = token [ "=" ( token | quoted-string ) ]

Among the above directives, Varnish only obeys s-maxage and max-age by default. It's worth looking closer specially at must-revalidate. This allows a client to cache the content, but requires it to send a conditional GET request before actually using the content.

s-maxage is of special interest to Varnish users. It instructs intermediate caches, but not clients (e.g.: browsers). Varnish will pick the value of s-maxage over max-age, which makes it possible for a web server to emit a Cache-Control header that gives different instructions to browsers and Varnish:

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: s-maxage=3600,max-age=5
Connection: keep-alive
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:21:47 GMT
Server: Apache/2.4.10 (Debian)
Transfer-Encoding: chunked
Via: 1.1 varnish-v4
X-Varnish: 2

Hello. Random number: 7684
Fri Nov 27 23:21:47 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 8
Cache-Control: s-maxage=3600,max-age=5
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:21:47 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 5 3

Hello. Random number: 7684
Fri Nov 27 23:21:47 UTC 2015

# http http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 16
Cache-Control: s-maxage=3600,max-age=5
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:21:47 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 7 3

Hello. Random number: 7684
Fri Nov 27 23:21:47 UTC 2015

The first request populates the cache, the second returns a cache hit after 8 seconds, while the third confirms that no background fetch has caused an update by returning the same object a third time.

Two important things to note here:

  • The Age header is accurately reported. This effectively disables client-side caching after Age has reached 5 seconds.
  • There could be other intermediate caches that would also use s-maxage.

The solution to both these issues is the same: Remove or reset the Age-header and remove or reset the s-maxage-part of the Cache-Control header. Varnish does not do this by default, but we will do both in later chapters. For now, just know that these are challenges.

stale-while-revalidate

In addition to RFC2616, there's also the more recent RFC5861 defines two additional variables for Cache-Control:

stale-while-revalidate = "stale-while-revalidate" "=" delta-seconds

and:

stale-if-error = "stale-if-error" "=" delta-seconds

These two variables map very well to Varnish' grace mechanics, which existed a few years before RFC5861 came about.

Varnish 4.1 implements stale-while-revalidate for the first time, but not stale-if-error. Varnish has a default stale-while-revalidate value of 10 seconds. Earlier examples ran into this: You could see responses that were a few seconds older than max-age, while a request to revalidate the response was happening in the background.

A demo of default grace, pay attention to the Age header:

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=5
Connection: keep-alive
Content-Length: 56
Content-Type: text/plain
Date: Sun, 29 Nov 2015 15:10:56 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 2

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 4
Cache-Control: max-age=5
Connection: keep-alive
Content-Length: 56
Content-Type: text/plain
Date: Sun, 29 Nov 2015 15:10:56 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 5 3

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 8
Cache-Control: max-age=5
Connection: keep-alive
Content-Length: 56
Content-Type: text/plain
Date: Sun, 29 Nov 2015 15:10:56 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32770 3

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 4
Cache-Control: max-age=5
Connection: keep-alive
Content-Length: 56
Content-Type: text/plain
Date: Sun, 29 Nov 2015 15:11:03 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 65538 32771

On the third request, Varnish is returning an object that is 8 seconds old, despite the max-age=5 second. When this request was received, Varnish immediately fired off a request to the web server to revalidate the object, but returned the result from cache. This is also demonstrated by the fourth request, where Age is already 4. The fourth request gets the result from the backend-request started when the third request was received. So:

  1. Request: Nothing in cache. Varnish requests content from backend, waits, and responds with that result.
  2. Request: Standard cache hit.
  3. Request: Varnish sees that the object in cache is stale, initiates a request to a backend server, but does NOT wait for the response. Instead, the result from cache is returned.
  4. Request: By now, the backend-request initiated from the third request is complete. This is thus a standard cache hit.

This behavior means that slow backends will not affect client requests if content is cached.

If this behavior is unwanted, you can disable grace by setting stale-while-revalidate=0:

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=5, stale-while-revalidate=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Thu, 03 Dec 2015 12:50:36 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 12

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 3
Cache-Control: max-age=5, stale-while-revalidate=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Thu, 03 Dec 2015 12:50:36 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32773 13

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: max-age=5, stale-while-revalidate=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Thu, 03 Dec 2015 12:50:42 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 32775

# http -p h http://localhost:6081/cgi-bin/foo.sh
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 1
Cache-Control: max-age=5, stale-while-revalidate=0
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Thu, 03 Dec 2015 12:50:42 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 15 32776

This was added in Varnish 4.1.0. We can now see that no background fetching was done at all, and no stale objects were delivered. In other words:

  1. Request: Nothing in cache. Varnish requests content from backend, waits, and responds with that result.
  2. Request: Standard cache hit.
  3. Request: Nothing in cache. Varnish fetches content form backend, waits and responds with that result.
  4. Request: Standard cache hit.

Vary

The Vary-header is exclusively meant for intermediate caches, such as Varnish. It is a comma-separated list of references to request headers that will cause the web server to produce a different variant of the same content. An example is needed:

# http -p Hhb http://localhost:6081/cgi-bin/foo.sh "X-demo: foo"
GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0
X-demo:  foo

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 6
Cache-Control: s-maxage=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:56:47 GMT
Server: Apache/2.4.10 (Debian)
Vary: X-demo
Via: 1.1 varnish-v4
X-Varnish: 12 32771

Hello. Random number: 21126
Fri Nov 27 23:56:47 UTC 2015

# http -p Hhb http://localhost:6081/cgi-bin/foo.sh "X-demo: bar"
GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0
X-demo:  bar

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: s-maxage=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:56:57 GMT
Server: Apache/2.4.10 (Debian)
Vary: X-demo
Via: 1.1 varnish-v4
X-Varnish: 32773

Hello. Random number: 126
Fri Nov 27 23:56:57 UTC 2015

# http -p Hhb http://localhost:6081/cgi-bin/foo.sh "X-demo: foo"
GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0
X-demo:  foo

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 15
Cache-Control: s-maxage=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:56:47 GMT
Server: Apache/2.4.10 (Debian)
Vary: X-demo
Via: 1.1 varnish-v4
X-Varnish: 14 32771

Hello. Random number: 21126
Fri Nov 27 23:56:47 UTC 2015

# http -p Hhb http://localhost:6081/cgi-bin/foo.sh "X-demo: bar"
GET /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0
X-demo:  bar

HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 8
Cache-Control: s-maxage=3600
Connection: keep-alive
Content-Length: 57
Content-Type: text/plain
Date: Fri, 27 Nov 2015 23:56:57 GMT
Server: Apache/2.4.10 (Debian)
Vary: X-demo
Via: 1.1 varnish-v4
X-Varnish: 32776 32774

Hello. Random number: 126
Fri Nov 27 23:56:57 UTC 2015

These four requests demonstrates that two objects are entered into the cache for the same URL, accessible by modifying the arbitrarily chosen X-demo request header - which is not a real header.

The most important use-case for Vary is to support content encoding such as gzip. In earlier versions of Varnish, the web server needed to do the compression and Varnish would store the compressed content and (assuming a client asked for it), the uncompressed content. This was supported through the Vary header, which the server would set to Vary: Accept-Encoding. Today, Varnish understands gzip and this isn't needed. There are two more examples of Vary-usage.

Mobile devices are often served different variants of the same contents, so called mobile-friendly pages. To make sure intermediate caches supports this, Varnish must emit a Vary: User-Agent string, suggesting that for each different User-Agent header sent, a unique variant of the cache must be made.

The second such header is the nefarious Cookie header. Whenever a page is rendered differently based on a cookie, the web server should send Vary: Cookie. However, hardly anyone do this in the real world, which has resulted in cookies being treated differently. Varnish does not cache any content if it's requested with a cookie by default, nor does it cache any response with a Set-Cookie-header. This clearly needs to be overridden, and will be covered in detail in later chapters.

The biggest problem with the Vary-header is the lack of semantic details. The Vary header simply states that any variation in the request header, however small, mandates a new object in the cache. This causes numerous headaches. Here are some examples:

  • Accept-Enoding: gzip,deflate and Accept-Encoding: deflate,gzip will result in two different variants.
  • Vary: User-Agent will cause a tremendous amount of variants, since the level of detail in modern User-Agent headers is extreme.
  • It's impossible to say that only THAT cookie will matter, not the others.

Many of these things can be remedied or at least worked around in Varnish. All of it will be covered in detail in separate chapters.

On a last note, Varnish has a special case were it refuse to cache any content with a response header of Vary: *.

Request methods

Only the GET request method is cached. However, Varnish will re-write a HEAD request to a GET request, cache the result and strip the response body before answering the client. A HEAD request is supposed to be exactl the same as a GET request, with the response body stripped, so this makes sense. To see this effect, issue a HEAD request first directly to Apache:

# http -p Hhb HEAD http://localhost:8080/cgi-bin/foo.sh
HEAD /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:8080
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 29
Content-Type: text/plain
Date: Sat, 28 Nov 2015 00:30:33 GMT
Keep-Alive: timeout=5, max=100
Server: Apache/2.4.10 (Debian)

# tail -n1 /var/log/apache2/access.log
::1 - - [28/Nov/2015:00:30:33 +0000] "HEAD /cgi-bin/foo.sh HTTP/1.1" 200 190 "-" "HTTPie/0.8.0"

The access log shows a HEAD request. Issuing the same request to Varnish:

# http -p Hhb HEAD http://localhost:6081/cgi-bin/foo.sh
HEAD /cgi-bin/foo.sh HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:6081
User-Agent: HTTPie/0.8.0

HTTP/1.1 200 OK
Age: 0
Connection: keep-alive
Content-Length: 29
Content-Type: text/plain
Date: Sat, 28 Nov 2015 00:32:05 GMT
Server: Apache/2.4.10 (Debian)
Via: 1.1 varnish-v4
X-Varnish: 2

# tail -n1 /var/log/apache2/access.log
127.0.0.1 - - [28/Nov/2015:00:32:05 +0000] "GET /cgi-bin/foo.sh HTTP/1.1" 200 163 "-" "HTTPie/0.8.0"

The client sees the same result, but the web server has logged a GET request. Please note that HEAD-requests include a Content-Lenght as if a GET-request was issued. It is only the response body itself that is absent.

Cached status codes

Only a subset of response odes allow cacheing, even if an s-maxage or similar is provided. Quoting directly from Varnish source code, specifically bin/varnishd/cache/cache_rfc2616.c, the list is:

case 200: /* OK */
case 203: /* Non-Authoritative Information */
case 204: /* No Content */
case 300: /* Multiple Choices */
case 301: /* Moved Permanently */
case 304: /* Not Modified - handled like 200 */
case 404: /* Not Found */
case 410: /* Gone */
case 414: /* Request-URI Too Large */

That means that if you provide s-maxage on a 500 Internal Server Error, Varnish will still not cache it by default. Varnish will cache the above status codes even without any cache control headers. The default cache duration is 2 minutes.

In addition to the above, there are two more status codes worth mentioning:

case 302: /* Moved Temporarily */
case 307: /* Temporary Redirect */
        /*
         * https://tools.ietf.org/html/rfc7231#section-6.1
         *
         * Do not apply the default ttl, only set a ttl if Cache-Control
         * or Expires are present. Uncacheable otherwise.
         */
        expp->ttl = -1.;

Responses with status codes 302 Moved Temporarily or 307 Temporary Redirect are only cached if Cache-Control or Expires explicitly allows it, but not cached by default.

In other words:

  • max-age=10 + 500 Internal Server Error: Not cached
  • max-age=10 + 302 Moved Temporarily: Cached
  • No Cache-Control + 302 Moved Temporarily: Not cached
  • No Cache-Control + 404 Not Found: Cached

Cookies and authorization

Requests with a cookie-header or HTTP basic authorization header are tricky at best to cache. Varnish takes a "better safe than sorry" approach, and does not cache responses to requests with either a Cookie-header, Authorization-header by default. Responses with Set-Cookie are not cached.

Because cookies are so common, this will generally mean that any modern site is not cached by default. Fortunately, Varnish has the means to override that default. We will investigate that in detail in later chapters.

Summary

There are a few other headers worth mentioning. The ancient Pragma header is still seen, and completely ignored by Varnish and generally replaced by Cache-Control. One header Varnish does care about is Expires. Expires is generally deprecated, but still valid.

If s-maxage and max-age is missing from Cache-Control, then Varnish will use an Expires header. The format of the Expires header is that of an absolute date - the same format as Date and Last-Modified. Don't use this unless you want a headache.

In other words, to cache by default:

  • The request method must be GET or HEAD.
  • There can be no Cookie-header or Authorize-header in the request.
  • There can be no Set-Cookie on the reply.
  • The status code needs to be 200, 203, 204, 300, 301, 304, 404, 410, 414.
  • OR the status code can be 302 or 307 IF Cache-Control or Expires enables caching.
  • Vary must NOT be *.

Varnish decides cache duration (TTL) in the following order:

  • If Cache-Control has s-maxage, that value is used.
  • Otherwise, if Cache-Control has max-age, that value is used.
  • Otherwise, if Expires is present, that value is used.
  • Lastly, Varnish uses default fall-back value. This is 2 minutes by default, as dictated by the default_ttl parameter.

Our goal when designing cache policies is to push as much of the logic to the right place. The right place for setting cache duration is usually in the application, not in Varnish. A good policy is to use s-maxage.

Comments

November 24, 2015

Kristian LyngstølVarnish Foo - Introduction

Posted on 2015-11-24

This is the only chapter written in first person.

I've worked on Varnish since late 2008, first for Redpill Linpro, then Varnish Software, then, after a brief pause, for Redpill Linpro again. Over the years I've written code, written Varnish modules and blog posts, tried to push the boundaries of what Varnish can do, debugged or analyzed countless Varnish sites, probably held more training courses than anyone else, written training material, and helped shape the Varnish community.

Today I find myself in a position where the training material I once maintained is no longer my responsibility. But I still love writing, and there's an obvious need for documentation for Varnish.

I came up with a simple solution: I will write a book. Because I couldn't imagine that I would ever finish it if I attempted writing a whole book in one go, I decided I would publish one chapter at a time on my blog. This is the first chapter of that book.

You will find the source on https://github.com/KristianLyng/varnishfoo. This is something I am doing on my spare time, and I hope to get help from the Varnish community in the form of feedback. While the format will be that of a book, I intend to keep it alive with updates as long as I can.

I intend to cover as much Varnish-related content as possible, from administration to web development and infrastructure. And my hope is that one day, this will be good enough that it will be worth printing as more than just a leaflet.

I am writing this in my spare time, I retain full ownership of the material. For now, the material is available under a Creative Commons "CC-BY-SA-NC" license. The NC-part of that license will be removed when I feel the material has matured enough and the time is right. To clarify, the "non-commercial" clause is aimed at people wanting to sell the book or use it in commercial training (or similar) - it is not intended to prevent you from reading the material at work.

Target audience and format

This book covers a large spectre of subjects related to Varnish. It is suitable for system administrators, infrastructure architects and web developers. The first few chapters is general enough to be of interest to all, while later chapters specialize on certain aspects of Varnish usage.

Each chapter is intended to stand well on its own, but there will be some cross-references. The book focuses on best practices and good habits that will help you beyond what just a few examples or explanations will do.

Each chapter provides both theory and practical examples. Each example is tested with a recent Varnish Version where relevant, and are based on experience from real-world Varnish installations.

What is Varnish

Varnish is a web server.

Unlike most web servers, Varnish does not read content from a hard drive, or run programs that generates content from SQL databases. Varnish acquires the content from other web servers. Usually it will keep a copy of that content around in memory for a while to avoid fetching the same content multiple times, but not necessarily.

There are numerous reasons you might want Varnish:

  1. Your web server/application is a beastly nightmare where performance is measured in page views per hour - on a good day.
  2. Your content needs to be available from multiple geographically diverse locations.
  3. Your web site consists of numerous different little parts that you need to glue together in a sensible manner.
  4. Your boss bought a service subscription and now has to justify the budget post.
  5. You like Varnish.
  6. ???

Varnish is designed around two simple concepts: Give you the means to fix or work around technical challenges. And speed. Speed was largely handled very early on, and Varnish is quite simply fast. This is achieved by being, at the core, simple. The less you have to do for each request, the more requests you can handle.

The name suggests what it's all about:

From The Collaborative International Dictionary of English v.0.48 [gcide]:

  Varnish \Var"nish\, v. t. [imp. & p. p. {Varnished}; p. pr. &
     vb. n. {Varnishing}.] [Cf. F. vernir, vernisser. See
     {Varnish}, n.]
     [1913 Webster]
     1. To lay varnish on; to cover with a liquid which produces,
        when dry, a hard, glossy surface; as, to varnish a table;
        to varnish a painting.
        [1913 Webster]

     2. To cover or conceal with something that gives a fair
        appearance; to give a fair coloring to by words; to gloss
        over; to palliate; as, to varnish guilt. "Beauty doth
        varnish age." --Shak.
        [1913 Webster]

Varnish can be used to smooth over rough edges in your stack, to give a fair appearance.

History

The Varnish project began in 2005. The issue to be solved was that of VG, a large Norwegian news site (or alternatively a tiny international site). The first release came in 2006, and worked well for pretty much one site: www.vg.no. In 2008, Varnish 2.0 came, which opened Varnish up to more sites, as long as they looked and behaved similar to www.vg.no. As time progressed and more people started using Varnish, Varnish has been adapted to a large and varied set of use cases.

From the beginning, the project was administered through Redpill Linpro, with the majority of development being done by Poul-Henning Kamp through his own company and his Varnish Moral License. In 2010, Varnish Software sprung out from Redpill Linpro. Varnish Cache has always been a free software project, and while Varnish Software has been custodians of the infrastructure and large contributors of code and cash, the project is independent.

Varnish Plus was born some time during 2011, all though it didn't go by that name at the time. It was the result of somewhat conflicting interests. Varnish Software had customer obligations that required features, and the development power to implement them, but they did not necessarily align with the goals and time frames of Varnish Cache. Varnish Plus became a commercial test-bed for features that were not /yet/ in Varnish Cache for various reasons. Many of the features have since trickled into Varnish Cache proper in one way or an other (streaming, surrogate keys, and more), and some have still to make it. Some may never make it. This book will focus on Varnish Cache proper, but will reference Varnish Plus where it makes sense.

With Varnish 3.0, released in 2011, varnish modules started becoming a big thing. These are modules that are not part of the Varnish Cache code base, but are loaded at run-time to add features such as cryptographic hash functions (vmod-digest) and memcached. The number of vmods available grew quickly, but even with Varnish 4.1, the biggest issue with them were that they required source-compilation for use. That, however, is being fixed almost as I am writing this sentence.

Varnish would not be where it is today without a large number of people and businesses. Varnish Software have contributed and continues to contribute numerous tools, vmods, and core features. Poul-Henning Kamp is still the gatekeeper of Varnish Cache code, for better or worse, and does the majority of the architectural work. Over the years, there have been too many companies and individuals involved to list them all in a book, so I will leave that to the official Varnish Cache project.

Today, Varnish is used by CDNs and news papers, APIs and blogs.

More than just cache

Varnish caches content, but can do much more. In 2008, it was used to rewrite URLs, normalize HTTP headers and similar things. Today, it is used to implement paywalls (whether you like them or not), API metering, load balancing, CDNs, and more.

Varnish has a powerful configuration language, the Varnish Configuration Language (VCL). VCL isn't parsed the traditional way a configuration file is, but is translated to C code, compiled and linked into the running Varnish. From the beginning, it was possible to bypass the entire translation process and provide C code directly, which was never recommended. With Varnish modules, it's possible to write proper modules to replace the in-line C code that was used in the past.

There is also a often overlooked Varnish agent that provides a HTTP REST interface to managing Varnish. This can be used to extract metrics, review or optionally change configuration, stop and start Varnish, and more. The agent lives on https://github.com/varnish/vagent2, and is packaged for most distributions today. There's also a commercial administration console that builds further on the agent.

Using Varnish to gracefully handle operational issues is also common. Serving cached content past its expiry time while a web server is down, or switching to a different server, will give your users a better browsing experience. And in a worst case scenario, at least the user can be presented with a real error message instead of a refused or timed out connection.

An often overlooked feature of Varnish is Edge Side Includes. This is a means to build a single HTTP object (like a HTML page) from multiple smaller object, with different caching properties. This lets content writers provide more fine-grained caching strategies without having to be too smart about it.

Where to get help

The official varnish documentation is available both as manual pages (run man -k varnish on a machine with a properly installed Varnish package), and as Sphinx documentation found under http://varnish-cache.org/docs/.

Varnish Software has also publish their official training material, which is called "The Varnish Book" (Not to be confused with THIS book about Varnish). This is available freely through their site at http://varnish-software.com, after registration.

An often overlooked source of information for Varnish are the flow charts/dot-graphs used to document the VCL state engine. The official location for this is only found in the source code of Varnish, under doc/graphviz/. They can be generated simply, assuming you have graphviz installed:

# git clone http://github.com/varnish/Varnish-Cache/
Cloning into 'Varnish-Cache'...
(...)
# cd Varnish-Cache/
# cd doc/graphviz/
# for a in *dot; do dot -Tpng $a > $(echo $a | sed s/.dot/.png/); done
# ls *png

Alternatively, replace -Tpng and .png with -Tsvg and .svg respectively to get vector graphics, or -Tpdf/.pdf for pdfs.

You've now made three graphs that you might as well print right now and glue to your desk if you will be working with Varnish a lot.

For convenience, the graphs from Varnish 4.1 are included. If you don't quite grasp what these tell you yet, don't be too alarmed. These graphs are provided early as they are useful to have around as reference material. A brief explanation for each is included, mostly to help you in later chapters.

cache_req_fsm.png

/img/cache_req_fsm.png

This can be used when writing VCL. You want to look for the blocks that read vcl_ to identify VCL functions. The lines tell you how a return-statement in VCL will affect the VCL state engine at large, and which return statements are available where. You can also see which objects are available where.

This particular graph details the client-specific part of the VCL state engine.

cache_fetch.png

/img/cache_fetch.png

This graph has the same format as the cache_req_fsm.png-one, but from the perspective of a backend request.

cache_http1_fsm.png

/img/cache_http1_fsm.png

Of the three, this is the least practical flow chart, mainly included for completeness. It does not document much related to VCL or practical Varnish usage, but the internal state engine of an HTTP request in Varnish. It can sometimes be helpful for debugging internal Varnish issues.

Comments

November 16, 2015

Kristian LyngstølVisualizing VCL

Posted on 2015-11-16

I was preparing to upgrade a customer, and ran across a semi-extensive VCL setup. It quickly became a bit hard to get a decent overview of what was going on.

The actual VCL is fairly simple.

To deal with this, I ended up hacking together a tiny awk/shell script to generate a dot graph of how things were glued together. You can find the script at http://kly.no/code/script/vcl-visualizer.sh .

The output is somewhat ugly, but useful.

/varnish/vcl-visualizer-min.png

(Click for full version)

Of note:

  • This is so far Varnish 3.0-ish, mainly because of the error-syntax. (So it'll work for 4.x, just without vcl_error-tracking)
  • Green borders: Found the reference and everything is OK.
  • Black border: The sub was referenced, but not found in any VCL file.
  • Red border: The sub was found, but never referenced (doesn't count for subroutines beginning with vcl_, e.g. vcl_recv, etc)

No idea if it's of interest to anyone but me, but I found it useful.

Comments

October 12, 2015

Ingvar HagelundVarnish-4.1.0 released, packages for fedora and epel

Varnish-4.1.0 was recently released, and as usual, I have patched and wrapped up packages for fedora and epel. As 4.1.0 is not api/abi compatible with varnish-4.0, packages for stable releases of epel and fedora are not updated. Varnish-4.1.x will be available in a stable Fedora at latest from f24, though the package recompiles fine on anything from el5 to f23 as well.

Prebuilt packages for epel5, epel6, and epel7 are available here: http://users.linpro.no/ingvar/varnish/4.1.0/.

If you are a fedora contributor, please test the f23 package. The package should install directly on el7 and all supported fedoras, including f23. Then report feedback and add karma points. With a little luck, varnish-4.1 will go into fedora 23 before it freezes.

Ingvar

Varnish Cache is a powerful and feature rich front side web cache. It is also very fast, and that is, fast as in powered by The Dark Side of the Force. On steroids. And it is Free Software.

Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at www.redpill-linpro.com.

September 25, 2015

Kristian LyngstølMagic Grace

Posted on 2015-09-25

I was hacking together a JavaScript varnishstat implementation for a customer a few days ago when I noticed something strange. I have put Varnish in front of the agent delivering stats, but I'm only caching the statistics for 1 second.

But the cache hit rate was 100%.

And the stats were updating?

Logically speaking, how can you hit cache 100% of the time and still get fresh content all the time?

Enter Grace

Grace mode is a feature Varnish has had since version 2.0 back in 2008. It is a fairly simple mechanic: Add a little bit of extra cache duration to an object. This is the grace period. If a request is made for the object during that grace period, the object is updated and the cached copy is used while updating it.

This reduces the thundering horde problem when a large amount of users request recently expired content, and it can drastically improve user experience when updating content is expensive.

The big change that happened in Varnish 4 was background fetches.

Varnish uses a very simple thread model (so to speak). Essentially, each session is handled by one thread. In prior versions of Varnish, requests to the backend were always tied to a client request.

  • Thread 1: Accept request from client 1
  • Thread 1: Look up content in cache
  • Thread 1: Cache miss
  • Thread 1: Request content from web server
  • Thread 1: Block
  • Thread 1: Get content from web server
  • Thread 1: Respond

If the cache is empty, there isn't much of a reason NOT to do this. Grace mode always complicated this. What PHK did to solve this was, in my opinion, quite brilliant in its simplicity. Even if it was a trade-off.

With grace mode, you HAVE the content, you just need to make sure it's updated. It looked something like this:

  • Thread 1: Accept request from client 1
  • Thread 1: Look up content in cache
  • Thread 1: Cache miss
  • Thread 1: Request content from web server
  • Thread 1: Block
  • Thread 1: Get content from web server
  • Thread 1: Respond

So ... NO CHANGE. For a single client, you don't have grace mode in earlier Varnish versions.

But enter client number 2 (or 3, 4, 5...):

  • Thread 1: Accept request from client 1
  • Thread 1: Look up content in cache
  • Thread 1: Cache miss
  • Thread 1: Request content from web server
  • Thread 1: Block
  • Thread 2: Accept request from client 2
  • Thread 2: Look up content in cache
  • Thread 2: Cache hit - grace copy is now eligible - Respond
  • Thread 1: Get content from web server
  • Thread 1: Respond

So with Varnish 2 and 3, only the first client will block waiting for new content. This is still an issue, but it does the trick for the majority of use cases.

Background fetches!

Background fetches changed all this. It's more complicated in many ways, but from a grace perspective, it massively simplifies everything.

With Varnish 4 you get:

  • Thread 1: Accept request from client 1
  • Thread 1: Look up content in cache
  • Thread 1: Cache hit - grace copy is now eligible - Respond
  • Thread 2: Request content from web server
  • Thread 2: Block
  • Thread 3: Accept request from client 2
  • Thread 3: Look up content in cache
  • Thread 3: Cache hit - grace copy is now eligible - Respond
  • Thread 2: Get content from web server

And so forth. Strictly speaking, I suppose this makes grace /less/ magical...

In other words: The first client will also get a cache hit, but Varnish will update the content in the background for you.

It just works.

Statistics?

What is a cache hit?

If I tell you that I have 100% cache hit rate, how much backend traffic would you expect?

We want to keep track of two ratios:

  • Cache hit rate - how much content is delivered directly from cache (same as today). Target value: 100%.
  • Fetch/request ratio: How many backend fetches do you initiate per client request. Target value: 0%.

For my application, a single user will result in a 100% cache hit rate, but also a fetch/request ratio of 100%. The cache isn't really offloading the backend load significantly until I have multiple users of the app. Mind you, if the application was slow, this would still benefit that one user.

The latter is also interesting from a security point of view. If you find the right type of request, you could end up with more backend fetches than client requests (e.g. due to restarts/retries).

How to use grace

You already have it, most likely. Grace is turned on by default, using a 10 second grace period. For frequently updated content, this is enough.

Varnish 4 changed some of the VCL and parameters related to grace. The important bits are:

  • Use beresp.grace in VCL to adjust grace for an individual object.
  • Use the default_grace parameter to adjust the ... default grace for objects.

If you want to override grace mechanics, you can do so in either vcl_recv by setting req.ttl to define a max TTL to be used for an object, regardless of the actual TTL. That bit is a bit mysterious.

Or you can look at vcl_hit. Here you'll be able to do:

if (obj.ttl + obj.grace > 0s && obj.ttl =< 0s) {
        // We are in grace mode, we have an object though
        if (req.http.x-magic-skip-grace-header ~ "yes") {
                return (miss);
        } else {
                return (delier);
        }
}

The above example-snippet will evaluate of the object has an expired TTL, but is still in the grace period. If that happens, it looks for a client header called "X-Magic-Skip-Grace-Header" and checks if it contains the string "yes". If so, the request is treated as a cache miss, otherwise, the cached object is delivered.

Comments

September 19, 2015

Kristian LyngstølVarnish Wishlist

Posted on 2015-09-19

I recently went back to working for Redpill Linpro, and thus started working with Varnish again, after being on the side lines for a few years.

I've been using Varnish since 2008. And a bit more than just using it too. There's been a lot of great change over time, but there are still things missing. I recently read http://kacper.blog.redpill-linpro.com/archives/743 and while I largely agree with Kacper, I think some of the bigger issues are missing from the list.

So here's my attempt to add to the debate.

TLS/SSL

Varnish needs TLS/SSL.

It's the elephant in the room that nobody wants to talk about.

The world is not the same as it was in 2006. Varnish is used for more and more sensitive sites. A larger percentage of Varnish installations now have some sort of TLS/SSL termination attached to it.

TLS/SSL has been a controversial issue in the history of Varnish Cache, with PHK (Principal architect of Varnish Cache - https://www.varnish-cache.org/docs/trunk/phk/ssl.html) being an outspoken opponent of adding TLS in Varnish. There are valid reasons, and heartbleed has most certainly proven many of PHK's grievances right. But what does that matter when we use TLS/SSL anyway? It's already in the stack, we're just closing our eyes to it.

Setting up nginx in front of Varnish to get TLS/SSL, then nginx behind Varnish to get TLS/SSL... That's just silly. Why not just use nginx to cache then? The lack of TLS/SSL in Varnish is a great advertisement for nginx.

There are a lot of things I dislike about TLS/SSL, but we need it anyway. There's the hitch project (http://hitch-tls.org), but it's not really enough. We also need TLS/SSL to the backends, and a tunnel-based solution isn't enough. How would you do smart load balancing through that? If we don't add TLS/SSL, we might as well just forget about backend directors all together. And it has to be an integral part of all backends.

We can't have a situation where some backend directors support TLS/SSL and some don't.

Varnish Software is already selling this through Varnish Cache Plus, their proprietary version of Varnish. That is obviously because it's a deal breaker in a lot of situations. The same goes for basically any serious commercial actor out there.

So we need TLS/SSL. And we need it ASAP.

Note

After speaking to PHK, let me clarify: He's not against adding support for TLS, but adding TLS itself. Varnish now supports the PROXY-protool which is added explicitly to improve support for TLS termination. Further such additions would likely be acceptable, always doing the TLS outside of Varnish.

Better procedures for VCL changes

With every Varnish version, VCL (The configuration language for Varnish) changes either a little bit, or a lot. Some of these changes are unavoidable due to internal Varnish changes. Some changes are to tweak the language to be more accurate (e.g. changing req.request to req.method, to reflect that it's the request method).

If Varnish is part of your day-to-day work, then this might not be a huge deal. You probably keep up-to-date on what's going on with Varnish anyway. But most users aren't there. We want Varnish to be a natural part of your stack, not a special thing that requires a "varnish-admin".

This isn't necessarily an easy problem to solve. We want to be able to improve VCL and get rid of old mistakes (e.g., changing req.request to req.method is a good thing for VCL). We've also changed the way to do error messages (or custom varnish-generated messages) numerous times. And how to create hitpass objects (a complicated aspect of any cache).

A few simple suggestions:

  • All VCL changes reviewed in public as a whole before the release process even starts. To avoid having to change it again two versions down the line.
  • Backward compatibility when possible. With warnings or even requiring an extra option to allow it. E.g.: req.request could easily still work, there's no conflict there. Never for forever, but perhaps to the end of a major version. Not everything will be backwards compatible, but some can.

I've had numerous complaints from highly skilled sysadmins who are frustrated by this aspect of Varnish. They just don't want to upgrade because they have to do what feels like arbitrary VCL changes every single time. Let's see if we can at least REDUCE that.

Documentation?

There's a lot of documentation for Varnish, but there's also a lot of bad documentation. Some issues:

  • People Google and end up on random versions on varnish-cache.org. No, telling people "but there's a version right there so it's your own fault!" is not an acceptable solution. Varnish Software them self recently had a link in their Varnish Book where they used a link to "trunk" instead of "4.0", whereupon the "here is a complete list of changes between Varnish 3 and Varnish 4" link was actually a link to changes betwen Varnish 4.0 and the next version of Varnish.

  • "user guide" and "tutorial" and "installation"? Kill at least two and leave the others for blog posts or whatever. Hard enough to maintain one with decent quality.

  • Generated documentation needs to be improved. Example:

    Prototype
            STRING fileread(PRIV_CALL, STRING)
    Description
            Reads a file and returns a string with the content. Please
            note that it is not recommended to send variables to this
            function the caching in the function doesn't take
            this into account. Also, files are not re-read.
    Example
            set beresp.http.served-by = std.fileread("/etc/hostname");
    

    PRIV_CALL should clearly not be exposed! Other examples are easy enough to find.

    In addition, the Description is a mixture of reference documentation style and elaboration. Reference documentation should be clearly separated from analysis of consequences so technical users don't have to reverse-engineer a sentence of "don't do this because X" to figure out what the code actually does.

    And where are the details? What happens if the file can't be opened? What are the memory constraints? It says it returns the content of the file as a string, but what happens with binary content? There's clearly some caching of the file, but how does that work? Per session? Per VCL? Does that cache persist when you do varnishadm stop; varnishadm start? That's completely left out.

  • Rants mixed in with documentation? Get rid of "doc/shpinx/phk" (https://www.varnish-cache.org/docs/4.0/phk/) and instead reference it somewhere else. Varnish-cache.org/doc should not be a weird blog-space. It clutters the documentation space. Varnish is not a small little project any more, it's grown past this.

VMOD packages

Varnish vmods are awesome. You can design some truly neat solutions using Open Source vmods, or proprietary ones.

But there are no even semi-official package repositories for the open source vmods. Varnish Software offers this to customers, but I really want it for the public too. Both for my own needs, and because it's important to improve Varnish and VMOD adaption.

Until you can do "apt-get install varnish-vmod-foo" or something like that, VMODS will not get the attention they deserve.

There are some projects in the works here, though, so stay tuned.

TLS/SSL

In case you missed it, I want TLS/SSL.

I want to be able to type https://<varnish host>

BTW: Regarding terminology, I decided to go with "TLS/SSL" instead of either "SSL" or "TLS" after some feedback. I suppose "TLS" is correct, but "SSL" is more recognized, whether we like it or not.

Comments

August 23, 2015

Kacper WysockiMy Varnish pet peeves

I’ve been meaning to write a blog entry about Varnish for years now. The closest I’ve come is to write a blog about how to make Varnish cache your debian repos, make you a WikiLeaks cache and I’ve released Varnish Secure Firewall, but that without a word on this blog. So? SO? Well, after years it turns out there is a thing or two to say about Varnish. Read on to find out what annoys me and people I meet the most.

varnish on wood

Although you could definitely call me a “Varnish expert” and even a sometimes contributor, and I do develop programs, I cannot call myself a Varnish developer because I’ve shamefully never participated in a Monday evening bug wash. My role in the Varnish world is more… operative. I am often tasked with helping ops people use Varnish correctly, justify its use and cost to their bosses, defend it from expensive and inferior competitors, sit up long nites with load tests just before launch days. I’m the guy that explains the low risk and high reward of putting Varnish in front of your critical site, and the guy that makes it actually be low risk, with long nites on load tests and I’ll be the first guy on the scene when the code has just taken a huge dump on the CEO’s new pet Jaguar. I am also sometimes the guy who tells these stories to the Varnish developers, although of course they also have other sources. The consequences of this .. lifestyle choice .. is that what code I do write is either short and to the point or .. incomplete.

bug wash

I know we all love Varnish, which is why after nearly 7 years of working with this software I’d like to share with you my pet peeves about the project. There aren’t many problems with this lovely and lean piece of software but those which are there are sharp edges that pretty much everyone snubs a toe or snags their head on. Some of them are specific to a certain version, while others are “features” present in nearly all versions.

And for you Varnish devs who will surely read this, I love you all. I write this critique of the software you contribute to, knowing full well that I haven’t filed bug reports on any of these issues and therefore I too am guilty in contributing to the problem and not the solution. I aim to change that starting now :-) Also, I know that some of these issues are better lived with than fixed, the medicine being more hazardous than the disease, so take this as all good cooking; with a grain of salt.

Silent error messages in init scripts

Some genious keeps inserting 1>/dev/null 2>&1 into the startup scripts on most Linux distros. This might be in line with some wacko distro policy but makes conf errors and in particular VCL errors way harder to debug for the common man. Even worse, the `service varnish reload` script called `varnish-vcl-reload -q`, that’s q for please-silence-my-fatal-conf-mistakes, and the best way to fix this is to *edit the init script and remove the offender*. Mind your p’s and q’s eh, it makes me sad every time, but where do I file this particular bug report?

silent but deadly

debug.health still not adequately documented

People go YEARS using Varnish without discovering watch varnishadm debug.health. Not to mention that it’s anyone’s guess this has to do with probes, and that there are no other debug.* parameters, except for the totally unrelated debug parameter. Perhaps this was decided to be dev-internal at some point, but the probe status is actually really useful in precisely this form. debug.health is still absent from the param.show list and the man pages, while in 4.0 some probe status and backend info has been put into varnishstat, which I am sure to be not the only one being verry thankful for indeed.

Bad naming

Designing a language is tricky.

conufsed?

Explaining why purge is now ban and what is now purge is something else is mindboggling. This issue will be fixed in 10 years when people are no longer running varnish 2.1 anywhere. Explaining all the three-letter acronyms that start with V is just a gas.
Showing someone ban("req.url = "+ req.url) for the first time is bound to make them go “oh” like a racoon just caught sneaking through your garbage.
Grace and Saint mode… that’s biblical, man. Understanding what it does and how to demonstrate the functionality is still for Advanced Users, explaining this to noobs is downright futile, and I am still unsure whether we wouldn’t all be better off for just enabling it by default and forgetting about it.
I suppose if you’re going to be awesome at architecting and writing software, it’s going to get in the way of coming up with really awesome names for things, and I’m actually happy that’s still the way they prioritize what gets done first.

Only for people who grok regex

Sometimes you’ll meet Varnish users who do code but just don’t grok regex. It’s weak, I know, but this language isn’t for them.

Uncertain current working directory

This is a problem on some rigs which have VCL code in stacked layers, or really anywhere where it’s more appropriate to call the VCL a Varnish program, as in “a program written for the Varnish runtime”, rather than simply a configuration for Varnish.

UncertantyYou’ll typically want to organize your VCL in such a way that each VCL is standalone with if-wrappend rules and they’re all included from one main vcl file, stacking all the vcl_recv’s and vcl_fetches .

Because distros don’t agree on where to put varnishd’s current working directory, which happens to be where it’s been launched from, instead of always chdir $(basename $CURRENT_VCL_FILE), you can’t reliably specify include statements with relative paths. This forces us to use hardcoded absolute paths in includes, which is neither pretty nor portable.

Missing default director in 4.0

When translating VCL to 4.0 there is no longer any language for director definitions, which means they are done in vcl_init(), which means your default backend is no longer the director you specified at the top, which means you’ll have to rewrite some logic lest it bite you in the ass.

director.backend() is without string representation, instead of backend_hint,
so cannot do old style name comparisons, ie backends are first-class objects but directors are another class of objects.

the missing director

VCL doesn’t allow unused backends or probes

Adding and removing backends is a routine ordeal in Varnish.
Quite often you’ll find it useful to keep backup backends around that aren’t enabled, either as manual failover backups, because you’re testing something or just because you’re doing something funky. Unfortunately, the VCC is a strict and harsh mistress on this matter: you are forced to comment out or delete unused backends :-(

Workarounds include using the backends inside some dead code or constructs like

vcl_recv{
	set req.backend_hint = unused;
	set req.backend_hint = default;
	...
}

It’s impossible to determine how many bugs this error message has avoided by letting you know that backend you just added, er yes that one isn’t in use sir, but you can definitely count the number of Varnish users inconvenienced by having to “comment out that backend they just temporarily removed from the request flow”.

I am sure it is wise to warn about this, but couldn’t it have been just that, a warning? Well, I guess maybe not, considering distro packaging is silencing error messages in init and reload scripts..

To be fair, this is now configurable in Varnish by setting vcc_err_unref to false, but couldn’t this be the default?

saintmode_threshold default considered harmful

saintmode

If many different URLs keep returning bad data or error codes, you might concievably want the whole backend to be declared sick instead of growing some huge list of sick urls for this backend. What if I told you your developers just deployed an application which generates 50x error codes triggering your saintmode for an infinite amount of URLs? Well, then you have just DoSed yourself because you hit this threshold. I usually enable saintmode only after giving my clients a big fat warning about this one, because quite frankly this easily comes straight out of left field every time. Either saintmode is off, or the treshold is Really Large™ or even ∞, and in only some special cases do you actually want this set to an actual number.

Then again, maybe it is just my clients and the wacky applications they put behind Varnish.

What is graceful about the saint in V4?

While we are on the subject, grace mode being the most often misunderstood feature of Varnish, the thing has changed so radically in Varnish 4 that it is no longer recognizable by users, and they often make completely reasonable but devestating mistakes trying to predict its behavior.

To be clear on what has happened: saint mode is deprecated as a core feature in V4.0, while the new architecture now allows a type of “stale-while-revalidate” logic. A saintmode vmod is slated for Varnish 4.1.

But as of 4.0, say you have a bunch of requests hitting a slow backend. They’ll all queue up while we fetch a new one, right? Well yes, and then they all error out when that request times out, or if the backend fetch errors out. That sucks. So lets turn on grace mode, and get “stale-while-revalidate” and even “stale-if-error” logic, right? And send If-Modified-Since headers too, sweet as.

Now that’s gonna work when the request times out, but you might be surprised that it does not when the request errors out with 50x errors. Since beresp.saint_mode isn’t a thing anymore in V4, those error codes are actually going to knock the old object outta cache and each request is going to break your precious stale-while-error until the backend probe declares the backend sick and your requests become grace candidates.

Ouch, you didn’t mean for it to do that, did you?

The Saint

And if, gods forbid, your apphost returns 404′s when some backend app is not resolving, bam you are in a cascading hell fan fantasy.

What did you want it to do, behave sanely? A backend response always replaces another backend response for the same URL – not counting vary-headers. To get a poor mans saint mode back in Varnish 4.0, you’ll have to return (abandon) those erroneous backend responses.

Evil grace on unloved objects

For frequently accessed URLs grace is fantastic, and will save you loads of grief, and those objects could have large grace times. However, rarely accessed URLs suffer a big penalty under grace, especially when they are dynamic and ment to be updated from backend. If that URL is meant to be refreshed from backend every hour, and Varnish sees many hours between each access, it’s going to serve up that many-hour-old stale object while it revalidates its cache.

stale while revalidate
This diagram might help you understand what happens in the “200 OK” and “50x error” cases of graceful request flow through Varnish 4.0.

Language breaks on major versions

This is a funny one because the first major language break I remember was the one that I caused myself. We were making security.vcl and I was translating rules from mod_security and having trouble with it because Varnish used POSIX regexes at the time, and I was writing this really godaweful script to translate PCRE into POSIX when Kristian who conceived of security.vcl went to Tollef, who were both working in the same department at the time, and asked in his classical broker-no-argument kind of way "why don’t we just support Perl regexes?".
Needless to say, (?i) spent a full 12 months afterwards cursing myself while rewriting tons of nasty client VCL code from POSIX to PCRE and fixing occasional site-devestating bugs related to case-sensitivity.

Of course, Varnish is all the better for the change, and would get no where fast if the devs were to hang on to legacy, but there is a lesson in here somewhere.

furby

So what's a couple of sed 's/req.method/req.request/'s every now and again?
This is actually the main reason I created the VCL.BNF. For one, it got the devs thinking about the grammar itself as an actual thing (which may or may not have resulted in the cleanups that make VCL a very regular and clean language today), but my intent was to write a parser that could parse any version of VCL and spit out any other version of VCL, optionally pruning and pretty-printing of course. That is still really high on my todo list. Funny how my clients will book all my time to convert their code for days but will not spend a dime on me writing code that would basically make the conversion free and painless for everyone forever.

Indeed, most of these issues are really hard to predict consequences of implementation decisions, and I am unsure whether it would be possible to predict these consequences without actually getting snagged by the issues in the first place. So again: varnish devs, I love you, what are your pet peeves? Varnish users, what are your pet peeves?

Errata: vcc_err_unref has existed since Varnish 3.

June 26, 2015

Ingvar Hagelundhitch-1.0.0-beta for Fedora and EPEL

The Varnish project has a new little free software baby arriving soon: Hitch, a scalable TLS proxy. It will also be made available with support by Varnish Software as part of their Varnish Plus product.

A bit of background:

Varnish is a high-performance HTTP accelerator, widely used over the Internet. To use varnish with https, it is often fronted by other general http/proxy servers like nginx or apache, though a more specific proxy-only high-performance tool would be preferable. So they looked at stud.

hitch is a fork of stud. The fork is maintained by the Varnish development team, as stud seems abandoned by its creators, after the project was taken over by Google, with no new commits after 2012.

I wrapped hitch for fedora, epel6 and epel7, and submitted them for Fedora and EPEL. Please test the latest builds and add feedback: https://admin.fedoraproject.org/updates/search/hitch . The default config is for a single instance of hitch.

The package has been reviewed and was recently accepted into Fedora and EPEL (bz #1235305). Update august 2015: Packages are pushed for testing. They will trickle down to stable eventually.

Note that there also exists as a fedora package of the (old) version of stud. If you use stud on fedora and want to test hitch, the two packages may coexist, and should be able to install in parallel.

To test hitch in front of varnish, in front of apache, you may do something like this (tested on el7):

  • Install varnish, httpd and hitch
      sudo yum install httpd varnish
      sudo yum --enablerepo=epel-testing install hitch || sudo yum --enablerepo=updates-testing install hitch
    
  • Start apache
      sudo systemctl start httpd.service
    
  • Edit the varnish config to point to the local httpd, that is, change the default backend definition in /etc/varnish/default.vcl , like this:
      backend default {
        .host = "127.0.0.1";
        .port = "80";
      }
    
  • Start varnish
      sudo systemctl start varnish.service
    
  • Add an ssl certificate to the hitch config. For a dummy certificate,
    the example.com certificate from the hitch source may be used:

      sudo wget -O /etc/pki/tls/private/default.example.com.pem http://users.linpro.no/ingvar/varnish/hitch/default.example.com.pem
    
  • Edit /etc/hitch/hitch.conf. Change the pem-file option to use that cert
      pem-file = "/etc/pki/tls/private/default.example.com.pem"
    
  • Start hitch
      sudo systemctl start hitch.service
    
  • Open your local firewall if necessary, by something like this:
      sudo firewall-cmd --zone=public --add-port=8443/tcp
    
  • Point your web browser to https://localhost:8443/ . You should be greeted with a warning about a non-official certificate. Past that, you will get the apache frontpage through varnish and hitch.

    Enjoy, and let me hear about any interesting test results.

    Ingvar

    Varnish Cache is powerful and feature rich front side web cache. It is also very fast, that is, Fast as in on steroids, and powered by The Dark Side of the Force.

    Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at www.redpill-linpro.com.

  • May 15, 2015

    Lasse KarstensenIntroducing hitch – a scalable TLS terminating proxy.

    The last couple of weeks we’ve been pretty busy making SSL/TLS support for Varnish Cache Plus 4. Now that the news is out, I can follow up with some notes here.

    The setup will be a TLS terminating proxy in front, speaking PROXY protocol to Varnish. Backend/origin support for SSL/TLS has been added, so VCP can now talk encrypted to your backends.

    On the client-facing side we are forking the abandoned TLS proxy called stud, and giving a new name: hitch.

    hitch will live on github as a standalone open source project, and we are happy to review patches/pull requests made by the community. Here is the source code: https://github.com/varnish/hitch

    We’ve picked all the important patches from the flora of forks, and merged it all into a hopefully stable tool. Some of the new stuff includes: TLS1.1, TLS1.2, SNI, wildcard certs, multiple listening sockets. See the CHANGES.rst file updates.

    Varnish Software will provide support on it for commercial uses, under the current Varnish Plus product package.


    April 24, 2015

    Stefan CaunterRogers needs to pay to solve its CFL problem

    So, there’s an article on the Toronto Sun website.  http://www.torontosun.com/2015/04/24/live-leiweke-updates-bmo-field-argos-talks Understandably, they won’t publish my comment, which appears below. Spam and pointless bickering is fine, apparently. Sigh. Here is my take. Rogers Communications and the Jays situation in the dome is driving the debate about BMO Field. It has nothing to do with the Argos […]

    April 22, 2015

    Tollef Fog HeenTemperature monitoring using a Beaglebone Black and 1-wire

    I’ve had a half-broken temperature monitoring setup at home for quite some time. It started out with a Atom-based NAS, a USB-serial adapter and a passive 1-wire adapter. It sometimes worked, then stopped working, then started when poked with a stick. Later, the NAS was moved under the stairs and I put a Beaglebone Black in its old place. The temperature monitoring thereafter never really worked, but I didn’t have the time to fix it. Over the last few days, I’ve managed to get it working again, of course by replacing nearly all the existing components.

    I’m using the DS18B20 sensors. They’re about USD 1 a piece on Ebay (when buying small quantities) and seems to work quite ok.

    My first task was to address the reliability problems: Dropouts and really poor performance. I thought the passive adapter was problematic, in particular with the wire lengths I’m using and I therefore wanted to replace it with something else. The BBB has GPIO support, and various blog posts suggested using that. However, I’m running Debian on my BBB which doesn’t have support for DTB overrides, so I needed to patch the kernel DTB. (Apparently, DTB overrides are landing upstream, but obviously not in time for Jessie.)

    I’ve never even looked at Device Tree before, but the structure was reasonably simple and with a sample override from bonebrews it was easy enough to come up with my patch. This uses pin 11 (yes, 11, not 13, read the bonebrews article for explanation on the numbering) on the P8 block. This needs to be compiled into a .dtb. I found the easiest way was just to drop the patched .dts into an unpacked kernel tree and then running make dtbs.

    Once this works, you need to compile the w1-gpio kernel module, since Debian hasn’t yet enabled that. Run make menuconfig, find it under “Device drivers”, “1-wire”, “1-wire bus master”, build it as a module. I then had to build a full kernel to get the symversions right, then build the modules. I think there is or should be an easier way to do that, but as I cross-built it on a fast AMD64 machine, I didn’t investigate too much.

    Insmod-ing w1-gpio then works, but for me, it failed to detect any sensors. Reading the data sheet, it looked like a pull-up resistor on the data line was needed. I had enabled the internal pull-up, but apparently that wasn’t enough, so I added a 4.7kOhm resistor between pin 3 (VDD_3V3) on P9 and pin (GPIO_45) on P8. With that in place, my sensors showed up in /sys/bus/w1/devices and you can read the values using cat.

    In my case, I wanted the data to go into collectd and then to graphite. I first tried using an Exec plugin, but never got it to work properly. Using a python plugin worked much better and my graphite installation is now showing me temperatures.

    Now I just need to add more probes around the house.

    The most useful references were

    In addition, various searches for DS18B20 pinout and similar, of course.

    April 13, 2015

    Stefan CaunterThe unconditional interest of Leafs fans

    Leafs tickets are seen as an investment to hold, not a conditional payment on success.The Leaf team is an incredibly valuable sport property that is basically destroyed every year by the media that keep them incredibly valuable. How? The players are given exalted status based on next to nothing on an achievement scale. People get […]

    March 05, 2015

    Ingvar Hagelundvarnish-4.0.3 for Fedora and EPEL

    varnish-4.0.3 was released recently. I have wrapped packages for Fedora and EPEL, and requested updates for epel7, f21 and f22. They will trickle down as stable updates within some days. I have also built packages for el6, and after som small patching, even for el5. These builds are based on the Fedora package, but should be only cosmetically different from the el6 and el7 packages available from http://varnish-cache.org/.

    Also note that Red Hat finally caught up, and imported the necessary selinux-policy changes for Varnish from fedora into el7. With selinux-policy-3.13.1-23.el7, Varnish starts fine in enforcing mode. See RHBA-2015-0458.

    My builds for el5 and el6 are available here: http://users.linpro.no/ingvar/varnish/4.0.3/. Note that they need other packages from EPEL to work.

    Update 1: I also provide an selinux module for those running varnish-4.0 on el6. It should work for all versions of varnish-4.0, including mine and the ones from varnish-cache.org.

    Update 2: Updated builds with a patch for bugzilla ticket 1200034 are pushed for testing in f21, f22 and epel7. el5 and el6 builds are available on link above.

    Enjoy.

    Ingvar

    Varnish Cache is powerful and feature rich front side web cache. It is also very fast, that is, Fast as in on steroids, and powered by The Dark Side of the Force.

    Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at redpill-linpro.com.

    January 19, 2015

    Lasse KarstensenPROXY protocol in Varnish

    Dag has been working implementing support for HAProxy’s PROXY protocol[1] in Varnish. This is a protocol adds a small header on each incoming TCP connection that describes who the real client is, added by (for example) an SSL terminating process. (since srcip is the terminating proxy)

    We’re aiming for merging this into Varnish master (so perhaps in 4.1?) when it is ready.

    The code is still somewhat unfinished, timeouts are lacking and some polishing needed, but it works and can be played with in a development setup.

    Code can be found here: https://github.com/daghf/varnish-cache/tree/PROXY

    I think Dag is using haproxy to test it with. I’ve run it with stunnel (some connection:close issues to figure out still), and I’d love if someone could test it with ELB, stud or other PROXY implementations.

    1: http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt


    January 08, 2015

    Ingvar Hagelundrpm packages of vmod-ipcast

    Still on varnish-3.0? Missing the ability to filter X-Forwarded-For through ACLs? Use vmod ipcast by Lasse Karstensen.

    I cleaned up and rolled an rpm package of vmod-ipcast-1.2 for varnish-3.0.6 on el6. It’s available here: http://users.linpro.no/ingvar/varnish/vmod-ipcast/.

    Note that the usage has changed a bit since the last version. You are now longer permitted to change client.ip (and that’s probably a good thing). Now it’s called like this, returning an IP address object:

    ipcast.ip("string","fallback_ip");

    If the string does not resemble an IP address, the fallback ip is returned. Note that if the fallback ip is an unvalid address, varnishd will crash!

    So, if you want to filter X-Forwarded-For through an ACL, you would something like this:

    import ipcast;
    sub vcl_recv {
       # Add some code to sanitize X-Forwarded-For above here, so it resembles one single IP address
       if ( ipcast.ip(req.http.X-Forwarded-For, "198.51.100.255") ~ someacl ) {
         # Do something special
       }
    }

    And that’s all for today.

    Varnish Cache is powerful and feature rich front side web cache. It is also very fast, that is, Fast as in on steroids, and powered by The Dark Side of the Force.

    Redpill Linpro is the market leader for professional Open Source and Free Software solutions in the Nordics, though we have customers from all over. For professional managed services, all the way from small web apps, to massive IPv4/IPv6 multi data center media hosting, and everything through container solutions, in-house, cloud, and data center, contact us at redpill-linpro.com.

    November 16, 2014

    Tollef Fog HeenResigning as a Debian systemd maintainer

    Apparently, people care when you, as privileged person (white, male, long-time Debian Developer) throw in the towel because the amount of crap thrown your way just becomes too much. I guess that’s good, both because it gives me a soap box for a short while, but also because if enough people talk about how poisonous the well that Debian is has become, we can fix it.

    This morning, I resigned as a member of the systemd maintainer team. I then proceeded to leave the relevant IRC channels and announced this on twitter. The responses I’ve gotten have been almost all been heartwarming. People have generally been offering hugs, saying thanks for the work put into systemd in Debian and so on. I’ve greatly appreciated those (and I’ve been getting those before I resigned too, so this isn’t just a response to that). I feel bad about leaving the rest of the team, they’re a great bunch: competent, caring, funny, wonderful people. On the other hand, at some point I had to draw a line and say “no further”.

    Debian and its various maintainer teams are a bunch of tribes (with possibly Debian itself being a supertribe). Unlike many other situations, you can be part of multiple tribes. I’m still a member of the DSA tribe for instance. Leaving pkg-systemd means leaving one of my tribes. That hurts. It hurts even more because it feels like a forced exit rather than because I’ve lost interest or been distracted by other shiny things for long enough that you don’t really feel like part of a tribe. That happened with me with debian-installer. It was my baby for a while (with a then quite small team), then a bunch of real life thing interfered and other people picked it up and ran with it and made it greater and more fantastic than before. I kinda lost touch, and while it’s still dear to me, I no longer identify as part of the debian-boot tribe.

    Now, how did I, standing stout and tall, get forced out of my tribe? I’ve been a DD for almost 14 years, I should be able to weather any storm, shouldn’t I? It turns out that no, the mountain does get worn down by the rain. It’s not a single hurtful comment here and there. There’s a constant drum about this all being some sort of conspiracy and there are sometimes flares where people wish people involved in systemd would be run over by a bus or just accusations of incompetence.

    Our code of conduct says, “assume good faith”. If you ever find yourself not doing that, step back, breathe. See if there’s a reasonable explanation for why somebody is saying something or behaving in a way that doesn’t make sense to you. It might be as simple as your native tongue being English and their being something else.

    If you do genuinely disagree with somebody (something which is entirely fine), try not to escalate, even if the stakes are high. Examples from the last year include talking about this as a war and talking about “increasingly bitter rear-guard battles”. By using and accepting this terminology, we, as a project, poison ourselves. Sam Hartman puts this better than me:

    I’m hoping that we can all take a few minutes to gain empathy for those who disagree with us. Then I’m hoping we can use that understanding to reassure them that they are valued and respected and their concerns considered even when we end up strongly disagreeing with them or valuing different things.

    I’d be lying if I said I didn’t ever feel the urge to demonise my opponents in discussions. That they’re worse, as people, than I am. However, it is imperative to never give in to this, since doing that will diminish us as humans and make the entire project poorer. Civil disagreements with reasonable discussions lead to better technical outcomes, happier humans and a healthier projects.

    October 13, 2014

    Lasse KarstensenVarnish VMOD static code analysis

    I recently went looking for something similar to pep8/pylint when writing Varnish VMODs, and ended up with OCLint.

    I can’t really speak to how good it is, but it catches the basic stuff I was interested in.

    The documentation is mostly for cmake, so I’ll give a small tutorial for automake:

  • (download+install oclint to somewhere in $PATH)
  • apt-get install bear
  • cd libvmod-xxx
  • ./autogen.sh; ./configure –prefix=/usr
  • bear make # “build ear” == bear. writes compile_commands.json
  • cd src
  • oclint libvmod-xxx.c # profit
  • Which will tell you about unused variables, useless parentheses, dead code and so on.


    October 03, 2014

    Lasse KarstensenAnnouncing libvmod-tcp: Adjust Varnish congestion control algorithm.

    I’ve uploaded my new TCP VMOD for Varnish 4 to github, you can find it here:
    http://github.com/lkarsten/libvmod-tcp.

    This VMOD allows you to get the estimated client socket round trip time, and then let you change the TCP connection’s congestion control algorithm if you’re so inclined.

    Research[tm][0] says that Hybla is better for long high latency links, so currently that is what it is used for.

    Here is a quick VCL example:

    if (tcp.get_estimated_rtt() > 300) {
    set req.http.x-tcp = tcp.congestion_algorithm("hybla");
    }

    One thing to note is that VCL handling is very early in the TCP connection lifetime. We’ve only just read and acked the HTTP request. The readings may be off, I’m analyzing this currently.
    (As I understand it the Linux kernel will keep per-ip statistics, so for subsequent requests this should get better and better..)

    References:
    0: Esterhuizen, A., and A. E. Krzesinski. “TCP Congestion Control Comparison.” (2012).


    September 30, 2014

    Lasse KarstensenFresh Varnish packages for Debian/Ubuntu and Redhat systems

    We use continuous integration when developing Varnish Cache. This means that we run our internal test suite (varnishtest) on all commits, so we catch our mistakes earlier.

    This pipeline of build jobs sometimes end up with binary packages of Varnish, which may be useful to people when they know they exist. They may not be the easiest to find, which this blog post tries to remedy.

    Development wise, Varnish Cache is developed with GIT with a master branch for development and a set of production branches, currently 3.0 and 4.0.

    Unreleased packages for Varnish master can be found here: https://jenkins.varnish-software.com/view/varnish-master/

    Unreleased packages of Varnish 4.0 can be found here: https://jenkins.varnish-software.com/view/varnish-4.0/

    (There is also a set of 3.0 jobs, but you should really go for 4.0 these days.)

    The latest commits in each of the production branches may contain fixes we’ve added after the last production release, but haven’t cut a formal release for yet. (For example there are some gzip fixes in the 3.0 branch awaiting a 3.0.6 release, which I really should get out soon.)

    Some jobs in the job listing just check that Varnish builds, without creating any output (or artifacts as Jenkins calls it.) This applies for any jobs with “-build-” in the name, for example varnish-4.0-build-el7-x86_64 and varnish-4.0-build-freebsd10-amd64.

    The Debian and Ubuntu packages are all built from one job currently, called varnish-VERSION-deb-debian-wheezy-amd64. Press “Expand all” under artifacts to get the full list.

    Redhat/RHEL packages are built in the different el5/el6/el7 jobs.

    The unreleased packages built for 3.0 and 4.0 are safe. This is the process used to build the officially released packages, just a step earlier in the process. The varnish-master packages are of course failing from time to time, but that is to be expected.

    The version numbers in the packages produced may be a bit strange, but that is what you get with unreleased software builds.

    I’m happy to improve this process and system if it can help you run never versions of Varnish, comments (either here or on IRC) are appreciated.


    June 03, 2014

    Lasse KarstensenWhat happened to ban.url in Varnish 4.0?

    tl;dr; when using Varnish 4 and bans via varnishadm, instead of “ban.url EXPRESSION”, use “ban req.url ~ EXPRESSION”.

    In Varnish 3.0 we had the ban.url command in the varnishadm CLI. This was a shortcut function expanding to the a bit cryptic (but powerful) ban command. In essence ban.url just took your expression, prefixed it with “req.url ~ ” and fed it to ban. No magic.

    We deprecated this in Varnish 4.0, and now everyone has to update their CMS’s plugin for cache  invalidation. Hence this blog post. Perhaps it will help. Perhaps not. :-)

    Some references:


    April 09, 2014

    MacYvesBuilding vagent2 for Varnish Cache 4.0.0 beta 1 for OS X 10.9.2

    For those keen bunnies that wishes to jump in and help us test out varnish cache 4.0.0 Beta 1 with varnish-agent 2, here’s how you do it on OS X 10.9.2 Mavericks.

    Prerequisites

    Homebrew dependencies

    Install the following with Homebrew

    • automake 1.14.1
    • libtool 2.4.2
    • pkg-config 0.28
    • pcre 8.34
    • libmicrohttpd 0.9.34

    Build varnish cache 4.0.0 beta 1

    1. Download and extract varnish cache 4 https://repo.varnish-cache.org/source/varnish-4.0.0-beta1.tar.gz
    2. run ./autogen.sh
    3. run ./configure
    4. make

    Build varnish-agent2 for varnish cache 4.0.0 beta 1

    1. Clone varnish-agent from repo https://github.com/varnish/vagent2
    2. Checkout the varnish-4.0-experimental branch
    3. export VARNISHAPI_CFLAGS=-I/tmp/varnish/varnish-4.0.0-beta1/include
    4. export VARNISHAPI_LIBS="-L/tmp/varnish/varnish-4.0.0-beta1/lib/libvarnishapi/.libs -lvarnishapi"
    5. run ./autogen.sh
    6. run ./configure
    7. make

    Note that if you run make install for varnish cache 4 or varnish-agent, it would then install it for you respectively.


    December 19, 2013

    Lasse KarstensenConverting a Varnish 3.0 VMOD to 4.0

    So we’re getting closer to releasing the first proper 4.0 version of Varnish Cache. One of the things we need to fix is to get all the vmod writers to make sure their vmod works with the new version.

    Here are my notes from doing just that, in the hope to make it simpler for others.

    In 4.0, you don’t need the source tree of Varnish any more. The include files will be enough, and pkg-config will find them for you.

    Make sure that /usr/lib/pkgconfig/varnishapi.pc and /usr/share/aclocal/varnish.m4 exists. If you installed Varnish in the standard path/prefix, that should work out of the box. Otherwise, you might to add some symlinks for pkg-config and automake to find the source. (I need multiple varnishd versions when debugging customer problems, so I let them live in /opt/varnishX.Y/ on my laptop)

    Pull/merge the new Makefile.am files from the master branch of libvmod-example.

    Header files: remove bin/varnishd/cache.h and add cache/cache.h.

    Vmod functions are now called with a vrt context as first argument. %s/struct sess \*sp/const struct vrt_ctx \*ctx/g

    The old sess struct has been split, some data is in vrt_ctx->req, and some is in vrt_vtx->req->sp. Look up what is where in cache/cache.h. 

    I’ve put up the 3.0->4.0 diff for vmod_policy.c as a gist: https://gist.github.com/lkarsten/8039861

    There was a bit trouble of finding varnishtest, as src/Makefile was missing the reference entirely. I just fixed it by hand for now. Another thing for the 4.0 todolist, then.

    And finally; 

    lkarsten@immer:~/work/libvmod-policy/src$ make check
    /opt/varnish/bin/varnishtest -Dvarnishd=/opt/varnish/sbin/varnishd -Dvmod_topbuild=/home/lkarsten/work/libvmod-policy tests/test01.vtc
    # top TEST tests/test01.vtc passed (1.574)

     

    I have a working Varnish 4.0 vmod. :-D


    December 11, 2013

    Lasse KarstensenDNS RBL test address for development

    If you are writing code that checks a DNS real-time blockhole list (RBL), it looks like 127.0.0.2 is the standard address that is always in the black/white -list.

    This is probably know for most sysadmins/security people and whatnot, but wasn’t entirely trivial to find using Google.

    lkarsten@immer:~$ dig 2.0.0.127.dnsbl.sorbs.net @8.8.8.8
    ; <<>> DiG 9.8.4-rpz2+rl005.12-P1 <<>> 2.0.0.127.dnsbl.sorbs.net @8.8.8.8
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55083
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 10, AUTHORITY: 0, ADDITIONAL: 0
    ;; QUESTION SECTION:
    ;2.0.0.127.dnsbl.sorbs.net. IN A
    ;; ANSWER SECTION:
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.10
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.5
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.7
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.2
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.3
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.9
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.14
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.4
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.6
    2.0.0.127.dnsbl.sorbs.net. 2562 IN A 127.0.0.8
    ;; Query time: 17 msec
    ;; SERVER: 8.8.8.8#53(8.8.8.8)
    ;; WHEN: Wed Dec 11 14:12:20 2013
    ;; MSG SIZE rcvd: 203
    lkarsten@immer:~$

    Good to be able to actually test your code for hits also.

    (this is for libvmod-policy, so you can deny/reject POST/PUT from spammers in Varnish)


    November 29, 2013

    Tollef Fog HeenRedirect loop with interaktiv.nsb.no (and how to fix it)

    I’m running a local unbound instance on my laptop to get working DNSSEC. It turns out that with the captive portal NSB (the Norwegian national rail company), this doesn’t work too well and you get into an endless series of redirects. Changing resolv.conf so you use the DHCP-provided resolver stops the redirect loop and you can then log in. Afterwards, you’re free to switch back to using your own local resolver.

    October 03, 2013

    Tollef Fog HeenFingerprints as lightweight authentication

    Dustin Kirkland recently wrote that “Fingerprints are usernames, not passwords”. I don’t really agree, I think fingerprints are fine for lightweight authentication. iOS at least allows you to only require a pass code after a time period has expired, so you don’t have to authenticate to the phone all the time. Replacing no authentication with weak authentication (but only for a fairly short period) will improve security over the current status, even if it’s not perfect.

    Having something similar for Linux would also be reasonable, I think. Allow authentication with a fingerprint if I’ve only been gone for lunch (or maybe just for a trip to the loo), but require password or token if I’ve been gone for longer. There’s a balance to be struck between convenience and security.

    September 25, 2013

    Lasse KarstensenVarnish and Ghost blogging software

    So there is a new shiny blogging platform out called Ghost. Looks pretty good to me.

    If you want to run it behind Varnish, you’ll soon notice it has the usual problem of setting session cookies everywhere leading to 0% hit rate. 

    I have written a Varnish VCL configuration for filtering this in the necessary places, while keeping the admin interface working still.

    You can find it here:

    https://gist.github.com/lkarsten/6683179

    Have fun.


    September 10, 2013

    Lasse KarstensenTesting VMODs with Travis (travis-ci.org)

    Travis CI is a service where open source software can run tests automatically on commits. It hooks into github in a silky smooth way.

    If you’re developing a Varnish module (VMOD), you probably have started out with our libvmod-example package. It has all the automake magic you need, as well as some simple test cases to test your vmod. Given that you’ve written som varnishtest testcases for it (you really should), you now can get travis to run them as well!

    I’ve put a small recipe for this into the libvmod-example package.

    Feel free to play around with it, feedback appreciated. For the travis setup bits, consult the travis getting started guide. The final result is something like this, shown for libvmod-cookie:

    https://travis-ci.org/lkarsten/libvmod-cookie


    September 09, 2013

    MacYvesVagrant, Varnish and vmods

    Development environment has been plaguing us for a while in my product development department. From dependencies hell to complex setup in operations, our development environment has gone through the usual gauntlet of pains and complaints.

    This has changed with Vagrant. It is the single tool that gels the devs with the ops; quintessential devop tool if you will. Not only Vagrant has helped eliminate the “works on my machine” bugs, we use it for automated integration tests. In addition, this one tool has made our development environment setup quick and simple for our HCI guys too.

    We do a lot of integration work with Varnish Cache and I thought I would take this opportunity to share this simple Vagrantfile, as an example, to help get started with installing varnish and libdigest-vmod from source.

    Note that the provisioning process is rather crude in this example. Rather, the intention here is to out outline the steps required to get varnish and vmods installed and running via Vagrant. For production and future maintainability, do use Chef or Puppet as it can be seamlessly integrated within the Vagrantfile.


    # -*- mode: ruby -*-
    # vi: set ft=ruby :

    VAGRANTFILE_API_VERSION = "2"

    Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

      config.vm.define :varnish do |varnish|
        varnish.vm.box = "varnish"
        varnish.vm.box_url = "http://files.vagrantup.com/precise64.box"
        $script_varnish = <<SCRIPT
    echo Installing dependencies, curl
    sudo apt-get update
    sudo apt-get install curl -y
    sudo apt-get install git -y
    curl http://repo.varnish-cache.org/debian/GPG-key.txt | sudo apt-key add -
    echo "deb http://repo.varnish-cache.org/ubuntu/ precise varnish-3.0" | sudo tee -a /etc/apt/sources.list
    echo "deb-src http://repo.varnish-cache.org/ubuntu/ precise varnish-3.0" | sudo tee -a /etc/apt/sources.list
    sudo apt-get update
    echo ==== Compiling and installing Varnish from source ====
    sudo apt-get install build-essential -y
    sudo apt-get build-dep varnish -y
    apt-get source varnish
    cd varnish-3.0.4
    ./autogen.sh
    ./configure
    make
    sudo make install
    cd ..
    echo done
    echo ==== Compiling and installing lib-digest vmod from source ===
    git clone https://github.com/varnish/libvmod-digest.git
    sudo apt-get install libmhash-dev libmhash2 -y
    cd libvmod-digest
    ./autogen.sh
    ./configure VARNISHSRC=/home/vagrant/varnish-3.0.4 VMODDIR=/usr/local/lib/varnish/vmods
    sudo make install
    cd ..
    echo ===== done ====
    echo ===== firing up varnish via commandline ====
    sudo varnishd -a :80 -T :6081 -f /vagrant/test.vcl
    touch varnish_vm
    SCRIPT

        varnish.vm.provision :shell, :inline => $script_varnish
      end

    end


    July 29, 2013

    Lasse KarstensenBuilding a Varnish VMOD on Debian

    From the tutorials department, here are some quick notes on how to install a Varnish VMOD from source.

    This is slightly complicated because Varnish demands that a VMOD must be built against the same git commit (or release) as the one that is running. This will be relaxed in future versions.

    Current setup is a standalone Varnish VM on Debian Wheezy with Varnish installed from varnish package archives (3.0.4-1~wheezy.)

    1. Get the vmod

    lkarsten@lb1:~$ git clone https://github.com/lkarsten/libvmod-cookie.git
    Cloning into 'libvmod-cookie'...
    remote: Counting objects: 253, done.
    remote: Compressing objects: 100% (131/131), done.
    remote: Total 253 (delta 132), reused 232 (delta 112)
    Receiving objects: 100% (253/253), 49.51 KiB, done.
    Resolving deltas: 100% (132/132), done.
    lkarsten@lb1:~$

    2. Get and configure the source tree for the running Varnish

    Verify first that you have the necessary package repositories enabled:

    lkarsten@lb1:~$ grep varnish /etc/apt/sources.list
    deb http://repo.varnish-cache.org/debian/ wheezy varnish-3.0
    deb-src http://repo.varnish-cache.org/debian/ wheezy varnish-3.0
    lkarsten@lb1:~$

    After that, continue with the juicy parts:

    lkarsten@lb1:~$ apt-get source varnish 
    Reading package lists... Done 
    Building dependency tree 
    Reading state information... Done 
    NOTICE: 'varnish' packaging is maintained in the 'Git' version control system at: 
    git://git.debian.org/pkg-varnish/pkg-varnish.git 
    Need to get 2,060 kB of source archives. 
    Get:1 http://repo.varnish-cache.org/debian/ wheezy/varnish-3.0 varnish 3.0.4-1 (dsc) [2,334 B] 
    Get:2 http://repo.varnish-cache.org/debian/ wheezy/varnish-3.0 varnish 3.0.4-1 (tar) [2,044 kB] 
    Get:3 http://repo.varnish-cache.org/debian/ wheezy/varnish-3.0 varnish 3.0.4-1 (diff) [14.1 kB] 
    Fetched 2,060 kB in 0s (11.4 MB/s) 
    gpgv: keyblock resource `/home/lkarsten/.gnupg/trustedkeys.gpg': file open error 
    gpgv: Signature made Fri 14 Jun 2013 11:56:48 CEST using RSA key ID 87218D9C 
    gpgv: Can't check signature: public key not found 
    dpkg-source: warning: failed to verify signature on ./varnish_3.0.4-1.dsc 
    dpkg-source: info: extracting varnish in varnish-3.0.4 
    dpkg-source: info: unpacking varnish_3.0.4.orig.tar.gz 
    dpkg-source: info: applying varnish_3.0.4-1.diff.gz 
    lkarsten@lb1:~$
    lkarsten@lb1:~$ cd varnish-3.0.4
    lkarsten@lb1:~/varnish-3.0.4$ ./autogen.sh
    [..]
    lkarsten@lb1:~/varnish-3.0.4$ ./configure --prefix=/usr
    [..]
    lkarsten@lb1:~/varnish-3.0.4$ make

    If configure or make fails, you might need some additional packages. Run an apt-get build-dep varnish and work from there. (if editline fails on you, remember to rerun configure after installing it)

    3. Build and install the vmod

    lkarsten@lb1:~$ cd libvmod-cookie/
    lkarsten@lb1:~/libvmod-cookie$ ./autogen.sh
    + aclocal -I m4
    + libtoolize --copy --force
    libtoolize: putting auxiliary files in `.'.
    libtoolize: copying file `./ltmain.sh'
    libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'.
    libtoolize: copying file `m4/libtool.m4'
    libtoolize: copying file `m4/ltoptions.m4'
    libtoolize: copying file `m4/ltsugar.m4'
    libtoolize: copying file `m4/ltversion.m4'
    libtoolize: copying file `m4/lt~obsolete.m4'
    + autoheader
    + automake --add-missing --copy --foreign
    configure.ac:8: installing `./config.guess'
    configure.ac:8: installing `./config.sub'
    configure.ac:11: installing `./install-sh'
    configure.ac:11: installing `./missing'
    src/Makefile.am: installing `./depcomp'
    + autoconf
    lkarsten@lb1:~/libvmod-cookie$ 
    lkarsten@lb1:~/libvmod-cookie$ ./configure VARNISHSRC=~/varnish-3.0.4/
    [..]
    # and finally
    lkarsten@lb1:~/libvmod-cookie$ make
    [..]
    libtool: link: ( cd ".libs" && rm -f "libvmod_cookie.la" && ln -s "../libvmod_cookie.la" "libvmod_cookie.la" )
    make[2]: Leaving directory `/home/lkarsten/libvmod-cookie/src'
    make[2]: Entering directory `/home/lkarsten/libvmod-cookie'
    rst2man README.rst vmod_cookie.3
    make[2]: Leaving directory `/home/lkarsten/libvmod-cookie'
    make[1]: Leaving directory `/home/lkarsten/libvmod-cookie'
    lkarsten@lb1:~/libvmod-cookie$ 
    lkarsten@lb1:~/libvmod-cookie$ sudo make install
    [..]
    /bin/mkdir -p '/usr/local/share/man/man3'
     /usr/bin/install -c -m 644 vmod_cookie.3 '/usr/local/share/man/man3'
    make[2]: Leaving directory `/home/lkarsten/libvmod-cookie'
    make[1]: Leaving directory `/home/lkarsten/libvmod-cookie'
    lkarsten@lb1:~/libvmod-cookie$

    At this point you should have the two vmod files available for Varnish:

    lkarsten@lb1:~/libvmod-cookie$ ls -l /usr/lib/varnish/vmods/
    total 64
    -rwxr-xr-x 1 root root 966 Jul 29 11:11 libvmod_cookie.la
    -rwxr-xr-x 1 root root 41538 Jul 29 11:11 libvmod_cookie.so
    -rw-r--r-- 1 root root 16128 Jun 17 13:38 libvmod_std.so
    lkarsten@lb1:~/libvmod-cookie$

    And you are done!

    “import cookie” should now work without issue in your /etc/varnish/default.vcl.


    July 22, 2013

    Lasse KarstensenSetting client.ip in Varnish VCL with libvmod-ipcast

    I’ve written a new Varnish 3.0 VMOD called ipcast.

    It has a single function; ipcast.clientip(ipstring) which sets the internal Varnish variable client.ip to whatever IPv4/IPv6 address you give as the argument.

    You need this if you want to do ACL checks on connections done through a load balancer or SSL terminator. In those cases client.ip would be 127.0.0.1 and you get the real client’s IP address in the X-Forwarded-For (or similar) header.

    You can find it here:

    https://github.com/lkarsten/libvmod-ipcast

    Here is some example VCL to illustrate how it works. I think the regex needs some work, suggestions/pull requests are welcome.

    import ipcast;
    acl friendly_network {
        "192.0.2.0"/24;
    }
    sub vcl_recv {
        if (req.http.X-Forwarded-For !~ ",") {
            set req.http.xff = req.http.X-Forwarded-For;
        } else {
            set req.http.xff = regsub(req.http.X-Forwarded-For,
                    "^[^,]+.?.?(.*)$", "\1");
        }
    
        if (ipcast.clientip(req.http.xff) != 0) {
            error 400 "Bad request";
        }
    
        if (client.ip !~ friendly_network) {
                error 403 "Forbidden";
        }
    }

    July 18, 2013

    cd34Varnish and Node.js

    While working with a client installation they wanted to run Varnish in front of their node.js powered site to eliminate having node serve the static assets. Socket.io uses HTTP/1.0 and cannot be cached. Minimally these few lines can be added to their respective functions and things will work. Obviously you’ll want to set expires on […]

    June 30, 2013

    ops42Properly redirect to mobile pages

    It is just amazing how much advice and examples one can find for how to redirect to a mobile equivalent of a given HTTP address. Oversimplified, wrong and harmful advice that is. And no, I’m not talking about that 301 vs 302 bullshit. For the love of God, stop listening to those overpaid, know-nothing SEO […]

    June 27, 2013

    Tollef Fog HeenGetting rid of NSCA using Python and Chef

    NSCA is a tool used to submit passive check results to nagios. Unfortunately, an incompatibility was recently introduced between wheezy clients and old servers. Since I don’t want to upgrade my server, this caused some problems and I decided to just get rid of NSCA completely.

    The server side of NSCA is pretty trivial, it basically just adds a timestamp and a command name to the data sent by the client, then changes tabs into semicolons and stuffs all of that down Nagios’ command pipe.

    The script I came up with was:

    #! /usr/bin/python
    # -* coding: utf-8 -*-
    
    import time
    import sys
    
    # format is:
    # [TIMESTAMP] COMMAND_NAME;argument1;argument2;…;argumentN
    #
    # For passive checks, we want PROCESS_SERVICE_CHECK_RESULT with the
    # format:
    #
    # PROCESS_SERVICE_CHECK_RESULT;<host_name>;<service_description>;<return_code>;<plugin_output>
    #
    # return code is 0=OK, 1=WARNING, 2=CRITICAL, 3=UNKNOWN
    #
    # Read lines from stdin with the format:
    # $HOSTNAME\t$SERVICE_NAME\t$RETURN_CODE\t$TEXT_OUTPUT
    
    if len(sys.argv) != 2:
        print "Usage: {0} HOSTNAME".format(sys.argv[0])
        sys.exit(1)
    HOSTNAME = sys.argv[1]
    
    timestamp = int(time.time())
    nagios_cmd = file("/var/lib/nagios3/rw/nagios.cmd", "w")
    for line in sys.stdin:
        (_, service, return_code, text) = line.split("\t", 3)
        nagios_cmd.write(u"[{timestamp}] PROCESS_SERVICE_CHECK_RESULT;{hostname};{service};{return_code};{text}\n".format
                         (timestamp = timestamp,
                          hostname = HOSTNAME,
                          service = service,
                          return_code = return_code,
                          text = text))
    

    The reason for the hostname in the line (even though it’s overridden) is to be compatible with send_nsca’s input format.

    Machines submit check results over SSH using its excellent ForceCommand capabilities, the Chef template for the authorized_keys file looks like:

    <% for host in @nodes %>
    command="/usr/local/lib/nagios/nagios-passive-check-result <%= host[:hostname] %>",no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty ssh-rsa <%= host[:keys][:ssh][:host_rsa_public] %> <%= host[:hostname] %>
    <% end %>
    

    The actual chef recipe looks like:

    nodes = []
    search(:node, "*:*") do |n|
      # Ignore not-yet-configured nodes                                                                       
      next unless n[:hostname]
      next unless n[:nagios]
      next if n[:nagios].has_key?(:ignore)
      nodes << n
    end
    nodes.sort! { |a,b| a[:hostname] <=> b[:hostname] }
    print nodes
    
    template "/etc/ssh/userkeys/nagios" do
      source "authorized_keys.erb"
      mode 0400
      variables({
                  :nodes => nodes
                })
    end
    
    cookbook_file "/usr/local/lib/nagios/nagios-passive-check-result" do
      mode 0555
    end
    
    user "nagios" do
      action :manage
      shell "/bin/sh"
    end
    

    To submit a check, hosts do:

    printf "$HOSTNAME\t$SERVICE_NAME\t$RET\t$TEXT\n" | ssh -i /etc/ssh/ssh_host_rsa_key -o BatchMode=yes -o StrictHostKeyChecking=no -T nagios@$NAGIOS_SERVER
    

    June 18, 2013

    Tollef Fog HeenAn otter, please (or, a better notification system)

    Recently, there’s been discussions on IRC and the debian-devel mailing list about how to notify users, typically from a cron script or a system daemon needing to tell the user their hard drive is about to expire. The current way is generally “send email to root” and for some bits “pop up a notification bubble, hoping the user will see it”. Emailing me means I get far too many notifications. They’re often not actionable (apt-get update failed two days ago) and they’re not aggregated.

    I think we need a system that at its core has level and edge triggers and some way of doing flap detection. Level interrupts means “tell me if a disk is full right now”. Edge means “tell me if the checksums have changed, even if they now look ok”. Flap detection means “tell me if the nightly apt-get update fails more often than once a week”. It would be useful if it could extrapolate some notifications too, so it could tell me “your disk is going to be full in $period unless you add more space”.

    The system needs to be able to take in input in a variety of formats: syslog, unstructured output from cron scripts (including their exit codes), snmp, nagios notifications, sockets and fifos and so on. Based on those inputs and any correlations it can pull out of it, it should try to reason about what’s happening on the system. If the conclusion there is “something is broken”, it should see if it’s something that it can reasonably fix by itself. If so, fix it and record it (so it can be used for notification if appropriate: I want to be told if you restart apache every two minutes). If it can’t fix it, notify the admin.

    It should also group similar messages so a single important message doesn’t drown in a million unimportant ones. Ideally, this should be cross-host aggregation. The notifications should be possible to escalate if they’re not handled within some time period.

    I’m not aware of such a tool. Maybe one could be rigged together by careful application of logstash, nagios, munin/ganglia/something and sentry. If anybody knows of such a tool, let me know, or if you’re working on one, also please let me know.

    March 25, 2013

    Mikko OhtamaaVarnish at the front of WordPress + Apache and Plone CMS virtual hosts

    When moving some sites to a new server I upgraded the Varnish cache server configuration serving this setup. Here are my notes how one can use Varnish at the front of virtual hosting.

    The setup is following

    • Multiple sites are hosted on the same server. The sites are mix of PHP of Plone sites.
    • Varnish accepts HTTP requests in port 80 and forwards request to the corresponding backend through HTTP proxying
    • Our virtual host rules capture domain names with or without www-prefix, or with any subdomain name prefix
    • Apache runs in non-standard port localhost:81, serving PHP and WordPress. WordPress caching rules are defined in Apache <virtualhost> config file.
    • Every Plone site runs in its own port and process. Plone uses VirtualHostMonster to rewrite publicly facing site URLS. Plone caching HTTP headers are set by plone.app.caching addon.
    • We do extensive cookie sanitization for PHP (generic), WordPress and Plone.  Google Analytics etc. cookies don’t bust the cache and we can still login to WordPress and Plone as admin
    • As a special trick, there is cleaned cookie debugging trick through HTTP response headers

    1-IMG_2677

    Don’t worry, Varnish can handle little load

     

    Pros

    • Blazingly fast, as Varnish is
    • With Plone’s plone.app.caching, one does not need to touch configuration files but Plone caching HTTP headers can be configured through-the-web

    Cons

    • Varnish does not have Nginx or Apache style virtual host configuration file facilities by default and making includes is little bit tricky: With many virtualhost the default.vcl config file grows long.
    • Because WordPress cannot do static resource serving as smartly as Plone, which has unique URLs for all static media revisions, you need to purge Varnish manually from command line if you update any static media files like CSS, JS or images.

    Varnish /etc/varnish/default.vcl example for Varnish 3.0:

    #
    # This backend never responds... we get hit in the case of bad virtualhost name
    #
    backend default {
        .host = "127.0.0.1";
        .port = "55555";
    }
    
    backend myplonesite {
        .host = "127.0.0.1";
        .port = "6699";
    }
    
    #
    # Apache running on server port 81
    #
    backend apache {
        .host = "127.0.0.1";
        .port = "81";
    }
    
    #
    # Gues which site / virtualhost we are diving into.
    # Apache, Nginx or Plone directly
    #
    sub choose_backend {
    
        # WordPress site
        if (req.http.host ~ "^(.*\.)?opensourcehacker\.com(:[0-9]+)?$") {
            set req.backend = apache;
        }
    
        # Example Plone site
        if (req.http.host ~ "^(.*\.)?myplonesite\.fi(:[0-9]+)?$") {
            set req.backend = myplonesite;
    
            # Zope VirtualHostMonster
            set req.url = "/VirtualHostBase/http/" + req.http.host + ":80/Plone/VirtualHostRoot" + req.url;
    
        }
    
    }
    
    sub vcl_recv {
    
        #
        # Do Plone cookie sanitization, so cookies do not destroy cacheable anonymous pages.
        # Also, make sure we do not destroy WordPress admin and login cookies in the proces
        #
        if (req.http.Cookie && !(req.url ~ "wp-(login|admin)")) {
            set req.http.Cookie = ";" + req.http.Cookie;
            set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
            set req.http.Cookie = regsuball(req.http.Cookie, ";(statusmessages|__ac|_ZopeId|__cp|php|PHP|wordpress_(.*))=", "; \1=");
            set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
            set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
    
            if (req.http.Cookie == "") {
                remove req.http.Cookie;
            }
        }
    
        call choose_backend;
    
        if (req.request != "GET" &&
          req.request != "HEAD" &&
          req.request != "PUT" &&
          req.request != "POST" &&
          req.request != "TRACE" &&
          req.request != "OPTIONS" &&
          req.request != "DELETE") {
            /* Non-RFC2616 or CONNECT which is weird. */
            return (pipe);
        }
        if (req.request != "GET" && req.request != "HEAD") {
            /* We only deal with GET and HEAD by default */
            return (pass);
        }
        if (req.http.Authorization || req.http.Cookie) {
            /* Not cacheable by default */
            return (pass);
        }
        return (lookup);
    }
    
    sub vcl_fetch {
    
        /* Use to see what cookies go through our filtering code to the server */
        /* set beresp.http.X-Varnish-Cookie-Debug = "Cleaned request cookie: " + req.http.Cookie; */
    
        if (beresp.ttl <= 0s ||
            beresp.http.Set-Cookie ||
            beresp.http.Vary == "*") {
            /*
             * Mark as "Hit-For-Pass" for the next 2 minutes
             */
            set beresp.ttl = 120 s;
            return (hit_for_pass);
        }
        return (deliver);
    }
    
    #
    # Show custom helpful 500 page when the upstream does not respond
    #
    sub vcl_error {
      // Let's deliver a friendlier error page.
      // You can customize this as you wish.
      set obj.http.Content-Type = "text/html; charset=utf-8";
      synthetic {"
      <?xml version="1.0" encoding="utf-8"?>
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
      <html>
        <head>
          <title>"} + obj.status + " " + obj.response + {"</title>
          <style type="text/css">
          #page {width: 400px; padding: 10px; margin: 20px auto; border: 1px solid black; background-color: #FFF;}
          p {margin-left:20px;}
          body {background-color: #DDD; margin: auto;}
          </style>
        </head>
        <body>
        <div id="page">
        <h1>Sivu ei ole saatavissa</h1>
        <p>Pahoittelemme, mutta palvelua ei ole saatavilla.</p>
        <hr />
        <h4>Debug Info:</h4>
        <pre>Status: "} + obj.status + {"
    Response: "} + obj.response + {"
    XID: "} + req.xid + {"</pre>
          </div>
        </body>
       </html>
      "};
      return(deliver);
    }

    WordPress does not support setting HTTP response headers natively like Plone. We set them in Apache virtual host configuration file in /etc/apache2/sites-enabled:

    <VirtualHost 127.0.0.1:81>
    
        ServerName opensourcehacker.com
        ServerAlias www.opensourcehacker.com
        ServerAdmin mikko@opensourcehacker.com
    
        LogFormat       combined
        TransferLog     /var/log/apache2/opensourcehacker.com.log
    
        # Basic WordPress setup
    
        Options +Indexes FollowSymLinks +ExecCGI
    
        DocumentRoot /srv/php/opensourcehacker/wordpress
    
        <Directory /srv/php/opensourcehacker/wordpress>
            Options FollowSymlinks
            AllowOverride All
        </Directory>
    
        AddType text/css .css
        AddType application/x-httpd-php .php .php3 .php4 .php5
        AddType application/x-httpd-php-source .phps
    
        #
        # Set expires headers manually
        #
        ExpiresActive On
        ExpiresByType text/html A0
        ExpiresByType image/gif A3600
        ExpiresByType image/png A3600
        ExpiresByType image/image/vnd.microsoft.icon A3600
        ExpiresByType image/jpeg A3600
        ExpiresByType text/css A3600
        ExpiresByType text/javascript A3600
        ExpiresByType application/x-javascript A3600
    
    </VirtualHost>

     Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

    March 22, 2013

    Tollef Fog HeenSharing an SSH key, securely

    Update: This isn’t actually that much better than letting them access the private key, since nothing is stopping the user from running their own SSH agent, which can be run under strace. A better solution is in the works. Thanks Timo Juhani Lindfors and Bob Proulx for both pointing this out.

    At work, we have a shared SSH key between the different people manning the support queue. So far, this has just been a file in a directory where everybody could read it and people would sudo to the support user and then run SSH.

    This has bugged me a fair bit, since there was nothing stopping a person from making a copy of the key onto their laptop, except policy.

    Thanks to a tip, I got around to implementing this and figured writing up how to do it would be useful.

    First, you need a directory readable by root only, I use /var/local/support-ssh here. The other bits you need are a small sudo snippet and a profile.d script.

    My sudo snippet looks like:

    Defaults!/usr/bin/ssh-add env_keep += "SSH_AUTH_SOCK"
    %support ALL=(root)  NOPASSWD: /usr/bin/ssh-add /var/local/support-ssh/id_rsa
    

    Everybody in group support can run ssh-add as root.

    The profile.d goes in /etc/profile.d/support.sh and looks like:

    if [ -n "$(groups | grep -E "(^| )support( |$)")" ]; then
        export SSH_AUTH_ENV="$HOME/.ssh/agent-env"
        if [ -f "$SSH_AUTH_ENV" ]; then
            . "$SSH_AUTH_ENV"
        fi
        ssh-add -l >/dev/null 2>&1
        if [ $? = 2 ]; then
            mkdir -p "$HOME/.ssh"
            rm -f "$SSH_AUTH_ENV"
            ssh-agent > "$SSH_AUTH_ENV"
            . "$SSH_AUTH_ENV"
        fi
        sudo ssh-add /var/local/support-ssh/id_rsa
    fi
    

    The key is unavailable for the user in question because ssh-add is sgid and so runs with group ssh and the process is only debuggable for root. The only thing missing is there’s no way to have the agent prompt to use a key and I would like it to die or at least unload keys when the last session for a user is closed, but that doesn’t seem trivial to do.

    February 15, 2013

    Kristian LyngstølThe Architecture the Varnish Agent

    Posted on 2013-02-15

    Designing software architecture is fun.

    The Varnish Agent 2 was written as a replacement for the original Varnish Agent. They both share the same purpose: Expose node-specific Varnish features to a management system. They are design very differently, though.

    In this post I'd like to explain some choices that were made, and show you how to write your own code for the Varnish Agent 2. It's really not that hard.

    The code can be found at: https://github.com/varnish/vagent2

    Why C ?

    The choice of C as a language was made fairly early. One of the main reasons is that Varnish itself is written in C, as are all the tools for Varnish. This means that the by far best supported APIs for talking to Varnish are written in C.

    But an other reason is because C is a very good language. It has become a false truth that you "never write web apps in C", more or less. There are good reasons for this: It takes time to set things up in C, C isn't very forgiving and perhaps most importantly: people generally suck at C.

    In the end, we chose C because it was the right tool for the job.

    Requirements

    When designing a new system, it's important to know what you're trying to achieve, and perhaps just as important to know what you're /not/ trying to achieve.

    The Varnish Agent is designed to:

    • Manage a single Varnish server.
    • Remove the need for management frontends to know the Varnish CLI language.
    • Expose log data
    • Persist configuration changes
    • Require "0" configuration of the agent itself
    • Ensure that Varnish works on boot, even if there is no management front-end present.
    • Be expandable without major re-factoring.
    • Be easy to expand

    What we did NOT want was:

    • Support for running the agent on a different machine than the Varnish server.
    • Elaborate self-management of the agent (e.g: support for users, and management of them).
    • Mechanisms that are opaque to a system administrator
    • Front-end code mixed with back-end code
    • "Sessions"

    We've achieved pretty much all of these goals.

    The heart of the agent: The module

    At the heart of the agent, there is the module. As of this writing, there are 14 modules written. The average module is 211 lines of C code (including copyright and license). The smallest module, the echo module, is 92 lines of code (the echo plugin is an example plugin with extensive self documentation). The largest modules, the vlog and vcl modules, are both 387 lines of code.

    To make modules useful, I spent most of the initial work on carving out how modules should work. This is currently how it works:

    • You define a module, say, src/modules/foobar.c
    • You write foobar_init(). This function is the only absolutely required part of the function. It will be run in the single-threaded stage of the agent.
    • You either hook into other modules (like the httpd-module), or define a start function.
    • After all plugins are initialized, the start function of each plugin is executed, if present.

    That's it.

    Since a common task is inter-operation between plugins, an IPC mechanism was needed. I threw together a simple message passing mechanism, inspired by varnish. This lives in src/ipc.c and include/ipc.h. The only other way to currently talk to other modules is through httpd_register (and logger(), but that's just a macro for ipc_run()).

    If you want your foobar.c-plugin to talk to the varnish CLI, you want to go through the vadmin-plugin. This is a two-step process:

    int handle;
    
    void foobar_init(struct agent_core_t *core)
    {
        handle = ipc_register(core, "vadmin");
    }
    

    This part of the code gives you a socket to talk to the vadmin module. Actually talking to other modules in foobar_init() is not going to work, since the module isn't started yet.

    And proper etiquette is not to use a global variable, but to use the plugin structure for your plugin, present in core:

    struct foobar_priv_t {
            int vadmin;
    }
    void foobar_init(struct agent_core_t *core)
    {
            struct foobar_priv_t *priv = malloc(sizeof(struct echo_priv_t));
            struct agent_plugin_t *plug;
            plug = plugin_find(core,"foobar");
            assert(plug);
            priv->vadmin = ipc_register(core,"vadmin");
            plug->data = (void *)priv;
            plug->start = NULL;
    }
    

    In this example, we have a private data structure for the module, which we allocate in the init function. Every function has a generic struct agent_plugin_t data structure already allocated for it and hooked on to the core->plugins list. This allows you to store generic data, as the core-data structure is the one typically passed around.

    Note

    The varnish agent uses a lot of assert()s. This is similar to what Varnish does. It lets you, the developer, state that we assume this worked, but if it didn't you really shouldn't just continue. It's excellent for catching obscure bugs before they actually become obscure. And it's excellent for letting you know where you actually need proper error code.

    Let's take a closer look at the generic struct agent_plugin_t:

    struct agent_plugin_t {
            const char *name;
            void *data;
            struct ipc_t *ipc;
            struct agent_plugin_t *next;
            pthread_t *(*start)(struct
                                agent_core_t *core, const
                                char *name);
            pthread_t *thread;
    };
    

    The name should be obvious. The void *data is left for the plugin to define. It can be ignored if your plugin doesn't need any data at all (what does it do?).

    struct ipc_t *ipc is the IPC-structure for the plugin. This tells you that all plugins have an IPC present. This is to allow you to run ipc_register() before a plugin has initialized itself. Otherwise we'd have to worry a lot more about which order modules were loaded.

    Next is *next. This is simply because the plugins are par of a linked list.

    the start() function-pointer is used to define a function that will start your plugin. This function can do pretty much anything, but have to return fairly fast. If it spawns off a thread, it's expected that it will return the pthread_t * data structure, as the agent will later wait for it to join. Similar, *thread is used for the same purpose.

    Using the IPC

    You've got a handle to work with, let's use it. To do that, let's look at the vping plugin, starting with init and start:

    static pthread_t *
    vping_start(struct agent_core_t *core, const char *name)
    {
            (void)name;
            pthread_t *thread = malloc(sizeof (pthread_t));
            pthread_create(thread,NULL,(*vping_run),core);
            return thread;
    }
    
    void
    vping_init(struct agent_core_t *core)
    {
            struct agent_plugin_t *plug;
            struct vping_priv_t *priv = malloc(sizeof(struct vping_priv_t));
            plug = plugin_find(core,"vping");
    
            priv->vadmin_sock = ipc_register(core,"vadmin");
            priv->logger = ipc_register(core,"logger");
            plug->data = (void *)priv;
            plug->start = vping_start;
    }
    

    vping_init() grabs a handle for the vadmin (varnish admin interface) plugin, and the logger. It also assigns vping_start() to relevant pointer.

    vping_start() simply spawns a thread that runs vping_run.

    static void *vping_run(void *data)
    {
            struct agent_core_t *core = (struct agent_core_t *)data;
            struct agent_plugin_t *plug;
            struct vping_priv_t *ping;
            struct ipc_ret_t vret;
    
            plug = plugin_find(core,"vping");
            ping = (struct vping_priv_t *) plug->data;
    
            logger(ping->logger, "Health check starting at 30 second intervals");
            while (1) {
                    sleep(30);
                    ipc_run(ping->vadmin_sock, &vret, "ping");
                    if (vret.status != 200)
                            logger(ping->logger, "Ping failed. %d ", vret.status);
                    free(vret.answer);
    
                    ipc_run(ping->vadmin_sock, &vret, "status");
                    if (vret.status != 200 || strcmp(vret.answer,"Child in state running"))
                            logger(ping->logger, "%d %s", vret.status, vret.answer);
                    free(vret.answer);
            }
            return NULL;
    }
    

    The vping module was the first module written. Written before the varnish admin interface was a module. It simply pings Varnish over the admin interface.

    This also illustrates how to use the logger: Grab a handle, then use logger(handle,fmt,...), similar to how you'd use printf().

    The IPC mechanism returns data through a vret-structure. For vadmin, this is precisely how Varnish would return it.

    Warning

    ipc_run() dynamically allocates memory for ret->answer. FREE IT.

    The logger also returns a vret-like structure, but the logger() macro handles this for you.

    Hooking up to HTTP!

    Hooking up to HTTP is ridiculously easy.

    Let's look at echo, comments removed:

    struct echo_priv_t {
            int logger;
    };
    
    static unsigned int echo_reply(struct httpd_request *request, void *data)
    {
            struct echo_priv_t *echo = data;
            logger(echo->logger, "Responding to request");
            send_response(request->connection, 200, request->data, request->ndata);
            return 0;
    }
    
    void echo_init(struct agent_core_t *core)
    {
            struct echo_priv_t *priv = malloc(sizeof(struct echo_priv_t));
            struct agent_plugin_t *plug;
            plug = plugin_find(core,"echo");
            assert(plug);
            priv->logger = ipc_register(core,"logger");
            plug->data = (void *)priv;
            plug->start = NULL;
            httpd_register_url(core, "/echo", M_POST | M_PUT | M_GET, echo_reply, priv);
    }
    

    This is the ENTIRE echo plugin. httpd_register_url() is the key here. It register a url-base, /echo in this case, and a set of request methods (POST, PUT and GET in this case. DELETE is also supported). A callback to execute and some optional private data.

    The echo_reply function is now executed every time a POST, PUT or GET request is received for URLs starting with /echo.

    You can respond with send_response() as demonstrated above, or the shorthands send_response_ok(request->connection, "Things are all OK!"); and send_response_fail(request->connection, "THINGS WENT BAD");.

    Warning

    Currently all http requests are handled in a single thread. This means you really really shouldn't block.

    But make sure it's written with thread safety in mind. We might switch to a multi-threaded request handler in the future.

    Know your HTTP

    "REST"-interfaces are great, if implemented correctly. A short reminder:

    • GET requests are idempotent and should not cause side effects. They should be purely informational.
    • PUT requests are idempotent, but can cause side effects. Example: PUT /start can be run multiple times.
    • POST requests do not have to be idempotent, and can cause side effects. Example: POST /vcl/ will upload new copies of the VCL.
    • DELETE requests are idempotent, and can have side effects. Example: DELETE /vcl/foobar.

    Test your code!

    Unused code is broken code. Untested code is also broken code.

    Pretty much all functionality is tested. Take a look in tests/.

    If your code is to be included in an official release, someone has to write test cases.

    I also advise you to add something in html/index.html to test it if that's feasible. It also tends to be quite fun.

    Getting started

    To get started, grab the code and get crackin'.

    I advise you to read include/*.h thoroughly.

    Comments

    February 11, 2013

    MacYvesDirrty Hax0r in Perl 5.12 Pid.pm for varnish-agent

    Right, so I am not particularly proud of this but again, more of a note to self than anything. Secondly, I am most certainly not a Perl expert. This little hack should probably never be used or put into any production Mac setup. If you know of a proper fix, please do let me know.

    Lastly, this is varnish-agent and not varnish-agent2. The former is written in Perl while the latter is written in C and uses Varnish API directly.

    OK! Now that the disclaimer is out of the way, onward with the dodgy! So when running varnish-agent on OSX 10.8.2, one may encounter the following issue:

    Can't kill a non-numeric process ID at /Library/Perl/5.12/File/Pid.pm line 124.

    The quick fix is proposed here involves adding this line to the the Pid.pm file on line 124.

    return undef unless $pid > 0;

    So your subroutine in Pid.pm will end up looking more like this.

    sub running {
       my $self = shift;
       my $pid  = $self->_get_pid_from_file;
       return undef unless $pid > 0;
       return   kill(0, $pid)
          ? $pid
          : undef;
    }


    February 06, 2013

    Mikko OhtamaaVarnish shell singleliners: reload config, purge cache and test hits

    Varnish is a server-side caching daemon. On our production server, Varnish listens to port HTTP 80 and serves at the production server front end cache; we use it mainly to serve JS, CSS and static images blazingly fast.

    This blog post is based on default Varnish Ubuntu / Debian installation using apt-get. These instructions were tested on Ubuntu 12.04 LTS and Varnish 3.0.2.

    1. Reload edited Varnish configs

    Varnish caching rules live in /etc/varnish folder. The config entry point (main file) is /etc/varnish/default.vcl. The daemon itself (ports, etc.) is configured by /etc/defaults/varnish.

    Varnish is controlled by an utility program varnishadm.You can use it in a console mode or issue direct command evaluations (think shell, MySQL client). On Ubuntu / Debian default installation varnishadm command as is is enough to control Varnish. However, on custom setup, you might need to guide it to a special console port or point it to a secret file.

    Varnish config load is 2 stage process:

    • Parse and load cfg file to a Varnish memory and give it a handle you can later refer to it
    • Activate config by handle (only possible if step 1 success)

    Below is an one liner shell script which generates a random handle and uses it to load the config if the config parses successfully.

    HANDLE=varnish-cfg-$RANDOM ; \
      varnishadm vcl.load $HANDLE /etc/varnish/default.vcl && \
      varnishadm vcl.use $HANDLE

    2. Purging Varnish cache from command line

    Another useful snippet is to purge all Varnish cache from command line (invalidate all the cache):

    varnishadm "ban.url ."  # Matches all URLs

    Note: Command is purge.url in Varnish 2.x.

    The cache is kept as shared memorymapped file in /var/lib/varnish/$INSTANCE/varnish_storage.bin. When Varnish is running it should map 1 GB (default) of your virtual memory to this file (as seen in ps, top).

    You could also ban by a hostname:

    varnishadm "ban req.http.host == opensourcehacker.com"

    Here is a shell transcript where we observe that ban works as intended using wget utility.

    # Go to /tmp because wget leaves files around
    cd /tmp
    
    # 1st load: uncached file, one X-Varnish stamp
    wget -S http://opensourcehacker.com/wp-content/uploads/2011/08/Untitled-41.jpg
    
    --2013-02-06 20:02:18--  http://opensourcehacker.com/wp-content/uploads/2011/08/Untitled-41.jpg
    Resolving opensourcehacker.com (opensourcehacker.com)... 188.40.123.220
    Connecting to opensourcehacker.com (opensourcehacker.com)|188.40.123.220|:80... connected.
    HTTP request sent, awaiting response... 
      HTTP/1.1 200 OK
      Server: Apache/2.2.22 (Ubuntu)
      Last-Modified: Sun, 14 Aug 2011 22:55:01 GMT
      ETag: "2000893-108ec-4aa7f09555b40"
      Cache-Control: max-age=3600
      Expires: Wed, 06 Feb 2013 23:02:19 GMT
      Content-Type: image/jpeg
      Content-Length: 67820
      Accept-Ranges: bytes
      Date: Wed, 06 Feb 2013 22:02:19 GMT
      X-Varnish: 705602514
    
    # 2st load: cached file, two X-Varnish stamps
    wget -S http://opensourcehacker.com/wp-content/uploads/2011/08/Untitled-41.jpg
    --2013-02-06 20:02:21--  http://opensourcehacker.com/wp-content/uploads/2011/08/Untitled-41.jpg
    Resolving opensourcehacker.com (opensourcehacker.com)... 188.40.123.220
    Connecting to opensourcehacker.com (opensourcehacker.com)|188.40.123.220|:80... connected.
    HTTP request sent, awaiting response... 
      HTTP/1.1 200 OK
      Server: Apache/2.2.22 (Ubuntu)
      Last-Modified: Sun, 14 Aug 2011 22:55:01 GMT
      ETag: "2000893-108ec-4aa7f09555b40"
      Cache-Control: max-age=3600
      Expires: Wed, 06 Feb 2013 23:02:19 GMT
      Content-Type: image/jpeg
      Content-Length: 67820
      Accept-Ranges: bytes
      Date: Wed, 06 Feb 2013 22:02:22 GMT
      X-Varnish: 705602515 705602514
    
    # Purge
    varnishadm "ban.url ."
    
    # It's non-cached again
    wget -S http://opensourcehacker.com/wp-content/uploads/2011/08/Untitled-41.jpg
    --2013-02-06 20:02:34--  http://opensourcehacker.com/wp-content/uploads/2011/08/Untitled-41.jpg
    Resolving opensourcehacker.com (opensourcehacker.com)... 188.40.123.220
    Connecting to opensourcehacker.com (opensourcehacker.com)|188.40.123.220|:80... connected.
    HTTP request sent, awaiting response... 
      HTTP/1.1 200 OK
      Server: Apache/2.2.22 (Ubuntu)
      Last-Modified: Sun, 14 Aug 2011 22:55:01 GMT
      ETag: "2000893-108ec-4aa7f09555b40"
      Cache-Control: max-age=3600
      Expires: Wed, 06 Feb 2013 23:02:35 GMT
      Content-Type: image/jpeg
      Content-Length: 67820
      Accept-Ranges: bytes
      Date: Wed, 06 Feb 2013 22:02:35 GMT
      X-Varnish: 705602516

    3. Restart Varnish on Ubuntu

    This forces config flush, not sure about whether cache file storage gets reset(?).

    service varnish restart

    4. Further ideas

    If someone knowns where to get Varnish VCL syntax highlighter for Sublime Text 2 (TextMate) that would make my life easier, used in the combination SFTP plug-in.

     

     Subscribe to RSS feed Follow me on Twitter Follow me on Facebook Follow me Google+

    MacYvesBuilding on varnish-agent2

    The varnish-agent2 existing code base is pretty solid,  rather beautiful I might add. These simple words, keep it simple, has been realised by the following rule of thumb.

    – Close to 0 configuration
    – “Just works”
    – Maintainable
    – Generic
    – Stateless

    For those that are keen to get started on the varnish-agent2 code base, I hope that this document will be of some use. I have used a tiny subset of the Concurrent Object Modeling and architectural design mEThod (COMET), particularly the Requirements Modeling, Analysis Modeling and snippets of the actual Design Model.

    Lastly, this document mostly serves as a mental note for myself🙂

    Requirements Modeling

    The requirement for the vagent2 is to provide an extendible restful interface for varnish cache. In addition, vagent2 act as the point of integration for varnish cache with other systems – e.g. an administrative or monitoring system.

    The use cases are simple right now, and is depicted in the use case diagram below.

    vagent2 use cases

    vagent2 use cases

    Analysis Modeling

    vagent2 is designed to support the full spectrum of HTTP method types. A user of the vagent2 will issue these HTTP requests and receive JSON data as response where applicable. Furthermore, the vagent2 is built with modules in mind to address a potentially expanding feature set. Lastly, each modules should be able to communicate and reused by another module.

    IPC lies at the heart of the varnish-agent2 code base and message passing is the norm here for implementing an event driven model. Each module follows vagent2’s plugin paradigm, and comes equipped with the module’s own private data set. The plugin therefore follows the IPC set of callback methods, such as ipc_start and ipc_run. These methods are assigned to the appropriate functions within each module.

    For the module to be exposed as a restful interface, a simple httpd_register method will hook in the module’s reply method of choice, and expose it appropriately.

    For any module, the basic dependencies is depicted below.

    Module breakdown

    basic module dependencies

    Static Modeling

    varnish-agent2 ships with a few core modules, such as the logger, shmlog access, httpd and ipc modules. vstatus and vadmin provides access to the shmlog via Varnish API. Note that as of writing this blog, varnish 3.0.3 was used.

    These aforementioned modules provide the building blocks for managing varnish cache. For an overview of the static models, see the class diagram below.

    Static Model

    Static overview of vagent2

    Dynamic Modeling

    Initialisation
    The process of initialising a module is rather straightforward. First, add a new init method for the new module in plugins.h, ensure that you have called the init method in main.c, and of course, allocated some memory for the new plugin in main.c too.

    This new module must provide an implementation of the new init method. See diagram below depicting vban’s initialisation process.

    vban initialisation process

    vban initialisation process

    Once initialised, by hooking onto struct agent_plugin* structure, the new module will correspond to the IPC life cycle.

    plugin->start  = ipc_start;
    plugin->ipc->priv = your_private_plugin_data;
    plugin->ipc->cb = your_plugin_callback;

    plugin->start is called when your plugin starts. Note that you need to assign a start method if you want the IPC to execute your callback.

    plug->ipc->priv refers to a persisted data for your plugin. This can be anything, and as a rule of thumb, this is a good place to hold references to other modules.

    plug->ipc-cb refers to the callback method of when ipc_run is issued by another module.

    A sample execution path

    To tie it all together, the collaboration diagram below illustrate the execution path of issuing a ban to the vagent2. Note that ipc is used to reach the vadmin module.

    Issue a ban

    Issue a ban


    January 31, 2013

    Kristian LyngstølThe Varnish Agent 2.1

    Posted on 2013-01-31

    We just released the Varnish Agent 2.1.

    (Nice when you can start a blog post with some copy/paste!)

    Two-ish weeks ago we released the first version of the new Varnish Agent, and now I have the pleasure of releasing a slightly more polished variant.

    The work I've put in with it the last couple of weeks has gone towards increasing stability, resilience and fault tolerance. Some changes:

    • Fixed several memory leaks
    • Fixed JSON formatting that broke the log output
    • Add privilege separation
    • Handle varnishd restart much better
    • Handle changing of the -T option without restarting the agent
    • Log assert errors to syslog
    • Add site specific javascript

    For a complete-ish log, see the closed tickets for the 2.1 milestone on github.

    This underlines what we seek to achieve with the agent: A rock stable operational service that just works.

    If you've got any features you'd like to see in the agent, this is the time to bring them forth!

    I've already started working on 2.2 which will include a much more powerful API for the varnishlog data (see docs/LOG-API.rst in the repo), and improved HTTP handling, including authentication.

    So head over to the demo, play with it, if you break it, let me know! Try to install the packages and tell me about any part of the installation process that you feel is awkward or not quite right.

    Comments

    January 29, 2013

    Tollef Fog HeenAbusing sbuild for fun and profit

    Over the last couple of weeks, I have been working on getting binary packages for Varnish modules built. In the current version, you need to have a built, unpacked source tree to build a module against. This is being fixed in the next version, but until then, I needed to provide this in the build environment somehow.

    RPMs were surprisingly easy, since our RPM build setup is much simpler and doesn’t use mock/mach or other chroot-based tools. Just make a source RPM available and unpack + compile that.

    Debian packages on the other hand, they were not easy to get going. My first problem was to just get the Varnish source package into the chroot. I ended up making a directory in /var/lib/sbuild/build which is exposed as /build once sbuild runs. The other hard part was getting Varnish itself built. sbuild exposes two hooks that could work: a pre-build hook and a chroot-setup hook. Neither worked: Pre-build is called before the chroot is set up, so we can’t build Varnish. Chroot-setup is run before the build-dependencies are installed and it runs as the user invoking sbuild, so it can’t install packages.

    Sparc32 and similar architectures use the linux32 tool to set the personality before building packages. I ended up abusing this, so I set HOME to a temporary directory where I create a .sbuildrc which sets $build_env_cmnd to a script which in turns unpacks the Varnish source, builds it and then chains to dpkg-buildpackage. Of course, the build-dependencies for modules don’t include all the build-dependencies for Varnish itself, so I have to extract those from the Varnish source package too.

    No source available at this point, mostly because it’s beyond ugly. I’ll see if I can get it cleaned up.

    January 28, 2013

    Tollef Fog HeenFOSDEM talk: systemd in Debian

    Michael Biebl and I are giving a talk on systemd in Debian at FOSDEM on Sunday morning at 10. We’ll be talking a bit about the current state in Wheezy, what our plans for Jessie are and what Debian packagers should be aware of. We would love to get input from people about what systemd in Jessie should look like, so if you have any ideas, opinions or insights, please come along. If you’re just curious, you are also of course welcome to join.

    January 22, 2013

    Kristian LyngstølThe Varnish Agent

    Posted on 2013-01-22

    We just released the Varnish Agent 2.0.

    The Varnish Agent is a HTTP REST interface to control Varnish. It also provides a proof of concept front-end in html/JavaScript. In other words: A fully functional Web UI for Varnish.

    We use the agent to interface between our commercial Varnish Administration Console and Varnish. This is the first agent written in C and the first version exposing a HTTP REST interface, so while 2.0 might suggest some maturity, it might be wiser to consider it a tech preview.

    /misc/agent-2.0.png

    I've written the agent for the last few weeks, and it's been quite fun. This is the first time I've ever written JavaScript, and it was initially just an after thought that quickly turned into something quite fun.

    Some features:

    • Mostly self documenting, so it should be simple to integrate in other environments.
    • Close to 0 configuration. In fact, it currently requires 0 configuration, but might eventually require a tiny bit for authentication.
    • "Unit tests" for most functionality.
    • Upload and download VCL. Uploaded VCL is stored to disk.
    • Deploy vcl (e.g: use it). A hard link points to the most recently deployed VCL, allowing you to use that on varnish boot.
    • Show and change parameters.
    • Stop/start Varnish, show/delete panics, show status, etc
    • Varnishstat: Retrieve varnishstat data in json format
    • Varnishlog data: Retrieve varnishlog data in json format (Historic data only atm).
    • Modularised.

    I've had a lot of fun hacking on this and I hope you will have some fun playing with it too!

    Comments

    January 19, 2013

    ops42When speed doesn’t matter

    Let’s talk about speed. Speed is important. Varnish is a synonym for speed. That far, that good. But am I really the only one who doesn’t get why it would be so important to be able to PURGE 4000 objects a second? Really, who the fuck cares? Wouldn’t it show a bigger problem if I […]

    January 17, 2013

    Tollef Fog HeenGitano – git hosting with ACLs and other shininess

    gitano is not entirely unlike the non-web, server side of github. It allows you to create and manage users and their SSH keys, groups and repositories from the command line. Repositories have ACLs associated with them. Those can be complex (“allow user X to push to master in the doc/ subtree) or trivial (“admin can do anything”). Gitano is written by Daniel Silverstone, and I’d like to thank him both for writing it and for holding my hand as I went stumbling through my initial gitano setup.

    Getting started with Gitano can be a bit tricky, as it’s not yet packaged and fairly undocumented. Until it is packaged, it’s install from source time. You need luxio, lace, supple, clod, gall and gitano itself.

    luxio needs a make install LOCAL=1, the others will be installed to /usr/local with just make install.

    Once that is installed, create a user to hold the instance. I’ve named mine git, but you’re free to name it whatever you would like. As that user, run gitano-setup and answer the prompts. I’ll use git.example.com as the host name and john as the user I’m setting this up for.

    To create users, run ssh git@git.example.com user add john john@example.com John Doe, then add their SSH key with ssh git@git.example.com as john sshkey add workstation < /tmp/john_id_rsa.pub.

    To create a repository, run ssh git@git.example.com repo create myrepo. Out of the box, this only allows the owner (typically “admin”, unless overridden) to do anything with it. To change ACLs, you’ll want to grab the refs/gitano/admin branch. This lives outside of the space git usually use for branches, so you can’t just check it out. The easiest way to check it out is to use git-admin-clone. Run it as git-admin-clone git@git.example.com:myrepo ~/myrepo-admin and then edit in ~/myrepo-admin. Use git to add, commit and push as normal from there.

    To change ACLs for a given repo, you’ll want to edit the rules/main.lace file. A real-world example can be found in the NetSurf repository and the lace syntax might be useful. A lace file consists of four types of lines:

    • Comments, start with – or #
    • defines, look like define name conditions
    • allows, look like allow "reason" definition [definition…]
    • denials, look like deny "reason" definition [definition…]

    Rules are processed one by one, from the top and terminate whenever a matching allow or deny is found.

    Conditions can either be matches to an update, such as ref refs/heads/master to match updates to the master branch. To create groupings, you can use the anyof or allof verbs in a definition. Allows and denials are checked against all the definitions listed and if all of them match, the appropriate action is taken.

    Pay some attention to what conditions you group together, since a basic operation (is_basic_op, aka op_read and op_write) happens before git is even involved and you don’t have a tree at that point, so rules like:

    define is_master ref refs/heads/master
    allow "Devs can push" op_is_basic is_master
    

    simply won’t work. You’ll want to use a group and check on that for basic operations and then have a separate rule to restrict refs.

    January 09, 2013

    MacYvesBuilding Varnish Cache 3.0.3 from source in Mountain Lion, OSX 10.8.2

    If you want to install Varnish Cache in OSX, I highly recommend using Homebrew’s recipe for installing Varnish. Easy!

    But if you want to build and run Varnish Cache from github, here is a little step-by-step checklist for you.

    A shopping list of applications you’ll need for your Mountain Lion:

    • Homebrew 0.9.3
    • Xcode 4.5.2, via Appstore
    • Command Line Tool for Mountain Lion – November 1st 2012

    You’ll need the following dependencies via Homebrew

    • automake, note that this has to be version 1.12. I have tested it with 1.13 and you will run into obsolete AM_CONFIG_HEADER. 
    • libtool
    • pcre
    • pkg-config

    You’ll need docutils for rst2man

    Run the following steps:

    1. brew install the above dependencies
    2. for automake, you will need to switch it to 1.12. See below
    3. install docutils
    4. git checkout 3.0.3 branch of Varnish Cache
    5. run ./autogen.sh
    6. run ./configure
    7. make install

    To install another version of automake with Homebrew:

    1. Unlink the existing 1.13 version – brew unlink automake
    2. Get the list of versions available for automake – brew versions automake
    3. Copy the 1.12 information from git checkout…, for example git checkout 1e5eb62 /usr/local/Library/Formula/automake.rb
    4. cd into /usr/local/, or where ever else you have Homebrew storing your formula
    5. Paste the copied git information and this will check out the appropriate version of automake for you
    6. Install 1.12 – brew install automake
    7. Voila! You now have 1.12 version and you can switch between 1.13 or 1.12 by simply doing brew switch automake 1.13

    November 13, 2012

    ops42Varnish is smarter than you

    I just realized that a week without offending your intelligence is no good week by my standards. So here goes. Did you know that Varnish works out of the box? Provided, of course, you specify a backend to use. In a perfect world your backend would actually respond with a proper Cache-Control Header and Varnish […]

    November 06, 2012

    ops42Neuter Varnish’s keep-alive connections to skip that pesky session replication

    Varnish is all about speed. You know that. I know that. Heck, everybody knows that. That’s what a catchphrase is for. To that end, Varnish will keep backend connections open using keep-alive to shave off that precious milliseconds it would waste opening new ones all the time. In a world without your average application server […]

    November 05, 2012

    ops42How to rid the interwebs of millions of posts about Varnish Purging

    If I had to complain about something – and let’s face it, that’s what I do – I’d have to say it’s the millions of posts explaining the technique to purge something in Varnish. If it’s explained in the docs then spare me the details about how it’s supposedly done. Either stay quiet or link […]

    October 29, 2012

    ops42Varnish needs to compensate for the shortcomings of oversimplified Apache configurations

    I am in the fortunate situation that, for the most part, I can trust my backend responses regarding the Cache-Control headers. This spares me the tedious work of setting a proper ttl and cleaning up the response headers. Let’s face it, the cache layer cannot (and should not) always know how long a certain response […]

    ops42Varnish the gatekeeper, Rule #1

    I know, it just seems too obvious to be worth mentioning, but if I have learned anything from scanning through access log files from a website with 3-digit million requests a day, it is that there is nothing that won’t be thrown your way. So better be safe than sorry and make sure only valid […]

    ops42One Varnish configuration to rule them all

    As part of my day job I am “the Varnish guy” and thus have to keep track of quite some servers for the development, QA, staging and production environment. Of course there are technical differences, yet I like to have “the one” Varnish configuration for all of them. For Varnish this means different backend and […]