Planet Varnish

May 02, 2012

Per BuerSneak peak at the administration console 2.0

We're just about to release the 2.0 version of the Varnish Administration Console. Below is very short screencast showing you some of the management features. 

<iframe allowfullscreen="" frameborder="0" height="480" src="http://www.youtube.com/embed/u5DjFZdgsUM" width="640"></iframe>

Watch this on youtube.

April 25, 2012

Kristian LyngstølAnnouncing Spew

Posted on 2012-04-25

I just pushed my HTTP request spewer, spew, to github.

http://github.com/varnish/spew

It's Linux-specific, since it uses epoll, and the http.c-code is still nasty, but it's also fast.

A reminder of what it can do:

http://kly.no/misc/no_stress_2.png

The feature list contains:

  • Spew opens N connections, sends M requests over each connection and then re-opens the connection and repeats.
  • Fast request generation
  • "Keep-Alive support" (sort of!)
  • No analysis at all.
  • Native IPv6 support
  • Configurable, both on the command line and config file
  • It survives despite what happens if you bring your HTTP server down for a moment. It might need some time to catch up, though.

Most of the boilerplate code is actually from an old defunct project I abandoned in 2009. All the stuff that deals with options and config files and debug messages and whatnot. The only thing I've done recently is src/http.c and integration.

Also: I know the code is still horrible. Patches are welcome, as are requests, (constructive!) comments, etc.

Comments

April 23, 2012

Kristian LyngstølNo Stress

Posted on 2012-04-23

After my last post about testing Varnish (http://kly.no/posts/2012_04_19_Testing_Varnish.html), and a few years of frustration, I decided to take a look at what is actually possible.

So this is an example:

http://kly.no/misc/no_stress.png

What you are seeing is Varnish doing 183k req/s on my home machine. The important thing, however, is that the tool generating this load, cleverly called a.out, is running at 22% CPU load, and it's the single-threaded result of one day of dirty hacking. Compare this to httperf which is hard to get over 100k req/s, or siege which kills itself at about 15k req/s on the same machine.

This being a prototype, the code will never see the light of day. Trust me when I say it's horrible - that's what you get from a day of fiddling. However, it has demonstrated to me what's possible, and I might re-start this project now that I have an idea of what I want to do.

As for how a.out works? It's connection-oriented, so it maintains N open connections at any given time and spews M requests over each connection in rather large (configurable) bursts. It also manages to NOT die if you stop the server for a while (httperf doesn't like this and siege doesn't really need any help to murder itself). It collects roughly 0 statistics and it does not really care about response.

So no, it's not very good. But it's fast, and that was what I set out to achieve.

Update: So this tool is a bit more powerful. I've been able to do 280-290k req/s with a single process (and thread). This is the same machine I did the 275k req/s record with using httperf, but that required two extra machines to generate traffic.... Will be interesting to try booting those tomorrow.

http://kly.no/misc/no_stress_2.png Comments

April 19, 2012

Kristian LyngstølTesting Varnish

Posted on 2012-04-19

  • How do I benchmark Varnish?
  • How do I make sure Varnish is ready for production?
  • How do I test Varnish?

These are questions I see people ask frequently, and there is no simple answer. In this blog post, I'll go through some of what you could do to test a Varnish-site.

What this blog post is not

This is not about benchmarking. I don't do benchmarking, never have. Why not? Because it's exceedingly hard and very few people succeed at proper benchmarking.

Neither is this a blog post about testing functionality on your site. You should be doing that already. I'll only say that you should test functionality, and it's often best done by browsing the site.

Also, don't expect it to be all that complete. Ask questions in comments and I might expand upon it!

What to test

Despite what most people ask about, the tools you chose are not nearly as important as what you want to test.

If you are hosting videos, I doubt testing request/second is a sensible metric. There are a few things you need to ask yourself:

  • What is going to be the bounding factor of my site? Bandwidth? Disk I/O? CPU? Memory? Something else?
  • How much of my site is it possible to cache?
  • Is there any way to tell ahead of time?

These questions are important and they relate.

If your site is already in production under some different architecture, you are in luck. Your access logs can tell you a lot about the traffic pattern you can expect. This is a great start.

If this is a new site, though, it can be harder to estimate the answers. I recommend starting with How much of my site is it possible to cache?. If your site is mostly static content, then Varnish will be able to help you a lot, assuming you set things up accordingly. If it's a site for logged in users, you have a much harder task. It's still possible to cache content with Varnish, but it's much harder. The details of how to do that is beyond the scope of this post.

As long as you can cache the majority of the content, chances are you will not be CPU bound as far as Varnish is concerned.

Getting a baseline

Testing a Varnish-site can be really fast or you can use the next six months doing it. Let's start by getting a baseline.

I usually start out by something truly simple: Look at varnishstat and varnishlog while you use a browser to browse the site. It's important that this is not a script, because your users are likely using browsers too and you want to catch all the stuff they catch, like cookies.

To set this up, the best way is to modify /etc/hosts (or the Windows equivalent (there is one, all the viruses uses it)). The reason you don't want to just add a test-domain is because your site will go on-line using a real domain, not a test-domain. A typical /etc/hosts file could look like this for me:

127.0.0.1       localhost
127.0.1.1       freud.kly.no freud
127.0.0.1       www.example.com example.com media.example.com
# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Even better is if this is an external server. Then make sure you block any access to other port-80 services for that site. This will ensure that you don't miss any sub-domains.

What you are looking for is cache hits, misses and hitpasses. This should reveal if the site handles cookies properly or not. You may also want to fire up a different browser.

You also want to keep a look out for:

  • Vary headers.
  • Strange or unexpected Cache-Control headers
  • Set-Cookie headers.
  • And of course: 404s or other errors.

Once you've got this nailed down. If you doubt the speed of Varnish, we can always throw in wget too:

wget --delete-after -p -r www.example.com

This will give you a recursive request for www.example.com with all prerequisites (CSS, images, etc). It's not very useful in itself but it will give you a feel for how fast the site is without Varnish and then after you've cached it with Varnish. You can easily run multiple wget-commands in parallel to gauge the bandwidth usage:

while sleep 1; do wget --delete-after -p -r www.example.com; done &
while sleep 1; do wget --delete-after -p -r www.example.com; done &
while sleep 1; do wget --delete-after -p -r www.example.com; done &

Ideally this should be network-bound, but realistically speaking, wget is not /that/ fast when it comes to tiny requests.

Warning

Keep in mind that you are likely going to be hitting a DNS server frequently, specially if you don't use /etc/hosts. I've had DNS servers running at 50-70% CPU when I've done stress testing in the past, which means the DNS server is affecting the test more than you want it to.

So far none of these tricks have been very fancy.

Bringing out the big guns

So you wont reach 275 kreq/s using wget. I'm not sure that should be a goal either, but it's worth while taking a look at.

If you are moving on to testing just Varnish, not the site itself, then it's time to move away from browsers and wget. There are several tools available for this, and I tend to prefer httperf. It's not a good tool by any sensible measure, but it's a fast one. The best way to learn httperf is to stick all the arguments into a text file and set up a shell script that randomly picks them until you find something that works. The manual pages are unhelpful at best.

An alternative to httperf is siege. I'm sure siege is great, if you don't mind that it'll run into a wall and kill itself long before your web server. If you want further proof, take a look at this part of siegerc, documenting Keep-Alive:

# Connection directive. Options "close" and "keep-alive"
# Starting with release 2.57b3, siege implements persistent.
# connections in accordance to RFC 2068 using both chunked
# encoding and content-length directives to determine the.
# page size. To run siege with persistent connections set
# the connection directive to keep-alive. (Default close)
# CAUTION: use the keep-alive directive with care.
# DOUBLE CAUTION: this directive does not work well on HPUX
# TRIPLE CAUTION: don't use keep-alives until further notice
# ex: connection = close
#     connection = keep-alive
#
connection = close

A stress testing tool that doesn't support keep-alive properly isn't very helpful. Whenever I use siege, it tends to max out at about 5000-10000 requests/second.

There's also Apache Bench, commonly known as just ab. I've rarely used it, but what little use I've seen from it has not been impressive. It supports KeepAlive, but my brief look at it showed no way to control the KeepAlive-ness. From basic tests of it, it also seemed slightly slower than httperf. It does seem better today than it was the first time I looked at it, though. For this blog posts, I'll use httperf simply because it's the tool I'm most familiar with and which have given me the right combination of control and performance.

However, httperf has several flaws:

  • It is single-threaded. It can do multiple concurrent requests, but only on a single thread. This can be leveraged by running multiple instances.
  • The documentation, while mostly complete, does not really answer enough questions.
  • It tends to max out at 1022-ish concurrent connections due to an internal limit. This might be possible to avoid if you compile it yourself. I've never bothered.
  • Bussy-loops! Beware that httperf using 100% cpu does NOT mean that it is running at full capacity.
  • No graceful slowing down if you try to hit a --rate that's too fast. It'll simply give connection errors instead.

The trick to httperf is to use --rate when you can. A typical httperf command might look like this (run on my laptop):

$ httperf --rate 2000 --num-conns=10000
        --num-calls 20 --burst-length 20 --server localhost
        --port 8080 --uri /misc/dummy.png

httperf --client=0/1 --server=localhost --port=8080
        --uri=/misc/dummy.png --rate=2000 --send-buffer=4096
        --recv-buffer=16384 --num-conns=10000 --num-calls=20
        --burst-length=20

httperf: warning: open file limit > FD_SETSIZE; limiting max. # of
         open files to FD_SETSIZE

Maximum connect burst length: 20

Total: connections 10000 requests 200000 replies 200000 test-duration 7.076 s

Connection rate: 1413.2 conn/s (0.7 ms/conn, <=266 concurrent connections)
Connection time [ms]: min 1.6 avg 70.4 max 3049.8 median 33.5 stddev 286.4
Connection time [ms]: connect 27.9
Connection length [replies/conn]: 20.000

Request rate: 28264.9 req/s (0.0 ms/req)
Request size [B]: 76.0

Reply rate [replies/s]: min 39514.8 avg 39514.8 max 39514.8 stddev 0.0 (1 samples)
Reply time [ms]: response 35.9 transfer 0.0
Reply size [B]: header 317.0 content 178.0 footer 0.0 (total 495.0)
Reply status: 1xx=0 2xx=200000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 1.33 system 5.61 (user 18.8% system 79.3% total 98.0%)
Net I/O: 15761.0 KB/s (129.1*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

Note that httperf will echo the command you ran it with back to you, with all options expanded. I took the liberty of formatting the output a bit more to make it easier to read. The options I use here are:

  • --rate 2000 - Tries to open 2000 connections per second.
  • --num-conns=10000 - Open a total of 10 000 connections for this test.
  • --num-calls=20 - Perform 20 requests per connection. (so a total of 10 000*20 requests (200 000)).
  • --burst-length 20 - Pipeline the requests. This is mainly to speed up httperf itself since it's much faster to send all 20 requests in one go than send them individually. Varnish handles it correctly anyway.
  • The rest should be self explanatory.

The first thing you should look at in the output is Errors:. If you get errors, there's a very good chance you were too optimistic with your --rate setting. Also note that the uri matters greatly. /misc/dummy.png is just that: a dummy-png I have to test. (see for yourself at http://kly.no/misc/dummy.png). Let's try the same with the front page:

$ httperf --rate 2000 --num-conns=10000 --num-calls 20
        --burst-length 20 --server localhost --port 8080 --uri /

httperf --client=0/1 --server=localhost --port=8080 --uri=/
        --rate=2000 --send-buffer=4096 --recv-buffer=16384
        --num-conns=10000 --num-calls=20 --burst-length=20

httperf: warning: open file limit > FD_SETSIZE; limiting max. # of
         open files to FD_SETSIZE

Maximum connect burst length: 42

Total: connections 1738 requests 34760 replies 34760 test-duration 10.589 s

Connection rate: 164.1 conn/s (6.1 ms/conn, <=1018 concurrent connections)
Connection time [ms]: min 477.3 avg 4592.4 max 8549.5 median 5276.5 stddev 2233.3
Connection time [ms]: connect 8.7
Connection length [replies/conn]: 20.000

Request rate: 3282.8 req/s (0.3 ms/req)
Request size [B]: 62.0

Reply rate [replies/s]: min 3077.3 avg 3311.2 max 3545.1 stddev 330.8 (2 samples)
Reply time [ms]: response 3772.9 transfer 49.8
Reply size [B]: header 326.0 content 38915.0 footer 2.0 (total 39243.0)
Reply status: 1xx=0 2xx=34760 3xx=0 4xx=0 5xx=0

CPU time [s]: user 0.61 system 9.54 (user 5.8% system 90.1% total 95.9%)
Net I/O: 125998.3 KB/s (1032.2*10^6 bps)

Errors: total 8262 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 8262 addrunavail 0 ftab-full 0 other 0

Now see how the errors piled up. This is because we exceeded the performance httperf could offer. Yeah, httperf is far from perfect. Also note the bandwidth usage and CPU usage. I'm not sure if it's a coincidence that we're so close to gigabit, since this is a Varnish server running on localhost.

What you also may want to look at is reply status, to check the status codes. You also want to pay attention to connection times. Let's take a look at the first example again:

Connection rate: 1413.2 conn/s (0.7 ms/conn, <=266 concurrent connections)
Connection time [ms]: min 1.6 avg 70.4 max 3049.8 median 33.5 stddev 286.4
Connection time [ms]: connect 27.9
Connection length [replies/conn]: 20.000

This tells me the average connection time was 70.4ms, with a maximum at 3049.8ms. 3 seconds is quite a long time. You may want to look at that. What I do when I debug stuff like this is make sure that I rule out the tool itself as the source of worry. There is no 100% accurate method of doing this, but given the CPU load of httperf at the time, it's reasonable to assume httperf is part of the problem here. You can experiment by slightly adjusting the --rate option to see if you're close to the breaking point of httperf.

You also want to watch varnishstat during these tests.

So what did you just learn?

Frankly very little.

Sure, this means I can run Varnish at around 30k req/s on my laptop, testing FROM my laptop too. But this is not that helpful.

What settings should I use?

Well, first of all, running 20 requests over a single connection is pointless. There's almost no browser or site out there which will cause this to happen. Depending on the site, numbers between 4 and 10 requests per connection is more realistic.

If all you want is a big number, then tons of requests over a single connection is fine. But it has nothing to do with reality.

You can get httperf to do some pretty cool things if you invest time in setting up thorough tests. It can generate URLs, for instance, if that's your thing. Or simulate sessions where it asks for one page, then three other pages over the same connection X amount of time later, etc etc. This is were the six-month testing period comes into play.

I consider it a much better practice to look at access logs you have and use something simpler to iterate the list. wget can do it, and I know several newspapers that use curl for just this purpose. It was actually curl that first showed me what happens when Varnish becomes CPU bound without having a session_linger set (this is set by default now, but for the curious, what happened was that the request rate dropped to a 20th of what it was a moment before, due to context switching).

Conclusion

Test your site and by all means test Varnish, but do not assume that just because httperf or some other tool gives you 80 000 requests/second that this will match real-life traffic.

Proper testing is an art and this is just a small look at some techniques I hope people find interesting.

Comments

April 18, 2012

Per BuerCounting syscall in Varnish Cache

Way back when we did a rough count on how many syscalls Varnish uses to deliver a piece of cache content. I think Anders counted seven calls per hit, roughly based on some system counters. I just did a recount, just to see how we are doing. After all, calling the kernel is one of the more expensive things to do in a program. Here is what I found.

Edit: I think the original claim was 7 system calls + 5 locks.

The first two syscalls - the acceptor threads

read more

March 30, 2012

Per BuerWarming up Varnish Cache with varnishreplay

Image: (c) 2011 jasonwoodhead23. Used under CC BY 2.0.

Varnish Cache 2.0 included a feature that hasn't really been used much but might be a very useful tool should you ever need it. It can be used to replay traffic from the log onto another.

read more

March 28, 2012

Kristian LyngstølThe Varnish Book

Posted on 2012-03-28

In 2008 Tollef Fog Heen wrote the initial slides used for our first Varnish System Administration training course. Since then, I've been the principal maintainer and author of the material. We contracted Jérôme Renard (http://39web.fr)) to adapt the course for web developers in 2011 and I've spent some time pulling it all together into one coherent book after that.

Today we make the Varnish Book available on-line under a Creative Commons CC-BY-NC-SA license.

http://kly.no/Blog/varnish-book-screenie.png

See http://www.varnish-software.com/static/book/ for the sphinx-rendered HTML variant.

The All-important content!

The book contains all the content we use for both the system administration course and the web developer course. While each of those courses omit some chapters (Sysadmin omits HTTP and Content Composition while Webdev omits the Tuning and Saving a Request chapters), the content is structured with this in mind.

The book does not teach you everything about Varnish.

If you read the book and perform the exercises (without cheating!) you will:

  • Understand basic Varnish configuration and administration
  • Know when modifying parameters is sensible
  • Have extensive understanding of VCL as a state engine and language
  • Understand how caching in HTTP affects Varnish, intermediary caches and browser caches
  • Know how to use Varnish to recover from or prevent error-situations
  • Have in-depth knowledge of the different methods of cache-invalidation that Varnish can provide
  • Know several good ways to approach cookies and user-specific content
  • Know of Edge Side Includes
  • Much more!

I've gradually restructured the book to be about specific tasks instead of specific features. Because of that, there are several features or techniques that are intentionally left out or down-played to make room for more vital concepts. This has been a gradual transition, and I'm still not entirely satisfied, but I believe this approach to learning Varnish is much more effective than trying to just teach you the facts.

One of my favorite examples of this is probably the Cache Invalidation chapter. We used to cover the equivalent of purging in the VCL chapters, since it is a simple enough feature, then cover banning in a separate chapter. The problem with that mentality is that when you are setting up Varnish, you don't think "I need to write VCL". You think "I need to figure out how to invalidate my cache" or "how do I make Varnish fall back to a different web server if the first one failed".

I have learned a great deal about Varnish, about how people learn and about the web in general while holding these training courses and writing this book. I hope that by releasing it in the open, even more people will get to know Varnish.

Future Content

The book will continue to change. We at Varnish Software (https://www.varnish-software.com) will update it for new versions of Varnish and take care of general maintenance.

I hope that we will also get some significant feedback from all you people out there who will read it. We appreciate everything from general evaluation and proof reading to more in-depth discussions.

One of the more recent topics I want to cover in the book is Varnish Modules. This is still quite new, so I'm in no rush. I still haven't decided what that chapter should cover. It might be about available vmods and practical usage of them, or we might go more in depth and talk about how to start writing your own. I really don't know.

An other topic I really wish to expand upon is content composition. The material Jerome provided for us was excellent, but I wish to go broader and also make it available in a couple of other languages than just PHP. There is some work in this area already, I just can't say much more about it yet...

You will probably also see rewrites of the first chapter and the Programs-chapter in the near future. They are both overdue for a brush-up.

In the end, though, this a book that will continue to evolve as long as people take interest. What it covers will be defined largely based on feedback from course participants, feedback from people reading it on-line and the resources needed to implement those changes.

License

We chose a CC-BY-NC-SA license because we both want to make the material available to as many people as possible, and make sure that we don't put our self out of the training business by providing a ready-made course for potential competitors.

Being of of those people who actually read licenses and try to interpret their legal ramifications, I've obviously also read the CC-BY-NC-SA license we use. It is (intentionally) vague when it comes to specifying what "non-commercial" means. What I interpret it as with regards to our training material is that you can read it as much as you want regardless of whether you are at work or what commercial benefits you have from understanding the content. You can also hold courses in non-profit settings (your local LUG for instance), and some internal training will probably be a non-issue too. However, the moment you offer training for a profit to other parties, you're violating the license. You'll also be violating it if you print and sell the book for a profit. Printing and selling it to cover the cost of printing is allowed (it's one of the few things where the license actually clarifies this).

Since we are using a "NC"-license, we'll also be asking for copyright assignment and/or a contributor's agreement if you wish to contribute significantly. This is so we can use your material in commercial activities. Exactly how this will be done is not yet clarified.

One last point: If you are contributing to the documentation we keep at www.varnish-cache.org, we will not consider it a breach of license if you borrow ideas from the book. Our goal is to make sure the book interacts well with the other documentation while covering our expenses.

Comments

March 26, 2012

Per BuerWhy I don't like SPDY

 

Last week, on the user group meetup in Paris, there were quite a few discussions about SPDY. Currently Google is pushing for SPDY to turn into HTTP 2.0. I can't say I like it. There is one big problem with SPDY - SSL. SPDY mandates SSL and it causes problems.

read more

March 18, 2012

cd34REMOTE_ADDR handling with Varnish and Load Balancers

While working with the ever present spam issue on this blog, I’ve started to have issues with many of the plugins not using the correct IP address lookup. While each plugin author can be contacted, trackbacks and comments through WordPress still have the Varnish server’s IP address. In our vcl, in vcl_recv, we put the [...]

March 15, 2012

Lasse KarstensenInvalidation API in Varnish Administration Console

So, we’re getting somewhere.

Here is a quick example of the new ban(/purge/invalidation) distribution API we have in the development version of the Varnish Administration Console:

$ curl -X POST –user user:pw -H ‘Content-Type: text/plain’  -d ‘req.http.url ~ “/articles/FOO”‘ http://vac.local/api/v1/cachegroup/production/ban
{“status”:”200″,”text”:”Ban executed successfully”}

With this you can invalidate content on your N active Varnish servers with a simple HTTP request from the backend/CMS. Super easy!


March 08, 2012

Ingvar Hagelundvarnish-3.0.2 for fedora

I finally got around to wrap up varnish-3.0.2 for fedora 17 and rawhide. Please test and report karma.

In this release, I have merged changes from the upstream rpm, and added native systemd support for f17 and rawhide. It also builds nicely for epel5 and epel6, providing packages quite similar to those available from the varnish project repo.

As epel does not allow changes in a package API after release, varnish-3.0.2 won’t be available through epel5 or epel6, so use the varnish project repo, or my precompiled packages for epel 4, 5 and 6 available here.

As always, feedback is very welcome.

February 24, 2012

Lasse KarstensenMobile device detection in Varnish

As part of my DAY JOB[tm] I’ve been working on device detection with Varnish. There is some content about this on blogs here and there, but no single place to get rulesets or VCL. We want to fix that by announcing a community updated VCL set for this. Check out the Github project:

https://github.com/varnish/varnish-devicedetect

It is a VCL set for Varnish which uses regular expressions to group clients into pools like PC, mobile-iphone, mobile-android, tablet-ipad and more based on their User-Agent.

With this you can serve per-device-type content directly from the Varnish cache, without hitting your backend web servers. It is super easy to install with nothing to compile. Just pull the files and include them in default.vcl.

The backend sees the first request like http://example.com/foo.html?devicetype=mobile-android, picks out the GET argument and produces the content that is best suite for small androids. Later requests are cached in Varnish. The main points are: 1) no need to redirect, saving probably a second in page load time since mobile networks are high latency and low bandwidth, 2) increased cache hit rate, which means you don’t have to pay for as many backends. 3) you don’t have to maintain the regular expressions all by yourself.

In addition to the regular expression set and the supporting VCL code to send headers/GET parameter to the backend, you get a system for overriding the detection. Go to /set_ua_device/mobile-android with your usual browser, and later requests will be served as if you had an android phone. Simple as PIE.

The regular expression set will evolve over time (by us, Varnish Software, when a customer asks for it), or by community input. For example, if anyone has a good suggestions on how to differentiate android tablets and android phones, your input is very welcome :)


January 28, 2012

cd34First beta of my new webapp, SnapReplay.com

After a late night working until 2:45am on the finishing touches, my phone alerted me to the fact it was 5:35am and time for the first Live Photostream to take place. The stream? Roger Waters – The Wall, from the Burswood Dome in Perth, AUSTRALIA. Special thanks to Paul Andrews for taking the pictures and [...]

January 19, 2012

Per BuerWhere will Varnish Cache be in 2020?

The last couple of days we've been discussing the future. 2011 was a good year, both for the Varnish Cache Project and Varnish Software itself. Both have now proven themselves. During the last two months of 2011 Varnish was downloaded over 25000 times. Varnish Software has also proven itself. We now know that our services are sought after and that people find our offering valuable. Now what?

read more

January 11, 2012

Per BuerThe superbunny under a CC license

The Varnish Superbunny

Together with our graphics and HCI guru, Morten Zetlitz, we've decided to release the superbunny graphic under a CC license.

You can find a PNG and an SVG version of this image below.

 

read more

January 10, 2012

cd34Finally, a formal release for my WordPress + Varnish + ESI plugin

A while back I wrote a plugin to take care of a particular client traffic problem. As the traffic came in very quickly and unexpectedly, I had only minutes to come up with a solution. As I knew Varnish pretty well, my initial reaction was to put the site behind Varnish. But, there’s a problem [...]

January 05, 2012

David HarriganBanning URLs from Varnish using Apache Camel and RabbitMQ – Part 2

Welcome Back! I hope you found Part 1 on this tutorial useful. You should by now have a running instance of Varnish cache along with a running instance of RabbitMQ. You should also have cloned the Varnish-Ban project from Bitbucket and perhaps had a look through the project structure and source code. I hope there is nothing [...]

January 04, 2012

Per BuerThe hash collision attacks

December 28, 2011

David HarriganBanning URLs from Varnish using Apache Camel and RabbitMQ – Part 1

Introduction Hello and Welcome! Over the course of three postings, I would like to present a tutorial on using RabbitMQ and Apache Camel to BAN (their parlance for removing) URLs (objects) held within Varnish Cache. This proposed approach allows for a complete decoupling of application logic from the caching system thus promoting greater flexibility, scalability [...]

December 13, 2011

David HarriganIntroducing LVUG

I’m a big fan of Varnish Cache. It truly is an amazing piece of software. I’m also keen to promote its use in the town I work in and to help share ideas, experiences and solutions between others. Therefore, I decided to create a new meetup group. I’m happy to announce that The London Varnish [...]

December 01, 2011

Kristian LyngstølVarnish Training

Posted on 2011-12-01

As anyone who's worked with me should realize by now, I'm big on documentation, be it source code comments or other types of documentation. The only reason I'm not more active in the documentation section of Varnish Cache is because I've maintained our (Varnish Software's) training material ever since Tollef Fog Heen wrote the initial slides in 2009.

I've held the course more times than I can remember, and usually done improvements after every course. Others have also held the course, including Redpill Linpro's Kacper Wysocki, maintainer of security.vcl (https://github.com/comotion/security.vcl) and Magnus Hagander (Postgresql god/swede). Feedback and gradual improvements have turned a set of slides into a pretty good course material.

We recently started holding on-line courses too. This revealed several new challenges. The obvious challenges are things like getting basic voice communication working (it sounds easy, but you'd be amazed...). It was also interesting when I held the course in my apartment on Australian time, and my ISP decided to perform maintenance on my cable modem (it was 2AM local, after all). So I've had to hold the course on a 3G connection, communicating with Australia. Fun. Then there's the lack of or severely reduced feedback, which presents challenges in how we do exercises and generally deal with the course. In a class room I can easily determine if the participants are able to keep up, if I'm going too slow or too fast and whether or not a subject is more interesting than an other. All of that is, at best, very difficult in an on-line session.

The last few weeks I've finally gotten around to merging the sysadmin course with the web development course that Jérôme Renard has written for us. It proved the perfect opportunity to give the course an other big update. While the course was already updated for Varnish 3, I've made several other Varnish 3-related additions. More importantly is that the flow of the course has changed from one oriented on Varnish functionality to tasks you wish to accomplish with Varnish. Instead of teaching you about Varnish architecture first, then Varnish parameters, the course now has a chapter devoted to Tuning.

Instead of just throwing in purging or banning when talking about VCL, there's now a chapter called Cache Invalidation, that attempts to give broader understanding of the alternatives you have and when to use which solution. Similarly, there's a chapter called Saving The Request, which starts out with the core Grace mechanisms, moves on to health checks, saint mode, req.hash_always_miss, directors and more.

There are several reasons I write about this. First of all: I'm very excited about the material. I've worked on it regularly for several years, doing everything from hacking rst2s5, tweaking the type setting and design to updating the content, reorganizing the course and of course holding it. It may seem like one big marketing stunt, but I can promise you that I never blog about something I'm not passionate about, regardless of whether it is work-related or not.

The other reason is that I'm holding the course next week. This will be the first time we hold it using the changed structure. I would have wanted to hold it in a class room first, but holding it on-line is still exciting.

If you wish to participate, head over to https://www.varnish-software.com/products-services/training/varnish-administration-course and convince your boss it will be awesome!

Comments

November 23, 2011

Per BuerVAC Screencast

So, on friday we finally made it. The 1.0 release of the Varnish Admin Console (VAC) was finally shipped. A lot of people have been wondering what it looks like. Here is a screencast:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="353" src="http://www.youtube.com/embed/M2LznlsrYMQ" width="694"></iframe>

Watch this video on Youtube.

Comments and feedback are welcome!

November 18, 2011

Yves Hwangvac-cottontail-1.0 release

The road to a software release is filled with sweat and tears. As the product manager of the Varnish Administration Console, it is my absolute pleasure to present you the vac-cottontail-1.0 release, available to silver and above subscribers of Varnish Software. Harking back to my Aussie roots and straight from Hilltop Hoods, this release is for our Varnish people in the front, in the nose bleed section.

read more

November 16, 2011

Mikko OhtamaaVarnish, caching and HTTP cookies

These short notes related on caching and HTTP cookies are based on my experience with Varnish and Plone CMS and WordPress.

Sanifying cookies for caching

Any cookie set on the server side (session cookie) or on the client-side (e.g. Google Analytics Javascript cookies) is poison for caching the anonymous visitor content.

Common cookies for all CMS systems are usually

  • Session cookie (anonymous user session): ZopeId,  PHPSESSID
  • Logged in user cookie: __ac (Plone)
  • Active language cookie: I18N_LANGUAGE (Plone)
  • Analytics cookies (Google Analytics et. al.): various ugly cookies
  • Some other status information e.g. status message: statusmessages (Plone)

HTTP caching needs to deal with both HTTP request and response cookie handling

  • HTTP request Cookie header. The browser sending HTTP request with Cookie header confuses Varnish cache look-up. This header can be set by Javascript also, not just by the server. Cookie can be preprocessed in vcl_recv.
  • HTTP response Set-Cookie header. This is server-side cookie set. If your server is setting cookies Varnish does not cache these responses by default. Howerver, this might be desirable behavior if e.g. multi-lingual content is served from one URL with language cookies. Set-Cookie can be post-processed in vcl_fetch.

Example how remove all Plone related cookies besides ones dealing with the logged in users (content authors):

sub vcl_recv {

  if (req.http.Cookie) {
      # (logged in user, status message - NO session storage or language cookie)
      set req.http.Cookie = ";" req.http.Cookie;
      set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
      set req.http.Cookie = regsuball(req.http.Cookie, ";(statusmessages|__ac)=", "; \1=");
      set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
      set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

      if (req.http.Cookie == "") {
          remove req.http.Cookie;
      }
  }
  ...

# Let's not remove Set-Cookie header in VCL fetch
sub vcl_fetch {

    # Here we could unset cookies explicitly,
    # but we assume plone.app.caching extension does it jobs
    # and no extra cookies fall through for HTTP responses we'd like to cache
    # (like images)

    if (!obj.cacheable) {
        return (pass);
    }
    if (obj.http.Set-Cookie) {
        return (pass);
    }
    set obj.prefetch =  -30s;
    return (deliver);
}

Another example how to purge Google Analytics cookies only and leave other cookies intact:

sub vcl_recv {

         # Remove Google Analytics cookies - will prevent caching of anon content
         # when using GA Javascript. Also you will lose the information of
         # time spend on the site etc..
         if (req.http.cookie) {
            set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", "");
            if (req.http.cookie ~ "^ *$") {
                remove req.http.cookie;
            }
          }

          ....

 Subscribe to this blog in a reader Follow me on Twitter

November 11, 2011

Stefan CaunterSet up Internal Load Balancer Pools on a Barracuda

Often we call services internally, whether it is a mail gateway or a web service. In production, if high availability is a requirement, we want to create a virtual ip for the service and add two or more physical servers as listeners. I usually use layer 7 service pools when network load is low, and [...]

November 04, 2011

Per BuerHTTP Streaming in Varnish

Since April this year we've been working with a few customers implementing proper support for streaming of objects in Varnish. The code was feature complete in August and has matured since, now reaching a stage where we are comfortable releaseing it.

read more

October 26, 2011

Dr CarterVarnish 3.0.2 released

Summary of changes from 3.0.1 to 3.0.2

  • A crasher bug when requests were queued and the backend sent a response with Vary has been fixed.
  • A crash when a too large synthetic response was produced has been fixed.
  • The ban lurker now properly sleeps the 1 second it is supposed to.
  • Varnish now releases disk space properly if no -s argument is provided, and the default cache size is now 100MB instead of 50% of the available disk space.

Download here.

<script type="text/javascript"> </script> <script src="http://pagead2.googlesyndication.com/pagead/show_ads.js" type="text/javascript"> </script>

October 08, 2011

Dr CarterCTRL F5 to force the update of Varnish cache

Add the following code to your vcl, so you can use CTRL F5 to force a refresh:

acl CTRLF5 {
   "xxx.xxx.xxx.xxx";
}

sub vcl_hit {

  if (client.ip ~ CTRLF5) {
    if (req.http.pragma ~ "no-cache" || req.http.Cache-Control ~ "no-cache")
    {
      set obj.ttl = 0s;
      return(pass);
    }
    else { return(deliver); }
  }
  else { return(deliver); }
}

Dr CarterFew corrections for wordpress.vcl in the wordpress plugin for Varnish

I’ve just migrated this blog on a dedicated server.
In order to use Varnish 3.0.1, i have done few ajustements on the wordpress.vcl contained in the wordpress plugin for Varnish

backend default {
  .host = "127.0.0.1";
  .port = "8080";
}

acl purge {
  "localhost";
}

sub vcl_recv {
  if (req.request == "PURGE") {
    if(!client.ip ~ purge) {
      error 405 "Not allowed.";
    }

    return (lookup);
  }

  if (req.request != "GET" &&
      req.request != "HEAD" &&
      req.request != "PUT" &&
      req.request != "POST" &&
      req.request != "TRACE" &&
      req.request != "OPTIONS" &&
      req.request != "DELETE") {
    return (pipe);
  }

  if (req.request != "GET" && req.request != "HEAD") {
    return (pass);
  }

  if (req.url ~ "wp-(login|admin)" || req.url ~ "preview=true") {
    return (pass);
  }

  remove req.http.cookie;
  return (lookup);
}

sub vcl_fetch {
  if (req.url ~ "wp-(login|admin)" || req.url ~ "preview=true") {
    return (hit_for_pass);
  }

  set beresp.ttl = 24h;
  return (deliver);
}

sub vcl_hit {
  if (req.request == "PURGE") {
     purge;
     error 200 "Purged.";
  }

  return(deliver);
}

sub vcl_miss {
  if (req.request == "PURGE") {
    purge;
    error 200 "Purged.";
  }

  return (fetch);
}

October 05, 2011

Yves Hwang‎Browsers, HTML5, Websocket & Varnish - The Cook, the Thief, His Wife & Her Lover ‎

Varnish Cache and the support for HTML5 is akin to the movie titled “The Cook, the Thief, His Wife & Her Lover.” The intricate web of relationship based on misunderstanding and extra marital affairs can be somewhat compared the current state of websocket implementation in browsers, in backend servers, and ultimately, in Varnish Cache. See imdb for more movie info.

read more

September 26, 2011

Dr CarterSample of string concatenation in Varnish 2.1.5

In Varnish 2.1.5, to concatenate, separate your variables by a space character :

set req.url = regsub(req.http.host, "(?:www\.)?(.*)\.test\.com", "/\1" ) regsub(req.url, "^/foo(.*)$", "\1");

If ( req.http.host == “www.blabla.test.com” ) and ( req.url == “/foo/index.php?a=1″ ) Then req.url = “/blabla/index.php?a=1″

Since Varnish 3.0, the string concatenation operator is +.
Example:

set req.url = regsub(req.http.host, "(?:www\.)?(.*)\.test\.com", "/\1" ) + regsub(req.url, "^/foo(.*)$", "\1");

Stefan Caunterfms4.0 JSEngine out of memory? Change tag to ScriptEngine

When you get “JavaScript runtime is out of memory; server shutting down instance” errors in your FMS logs, and your app won’t stay loaded, and none of your clients can connect, you need to increase the amount of memory available to the script engine. In the docs for FMS 4.0, the JSEngine tag is “deprecated”. [...]

September 19, 2011

Yves HwangVarnish Administration Console - "Cache Groups" sneak peek!

This is the last installment of the VAC sneak peak - introducing Cache Groups!

Cache Groups

read more

September 14, 2011

Yves HwangVarnish Administration Console - "Ban Management" sneak peek!

Ban Managing Bunny!Contrary to popular belief, sometimes having long TTL objects in your cache is bad for business. In this instant-update era, Google+, Twitter or Facebook socialites crave the latest memes, while being repulsed by old contents - even if they are mere seconds past the use-by date. Enter purges and bans in Varnish Cache.

read more

September 08, 2011

Mikko OhtamaaPurging Varnish cache from Python web application

Varnish is a popular front end caching server. If you are using it at the front of your web site you have situations where you wish to force the cached content out of the cache E.g. site pages have been updated.

Below is an example how to create an action to purge selected pages from the Varnish cache. The example shows how to purge the whole cache, but you can use regular expressions to limit the effect of the purge.

This tutorial has been written for Ubuntu 8.04, Python 2.6, Varnish 2.x and Plone 4.x. However, it should be trivial to apply the instructions for other systems.

We use Requests library which is a great improvement over Python stdlib’s infamous urllib. After you try Request you never want to use urllib or curl again.

Configuring Varnish

First you need to allow HTTP PURGE request handler in default.vcl from localhost. We’ll create a special PURGE command which takes URLs to be purged out of the cache in a special header:

acl purge_hosts {
        "localhost";
        # XXX: Add your local computer public IP here if you
        # want to test the code against the production server
        # from the development instance
}

...

sub vcl_recv {

        ...

        # PURGE requests call purge() with a regular expression payload from
        if (req.request == "PURGE") {
                # Check IP for access control list
                if (!client.ip ~ purge_hosts) {
                        error 405 "Not allowed.";
                }
                # Purge for the current host using reg-ex from X-Purge-Regex header
                purge("req.http.host == " req.http.host " && req.url ~ " req.http.X-Purge-Regex);
                error 200 "Purged.";
        }
}

Creating a view

Then let’s create a Plone view which will make a request from Plone to Varnish (upstream localhost:80) and issue PURGE command. We do this using Requests Python lib.

This is useful when very fine tuned per page purgingn fails and the site admins simply want to make sure that everything from the front end cache will be gone.

This view is created so that it can be hooked in portal_actions menu and only the site admins have permission to call it.

Example view code:

import requests

from Products.CMFCore.interfaces import ISiteRoot
from five import grok

# XXX: The followig monkey-patch is unneeded soon
# as the author updated Requets library to handle this case

from requests.models import Request

if not "PURGE" in Request._METHODS:
    lst = list(Request._METHODS)
    lst.append("PURGE")
    Request._METHODS = tuple(lst)

class Purge(grok.CodeView):
    """
    Purge upstream cache from all entries.

    This is ideal to hook up for admins e.g. through portal_actions menu.

    You can access it as admin:: http://site.com/@@purge
    """

    grok.context(ISiteRoot)

    # Onlyl site admins can use this
    grok.require("cmf.ManagePortal")

    def render(self):
        """
        Call the parent cache using Requets Python library and issue PURGE command for all URLs.

        Pipe through the response as is.
        """

        site_url = "http://www.site.com/"

        # We are cleaning everything, but we could limit the effect of purging here
        headers = {
                   # Match all pages
                   "X-Purge-Regex" : ".*"
        }

        resp = requests.request("PURGE", site_url, headers=headers)

        self.request.response["Content-type"] = "text/plain"
        text = []

        text.append("HTTP " + str(resp.status_code))

        # Dump response headers as is to the Plone user,
        # so he/she can diagnose the problem
        for key, value in resp.headers.items():
            text.append(str(key) + ": " + str(value))

        # Add payload message from the server (if any)

        text.append(str(resp.body))

        return "\n".join(text)

Now if you are a Plone site admin and visit in URL:

http://site.com/@@purge

You’ll clear the front end cache.

More info

 Subscribe to this blog in a reader Follow me on Twitter

September 06, 2011

Dr CarterVarnish 3.0.1 released

Summary of changes from 3.0.0 to 3.0.1

  • Objects with grace and keep set were mistakenly seen as candidates for the shortlived storage, but would not be cleaned up quickly, something that manifested as if there was a memory leak. This is now fixed.
  • When multiple clients were waiting for an object, all clients would be woken up when an object became available, leading to stuck threads. This has now been fixed.
  • A bug in how XML entities were handled with ESI has been fixed.
  • The documentation has seen numerous updates.
  • varnishncsa is now more stable and has support for showing arbitrary request and response fields.

Download here.

August 30, 2011

Per BuerThe forth Varnish User Group meeting is coming up

What is that?

The Varnish User Group meetings are an informal meeting where users and developers meet, exchange experiences and the roadmap is discussed. The schedule for this years meeting isn't ready yet but I know Tollef is planning a "How to write your first VMOD" talk and Martin is going to talk about his upcoming implementation of streaming. If you have any suggestions or have something you want to share, please send me an email or leave a comment.

read more

August 26, 2011

Ingvar Hagelundrpm packages of varnish-3.0.0

Varnish is a state of the art http accelerator, or frontside cache, if you like.

varnish-3.0.0 was released some weeks ago. I have built packages for Fedora and epel4/5/6. Packages may be found at the usual http://users.linpro.no/ingvar/varnish/. The rhel packages require some dependencies pulled from epel.

Varnish Software produces their own packages, based on the specfile I maintain for Fedora. The changes from their rpm spec are mostly cosmetic to fit better to Fedora’s packaging standards.

August 09, 2011

Per BuerUpgrading from Varnish Cache 2.1 to 3.0

Andreas Plesner Jacobsen wrote an guide for upgrading Varnish Cache 2.1 to 3.0. Tollef pulled the document into the official documentation. It is now available online.

July 21, 2011

cd34W3 Total Cache and Varnish

Last week I got called into a firestorm to fix a set of machines that were having problems. As Varnish was in the mix, the first thing I noticed was the hit rate was extremely low as Varnish’s VCL wasn’t really configured well for WordPress. Since WordPress uses a lot of cookies and Varnish passes [...]

July 17, 2011

cd34Gracefully Degrading Site with Varnish and High Load

If you run Varnish, you might want to gracefully degrade your site when traffic comes unexpectedly. There are other solutions listed on the net which maintain a Three State Throttle, but, it seemed like this could be done easily within Varnish without needing too many external dependencies. The first challenge was to figure out how [...]

July 16, 2011

cd34Updated WordPress VCL – still not complete, but, closer

Worked with a new client this week and needed to get the VCL working for their installation. They were running W3TC, but, this VCL should work for people running WP-Varnish or any plugin that allows Purging. This VCL is for Varnish 2.x. There are still some tweaks, but, this appears to be working quite well. [...]

June 24, 2011

Per BuerVideo: Kristian Lyngstøl on VMODs

Third installment on the Varnish <3 video series. Kristian Lyngstøl on VMODs. This is the final video of the first batch. Hopefully, I'll find time to do this from time to time, we have a couple of ideas which will make some relevant content.

Cheers!

June 23, 2011

Per BuerInterview with Tollef Fog Heen on the tools

The Varnish Cache tools where given a facelift in Varnish Cache 3.0. In this video Tollef, gives you an overview of the changes.

I'll stop embedding the video in to the webpage. It gets a bit weird with when the article gets aggrigated. 

 

June 22, 2011

Ingvar HagelundThe usage of Varnish revisited

Varnish is a high-performance HTTP accelerator, or frontside cache if you like. Working with Varnish is part of my day job. Among other things, I maintain the packages for Fedora and EPEL.

To celebrate the release of Varnish version 3, I decided to poke around lists again, to look for Varnish in common use.

This is more or less a repost, with updated numbers. There is no deep magic here. I just parse some of the available top lists that I know of, and peek at the HTML headers of the sites that are listed. If there are subsites linked from the front page of the site, I scan them too. Subsites with a Varnish match are shown in parenthesis in the results.

For the Nordic countries, I have quite good lists, that is, upload result lists from the probably most visited media sites in the respective countries. Remember of course, that these are generally pay-to-be-included lists, and there may exist sites with far more hits than the ones listed.

For a global overview, I have used Alexa and Google’s Top 1000 lists.

Now for the results. Varnish is sponsored by large Norwegian sites, so it is no big surprise that there are a lot of hits in Norway. Of the TNS Gallup top list, Varnish runs at stunning 51 of the top 100 sites. That’s 15 up since my last probe.

For Denmark, I use FDIM‘s list. Sorted on page hits, we now rule 15 of the top 100, and 29 in the top 200.

For Sweden, I use the KIA Index list. It is a bit harder to parse, but I think I got it right. Sorting on page hits, in the top 100, we are up to 13, and in the top 200, we find 26 sites running Varnish.

Iceland is finally on the list, with one single item on Modernus’ top list. The lucky site is www.vb.is, which looks like a financial publication.

I haven’t got results for Finland yet, I have to rebuild my parser, it seems.

For what it’s worth, I’ll toss in Germany as well. Four sites in the Google’s top 100 sites for Germany, and 13 on the Netcraft toolbar users’ list sounds like a good start to me. And Der Spiegel and Der Zeit are well-known publications.

For the Alexa’s World top 500 list, we have 17 instances of Varnish in the top 500. That is the same result as last year. Still no World domination. Google shows us a similar result, with 32 sites running Varnish in its top 1000 list.

We know Facebook, the World’s most visited site, runs Varnish for several of their services, but it is hidden from my probes.

All the gory details are available here.

Other more or less worth mentioned sites that has been reported to use Varnish but does not show up in my lists, may be Slashdot, The Pirate Bay, e.Republik, WOWwiki, Globo.com, PCWelt.de, BlackPlanet, funnyordie.com, n-tv.de, 20minutos.es, theglobeandmail.com and hackint0sh.org, to name a few.

Do you know of other famous sites running Varnish? Use the comments.

June 21, 2011

cd34When to Cache, What to Cache, How to Cache

This post is a version of the slideshow presentation I did at Hack and Tell in Fort Lauderdale, Florida at The Collide Factory on Saturday, April 2, 2011. These are 5 minute talks where each slide auto-advances after fifteen seconds which limits the amount of detail that can be conveyed. A brief introduction What makes [...]

June 17, 2011

Dr CarterVarnish 3.0.0 released

Summary of changes from 2.1.5 to 3.0.0

  • Module support through VMODs.
  • Compression and uncompression support, including stiching together compressed ESI fragments.
  • Preliminary streaming support, both on miss and on pass.
  • Much improved documentation.
  • Better default values for parameters.
  • Varnishncsa now has custom log format support.
  • Varnishlog, varnishncsa and varnishhist can now filter out records that match multiple expressions.

Download here.

June 15, 2011

Yves HwangVarnish Administration Console - "Dashboard" sneak peek!

June 09, 2011

Kristian LyngstølVarnish 3.0.0 - RSN!

Posted on 2011-06-09

Varnish 3.0.0 beta2 was just released (http://www.varnish-cache.org/lists/pipermail/varnish-announce/2011-June/000032.html), and we're aiming for 3.0.0 next week.

The release date is set for Thursday the 16th (next week(June, 2011, for the potential future archive crawlers)), and several release parties are planned. all around the world (http://v3party.varnish-cache.org/).

This will be a very special day for everyone in Varnish Software. Varnish 2.0.0 was released roughly the same week I started working at Redpill Linpro, before I was really involved with Varnish, and I still regret that I didn't snatch one of those fancy 2.0-t-shirts.

What's new?

While there are too many news to mention, there are two that stand out more than anything.

The first is gzip compression and the second is the Varnish module-architecture, or simply vmods. That most of ESI has been re-factored under the hood will be evident in future releases, and the same goes for streaming delivery.

Compression

  • "What, you don't already have compression?!"

Remember, Varnish is a caching service. Most of the time you don't need it to do the compression, because the web server will do it for you. But there are good reasons for why you want it. ESI is the primary use-case.

With ESI, Varnish needs to parse the content, and it can't do that if it doesn't understand the transfer encoding. With Varnish 2.1, you have to send uncompressed data to Varnish if you want to use ESI. This means you have to either deliver uncompressed content to your clients, or use yet an other service in front of Varnish.

But let me get a bit technical, because Varnish 3.0's compression is pretty awe-inspiring.

So with ESI, you have multiple, individually cached elements that make up a single user-visible page. So what we could do, is glue it all together and compress it before we send it. The downside is that we'll do the compression over and over. An alternative would be to cache the compressed result as long as the individual elements are unchanged, but that will require more space and more complexity.

So what does Varnish 3.0 does? It stores the elements compressed, modifies the right gzip-bits and glues it all together on the fly, without decompressing it. If a single element is changed, only that element needs to be updated. This is probably the best solution, even if the complexity of meddling with binary gzip headers directly can lead to some pretty tricky code. I challenge you to find a solution that handles compression in a smarter way.

Varnish 3.0 also does decompression. If your browser doesn't support compression (Possibly a script or other tool, real browsers support compression), Varnish will decompress the object for you on the fly. This is an other huge improvement over Varnish 2.1. In Varnish 2.1, this is solved using Vary: and storing different copies of the same object based on compression.

We can also do the same with the backend-data: If ESI needs to read the data, Varnish 3.0 can decompress it on the fly, parse it, then re-compress it before storing it.

And for you, the user, the complexity is fairly non-existent. Push the button, remove that nginx(no hard feelings)server you had doing compression, ????, profit!

VMODS!

VCL, the Varnish Configuration Language, is a flexible way of configuring Varnish. Since it is already translated from VCL to C, the compiled and linked in to the running Varnish instance, VCL has "always" had in-line C: Anywhere you want to, you can tell the VCL compiler that this is pure C code, and pass it directly to the compiler instead of trying to translate from VCL to C first. This was mainly provided because:

  1. It didn't add any complexity and was actually less work.
  2. It provided an escape chute for features we didn't want in Varnish-proper, but were valid for some users.

What features could that be? Syslog-support, geoip-integration, cryptographic verification of authorization headers, etc.

It turned out to be very useful, but impractical and difficult to re-use. Since it was all glued to your VCL, it meant that you had to stick C code in the middle of your regular configuration, and it made it very hard to combine two different features if you didn't know C yourself. And linking towards external libraries required parameter changes.

Enter varnish modules.

Simply put, vmods are a way of letting you do the same thing you can do with in-line C, but in a more sustainable manner. A vmod will typically have a few functions exposed to VCL, and the VCL just has to declare that it imports the vmod to use the functions without a single line of C-code in your VCL.

This also means that the vmod has its own build system, own linking process, flexible development environment and are much easier to share.

In the time to come, I expect us to have a community-site for vmods. I also expect that you will see a lot of minor yet important changes to Varnish during the Varnish 3.0-cycle that exposes more of the internal varnish utilities so vmods can use them.

Small disclaimer, though: There is no API guarantee. We don't wish to slow the pace of Varnish-development by restricting what we can change. That said, we wont tear things apart just to see our vmod-contributors bleed.

I plan to write a blog post on vmod-hacking in the near future, so expect more detail there.

Finishing words

Varnish 1.0 was, as any 1.0-release is, important. I was not involved with the project back then, but as I understand it, Varnish 1 was very much tailored to a specific site, or type of site.

With Varnish 2.0, Varnish became useful for anyone with a high-traffic site. The 2.0-series was a good release-series, adding a healthy mixture of bug fixes and non-intrusive features with each release. But constantly focused on real-world usage. We also saw the first Varnish User Group meeting during that time.

Varnish 2.1 has been a sort of intermittent release. Director-refactoring, ESI and to a certain degree persistent storage paved the way for architectural changes that had to be done. Meanwhile, the user-base really exploded.

Now, with Varnish 3.0.0 coming out, we are already seeing how useful vmods are for Varnish Software-customers like header vmod (https://github.com/KristianLyng/libvmod-header) (Still under development). Gzip also means there are no drawbacks to using ESI with Varnish. You don't need to add a second service to compress data.

It all boils down to Varnish being a more useful part of your architecture. It's easy to get fast and maintainable C-code in there if you need it, you can even pay someone to write just that bit for you. Without being confined to Varnish' own road map and release cycle. There are no longer any downsides to using ESI. It's fast. It's free software. There are professional service and support offerings available (http://www.varnish-software.com). Varnish follow standards - unless you tell it not to. And so forth.

I'm not usually one to pat myself too much on the back, but as I'm writing this, I feel proud to be part of the team that gives you Varnish 3.0.

If you're in Oslo, I'll see you next Thursday!

Comments

June 06, 2011

cd34WordPress, Varnish and ESI Plugin

This post is a version of the slideshow presentation I did at Hack and Tell in Fort Lauderdale, Florida at The Whitetable Foundation on Saturday, June 4, 2011. Briefly, I created a Plugin that enabled Fragment Caching with WordPress and Varnish. The problem we ran into with normal page caching methods was related to the [...]

June 03, 2011

Per BuerShort interview with Poul Henning Kamp

Hi.

As an experiment, we've been experimenting with creating short videos about Varnish Cache. We're not quite sure what the format should be so bear with us. First up is this video where Poul Henning Kamp giving us a his view on what he thinks are the two most important features of Varnish Cache 3.0.

There will be a couple of more videos coming soon. If you want us to discuss a certain topic, let us know.

June 02, 2011

Per BuerVarnish jobs in Oslo

We're growing! 

We're looking for two technical people in Oslo. The specifications aren't very specific. Basically, we're looking for smart people that might help us with one or more of the following:

read more

May 27, 2011

Per BuerFirst third party module spotted in the wild

May 23, 2011

Per BuerVarnish 3.0 - the Standard Module

Module support is perhaps the most exciting feature of Varnish Cache 3.0. It makes it really easy to add quite complex business logic in Varnish or to connect Varnish to any external data source. Do you want to connect Varnish to your MySQL database, so that it will authorize users against a table in MySQL? It's now easy and simple to write a module that does that. It will probably wreck havoc on performance but that's your decision - not ours.

read more

May 20, 2011

Per BuerMobile detection in Varnish

Hi.

We're considering releasing a Varnish 3.0 module that makes the content of the WURFL database accessible in VCL. WURFL is crowd-sourced database consisting of some 13 thousand user-agent strings with accompanied characteristics.

If you are interested in seeing such a thing get developed and willing to spend some money to see it released, either send me an email (perbu [at] varnish-software [dot] com) or leave a comment below.

May 11, 2011

Per BuerVarnish Cache 3.0 beta 1 is out!

The big new features are compression, basic streaming suppport and vmods. But there are a bunch of other new features as well. Installation notes and such and full change log.

 

Per BuerStreaming in Varnish 3.0

Varnish Cache 2.1 does what you can call store-and-forward proxying. It gets the request, then it turns around, gets the whole object from the backend, then turns around and delivers it to the client. If your backend takes time to deliver the object the client might get restless.

read more

May 05, 2011

Per BuerBans and purges in Varnish 3.0

In Varnish 1.0 there was only one way or ejecting content from Varnish. You had to add VCL code that could find the object and set the TTL to zero. The typical, and squid-compatible way of doing it was by creating a new HTTP method and call it "PURGE". The VCL would typically look like this:

read more

May 03, 2011

Per BuerVarnish 3.0 changes - ESI and gzip

Hi.

Finally Varnish 3.0 is feature complete and we're about to roll out a beta. All in all we've done quite a lot of testing on this release and the code seems to be quite stable. We've been testing with some rather busy websites and both speed and stability is very good.

read more

April 27, 2011

Per BuerUpdates on www.varnish-cache.org

As the snow is melting in Oslo we've spent some time doing some spring renovations on www.varnish-cache.org. We've added a forum and a new FAQ.

read more