Planet Varnish

January 19, 2012

Per BuerWhere will Varnish Cache be in 2020?

The last couple of days we've been discussing the future. 2011 was a good year, both for the Varnish Cache Project and Varnish Software itself. Both have now proven themselves. During the last two months of 2011 Varnish was downloaded over 25000 times. Varnish Software has also proven itself. We now know that our services are sought after and that people find our offering valuable. Now what?

read more

January 11, 2012

Per BuerThe superbunny under a CC license

The Varnish Superbunny

Together with our graphics and HCI guru, Morten Zetlitz, we've decided to release the superbunny graphic under a CC license.

You can find a PNG and an SVG version of this image below.

 

read more

January 10, 2012

cd34Finally, a formal release for my WordPress + Varnish + ESI plugin

A while back I wrote a plugin to take care of a particular client traffic problem. As the traffic came in very quickly and unexpectedly, I had only minutes to come up with a solution. As I knew Varnish pretty well, my initial reaction was to put the site behind Varnish. But, there’s a problem [...]

January 05, 2012

David HarriganBanning URLs from Varnish using Apache Camel and RabbitMQ – Part 2

Welcome Back! I hope you found Part 1 on this tutorial useful. You should by now have a running instance of Varnish cache along with a running instance of RabbitMQ. You should also have cloned the Varnish-Ban project from Bitbucket and perhaps had a look through the project structure and source code. I hope there is nothing [...]

January 04, 2012

Per BuerThe hash collision attacks

December 28, 2011

David HarriganBanning URLs from Varnish using Apache Camel and RabbitMQ – Part 1

Introduction Hello and Welcome! Over the course of three postings, I would like to present a tutorial on using RabbitMQ and Apache Camel to BAN (their parlance for removing) URLs (objects) held within Varnish Cache. This proposed approach allows for a complete decoupling of application logic from the caching system thus promoting greater flexibility, scalability [...]

December 13, 2011

David HarriganIntroducing LVUG

I’m a big fan of Varnish Cache. It truly is an amazing piece of software. I’m also keen to promote its use in the town I work in and to help share ideas, experiences and solutions between others. Therefore, I decided to create a new meetup group. I’m happy to announce that The London Varnish [...]

December 01, 2011

Kristian LyngstølVarnish Training

As anyone who’s worked with me should realize by now, I’m big on documentation, be it source code comments or other types of documentation. The only reason I’m not more active in the documentation section of Varnish Cache is because I’ve maintained our (Varnish Software’s) training material ever since Tollef Fog Heen wrote the initial slides in 2009.

I’ve held the course more times than I can remember, and usually done improvements after every course. Others have also held the course, including Redpill Linpro’s Kacper Wysocki, maintainer of security.vcl and Magnus Hagander (Postgresql god/swede). Feedback and gradual improvements have turned a set of slides into a pretty good course material.

We recently started holding on-line courses too. This revealed several new challenges. The obvious challenges are things like getting basic voice communication working (it sounds easy, but you’d be amazed…). It was also interesting when I held the course in my apartment on Australian time, and my ISP decided to perform maintenance on my cable modem (it was 2AM local, after all). So I’ve had to hold the course on a 3G connection, communicating with Australia. Fun. Then there’s the lack of or severely reduced feedback, which presents challenges in how we do exercises and generally deal with the course. In a class room I can easily determine if the participants are able to keep up, if I’m going too slow or too fast and whether or not a subject is more interesting than an other. All of that is, at best, very difficult in an on-line session.

The last few weeks I’ve finally gotten around to merging the sysadmin course with the web development course that Jérôme Renard has written for us. It proved the perfect opportunity to give the course an other big update. While the course was already updated for Varnish 3, I’ve made several other Varnish 3-related additions. More importantly is that the flow of the course has changed from one oriented on Varnish functionality to tasks you wish to accomplish with Varnish. Instead of teaching you about Varnish architecture first, then Varnish parameters, the course now has a chapter devoted to Tuning.

Instead of just throwing in purging or banning when talking about VCL, there’s now a chapter called Cache Invalidation, that attempts to give broader understanding of the alternatives you have and when to use which solution. Similarly, there’s a chapter called Saving The Request, which starts out with the core Grace mechanisms, moves on to health checks, saint mode, req.hash_always_miss, directors and more.

There are several reasons I write about this. First of all: I’m very excited about the material. I’ve worked on it regularly for several years, doing everything from hacking rst2s5, tweaking the type setting and design to updating the content, reorganizing the course and of course holding it. It may seem like one big marketing stunt, but I can promise you that I never blog about something I’m not passionate about, regardless of whether it is work-related or not.

The other reason is that I’m holding the course next week. This will be the first time we hold it using the changed structure. I would have wanted to hold it in a class room first, but holding it on-line is still exciting.

If you wish to participate, head over to https://www.varnish-software.com/products-services/training/varnish-administration-course and convince your boss it will be awesome!


November 23, 2011

Per BuerVAC Screencast

So, on friday we finally made it. The 1.0 release of the Varnish Admin Console (VAC) was finally shipped. A lot of people have been wondering what it looks like. Here is a screencast:

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="353" src="http://www.youtube.com/embed/M2LznlsrYMQ" width="694"></iframe>

Watch this video on Youtube.

Comments and feedback are welcome!

November 18, 2011

Yves Hwangvac-cottontail-1.0 release

The road to a software release is filled with sweat and tears. As the product manager of the Varnish Administration Console, it is my absolute pleasure to present you the vac-cottontail-1.0 release, available to silver and above subscribers of Varnish Software. Harking back to my Aussie roots and straight from Hilltop Hoods, this release is for our Varnish people in the front, in the nose bleed section.

read more

November 16, 2011

Mikko OhtamaaVarnish, caching and HTTP cookies

These short notes related on caching and HTTP cookies are based on my experience with Varnish and Plone CMS and WordPress.

Sanifying cookies for caching

Any cookie set on the server side (session cookie) or on the client-side (e.g. Google Analytics Javascript cookies) is poison for caching the anonymous visitor content.

Common cookies for all CMS systems are usually

  • Session cookie (anonymous user session): ZopeId,  PHPSESSID
  • Logged in user cookie: __ac (Plone)
  • Active language cookie: I18N_LANGUAGE (Plone)
  • Analytics cookies (Google Analytics et. al.): various ugly cookies
  • Some other status information e.g. status message: statusmessages (Plone)

HTTP caching needs to deal with both HTTP request and response cookie handling

  • HTTP request Cookie header. The browser sending HTTP request with Cookie header confuses Varnish cache look-up. This header can be set by Javascript also, not just by the server. Cookie can be preprocessed in vcl_recv.
  • HTTP response Set-Cookie header. This is server-side cookie set. If your server is setting cookies Varnish does not cache these responses by default. Howerver, this might be desirable behavior if e.g. multi-lingual content is served from one URL with language cookies. Set-Cookie can be post-processed in vcl_fetch.

Example how remove all Plone related cookies besides ones dealing with the logged in users (content authors):

sub vcl_recv {

  if (req.http.Cookie) {
      # (logged in user, status message - NO session storage or language cookie)
      set req.http.Cookie = ";" req.http.Cookie;
      set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
      set req.http.Cookie = regsuball(req.http.Cookie, ";(statusmessages|__ac)=", "; \1=");
      set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
      set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");

      if (req.http.Cookie == "") {
          remove req.http.Cookie;
      }
  }
  ...

# Let's not remove Set-Cookie header in VCL fetch
sub vcl_fetch {

    # Here we could unset cookies explicitly,
    # but we assume plone.app.caching extension does it jobs
    # and no extra cookies fall through for HTTP responses we'd like to cache
    # (like images)

    if (!obj.cacheable) {
        return (pass);
    }
    if (obj.http.Set-Cookie) {
        return (pass);
    }
    set obj.prefetch =  -30s;
    return (deliver);
}

Another example how to purge Google Analytics cookies only and leave other cookies intact:

sub vcl_recv {

         # Remove Google Analytics cookies - will prevent caching of anon content
         # when using GA Javascript. Also you will lose the information of
         # time spend on the site etc..
         if (req.http.cookie) {
            set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", "");
            if (req.http.cookie ~ "^ *$") {
                remove req.http.cookie;
            }
          }

          ....

 Subscribe to this blog in a reader Follow me on Twitter

November 11, 2011

Stefan CaunterSet up Internal Load Balancer Pools on a Barracuda

Often we call services internally, whether it is a mail gateway or a web service. In production, if high availability is a requirement, we want to create a virtual ip for the service and add two or more physical servers as listeners. I usually use layer 7 service pools when network load is low, and [...]

November 04, 2011

Per BuerHTTP Streaming in Varnish

Since April this year we've been working with a few customers implementing proper support for streaming of objects in Varnish. The code was feature complete in August and has matured since, now reaching a stage where we are comfortable releaseing it.

read more

October 26, 2011

Dr CarterVarnish 3.0.2 released

Summary of changes from 3.0.1 to 3.0.2

  • A crasher bug when requests were queued and the backend sent a response with Vary has been fixed.
  • A crash when a too large synthetic response was produced has been fixed.
  • The ban lurker now properly sleeps the 1 second it is supposed to.
  • Varnish now releases disk space properly if no -s argument is provided, and the default cache size is now 100MB instead of 50% of the available disk space.

Download here.

<script type="text/javascript"> </script> <script src="http://pagead2.googlesyndication.com/pagead/show_ads.js" type="text/javascript"> </script>

October 08, 2011

Dr CarterCTRL F5 to force the update of Varnish cache

Add the following code to your vcl, so you can use CTRL F5 to force a refresh:

acl CTRLF5 {
   "xxx.xxx.xxx.xxx";
}

sub vcl_hit {

  if (client.ip ~ CTRLF5) {
    if (req.http.pragma ~ "no-cache" || req.http.Cache-Control ~ "no-cache")
    {
      set obj.ttl = 0s;
      return(pass);
    }
    else { return(deliver); }
  }
  else { return(deliver); }
}

Dr CarterFew corrections for wordpress.vcl in the wordpress plugin for Varnish

I’ve just migrated this blog on a dedicated server.
In order to use Varnish 3.0.1, i have done few ajustements on the wordpress.vcl contained in the wordpress plugin for Varnish

backend default {
  .host = "127.0.0.1";
  .port = "8080";
}

acl purge {
  "localhost";
}

sub vcl_recv {
  if (req.request == "PURGE") {
    if(!client.ip ~ purge) {
      error 405 "Not allowed.";
    }

    return (lookup);
  }

  if (req.request != "GET" &&
      req.request != "HEAD" &&
      req.request != "PUT" &&
      req.request != "POST" &&
      req.request != "TRACE" &&
      req.request != "OPTIONS" &&
      req.request != "DELETE") {
    return (pipe);
  }

  if (req.request != "GET" && req.request != "HEAD") {
    return (pass);
  }

  if (req.url ~ "wp-(login|admin)" || req.url ~ "preview=true") {
    return (pass);
  }

  remove req.http.cookie;
  return (lookup);
}

sub vcl_fetch {
  if (req.url ~ "wp-(login|admin)" || req.url ~ "preview=true") {
    return (hit_for_pass);
  }

  set beresp.ttl = 24h;
  return (deliver);
}

sub vcl_hit {
  if (req.request == "PURGE") {
     purge;
     error 200 "Purged.";
  }

  return(deliver);
}

sub vcl_miss {
  if (req.request == "PURGE") {
    purge;
    error 200 "Purged.";
  }

  return (fetch);
}

October 05, 2011

Yves Hwang‎Browsers, HTML5, Websocket & Varnish - The Cook, the Thief, His Wife & Her Lover ‎

Varnish Cache and the support for HTML5 is akin to the movie titled “The Cook, the Thief, His Wife & Her Lover.” The intricate web of relationship based on misunderstanding and extra marital affairs can be somewhat compared the current state of websocket implementation in browsers, in backend servers, and ultimately, in Varnish Cache. See imdb for more movie info.

read more

September 26, 2011

Dr CarterSample of string concatenation in Varnish 2.1.5

In Varnish 2.1.5, to concatenate, separate your variables by a space character :

set req.url = regsub(req.http.host, "(?:www\.)?(.*)\.test\.com", "/\1" ) regsub(req.url, "^/foo(.*)$", "\1");

If ( req.http.host == “www.blabla.test.com” ) and ( req.url == “/foo/index.php?a=1″ ) Then req.url = “/blabla/index.php?a=1″

Since Varnish 3.0, the string concatenation operator is +.
Example:

set req.url = regsub(req.http.host, "(?:www\.)?(.*)\.test\.com", "/\1" ) + regsub(req.url, "^/foo(.*)$", "\1");

Stefan Caunterfms4.0 JSEngine out of memory? Change tag to ScriptEngine

When you get “JavaScript runtime is out of memory; server shutting down instance” errors in your FMS logs, and your app won’t stay loaded, and none of your clients can connect, you need to increase the amount of memory available to the script engine. In the docs for FMS 4.0, the JSEngine tag is “deprecated”. [...]

September 19, 2011

Yves HwangVarnish Administration Console - "Cache Groups" sneak peek!

This is the last installment of the VAC sneak peak - introducing Cache Groups!

Cache Groups

read more

September 14, 2011

Yves HwangVarnish Administration Console - "Ban Management" sneak peek!

Ban Managing Bunny!Contrary to popular belief, sometimes having long TTL objects in your cache is bad for business. In this instant-update era, Google+, Twitter or Facebook socialites crave the latest memes, while being repulsed by old contents - even if they are mere seconds past the use-by date. Enter purges and bans in Varnish Cache.

read more

September 08, 2011

Mikko OhtamaaPurging Varnish cache from Python web application

Varnish is a popular front end caching server. If you are using it at the front of your web site you have situations where you wish to force the cached content out of the cache E.g. site pages have been updated.

Below is an example how to create an action to purge selected pages from the Varnish cache. The example shows how to purge the whole cache, but you can use regular expressions to limit the effect of the purge.

This tutorial has been written for Ubuntu 8.04, Python 2.6, Varnish 2.x and Plone 4.x. However, it should be trivial to apply the instructions for other systems.

We use Requests library which is a great improvement over Python stdlib’s infamous urllib. After you try Request you never want to use urllib or curl again.

Configuring Varnish

First you need to allow HTTP PURGE request handler in default.vcl from localhost. We’ll create a special PURGE command which takes URLs to be purged out of the cache in a special header:

acl purge_hosts {
        "localhost";
        # XXX: Add your local computer public IP here if you
        # want to test the code against the production server
        # from the development instance
}

...

sub vcl_recv {

        ...

        # PURGE requests call purge() with a regular expression payload from
        if (req.request == "PURGE") {
                # Check IP for access control list
                if (!client.ip ~ purge_hosts) {
                        error 405 "Not allowed.";
                }
                # Purge for the current host using reg-ex from X-Purge-Regex header
                purge("req.http.host == " req.http.host " && req.url ~ " req.http.X-Purge-Regex);
                error 200 "Purged.";
        }
}

Creating a view

Then let’s create a Plone view which will make a request from Plone to Varnish (upstream localhost:80) and issue PURGE command. We do this using Requests Python lib.

This is useful when very fine tuned per page purgingn fails and the site admins simply want to make sure that everything from the front end cache will be gone.

This view is created so that it can be hooked in portal_actions menu and only the site admins have permission to call it.

Example view code:

import requests

from Products.CMFCore.interfaces import ISiteRoot
from five import grok

# XXX: The followig monkey-patch is unneeded soon
# as the author updated Requets library to handle this case

from requests.models import Request

if not "PURGE" in Request._METHODS:
    lst = list(Request._METHODS)
    lst.append("PURGE")
    Request._METHODS = tuple(lst)

class Purge(grok.CodeView):
    """
    Purge upstream cache from all entries.

    This is ideal to hook up for admins e.g. through portal_actions menu.

    You can access it as admin:: http://site.com/@@purge
    """

    grok.context(ISiteRoot)

    # Onlyl site admins can use this
    grok.require("cmf.ManagePortal")

    def render(self):
        """
        Call the parent cache using Requets Python library and issue PURGE command for all URLs.

        Pipe through the response as is.
        """

        site_url = "http://www.site.com/"

        # We are cleaning everything, but we could limit the effect of purging here
        headers = {
                   # Match all pages
                   "X-Purge-Regex" : ".*"
        }

        resp = requests.request("PURGE", site_url, headers=headers)

        self.request.response["Content-type"] = "text/plain"
        text = []

        text.append("HTTP " + str(resp.status_code))

        # Dump response headers as is to the Plone user,
        # so he/she can diagnose the problem
        for key, value in resp.headers.items():
            text.append(str(key) + ": " + str(value))

        # Add payload message from the server (if any)

        text.append(str(resp.body))

        return "\n".join(text)

Now if you are a Plone site admin and visit in URL:

http://site.com/@@purge

You’ll clear the front end cache.

More info

 Subscribe to this blog in a reader Follow me on Twitter

September 06, 2011

Dr CarterVarnish 3.0.1 released

Summary of changes from 3.0.0 to 3.0.1

  • Objects with grace and keep set were mistakenly seen as candidates for the shortlived storage, but would not be cleaned up quickly, something that manifested as if there was a memory leak. This is now fixed.
  • When multiple clients were waiting for an object, all clients would be woken up when an object became available, leading to stuck threads. This has now been fixed.
  • A bug in how XML entities were handled with ESI has been fixed.
  • The documentation has seen numerous updates.
  • varnishncsa is now more stable and has support for showing arbitrary request and response fields.

Download here.

August 30, 2011

Per BuerThe forth Varnish User Group meeting is coming up

What is that?

The Varnish User Group meetings are an informal meeting where users and developers meet, exchange experiences and the roadmap is discussed. The schedule for this years meeting isn't ready yet but I know Tollef is planning a "How to write your first VMOD" talk and Martin is going to talk about his upcoming implementation of streaming. If you have any suggestions or have something you want to share, please send me an email or leave a comment.

read more

August 26, 2011

Ingvar Hagelundrpm packages of varnish-3.0.0

Varnish is a state of the art http accelerator, or frontside cache, if you like.

varnish-3.0.0 was released some weeks ago. I have built packages for Fedora and epel4/5/6. Packages may be found at the usual http://users.linpro.no/ingvar/varnish/. The rhel packages require some dependencies pulled from epel.

Varnish Software produces their own packages, based on the specfile I maintain for Fedora. The changes from their rpm spec are mostly cosmetic to fit better to Fedora’s packaging standards.

August 09, 2011

Per BuerUpgrading from Varnish Cache 2.1 to 3.0

Andreas Plesner Jacobsen wrote an guide for upgrading Varnish Cache 2.1 to 3.0. Tollef pulled the document into the official documentation. It is now available online.

July 21, 2011

cd34W3 Total Cache and Varnish

Last week I got called into a firestorm to fix a set of machines that were having problems. As Varnish was in the mix, the first thing I noticed was the hit rate was extremely low as Varnish’s VCL wasn’t really configured well for WordPress. Since WordPress uses a lot of cookies and Varnish passes [...]

July 17, 2011

cd34Gracefully Degrading Site with Varnish and High Load

If you run Varnish, you might want to gracefully degrade your site when traffic comes unexpectedly. There are other solutions listed on the net which maintain a Three State Throttle, but, it seemed like this could be done easily within Varnish without needing too many external dependencies. The first challenge was to figure out how [...]

July 16, 2011

cd34Updated WordPress VCL – still not complete, but, closer

Worked with a new client this week and needed to get the VCL working for their installation. They were running W3TC, but, this VCL should work for people running WP-Varnish or any plugin that allows Purging. This VCL is for Varnish 2.x. There are still some tweaks, but, this appears to be working quite well. [...]

June 24, 2011

Per BuerVideo: Kristian Lyngstøl on VMODs

Third installment on the Varnish <3 video series. Kristian Lyngstøl on VMODs. This is the final video of the first batch. Hopefully, I'll find time to do this from time to time, we have a couple of ideas which will make some relevant content.

Cheers!

June 23, 2011

Per BuerInterview with Tollef Fog Heen on the tools

The Varnish Cache tools where given a facelift in Varnish Cache 3.0. In this video Tollef, gives you an overview of the changes.

I'll stop embedding the video in to the webpage. It gets a bit weird with when the article gets aggrigated. 

 

June 22, 2011

Ingvar HagelundThe usage of Varnish revisited

Varnish is a high-performance HTTP accelerator, or frontside cache if you like. Working with Varnish is part of my day job. Among other things, I maintain the packages for Fedora and EPEL.

To celebrate the release of Varnish version 3, I decided to poke around lists again, to look for Varnish in common use.

This is more or less a repost, with updated numbers. There is no deep magic here. I just parse some of the available top lists that I know of, and peek at the HTML headers of the sites that are listed. If there are subsites linked from the front page of the site, I scan them too. Subsites with a Varnish match are shown in parenthesis in the results.

For the Nordic countries, I have quite good lists, that is, upload result lists from the probably most visited media sites in the respective countries. Remember of course, that these are generally pay-to-be-included lists, and there may exist sites with far more hits than the ones listed.

For a global overview, I have used Alexa and Google’s Top 1000 lists.

Now for the results. Varnish is sponsored by large Norwegian sites, so it is no big surprise that there are a lot of hits in Norway. Of the TNS Gallup top list, Varnish runs at stunning 51 of the top 100 sites. That’s 15 up since my last probe.

For Denmark, I use FDIM‘s list. Sorted on page hits, we now rule 15 of the top 100, and 29 in the top 200.

For Sweden, I use the KIA Index list. It is a bit harder to parse, but I think I got it right. Sorting on page hits, in the top 100, we are up to 13, and in the top 200, we find 26 sites running Varnish.

Iceland is finally on the list, with one single item on Modernus’ top list. The lucky site is www.vb.is, which looks like a financial publication.

I haven’t got results for Finland yet, I have to rebuild my parser, it seems.

For what it’s worth, I’ll toss in Germany as well. Four sites in the Google’s top 100 sites for Germany, and 13 on the Netcraft toolbar users’ list sounds like a good start to me. And Der Spiegel and Der Zeit are well-known publications.

For the Alexa’s World top 500 list, we have 17 instances of Varnish in the top 500. That is the same result as last year. Still no World domination. Google shows us a similar result, with 32 sites running Varnish in its top 1000 list.

We know Facebook, the World’s most visited site, runs Varnish for several of their services, but it is hidden from my probes.

All the gory details are available here.

Other more or less worth mentioned sites that has been reported to use Varnish but does not show up in my lists, may be Slashdot, The Pirate Bay, e.Republik, WOWwiki, Globo.com, PCWelt.de, BlackPlanet, funnyordie.com, n-tv.de, 20minutos.es, theglobeandmail.com and hackint0sh.org, to name a few.

Do you know of other famous sites running Varnish? Use the comments.

June 21, 2011

cd34When to Cache, What to Cache, How to Cache

This post is a version of the slideshow presentation I did at Hack and Tell in Fort Lauderdale, Florida at The Collide Factory on Saturday, April 2, 2011. These are 5 minute talks where each slide auto-advances after fifteen seconds which limits the amount of detail that can be conveyed. A brief introduction What makes [...]

June 17, 2011

Dr CarterVarnish 3.0.0 released

Summary of changes from 2.1.5 to 3.0.0

  • Module support through VMODs.
  • Compression and uncompression support, including stiching together compressed ESI fragments.
  • Preliminary streaming support, both on miss and on pass.
  • Much improved documentation.
  • Better default values for parameters.
  • Varnishncsa now has custom log format support.
  • Varnishlog, varnishncsa and varnishhist can now filter out records that match multiple expressions.

Download here.

June 15, 2011

Yves HwangVarnish Administration Console - "Dashboard" sneak peek!

June 09, 2011

Kristian LyngstølVarnish 3.0.0 – RSN!

Varnish 3.0.0 beta2 was just released, and we’re aiming for 3.0.0 next week.

The release date is set for Thursday the 16th (next week(June, 2011, for the potential future archive crawlers)), and several release parties are planned. Varnish Software will be present in Santa Clara, London and Oslo, but numerous parties are planned all around the world.

This will be a very special day for everyone in Varnish Software. Varnish 2.0.0 was released roughly the same week I started working at Redpill Linpro, before I was really involved with Varnish, and I still regret that I didn’t snatch one of those fancy 2.0-t-shirts.

What’s new?

While there are too many news to mention, there are two that stand out more than anything.

The first is gzip compression and the second is the Varnish module-architecture, or simply vmods. That most of ESI has been re-factored under the hood will be evident in future releases, and the same goes for streaming delivery.

Compression

- “What, you don’t already have compression?!”

Remember, Varnish is a caching service. Most of the time you don’t need it to do the compression, because the web server will do it for you. But there are good reasons for why you want it. ESI is the primary use-case.

With ESI, Varnish needs to parse the content, and it can’t do that if it doesn’t understand the transfer encoding. With Varnish 2.1, you have to send uncompressed data to Varnish if you want to use ESI. This means you have to either deliver uncompressed content to your clients, or use yet an other service in front of Varnish.

But let me get a bit technical, because Varnish 3.0′s compression is pretty awe-inspiring.

So with ESI, you have multiple, individually cached elements that make up a single user-visible page. So what we could do, is glue it all together and compress it before we send it. The downside is that we’ll do the compression over and over. An alternative would be to cache the compressed result as long as the individual elements are unchanged, but that will require more space and more complexity.

So what does Varnish 3.0 does? It stores the elements compressed, modifies the right gzip-bits and glues it all together on the fly, without decompressing it. If a single element is changed, only that element needs to be updated. This is probably the best solution, even if the complexity of meddling with binary gzip headers directly can lead to some pretty tricky code. I challenge you to find a solution that handles compression in a smarter way.

Varnish 3.0 also does decompression. If your browser doesn’t support compression (Possibly a script or other tool, real browsers support compression), Varnish will decompress the object for you on the fly. This is an other huge improvement over Varnish 2.1. In Varnish 2.1, this is solved using Vary: and storing different copies of the same object based on compression.

We can also do the same with the backend-data: If ESI needs to read the data, Varnish 3.0 can decompress it on the fly, parse it, then re-compress it before storing it.

And for you, the user, the complexity is fairly non-existent. Push the button, remove that nginx(no hard feelings)server you had doing compression, ????, profit!

VMODS!

VCL, the Varnish Configuration Language, is a flexible way of configuring Varnish. Since it is already translated from VCL to C, the compiled and linked in to the running Varnish instance, VCL has “always” had in-line C: Anywhere you want to, you can tell the VCL compiler that this is pure C code, and pass it directly to the compiler instead of trying to translate from VCL to C first. This was mainly provided because:

1. It didn’t add any complexity and was actually less work.
2. It provided an escape chute for features we didn’t want in Varnish-proper, but were valid for some users.

What features could that be? Syslog-support, geoip-integration, cryptographic verification of authorization headers, etc.

It turned out to be very useful, but impractical and difficult to re-use. Since it was all glued to your VCL, it meant that you had to stick C code in the middle of your regular configuration, and it made it very hard to combine two different features if you didn’t know C yourself. And linking towards external libraries required parameter changes.

Enter varnish modules.

Simply put, vmods are a way of letting you do the same thing you can do with in-line C, but in a more sustainable manner. A vmod will typically have a few functions exposed to VCL, and the VCL just has to declare that it imports the vmod to use the functions without a single line of C-code in your VCL.

This also means that the vmod has its own build system, own linking process, flexible development environment and are much easier to share.

In the time to come, I expect us to have a community-site for vmods. I also expect that you will see a lot of minor yet important changes to Varnish during the Varnish 3.0-cycle that exposes more of the internal varnish utilities so vmods can use them.

Small disclaimer, though: There is no API guarantee. We don’t wish to slow the pace of Varnish-development by restricting what we can change. That said, we wont tear things apart just to see our vmod-contributors bleed.

I plan to write a blog post on vmod-hacking in the near future, so expect more detail there.

Finishing words

Varnish 1.0 was, as any 1.0-release is, important. I was not involved with the project back then, but as I understand it, Varnish 1 was very much tailored to a specific site, or type of site.

With Varnish 2.0, Varnish became useful for anyone with a high-traffic site. The 2.0-series was a good release-series, adding a healthy mixture of bug fixes and non-intrusive features with each release. But constantly focused on real-world usage. We also saw the first Varnish User Group meeting during that time.

Varnish 2.1 has been a sort of intermittent release. Director-refactoring, ESI and to a certain degree persistent storage paved the way for architectural changes that had to be done. Meanwhile, the user-base really exploded.

Now, with Varnish 3.0.0 coming out, we are already seeing how useful vmods are for Varnish Software-customers like Softonic, who’re sponsoring a VMOD to allow header-modification targeted at the Set-Cookie header (which can’t be treated like the RFC2616 specifies, due to how this has been implemented historically). Doing this in a vmod allows the VCL to remain clean. It also lets us share the code that isn’t site-specific but widely useful, since it is all contained in a header vmod(Still under development). Gzip also means there are no drawbacks to using ESI with Varnish. You don’t need to add a second service to compress data.

It all boils down to Varnish being a more useful part of your architecture. It’s easy to get fast and maintainable C-code in there if you need it, you can even pay someone to write just that bit for you. Without being confined to Varnish’ own road map and release cycle. There are no longer any downsides to using ESI. It’s fast. It’s free software. There are professional service and support offerings available. Varnish follow standards – unless you tell it not to. And so forth.

I’m not usually one to pat myself too much on the back, but as I’m writing this, I feel proud to be part of the team that gives you Varnish 3.0.

If you’re in Oslo, I’ll see you next Thursday!


June 06, 2011

cd34WordPress, Varnish and ESI Plugin

This post is a version of the slideshow presentation I did at Hack and Tell in Fort Lauderdale, Florida at The Whitetable Foundation on Saturday, June 4, 2011. Briefly, I created a Plugin that enabled Fragment Caching with WordPress and Varnish. The problem we ran into with normal page caching methods was related to the [...]

June 03, 2011

Per BuerShort interview with Poul Henning Kamp

Hi.

As an experiment, we've been experimenting with creating short videos about Varnish Cache. We're not quite sure what the format should be so bear with us. First up is this video where Poul Henning Kamp giving us a his view on what he thinks are the two most important features of Varnish Cache 3.0.

There will be a couple of more videos coming soon. If you want us to discuss a certain topic, let us know.

June 02, 2011

Per BuerVarnish jobs in Oslo

We're growing! 

We're looking for two technical people in Oslo. The specifications aren't very specific. Basically, we're looking for smart people that might help us with one or more of the following:

read more

May 27, 2011

Per BuerFirst third party module spotted in the wild

May 23, 2011

Per BuerVarnish 3.0 - the Standard Module

Module support is perhaps the most exciting feature of Varnish Cache 3.0. It makes it really easy to add quite complex business logic in Varnish or to connect Varnish to any external data source. Do you want to connect Varnish to your MySQL database, so that it will authorize users against a table in MySQL? It's now easy and simple to write a module that does that. It will probably wreck havoc on performance but that's your decision - not ours.

read more

May 20, 2011

Per BuerMobile detection in Varnish

Hi.

We're considering releasing a Varnish 3.0 module that makes the content of the WURFL database accessible in VCL. WURFL is crowd-sourced database consisting of some 13 thousand user-agent strings with accompanied characteristics.

If you are interested in seeing such a thing get developed and willing to spend some money to see it released, either send me an email (perbu [at] varnish-software [dot] com) or leave a comment below.

May 11, 2011

Per BuerVarnish Cache 3.0 beta 1 is out!

The big new features are compression, basic streaming suppport and vmods. But there are a bunch of other new features as well. Installation notes and such and full change log.

 

Per BuerStreaming in Varnish 3.0

Varnish Cache 2.1 does what you can call store-and-forward proxying. It gets the request, then it turns around, gets the whole object from the backend, then turns around and delivers it to the client. If your backend takes time to deliver the object the client might get restless.

read more

May 05, 2011

Per BuerBans and purges in Varnish 3.0

In Varnish 1.0 there was only one way or ejecting content from Varnish. You had to add VCL code that could find the object and set the TTL to zero. The typical, and squid-compatible way of doing it was by creating a new HTTP method and call it "PURGE". The VCL would typically look like this:

read more

May 03, 2011

Per BuerVarnish 3.0 changes - ESI and gzip

Hi.

Finally Varnish 3.0 is feature complete and we're about to roll out a beta. All in all we've done quite a lot of testing on this release and the code seems to be quite stable. We've been testing with some rather busy websites and both speed and stability is very good.

read more

April 27, 2011

Per BuerUpdates on www.varnish-cache.org

As the snow is melting in Oslo we've spent some time doing some spring renovations on www.varnish-cache.org. We've added a forum and a new FAQ.

read more

April 01, 2011

Leszek UrbanskiContent authorization with Varnish

I’ve been asked about this so many times that I thought I should just post it here. It’s actually very simple to do using restarts.

The problem: you need to check if a user is authorized for an object (which may or may not already be cached by Varnish) by means of an external application.

The solution: the following VCL will pass GET requests from the users to the authorization app. You can modify the URLs, e.g. insert a custom query string if required by the app.

The request is then either denied (if the auth app returns anything other than a 200) or restarted and served from the real backend or from cache.

This is only an example; you can extend it to cache authorization responses, add a control header if you use restarts anywhere else in your VCL, etc.

sub vcl_recv {
        if (req.url ~ "^/authorized_content") {
          if (req.restarts == 0) {
            set req.backend = authorization_backend;
            return(pass);
          } else {
            set req.backend = real_backend;
            set req.url = regsub(req.url, "_authorize_me", "");
          }
        }
}

sub vcl_fetch {
        if (req.url ~ "^/authorized_content" && req.restarts == 0) {
          if (beresp.status == 200) {
            restart;
          } else {
            error 403 "Not authorized";
          }
        }
}

March 16, 2011

Kristian LyngstølThe many pitfalls of benchmarking

I was made aware of a synthetic benchmark that concerned Varnish today, and it looked rather suspicious. The services tested was Varnish, nginx, Apache and G-Wan. And G-Wan came out an order of magnitude faster than Varnish. This made me question the result. The first thing I noticed was AB, a tool I’ve long since given up trying to make behave properly. As there was no detailed data, I decided to give it a spin myself.

You will not find graphs. You will not find “this is best!”-quotes. I’m not even backing up my statements with httperf-output.

Disclaimer

This is not a comparison of G-Wan versus Varnish. It is not complete. It is not even a vague attempt at making either G-Wan or Varnish perform better or worse. It is not realistic. Not complete and in no way a reflection on the overall functionality, usability or performance of G-Wan.

Why not? Because I would be stupid to publicize such things without directly consulting the developers of G-Wan so that the comparison would be fair. I am a Varnish-developer.

This is a text about stress testing. Not the result of stress testing. Nothing more.

The basic idea

So G-Wan was supposedly much faster than Varnish. The feature-set is also very narrow, as it goes about things differently. The test showed that Varnish, Apache and nginx were almost comparable in performance, whereas G-Wan was ridiculously much faster. The test was also conducted on a local machine (so no networking) and using AB. As I know that it’s hard to get nginx, Apache and Varnish to perform within the same level, this indicated that G-Wan did something differently that affected the test to me.

I installed g-wan and Varnish on a virtual machine and started playing with httperf.

What to test

The easiest number to demonstrate in a test is the maximum request rate. It tells you what the server can do under maximum load. However, it is also the hardest test to do precisely and fairly across daemons of vastly different nature.

Other things I have rarely written about is the response time of Varnish for average requests. This is often much more interesting to the end user, as your server isn’t going to be running at full capacity anyway. The fairness and concurrency is also highly relevant. A user doing a large download shouldn’t adversely affect other users.

I wasn’t going to bother with all that.

First test

The first test I did was “max req/s”-like. It quickly showed that G-Wan was very fast, and in fact faster than Varnish. At first glance. The actual request-rate was faster. The CPU-usage was lower. However, Varnish is massively multi-threaded, which offsets the cpu measurements greatly and I wasn’t about to trust it.

Looking closer I realized that the real bottleneck was in fact httperf. With Varnish, it was able to keep more connections open and busy at the same time, and thus hit the upper limit of concurrency. This in turned gave subtle and easily ignored errors on the client which Varnish can do little about. It seemed G-Wan was dealing with fewer sessions at the same time, but faster, which gave httperf an easier time. This does not benefit G-Wan in the real world (nor does it necessarily detract from the performance), but it does create an unbalanced synthetic test.

I experimented with this quite a bit, and quickly concluded that the level of concurrency was much higher with varnish. But it was difficult to measure. Really difficult. Because I did not want to test httperf.

The hardware I used was my home-computer, which is ridiculously overpowered. The VM (KVM) was running with two CPU cores and I executed the clients from the host-OS instead of booting up physical test-servers. (… That 275k req/s that’s so much quoted? Spotify didn’t skip a beat while it was running (on the same machine). ;) )

Conclusion

The more I tested this, the more I was able to produce any result I wanted by tweaking the level of concurrency, the degree of load, the amount of bandwidth required and so forth.

The response time of G-Wan seemed to deteriorate with load. But that might as well be the test environment. As the load went up, it took a long time to get a response. This is just not the case with Varnish at all. I ended up doing a little hoodwinking at the end to see how far this went, and the results varied extremely with tiny variations of test-parameters. The concurrency is a major factor. And the speed of Varnish at each individual connection played a huge part. At large amounts of parallel requests Varnish would be sufficiently fast with all the connections that httperf never ran into problems, while G-Wan would be more uneven and thus trigger failures (and look slower)…

My only conclusion is that it will take me several days to properly map out the performance patterns of Varnish compared to G-Wan. They treat concurrent connections vastly different and perform very different depending on the load-pattern you throw at them. Relating this to real traffic is very hard.

But this confirms my suspicion of the bogus-ness of the blog post that lead me to perform these tests. It’s not that I mind Varnish losing performance tests if we are actually slower, but it’s very hard to stomach when the nature of the test is so dubious. The art of measuring realistic performance with synthetic testing is not one that can be mastered in an afternoon.

Lessons learned

(I think conclusions are supposed to be last, but never mind)

First: Be skeptical of unbalanced results. And of even results.

Second: Measure more than one factor. I’ve mainly focused on request-rate in my posts because I do not compare Varnish to anything but itself. Without a comparison it doesn’t make that much sense to provide reply latency (though I suppose I should start supplying a measure of concurrency, since that’s one of the huge strong-points of Varnish.).

Third: Conclude carefully. This is an extension of the first lesson.

A funny detail: While I read the license for the non-free G-Wan, which I always do for proprietary software, I was happy to see that it didn’t have a benchmark-clause (Oracle, anyone?). But it does forbid removing or modifying the Server:-header. It also forces me to give the G-Wan-guys permission to use my using of G-Wan in their marketing… Hmm — maybe I should … — err, never mind.


Kristian LyngstølVarnish Seminar in Paris

I will be in Paris next week to participate in a seminar on Varnish at Capgemini’s premises. If you are in the area and interested in Varnish, take a look at https://www.varnish-software.com/paris. The nature of the event is informational for technical minds.

(This must be my shortest blog-post by far)


March 15, 2011

Per BuerGetting a pretty panic

We're busy debugging trunk these days. Varnish does a fairly good job of catching a stack trace when the child process crashed. However, as the stack trace is sent to syslog, syslog mangles it and loses a bit of information. Tollef grew tired of getting people to fish out traces from syslog and implemented the CLI command panic.show. It looks like this....

read more

March 11, 2011

Per BuerOnline Training and the worlds second best OSS project

March 09, 2011

Mikko OhtamaaLazily load elements becoming visible using jQuery

It is a useful trick to lazily load comments or such elements at the bottom of page. Some elements may be loaded only when they are scrolled visible.

  • All users are not interested in the information and do not necessary read the article long enough to see it
  • By lazily loading such elements one can speed up the initial page load time
  • You save bandwidth
  • If you use AJAX for the dynamic elements of the page you can more easily cache your pages in static page cache (Varnish) even if the pages contain personalized bits

For example, Disqus is doing this (see comments in jQuery API documentation).

You can achieve this with in-view plug-in for jQuery.

Below is an example for Plone triggering productappreciation_view loading when our placeholder div tag becomes visible.

...
<head>
  <script type="text/javascript" tal:attributes="src string:${portal_url}/++resource++your.app/in-view.js"></script>
</head>
...
<div id="comment-placefolder">

 <!-- Display spinning AJAX indicator gif until our AJAX call completes -->

 <p>
 <!-- Image is in Products.CMFPlone/skins/plone_images -->
 <img tal:attributes="src string:${context/@@plone_portal_state/portal_url}/spinner.gif" /> Loading comments
 </p>

 <!-- Hidden link to a view URL which will render the view containing the snippet for comments -->                       
 <a rel="nofollow" style="display:none" tal:attributes="href string:${context/absolute_url}/productappreciation_view" />

 <script>

 jq(document).ready(function() {

   // http://remysharp.com/2009/01/26/element-in-view-event-plugin/                                        
   jq("#comment-placeholder").bind("inview", function() {

     // This function is executed when the placeholder becomes visible

     // Extract URL from HTML page
     var commentURL = jq("#comment-placeholder a").attr("href");

     if (commentURL) {
     // Trigger AJAX call
       jq("#comment-placeholder").load(commentURL);
     }

   });                                     

 });     
 </script>
</div>

 Subscribe to this blog in a reader Follow me on Twitter

March 04, 2011

Mikko OhtamaaSetting up Apache virtual hosts behind Varnish

Varnish is a very fast front end cache server. You might want to use it at the front of Apache to speed up loading of your static pages and static media, for example for your WordPress blog. You can also use Varnish backends to multiplex the requests between Plone and Apache based PHP software running on the same server using different backend directives.

However if you wish to use Apache virtual hosts with Varnish there is a trick in it.

We use the following setup

  • Varnish listens to port 80, HTTP
  • Apache listens to port 81
  • Varnish uses Apache as a backend

The related varnish.vcl is

backend backend_apache {
.host = "127.0.0.1";
.port = "81";
}

sub vcl_recv {
 ...
 elsif (req.http.host ~ "^blog.mfabrik.com(:[0-9]+)?$") {
    set req.backend = backend_apache;
 }
 ...
}

Note that the backend IP is 127.0.0.1 (localhost). By default, with Debian or Ubuntu Linux, Apache configuration does not do virtual hosting for this.

So if /etc/apache2/sites-enabled/blog.mfabrik.com looks like:

<VirtualHost *:81>

 ServerName blog.mfabrik.com
 ...
 LogFormat       combined
 TransferLog     /var/log/apache2/blog.mfabrik.com.log

 ...

 ExpiresActive On
 ExpiresByType image/gif A3600
 ExpiresByType image/png A3600
 ExpiresByType image/image/vnd.microsoft.icon A3600
 ExpiresByType image/jpeg A3600
 ExpiresByType text/css A3600
 ExpiresByType text/javascript A3600
 ExpiresByType application/x-javascript A3600
</VirtualHost>

And now the trick – you need to add the following to /etc/apache2/httpd.conf

NameVirtualHost *:81

Unless you do all this, Apache will just pick the first virtualhost file in /etc/apache2/sites-enabled and use it for all requests.

Also you need to edit ports.conf and change Apache to listen to port 81:

Listen 81

 Subscribe to this blog in a reader Follow me on Twitter

February 28, 2011

Per BuerWhy ICP isn't happening (and is generally a bad idea)

Two features are very often asked for in Varnish. One is SSL and we'll be coming back to that one later and the other is ICP - Internet cache protocol as defined in RFC 2186.

read more

February 15, 2011

Dr CarterNeed to purge Varnish of all websites containing a string ?

With varnish 2.1.4, try this:

if you want purge this regexp: ^(www\.)?(.*)xxx

you must do : varnishadm -T localhost:6082 purge "req.http.host ~ \"^(www\\\.)?(.*)xxx\""

Be careful to put good backslash sequence, in order to avoid: Syntax Error: Invalid backslash sequence

February 04, 2011

Ingvar Hagelundrpm packages of varnish-2.1.5

Varnish is a state of the art http accelerator, or frontside cache, if you like.

varnish-2.1.5 was released the other day. I have updated my packages in Fedora and epel6. Builds for rhel4 and rhel5 may be found at the usual http://users.linpro.no/ingvar/varnish/. The rhel5 packages require some dependencies pulled from epel5.

Varnish Software produces their own packages, based on the specfile I maintain for Fedora. The only important change is that my spins link against a system installed jemalloc, instead of the one provided with the source. This gives us the opportunity to update jemalloc to the latest version without recompiling varnish.

I also build packages for rhel4. While probably unsupported from Varnish Software, it compiles and runs the test suite after some small fixes to the build. jemalloc packages are provided as well.

February 03, 2011

Per BuerVarnish Source Code - visualized

This morning Kristian tried to throw Gource at the Varnish Git repository. The results where spectacular.

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="510" src="http://www.youtube.com/embed/5mZA-KbP5WQ" title="YouTube video player" width="640"></iframe>

February 01, 2011

Factual Dev BlogA Practical Guide to Varnish

Why Varnish Matters…

What is Varnish?

Varnish is an open source, high performance http accelerator that sits in front of a web stack and caches pages.  This caching layer is very configurable and can be used for both static and dynamic content.

One great thing about Varnish is that it can improve the performance of your website without requiring any code changes.  If you haven’t heard of Varnish (or have heard of it, but haven’t used it), please read on.  Adding Varnish to your stack can be completely noninvasive, but if you tweak your stack to play along with some of varnish’s more advanced features, you’ll be able to increase performance by orders of magnitude.

Some of the high profile companies using Varnish include: TwitterFacebookHeroku and LinkedIn.

Our Use Case

One of Factual’s first high profile projects was Newsweek’s “America’s Best High Schools: The List”. After realizing that we had only a few weeks to increase our throughput by tenfold, we looked into a few options. We decided to go with Varnish because it was noninvasive, extremely fast and battlefield tested by other companies. The result yielded a system that performed 15 times faster and a successful launch that hit the front page of msn.com.  Varnish now plays a major role in our stack and we’re looking to implement more performance tweaks designed with Varnish in mind.

A Simple Use Case

The easiest and safest way to add Varnish to your stack is to serve and cache static content.  Aside from using a CDN, Varnish is probably the next best thing that you can use for free.  However, dynamic content is where you can squeeze real performance out of your stack if you know where and how to use it.  This guide will only scratch the surface on how Varnish can drastically improve performance.  Advanced features such as edge side includes and header manipulation allow you to leverage Varnish for even higher throughput.  Hopefully, we’ll get to more of these advanced features in future blog posts, but for now, we’ll just give you an introduction.

“Hello World”

Installation

Please follow the installation guide on Varnish’s documentation page. http://www.varnish-cache.org/docs

Assuming you’ve installed it correctly, you should be able to run both your webserver and Varnish on different ports. The rest of this guide will assume that you have your webserver running on port 8080, Varnish running on port 80.

Varnish Configuration Language: VCL

Varnish uses its own domain specific language for configuration. Unlike a lot of other projects, Varnish’s configuration language is not declarative. Its very expressive and yet easy to follow. For ubuntu, Varnish’s config file is located here: /etc/varnish/default.vcl A lot of the examples we’ll dive into are based on Varnish’s own documentation here.

This is a simple Varnish config file that will cache all requests whose URI begins with “/sytlesheets”. There are a few things to note here that we’ll explain later:

  • the removal of the Accept-Encoding header
  • the removal of Set-Cookie
  • return(lookup) and return(pass) in vcl_recv
# Defining your webserver.
backend default {
  .host = "127.0.0.1";
  .port = "8080";
}
 
# Incoming request
# can return pass or lookup (or pipe, but not used often)
sub vcl_recv {
 
  # set default backend
  set req.backend = default;
 
  # remove
  unset req.http.Accept-Encoding;
 
 
  # lookup stylesheets in the cache
  if (req.url ~ "^/stylesheets") {
    return(lookup);
  }
 
  return(pass);
}
 
# called after recv and before fetch
# allows for special hashing before cache is accessed
sub vcl_hash {
 
}
 
 
# Before fetching from webserver
# returns pass or deliver
sub vcl_fetch {
  if (req.url ~ "^/stylesheets") {
    # removing cookie
    unset beresp.http.Set-Cookie;
 
    # Cache for 1 day
    set beresp.ttl = 1d;
    return(deliver);
  }
}
 
# called after fetch or lookup yields a hit
sub vcl_deliver {
 
}
 
#
sub vcl_error {
 
}

Now lets look at a few things in detail:

Removing Accept-Encoding Header

The reason this is done is because Varnish doesn’t handle encodings (gzip, deflate, etc…). Instead, Varnish will defer to the webservers to do this. For now, we’re going to ignore this header and just have the webservers give us non-encoded content. The proper way to handle encodings is to have the encoding normalized, but we’ll discuss this later.

Removal of Set-Cookie

We do this because we don’t want the webserver giving us session-specific content. This is just a safe guard and is probably a little unnecessary, but its probably a good thing to note when caching. We’ll discuss session-specific content later.

Returning “pass” vs “lookup”

Returning “pass” tells Varnish to not even try to do a cache lookup. Returning “lookup” tells Varnish to lookup the object from its cache in lieu of fetching it from the webserver. If the object is cached, the webserver is never hit. If it isn’t in the cache, then vcl_fetch is called before fetching the content from the webserver.

Manipulating the Hashing Function

User/Session Specific Content

Let’s say that we want to cache every users “/profile” page. This can be done by including the cookie in the hash function like this:

sub vcl_hash {
  if (req.url ~ "^/profile$") {
    set req.hash += req.http.cookie;
  }
}

Canonicalized Url Caching

In Ruby on Rails, it is common practice to attach trailing timestamps at the end of static content to ensure that the web browser doesn’t cache it (e.g. /stylesheets/main.css?123232113). Let’s say we don’t want to include this when we cache our stylesheets. Here is an example that will remove the trailing timestamp.

sub vcl_hash {
  if (req.url ~ "^/stylesheets") {
    set req.url = regsub(req.url, "\?\d+", "");
  }
}

Browser Specific CSS

Caching browser specific content.  One trick we use is to have a small portion of our css be browser specific to handle various differences between browsers.  We do this by having a dynamic call that will serve up css based on the User-Agent header.  The problem with this technique is that we’ll have different css being served by the same url.   Varnish can still cache this by adding the User-Agent header to the hash like such:

sub vcl_hash {
  if (req.url ~ "^/stylesheets/browser_specific.css") {
    set req.hash += req.http.User-Agent
  }
}

ACLs

Varnish has options to create ACL’s to allow access to certain requests:

# create ACL
acl admin {
  "localhost";
  "192.168.2.20";
}
 
sub vcl_recv {
  # protect admin urls from unauthorized ip's
  if (req.url ~ "^/admin") {
    if (client.ip ~ admin) {
      return(pass);
    } else {
      error 405 "Not allowed in admin area.";
    }
  }
}

Purging

There are times when we need to purge certain cached objects without restarting the server. Varnish allows 2 ways to purge: lookup and url. These examples are based on the Varnish documentation page on purginge: http://www.varnish-cache.org/trac/wiki/VCLExamplePurging

Purge by lookup

Purging by lookup uses the vcl_hit function and “PURGE” http action:

acl purgeable {
  "localhost";
  "192.168.2.20";
}
 
sub vcl_recv {
  if (req.request == "PURGE") {
    if (!client.ip ~ purgeable) {
      set obj.ttl = 0s;
      error 405 "Not allowed to purge.";
    }
  }
}
 
sub vcl_hit {
  if (req.request == "PURGE") {
    set obj.ttl = 0s;
    error 200 "Purged.";
  }
}
 
sub vcl_miss {
  if (req.request == "PURGE") {
    set obj.ttl = 0s;
    error 404 "Not in cache.";
  }
}

Purge by URL

Purging by url is probably a safer bet if you are using cookies or any other tricks in your hash function:

sub vcl_recv {
  if (req.request == "PURGE") {
    if (!client.ip ~ purgeable) {
      error 405 "Not allowed.";
    }
    purge("req.url == " req.url " && req.http.host == " req.http.host);
    error 200 "Purged.";
  }
}

Handling Encodings

Its good to canonicalize your encoded requests because you could either get redundent cached objects, or you could end up returning incorrect encoded objects. For more details, please refer to the Varnish FAQ on Compression. Below is a snippet from that page.

if (req.http.Accept-Encoding) {
  if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") {
    # No point in compressing these
    remove req.http.Accept-Encoding;
  } elsif (req.http.Accept-Encoding ~ "gzip") {
    set req.http.Accept-Encoding = "gzip";
  } elsif (req.http.Accept-Encoding ~ "deflate" && req.http.user-agent !~ "Internet Explorer") {
    set req.http.Accept-Encoding = "deflate";
  } else {
    # unkown algorithm
    remove req.http.Accept-Encoding;
  }
}

Advanced Backends

Multiple Backends

Lets pretend that we have a special assets server that serves up just our stylesheets. Here is an example of having multiple backends for this purpose:

backend default {
  .host = "127.0.0.1";
  .port = "8080";
}
 
backend stylesheets {
  .host = "10.0.0.10";
  .port = "80";
}
 
sub vcl_recv {
  if (req.url ~ "^/stylesheets") {
    # set stylesheets backend
    set req.backend = stylesheets;
    return(lookup);
  }
 
  # set default backend
  set req.backend = default;
  return(pass);
}

Round Robin and Random Multiple Server Backend

backend server1 {
  .host = "10.0.0.10";
}
 
backend server2{
  .host = "10.0.0.11";
}
 
director multi_servers1 round-robin {
  {
    .backend = server1;
  }
  {
    .backend = server2;
  }
}
 
director multi_servers2 random {
  {
    .backend = server1;
  }
  {
    .backend = server2;
  }
}

Varnish stays on our stack happily ever after…

When we first started using Varnish, it was out of desperation and all new to us.  Over the past year, we’ve been figuring out ways to leverage its performance in more creative ways.  At this point, we couldn’t imagine putting together a stack that didn’t include this great project.

We hope this post has been helpful for anyone interested in getting varnish setup for the first time.  Future blog posts will include more advanced features.

Factual on Twitter and Facebook

January 30, 2011

Dr CarterVarnish 2.1.5 released

Summary of changes from 2.1.4 to 2.1.5

  • Two bugs relating to Content-Length and possible duplication of Content-Length headers have been resolved.
  • Support for bourne-like “here”-documents in the command line interface, allowing <<__EOF__ and similar schemes.
  • Fixed an issue with re-using connections after Chunked-Encoding.
  • Fix a bug that would inflate the “lost header” count and could cause problems during heavy traffic over a single connection, typically seen by load testing.
  • Use the time of cache-insertion for “If-Modified-Since” requests if a “Last-Modified” header isn’t provided by the backend.
  • Merge multi-line Vary and Cache-Control headers from clients, which Google Chromium seem to split up.
  • Various build fixes and documentation improvements
  • Various bug fixes.

Download here.