Kristian Lyngstol's Blog

A free software-hacker's blog

Category Archives: Varnish

The Varnish Book

In 2008 Tollef Fog Heen wrote the initial slides used for our first Varnish System Administration training course. Since then, I’ve been the principal maintainer and author of the material. We contracted Jérôme Renard ( to adapt the course for web developers in 2011 and I’ve spent some time pulling it all together into one coherent book after that.

Today we make the Varnish Book available on-line under a Creative Commons CC-BY-NC-SA license.

See for the overview page, or for the sphinx-rendered HTML variant.

The All-important content!

The book contains all the content we use for both the system administration course and the web developer course. While each of those courses omit some chapters (Sysadmin omits HTTP and Content Composition while Webdev omits the Tuning and Saving a Request chapters), the content is structured with this in mind.

The book does not teach you everything about Varnish.

If you read the book and perform the exercises (without cheating!) you will:

  • Understand basic Varnish configuration and administration
  • Know when modifying parameters is sensible
  • Have extensive understanding of VCL as a state engine and language
  • Understand how caching in HTTP affects Varnish, intermediary caches and browser caches
  • Know how to use Varnish to recover from or prevent error-situations
  • Have in-depth knowledge of the different methods of cache-invalidation that Varnish can provide
  • Know several good ways to approach cookies and user-specific content
  • Know of Edge Side Includes
  • Much more!

I’ve gradually restructured the book to be about specific tasks instead of specific features. Because of that, there are several features or techniques that are intentionally left out or down-played to make room for more vital concepts. This has been a gradual transition, and I’m still not entirely satisfied, but I believe this approach to learning Varnish is much more effective than trying to just teach you the facts.

One of my favorite examples of this is probably the Cache Invalidation chapter. We used to cover the equivalent of purging in the VCL chapters, since it is a simple enough feature, then cover banning in a separate chapter. The problem with that mentality is that when you are setting up Varnish, you don’t think “I need to write VCL”. You think “I need to figure out how to invalidate my cache” or “how do I make Varnish fall back to a different web server if the first one failed”.

I have learned a great deal about Varnish, about how people learn and about the web in general while holding these training courses and writing this book. I hope that by releasing it in the open, even more people will get to know Varnish.

Future Content

The book will continue to change. We at Varnish Software will update it for new versions of Varnish and take care of general maintenance.

I hope that we will also get some significant feedback from all you people out there who will read it. We appreciate everything from general evaluation and proof reading to more in-depth discussions.

One of the more recent topics I want to cover in the book is Varnish Modules. This is still quite new, so I’m in no rush. I still haven’t decided what that chapter should cover. It might be about available vmods and practical usage of them, or we might go more in depth and talk about how to start writing your own. I really don’t know.

An other topic I really wish to expand upon is content composition. The material Jerome provided for us was excellent, but I wish to go broader and also make it available in a couple of other languages than just PHP. There is some work in this area already, I just can’t say much more about it yet…

You will probably also see rewrites of the first chapter and the Programs-chapter in the near future. They are both overdue for a brush-up.

In the end, though, this a book that will continue to evolve as long as people take interest. What it covers will be defined largely based on feedback from course participants, feedback from people reading it on-line and the resources needed to implement those changes.


We chose a CC-BY-NC-SA license because we both want to make the material available to as many people as possible, and make sure that we don’t put our self out of the training business by providing a ready-made course for potential competitors.

Being of of those people who actually read licenses and try to interpret their legal ramifications, I’ve obviously also read the CC-BY-NC-SA license we use. It is (intentionally) vague when it comes to specifying what “non-commercial” means. What I interpret it as with regards to our training material is that you can read it as much as you want regardless of whether you are at work or what commercial benefits you have from understanding the content. You can also hold courses in non-profit settings (your local LUG for instance), and some internal training will probably be a non-issue too. However, the moment you offer training for a profit to other parties, you’re violating the license. You’ll also be violating it if you print and sell the book for a profit. Printing and selling it to cover the cost of printing is allowed (it’s one of the few things where the license actually clarifies this).

Since we are using a “NC”-license, we’ll also be asking for copyright assignment and/or a contributor’s agreement if you wish to contribute significantly. This is so we can use your material in commercial activities. Exactly how this will be done is not yet clarified.

One last point: If you are contributing to the documentation we keep at, we will not consider it a breach of license if you borrow ideas from the book. Our goal is to make sure the book interacts well with the other documentation while covering our expenses.

Varnish Training

As anyone who’s worked with me should realize by now, I’m big on documentation, be it source code comments or other types of documentation. The only reason I’m not more active in the documentation section of Varnish Cache is because I’ve maintained our (Varnish Software’s) training material ever since Tollef Fog Heen wrote the initial slides in 2009.

I’ve held the course more times than I can remember, and usually done improvements after every course. Others have also held the course, including Redpill Linpro’s Kacper Wysocki, maintainer of security.vcl and Magnus Hagander (Postgresql god/swede). Feedback and gradual improvements have turned a set of slides into a pretty good course material.

We recently started holding on-line courses too. This revealed several new challenges. The obvious challenges are things like getting basic voice communication working (it sounds easy, but you’d be amazed…). It was also interesting when I held the course in my apartment on Australian time, and my ISP decided to perform maintenance on my cable modem (it was 2AM local, after all). So I’ve had to hold the course on a 3G connection, communicating with Australia. Fun. Then there’s the lack of or severely reduced feedback, which presents challenges in how we do exercises and generally deal with the course. In a class room I can easily determine if the participants are able to keep up, if I’m going too slow or too fast and whether or not a subject is more interesting than an other. All of that is, at best, very difficult in an on-line session.

The last few weeks I’ve finally gotten around to merging the sysadmin course with the web development course that Jérôme Renard has written for us. It proved the perfect opportunity to give the course an other big update. While the course was already updated for Varnish 3, I’ve made several other Varnish 3-related additions. More importantly is that the flow of the course has changed from one oriented on Varnish functionality to tasks you wish to accomplish with Varnish. Instead of teaching you about Varnish architecture first, then Varnish parameters, the course now has a chapter devoted to Tuning.

Instead of just throwing in purging or banning when talking about VCL, there’s now a chapter called Cache Invalidation, that attempts to give broader understanding of the alternatives you have and when to use which solution. Similarly, there’s a chapter called Saving The Request, which starts out with the core Grace mechanisms, moves on to health checks, saint mode, req.hash_always_miss, directors and more.

There are several reasons I write about this. First of all: I’m very excited about the material. I’ve worked on it regularly for several years, doing everything from hacking rst2s5, tweaking the type setting and design to updating the content, reorganizing the course and of course holding it. It may seem like one big marketing stunt, but I can promise you that I never blog about something I’m not passionate about, regardless of whether it is work-related or not.

The other reason is that I’m holding the course next week. This will be the first time we hold it using the changed structure. I would have wanted to hold it in a class room first, but holding it on-line is still exciting.

If you wish to participate, head over to and convince your boss it will be awesome!

Varnish backend selection through DNS

A common challenge to using a cache is maintaining a mapping between public site names and actual web servers (backends). If you only have one type of web server (or maybe two?), and it’s fairly static, this isn’t a big deal. However, if your infrastructure spans tens of different types of web servers, then it starts getting iffy. Here’s an example of how this could look:

director sports round-robin {
    { .backend = { .host = ""; .port = "80"; } }
    { .backend = { .host = ""; .port = "80"; } }
    { .backend = { .host = ""; .port = "80"; } }
director shop round-robin {
    { .backend = { .host = ""; .port = "80"; } }
    { .backend = { .host = ""; .port = "80"; } }
    { .backend = { .host = ""; .port = "80"; } }
director economy round-robin {
    { .backend = { .host = ""; .port = "80"; } }
    { .backend = { .host = ""; .port = "80"; } }
    { .backend = { .host = ""; .port = "80"; } }
director main round-robin {
    { .backend = { .host = ""; .port = "80"; } }
    { .backend = { .host = ""; .port = "80"; } }
    { .backend = { .host = ""; .port = "80"; } }
sub vcl_fetch {
    if ( ~ "$") {
        set req.backend = sports;
    } elsif ( ~ "$") {
        set req.backend = shop;
    } elsif ( ~ "$") {
        set req.backend = economy;
    } else {
        set req.backend = main;

This is obviously a bit of a drag, and so far we only added four sites.

Enter the DNS director

The DNS director allows you to define a single director containing one or more backends, just like any other backend director, but uses DNS to decide which one to pick. Simply put, it does a DNS lookup on the Host header and sees if it has a backend that matches.

Notice that it does NOT automatically try whatever IP the Host header resolves to. It has to know about the backend in advance. This might sound like a bit of a major flaw, but I choose to look at it as a safety net.

The DNS director also allow you to add a postfix to the host-name before it is looked up, so could become It has rudimentary DNS round-robin support and caches the DNS lookups (both successful lookups and misses). Since there isn’t a practical way of obtaining the TTL of a DNS result except apparently hand-coding the resolver or possibly adding some obscure dependency, the life-time of the DNS cache is defined by a setting in the director, cleverly named .ttl.

As a last added bonus, I also added a really easy way to shoot yourself in the leg. With the DNS director, you can specify a range of backends using .list and a acl-like syntax. However, remember that adding means Varnish will internally generate 16 million backends. That’s PROBABLY not a good idea. So do use some moderation. A /24 or two shouldn’t be a big deal, but I’d try to narrow it down as much as possible.

Here’s the above example, except using instead of FOO<N>, and assuming that the web servers are all in the range or

director mydir dns {
    .list = {
        .port = "80";
        .connection_timeout = 0.4;
    .ttl = 5m;
    .suffix = ""
sub vcl_recv {
    set req.backend = mydir;

Specifying connection timeout and similar attributes is optional in .list, but has to be before the list of IPs. You do not have to use .list, you can also add backends the same way you would with the random or round-robin director.

The above examples caches the DNS results for 5 minutes. I’ve also added some counters (visible through varnishstat): Number of DNS lookups, DNS cache hits, failed DNS lookups and how often the DNS cache is full. You may still want to do some basic sanitizing of domain names so as to reduce DNS spam, but now you can probably just use one regsub to match a number of sites.


The DNS director was committed to Varnish development trunk yesterday (Sunday, August 1st 2010) and I expect it to be available in Varnish 2.1.4. It has already been used in production at a few customer sites, with good results. Like any non-trivial piece of code, there are certain aspects of it I want to improve, but I do not foresee that as a blocker for including it in a release. It does not affect the rest of Varnish at all if it not used (unless you count adding 4 counters to varnishstat).

If you want to test it, you’ll have to use Varnish trunk. Alternatively, you can check out my github repo which is currently sitting at Varnish 2.1.2 + DNS director + return(refresh). It does look like you’ll have to compile from source, though. Unless, of course, you’re a Varnish Software customer, then you just drop us a mail and you’ll get your rpms or .debs shortly. (My marketing hat is currently firmly planted on my head).

The development of this feature was sponsored by Globo and Mercado Libre and implemented by myself/Varnish Software.

Varnish 2.1.3 – Use it!

We just released Varnish 2.1.3. And it’s good.

I am a person who value stability, predictability and stability over pretty much everything else. I will gladly use 3-year old software over the latest version if it’s working well. I only upgrade my desktop if I see a really good reason to do it. I rather wait 2 years for a new package to enter my favorite distribution than grab it from a source package to get a new feature.

Why do I tell you this? Because I am now ready to truly whole-heartedly recommend Varnish 2.1.3. Varnish 2.0.6 was a great release. It was stable, it worked well, we know how to get the most of it. There were no un-knowns. When we released Varnish 2.1.0, we knew that it was going to take a release or two to get the 2.1 releases equally good. I finally believe we are there, and that using Varnish 2.1.3 is (almost) as safe as 2.0.6. This is the version I will be recommending to our customers.

Varnish 2.0.6 compared to 2.1

Varnish 2.1 represents two years of development. Roughly. The performance of Varnish 2.1.3 is roughly the same as that of Varnish 2.0.6, with a few exceptions.

First, we now use the “critbit” hashing algorithm instead of “classic” as default. This switch revealed a few weaknesses in the implementation that we gradually resolved between Varnish 2.1.0 and 2.1.3. The benefit of critbit is that it requires far less locking to deliver content. It scales better with large data sets and is generally a nice thing to have.

We have also re-factored much of the code that relates to directors, which allowed us to add multiple new directors. Including directors to pick a backend based on source-ip, URL hash etc.

With Varnish 2.1.3 we also added a “log” command to VCL, which allows you to add generic log messages to the SHM-log through VCL, a much-requested feature.

We have also added basic support for Range-headers. This is not the smartest version of it around, but it fits into the KISS-approach of Varnish. When Range-support is enabled, Varnish will fetch the entire object upon a Range request, but deliver only the range that the client requested. This allows Varnish to cache the entire object, but deliver it in smaller segments.

An other important change between Varnish 2.0 and 2.1 is the removal of the object workspace. The most immediate effect of this is that you will have to write “beresp” instead of “obj” in the vcl_fetch part of your VCL. The bigger consequence is that you no longer have a obj_workspace parameter and all work previously done in obj_workspace is now done in sess_workspace, then Varnish allocates exactly as much space as it needs for the object once it’s finished. This should save you some memory on large data sets.

Now, there are several other changes, but most of them are internal. This is partly to make way for persistent storage, and also for general house-keeping. An other important reason why the perceived difference between Varnish 2.0.6 and 2.1 is not that big is that many of the features that were written for Varnish 2.1 were ported to Varnish 2.0. This includes new purging mechanisms, saint mode, resets in vcl_error and numerous bug fixes.

What’s next?

We are still working on persistent storage. It is available in Varnish 2.1 as an experimental feature, but it is missing certain key aspects – like LRU support. You can compare this to how critbit evolved: Critbit was available in Varnish 2.0, but not stable. We used the 2.0 release to fine tune critbit, and we will use 2.1 to improve persistent storage.

For Varnish 2.1.4, I will be merging the DNS director, which has been ready for some time now. I wanted to investigate some reports of memory leaks before I merged it, and those seem to be debunked now.

I will also be merging my return(refresh) code, which is fairly simple stuff. All it does is “guarantee” a cache miss, even if there is valid content in the cache. The use-case for this is when you update content and want to control who does the initial waiting. The typical example is when your front page updates, you send a script to it with a magic header (X-Refresh: Yes, for instance), then you look for that in VCL and make sure the client is coming from an allowed IP (if (client.ip ~ purgers), for example) and issue return(refresh), which will (oddly enough) refresh the content. Your clients wont have to wait and the front-page is updated immediately.

In the longer run, we are also looking at proper support for gzip. For the uninitiated, it should be emphasized that for normal operation, Varnish doesn’t need to support gzip. Normally, Varnish will simply forward the Accept-Encoding header to the web server, which will compress the content as it sees fit and return it with a Vary header. That way, Varnish can deliver compressed content without having to compress it itself. This works fine, until you introduce Edge Side Includes (ESI) into the mix. With ESI, Varnish has to parse the content returned to check for ESI commands, and it can’t do that if the content is compressed. So today, Varnish only supports uncompressed ESI. We wish to solve that. Properly.

I am sure I have forgotten some key elements, but this should hopefully be enough to make this a worth-while read.


Get every new post delivered to your Inbox.