Drupal-lovers

Who love to Drupal

Drupal and Varnish

Posted by drupallovers on June 12, 2012

Varnish is a HTTP accelerator (or reverse proxy) capable of serving 100,000 requests a second. Somewhat faster than Drupal, even with page-caching on!

How does it work?

Diagram of the varnish process, explained in more detail in the list below.

  • Cache-hit
    1. User requests a URL
    2. Varnish checks it’s cache
    3. Varnish retrieves the data from the cache
    4. Varnish delivers the data to the user.
  • Cache-miss
    1. User requests a URL
    2. Varnish checks it’s cache – but the data isn’t cached
    3. Varnish requests the URL from the backend
    4. Drupal processes the request and delivers a response to Varnish
    5. Varnish caches the response
    6. Varnish forwards the response to the user

Varnish magic

Varnish is capable of some cool features. It can be used for:

  • Load balancing between a series of backend Drupal servers
  • Serving assets (images/css/swfs…) from a light-weight backend whilst serving content from Drupal
  • ESI – Edge Side Includes – allowing personalised pages to be cached
  • Maintainance-mode, where Varnish can serve a “Site is being updated” page without traffic hitting the backend

What are the cons?

Like any solution, Varnish brings its own set of issues.

  • Statistics – content served by Varnish won’t hit the backend, so traditional stats (log files, the statistics module) won’t show the correct results. Use a client-side solution such as Google Analytics instead.
  • Personalisation – personalised pages are hard to cache. You will need things like ESI and custom VCL logic. Not for the faint-hearted.
  • Caching-rules complexity – choosing what to cache, how long to cache it for, and how to purge the cache if content is updated, is all complex, and needs more modules, more VCL rules. Choose a balance between complexity, performance, and investment in hardware.
  • Decreased performance – What? I hear you cry…I thought Varnish was meant to improve performance! Well, if something’s cached, it’s quicker. But cache-misses are generally slower via varnish than a direct request, because every additional system in the request-route decreases performance. In reality, most web-sites would benefit (if only to cache images and css), but it might not be suitable for a web-service.

Getting setup: How to add varnish to your existing web site

This how-to presumes that you already have your drupal site up and running, using apache on port 80.

  1. install varnish (apt-get install varnish / yum install varnish)
  2. change the varnish config to listen on port 80 Edit /etc/default/varnish Change the VARNISH_LISTEN_PORT to 80 NB: the behaviour has changed somewhat across varnish – check the docs for your version!
  3. change apache config to listen on port 8080 (or another suitable port, if something’s already running on 8080)
  4. edit the VCL to forward backend requests to apache Edit /etc/varnish/default.vcl
    backend default {
      .host = "127.0.0.1";
      .port = "8080"; 
    }
  5. restart apache
    I usually use “apache2ctl graceful”, rather than “/etc/init.d/apache2 restart” because this sanity-checks the configs before restarting!
  6. restart varnish: “/etc/init.d/varnish restart”
    NB: each time Varnish is restarted, its cache is cleared.

This will give you a very basic implementation – to see the benefit of varnish, you’ll want to do more.

Optimising Varnish and Drupal to work together

  • Varnish module: The varnish module provides an admin dashboard to show you the status of the Varnish server(s), and provides integration points which will clear the Varnish cache when a node is edited (when integrated with the expire module).
  • Expire module: The expire module has been separated out from Boost, and provides a generic cache-management tool which runs at page-level. Expire will purge the appropriate pages from the varnish cache when the content changes.
  • Memcache et al: use all of the normal performance tools as well. So memcache, APC, cache-sets/gets in your custom modules, etc. Varnish is an addition to the performance tools, not a replacement.
  • ESI module: Edge-side includes are difficult, but the ESI module makes them a little easier.

Some custom VCL logic

Long live assets!

Assets should live a long time. CSS, images, JS, SWFs: should all hang around in the cache.

sub vcl_recv {
  # Assets are pulled from the cache, even if we have a NO_CACHE cookie.
  if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {
    return (lookup);
  }
}
sub vcl_fetch {
  # Don't cache cookies
  remove beresp.http.Set-Cookie;
  # Set a long TTL (1 day)
  set beresp.ttl = 86400s;
  return(deliver)
}

Is it working?

To give you some reassurance that Varnish is doing it’s job, it’s nice to see if you’re getting hits or misses. This code will add an HTTP header to report back.

sub vcl_deliver {
  if (obj.hits > 0) {
    set resp.http.X-Cache = "HIT";
  } else {
    set resp.http.X-Cache = "MISS";
  }
}

How to clear the varnish cache

In case you follow all these instructions, but end up with some really private data in your cache, here’s a quick clear-cache how-to!

  • command-line: SSH into the server, and run:/etc/init.d/varnish restart
  • Telnet: Telnet to the admin port of Varnish (by default port 6082) and use the purge command
    telnet 192.168.0.10 6082
    url.purge /

    url.purge will take regex – see the Varnish docs for more info.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: