Kristian Lyngstol's Blog

A free software-hacker's blog

Tag Archives: examples

Using an OpenPGP Smartcard on Ubuntu 10.04

We recently bought a few OpenPGP SmartCard version 2.0 at Varnish Software, with card readers to match, and I’ve been using mine for a month or so now. However, there are a few challenges involved, specially with the PCMCIA-based Omnikey CM4040 card reader. Until today, mine has been flaky at best.

The use case for this is to store your ssh key on a smart card, along with your encryption and signature key.

In addition to hardware, the version 2.0 of the OpenPGP card requires gnupg 1.4.10 or gnupg 2.0.10 or newer, which is not necessarily easily available. It is available on Ubuntu 10.04, however.

I’ve tried to describe the necessary steps, though I might have forgotten some elements of it. That being said, this isn’t meant to be a recipe for people who don’t know what they are doing, it’s intended for those of you who understand what’s actually being done. Let me know if anything crucial is missing.

Setting up the software

You will need gnupg2, gnupg-agent, pcscd and libpcsclite1 to begin with. If you are using th Omnikey CM4040, you will most likely also need pcsc-omnikey. The software stack works by having gpg2 or ssh talk to the gpg-agent (which will also act as an ssh-agent), this in turn will spawn scdaemon which talks to pcscd which talks to the actual card, if my understanding is correct.

apt-get install gnupg2 gnupg-agent pcscd libpcsclite1

To be able to use the ssh bit, you have to remove the ssh-agent from /etc/X11/Xsession.d/, simply copy it out of the way: cp /etc/X11/Xsession.d/90×11-common_ssh-agent /etc/X11/ . Now edit /etc/X11/Xsession.d/90gpg-agent and find the STARTUP=”$GPGAGENT –daemon …” line and add –enable-ssh-support. After that, you’ll be using gpg as your ssh agent when you next log in.

Setting up the Omnikey CM4040

To get the 4040 up and working, you need to install the pcsc-omnikey package if you didn’t already do that. Unfortunately, it doesn’t drop the required config in /etc/reader.conf.d/ as it’s supposed to, so grab it from the example doc, then regenerate /etc/reader.conf (which is just a collection of /etc/reader.conf.d/) and restart pcscd:


# cp /usr/share/doc/pcsc-omnikey/examples/reader.conf /etc/reader.conf.d/
# update-reader.conf
# service pcscd restart

If you had already tried gpg2 –card-status or similar, you will most likely have to kill scdaemon, and it’s fairly stubborn, so it typically takes 3-4 SIGTERMs before it dies.

After this, though, you should be all set up.

Trying the card

gpg2 --card-status

A couple of things can happen:

  1. You get card information and it’s all nice and dandy
  2. You get most of the card information, but things like vendor ID is blank or incorrect – upgrade gnupg.
  3. It hangs – kill scdaemon.
  4. You get a “no card found” error: remove and reinsert the card, restart pcscd and kill scdaemon, then pound your head into the wall if it still doesn’t change.
  5. You get a more generic error: see point 4.

To make sure the card isn’t caching, try removing it then typing gpg2 –card-status. You should obviously get an error, and if you don’t, it’s likely scdaemon that needs a kill. This typically happens if you remove the card reader or restart pcscd.

Assuming the card is showing up, it’s time to use it: gpg2 –card-edit

Setting up most of this should be easy, but there’s at least one point that needs an explanation: Your public key is _NOT_ stored on the card! After you’ve created a new key (or imported one), you will have to send it to a key server the normal way and set the url on the card to match. The reason you want to set the url is because it makes it far easier when you use your card on a different machine: All you do is run –edit-card and type ‘fetch’ and it will download your public keys.

The other thing you must remember, which you probably already know from reading the notes that came with your card: If you type the admin pin code wrong more than 3 times in a row, your card is essentially bust. Luckily, you rarely need the admin pin.

If you haven’t managed to set your pin yet, type help and read. To generate a new set of keys, simply type generate. You can also move the keys using normal gpg2 commands, but as I haven’t done that myself, I can’t rule out pit falls.

Once your keys are generated, that’s really all there is. You can add uids later like you would with any other gpg key (ie: gpg2 –edit-key Your-fpr-here).

Using ssh

This should already Just Work, but to verify that ssh can see your key, use: ssh-add -l. You can also add your old ssh keys to gpg that way. You’ll be prompted for pass phrase two times: The original pass phrase and the new one. See gpg-agents man pages for details.

Exporting the public key can either be done by ssh-copy-id,  or ssh-add -L.

Using GPG

You probably want to make sure you set up key trust for your key (gpg2 –edit-key ..;   trust) when you’re on a new computer, but besides that, it’ll be just like normal GPG. I’d post a pretty picture of the PIN-dialog, but it seems it grabs most of the screen, so that’s a bit difficult.

Hooking it up to PAM

I’ve had moderate success with libpam-poldi. I’m saying moderate because only one application can hold your card at any given time, which is fine as long as everything goes through gpg2-agent and scdaemon, but if you try to use your smartcard  for sudo, you’ll likely find it doesn’t do you much good. The poldi-package worked pretty much out of the box for me, after gluing it in to pam. It has fairly good instructions, so I’ll leave it as an exercise to the reader to actually set it up.

One interesting thing, though, is that if you hook it up to gnome-screen-saver, it’ll work flawlessly, and even take pin-cache into consideration. Login also works fine, as gpg-agent hasn’t started yet.

Caching a Debian repository with Varnish

In the past I’ve both run my own debian mirror and used apt-cacher to reduce the amount of duplicate package downloads I do at home when I upgrade multiple computers (virtual or otherwise). Mirroring sort of defeated the purpose as I used far more bandwidth than I needed, and apt-cacher was not horribly robust. The reason I want this is twofold: 1. I’m a geek. 2. I often have 5-10 debian-based hosts (virtual+physical) at home.

So now I use Varnish instead.

Debian archives (in this context, debian and ubuntu are the same) are twofold: information about packages and the packages themself. Ie: “apt-get update” versus “apt-get install”. I cache “^/debian/.*\.deb$” for 21 days (random number) and everything else in “^/debian” for 12 hours.

I’ve set up repo.kristian.int to point to the web host, which means that if I for some reason don’t want to use my local Varnish cache, I can just point it to a real debian mirror and the clients wouldn’t notice the difference.

Just for the heck of it, Varnish is set up to use 50GB of -smalloc memory. It’s fun to have disk space.

Pros:

  • No maintenance needed – it’s just a HTTP cache.
  • Reduced bandwidth usage.
  • Faster local upgrades.

Cons:

  • No streaming delivery yet, so adds a delay if it’s cache miss. Since the final delivery is gbit, this is hardly a real issue. And since streaming delivery is on the todo-list for Varnish….
  • Restarting Varnish flushes the cache. I will be using persistence to solve this.

Vcl:

backend lo {
        .host = "127.0.0.1";
        .port = "8080";
}

backend debian {
        .host = "ftp.no.debian.org";
        .port = "80";
}

sub vcl_recv {
        if (req.url == "/purgeall") {
                purge("req.url ~ .*");
                error 200 "Purged all";
        }
        if (req.url ~ "^/debian/.*") {
                set req.backend = debian;
        } else {
                set req.backend = lo;
        }
}

sub vcl_fetch {
        if (req.url ~ "^/debian") {
                if (req.url ~ ".deb$") {
                        set beresp.ttl = 21d;
                } else {
                        set beresp.ttl = 12h;
                }
        } elsif (req.url ~ "^/slaughter") {
                set beresp.ttl = 1s;
        } elsif (req.url ~ "^/munin/") {
                set beresp.ttl = 30s;
        } else {
                set beresp.ttl = 10s;
                set beresp.cacheable = false;
        }
}


Notes on the vcl

The VCL is an unedited copy/paste of the actual VCL I use, and it’s running on an internal Varnish server. I’m not protecting things like /purgeall, which you should if you copy this.

Also note that I consistently fall through to the default VCL instead of trying to out-smart it. That’s how I recommend you write a VCL file, as the default VCL handles cookies, authorization headers and strange HTTP requests (ie: TRACE) in a sensible way, in addition to adding X-Forwarded-For logic.

Adding support for Ubuntu would just mean adding an ubuntu mirror and copying the logic for ^/debian to ^/ubuntu.

Varnish purges

Varnish purges can be tricky. Both the model of purging that Varnish use and the syntax you need to use to take advantage of them can be difficult to grasp. It took me about five Varnish Administration Courses until I was happy with how I explained it to the participants, specially because the syntax is the most confusing syntax we have in VCL. However, it’s not very hard to work with once you understand the magic at work.

0. Separating purges and forced expiry

There are two ways to throw something out of the cache before the TTL is due in Varnish. You can either find the object you want gone and set the TTL to 0 forcing it to expire, or use Varnish’ purge mechanism. Setting ttl to 0 has it’s advantages, since you evict the object immediately, but it also means you have to evict one object at a time. This is fairly easy and usually done by having Varnish look for a “PURGE” request and handle it. This is not what I’ll talk about today, though. Read http://varnish-cache.org/wiki/VCLExamplePurging for information on forcibly expiring an object.

1. The challenges of purging a cache

The main reason people purge their cache, is to make room for updated content. Some journalist updated an article and you want the old one – possibly cached for days – gone. In addition, you may not know exactly what to cache, or it might be broader than just one item. En example would be a template used to generate multiple php files. Or all sports articles.

All in all, you do not purge to conserve memory. Because you expect that the cache will be filled soon.

If you are to purge all your php pages and you have 150 000 objects, you may not want to go looking for them either. This the reason some competing cache products are slow at large purging. By looking for all these objects, you might have to hit the disk to fetch cold objects.

In varnish, we also leave it up to VCL what’s unique to an object. That is to say: You can override the cache hash. By default it’s the host name or server IP combined with the “URL”. This is usually what people want, but sometimes you may want to add a cookie into the mix, for instance. The point is, we don’t know exactly what people cache on.

2. How Varnish attacks the problem

In Varnish, you purge by adding a purge to a list. This list can grow large if you add several very specific purges, but we try to reduce the overlap as much as possible. The purge in question can be pretty much anything you can match in VCL, including regular expressions on URLs, host names and user-agents for that matter. You can see the list by typing “purge.list” in the command line interface (CLI, or telnet).

Each object in your cache points to the last purge it was tested against. When you hit an object, it checks if there are any new purges in the list, test the object against them, then either evict the object and fetch a new one, or update the “last tested against”-pointer.

Because of this, the ‘req’-structure you are evaluating is actually that of the client to access the object next, not the client who pulled the object from the backend. It also means that every single object in your cache that is hit will be tested against all purges to see if it matches. But it’s spread out over time. It might sound wasteful, but it means you can add purges at constant time, and not really think about the cost of evaluating them.

It also means the object stays in the cache until it expires if it is not hit. So you don’t free up memory.

3. Adding purges “by hand”

Want to purge a http://example.com/somedirectory/ and everything beneath that path?

purge req.http.host == example.com && req.url ~ ^/somedirectory/.*$

or

purge req.url ~ ^/somedirectory/ && req.http.host == example.com

Want to purge all objects with a “Cache-Control: max-age=” set to 3600 ?

purge obj.http.Cache-Control ~ max-age=3600

or to take white space into account and no trailing numbers:

purge obj.http.Cache-Control ~ max-age ?= ?3600[^0-9]

Notice that all of the variables are in the same “VCL-context” as the client to hit the object next, so if you purge on req.http.user-agent, it’s fairly random if the object is really purged, because you (probably) can’t predict what user-agent the next person to visit a specific object is using. If you wish to purge based on a parameter sent from the “original” client, you will have to store that parameter in obj.http somewhere and remove it in vcl_deliver if you don’t want to expose it.

4. Adding purges in VCL

This is where it gets tricky. The normal example of why, is this: purge(“req.url == ” req.url);

Normal programming-thinking would tell you that this would match everything, since the url is always equal to itself. This is where VCL string concatenation comes into the picture. In reality, you are writing: “add this to the purge list: The string containing “req.url == ” and the value of the variable req.url”.

In other words, if the client access http://example.com/foobar and hit the code above, this would say: “Add the string containing “req.url == ” and “/foobar” to the purge list.” The quotation marks are essential!

I find it easier to think of it as preparing a string for the purge-command on cli. Varnish concatenates two strings without any special sign.

In the end, this is the rule of thumb: Put everything you expect to see literally when you type “purge.list” inside quotation marks, and put things you wish to replace with the variable of the calling session outside.

So you actually have three different VCL contexts to worry about:

  1. The context that originally pulled the object in from a backend (not much you can do here unless you hide things in obj.http)
  2. The context that will hit the object and thereby test the object against the purge. Any variable in this context has to be inside quotation marks.
  3. The context that triggered the purge, variables from this context should be outside quotation marks, so they are replaced with their string values before being added to the purge list.

The reason you do not need quotation marks if you enter the purge command on the command line interface is because you don’t have the third context. There is no req.url in telnet, since you are not going through VCL at all.

Some examples, note that when I say “supplied by the client” I mean the client initiating the purge, typically some smart system you’ve set up:

Purge object on the current host and URLs matching the regex stored in the X-Purge-Regex header supplied by the client:

purge("req.http.host == " req.http.host " && req.url ~ " req.http.X-Purge-Regex);

Purge all php for any example.com-domain:

purge("req.http.host ~ example.com$ && req.url ~ ^/.*\.php");

Same, but for the host provided in the X-Purge-HostPHP:

purge("req.http.host ~ " req.http.X-Purge-HostPHP " && req.url ~ ^/.*\.php");

Purge objects with X-Cache-Channel set to “sport”:

purge("obj.http.X-Cache-Channel ~ sport");

Same, but purge the cache-channel set in the header ‘X-Purge-CC':

purge("obj.http.X-Cache-Channel ~ " X-Purge-CC);

Purge in vcl_fetch if the backend sent a X-Purge-URL header (weird thing to do, but fun example):

sub vcl_fetch {
(....)
if (obj.http.X-Purge-URL) {
purge("req.url ~ " obj.http.X-Purge-URL);
}
(...)
}

(PS: I have not actually tested all these examples, but they look correct)

Varnish best practices

A while ago I wrote about common Varnish issues, and I think it’s time for an updated version. This time, I’ve decided to include a few somewhat uncommon issues that, if set, can be difficult to spot or track down. A sort of pitfall-avoidance, if you will. I’ll add a little summary with parameters and such at the end.

1. Run Varnish on a 64 bit operating system

Varnish works on 32-bit, but was designed for 64bit. It’s all about virtual memory: Things like stack size suddenly matter on 32bit. If you must use Varnish on 32-bit, you’re somewhat on your own. However, try to fit it within 2GB. I wouldn’t recommend a cache larger than 1GB, and no more than a few hundred threads… (Why are you on 32bit again?)

2. Watch /var/log/syslog

Varnish is flexible, and has a relatively robust architecture. If a Varnish worker thread was to do something Bad and Varnish noticed, an assert would be triggered, Varnish would shut down and the management process would start it up again almost instantly. This is logged. If it wasn’t, there’s a decent chance you wouldn’t notice, since the downtime is often sub-second. However, your cache is emptied. We’ve had several customers contact us about performance-issues, only to realize they’re essentially restarting Varnish several times per minute.

This might make it sound like Varnish is unstable: It’s not. But there are bugs, and I happen to see a lot of them, since that’s my job.

An extra note: On Debian-based systems, /var/log/messages and /var/log/syslog is not the same. Varnish will log the restart in /var/log/messages but the actual assert error is only found in /var/log/syslog, so make sure you look there too.

The best way to deal with assert errors is to search our bug tracker for the relevant function-name.

3. Threads

The default values for threads is based on a philosophy I’ve since come to realize isn’t optimal. The idea was to minimize the memory footprint of Varnish. So by default, Varnish uses 5 threads per thread pool. By default, that’s 10 threads minimum. The maximum is far higher, but in reality, threads are fairly cheap. If you expect to handle 500 concurrent requests, tune Varnish for that.

A little clarification on the thread-parameters: thread_pool_min is the minimum number of threads for each thread pool. thread_pool_max is the maximum total number of threads. That means the values are not on the same scale. The thread_pools parameter can safely be ignored (tests have indicated that it doesn’t matter as much as we thought), but ideally having one thread_pool for each cpu core is the rule of thumb, if you want to modify it.

You also do not want more than 5000 as the thread_pool_max. It’s dangerous, though fixed in trunk. It’s also more often than not an indication that something else is wrong. If you find yourself using 5000 threads, the solution is to find out why it’s happening, not to increase the number of threads.

To reduce the startup time, you also want to reduce the thread_pool_add_delay parameter. ‘2’ is a good value (as opposed to 20 which makes for a slow start).

4. Tune based on necessity

I often look at sites where someone has tried to tune Varnish to get the most out of it, but taken it a bit too far. After working with Varnish I’ve realized that you do not really need to tune Varnish much: The defaults are tuned. The only real exception I’ve found to this is number of threads and possibly work spaces.

Varnish is – by default – tuned for high performance on the vast majority of real-life production sites. And it scales well, in most directions. By default. Do yourself a favor and don’t fix a problem which isn’t there. Of all the issues I’ve dealt with on Varnish, the vast majority have been related to finding out the real problem and either using Varnish to work around it, or fix it on the related system. Off the top of my head, I can really only remember one or two cases where Varnish itself has been the problem with regards to performance.

To be more specific:

  • Do not modify lru_interval. I often see the value “3600”. Which is a 180 000% (one hundred and eighty thousand percent) increase from the default. This is downright dangerous if you suddenly need the lru-list, and so far my tests haven’t been able to prove any noticeable performance improvement.
  • Setting sess_timeout to a higher value increase your filedescriptor consumption. There’s little to gain by doing it too. You risk running out of file descriptors. At least until we can get the fix into a released version.

So the rule of thumb is: Adjust your threads, then leave the rest until you see a reason to change it.

5. Pay attention to work spaces

To avoid locking, Varnish allocates a chump of memory to each thread, session and object. While keeping the object workspace small is a good thing to reduce the memory footprint (this has been improved vastly in trunk), sometimes the session workspace is a bit too small, specially when ESI is in use. The default sess_workspace is 16kB, but I know we have customers running with 5MB sess_workspace without trouble. We’re obviously looking to fix this, but so far it seems that having some extra sess_workspace isn’t that bad. The way to tell is by asserts (unfortunately), typically something related to “(p != NULL) Condition not true” (though there can obviously be other reasons for that). Look for it in our bug report, then try to increase the session workspace.

6. Keep your VCL simple

Most of your VCL-work should be focused around vcl_recv and vcl_fetch. That’s where you define the majority of your caching policies. If that’s where you do your work, you’re fairly safe.

If you want to add extra headers, do it in vcl_deliver. Adding a header in vcl_hit is not safe. You can use the “obj.hits” variable in vcl_deliver to determine if it was a cache hit or not.

You should also review the default vcl, and if you can, let Varnish fall through to it. When you define your VCL, Varnish appends the default VCL, but if you terminate a function, the default is never run. This is an important detail in vcl_recv, where requests with cookies or Authroization-headers are passed if present. That’s far safer than forcing a lookup. The default vcl_recv code also ensures that only GET and HEAD-requests go through the cache.

Focus on caching policy and remember that the default VCL is appended to your own VCL – and use it.

7. Choosing storage backend (malloc or file?)

If you can contain your cache in memory, use malloc. If you have 32GB of physical memory, using -smalloc,30G is a good choice. The size you specify is for the cache, and does not include session workspace and such, that’s why you don’t want to specify -smalloc,32G on a 32GB-system.

If you can not contain your cache in memory, first consider if you really need that big of a cache. Then consider buying more memory. Then sleep on it. Then, if you still think you need to use disk, use -sfile. On Linux, -sfile performs far better than -smalloc once you start hitting disk. We’re talking pie-chart-material. You should also make sure the filesystem is mounted with noatime, though it shouldn’t be necessary. On Linux, my cold-hit tests (a cold hit being a cache hit that has to be read from disk, as opposed to a hot hit which is read from memory) take about 6000 seconds to run on -smalloc, while it takes 4000 seconds on -sfile with the same hardware.  Consistently. However, your milage may vary with things such as kernel version, so test both anyway. My tests are easy enough: Run httperf through x-thousand urls in order. Then do it again in the same order.

Some of the most challenging setups we work with are disk-intensive setups, so try to avoid it. SSD is a relatively cheap way to buy yourself out of disk-issues though.

8. Use packages and supplied scripts

While it may seem easier to just write your own script and/or install from source, it rarely pays off in the long run. Varnish usually run on machines where downtime has to be planned, and you don’t want a surprise when you upgrade it. Nor do you want to risk missing that little bug we realized was a problem on your distro but not others. If you do insist on running home-brew, make sure you at least get the ulimit-commands from the startup scripts.

This is really something you want regardless of what sort of software you run, though.

9. Firewall and sysctl-tuning

Do not set “tw_reuse” to 1 (sysctl). It will work perfectly fine for everyone. Except thousands of people behind various NAT-based firewalls. And it’s a pain to track down. Unfortunately, this has been an advice in the past.

Avoid connection-tracking on the Varnish server too. If you need it, you’ll need to tune it for high performance, but the best approach is simply to not do connection-tracking on a server with potentially thousands of new connections per second.

10. Service agreements

(Service agreements are partly responsible for my salary, so with that “conflict of interest” in mind….)

You do not need a service agreement to run Varnish. It’s free software.

However, if you intend to run Varnish and your site is business critical, it’s sound financial advice to invest some money in it. We are the best at finding potential problems with your Varnish-setup before they occur, and solving them fast when they do occur.

We typically start out by doing a quick sanity-test of your configuration. This is something we can do fast, both with regards to parameters, VCL and system configuration. Some of our customers only contact us when there’s something horribly wrong, others more frequently to sanity-check their plans or check up on how to use varnisncsa for their particular logging tool and so on. It’s all up to you.

We also have a public bug tracker anyone can access and submit to. We do not have a private bug tracker, though there are bugs that never hit the public bug tracker – but that’s because we fix them immediately. Just like any other free software project, really. We have several public mailing lists, and we answer them to the best of our ability, but there is no guarantee and our time is far more limited. If you run into a bug, my work on other bugs will be postponed until your problems are solved. Better yet: if you run into something you don’t know is a bug, we can track it down.

A service agreement gives you saftey. And your needs will get priority when we decide where we want to take Varnish in the future.

We also offer training on Varnish, if you prefer not to rely on outside competence.

Oh, and I get to eat. Yum.

Summary

Keep it simple and clean. Do not use connection tracking or tw_reuse. Try to fit your cache into memory on a 64-bit system.

Watch your logs.

Parameters:

thread_pool_add_delay=2
thread_pools = <Number of cpu cores>
thread_pool_min = <800/number of cpu cores>
thread_pool_max = 4000
session_linger = 50
sess_workspace = <16k to 5m>

So if you have a dual quad core CPU, you would have 8 cpu cores. This would make sense: thread_pools=8, thread_pool_min=100, thread_pool_max=4000. The number 800 is semi random: it seems to cover most use-cases. I addedd session_linger into the mix because it’s a default in Varnish 2.0.5 and 2.0.6 but not in prior versions, and it makes good sense.

Follow

Get every new post delivered to your Inbox.