mod_auth_cas on CentOS7 / Apache 2.4

For CentOS, this is now available in the EPEL repo

mod_auth_cas (https://github.com/Jasig/mod_auth_cas) is an Apache module that plugs into the Apache mod_auth framework, to provide authentication against a Jasig CAS SSO server.  Unfortunately development on it seems to have stalled, and it currently doesn’t support Apache 2.4 – https://github.com/Jasig/mod_auth_cas/issues/49.

There is a fork that does support Apache 2.4, so this willl cover how to build and install it.

# Clone the forked github repo
git clone https://github.com/klausdieterkrannich/mod_auth_cas.git .

# install development libraries
yum install gcc httpd-devel openssl-devel libcurl-devel automake

# Run configure
./configure

# Run Make
make

# If you get errors like:
/opt/mod_auth_cas/missing: line 81: aclocal-1.12: command not found
WARNING: 'aclocal-1.12' is missing on your system.
# then symlink the binaries as follows:
ln -s /usr/bin/aclocal /usr/bin/aclocal-1.12
ln -s /usr/bin/automake /usr/bin/automake-1.12
# and run make again

# Assuming it built OK, then install it:
make install

# On CentOS this will put the binaries into:
#/usr/lib64/httpd/modules
# and you can then copy them from here to your production systems.

 

DHCP fingerprinting

We wanted to find out what sort of devices are active on the wireless network, and the vendor tools we’ve got don’t quite give us the level of detail we were after.

However, everything which hits our wireless network gets a DHCP lease from our dhcp servers.  With a bit of dhcpd.conf magic, you can make it profile each client when it requests or renews a lease and record a fingerprint in the logs.

dhcpd.conf – collecting fingerprints

# put the dhcp options request fingerprint in the leases file
set dhcp-op-req-string = binary-to-ascii(10,8,":",option dhcp-parameter-request-list);

# log the fingerprint in the format:
# Jul 17 14:36:06 dhcp2 dhcpd: FINGERPRINT 1,3,6,12,15,28 for 00:10:20:30:40:50

log(info,
concat("FINGERPRINT ",
binary-to-ascii(10,8,",",option dhcp-parameter-request-list),
" for ",
concat (  # MAC
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 1, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 2, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 3, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 4, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 5, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 6, 1))),2)
       )        # End MAC
));
# End DHCP fingerprinting

Now every time a device interacts with our DHCP server, we get a FINGERPRINT line appearing in our logs along with the mac address which requested the lease.

So far, so good. Now we need to process those logs into something anonymous, but meaningful.

Data Prep
The easiest approach is to cat our logfile, strip out just the fields we’re interested in (mac address and fingerprint) then sort them to remove duplicates (we only want to count each machine once!) and then finally throw away the mac addresses (because all we really want are the fingerprints)

We can do that easily enough with a lovely long pipeline

cat /var/log/dhcpd.log | grep FINGERPRINT | awk '{ print $9 " " $7 }' | sort -u | awk '{ print $2 }'

There are probably more elegant ways to do it, but the above isn’t really the interesting bit. All you get out of it is a list of fingerprints. The magic is in converting those into something meaningful.

Chewing on your fingerprints
To process, identify and count these fingerprints, we need the help of the fingerbank project who have collected DHCP fingerprints from all over the place.

I’m grabbing the fingerprint list as a config file from their github repo: https://github.com/inverse-inc/fingerbank/blob/master/dhcp_fingerprints.conf although since I first started playing with this about 6 months ago, it seems they’ve made their fingerprint database available as an Sqlite DB – which would have been much easier to wrangle than parsing the config file.

So here’s a slightly shonky perl script to parse the config file and produce a CSV summary of the output. This is probably not as elegantly done as it could be, please don’t judge too harshly! I’ve tried to make it readable, but some of the datastructures are a bit on the deep side. If you want to see what’s going on, make plenty of use of “Data::Dumper” – I know I had to when writing it.

It assumes dhcp_fingerprints.conf is in the same folder as the script, and expects to be fed fingerprints over STDIN one line at a time – so you can stick it on the end of the pipeline I mentioned earlier.

#!/usr/bin/perl -wT

use strict;

use Config::IniFiles;
use Data::Dumper;

my %dhcp_fingerprints; # tied version of the config file
my ($fprint_db, $fprint_class, $os_counter); # DStructs which we query later

# Tie fingerprint config file from fingerbank to a DS so we can parse it
tie %dhcp_fingerprints, 'Config::IniFiles', ( -file => "dhcp_fingerprints.conf" );

# Build $fprint_class (maps OS name to "class")
foreach my $class (tied(%dhcp_fingerprints)->GroupMembers("class") ) {
  my ($min,$max) = split /\D/, $dhcp_fingerprints{$class}{"members"};
  $$fprint_class{ $dhcp_fingerprints{$class}{"description"} }{min}=$min;
  $$fprint_class{ $dhcp_fingerprints{$class}{"description"} }{max}=$max;
}

# Build $fprint_db (maps fingerprint to OS name)
foreach my $os ( tied(%dhcp_fingerprints)->GroupMembers("os") ) {
  $os =~ m/os (.*)$/gi;
  my $os_id = $1;

  if ( exists( $dhcp_fingerprints{$os}{"fingerprints"} ) ) {
    if ( ref( $dhcp_fingerprints{$os}{"fingerprints"} ) eq "ARRAY" ) {
      foreach my $dhcp_fingerprint ( @{ $dhcp_fingerprints{$os}{"fingerprints"} } ) {
        $$fprint_db{$dhcp_fingerprint}{"description"}=$dhcp_fingerprints{$os}{"description"};
        $$fprint_db{$dhcp_fingerprint}{"os"}=$os_id;   
      }
    } else {
      if (defined $dhcp_fingerprints{$os}{"fingerprints"}) {
        foreach my $dhcp_fingerprint (split(/\n/, $dhcp_fingerprints{$os}{"fingerprints"})) {
        $$fprint_db{$dhcp_fingerprint}{"description"}=$dhcp_fingerprints{$os}{"description"};
        $$fprint_db{$dhcp_fingerprint}{"os"}=$os_id;
        }
      }
    }
  }
}

# now we loop through all the fingerprints we've been given on STDIN and try to ID them
while () {
  chomp;
  my $fingerprint = $_;

  # See if it appears in $fprint_db...
  if(defined $$fprint_db{$fingerprint}) {
    # Count it
    $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"count"}++;

    # Try to identify the type of OS
    foreach my $class (keys $fprint_class) {
      if ($$fprint_db{$fingerprint}{"os"} >= $$fprint_class{$class}{"min"} && $$fprint_db{$fingerprint}{"os"} <= $$fprint_class{$class}{"max"}) {
        $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"}=$class;
      }
    }
    
    # If we haven't yet set the OS class, set it to "unknown"
    $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"}="unknown" unless (defined $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"});

    } else {
      # No idea what it was, so add it to the unknown count
      $$os_counter{"unknown"}{"count"}++;
      $$os_counter{"unknown"}{"class"}="unknown";
    }
  }

# Print summary output as a CSV
print "\n\nClass,OS,Count\n";
foreach my $os(keys %$os_counter) {
  print qq["$$os_counter{$os}{class}","$os","$$os_counter{$os}{count}"\n];
}

If I let that chew on a decent chunk of todays logs (from about 7am to 2pm) it spits out the following:

Class OS Count
Smartphones/PDAs/Tablets Samsung Galaxy Tab 3 7.0 SM-T210R 39
Home Audio/Video Equipment Slingbox 49
Dead OSes OS/2 Warp 1
Gaming Consoles Xbox 360 6
Windows Microsoft Windows Vista/7 or Server 2008 1694
Printers Lexmark Printer 1
Network Boot Agents Novell Netware Client 1
Macintosh Mac OS X Lion 2783
Misc Eye-Fi Wireless Memory Card 1
Printers Kyocera Printer 1
unknown unknown 40
Smartphones/PDAs/Tablets LG Nexus 5 & 7 1797
Printers HP Printer 54
CD-Based OSes PHLAK 1
Smartphones/PDAs/Tablets Nokia 13
Smartphones/PDAs/Tablets Motorola Android 2
Macintosh Mac OS X 145
Smartphones/PDAs/Tablets Generic Android 2989
Gaming Consoles Playstation 2 1
Linux Chrome OS 39
Linux Ubuntu/Debian 5/Knoppix 6 5
Routers and APs Cisco Wireless Access Point 69
Linux Generic Linux 7
Linux Ubuntu 11.04 21
Windows Microsoft Windows 8 1792
Routers and APs Apple Airport 2
Routers and APs DD-WRT Router 3
Smartphones/PDAs/Tablets Sony Ericsson Android 1
Linux Debian-based Linux 51
Smartphones/PDAs/Tablets Symbian OS 2
Storage Devices LaCie NAS 27
Windows Microsoft Windows XP 30
Smartphones/PDAs/Tablets Android Tablet 24
Monitoring Devices Tripplite UPS 1
Smartphones/PDAs/Tablets Apple iPod, iPhone or iPad 12289
Smartphones/PDAs/Tablets Samsung S5260 Star II 2
Smartphones/PDAs/Tablets RIM BlackBerry 63

I’m not sure I 100% believe the above (OS/2 Warp? Really?) but the bits I disbelieve are largely in the noise.

Chewing on the above stats a bit, shows us that the wireless network is roughly 27% laptops and 72% mobile devices (tablets etc). Amongst the laptops, Windows is just about in the lead with 53%, and OSX is close behind at 44% (which is probably higher than a lot of people think) Linux laptops are trailing behind at only 2%.

The mobile device landscape is less evenly split, with 71% iOS and 28% Android.

Although I wouldn’t read too much into the above analysis, as it represents a comparatively small time slice (and only 23775 of the 37000 devices we see on the wireless each week)

Who knows, perhaps we’ve got 13K windows phones owned by people who just don’t come onto campus on a Monday…

Update 2015-05-11: I’ve been asked under what license I’ve released the perl script in this post. I didn’t put any thought into licenses at the time (I was just trying to solve a problem and answer a question I’d been asked!) but I’ll put my hand up, part of the script is based on prior-art.

The section which parses the fingerprint database is taken from the process_fingerprints() function in https://github.com/inverse-inc/fingerbank/blob/master/obsolete/tools/fingerprint-find-candidate-matches.pl – a script which seems to be covered by the GPLv2 licence.

As I understand it, under the terms of the GPLv2 license, that means that the script above should also be distributed under the GPLv2 license (which I’m OK with) and that under the terms of that license it should be distributed along with a copy of the GPLv2 license… which can be found here: https://www.gnu.org/licenses/gpl-2.0.html

Puppet — making array expansions in resource calls unique

Lets say that you have an array of NFS clients to allow access to an export. You want to expand this into a list of resources, which you’d normally do via something like:

$allow = hiera_array('some_variable', undef)
nfs::export { $allow:
  ...
}

This works fine until you want to have multiple exports to the same name/IP, at which point you have two nfs::export resources with the same name. But you need the expansion of the array to generate multiple calls to nfs::export, so you need to prefix each string in the array:

$allow = hiera_array('some_variable', undef)
$hacked_allow = prefix($allow, "${fqdn} export of ${path} to client ")
nfs::export { $hacked_allow:
  ...
}

..then in the called define you extract the client fqdn like so:

define nfs::export ( ... ) {
  $client = regsubst($name, "^.* export of .* to client (.*)$", '\1')
  validate_string($client)

  ... [use $client as fqdn in here]
}

This seems like a terrible hack, but it’s a reasonable workaround for the need to expand arrays and keep the names unique to avoid duplicate resources.

mcollective with SSL — the short(ish) version

Prerequisites

  • A working puppet master and some clients to test with
  • Hiera in use, with some “common” section which is applied to all nodes. This isn’t strictly required, but it makes the set-up a lot simpler.

What you’ll end up with

  • SSL’d connections between all involved daemons
  • An admin user who can run “mco ping” and get responses from all the mcollective-enabled nodes

Basic setup (without SSL)

First you’ll want to get mcollective up and running *without* SSL to be sure that the module is working, before leaping on to use SSL and certificates.

Since this is a very basic setup all the examples with use the puppet master as the “broker”, where activemq queues all the requests and replies centrally, rather than a separate machine. Replace the puppet.example.org and node[12].example.org references with your host names as appropriate 🙂

  • Install the puppetlabs mcollective module and its dependencies with:
    puppet module install puppetlabs-mcollective
    This installs many other modules to help it along.
  • Add the mcollective class to your included class list for all nodes. e.g.:
    classes:
      ... (probably plenty of others)
      - mcollective
    
  • Add these variables to your hiera configs for all hosts:
    mcollective::middleware_hosts: puppet.example.org
    
  • Add these variables to apply only to your puppet master:
    mcollective::middleware: true
    mcollective::server    : true
    mcollective::client    : true
    
  • If you have firewalling on your test puppet master (and you should!), open port 61613. For me this required adding another hiera entry:
     site::firewall:
       '040 accept all stomp/activemq connections':
         state : ['NEW']
         proto : 'tcp'
         dport : 61613
         action: accept
    
  • Now run puppet (or wait for a run to occur automatically, if you have that set up) on the puppet master and a couple of nodes, and you should get the mcollective agent installed and configured on all nodes, plus the activemq daemon set up and running on the broker.

At this point you should be able to run mco ping and get some ping time results back from the nodes which are running the mcollective agent. You can also look at the output of netstat on the activemq server and see a number of ESTABLISHED connections to the 61613 port from the mcollective nodes.

Adding SSL support

Now you have a basic set-up working the steps required to get SSL working are :

  • Allow firewall connections to port 61614 on the broker (same host as the puppet master in this simple case)
  • Generate a strong random password for the mcollective user — I use pwgen -s -y 20 to generate good long random passwords
  • Generate a certificate (public/private) pair for the new admin user, using puppet cert generate admin-user-name on the puppet master
  • Generate a certificate (public/private) pair for the mcollective and activemq daemons to use, with puppet cert generate mcollective-servers
  • Create a site-local mcollective certificates set of directories under your module directories. For example, we have:
      /etc/puppet/modules/project_zoned/files/mcollective/
      /etc/puppet/modules/project_zoned/files/mcollective/client_certs
      /etc/puppet/modules/project_zoned/files/mcollective/certs
      /etc/puppet/modules/project_zoned/files/mcollective/private_keys
    
  • Copy the certificates to the appropriate places:
      cp /var/lib/puppet/ssl/certs/ca.pem /etc/puppet/modules/project_zoned/files/mcollective/certs/
      cp /var/lib/puppet/ssl/certs/mcollective-servers.pem /etc/puppet/modules/project_zoned/files/mcollective/certs/
      cp /var/lib/puppet/ssl/private_keys/mcollective-servers.pem /etc/puppet/modules/project_zoned/files/mcollective/private_keys/
      cp /var/lib/puppet/ssl/certs/admin-user-name.pem /etc/puppet/modules/project_zoned/files/mcollective/client_certs/
      cp /var/lib/puppet/ssl/private_keys/admin-user-name.pem /etc/puppet/modules/project_zoned/files/mcollective/private_keys/
      cp /var/lib/puppet/ssl/certs/admin-user-name.pem /etc/puppet/modules/project_zoned/files/mcollective/client_certs/
    
  • Add a declaration of an mcollective::user. We have various site-local classes which get auto-expanded into multiple calls to puppet defines, so we add a list like this:
    site::mcollective::users:
      - 'admin-user-name':
        group : 'admin-user-group'
        certificate: 'puppet:///modules/project_zoned/mcollective/client_certs/admin-user-name.pem'
        private_key: 'puppet:///modules/project_zoned/mcollective/private_keys/admin-user-name.pem'
    
  • Re-run puppet on the puppet master and you’ll have an install of mcollective which is set up to use SSL, and an admin user who has a ~/.mcollective config file, along with keys in their ~/.mcollective.d directory. This user should be able to run mco ping and get results from any nodes which have had the mcollective setup installed on them too (which is any that you run puppet on after adding the mcollective class and settings into a common yaml file for all nodes).

Now you should have a basic SSL-encrypted setup which works, and you can start adding mcollective plugins to do useful stuff like manage services and the puppet agent runs etc.

Adding more admin users just requires generating certificates for the user, copying them to the mcollective files repository in your “project” module, and adding them to the site::mcollective::users hash. Then they get the config added to their homedir automatically.

I can see this being rather useful as the number of machines we are managing continues to grow 😀

Docs which were useful / further reading

Juniper VPN – official Linux instructions

After much effort, we now have a packaged version of the Juniper VPN Linux client available! We also have a helper script which allows you to easily connect from your Linux desktop. It’s available for rpm and deb based distros, and as a tgz if you’re using a different package manager.

The instructions have been published here: http://www.bris.ac.uk/it-services/advice/homeusers/uobonly/uobvpn/howto/linux/

The old PPTP based VPN will be switched off on the 30th June 2014 and anyone using linux to VPN in will need to migrate to the new service before then.

-Paul

Moving a Rocks 6 cluster and changing the IP address of the head node

We have a Rocks 6 cluster which was definitely out-growing the server room/air-conditioned broom-cupboard in which it was housed, in terms of cooling, power consumption and a growing lack of physical space.
There were also many new nodes which wanted to be added but couldn’t without considerable upgrades to the room or a move to a more suitable space.

Moving it to a better home meant a change of upstream router and hence a change of IP address. All signs pointed to this being a terrible idea and a potential disaster, and all replies to related questions on the Rocks mailing list said that reinstalling the head node was the only viable solution. (Possibly using a “restore roll”)

We weren’t convinced that it would be less effort (or time taken) to reinstall than hack the config…

Internally others had moved IP and changed the hostname of a Rocks 5 cluster and had a long, long list of changes required. This was clearly going to be error-prone and looked like a worse idea than initially.

But then we found a page which claimed it was only a few “rocks set” commands and that’s all that was needed on Rocks 6!
( see http://davidmnoriega.wordpress.com/2012/06/22/rocks-cluster-changing-the-external-ip-address/ )

It did mention that there would be a couple of files in /etc to change too, but didn’t explicitly list everything.

Testing on VMs before the move…
We set about installing a VM on Virtualbox as a test head node, just with a default Rocks 6 config from the same install image which was used for the cluster (Rocks 6.0 “Mamba”). We then made a couple of compute node VMs for it to install and set about testing whether changing the IP using those instructions works fine.

First snag: VirtualBox Open-Source Edition doesn’t allow PXE boot from Intel ethernet adapters, because there is some non-GPL-compatible code which cannot be bundled in with it (ref: https://forums.virtualbox.org/viewtopic.php?f=9&t=34681 ).

Just pick a non-intel network adapter for the compute nodes and all is fine with the PXE boot, once the right boot order is set (PXE then HDD).

To ensure that we caught all references to the IP address, we used this:

find /etc -type f -print0 | xargs -0 grep a.b.c

..where a.b.c is the first three octets of the IP address as this is a /24 subnet we’re part of.
(This might be a bit more complicated if your subnet isn’t a /8, /16 or /24)

This finds:
/etc/sysconfig/network
/etc/sysconfig/network-scripts/ifcfg-em2
/etc/sysconfig/static-routes
/etc/yum.repos.d/Rocks-6.0.repo
/etc/hosts

Likewise we searched /opt as well as /etc, as the maui config and a couple of other bits and bobs are stored there; we found nothing which needed changing in /opt.

Then simply :

  • change all these IPs to the updated one (including changing static gateways)
  • run the few “rocks set” commands to update the ethernet interface and kickstart settings :


    rocks set host interface ip xxxxxxx ethX x.x.x.x
    rocks set attr Kickstart_PublicAddress x.x.x.x
    rocks set attr Kickstart_PublicNetwork x.x.x.x
    rocks set attr Kickstart_PublicBroadcast x.x.x.x
    rocks set attr Kickstart_PublicGateway x.x.x.x
    rocks set attr Kickstart_PublicNetmask x.x.x.x

  • physically move everything
  • (re)start the head node
  • rebuild all the nodes with:
    rocks set host boot compute action=install and (re)boot the nodes.
    They need this to point at the newly-configured head node IP, as it uses the external IP for static routing. (No idea why it doesn’t use the internal IP, but that’s for another post…)

And everything worked rather nicely!
This was then repeated on the physical machines with renewed confidence that it would work just fine.

Remember to check that everything is working at this point — submit jobs to the queues, ensure NFS is happy and that the nodes can talk to the required external services such as licence servers.

Problems encountered and glossed over by this:

  • Remember to make the configuration changes before you move the head node, otherwise it can take an age to boot whilst it times out on
    remote NFS mounts and other network operations
  • Ensure you have all sockets enabled (network and power) for the racks you are moving to, otherwise this could add another unwanted delay
  • Physically moving machines requires a fair bit of downtime, a van and some careful moving to avoid damage to disks (especially the head node)
  • Networking in the destination room was not terribly structured, and required us to thread a long cable between the head node and internal
    switch, as the head node is in a UPS’d rack which is the opposite side of the (quite large) room to the compute nodes. This will be more of a pain if any compute nodes need to also be on UPS (e.g. they
    have additional storage).
  • Don’t try to be optimistic on the time it will take — users will be happier if you allocate 3 days and it takes 2 than if you say it will take 1 and it takes 2.
  • Don’t have just broken your arm before attempting to move a cluster — it makes moving nodes or doing cabling rather more awkward/impossible! (Unlikely to be a problem for most people 🙂

What’s talking to my DNS server?

I currently look after 6 DNS resolvers, which are used to provide DNS two large sections of the university network.

While they’re all running BIND, there’s a real mix of versions so I want to consolidate them down, retire the oldest pair and generally tidy up.

The two oldest are in an IP address range that I can’t easily move onto any of the other servers, so any clients using them need to be pointed at the newer servers before I can retire them.

The problem is, I’m not 100% sure which clients are using them.

So, how do you find out what machines are using a given DNS server?
The temptation is to turn on querylogging, and then look at the log files after a week. There are two problems with that, the first is a technical issue (querylogging appears to block responses while waiting for the logging IO to happen, slows your DNS server down significantly and sends the load average through the roof) but the second (more important) problem is a privacy issue.

Querylogging logs every DNS request made by your clients, as well as the IP the client machine is using. Taken together, those two pieces of information are enough to identify what a given user is looking at on the internet. That’s a pretty serious breach of privacy, needs signoff from on-high and is generally a world of paperwork and hastle which is best avoided.

So we need an approach which is lightweight, and only identifies the machines which are talking to us and not what they’re asking for. Gathering the minimum of data required to allow us to make the decision we’re interested in.

On Linux, that’s pretty easy to arrange with a bit of tcpdump and some good old awk:

#!/bin/bash
/usr/bin/nohup /usr/sbin/tcpdump -ln port 53 2>&1 | awk '/\.domain:/ { print $3; system("") }' | awk -F'.' '{ print $1"."$2"."$3"."$4; system("") }' > /tmp/dns_clients.log &

What does that do then?
It’s quite a long pipeline, so if we split it into chunks, it looks a bit like this:

/usr/bin/nohup

nohup combined with the ‘&’ at the end runs the whole pipeline in such a way that it doesn’t die when you log out. Which is useful if you want to set something running and then come back to it later.

/usr/sbin/tcpdump -ln port 53 2>&1

We then use tcpdump to dump all traffic on port 53 (DNS) The -n tells it to not resolve any IP addresses to dns names, and the -l changes the STDOUT buffering to be line buffered. This is helpful if you want to be able to see clients hitting your box in real time, for example to make sure the script is doing what you’re expecting it to.

“2>&1” merges STDERR and STDOUT into a single STDOUT stream.

awk '/\.domain:/ { print $3; system("") }'

Next we awk to look for lines which contain the string “.domain:” these are the ones which are headed towards our DNS server, rather than outbound requests made by processes running on the server. “print $3” prints the third field on the line (which is the source IP address/port of the request)

system(“”) is a useful bodge which makes the output from awk line buffered, in the same way as we used -l for tcpdump.

awk -F'.' '{ print $1"."$2"."$3"."$4; system("") }'

A bit more awk to strip off the port number, leaving us with just the IPv4 address. The observant amongst you will notice that this does nothing for IPv6 addresses. I should probably come up with a better solution, but this is sufficient for the boxes I’m currently interested in.

> /tmp/dns_clients.log &

redirects all the output into the /tmp/dns_clients.log file.

Handling the log file
So set that running on a DNS server, and client IP addresses will be logged to /tmp/dns_clients.log without logging what they’re looking at or slowing down the server too much. However, it logs one line for every request so it can get large quite quickly on a busy server.

I’ve been rotating that logfile out daily and running it through “sort -u” to get a list of unique clients. Once I’ve tracked them all down and pointed them at a more appropriate DNS server I’ll be able to retire the old ones.

Win!

Solaris Bonus Round
If you need to do this on solaris, tcpdump isn’t generally available (unless you’re willing to build it yourself) but /usr/sbin/snoop has roughly equivalent functionality. The output format is a little different, although it’s a bit easier to handle.

#!/usr/bin/bash
/usr/bin/nohup /usr/sbin/snoop -r dst port 53 | awk '{ print $1 }' > /tmp/dns_clients.log &

That does broadly the right thing. It includes traffic originated by the server as well as traffic hitting it from clients (so you could also pipe it through “grep -v $MY_IP” to weed that out if you wanted)

Nagios alerts via Google Talk

Every server built by my team in the last 3 years gets automatically added to our Nagios monitoring. We monitor the heck out of everything, and Nagios is configured to send us an alert email when something needs our attention.

Recently we experienced an issue with some servers which are used to authenticate a large volume of users on a high profile service, nagios spotted the issue and sent us an email – which was then delayed and didn’t actually hit our inboxes until 1.5hrs after the fault developed. That’s 1.5hrs of downtime on a service, in the middle of the day, because our alerting process failed us.

This is clearly Not Good Enough(tm) so we’re stepping it up a notch.

With help from David Gardner we’ve added Google Talk alerts to our Nagios instance. Now when something goes wrong, as well as getting an email about it, my instant messenger client goes “bong!” and starts flashing. Win!

If you squint at it in the right way, on a good day, Google Talk is pretty much Jabber or XMPP and that appears to be fairly easy to automate.

To do this, you need three things things:

  1. An account for Nagios to send the alerts from
  2. A script which sends the alerts
  3. Some Nagios config to tie it all together

XMPP Account

We created a free gmail.com account for this step. If you prefer, you could use another provider such as jabber.org. I would strongly recommend you don’t re-use any account you use for email or anything. Keep it self contained (and certainly don’t use your UoB gmail account!)

Once you’ve created that account, sign in to it and send an invite to the accounts you want the alerts to be sent to. This establishes a trust relationship between the accounts so the messages get through.

The Script

This is a modified version of a perl script David Gardner helped us out with. It uses the Net::XMPP perl module to send the messages. I prefer to install my perl modules via yum or apt (so that they get updated like everything else) but you can install it from CPAN if you prefer. Whatever floats your boat. On CentOS it’s the perl-Net-XMPP package, and on Debian/Ubuntu it’s libnet-xmpp-perl

#!/usr/bin/perl -T
use strict;
use warnings;
use Net::XMPP;
use Getopt::Std;

my %opts;
getopts('f:p:r:t:m:', \%opts);

my $from = $opts{f} or usage();
my $password = $opts{p} or usage();
my $resource = $opts{r} || "nagios";
my $recipients = $opts{t} or usage();
my $message = $opts{m} or usage();

unless ($from =~ m/(.*)@(.*)/gi) {
  usage();
}
my ($username, $componentname) = ($1,$2);

my $conn = Net::XMPP::Client->new;
my $status = $conn->Connect(
  hostname => 'talk.google.com',
  port => 443,
  componentname => $componentname,
  connectiontype => 'http',
  tls => 0,
  ssl => 1,
);

# Change hostname
my $sid = $conn->{SESSION}->{id};
$conn->{STREAM}->{SIDS}->{$sid}->{hostname} = $componentname;

die "Connection failed: $!" unless defined $status;
my ( $res, $msg ) = $conn->AuthSend(
  username => $username,
  password => $password,
  resource => $resource, # client name
);
die "Auth failed ", defined $msg ? $msg : '', " $!"
unless defined $res and $res eq 'ok';

foreach my $recipient (split(',', $recipients)) {
  $conn->MessageSend(
    to => $recipient,
    resource => $resource,
    subject => 'message via ' . $resource,
    type => 'chat',
    body => $message,
  );
}

sub usage {
print qq{
$0 - Usage

-f "from account" (eg nagios\@myhost.org)
-p password
-r resource (default is "nagios")
-t "to account" (comma separated list or people to send the message to)
-m "message"
};
exit;

The Nagios Config

You need to install the script in a place where Nagios can execute it. Usually the place where your Nagios check plugins are installed is fine – on our systems that’s /usr/lib64/nagios/plugins.

Now we need to make Nagios aware of this plugin by defining a command. In your command.cfg, add two blocks like this (one for service notifications, one for host notifications). Replace <username> with the gmail account you registered, excluding the suffix, and replace <password> with the account password.

define command {
  command_name notify-by-xmpp
  command_line $USER1$/notify_by_xmpp -f <username@gmail.com> -p <password> -t $CONTACTEMAIL$ -m "$NOTIFICATIONTYPE$ $HOSTNAME$ $SERVICEDESC$ $SERVICESTATE$ $SERVICEOUTPUT$ $LONGDATETIME$"
}

define command {
  command_name host-notify-by-xmpp
  command_line $USER1$/notify_by_xmpp -f <username@gmail.com> -p <password> -t $CONTACTEMAIL$ -m "Host $HOSTALIAS$ is $HOSTSTATE$ - Info"
}

This command uses the Nagios user’s registered email address for XMPP as well as email. For this to work you will probably have to use the short (non-aliased) version of the email address, i.e. ab12345@bristol.ac.uk rather than alpha.beta@bristol.ac.uk. Change this in contact.cfg if necessary.

With the script in place and registered with Nagios, we just need to tell Nagios to use it. You can enable it per-user or just enable it for everyone. In this example we’ll enable it for everyone.

Look in templates.cfg and edit the generic contact template to look like this. Relevant changes are in bold.

define contact{
        name                            generic-contact
        service_notification_period     24x7
        host_notification_period        24x7
        service_notification_options    w,c,r,f,s
        host_notification_options       d,u,r,f,s
        service_notification_commands   notify-service-by-email, notify-service-by-xmpp
        host_notification_commands      notify-host-by-email, notify-by-xmpp
        register                        0
}

Restart Nagios for the changes to take effect. Now break something, and wait for your IM client to receive the notification.

Quicktip – mp4 playback in chromium

I use Ubuntu as my main desktop both at work and at home, and chromium as my default browser. For ages it’s been annoying me that some videos on youtube don’t play. Well, I’ve finally gotten around to fixing it.

It seems mp4 playback is missing from chromium on ubuntu, and can be added with:

sudo apt-get install chromium-codecs-ffmpeg-extra

Restart chromium and you can go back to watching kittens falling off tables.

More Juniper VPN Client…

As mentioned at the bottom of my previous post, it is possible to connect to the Juniper VPN without using the java gui, you just need the right command line arguments.

Well, I’ve worked out what those are.

./ncsvc -h uobnet.bris.ac.uk -f ./uobnet.crt -r UoB-Users -u ab12345

(replacing ab12345 with your UoB username obviously)

The list of 32bit dependencies is significantly smaller (as it doesn’t involve Java!) and on Ubuntu the client seems to be satisfied with:

sudo apt-get install libc6-i386 lib32z1 lib32nss-mdns

As before, the above has had very little testing at all.  Any reports of what it does on non-ubuntu systems would be very welcome indeed!