Publish a Module on Puppet Forge

Posted on 2014/11/20 by Jonathan Gazeley

I’ve started publishing as many of my Puppet modules as possible on Puppet Forge. It isn’t hard to do but there are a few things to know. This guide is largely based on Puppetlabs’ own guide Publishing Modules on the Puppet Forge.

For home-grown modules that have grown organically, you are likely to have at least some site-specific data mixed in with the code. Before publishing, you’ll need to abstract this out. I recommend using parametrised classes with sane defaults for your inputs. If necessary, you can have a local wrapper class to pass site-specific values into your module.
The vast majority of Puppet modules are on GitHub, but this isn’t actually a requirement. GitHub offers public collaboration and issue tracking, but you can keep your code wherever you like.
Before you can publish, you need to include some metadata with your module. Look at the output of puppet module generate. If you’re starting from scratch, this command is an excellent place to start. If you’re patching up an old module for publication, run it in a different location and selectively copy the useful files into your module. The mandatory files are metadata.json and README.md.
When you’re ready to publish, run puppet module build. This creates a tarball of your module and metadata which is ready to upload to Puppet Forge.
Create an account on Puppet Forge and upload your tarball. It will automatically fill in the metadata.
Install your module on your Puppetmaster by doing puppet module install myname/mymodule

Building a Gitlab server with Puppet

Posted on 2014/11/17 by Jonathan Gazeley

GitHub is an excellent tool for code-sharing, but it has the major disadvantage of being fully public. You probably don’t want to put your confidential stuff and shared secrets in there! You can pay for private repositories, but the issue still stands that we shouldn’t be putting confidential UoB things in a non-approved cloud provider.

I briefly investigated several self-hosted pointy-clicky Git interfaces, including Gitorious, Gitolite, GitLab, Phabricator and Stash. They all have their relative merits but they all seem to be a total pain to install and run in a production environment, often requiring that we randomly git clone something into the webroot and then not providing a sane upgrade mechanism. Many of them have dependencies on modules not included with the enterprise Linux distributions

In the end, the easiest-to-deploy option seemed to be to use the GitLab Omnibus installer. This bundles the GitLab application with all its dependencies in a single RPM for ease of deployment. There’s also a Puppet Forge module called spuder/gitlab which makes it nice and easy to install on a Puppet-managed node.

After fiddling, my final solution invokes the Forge module like this:

class { 'gitlab' : 
  puppet_manage_config          => true,
  puppet_manage_backups         => true,
  puppet_manage_packages        => false,
  gitlab_branch                 => '7.4.3',
  external_url                  => "https://${::fqdn}",
  ssl_certificate               => '/etc/gitlab/ssl/gitlab.crt',
  ssl_certificate_key           => '/etc/gitlab/ssl/gitlab.key',
  redirect_http_to_https        => true,
  backup_keep_time              => 5184000, # 5184000 = 60 days
  gitlab_default_projects_limit => 100,
  gitlab_download_link          => 'https://downloads-packages.s3.amazonaws.com/centos-6.5/gitlab-7.4.3_omnibus.5.1.0.ci-1.el6.x86_64.rpm',
  gitlab_email_from             => 'gitlab@example.com',
  ldap_enabled                  => true,
  ldap_host                     => 'ldap.example.com',
  ldap_base                     => 'CN=Users,DC=example,DC=com',
  ldap_port                     => '636',
  ldap_uid                      => 'uid',
  ldap_method                   => 'ssl',
  ldap_bind_dn                  => 'uid=ldapuser,ou=system,dc=example,dc=com',
  ldap_password                 => '*********',
}

I also added a couple of resources to install the certificates and create a firewall exception, to make a complete working deployment.

The upgrade path requires manual intervention, but is mostly automatic. You just need to change gitlab_download_link to point to a newer RPM and change gitlab_branch to match.

If anyone is interested, I’d be happy to write something about the experience of using GitLab after a while, when I’ve found out some of the quirks.

Update by DaveG! (in lieu of comments currently on this site)

Gitlab have changed their install process to require use of their repo, so this module doesn’t like it very much. They’ve also changed the package name to ‘gitlab-ce’ rather than just ‘gitlab’.

To work around this I needed to:

Add name => 'gitlab-ce' to the package { 'gitlab': ... } params in gitlab/manifests/install.pp
Find the package RPM for a new shiny version of Gitlab. 7.11.4 in this case, via https://packages.gitlab.com/gitlab/gitlab-ce?filter=rpms
Copy the RPM to a local web-accessible location as a mirror, and use this as the location for the gitlab_download_link class parameter

This seems to have allowed it to work fine!
(Caveat: I had some strange behaviour with whether it would run the gitlab instance correctly, but I’m not sure if that’s because of left-overs from a previous install attempt. Needs more testing!)

DHCP fingerprinting

Posted on 2014/11/10 by Paul Seward

We wanted to find out what sort of devices are active on the wireless network, and the vendor tools we’ve got don’t quite give us the level of detail we were after.

However, everything which hits our wireless network gets a DHCP lease from our dhcp servers. With a bit of dhcpd.conf magic, you can make it profile each client when it requests or renews a lease and record a fingerprint in the logs.

dhcpd.conf – collecting fingerprints

# put the dhcp options request fingerprint in the leases file
set dhcp-op-req-string = binary-to-ascii(10,8,":",option dhcp-parameter-request-list);

# log the fingerprint in the format:
# Jul 17 14:36:06 dhcp2 dhcpd: FINGERPRINT 1,3,6,12,15,28 for 00:10:20:30:40:50

log(info,
concat("FINGERPRINT ",
binary-to-ascii(10,8,",",option dhcp-parameter-request-list),
" for ",
concat (  # MAC
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 1, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 2, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 3, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 4, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 5, 1))),2), ":",
        suffix (concat ("0", binary-to-ascii (16, 8, "",
          substring (hardware, 6, 1))),2)
       )        # End MAC
));
# End DHCP fingerprinting

Now every time a device interacts with our DHCP server, we get a FINGERPRINT line appearing in our logs along with the mac address which requested the lease.

So far, so good. Now we need to process those logs into something anonymous, but meaningful.

Data Prep
The easiest approach is to cat our logfile, strip out just the fields we’re interested in (mac address and fingerprint) then sort them to remove duplicates (we only want to count each machine once!) and then finally throw away the mac addresses (because all we really want are the fingerprints)

We can do that easily enough with a lovely long pipeline

cat /var/log/dhcpd.log | grep FINGERPRINT | awk '{ print $9 " " $7 }' | sort -u | awk '{ print $2 }'

There are probably more elegant ways to do it, but the above isn’t really the interesting bit. All you get out of it is a list of fingerprints. The magic is in converting those into something meaningful.

Chewing on your fingerprints
To process, identify and count these fingerprints, we need the help of the fingerbank project who have collected DHCP fingerprints from all over the place.

I’m grabbing the fingerprint list as a config file from their github repo: https://github.com/inverse-inc/fingerbank/blob/master/dhcp_fingerprints.conf although since I first started playing with this about 6 months ago, it seems they’ve made their fingerprint database available as an Sqlite DB – which would have been much easier to wrangle than parsing the config file.

So here’s a slightly shonky perl script to parse the config file and produce a CSV summary of the output. This is probably not as elegantly done as it could be, please don’t judge too harshly! I’ve tried to make it readable, but some of the datastructures are a bit on the deep side. If you want to see what’s going on, make plenty of use of “Data::Dumper” – I know I had to when writing it.

It assumes dhcp_fingerprints.conf is in the same folder as the script, and expects to be fed fingerprints over STDIN one line at a time – so you can stick it on the end of the pipeline I mentioned earlier.

#!/usr/bin/perl -wT

use strict;

use Config::IniFiles;
use Data::Dumper;

my %dhcp_fingerprints; # tied version of the config file
my ($fprint_db, $fprint_class, $os_counter); # DStructs which we query later

# Tie fingerprint config file from fingerbank to a DS so we can parse it
tie %dhcp_fingerprints, 'Config::IniFiles', ( -file => "dhcp_fingerprints.conf" );

# Build $fprint_class (maps OS name to "class")
foreach my $class (tied(%dhcp_fingerprints)->GroupMembers("class") ) {
  my ($min,$max) = split /\D/, $dhcp_fingerprints{$class}{"members"};
  $$fprint_class{ $dhcp_fingerprints{$class}{"description"} }{min}=$min;
  $$fprint_class{ $dhcp_fingerprints{$class}{"description"} }{max}=$max;
}

# Build $fprint_db (maps fingerprint to OS name)
foreach my $os ( tied(%dhcp_fingerprints)->GroupMembers("os") ) {
  $os =~ m/os (.*)$/gi;
  my $os_id = $1;

  if ( exists( $dhcp_fingerprints{$os}{"fingerprints"} ) ) {
    if ( ref( $dhcp_fingerprints{$os}{"fingerprints"} ) eq "ARRAY" ) {
      foreach my $dhcp_fingerprint ( @{ $dhcp_fingerprints{$os}{"fingerprints"} } ) {
        $$fprint_db{$dhcp_fingerprint}{"description"}=$dhcp_fingerprints{$os}{"description"};
        $$fprint_db{$dhcp_fingerprint}{"os"}=$os_id;   
      }
    } else {
      if (defined $dhcp_fingerprints{$os}{"fingerprints"}) {
        foreach my $dhcp_fingerprint (split(/\n/, $dhcp_fingerprints{$os}{"fingerprints"})) {
        $$fprint_db{$dhcp_fingerprint}{"description"}=$dhcp_fingerprints{$os}{"description"};
        $$fprint_db{$dhcp_fingerprint}{"os"}=$os_id;
        }
      }
    }
  }
}

# now we loop through all the fingerprints we've been given on STDIN and try to ID them
while () {
  chomp;
  my $fingerprint = $_;

  # See if it appears in $fprint_db...
  if(defined $$fprint_db{$fingerprint}) {
    # Count it
    $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"count"}++;

    # Try to identify the type of OS
    foreach my $class (keys $fprint_class) {
      if ($$fprint_db{$fingerprint}{"os"} >= $$fprint_class{$class}{"min"} && $$fprint_db{$fingerprint}{"os"} <= $$fprint_class{$class}{"max"}) {
        $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"}=$class;
      }
    }
    
    # If we haven't yet set the OS class, set it to "unknown"
    $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"}="unknown" unless (defined $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"});

    } else {
      # No idea what it was, so add it to the unknown count
      $$os_counter{"unknown"}{"count"}++;
      $$os_counter{"unknown"}{"class"}="unknown";
    }
  }

# Print summary output as a CSV
print "\n\nClass,OS,Count\n";
foreach my $os(keys %$os_counter) {
  print qq["$$os_counter{$os}{class}","$os","$$os_counter{$os}{count}"\n];
}

If I let that chew on a decent chunk of todays logs (from about 7am to 2pm) it spits out the following:

Class	OS	Count
Smartphones/PDAs/Tablets	Samsung Galaxy Tab 3 7.0 SM-T210R	39
Home Audio/Video Equipment	Slingbox	49
Dead OSes	OS/2 Warp	1
Gaming Consoles	Xbox 360	6
Windows	Microsoft Windows Vista/7 or Server 2008	1694
Printers	Lexmark Printer	1
Network Boot Agents	Novell Netware Client	1
Macintosh	Mac OS X Lion	2783
Misc	Eye-Fi Wireless Memory Card	1
Printers	Kyocera Printer	1
unknown	unknown	40
Smartphones/PDAs/Tablets	LG Nexus 5 & 7	1797
Printers	HP Printer	54
CD-Based OSes	PHLAK	1
Smartphones/PDAs/Tablets	Nokia	13
Smartphones/PDAs/Tablets	Motorola Android	2
Macintosh	Mac OS X	145
Smartphones/PDAs/Tablets	Generic Android	2989
Gaming Consoles	Playstation 2	1
Linux	Chrome OS	39
Linux	Ubuntu/Debian 5/Knoppix 6	5
Routers and APs	Cisco Wireless Access Point	69
Linux	Generic Linux	7
Linux	Ubuntu 11.04	21
Windows	Microsoft Windows 8	1792
Routers and APs	Apple Airport	2
Routers and APs	DD-WRT Router	3
Smartphones/PDAs/Tablets	Sony Ericsson Android	1
Linux	Debian-based Linux	51
Smartphones/PDAs/Tablets	Symbian OS	2
Storage Devices	LaCie NAS	27
Windows	Microsoft Windows XP	30
Smartphones/PDAs/Tablets	Android Tablet	24
Monitoring Devices	Tripplite UPS	1
Smartphones/PDAs/Tablets	Apple iPod, iPhone or iPad	12289
Smartphones/PDAs/Tablets	Samsung S5260 Star II	2
Smartphones/PDAs/Tablets	RIM BlackBerry	63

I’m not sure I 100% believe the above (OS/2 Warp? Really?) but the bits I disbelieve are largely in the noise.

Chewing on the above stats a bit, shows us that the wireless network is roughly 27% laptops and 72% mobile devices (tablets etc). Amongst the laptops, Windows is just about in the lead with 53%, and OSX is close behind at 44% (which is probably higher than a lot of people think) Linux laptops are trailing behind at only 2%.

The mobile device landscape is less evenly split, with 71% iOS and 28% Android.

Although I wouldn’t read too much into the above analysis, as it represents a comparatively small time slice (and only 23775 of the 37000 devices we see on the wireless each week)

Who knows, perhaps we’ve got 13K windows phones owned by people who just don’t come onto campus on a Monday…

Update 2015-05-11: I’ve been asked under what license I’ve released the perl script in this post. I didn’t put any thought into licenses at the time (I was just trying to solve a problem and answer a question I’d been asked!) but I’ll put my hand up, part of the script is based on prior-art.

The section which parses the fingerprint database is taken from the process_fingerprints() function in https://github.com/inverse-inc/fingerbank/blob/master/obsolete/tools/fingerprint-find-candidate-matches.pl – a script which seems to be covered by the GPLv2 licence.

As I understand it, under the terms of the GPLv2 license, that means that the script above should also be distributed under the GPLv2 license (which I’m OK with) and that under the terms of that license it should be distributed along with a copy of the GPLv2 license… which can be found here: https://www.gnu.org/licenses/gpl-2.0.html

UoB Unix

Linux and Unix at the University of Bristol

Monthly Archives: November 2014

Publish a Module on Puppet Forge

Building a Gitlab server with Puppet

DHCP fingerprinting