We wanted to find out what sort of devices are active on the wireless network, and the vendor tools we’ve got don’t quite give us the level of detail we were after.
However, everything which hits our wireless network gets a DHCP lease from our dhcp servers. With a bit of dhcpd.conf magic, you can make it profile each client when it requests or renews a lease and record a fingerprint in the logs.
dhcpd.conf – collecting fingerprints
# put the dhcp options request fingerprint in the leases file
set dhcp-op-req-string = binary-to-ascii(10,8,":",option dhcp-parameter-request-list);
# log the fingerprint in the format:
# Jul 17 14:36:06 dhcp2 dhcpd: FINGERPRINT 1,3,6,12,15,28 for 00:10:20:30:40:50
log(info,
concat("FINGERPRINT ",
binary-to-ascii(10,8,",",option dhcp-parameter-request-list),
" for ",
concat ( # MAC
suffix (concat ("0", binary-to-ascii (16, 8, "",
substring (hardware, 1, 1))),2), ":",
suffix (concat ("0", binary-to-ascii (16, 8, "",
substring (hardware, 2, 1))),2), ":",
suffix (concat ("0", binary-to-ascii (16, 8, "",
substring (hardware, 3, 1))),2), ":",
suffix (concat ("0", binary-to-ascii (16, 8, "",
substring (hardware, 4, 1))),2), ":",
suffix (concat ("0", binary-to-ascii (16, 8, "",
substring (hardware, 5, 1))),2), ":",
suffix (concat ("0", binary-to-ascii (16, 8, "",
substring (hardware, 6, 1))),2)
) # End MAC
));
# End DHCP fingerprinting
Now every time a device interacts with our DHCP server, we get a FINGERPRINT line appearing in our logs along with the mac address which requested the lease.
So far, so good. Now we need to process those logs into something anonymous, but meaningful.
Data Prep
The easiest approach is to cat our logfile, strip out just the fields we’re interested in (mac address and fingerprint) then sort them to remove duplicates (we only want to count each machine once!) and then finally throw away the mac addresses (because all we really want are the fingerprints)
We can do that easily enough with a lovely long pipeline
cat /var/log/dhcpd.log | grep FINGERPRINT | awk '{ print $9 " " $7 }' | sort -u | awk '{ print $2 }'
There are probably more elegant ways to do it, but the above isn’t really the interesting bit. All you get out of it is a list of fingerprints. The magic is in converting those into something meaningful.
Chewing on your fingerprints
To process, identify and count these fingerprints, we need the help of the fingerbank project who have collected DHCP fingerprints from all over the place.
I’m grabbing the fingerprint list as a config file from their github repo: https://github.com/inverse-inc/fingerbank/blob/master/dhcp_fingerprints.conf although since I first started playing with this about 6 months ago, it seems they’ve made their fingerprint database available as an Sqlite DB – which would have been much easier to wrangle than parsing the config file.
So here’s a slightly shonky perl script to parse the config file and produce a CSV summary of the output. This is probably not as elegantly done as it could be, please don’t judge too harshly! I’ve tried to make it readable, but some of the datastructures are a bit on the deep side. If you want to see what’s going on, make plenty of use of “Data::Dumper” – I know I had to when writing it.
It assumes dhcp_fingerprints.conf is in the same folder as the script, and expects to be fed fingerprints over STDIN one line at a time – so you can stick it on the end of the pipeline I mentioned earlier.
#!/usr/bin/perl -wT
use strict;
use Config::IniFiles;
use Data::Dumper;
my %dhcp_fingerprints; # tied version of the config file
my ($fprint_db, $fprint_class, $os_counter); # DStructs which we query later
# Tie fingerprint config file from fingerbank to a DS so we can parse it
tie %dhcp_fingerprints, 'Config::IniFiles', ( -file => "dhcp_fingerprints.conf" );
# Build $fprint_class (maps OS name to "class")
foreach my $class (tied(%dhcp_fingerprints)->GroupMembers("class") ) {
my ($min,$max) = split /\D/, $dhcp_fingerprints{$class}{"members"};
$$fprint_class{ $dhcp_fingerprints{$class}{"description"} }{min}=$min;
$$fprint_class{ $dhcp_fingerprints{$class}{"description"} }{max}=$max;
}
# Build $fprint_db (maps fingerprint to OS name)
foreach my $os ( tied(%dhcp_fingerprints)->GroupMembers("os") ) {
$os =~ m/os (.*)$/gi;
my $os_id = $1;
if ( exists( $dhcp_fingerprints{$os}{"fingerprints"} ) ) {
if ( ref( $dhcp_fingerprints{$os}{"fingerprints"} ) eq "ARRAY" ) {
foreach my $dhcp_fingerprint ( @{ $dhcp_fingerprints{$os}{"fingerprints"} } ) {
$$fprint_db{$dhcp_fingerprint}{"description"}=$dhcp_fingerprints{$os}{"description"};
$$fprint_db{$dhcp_fingerprint}{"os"}=$os_id;
}
} else {
if (defined $dhcp_fingerprints{$os}{"fingerprints"}) {
foreach my $dhcp_fingerprint (split(/\n/, $dhcp_fingerprints{$os}{"fingerprints"})) {
$$fprint_db{$dhcp_fingerprint}{"description"}=$dhcp_fingerprints{$os}{"description"};
$$fprint_db{$dhcp_fingerprint}{"os"}=$os_id;
}
}
}
}
}
# now we loop through all the fingerprints we've been given on STDIN and try to ID them
while () {
chomp;
my $fingerprint = $_;
# See if it appears in $fprint_db...
if(defined $$fprint_db{$fingerprint}) {
# Count it
$$os_counter{$$fprint_db{$fingerprint}{"description"}}{"count"}++;
# Try to identify the type of OS
foreach my $class (keys $fprint_class) {
if ($$fprint_db{$fingerprint}{"os"} >= $$fprint_class{$class}{"min"} && $$fprint_db{$fingerprint}{"os"} <= $$fprint_class{$class}{"max"}) {
$$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"}=$class;
}
}
# If we haven't yet set the OS class, set it to "unknown"
$$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"}="unknown" unless (defined $$os_counter{$$fprint_db{$fingerprint}{"description"}}{"class"});
} else {
# No idea what it was, so add it to the unknown count
$$os_counter{"unknown"}{"count"}++;
$$os_counter{"unknown"}{"class"}="unknown";
}
}
# Print summary output as a CSV
print "\n\nClass,OS,Count\n";
foreach my $os(keys %$os_counter) {
print qq["$$os_counter{$os}{class}","$os","$$os_counter{$os}{count}"\n];
}
If I let that chew on a decent chunk of todays logs (from about 7am to 2pm) it spits out the following:
Class |
OS |
Count |
Smartphones/PDAs/Tablets |
Samsung Galaxy Tab 3 7.0 SM-T210R |
39 |
Home Audio/Video Equipment |
Slingbox |
49 |
Dead OSes |
OS/2 Warp |
1 |
Gaming Consoles |
Xbox 360 |
6 |
Windows |
Microsoft Windows Vista/7 or Server 2008 |
1694 |
Printers |
Lexmark Printer |
1 |
Network Boot Agents |
Novell Netware Client |
1 |
Macintosh |
Mac OS X Lion |
2783 |
Misc |
Eye-Fi Wireless Memory Card |
1 |
Printers |
Kyocera Printer |
1 |
unknown |
unknown |
40 |
Smartphones/PDAs/Tablets |
LG Nexus 5 & 7 |
1797 |
Printers |
HP Printer |
54 |
CD-Based OSes |
PHLAK |
1 |
Smartphones/PDAs/Tablets |
Nokia |
13 |
Smartphones/PDAs/Tablets |
Motorola Android |
2 |
Macintosh |
Mac OS X |
145 |
Smartphones/PDAs/Tablets |
Generic Android |
2989 |
Gaming Consoles |
Playstation 2 |
1 |
Linux |
Chrome OS |
39 |
Linux |
Ubuntu/Debian 5/Knoppix 6 |
5 |
Routers and APs |
Cisco Wireless Access Point |
69 |
Linux |
Generic Linux |
7 |
Linux |
Ubuntu 11.04 |
21 |
Windows |
Microsoft Windows 8 |
1792 |
Routers and APs |
Apple Airport |
2 |
Routers and APs |
DD-WRT Router |
3 |
Smartphones/PDAs/Tablets |
Sony Ericsson Android |
1 |
Linux |
Debian-based Linux |
51 |
Smartphones/PDAs/Tablets |
Symbian OS |
2 |
Storage Devices |
LaCie NAS |
27 |
Windows |
Microsoft Windows XP |
30 |
Smartphones/PDAs/Tablets |
Android Tablet |
24 |
Monitoring Devices |
Tripplite UPS |
1 |
Smartphones/PDAs/Tablets |
Apple iPod, iPhone or iPad |
12289 |
Smartphones/PDAs/Tablets |
Samsung S5260 Star II |
2 |
Smartphones/PDAs/Tablets |
RIM BlackBerry |
63 |
I’m not sure I 100% believe the above (OS/2 Warp? Really?) but the bits I disbelieve are largely in the noise.
Chewing on the above stats a bit, shows us that the wireless network is roughly 27% laptops and 72% mobile devices (tablets etc). Amongst the laptops, Windows is just about in the lead with 53%, and OSX is close behind at 44% (which is probably higher than a lot of people think) Linux laptops are trailing behind at only 2%.
The mobile device landscape is less evenly split, with 71% iOS and 28% Android.
Although I wouldn’t read too much into the above analysis, as it represents a comparatively small time slice (and only 23775 of the 37000 devices we see on the wireless each week)
Who knows, perhaps we’ve got 13K windows phones owned by people who just don’t come onto campus on a Monday…
Update 2015-05-11: I’ve been asked under what license I’ve released the perl script in this post. I didn’t put any thought into licenses at the time (I was just trying to solve a problem and answer a question I’d been asked!) but I’ll put my hand up, part of the script is based on prior-art.
The section which parses the fingerprint database is taken from the process_fingerprints() function in https://github.com/inverse-inc/fingerbank/blob/master/obsolete/tools/fingerprint-find-candidate-matches.pl – a script which seems to be covered by the GPLv2 licence.
As I understand it, under the terms of the GPLv2 license, that means that the script above should also be distributed under the GPLv2 license (which I’m OK with) and that under the terms of that license it should be distributed along with a copy of the GPLv2 license… which can be found here: https://www.gnu.org/licenses/gpl-2.0.html