Arpwatch - Upgraded and explained
Friday, July 24. 2020
For many years I've run my own systems at home. Its given most of you do much less system running than I. There are servers and network switches and wireless routers, battery-backed power supplies and so on. Most of that I've written about in this blog earlier.
There is many security aspects any regular Jane lay-person won't spend a second thinking of. One of them is: What hardware runs on my home network? In my thinking that question is in top 3 -list.
The answer to that one is very trivial and can be found easily from your own network. Ask the network! It knows.
ARP - Address Resolution Protocol
This is in basics of IPv4 networking. A really good explanation can be found from a CCNA (Cisco Certified Network Associate) study site https://study-ccna.com/arp/: a network protocol used to find out the hardware (MAC) address of a device from an IP address. Well, to elaborate on that. Every single piece of hardware has an unique identifier in it. You may have heard of IMEI in your 3G/4G/5G phone, but as your phone also supports Wi-Fi, it needs to have an identifier for Wi-Fi too. A MAC-address.
Since Internet doesn't work with MAC-addresses, a translation to an IP-address is needed. Hence, ARP.
Why would you want to watch ARPs?
Simple: security.
If you know every single MAC-address in your own network, you'll know which devices are connected into it. If you think of it, there exists a limited set of devices you WANT to have in your network. Most of them are most probably your own, but what if one isn't? Wouldn't it be cool to get an alert quickly every time your network sees a device it has never seen before. In my thinking, yes! That would be really cool.
OUIs
Like in shopping-TV, there is more! A 48-bit MAC-address uniquely identifies the hardware connected to an Ethernet network, but it also identifies the manufacturer. Since IEEE is the standards body for both wired and wireless Ethernet (aka. Wi-Fi), they maintain a database of Organizationally unique identifiers.
An organizationally unique identifier (OUI) is a 24-bit number that uniquely identifies a vendor, manufacturer, or other organization.
OUIs are purchased from the Institute of Electrical and Electronics Engineers (IEEE) Registration Authority by the assignee (IEEE term for the vendor, manufacturer, or other organization).
The list is freely available at http://standards-oui.ieee.org/oui/oui.csv in CSV-format. Running couple sample queries for hardware seen in my own network:
$ fgrep "MA-L,544249," oui.csv
MA-L,544249,Sony Corporation,Gotenyama Tec 5-1-2 Tokyo Shinagawa-ku JP 141-0001
$ fgrep "MA-L,3C15C2," oui.csv
MA-L,3C15C2,"Apple, Inc.",1 Infinite Loop Cupertino CA US 95014
As we all know, CSV is handy but ugly. My favorite tool Wireshark does pre-process the ugly CSV into something it can chew without gagging. In Wireshark source code there is a tool, make-manuf.py
producing output file of manuf
containing the information in a more user-friendly way.
Same queries there against Wireshark-processed database:
$ egrep "(54:42:49|3C:15:C2)" manuf
3C:15:C2 Apple Apple, Inc.
54:42:49 Sony Sony Corporation
However, arpwatch doesn't read that file, a minor tweak is required. I'm running following:
perl -ne 'next if (!/^([0-9A-F:]+)\s+(\S+)\s+(.+)$/); print "$1\t$3\n"' manuf
... and it will produce a new database usable for arpwatch.
Trivial piece of information: Apple, Inc. has 789 OUI-blocks in the manuf
-file. Given 24-bit addressing they have 789 times 16M addresses available for their devices. That's over 13 billion device MAC-addresses reserved. Nokia has only 248 blocks.
Practical ARP with a Blu-ray -player
Let's take a snapshot of traffic.
This s a typical boot sequence of a Sony Blu-ray player BDP-S370. What happens is:
- (Frames 1 & 2) Device will obtain an IPv4-address with DHCP, Discover / Offer / Request is missing the middle piece. Hm. weird.
- (Frame 3) Instantly after knowing the own IPv4-address, the device will ARP-request the router (192.168.1.1) MAC-address as the device wants to talk into Internet.
- (Frames 5 & 6) Device will ping (ICMP echo request) the router to verify its existence and availability.
- (Frames 7-9) Device won't use DHCP-assigned DNS, but will do some querying of its own (discouraged!) and check if a new firmware is available at
blu-ray.update.sony.net
. - (Frame 12) Device starts populating its own ARP-cache and will query for a device it saw in the network. Response is not displayed.
- (Frames 13 & 14) Router at 192.168.1.1 needs to populate its ARP-cache and will query for the Blu-ray player's IPv4-address. Device will respond to request.
- Other parts of the capture will contain ARP-requests going back and forth.
Practical ARP with a Linux 5.3
Internet & computers do evolve. What we saw there in a 10 year old device is simply the old way of doing things. This is how ARP works in a modern operating system:
In this a typical boot sequence. I omitted all the weird and unrelated stuff and that makes the first frame as #8. What happens in the sequence is:
- (Frames 8-11) Device will obtain an IPv4-address with DHCP, Discover / Offer /Request / Ack -sequence is captured in full.
- (Frames 12-14) Instantly after knowing the own IPv4-address, the device will ARP-request the IPv4 address assigned into it. This is a collision-check to confirm nobody else in the same LAN is using the same address.
- (Frame 15) Go for a Gratuitous ARP to make everybody else's life easier in the network.
- Merriam-Webster will define "gratuitous" as:
not called for by the circumstances :
not necessary, appropriate, or justified :
unwarranted - No matter what, Gratuitous ARP is a good thing!
- Merriam-Webster will define "gratuitous" as:
- (Frame 16) Join IGMPv3 group to enable multicast. This has nothing to do with ARP, though.
The obvious difference is the existence of Gratuitous ARP "request" the device did instantly after joining the network.
- A gratuitous ARP request is an Address Resolution Protocol request packet where the source and destination IP are both set to the IP of the machine issuing the packet and the destination MAC is the broadcast address ff:ff:ff:ff:ff:ff. A new device literally is asking questions regarding the network it just joined from itself! However, the question asking is done in a very public manner, everybody in the network will be able to participate.
- Ordinarily, no reply packet will occur. There is no need to respond to an own question into the network.
- In other words: A gratuitous ARP reply is a reply to which no request has been made.
- Doing this seems no-so-smart, but gratuitous ARPs are useful for four reasons:
- They can help detect IP conflicts. Note how Linux does aggressive collision checking by its own too.
- They assist in the updating of other machines' ARP tables. Given Gratuitous ARP, in the network capture, there are nobody doing traditional ARPing for the new device. They already have the information. The crazy public-talking did the trick.
- They inform switches of the MAC address of the machine on a given switch port. My LAN-topology is trivial enough for my switches to know which port is hosting which MAC-addresses, but when eyeballing the network capture, sometimes switches need to ARP for a host to update their MAC-cache.
- Every time an IP interface or link goes up, the driver for that interface will typically send a gratuitous ARP to preload the ARP tables of all other local hosts. This sums up reasons 1-3.
How can you watch ARPs in a network?
Simple: run arpwatch in your Linux-router.
Nice people at Lawrence Berkeley National Laboratory (LBNL) in Berkeley, California have written a piece of software and are publishing it (among others) at https://ee.lbl.gov/. This ancient, but maintained, daemon has been packaged into many Linux-distros since dawn of time (or Linux, pick the one which suits you).
As already established, all devices will ARP on boot. They will ARP also later during normal operations, but that's beside the point. All a device needs to do is to ARP once and it's existence is revealed. When the daemon sees a previously unknown device in your network, it will emit a notification in form of an email. Example:
Here, my router running arpwatch saw a Sony Blu-ray player BDP-S370. The ethernet address contains the 24-bit OUI-part of 54:42:49
and remaining 24-bits of a 48-bit MAC will identify the device. Any new devices are recorded into a time-stamped database and no other notifications will be made for that device.
Having the information logged into a system log and receiving the notification enables me to ignore or investigate the device. For any devices I know can be ignored, but anything suspicious I'll always track.
IPv6 and ARP
Waitaminute! IPv6 doesn't do ARP, it does Neighbor Discovery Protocol (NDP).
True. Any practical implementation does use dual-stack IPv4 and IPv6 making ARP still a viable option for tracking MAC-addresses. In case you use a pure-IPv6 -network, then go for addrwatch https://github.com/fln/addrwatch. It will support both ARP and NDP in same tool. There are some shortcomings in the reporting side, but maybe I should take some time to tinker with this and create a patch and a pull-request to the author.
Avoiding ARP completely?
Entirely possible. All a stealth device needs to do is to piggy-back an existing device's MAC-address in the same wire (or wireless) and impersonate that device to remain hidden-in-plain-sight. ARP-watching is not foolproof.
Fedora updated arpwatch 3.1 RPM
All these years passed and nobody at Fedora / Red Hat did anything to arpwatch.
Three big problems:
- No proper support for
/etc/sysconfig/
in systemd-service. - Completely outdated list of Organizationally Unique Identifier (OUIs) used as Ethernet manufacturers list displaying as unknown for anything not 10 years old.
- Packaged version was 2.1 from year 2006. Latest is 3.1 from April 2020.
Here you go. Now there is an updated version available, Bug 1857980 - Update arpwatch into latest upstream contains all the new changes, fixes and latest upstream version.
Given systemd, for running arpwatch my accumulated command-line seems to be:
/usr/sbin/arpwatch -F -w 'root (Arpwatch)' -Z -i eth0
That will target only my own LAN, both wired and wireless.
Finally
Happy ARPing!
OpenSSH 8.3 client fails with: load pubkey invalid format
Saturday, July 11. 2020
Update 13th Sep 2020:
There is a follow-up article with a key format conversion infromation.
Ever since updating into OpenSSH 8.3, I started getting this on a connection:
$ ssh my-great-linux-server
load pubkey "/home/me/.ssh/id_ecdsa-my-great-linux-server": invalid format
Whaaaat!
Double what on the fact, that connection works. There is no change in connection besided the warning.
8.3 release notes won't mention anything about that (OpenSSH 8.3 released (and ssh-rsa deprecation notice)). My key-pairs have been elliptic for years and this hasn't bothered me. What's going on!?
Adding verbosity to output with a -vvv reveals absolutely nothing:
debug1: Connecting to my-great-linux-server [192.168.244.1] port 22.
debug1: Connection established.
load pubkey "/home/me/.ssh/id_ecdsa-ecdsa-my-great-linux-server": invalid format
debug1: identity file /home/me/.ssh/id_ecdsa-ecdsa-my-great-linux-server type -1
debug1: identity file /home/me/.ssh/id_ecdsa-ecdsa-my-great-linux-server-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.3
Poking around, I found this article from Arch Linux forums: [SOLVED] openssh load pubkey "mykeyfilepath": invalid format
Apparently OpenSSH-client now requires both the private AND public keys to be available for connecting. Mathematically the public key isn't a factor. Why would it be needed? I cannot understand the decision to throw a warning about assumed missing key. I do have the key, but as I won't need it in my client, I don't have it available.
Simply touching an empty file with correct name won't clear the warning. The actual public key of the pair needs to be available to make the ridiculous message go away.
After little bit of debugging points to the problem in ssh.c
:
check_load(sshkey_load_public(cp, &public, NULL),
filename, "pubkey");
Link: https://github.com/openssh/openssh-portable/blob/V_8_3_P1/ssh.c#L2207
Tracking the change:
$ git checkout V_8_3_P1
$ git log -L 2207,2207:ssh.c
.. points to a commit 5467fbcb09528ecdcb914f4f2452216c24796790 (Github link), which was made exactly two years ago in July 11th 2018 to introduce this checking of loaded public key and emitting a hugely misleading error message on failure.
To repeat:
Connecting to a server requires only private key. The public key is used only at the server end and is not mathematically required to establish encrypted connection from a client.
So, this change is nothing new. Still the actual reason for introducing the check_load()
-call with most likely non-existing public key is a mystery. None of the changes made in the mentioned commit or before it explains this addition, nor there are no significant changes made in the actual public key loading. A check is added, nothing more.
Fast forward two years to present day. Now that the 8.3 is actually used by a LOT of people, less than a month ago the problem was fixed. Commit c514f3c0522855b4d548286eaa113e209051a6d2 (Github link) fixes the problem by simulating a Posix ENOENT
when the public key was not found from expected locations. More details about that error are in errno (7) man page.
Problem solved. All we all need to do is wait for this change to propagate to the new clients. Nobody knows how long that will take as I just updated this.