Fedora dhclient broken
Monday, December 10. 2018
I'm not a huge fan of NetworkManager. Since I am a fan of many RedHat products, it creates a nice conflict. They develop it, so it is pre-installed in all of RedHat's Linuxes. Luckily its very easy to yank off and replace with something that actually works and is suitable for server computing.
Also, a third player exists in the Linux networking setup -scene. systemd-networkd (https://www.freedesktop.org/software/systemd/man/systemd-networkd.html) does exactly the same as NetworkManager or classic network-scripts would do. It is the newcomer, but since everybody's box already has systemd, using it to run your networking makes sense to some.
I don't know exactly when, but at some point Fedora simply abandoned all the classic ways of doing networking. I know for a fact, that in Fedora 26 ISC's dhclient worked ok, but looks like around the time of 26 release, they simply broke it. Now we're at 29 and it has the same code as 28 did. Since almost nobody uses classic networking, this bug went unnoticed for a while. There is a bug in RedHat's Bugzilla which looks similar to what I'm experiencing: Bug 1314203 - dhclient establishes a lease on the explicitly specified interface, but then endlessly retries old leases on other interfaces, but looks like it didn't get any attention. To make this bug even more difficult to spot, you need to have multiple network interfaces in your machine for this problem to even exist. Most people don't, and looks like those who do, aren't running dhclient
.
The issue, in detail, is following:
When run ifup eno1
, no IP-address will be issued by my ISP for that interface.
When running dhclient in diagnostics mode, with following command:
/sbin/dhclient -1 -d -pf /run/dhclient-eno1.pid -H myPCame eno1
output will be:
Internet Systems Consortium DHCP Client 4.3.6
Copyright 2004-2017 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/
Listening on LPF/enp3s0f1/90:e2:ba:00:00:01
Sending on LPF/enp3s0f1/90:e2:ba:00:00:01
Listening on LPF/eno1/60:a4:4c:00:00:01
Sending on LPF/eno1/60:a4:4c:00:00:01
Listening on LPF/enp3s0f0/90:e2:ba:00:00:02
Sending on LPF/enp3s0f0/90:e2:ba:00:00:02
Sending on Socket/fallback
DHCPDISCOVER on enp3s0f1 to 255.255.255.255 port 67 interval 3 (xid=0x90249a1f)
DHCPREQUEST on eno1 to 255.255.255.255 port 67 (xid=0xe612e570)
DHCPDISCOVER on enp3s0f0 to 255.255.255.255 port 67 interval 5 (xid=0xb568cb15)
DHCPACK from 62.248.219.2 (xid=0xe612e570)
DHCPREQUEST on enp3s0f0 to 255.255.255.255 port 67 (xid=0xb568cb15)
DHCPOFFER from 84.249.192.3
DHCPACK from 84.249.192.3 (xid=0xb568cb15)
DHCPDISCOVER on enp3s0f1 to 255.255.255.255 port 67 interval
Notice how DHCP-client was requested on network interface eno1
, but it is actually run for all there are. For me, this is a real problem, so I spent a while on it. Bug report is at Bug 1657848 - dhclient ignores given interface and it contains my patch:
--- ../dhcp-4.3.6/common/discover.c 2018-12-10 16:14:50.983316937 +0200
+++ common/discover.c 2018-12-10 15:20:12.825557954 +0200
@@ -587,7 +587,7 @@
state == DISCOVER_REQUESTED))
ir = 0;
else if (state == DISCOVER_UNCONFIGURED)
- ir = INTERFACE_REQUESTED | INTERFACE_AUTOMATIC;
+ ir = INTERFACE_AUTOMATIC;
else {
ir = INTERFACE_REQUESTED;
if (state == DISCOVER_RELAY && local_family == AF_INET) {
My fix is to break the functionality of dhclient
. If you don't specify an interface for dhclient
to run on, it will run on all. To me (or my network-scripts) that won't make any sense, so I'm choosing to run only on specified interfaces, or interface in my case. This patch when applied and compiled to a binary will fully fix the problem.
Windows 10 with Microsoft's OpenSSH
Sunday, December 2. 2018
Without thinking it too much, I just went for a SSH-connection on a Windows 10 box with the way it's typically done:
ssh -i .ssh/id_nistp521 user@linux-box
When the result was not a private key passphrase question, but a: Bad owner or permissions on .ssh/config
, I started thinking about it. What! What? Why! It worked earlier, why won't it anymore?
For the record:
I am an avid Cygwin user. So, I get a very very Linuxish experience also on my Windows-boxes. So, don't be confused with the rest of the story. This is on a Windows 10, even if it doesn't appear to be so.
Back to the failure. I started poking around the permissions. SSH-clients are picky on private key permissions (also config file), so I thought that something was going on. Doing a ls -l
:
-rw------- 1 jari None 547 Nov 12 17:25 config
-rw------- 1 jari None 444 Jul 15 2017 id_nistp521
Nope. Nothing wrong with that. Still, something HAD to be wrong, as this wasn't acceptable for the client. Doing a which ssh
gave it away. I was expecting to see /usr/bin/ssh
, not /cygdrive/c/Windows/System32/OpenSSH/ssh
!!
So, who put an OpenSSH client to my Windows-directory? Why? When? What kind of sorcery is that? Environment variables:
Yes, there is a PATH-entry for OpenSSH before Cygwin. In the directory, there is a full set of OpenSSH-tools:
Version is 7.6:
OpenSSH_for_Windows_7.6p1, LibreSSL 2.6.4
The expected version in Cygwin is 7.9:
OpenSSH_7.9p1, OpenSSL 1.0.2p 14 Aug 2018
So, to fix this, I yanked the PATH-entry away. Now my SSH-connections worked as expected.
Little bit of googling around landed me on an article OpenSSH in Windows 10! in MSDN blogs. This is on January 2018 and apparently this stuff landed on my Windows in April update (build 1803). Also in article What’s new for the Command Line in Windows 10 version 1803, I found out that also tar
and curl
were added.
Ultimately this is a good thing. Now that I know this stuff is there as default, there is no need to go load a PuTTY or something for a random SSH-thing you just want to get fixed on a remote box.