Adding a source of randomness to a Linux

Monday, June 3. 2019

Randomness in computers

You don't need to know much about computers to understand, that computers cannot do random things. Yes, all programming languages and libraries do offer you a rand()-function to emulate randomness. However, the resulting output will follow the carefully crafted programming implementing this "randomness". The most trivial pseudo-random functions will merely provide a sequence of numbers appearing random, but this sequence can be reset to start from beginning making the "randomness" predicatable. That's not really very random, huh!

Improved randomness in computers

To be fair, there does exist improved pseudo-random algorithms which take their initial seed-values from something volatile (time is one such volatile parameter) making the quality of randomness better. Still, even high-quality pseudo-random algorithm is just complex sequence of operations, which will produce duplicate results on same input values. Sometimes its just very tricky to craft a situation where all of the input values would match.

If somebody is capable of doing that, your randomness changes into predictability. Read the story of Dual_EC_DRBG on Wikipedia https://en.wikipedia.org/wiki/Dual_EC_DRBG. When you're generating your precious private keys, you don't want anybody (including NSA) to be able to guess what you have there.

Random source in Linux

Since a proper random source is something every single user, developer and sysadmin would love to have, the problem has been approached on your Linux by authors of the operating system. An excellent description can be found from Wikipedia article https://en.wikipedia.org/wiki//dev/random#Linux. Briefly put, your Linux will collect environmental entropy from number of sources (including human interaction with keyboard and mouse) to a pool, which can then be used to produce naturally random numbers. It actually works very well, the quality of randomness is top-notch.

Obvious problem with this approach is, that you cannot pull too many random numbers out of this source without exhausting it. The fix is to keep typing something while moving your mouse (not a joke!) to generate entropy for the random source. This will eventually help fill the entropy pool and /dev/random will spit couple bytes more.

Those users who have exhausted their /dev/random on an idling rack server without a console keyboard, mouse and video know that it takes painfully long for the entropy pool to fill. A busy server doing something will be able to fill the pool much faster.

A real random source

If you need a real proper random source, which works without human intervention and can provide really good randomness as a stream, there are possibilities on hardware. I know of two good ones, Simtec Electronics Entropy Key and ubld.it TrueRNG Hardware Random Number Generator.

Note: if you consider getting one, get the TrueRNG version 3 (http://ubld.it/truerng_v3). Its just that I have the 1st gen version at hand and haven't found the reason to upgrade.

My TrueRNG looks like this:

It is essentially an USB-stick.

Linux lsusb info essentially identifies it as a Microchip (vendor ID 0x04d8) manufactured USB-device (with ID 0xf5fe) providing RS-232 communications:

Bus 002 Device 009: ID 04d8:f5fe Microchip Technology, Inc. Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 2 Communications bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x04d8 Microchip Technology, Inc. idProduct 0xf5fe bcdDevice 1.00 iManufacturer 1 ubld.it iProduct 2 TrueRNG iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0043 bNumInterfaces 2 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 1 AT-commands (v.25ter) iInterface 0 CDC Header: bcdCDC 1.10 CDC ACM: bmCapabilities 0x02 line coding and serial state CDC Union: bMasterInterface 0 bSlaveInterface 1 CDC Call Management: bmCapabilities 0x00 bDataInterface 1 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x000a 1x 10 bytes bInterval 1 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Device Status: 0x0001 Self Powered

And by looking at /dev/, there is a /dev/ttyACM0. That's how udevd will populate a CDC-device when it sees one.

How is this a "true" random source?

Oh, that's easy. The device will produce a random 0 or 1 bit constantly when its on. Or to be precise, there is an internal algorithm producing those based on a constant flow of electrons on a transistor PN-surface. The exact phenomenon is called avalance effect or avalance breakdown. For those who can do electronics, there is a good explanation about this in Difference Between Avalanche Breakdown and Zener Breakdown (I borrowed the visualisation pic from above link).

To (over)simplify that, in a carefully constructed electronic circuit, inside a transistor an electron may or may not be emitted on the other side of a semiconducting surface. The occurrence is as random as it can be in nature. Other circuitry will detect this random flow of electrons (or lack of flow) to produce ones and zeros.

What makes this a really good for randomness, as it is well established that this avalance of electrons will happen. Also, it will happen often enough to produce a stream of events. It's just that we don't know exactly WHEN the avalance of electrons will happen. If you time-slice this to slots, a slot can be empty (no avalance) or full (electrons avalanching).

Linux tweaking:

udev

Anybody having multiple devices in their Linuxes knows, that you really cannot control which device name some specific device will get on reboot. To overcome that, udevd can be instructed to do things when it sees a device. My rules for TrueRNG include setting it to highest possible speed and creating a symlink-device so, that I can point to a known source of random. Also, I'm loosening access to that source of randomness to any users belonging to dialout-group. If I wouldn't do that, only root would have access to this fine random-source.

My /etc/udev/rules.d/99-TrueRNG.rules contains:

SUBSYSTEM=="tty", ATTRS{product}=="TrueRNG", SYMLINK+="TrueRNG", RUN+="/bin/stty raw -echo -ixoff -F /dev/%k speed 3000000" ATTRS{idVendor}=="04d8", ATTRS{idProduct}=="f5fe", ENV{ID_MM_DEVICE_IGNORE}="1", GROUP="dialout", MODE="0664"

If you want to take your random-device for a spin, you can do something like:

dd if=/dev/TrueRNG of=random.bytes bs=64 count=1024

That would create a file of 64 KiB containing very very random bytes. In theory you can just cp data out of the character device, but since it has an infite flow, you'll need to cut it at one point.

rngd

Remember the part I said earlier about Linux using your keypresses and mouse movements as entropy source for randomness. Even with the USB-stick popped into a PC, that still remains the case. What needs to be done next is to offer a helping hand to the Linux kernel and make sure the entropy pool is always full.

My Fedora has package called rng-tools. It is packaged from Mr. Horman's https://github.com/nhorman/rng-tools. What's in there are the tools for pumping those precious truly random bits out of the USB-source to Linux kernel's entropy pool. As default, rngd will use /dev/hwrng as the source for randomness. Some Linuxes don't have that device at all, some Linuxes point that into CPU's random source. What's guaranteed, it will not point to your USB-stick! We need to change that.

Btw. you might be horrified by the fact, that something is fidding with your randomness. The exact bits transferred from USB to entropy pool won't be the actual bits getting out of /dev/random. Your keypresses and many other events are still a factor. Its still a good idea to not run randomness-monitoring malware or spyware in your Linux.

Systemd works so, that I did create a copy of /usr/lib/systemd/system/rngd.service into /etc/systemd/system/rngd.service. The contents of the copy in /etc/systemd/system/ can be freely modified and it has priority over the /usr/lib/systemd/system/ one. The only change I made was to have the ExecStart-line say as:

ExecStart=/sbin/rngd -f --rng-device=/dev/TrueRNG --fill-watermark=4000

When rngd-service would be started, it will use the USB-stick as source and make sure, there are at least 4000 bits of entropy in the pool.

Making sure rngd setup works

At any given point, you can query how many bits are available in the Linux entropy-pool:

cat /proc/sys/kernel/random/entropy_avail

Since my setup is working correctly, it will display a number greater than 4000 and smaller than 4096. The upper limit comes from /proc/sys/kernel/random/poolsize, which is a hard-coded number from Linux kernel source.

Hint: If you do the stupid thing like I did and set the /proc/sys/kernel/random/write_wakeup_threshold (using --fill-watermark) into 4096 (or above), your rngd will keep hogging CPU like there is no tomorrow. It is impossible for the pool to contain maximum number of bits at any given time. Give your system a break and set the threshold bit lower than max.

Finally

It's always nice to know for a fact, that random numbers are random. This fact can be verified and has been verified by number of other people.

Enjoy!

by Jari Turkia in Hardware, Linux, Security at 11:24 | Comments (0) | Share in LinkedIn

Bacula 9 vchanger: Tape count fix

Sunday, June 2. 2019

One of the first ever blog posts I've written here is about Bacula, the open-source backup software (more at https://www.bacula.org/). I published Fedora 17 binaries for the virtual tape changer running for Bacula 7. The post from year 2013 is here.

Running Bacula in Fedora Linux isn't much of a trick, ready-made binaries are available by the distro and configuring one is covered in Bacula's documentation. Then again, running Bacula with a NAS (see Wikipedia for Network-attached storage) as storage backend is where things get very very tricky. I've written about my Qnap NAS-device's support earlier, see the post about that.

Since its inception, Bacula is baked to require a tape drive (or drives) and a set of tapes (a single tape is supported also). Given modern day computing environment, actual physical tapes aren't used that much. Even I stopped using DLT (Wikipedia Digital Linear Tape) or LTO (Wikipedia Linear Tape-Open) tapes years ago and went for an easy, fast and inexpensive solution for storing my backups on a NAS. So, I really do need to have a concept of a "tape" somehow. That's where the virtual Bacula tape changer steps in. It is a piece of software attaching to Bacula autochanger API emulating a virtual "tape" drive and set of tapes with all the necessary operations, but doing all that on a filesystem. More details about autochangers can be found from Bacula Autochanger Resource page.

The obvious idea is to create a set of files to act as a set of "tapes". For system administration purposes, the tapes are just files in a subdirectory. Smart thing to do is to make that particular subdirectory located on a NAS to store the backups where there is plenty of external capacity outside your system. In my case, I'll access them over an iSCSI-mounted filesystem. More details about iSCSI on a Linux can be found from RedHat Enterprise Linux 7 manual pages at https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/online-storage-management#osm-target-setup. For those planning something similar: I chose NOT to go with a NFS. A NFS-connection get stuck very easily and you will be rebooting your stuff a lot. See How-To: Release Stuck NFS Mounts without a Reboot @ Linux Journal for more about that.

When I went to Fedora 29, my Bacula setup got an automatic bump to version 9. My previous setup was for Bacula version 7 and quite soon I realized that I needed to alter my vchanger somehow to get it to support version 9. Bacula-guys did make changes to autochanger-API in their version-bump process. Luckily vchanger author was ahead of me and I got the code from http://sourceforge.net/projects/vchanger/. Soon realized that when I did a simple command of vchanger /etc/qnap.conf LIST, it displayed an extra tape which didn't exist in reality. I was puzzled. Old setup displayed the tape count correctly.

I did some C++ debugging and found out an obvious bug in the code. In src/diskchanger.cpp, DiskChanger-class InitializeVirtSlots()-method calculates the last changer slot numer incorrectly. It is guaranteed to be one-off. After fixing this, I contacted the vchanger author Mr. J. Fisher about my findings, and he agreed, there was a bug in his code.

Unfortunately, couple of months have passed and there is no 1.0.3 release yet, so the fix isn't in the SourceForge git-repo yet. For Fedora-users, my RPMs are available at http://opensource.hqcodeshop.com/Bacula/vchanger for B9/. Go get them there! I've been using those since last December, so I think my fix is correct and doesn't introduce any issues.

by Jari Turkia in Linux, Software at 13:01 | Comments (4) | Share in LinkedIn

Fedora dhclient broken

Monday, December 10. 2018

I'm not a huge fan of NetworkManager. Since I am a fan of many RedHat products, it creates a nice conflict. They develop it, so it is pre-installed in all of RedHat's Linuxes. Luckily its very easy to yank off and replace with something that actually works and is suitable for server computing.

Also, a third player exists in the Linux networking setup -scene. systemd-networkd (https://www.freedesktop.org/software/systemd/man/systemd-networkd.html) does exactly the same as NetworkManager or classic network-scripts would do. It is the newcomer, but since everybody's box already has systemd, using it to run your networking makes sense to some.

I don't know exactly when, but at some point Fedora simply abandoned all the classic ways of doing networking. I know for a fact, that in Fedora 26 ISC's dhclient worked ok, but looks like around the time of 26 release, they simply broke it. Now we're at 29 and it has the same code as 28 did. Since almost nobody uses classic networking, this bug went unnoticed for a while. There is a bug in RedHat's Bugzilla which looks similar to what I'm experiencing: Bug 1314203 - dhclient establishes a lease on the explicitly specified interface, but then endlessly retries old leases on other interfaces, but looks like it didn't get any attention. To make this bug even more difficult to spot, you need to have multiple network interfaces in your machine for this problem to even exist. Most people don't, and looks like those who do, aren't running dhclient.

The issue, in detail, is following:
When run ifup eno1, no IP-address will be issued by my ISP for that interface.

When running dhclient in diagnostics mode, with following command:
/sbin/dhclient -1 -d -pf /run/dhclient-eno1.pid -H myPCame eno1
output will be:

Internet Systems Consortium DHCP Client 4.3.6 Copyright 2004-2017 Internet Systems Consortium. All rights reserved. For info, please visit https://www.isc.org/software/dhcp/ Listening on LPF/enp3s0f1/90:e2:ba:00:00:01 Sending on LPF/enp3s0f1/90:e2:ba:00:00:01 Listening on LPF/eno1/60:a4:4c:00:00:01 Sending on LPF/eno1/60:a4:4c:00:00:01 Listening on LPF/enp3s0f0/90:e2:ba:00:00:02 Sending on LPF/enp3s0f0/90:e2:ba:00:00:02 Sending on Socket/fallback DHCPDISCOVER on enp3s0f1 to 255.255.255.255 port 67 interval 3 (xid=0x90249a1f) DHCPREQUEST on eno1 to 255.255.255.255 port 67 (xid=0xe612e570) DHCPDISCOVER on enp3s0f0 to 255.255.255.255 port 67 interval 5 (xid=0xb568cb15) DHCPACK from 62.248.219.2 (xid=0xe612e570) DHCPREQUEST on enp3s0f0 to 255.255.255.255 port 67 (xid=0xb568cb15) DHCPOFFER from 84.249.192.3 DHCPACK from 84.249.192.3 (xid=0xb568cb15) DHCPDISCOVER on enp3s0f1 to 255.255.255.255 port 67 interval

Notice how DHCP-client was requested on network interface eno1, but it is actually run for all there are. For me, this is a real problem, so I spent a while on it. Bug report is at Bug 1657848 - dhclient ignores given interface and it contains my patch:

--- ../dhcp-4.3.6/common/discover.c 2018-12-10 16:14:50.983316937 +0200 +++ common/discover.c 2018-12-10 15:20:12.825557954 +0200 @@ -587,7 +587,7 @@ state == DISCOVER_REQUESTED)) ir = 0; else if (state == DISCOVER_UNCONFIGURED) - ir = INTERFACE_REQUESTED | INTERFACE_AUTOMATIC; + ir = INTERFACE_AUTOMATIC; else { ir = INTERFACE_REQUESTED; if (state == DISCOVER_RELAY && local_family == AF_INET) {

My fix is to break the functionality of dhclient. If you don't specify an interface for dhclient to run on, it will run on all. To me (or my network-scripts) that won't make any sense, so I'm choosing to run only on specified interfaces, or interface in my case. This patch when applied and compiled to a binary will fully fix the problem.

by Jari Turkia in Linux at 17:13 | Comments (0) | Share in LinkedIn

Arch Linux failing to start network interface, part 2

Saturday, March 17. 2018

I genuinely love my Arch Linux. It is a constant source of mischief. In a positive sense. There is always something changing making the entire setup explode. The joy I get, is when I need to get the pieces back together.

In the Wikipedia article of Arch Linux, there is a phrase:

... and expects the user to be willing to make some effort to understand the system's operation

The is precisely what I use my Arch Linux for. I want the practical experience and understanding on the system. And given it's rolling release approaches, it explodes plenty.

Back in 2014, Arch Linux impemented Consistent Network Device Naming. At that time the regular network interface names changed. For example my eth0 become ens3. My transition was not smooth. See my blog post about that.

Now it happened again! Whaat?

Symptoms:

Failure to access the Linux-box via SSH
Boot taking very long time
Error message about service sys-subsystem-net-devices-ens3.device failing on startup

Failure:

Like previous time, the fix is about DHCP-client failing.

You vanilla query for DHCP-client status:

systemctl status dhcpcd@*

... resulted as nothingness. A more specific query for the failing interface:

systemctl status dhcpcd@ens3

... results:

* dhcpcd@ens3.service - dhcpcd on ens3 Loaded: loaded (/usr/lib/systemd/system/dhcpcd@.service; enabled; vendor pre> Active: inactive (dead)

Yup. DHCP failure. Like previously, running ip addr show revealed the network interface name change:

2: enp0s3: mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:52:54:52:54 brd ff:ff:ff:ff:ff:ff

There is no more ens3, it is enp0s3 now. Ok.

Fix:

A simple disable for the non-existent interface's DHCP, and enable for the new one:

systemctl disable dhcpcd@ens3 systemctl enable dhcpcd@enp0s3

To test that, I rebooted the box. Yup. Working again!

Optional fix 2, for the syslog:

Debugging this wasn't as easy as I expected. dmesg had nothing on DHCP-clients and there was no kernel messages log at all! Whoa! Who ate that? I know, that default installation of Arch does not have syslog. I did have it running (I think) and now it was gone. Weird.

Documentation is at https://wiki.archlinux.org/index.php/Syslog-ng, but I simply did a:

pacman -S syslog-ng
systemctl enable syslog-ng@default systemctl start syslog-ng@default

... and a 2nd reboot to confim, that the syslog existed and contained boot information. Done again!

What:

The subject of Consistent Network Device Naming is described in more detail here: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/networking_guide/ch-consistent_network_device_naming

Apparently, there are five different approaches on how to actually implement the CNDN. And given the old ens-device, which is according to PCI hotplug slot (enS for slot) index number (Scheme 2), the new naming scheme was chosen to be physical location (enP for physical) of the connector (Scheme 3).

The information of when/what/why the naming scheme change was made eludes me. I tried searching Arch discussion forums at https://bbs.archlinux.org/, but nothing there that I could find. But anyway, I got the pieces back together. Again!

Update 30th March 2018:
Yup. The interface naming rolled back. Now ens3 is the interface used again. Darnnation this naming flapping!

by Jari Turkia in Linux at 19:02 | Comment (1) | Share in LinkedIn

Arch Linux failing to update man-db

Friday, January 26. 2018

This week seems to be especially hard on my Linuxes. Doing a regular pacman -Syu started spitting crap on me:

error: failed to commit transaction (conflicting files) man-db: /usr/bin/accessdb exists in filesystem man-db: /usr/bin/apropos exists in filesystem man-db: /usr/bin/catman exists in filesystem man-db: /usr/bin/convert-mans exists in filesystem man-db: /usr/bin/lexgrog exists in filesystem man-db: /usr/bin/man exists in filesystem man-db: /usr/bin/mandb exists in filesystem man-db: /usr/bin/manpath exists in filesystem man-db: /usr/bin/whatis exists in filesystem man-db: /usr/lib/man-db/globbing exists in filesystem man-db: /usr/lib/man-db/libman-2.7.6.1.so exists in filesystem man-db: /usr/lib/man-db/libman.so exists in filesystem man-db: /usr/lib/man-db/libmandb-2.7.6.1.so exists in filesystem ...

A simple query for what's wrong:

# pacman -Qkk man-db man-db: 363 total files, 0 altered files

So, nothing wrong with it. It just loves busting my balls!

Using a bigger hammer:

# pacman -S --force man-db ... :: Processing package changes... (1/1) upgrading man-db [######################] 100% New optional dependencies for man-db gzip [installed] :: Running post-transaction hooks... (1/2) Creating temporary files... (2/2) Arming ConditionNeedsUpdate...

Now my pacman -Syu works. Weird case, that.

by Jari Turkia in Linux at 20:06 | Comments (0) | Share in LinkedIn

Open Management Infrastructure in Azure gone wild

Thursday, January 25. 2018

I opened my mail, and I had 730 new e-mails there! Whaat!

One of my Azure boxes has (for reason unknown to me), following crontab-entry on root's crontab:

* * * * * [ $ ! -f /etc/opt/omi/creds/omi.keytab $ -o $ /etc/krb5.keytab -nt /etc/opt/omi/creds/omi.keytab $ ] && /opt/omi/bin/support/ktstrip /etc/krb5.keytab /etc/opt/omi/creds/omi.keytab

/opt/omi/bin/support/ktstrip keeps failing, because /etc/krb5.keytab is missing. And that command is run every single minute on my machine. So, every single minute I get a new information about the failure. Nice!

The sequence of events is totally unclear to me. I haven't touched anything, but this morning an influx of e-mails stated pouring in.

OMI, or Open Management Infrastructure is something Linux-images in Azure have, so it shouldn't be anything dangerous.

The obvious fix as to remove that stupid line.

by Jari Turkia in Linux at 18:58 | Comments (2) | Share in LinkedIn

HOWTO: Configuring a router on a Mini-PC with CentOS

Thursday, January 18. 2018

Over half an year later, I realized, that I never published my article about operating system and software setup of my Mini-PC router. This is a follow-up post about the Qotom-hardware I wrote earlier. So, its probably about time to do that!

To get the ball rolling on new CentOS installation, a good start is to download it, Rufus it into an USB-stick and install the minimal setup into router-PC. The CentOS installation is so well documented and trivial process, I won't go into any details of it. Read something like Installing Red Hat Enterprise Linux 7.4 on all architectures for details of that.

Goal

Every project needs a goal. In any kind of engineering there is a specification and criteria, that the goal has been met.

The goal of this project is to create a Linux-server capable of securing a local network from the Internet and allow traffic to pass from the LAN to the wild-wild-net.

Spec:

There is a working CentOS Linux running on the MiniPC
ISP's cable modem is configured as bridge, no double NATting done
- MiniPC gets a public IP-address from ISP
- MiniPC can be accessed from the Net via the IP-address
Configurations persist a reboot on the MiniPC
MiniPC issues dynamic IP-addresses to LAN-clients
MiniPC acts as a caching nameserver to LAN-clients
- Any requests from the Net are not served
Wireless access point is configured not do do any routing, aka. it is in access point mode
The setup is secure with attack surface minimized
LAN IP-address range is 192.168.1.0/24

Definition of done:

Internet works!
- MiniPC can connect to net
- MiniPC can be connected from net and LAN via SSH
- Wired clients can connect to net via Ethernet cable without any manual configuration
- Wireless clients can connec to the net via Wi-Fi without any manual configuration

Step 1: Packages

After minimal installation, the set of tools and packages required includes:
net-tools bind-utils screen tcpdump policycoreutils-python setools

net-tools: mostly for netstat, using route or ifconfig is deprecated
bind-utils: for dig and nslookup
screen: for full-screen window manager
tcpdump: taking a look into Ethernet and TCP/IP-packages, when something goes wrong, getting detailed view is very important
policycoreutils-python setools: for managing SELinux

Step 2: Remove NetworkManager

Packages to install: -none needed-

Why a server would have GNOME NetworkManager installed on a server is beyond me. I simply cannot comprehend what CentOS-people are thinking, when they as default treat my server as a laptop. But the main thing is, that this piece of shit needs to go! The quicker, the better!

DANGER!

When you actually run the yum-command to remove NetworkManager, your system will lose all network connectivity. So, please, do run this at a console, not via SSH-connection.

DANGER!

Run command as root on console:

yum erase NetworkManager

Now your system's networking is royally messed up.

Step 3: Setup NICs

Packages to install: -none needed-

Check that NetworkManager created and left ifcfg-files into /etc/sysconfig/network-scripts/. If the appropriate ifcfg-files (one for each interface) are gone, you need to start learning how to write one fast. A good starting point on that would be RedHat Enterprise Linux 7 product documentation, Networking Guide, section 2.2 Editing Network Configuration Files.

LAN interface

Out of the two Ethernet-interfaces, 50/50 coin-flip ended as enp3s0 LAN and enp1s0 WAN. For any practical purposes, it really doesn't matter which one is which, but I'm describing here my setup. If you're using some other hardware, your interface names won't match those.

For any sensible use of your LAN-side, this interface should be connected to a network switch, so that your local network can be shared by your PC, Playstation, TV, Wi-Fi access point or whatever you have there running. Of course you can run it with only one host connected directly to the router.

This is critical: Your LAN-interface MUST have a static IP-address for it. It really cannot act as LAN-side of a router without one.

I chose my LAN to be private IP-range 192.168.1.0/24, so I edited /etc/sysconfig/network-scripts/ifcfg-enp3s0 to contain:

TYPE=Ethernet BOOTPROTO=none DEFROUTE=yes IPV6INIT=yes NAME=enp3s0 UUID=-don't-touch-this- DEVICE=enp3s0 ONBOOT=yes NETWORK=192.168.1.0 BROADCAST=193.168.1.255 USERCTL=no IPADDR=192.168.1.1 PREFIX=24 IPV4_FAILURE_FATAL=no IPV6_AUTOCONF=yes IPV6_DEFROUTE=yes IPV6_PEERDNS=yes IPV6_PEERROUTES=yes IPV6_FAILURE_FATAL=no

WAN interface

For WAN, there is no need to touch enp1s0 setup (much). When your WAN-interface (enp1s0) starts, it will obtain an IP-address from your ISP. With that, it will also get your ISP's DNS-address an overwrite your precious manual changes in /etc/resolv.conf. You don't want that to happen. So, prevent that and edit /etc/sysconfig/network-scripts/ifcfg-enp1s0 and add:

PEERDNS="no"

Well, that was easy!

IP-forwarding

For routing to work, it requires Linux kernel to have IP-forwarding enabled. It will allow network packets to travel between interfaces.

Enable IP-forwarding immediately:

sysctl -w net.ipv4.ip_forward=1

Enable IP-forwarding on boot:

sysctl net.ipv4.ip_forward > /etc/sysctl.d/1_ip_forward.conf

Finalize network setup

When your network interface configurations are ok, restart everything by running following as root:

systemctl enable network
systemctl restart network

Now your system:

has both interfaces on-line
is reachable from a machine on your wired-LAN using a static IP-address other than 192.168.1.1.
Note: your router doesn'ht have DHCPd running yet, so you need to figure out how to configure a static IP-address to your device
still gets an IP-address from your ISP from your external interface
can reach IP-addresses via both external and internal interfaces

If these criteria are not met, there is simply no point in proceeding. Your system won't work as a router without those prerequisites.

Finally, make sure that your IPtables-rules have effect. Your box is connected to Internet and it can be accessed/bombarded from there, so run following to secure your setup:

systemctl restart firewalld

Now your system is ready to become a router.

Step 4: Firewalld

Packages to install: -none needed-

Zones

Out-of-box CentOS has firewalld enabled. It has only one zone defined for public wild-wild-net, also TCP/22 SSH is open for the world. This needs to be run as root. First split off LAN into own zone home:

# firewall-cmd --zone home --change-interface enp3s0 --permanent

Check the zones and their assigned interfaces:

# firewall-cmd --get-active-zones home interfaces: enp3s0 public interfaces: enp1s0

Setup network address translation (NAT) and allow traffic to flow from your LAN to outside world. Any relevant traffic is allowed to flow in from Internet back to your LAN. Commands to run:

# firewall-cmd --permanent --direct --add-rule ipv4 nat POSTROUTING 0 -o enp1s0 -j MASQUERADE # firewall-cmd --permanent --direct --add-rule ipv4 filter FWDI_home_allow 0 -o enp1s0 -j ACCEPT # firewall-cmd --permanent --direct --add-rule ipv4 filter FWDI_public_allow 0 -o enp3s0 -m state --state RELATED,ESTABLISHED -j ACCEPT

Enable the DNS-server we'll setup later, also disable any outgoing DNS-queries from your LAN (a security measure):

# firewall-cmd --permanent --zone home --add-service dns # firewall-cmd --permanent --direct --add-rule ipv4 filter FWDI_home_deny 0 -p udp -m udp --dport 53 -j REJECT

At this point do a reload:

# firewall-cmd --reload

... and test your firewall setup from router:

You still must be able to access Internet from your router
Your LAN does work at this point. A client with a static IP must be able to access Internet.

Step 5: Named

Packages to install: bind-chroot

You can continue to use your ISP's nameserver, but I never do that. It makes much more sense to have a caching nameserver running at your own router. This allows your box to go directly to Internet root servers and do all the name queries for you. In many countries ISPs intentionally drop some domains out or are forced by government to do that. Running your own resolver makes sure that you get all the records as is and in case of changes you can flush caches whenever you want and don't have to wait for a record to expire.

Out-of-box the BIND 9.9.4 does not server anybody else than localhost. To fix this, find following two lines in /etc/named.conf:

listen-on port 53 { 127.0.0.1; }; allow-query { localhost; };

Edit them to contain:

listen-on port 53 { 127.0.0.1; 192.168.1.1; };
allow-query { localhost; 192.168.1.0/24; };

Finally, change your system's default name resolver by editing /etc/resolv.conf to contain a single line:

nameserver 127.0.0.1

Start the server and enable it to start on boot:

systemctl start named-chroot
systemctl enable named-chroot

Now you're ready to test the setup. Just host www.google.com or your favorite site. Successful reply will include IP-address(es) for your query.

Now your caching nameserver is ready and most probably you'll never have to touch it again.

Step 6: DHCP

Packages to install: dhcp

Edit /etc/dhcp/dhcpd.conf and have it contain:

ddns-update-style interim; ignore client-updates; authoritative; default-lease-time 14400; max-lease-time 86400; option subnet-mask 255.255.255.0; option broadcast-address 192.168.1.255; option routers 192.168.1.1; option domain-name "my.own.lan"; option domain-name-servers 192.168.1.1; subnet 192.168.1.0 netmask 255.255.255.0 { range 192.168.1.50 192.168.1.99; }

That piece of configuration will use your router as DNS for the clients and issue them addresses from range .50 - .99.

Start the server and enable it to start on boot:

systemctl start dhcpd
systemctl enable dhcpd

At this point, configure your client to use DHCP for IP-addressing. You must get an IP from the above range, also DNS-resolve and NAT should work, but that's the next step to do. Test it all.

Step 7: Testing it

Make sure:

A client in your LAN gets an IP-address from DHCP
A client in your LAN can ping your router at 192.168.1.1
A client in your LAN can ping something in the Internet, like Google's name server at 8.8.8.8
A client in your LAN resolves names, for example: nslookup www.google.com returns IP-addresses
A client in your LAN can access https://www.google.com/ via a web-browser

That's it! What else would you need?

Done!

Congratulations on your new router!

What I did next was set up my own DNS-zone so that my hosts had FQDNs. But that's beyond this blog post. Read something like How To Configure BIND as a Private Network DNS Server on CentOS 7 by DigitalOcean for that.

by Jari Turkia in Linux at 19:21 | Comments (18) | Share in LinkedIn

Saving the day - Android tethering with Linux

Sunday, December 3. 2017

The fail

On a peaceful Sunday, I was just minding my own business and BOOM! Internet connection was gone. After a quick debugging session, restarting the router and eyeballing the LEDs, it was evident: something with my ISP Com Hem was down:

Ok, ISP down, what next?
I whipped up the iPhone and went for any possible service announcements. And yes, the above announcement was placed on my user account information. I was stunned by this, it was so cool to have:

confirmation, that something was down with ISP: Yup, it's broken.
that information tailored with the geographical location of my subscription: Yup, that fail affects you.

No Finnish ISP or telco has that. I was very impressed with such detail (and still am).

The fix

There is no way I'm sitting on my thumbs on such an event. I was just about to start playing Need for Speed and now Origin wouldn't even log me in, so, no Internet, no gaming.

I have an el-cheapo Huawei Android lying around somewhere, with a Swedish SIM-card in it. My dirt cheap subscription has couple of gigs data transfer per month in it, which I never use. I came up with a plan to temporarily use the cell phone as an Internet connection. The idea would be to hook it up into my Linux router with an USB-cable, make sure the Android pops up as a network interface and then configure the Linux to use that network interface as primary connection.

Thethering

I found tons of information about Android-tethering from Arch Linux wiki. It basically says:

Make sure your Android is newer than 2.2
Connect the phone to a Linux
Enable USB-tethering from the phone's connection sharing -menu
Confirm the new network interface's existence on the Linux end

On my phone, there was two settings for personal hotspot. Wifi/Bluetooth and USB:

Connection

New phones have USB-C, but its such a new connector type, that anything older than couple years, has most likely micro-USB -connector:

Hooking it up to a Linux will output tons of dmesg and and ultimately result in a brand new network interface:

# ip addr show 5: enp0s20u4u3: link/ether 82:49:a8:b4:96:c9 brd ff:ff:ff:ff:f inet 192.168.42.90/24 brd 192.168.42.255 scope valid_lft 3595sec preferred_lft 3595sec inet6 fe80::7762:e1a9:9fa:69f5/64 scope link valid_lft forever preferred_lft forever

Routing configuration

Now that there was a new connection, I tried pinging something in the wild world:
ping -I enp0s20u4u3 193.166.3.2

Nope. Didn't work.
I confirmed, that the default network gateway was still set up into the broken link:
# ip route show default via 192.168.100.1 dev enp1s0 proto static metric 100

That needs to go to enable some functionality. But what to replace the bad gateway with?
Since the connection had IP-address from Telco DHCP, there is a lease-file with all the necessary information:
# cat /var/lib/NetworkManager/dhclient-*-enp0s20u4u3.lease lease { interface "enp0s20u4u3"; fixed-address 192.168.42.90; option subnet-mask 255.255.255.0; option routers 192.168.42.129;

The fixed-address in the file matches the above ip addr show -information. Required information was gathered, and the idea was to ditch the original gateway and replace it with a one from the Android phone's telco:
# ip route del default via 192.168.100.1 # ip route add default via 192.168.42.129 dev enp0s20u4u3 # ip route show default via 192.168.42.129 dev enp0s20u4u3 proto static metric 101

Now it started cooking:
# ping -c 5 ftp.funet.fi PING ftp.funet.fi (193.166.3.2) 56(84) bytes of data. 64 bytes from ftp.funet.fi (193.166.3.2): icmp_seq=1 ttl=242 time=35.6 ms 64 bytes from ftp.funet.fi (193.166.3.2): icmp_seq=2 ttl=242 time=31.7 ms

To finalize the access from my LAN, I ran following firewall-cmd --direct commands:
--remove-rule ipv4 nat POSTROUTING 0 -o enp1s0 -j MASQUERADE --add-rule ipv4 nat POSTROUTING 0 -o enp0s20u4u3 -j MASQUERADE --add-rule ipv4 filter FORWARD 0 -i enp3s0 -o enp0s20u4u3 -j ACCEPT --add-rule ipv4 filter FORWARD 0 -i enp0s20u4u3 -o enp3s0 \ -m state --state RELATED,ESTABLISHED -j ACCEPT

There is no firewall-cmd --permanent on purpose. I don't intend those to stick too long. I just wanted to play the darn game!

Done!

Now my gaming PC would connect to The Big Net. I could suft the web, read mail and even Origin logged me in.

That's it! Day saved!

by Jari Turkia in Linux at 18:37 | Comments (0) | Share in LinkedIn

Cygwin X11 with window manager

Saturday, November 4. 2017

Altough, I'm a Cygwin fan, I have to admit, that the X11-port is not one of their finest work. Every once in a while I've known to run it.

Since there are number of window managers made available for Cygwin, I found it surprisingly difficult to start using one. According to docs (Chapter 3. Using Cygwin/X) and /usr/bin/startxwin, XWin-command is executed with a -multiwindow option. Then XWin man page says: "In this mode XWin uses its own integrated window manager in order to handle the top-level X windows, in such a way that they appear as normal Windows windows."

As a default, that's ok. But what if somebody like me would like to use a real Window Manager?

When startxwin executes xinit, it optionally can run a ~/.xserverrc as a server instead of XWin. So, I created one, and made it executable. In the script, I replace -multiwindow with -rootless to not use the default window manager.

This is what I have:

#!/bin/bash # If there is now Window Maker installed, just do the standard thing. # Also, if xinit wasn't called without a DISPLAY, then just quit. if [ ! -e /usr/bin/wmaker ] || [ -z "$1" ]; then exec XWin "$@" # This won't be reached. fi # Alter the arguments: # Make sure, there is no "-multiwindow" -argument. args_out=() for arg; do [ $arg == "-multiwindow" ] && arg="-rootless" args_out+=("$arg") done exec XWin "${args_out[@]}" & # It takes a while for the XWin to initialize itself. # Use xset to check if it's available yet. while [ ! DISPLAY="${args_out[0]}" xset q > /dev/null ]; do sleep 1 done sleep 1 # Kick on a Window Manager DISPLAY="${args_out[0]}" exec /usr/bin/wmaker & wait

The script assumes, that there is a Window Maker installed (wmaker.exe). The operation requires xset.exe to exist. Please, install it from package xset, as it isn't installed by default.

by Jari Turkia in Linux, Windows at 18:38 | Comments (2) | Share in LinkedIn

Fedora 26: SElinux-policy failing on StrongSWAN IPsec-tunnel

Monday, September 4. 2017

SElinux is a beautiful thing. I love it! However, the drawback of a very fine-grained security control is, that the policy needs to be exactly right. Almost right won't do it.

This bite me when I realized, that systemd couldn't control StrongSWAN's charon - IKE-daemon. It worked flawlessly, when running a simple strongswan start, but failing on systemctl start strongswan. Darn! When the thing works, but doesn't work as a daemon, to me it has the instant smell of SElinux permission being the culprit.

Very brief googling revealed, that other people were suffering from that same issue:

Bug 1444607 - SELinux is preventing starter from execute_no_trans access on the file /usr/libexec/stro
ngswan/charon.
Bug 1467940 - SELinux is preventing starter from 'execute_no_trans' accesses on the file /usr/libexec/strongswan/charon.

Others had made the same conclusion: it's a SElinux -policy failure. Older bug report was from April. That's a month before Fedora 26 was released! But neither bug report had a fix for it. I went to browse Bodhi and found out that there is a weekly release of selinux-policy .rpm-file, but this hadn't gotten the love it desperately needed from RedHat guys.

Quite often self-help is the best help, so I ran audit2allow -i /var/log/audit/audit.log and deduced a following addition to my local policy:

#============= ipsec_t ============== allow ipsec_t ipsec_exec_t:file execute_no_trans; allow ipsec_t var_run_t:sock_file { unlink write };

I have no idea if that fix is ever going to be picked up by RedHat, but it definitely works for me. Now my IPsec tunnels survive a reboot of my server.

Update 10th Sep 2017:

Package selinux-policy-3.13.1-260.8.fc26.noarch.rpm has following changelog entry:

2017-08-31 - Lukas Vrabec <lvrabec@redhat.com> - 3.13.1-260.8
- Allow ipsec_t can exec ipsec_exec_t

... which fixes the problem.
To test that, I dropped my own modifications out of local policy and tested. Yes, working perfectly! Thank you Fedora guys for this.

by Jari Turkia in Linux at 07:49 | Comments (0) | Share in LinkedIn

Handling /run with systemd, Part II

Sunday, June 4. 2017

It took me less than 4 years to finally revisit this subject. I'd like to thank all the people who commented the original blog post. It looks like for those years SystemD (am I writing it wrong?) was in constant evolution and new features were added.

This is what I'm running in production. Containing System and omitting Unit and Install -parts as they are unchanged:

[Service] Type=forking PrivateTmp=yes User=nobody Group=nobody RuntimeDirectory=dhis RuntimeDirectoryMode=0750 ExecStart=/usr/sbin/dhid -P /run/dhis/dhid.pid PIDFile=/run/dhis/dhid.pid

This also makes my RPM spec-file simpler, I got to remove stuff there, because temporary directory creation is taken care. Finally, I think that this one is done and ready!

If you want to download my source RPM-package, go here.
If you want to know more about RPM-specs, read Maximum RPM - Taking the RPM Package Manager to the Limit.

by Jari Turkia in Linux at 19:12 | Comment (1) | Share in LinkedIn

Wi-Fi access point - TRENDnet TEW-818DRU - Part 2: Software

Monday, May 8. 2017

In my previous post, I un-boxed my new Wi-Fi access point. This is the part for running something in it.

For this to happen, the obvious prerequisite is DD-WRT binary image built specifically for TEW-818DRU. DD-WRT supported devices -list doesn't say much. Little bit of poking around results in build 23720 back in the 2014 for this one. It is at: https://www.dd-wrt.com/site/support/other-downloads?path=betas%2F2014%2F03-13-2014-r23720%2Ftrendnet-818DRU%2F. As I wanted something newer, I went for November 2016 build 30880 at: ftp://ftp.dd-wrt.com/betas/2016/11-14-2016-r30880/trendnet-818DRU/.

My typical approach for flashing new firmware is to stay connected with a wire. In practice that means, that I'll hook up an ethernet cable to my laptop and the other end to the access point's LAN-switch. Then I'll configure a static IP-address at the laptop's operating system. This makes sure, that I'm 100% connected whenever the box is running. Doing this over wireless connection and/or using dynamicly assigned IP-address may or may not work. As these boxes are expensive enough, I didn't push my luck. The downside of this approach is, that I'll need to know what the actual management IP-address will be.

Ok, let's start!

On out-of-box-experience the web GUI is at 192.168.10.1:

After login, there is a nice setup-wizard. Which of course, we'll just skip by acknowleding the alert:

Now we're at the normal administrator environment:

For me, the word "advanced" is like honey to a grizzly bear . I'll always home towards it, I know that all the goodies are stored there:

And also this time I was right, firmware upload/upgrade has its own menu item. Its clear, that this device is 100% designed by engineers, they cannot even seem to be able to agree on a single terminology. Menu has "upload", page title has "upgrade". Any self-respecting user experience designer would yell "You're confusing the user with that!", but I guess this stuff is for nerds only, and they don't care.

After selecting the trendnet-818dru-webflash.bin file to be uploaded, there is yet again a nice warning:

It will take couple minutes for the flashing to complete:

There is very little indication, that the process completed. I didn't notice any lights blinking or something like that. It just completed, rebooted and stayed silent.

Now the IP-addess will change. DD-WRT is 192.168.1.1 at out-of-box-experience:

And that's pretty much it for firmware upgrade. At this point I did my wireless access point -setup including:

Admin username and password
AP's LAN IP-address, my LAN isnt' at 192.168.1/24
Enable SSH-service
Enable GUI-access for HTTPS and SSH
Wireless network setup for 2.4 GHz and 5 GHz, WPA2 Personal with pre-shared key as security

DD-WRT is for knowledgeable administrators, no setup wizards or mumbo-jumbo. Just the settings.
Btw. configuration docs can be found at: https://www.dd-wrt.com/wiki/index.php/Configuration_HOWTOs

by Jari Turkia in Linux at 08:10 | Comments (0) | Share in LinkedIn

Fedora 21 DHCP client failing to get an IP-address from Elisa [Solved!]

Monday, July 18. 2016

One of my own boxes runs a Fedora Linux. A while back my upgrade failed miserably due to Fedora installer not getting an IP-address from my ISP, Elisa. I had a minor skirmish for an hour or so with the installer, but no avail, Fedora installer beat me on that one. As I love to have that box up and running, I gave up and decided to investigate that later. Now that day came and I'm victorious!

Basics

DHCP is what pretty much everybody has for getting an IPv4 address in 2016. Mobile connections have something different, but everything else including Wi-Fi hotspots, ADSL-routers, Fiber-to-the-Home -connections, etc. etc. issue an IP-address (mostly IPv4, sometimes IPv6) to any well-behaving customer of theirs. Today, the de-facto is that the IP-address is allocated out of a well known broadband address range, or pool. Lists of those are generally available, so that home customers can be differentiated from data centers and companies.

To put DHCP simply, it is a mechanism for allocating an unique address for your Internet connection. The Wikipedia definition for Dynamic Host Configuration Protocol uses more words and isn't as concise as mine, but you'll get the idea.

Details of the problem

In case of mis-use or unpaid internet bill, they'd naturally decline any DHCP-requests for an IP-address. Since everything I tested, including various Windowses, OS Xes and Linuxes worked it wasn't about that. The connection was ok, the DHCP server issued a valid DHCP-lease as it had been doing for couple years, but not for my Fedora installer. Duh?

At this point I went to google for the symptoms and quite soon I landed into RedHat Bugzilla. It contains bug 1154200 which is titled as "not getting a dhcp address assigned". Mr. Krovich reports that his Fedora 21 installer won't get an IP-address from ISP. I pulled up a Fedora 20 installer for the previous version. It worked ok! Yep, they changed something into Fedora 21. The change affects Fedora 22, 23 and the latest 24. It does not affect RedHat nor CentOS (yet).

Fix (aka. trial and error)

In the comments of bug 1154200 they're talking about Option 61 commit which was introduced for Fedora 21 release. A possible fix would be to use DHCP-configuration: send dhcp-client-identifier = hardware;

I downloaded Fedora 24 installer and tested it out. It didn't help any. After a lot of wiresharking the traffic, I isolated this:
In the DHCP Discover -packet, there was an Option 61 present.

More poking for man 5 dhclient-options revealed that it was possible to specify a fixed string for identifier. So, again I edited /etc/dhcp/dhclient.conf (btw. the file didn't exist to begin with, I had to create one) to contain:
send dhcp-client-identifier = "";

That did the trick! Now even Fedora 24 installer got a a valid IP-address and it was possible to install.

Specs

In the early days, all IP-addresses were assigned manually. Everybody was given an IP-address and they punched all the details manually. That was frustrating and error-prone, so somebody invented BOOTP to automate the entire setup. Quite soon, that evolved into DHCP, defined by RFC 1531. When DHCP gained traction and more and more vendors joined the dynamically allocated -game, couple of clarifying iterations later, we're at RFC 2131 for current breed of DHCPing. It is basically the original BOOTP, but with most wrinkles ironed out.

The options are defined at RFC 1533 for DHCP Options and BOOTP Vendor Extensions. Looks like nobody supported Option 61 for a long time. Windowses don't, Apple doesn't, most Linux Distros don't, but RFC 4361 for Node-specific Client Identifiers for Dynamic Host Configuration Protocol Version Four (DHCPv4) must have done it for Fedora-guys. They chose to implement request 560361 (Dhclient doesn't use client-identifier; may cause issues in certain bridged environments) and make sure everybody uses it, with assumption that all ISPs that won't support Client Identifiers will merrily ignore the option. Nice!

Afterwards

I'll target equal blame to my ISP. The Option 61 is well-defined and it should be possible to ignore it. Looks like they're running Alcatel-Lucent hardware there and for some reason it is configured to spit on Option 61 requests.

Naturally I reported the error to my ISP, but you can assume how well that goes. Any regular customer facing clerk won't know DHCP or what it does, nor any options of it. So all I got back is the classic "we'll investigate" -style response. I'm not keeping my hopes up. I have more hope on my own Bugzilla request 1357469 to have an option to enable or disable usage of Option 61 on Fedora. They might even implement that one day.

Anyway, I'm hoping that this post will help somebody strugging to install their Fedora.

by Jari Turkia in Linux at 13:17 | Comments (0) | Share in LinkedIn

CentOS 7.2 network install fail [Solved]

Sunday, June 5. 2016

I was about to upgrade an old CentOS 6 box into 7. It was all planned, backups taken, necessary information gathered and USB stick prepared with 7.2 DVD image in it. A shutdown and boot from the installation USB, bunch of settings, date/time, keyboard, network, but Däng! No dice.

My initial attempt was to install from USB, but for some reason the server didn't see the USB volume as a valid installation source. No problem, I thought, let's go for network-install then. The interface was already up and the box could reach Internet ok. Installing from a mirror shouldn't take too long. But no. All I could accomplish was a "Error setting up base repository". I went googling about this and found CentOS 7.2 Netinstall Guide – Network Installation Screenshots.

First I set up installation source as On the network: http://mirror.centos.org/centos/7.2.1511/os/x86_64/ and then This URL refers to a mirror list: Checked. No avail. It took about 8 minutes to get the error, but this approach failed miserably. What /tmp/packaging.log had was:

ERR packaging: failed to grab repo metadata for anaconda: Cannot find a valid baseurl for repo: anaconda
ERR packaging: metadata download for repo anaconda failed after 10 retries

Argh! 8 minutes to determine, that the thing didn't work.

There was plenty of time to plan for the next move. I went to see CentOS mirror list, and picked the local Finnish mirror at nic.FUNET. Setting that as source: http://ftp.funet.fi/pub/mirrors/centos.org/7.2.1511/os/x86_64/ and with This URL refers to a mirror list: set as Unchecked got me butkus, /tmp/packaging.log had:

ERR packaging: failed to grab repo metadata for anaconda: failure: repodata/6990209f63a9fd811f13e830ac3c6de4c5d70a42b1c6873e4329b523d394c3bd-primary.xml.gz from anaconda: [Errno 256] No more mirrors to try.
http://ftp.funet.fi/pub/mirrors/centos.org/7.2.1511/os/x86_64/repodata/
6990209f63a9fd811f13e830ac3c6de4c5d70a42b1c6873e4329b523d394c3bd-primary.xml.gz: [Errno 14] HTTP Error 404 - Not Found

Finally a tangible result. Obviously the HTTP/404 was correct. There is no such file in that directory. It took me about 15 seconds to determine, that the URL should be http://ftp.funet.fi/pub/mirrors/centos.org/7.2.1511/os/x86_64/repodata/
0e54cd65abd3621a0baf9a963eafb1a0ffd53603226f02aadce59635329bc937-primary.xml.gz. Something was off in the installer metadata. But where?

I checked treeinfo at http://ftp.funet.fi/pub/mirrors/centos.org/7.2.1511/os/x86_64/.treeinfo, but no avail. Then my poking around landed at /var/run/install/repo/repodata. It has among others, a file named repomd.xml. Looking at the network version from http://ftp.funet.fi/pub/mirrors/centos.org/7.2.1511/os/x86_64/repodata/repomd.xml made everything clear as crystal.

Drive had:
- revision 1449702798
- 6990209f63a9fd811f13e830ac3c6de4c5d70a42b1c6873e4329b523d394c3bd, the file that doesn't exist in the mirror
Network had:
- revision 1449700451
- 0e54cd65abd3621a0baf9a963eafb1a0ffd53603226f02aadce59635329bc937, the file that does exist

But how to fix this?

My initial attempt was to wget http://ftp.funet.fi/pub/mirrors/centos.org/7.2.1511/os/x86_64/repodata/repomd.xml into /var/run/install/repo/repodata and retry, but that didn't change anything, still the same frustrating error after 10 minute delay.

I rebooted the box and relized, that my change persisted on the USB-drive. Whoa! Anyway, I got things cooking this time. Finally the base repository was accepted, I got to go make installation selection and got the install forward.

What the hell was going on there? Where did the incorrect repomd.xml come from? It isn't in the installation image. Or it is, but it comes from a place I didn't find. Whatever it is, there is something seriously off in the process. Why doesn't the installer try to get the most recent version from the network. It is a network install, after all!! After frustrating couple hours later than anticipated, I finally got the box upgraded. Hopefully this information saves you that time.

by Jari Turkia in Linux at 21:07 | Comment (1) | Share in LinkedIn

Improving Nuvoton NCT6776 lm_sensors output

Monday, November 16. 2015

Problem

My home Linux-box was outputting more-or-less useless lm_sensor output. Example:

coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +36.0°C (high = +80.0°C, crit = +98.0°C) Core 0: +34.0°C (high = +80.0°C, crit = +98.0°C) Core 1: +31.0°C (high = +80.0°C, crit = +98.0°C) Core 2: +36.0°C (high = +80.0°C, crit = +98.0°C) Core 3: +33.0°C (high = +80.0°C, crit = +98.0°C) nct6776-isa-0290 Adapter: ISA adapter Vcore: +0.97 V (min = +0.00 V, max = +1.74 V) in1: +1.02 V (min = +0.00 V, max = +0.00 V) ALARM AVCC: +3.33 V (min = +2.98 V, max = +3.63 V) +3.3V: +3.31 V (min = +2.98 V, max = +3.63 V) in4: +1.01 V (min = +0.00 V, max = +0.00 V) ALARM in5: +2.04 V (min = +0.00 V, max = +0.00 V) ALARM in6: +0.84 V (min = +0.00 V, max = +0.00 V) ALARM 3VSB: +3.42 V (min = +2.98 V, max = +3.63 V) Vbat: +3.36 V (min = +2.70 V, max = +3.63 V) fan1: 0 RPM (min = 0 RPM) fan2: 703 RPM (min = 0 RPM) fan3: 0 RPM (min = 0 RPM) fan4: 819 RPM (min = 0 RPM) fan5: 0 RPM (min = 0 RPM) SYSTIN: +36.0°C (high = +0.0°C, hyst = +0.0°C) ALARM sensor = thermistor CPUTIN: -60.0°C (high = +80.0°C, hyst = +75.0°C) sensor = thermal diode AUXTIN: +35.0°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor PECI Agent 0: +26.0°C (high = +80.0°C, hyst = +75.0°C) (crit = +88.0°C) PCH_CHIP_TEMP: +0.0°C PCH_CPU_TEMP: +0.0°C PCH_MCH_TEMP: +0.0°C

That's all great and all, but what the heck are in 1, 4-6 and fan 1-5? Are the in 1, 4-6 readings really reliable? Why are there sensors with 0 RPM readings? CPUTIN indicating -60 degrees, really? PCH-temps are all 0, why?

Investigation

In order to get to bottom of all this, let's start from the chip in question. lm_sensors -setup identified it as NCT6776. For some reason Nuvoton doesn't have the data sheet anymore, but by little bit of googling, a PDF with title NCT6776F / NCT6776D Nuvoton LPC I/O popped up.

Analog inputs:

Following information can be found:

It contains following analog inputs:

AVCC
VBAT
3VSB
3VCC
CPUVCORE
VIN0
VIN1
VIN2
VIN3

The good thing is, that first 5 of them are clearly labeled, but inputs 0 through 3 are not. They can be pretty much anything.

Revolution Pulse counters:

When it comes to RPM-readings, following information is available:

That lists following inputs:

SYSFANIN
CPUFANIN
AUXFANIN0
AUXFANIN1
AUXFANIN2

Looks like all of those have connectors on my motherboard.

Temperature Sources:

For the temperature measurements, the chip has:

The analog temperature inputs are:

SMIOVT1
SMIOVT2
SMIOVT3
SMIOVT4
SMIOVT5
SMIOVT6

According to the above table, they're mapped into AUXTIN, CPUTIN and SYSTIN.

Also on top of those, there is PECI (Platform Environment Control Interface). A definition says "PECI is a new digital interface to read the CPU temperature of Intel® CPUs". So, there aren't any analog pins for that, but there are readings available, when questioned.

Configuration

A peek in to /etc/sensors3.conf at the definition of the chip shows:

chip "w83627ehf-*" "w83627dhg-*" "w83667hg-*" "nct6775-*" "nct6776-*" label in0 "Vcore" label in2 "AVCC" label in3 "+3.3V" label in7 "3VSB" label in8 "Vbat" set in2_min 3.3 * 0.90 set in2_max 3.3 * 1.10 set in3_min 3.3 * 0.90 set in3_max 3.3 * 1.10 set in7_min 3.3 * 0.90 set in7_max 3.3 * 1.10 set in8_min 3.0 * 0.90 set in8_max 3.3 * 1.10

And that's all. I guess that would be ok for the generic case, but in my particular box that list of settings doesn't cover half of the inputs.

Solution

Configuration changes

I added following settings for temperature into "chip "w83627ehf-*" "w83627dhg-*" "w83667hg-*" "nct6775-*" "nct6776-*""-section:

label in0 "Vcore" set in0_min 1.1 * 0.9 set in0_max 1.1 * 1.15 label in1 "+12V" compute in1 @ * 12, @ / 12 set in1_min 12 * 0.95 set in1_max 12 * 1.1 label in2 "AVCC" set in2_min 3.3 * 0.95 set in2_max 3.3 * 1.1 label in3 "+3.3V" set in3_min 3.3 * 0.95 set in3_max 3.3 * 1.1 label in4 "+5V" compute in4 @ * 5, @ / 5 set in4_min 5 * 0.95 set in4_max 5 * 1.1 ignore in5 ignore in6 label in7 "3VSB" set in7_min 3.3 * 0.95 set in7_max 3.3 * 1.1 label in8 "Vbat" set in8_min 3.3 * 0.95 set in8_max 3.3 * 1.1

The obvious problem still stands: what are the undocumented in 1, 4, 5 and 6? Mr. Ian Dobson at Ubuntuforums.org discussion about NCT6776 claims, that in1 is for +12 VDC power and in4 is for +5VDC power. I cannot deny nor confirm that for my board. The Novoton-chip only provides the inputs, but there is absolutely no way of telling how the manufacturer chooses to connect them to various parts of the MoBo. I took the same assumption, so all that was necessary, was to multiply the input data by 12 and 5 to get a proper reading. I don't know what in5 and in6 are for, that's why I remove them from the display. All the other ones are min and max boundaries for the known readings.

The fan settings are machine specific, in my case:

label fan2 "CPU fan" set fan2_min 200 label fan4 "HDD fan" set fan4_min 200 ignore fan1 ignore fan3 ignore fan5

As I only have fans connected to 2 out of 5, I'll ignore the not connected ones. For the connected, I set a lower limit of 200 RPM.

Temperatures are motherboard-specific. In my case, I did following additions:

label temp1 "MB" set temp1_max 38 set temp1_max_hyst 35 label temp3 "CPU" label temp7 "CPU?" ignore temp2 ignore temp8 ignore temp9 ignore temp10

The easy part is to remove the values not displaying anything. The hard part is to try to figure out what the measurements indicate. Based on the other readings, temp3 is CPU combined somehow. The other sensor is displaying rougly same values for each core I have there. However, the temp7 is for PECI, but it doesn't behave anything like CPU-temps. It should, but it doesn't. That's why I left a question mark after it.

Resulting output

After the additions, following output is available:

coretemp-isa-0000 Adapter: ISA adapter Physical id 0: +48.0°C (high = +80.0°C, crit = +98.0°C) Core 0: +48.0°C (high = +80.0°C, crit = +98.0°C) Core 1: +40.0°C (high = +80.0°C, crit = +98.0°C) Core 2: +43.0°C (high = +80.0°C, crit = +98.0°C) Core 3: +39.0°C (high = +80.0°C, crit = +98.0°C) nct6776-isa-0290 Adapter: ISA adapter Vcore: +1.22 V (min = +0.99 V, max = +1.26 V) +12V: +12.29 V (min = +11.42 V, max = +13.25 V) AVCC: +3.33 V (min = +3.14 V, max = +3.63 V) +3.3V: +3.31 V (min = +3.14 V, max = +3.63 V) +5V: +5.04 V (min = +4.76 V, max = +5.52 V) 3VSB: +3.42 V (min = +3.14 V, max = +3.63 V) Vbat: +3.38 V (min = +3.14 V, max = +3.63 V) CPU fan: 912 RPM (min = 200 RPM) HDD fan: 897 RPM (min = 200 RPM) MB: +35.0°C (high = +38.0°C, hyst = +35.0°C) sensor = thermistor CPU: +37.0°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor CPU?: +37.0°C (high = +80.0°C, hyst = +75.0°C) (crit = +88.0°C)

Before taking the readings, I ran sensors -s to set the min/max values.

Now my output starts making sense and I can actually monitor any changes.

PS.
At the time of writing this article, website http://www.lm-sensors.org/ was down for multiple days in a row. I can only hope, that project personnel solves the issue with the web site and it is up at the time you're seeing this.

by Jari Turkia in Linux at 18:20 | Comments (3) | Share in LinkedIn

Mon	Tue	Wed	Thu	Fri	Sat	Sun
← Back	July '25
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Monday, June 3. 2019

Randomness in computers

Improved randomness in computers

Random source in Linux

A real random source

How is this a "true" random source?

Linux tweaking:

udev

rngd

Making sure rngd setup works

Finally

Sunday, June 2. 2019

Monday, December 10. 2018

Saturday, March 17. 2018

Symptoms:

Failure:

Fix:

Optional fix 2, for the syslog:

What:

Friday, January 26. 2018

Thursday, January 25. 2018

Thursday, January 18. 2018

Goal

Spec:

Definition of done:

Step 1: Packages

Step 2: Remove NetworkManager

Step 3: Setup NICs

LAN interface

WAN interface

IP-forwarding

Finalize network setup

Step 4: Firewalld

Zones

Step 5: Named

Step 6: DHCP

Step 7: Testing it

Done!

Sunday, December 3. 2017

The fail

The fix

Thethering

Connection

Routing configuration

Done!

Saturday, November 4. 2017

Monday, September 4. 2017

Sunday, June 4. 2017

Monday, May 8. 2017

Monday, July 18. 2016

Basics

Details of the problem

Fix (aka. trial and error)

Specs

Afterwards

Sunday, June 5. 2016

Monday, November 16. 2015

Problem

Investigation

Analog inputs:

Revolution Pulse counters:

Temperature Sources:

Configuration

Solution

Configuration changes

Resulting output

Calendar

Quicksearch

Archives

Categories

RSS feeds of this Blog

Blog Administration

Powered by