Nuvoton NCT6793D lm_sensors output
Monday, July 3. 2023
LM-Sensors is set of libraries and tools for accessing your Linux server's motheboard sensors. See more @ https://github.com/lm-sensors/lm-sensors.
If you're ever wondered why in Windows it is tricky to get readings from your CPU-fan rotation speed or core temperatures from you fancy GPU without manufacturer utilities. Obviously vendors do provide all the possible readings in their utilities, but people who would want to read, record and store the data for their own purposes, things get hairy. Nothing generic exists and for unknown reason, such API isn't even planned.
In Linux, The One toolkit to use is LM-Sensors. On kernel side, there exists The Linux Hardware Monitoring kernel API. For this stack to work, you also need a kernel module specific to your motherboard providing the requested sensor information via this beautiful API. It's also worth noting, your PC's hardware will have multiple sensors data providers. An incomplete list would include: motherboard, CPU, GPU, SSD, PSU, etc.
Now that sensors-detect
found all your sensors, confirm sensors
will output what you'd expect it to. In my case there was a major malfunction. On a boot, following thing happened when system started sensord
(in case you didn't know, kernel-stuff can be read via dmesg
):
systemd[1]: Starting lm_sensors.service - Hardware Monitoring Sensors...
kernel: nct6775: Enabling hardware monitor logical device mappings.
kernel: nct6775: Found NCT6793D or compatible chip at 0x2e:0x290
kernel: ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (_GPE.HWM) (20221020/utaddress-204)
kernel: ACPI: OSL: Resource conflict; ACPI support missing from driver?
systemd[1]: Finished lm_sensors.service - Hardware Monitoring Sensors.
This conflict resulted in no available mobo readings! NVMe, GPU and CPU-cores were ok, the part I was mostly looking for was fan RPMs and mobo temps just to verify my system health. No such joy.
Uff.
It seems, this particular Linux kernel module has issues. Or another way to state it: mobo manufacturers have trouble implementing Nuvoton chip into their mobos. On Gentoo forums, there is a helpful thread: [solved] nct6775 monitoring driver conflicts with ACPI
Disclaimer: For ROG Maximus X Code -mobo adding acpi_enforce_resources=no
into kernel parameters is the correct solution. Results will vary depending on what mobo you have.
Such ACPI-setting can be permanently enforced by first querying about the Linux kernel version being used (I run a Fedora): grubby --info=$(grubby --default-index)
. The resulting kernel version can be updated by: grubby --args="acpi_enforce_resources=no" --update-kernel DEFAULT
. A reboot shows fix in effect, ACPI Warning is gone and mobo sensor data can be seen.
As a next step you'll need userland tooling to interpret the raw data into human-readable information with semantics. A new years back, I wrote about Improving Nuvoton NCT6776 lm_sensors output. It's mainly about bridging the flow of zeros and ones into something having meaning to humans. This is my LM-Sensors configuration for ROG Maximus X Code:
chip "nct6793-isa-0290"
# 1. voltages
ignore in0
ignore in1
ignore in2
ignore in3
ignore in4
ignore in5
ignore in6
ignore in7
ignore in8
ignore in9
ignore in10
ignore in11
label in12 "Cold Bug Killer"
set in12_min 0.936
set in12_max 2.613
set in12_beep 1
label in13 "DMI"
set in13_min 0.550
set in13_max 2.016
set in13_beep 1
ignore in14
# 2. fans
label fan1 "Chassis fan1"
label fan2 "CPU fan"
ignore fan3
ignore fan4
label fan5 "Ext fan?"
# 3. temperatures
label temp1 "MoBo"
label temp2 "CPU"
set temp2_max 90
set temp2_beep 1
ignore temp3
ignore temp5
ignore temp6
ignore temp9
ignore temp10
ignore temp13
ignore temp14
ignore temp15
ignore temp16
ignore temp17
ignore temp18
# 4. other
set beep_enable 1
ignore intrusion0
ignore intrusion1
I'd like to credit Mr. Peter Sulyok on his work about ASRock Z390 Taichi. This mobo happens to use the same Nuvoton NCT6793D -chip for LPC/eSPI SI/O (I have no idea what those acronyms are for, I just copy/pasted them from the chip data sheet). The configuration is in GitHub for everybody to see: https://github.com/petersulyok/asrock_z390_taichi
Also, I''d like to state my ignorance. After reading less than 500 pages of the NCT6793D data sheet, I have no idea what is:
- Cold Bug Killer voltage
- DMI voltage
- AUXTIN1 is or exactly what temperature measurement it serves
- PECI Agent 0 temperature
- PECI Agent 0 Calibration temperature
Remember, I did mention semantics. From sensors
-command output I can read a reading, what it translates into, no idea! Luckily there are some of the readings seen are easy to understand and interpret. As an example, fan RPMs are really easy to measure by removing the fan from its connector. Here is an excerpt from my mobo manual to explain fan-connectors:

As data quality is taken care of and output is meaningful, next step is to start recording data. In LM-Sensors, there is sensord
for that. It is a system service taking a snapshot (you can define the frequency) and storing it for later use. I'll enrich the stored data points with system load averages, this enables me to estimate a relation with high temperatures and/or fan RPMs with how much my system is working.
Finally, all data gathered into a RRDtool database can be easily visualized with rrdcgi
into HTML + set of PNG-images to present a web page like this:

Nice!
LM-Sensors is set of libraries and tools for accessing your Linux server's motheboard sensors. See more @ https://github.com/lm-sensors/lm-sensors.
If you're ever wondered why in Windows it is tricky to get readings from your CPU-fan rotation speed or core temperatures from you fancy GPU without manufacturer utilities. Obviously vendors do provide all the possible readings in their utilities, but people who would want to read, record and store the data for their own purposes, things get hairy. Nothing generic exists and for unknown reason, such API isn't even planned.
In Linux, The One toolkit to use is LM-Sensors. On kernel side, there exists The Linux Hardware Monitoring kernel API. For this stack to work, you also need a kernel module specific to your motherboard providing the requested sensor information via this beautiful API. It's also worth noting, your PC's hardware will have multiple sensors data providers. An incomplete list would include: motherboard, CPU, GPU, SSD, PSU, etc.
Now that sensors-detect
found all your sensors, confirm sensors
will output what you'd expect it to. In my case there was a major malfunction. On a boot, following thing happened when system started sensord
(in case you didn't know, kernel-stuff can be read via dmesg
):
systemd[1]: Starting lm_sensors.service - Hardware Monitoring Sensors...
kernel: nct6775: Enabling hardware monitor logical device mappings.
kernel: nct6775: Found NCT6793D or compatible chip at 0x2e:0x290
kernel: ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (_GPE.HWM) (20221020/utaddress-204)
kernel: ACPI: OSL: Resource conflict; ACPI support missing from driver?
systemd[1]: Finished lm_sensors.service - Hardware Monitoring Sensors.
This conflict resulted in no available mobo readings! NVMe, GPU and CPU-cores were ok, the part I was mostly looking for was fan RPMs and mobo temps just to verify my system health. No such joy. Uff.
It seems, this particular Linux kernel module has issues. Or another way to state it: mobo manufacturers have trouble implementing Nuvoton chip into their mobos. On Gentoo forums, there is a helpful thread: [solved] nct6775 monitoring driver conflicts with ACPI
Disclaimer: For ROG Maximus X Code -mobo adding acpi_enforce_resources=no
into kernel parameters is the correct solution. Results will vary depending on what mobo you have.
Such ACPI-setting can be permanently enforced by first querying about the Linux kernel version being used (I run a Fedora): grubby --info=$(grubby --default-index)
. The resulting kernel version can be updated by: grubby --args="acpi_enforce_resources=no" --update-kernel DEFAULT
. A reboot shows fix in effect, ACPI Warning is gone and mobo sensor data can be seen.
As a next step you'll need userland tooling to interpret the raw data into human-readable information with semantics. A new years back, I wrote about Improving Nuvoton NCT6776 lm_sensors output. It's mainly about bridging the flow of zeros and ones into something having meaning to humans. This is my LM-Sensors configuration for ROG Maximus X Code:
chip "nct6793-isa-0290"
# 1. voltages
ignore in0
ignore in1
ignore in2
ignore in3
ignore in4
ignore in5
ignore in6
ignore in7
ignore in8
ignore in9
ignore in10
ignore in11
label in12 "Cold Bug Killer"
set in12_min 0.936
set in12_max 2.613
set in12_beep 1
label in13 "DMI"
set in13_min 0.550
set in13_max 2.016
set in13_beep 1
ignore in14
# 2. fans
label fan1 "Chassis fan1"
label fan2 "CPU fan"
ignore fan3
ignore fan4
label fan5 "Ext fan?"
# 3. temperatures
label temp1 "MoBo"
label temp2 "CPU"
set temp2_max 90
set temp2_beep 1
ignore temp3
ignore temp5
ignore temp6
ignore temp9
ignore temp10
ignore temp13
ignore temp14
ignore temp15
ignore temp16
ignore temp17
ignore temp18
# 4. other
set beep_enable 1
ignore intrusion0
ignore intrusion1
I'd like to credit Mr. Peter Sulyok on his work about ASRock Z390 Taichi. This mobo happens to use the same Nuvoton NCT6793D -chip for LPC/eSPI SI/O (I have no idea what those acronyms are for, I just copy/pasted them from the chip data sheet). The configuration is in GitHub for everybody to see: https://github.com/petersulyok/asrock_z390_taichi
Also, I''d like to state my ignorance. After reading less than 500 pages of the NCT6793D data sheet, I have no idea what is:
- Cold Bug Killer voltage
- DMI voltage
- AUXTIN1 is or exactly what temperature measurement it serves
- PECI Agent 0 temperature
- PECI Agent 0 Calibration temperature
Remember, I did mention semantics. From sensors
-command output I can read a reading, what it translates into, no idea! Luckily there are some of the readings seen are easy to understand and interpret. As an example, fan RPMs are really easy to measure by removing the fan from its connector. Here is an excerpt from my mobo manual to explain fan-connectors:
As data quality is taken care of and output is meaningful, next step is to start recording data. In LM-Sensors, there is sensord
for that. It is a system service taking a snapshot (you can define the frequency) and storing it for later use. I'll enrich the stored data points with system load averages, this enables me to estimate a relation with high temperatures and/or fan RPMs with how much my system is working.
Finally, all data gathered into a RRDtool database can be easily visualized with rrdcgi
into HTML + set of PNG-images to present a web page like this:
Nice!
File 'repomd.xml' from repository is unsigned
Thursday, March 23. 2023
For all those years I've been running SUSE-Linux, I've never bumped into this one while running a trivial zypper update
:
Warning: File 'repomd.xml' from repository 'Update repository with updates from SUSE Linux Enterprise 15' is unsigned.
Note: Signing data enables the recipient to verify that no modifications occurred
after the data were signed. Accepting data with no, wrong or unknown signature can
lead to a corrupted system and in extreme cases even to a system compromise.
Note: File 'repomd.xml' is the repositories master index file. It ensures the
integrity of the whole repo.
Warning: We can't verify that no one meddled with this file, so it might not be
trustworthy anymore! You should not continue unless you know it's safe.
continue? [yes/no] (no):

This error slash warning being weird and potentially dangerous, my obvious reaction was to hit ctrl-c and go investigate. First, my package verification mechanism should be intact and be able to verify if updates downloaded are unaltered or not. Second, there should have not been any breaking changes into my system, at least I didn't make any. As my system didn't seem to breached, I assumed a system malfunction and went investigating.
Quite soon, I learned this is less than rare event. It has happeded multiple times for other people. According to article Signature verification failed for file ‘repomd.xml’ from repository ‘openSUSE-Leap-42.2-Update’ there exists a simple fix.
By running two command zypper clean --all
and zypper ref
, the problem should dissolve.
Yes, that is the case. After a simple wash/clean/rinse -cycle zypper update
worked again.
It was just weird to bump into that for the first time I'd assume this would have occurred some time earlier.
For all those years I've been running SUSE-Linux, I've never bumped into this one while running a trivial zypper update
:
Warning: File 'repomd.xml' from repository 'Update repository with updates from SUSE Linux Enterprise 15' is unsigned.
Note: Signing data enables the recipient to verify that no modifications occurred
after the data were signed. Accepting data with no, wrong or unknown signature can
lead to a corrupted system and in extreme cases even to a system compromise.
Note: File 'repomd.xml' is the repositories master index file. It ensures the
integrity of the whole repo.
Warning: We can't verify that no one meddled with this file, so it might not be
trustworthy anymore! You should not continue unless you know it's safe.
continue? [yes/no] (no):
This error slash warning being weird and potentially dangerous, my obvious reaction was to hit ctrl-c and go investigate. First, my package verification mechanism should be intact and be able to verify if updates downloaded are unaltered or not. Second, there should have not been any breaking changes into my system, at least I didn't make any. As my system didn't seem to breached, I assumed a system malfunction and went investigating.
Quite soon, I learned this is less than rare event. It has happeded multiple times for other people. According to article Signature verification failed for file ‘repomd.xml’ from repository ‘openSUSE-Leap-42.2-Update’ there exists a simple fix.
By running two command zypper clean --all
and zypper ref
, the problem should dissolve.
Yes, that is the case. After a simple wash/clean/rinse -cycle zypper update
worked again.
It was just weird to bump into that for the first time I'd assume this would have occurred some time earlier.
Writing a secure Systemd daemon with Python
Sunday, March 5. 2023
This is a deep dive into systems programing using Python. For those unfamiliar with programming, systems programming sits on top of hardware / electronics design, firmware programming and operating system programming. However, it is not applications programming which targets mostly end users. Systems programming targets the running system. Mr. Yadav of Dark Bears has an article Systems Programming is Hard to Do – But Somebody’s Got to Do it, where he describes the limitations and requirements of doing so.
Typically systems programming is done with C, C++, Perl or Bash. As Python is gaining popularity, I definitely want to take a swing at systems programming with Python. In fact, there aren''t many resources about the topic in the entire Internet.
Requirements
This is the list of basic requirements I have for a Python-based system daemon:
- Run as service: Must run as a Linux daemon, https://man7.org/linux/man-pages/man7/daemon.7.html
- Start runing on system boot and stop running on system shutdown
- Modern: systemd-compatible, https://systemd.io/
- Not interested in ancient SysV init support anymore, https://danielmiessler.com/study/the-difference-between-system-v-and-systemd/
- Modern: D-bus -connected
- Service provided will have an interface on system D-bus, https://www.freedesktop.org/wiki/Software/dbus/
- All Linux systems are built on top of D-bus, I absolutely want to be compatible
- Monitoring: Must support systemd watchdog, https://0pointer.de/blog/projects/watchdog.html
- Surprisingly many out-of-box Linux daemons won't support this. This is most likely because they're still SysV-init based and haven't modernized their operation.
- I most definitely want to have this feature!
- Security: Must use Linux capabilities to run only with necessary permissions, https://man7.org/linux/man-pages/man7/capabilities.7.html
- Security: Must support SElinux to run only with required permissions, https://github.com/SELinuxProject/selinux-notebook/blob/main/src/selinux_overview.md
- Isolation: Must be independent from system Python
- venv
- virtualenv
- Any possible changes to system Python won't affect daemon or its dependencies at all.
- Modern: Asynchronous Python
- Event-based is the key to success.
- D-bus and systemd watchdog pretty much nail this. Absolutely must be asynchronous.
- Packaging: Installation from RPM-package
- This is the only one I'll support for any forseeable future.
- The package will contain all necessary parts, libraries and dependencies to run a self-contained daemon.
That's a tall order. Selecting only two or three of those are enough to add tons of complexity to my project. Also, I initially expected somebody else in The Net to be doing this same or something similar. Looks like I was wrong. Most systems programmers love sticking to their old habits and being at SysV-init state with their synchronous C / Perl -daemons.
Scope / Target daemon
I've previously blogged about running an own email server and fighting spam. Let's automate lot of those tasks and while automating, create a Maildir monitor of junk mail -folder.
This is the project I wrote for that purpose: Spammer Blocker
Toolkit will query for AS-number of spam-sending SMTP-server. Typically I'll copy/paste the IP-address from SpamCop's report and produce a CIDR-table for Postfix. The table will add headers to email to-be-stored so that possible Procmail / Maildrop can act on it, if so needed. As junk mail -folder is constantly monitored, any manually moved mail will be processed also.
Having these features bring your own Linux box spam handling capabilities pretty close to any of those free-but-spy-all -services commonly used by everybody.
Addressing the list of requirements
Let's take a peek on what I did to meet above requirements.
Systemd daemon
This is nearly trivial. See service definition in spammer-reporter.service.
What's in the file is your run-of-the-mill systemd service with appropriate unit, service and install -definitions. That triplet makes a Linux run a systemd service as a daemon.
Python venv isolation
For any Python-developer this is somewhat trivial. You create the environment box/jail, install requirements via setup.py and that's it. You're done. This same isolation mechanism will be used later for packaging and deploying the ready-made daemon into a system.
What's missing or to-do is start using a pyproject.toml
. That is something I'm yet to learn. Obviously there is always something. Nobody, nowhere is "ready". Ever.
Asynchronous code
Talking to systemd watchdog and providing a service endpoint on system D-bus requires little bit effort. Read: lots of it.
To get a D-bus service properly running, I'll first become asynchronous. For that I'll initiate an event-loop dbus.mainloop.glib
. As there are multiple options for event loop, that is the only one actually working. Majority of Python-code won't work with Glib, they need asyncio. For that I'll use asyncio_glib to pair the GLib's loop with asyncio. It took me while to learn and understand how to actually achieve that. When successfully done, everything needed will run in a single asynchronous event loop. Great success!
With solid foundation, As the main task I'll create an asynchronous task for monitoring filesystem changes and run it in a forever-loop. See inotify(7) for non-Python command. See asyncinotify-library for more details of the Pythonic version. What I'll be monitoring is users' Maildirs configured to receive junk/spam. When there is a change, a check is made to see if change is about new spam received.
For side tasks, there is a D-bus service provider. If daemon is running under systemd also the required watchdog -handler is attached to event loop as a periodic task. Out-of-box my service-definition will state max. 20 seconds between watchdog notifications (See service Type=notify
and WatchdogSec=20s
). In daemon configuration file spammer-reporter.toml
, I'll use 15 seconds as the interval. That 5 seconds should be plenty of headroom.
Documentation of systemd.service for WatchdogSec states following:
If the time between two such calls is larger than the configured time, then the service is placed in a failed state and it will be terminated
For any failed service, there is the obvious Restart=on-failure
.
If for ANY possible reason, the process is stuck, systemd scaffolding will take control of the failure and act instantly. That's resiliency and self-healing in my books!
Security: Capabilities
My obvious choice would be to not run as root. As my main task is to provide a D-bus service while reading all users' mailboxes, there is absolutely no way of avoiding root permissions. Trust me! I tried everything I could find to grant nobody's process with enough permissions to do all that. No avail. See the documenatiion about UID / GID rationale.
As standard root will have waaaaaay too much power (especially if mis-applied) I'll take some of those away. There is a yet rarely used mechanism of capabilities(7) in a Linux. My documentation of what CapabilityBoundingSet=CAP_AUDIT_WRITE CAP_DAC_READ_SEARCH CAP_IPC_LOCK CAP_SYS_NICE
means in system service definition is also in the source code. That set grants the process for permission to monitor, read and write any users' files.
There are couple of other super-user permissions left. Most not needed powers a regular root would have are stripped, tough. If there is a security leak and my daemon is used to do something funny, quite lot of the potential impact is already mitigated as the process isn't allowed to do much.
Security: SElinux
For even more improved security, I do define a policy in spammer-block_policy.te
. All my systems are hardened and run SElinux in enforcing -mode. If something leaks, there is a limiting box in-place already. In 2014, I wrote a post about a security flaw with no impact on a SElinux-hardened system.
The policy will allow my daemon to:
- read and write FIFO-files
- create new unix-sockets
- use STDIN, STDOUT and STDERR streams
- read files in
/etc/
, note: read! not write
- read I18n (or internationalization) files on the system
- use capabilities
- use TCP and UDP -sockets
- acess D-bus -sockets in
/run/
- access D-bus watchdog UDP-socket
- access user
passwd
information on the system via SSSd
- read and search user's home directories as mail is stored in them, note: not write
- send email via SMTPd
- create, write, read and delete temporary files in
/tmp/
Above list is a comprehensive requirement of accesses in a system to meet the given task of monitoring received emails and act on determined junk/spam. As the policy is very carefully crafted not to allow any destruction, writing, deletion or manging outside /tmp/
, in my thinking having such a hardening will make the daemon very secure.
Yes, in /tmp/
, there is stuff that can be altered with potential security implications. First you have to access the process. While hacking the daemon, make sure to keep the event-loop running or systemd will zap the process within next 20 seconds or less. I really did consider quite a few scenarios if, and only if, something/somebody pops the cork on my daemon.
RPM Packaging
To wrap all of this into a nice packages, I'm using rpmenv. This toolkit will automatically wrap anything needed by the daemon into a nice virtualenv and deploy that to /usr/libexec/spammer-block/
. See rpm.json
for details.
SElinux-policy has an own spammer-block_policy_selinux.spec
. Having these two in separate packages is mandatory as the mechanisms to build the boxes are completely different. Also, this is the typical approach on other pieces of software. Not everybody has strict requirements to harden their systems.
Where to place an entire virtualenv in a Linux? That one is a ball-buster. RPM Packaging Guide really doesn't say how to handle your Python-based system daemons. Remember? Up there ↑, I tried explaining how all of this is rather novel and there isn't much information in The Net regarding this. However, I found somebody asking What is the purpose of /usr/libexec? on Stackexchange and decided that libexec/
is fine for this purpose. I do install shell wrappers into /bin/
to make everybody's life easier. Having entire Python environment there wouldn't be sensible.
Final words
Only time will tell if I made the right design choices. I totally see Python and Rust -based deamons gaining popularity in the future. The obvious difference is Rust is a compiled language like Go, C and C++. Python isn't.
This is a deep dive into systems programing using Python. For those unfamiliar with programming, systems programming sits on top of hardware / electronics design, firmware programming and operating system programming. However, it is not applications programming which targets mostly end users. Systems programming targets the running system. Mr. Yadav of Dark Bears has an article Systems Programming is Hard to Do – But Somebody’s Got to Do it, where he describes the limitations and requirements of doing so.
Typically systems programming is done with C, C++, Perl or Bash. As Python is gaining popularity, I definitely want to take a swing at systems programming with Python. In fact, there aren''t many resources about the topic in the entire Internet.
Requirements
This is the list of basic requirements I have for a Python-based system daemon:
- Run as service: Must run as a Linux daemon, https://man7.org/linux/man-pages/man7/daemon.7.html
- Start runing on system boot and stop running on system shutdown
- Modern: systemd-compatible, https://systemd.io/
- Not interested in ancient SysV init support anymore, https://danielmiessler.com/study/the-difference-between-system-v-and-systemd/
- Modern: D-bus -connected
- Service provided will have an interface on system D-bus, https://www.freedesktop.org/wiki/Software/dbus/
- All Linux systems are built on top of D-bus, I absolutely want to be compatible
- Monitoring: Must support systemd watchdog, https://0pointer.de/blog/projects/watchdog.html
- Surprisingly many out-of-box Linux daemons won't support this. This is most likely because they're still SysV-init based and haven't modernized their operation.
- I most definitely want to have this feature!
- Security: Must use Linux capabilities to run only with necessary permissions, https://man7.org/linux/man-pages/man7/capabilities.7.html
- Security: Must support SElinux to run only with required permissions, https://github.com/SELinuxProject/selinux-notebook/blob/main/src/selinux_overview.md
- Isolation: Must be independent from system Python
- venv
- virtualenv
- Any possible changes to system Python won't affect daemon or its dependencies at all.
- Modern: Asynchronous Python
- Event-based is the key to success.
- D-bus and systemd watchdog pretty much nail this. Absolutely must be asynchronous.
- Packaging: Installation from RPM-package
- This is the only one I'll support for any forseeable future.
- The package will contain all necessary parts, libraries and dependencies to run a self-contained daemon.
That's a tall order. Selecting only two or three of those are enough to add tons of complexity to my project. Also, I initially expected somebody else in The Net to be doing this same or something similar. Looks like I was wrong. Most systems programmers love sticking to their old habits and being at SysV-init state with their synchronous C / Perl -daemons.
Scope / Target daemon
I've previously blogged about running an own email server and fighting spam. Let's automate lot of those tasks and while automating, create a Maildir monitor of junk mail -folder.
This is the project I wrote for that purpose: Spammer Blocker
Toolkit will query for AS-number of spam-sending SMTP-server. Typically I'll copy/paste the IP-address from SpamCop's report and produce a CIDR-table for Postfix. The table will add headers to email to-be-stored so that possible Procmail / Maildrop can act on it, if so needed. As junk mail -folder is constantly monitored, any manually moved mail will be processed also.
Having these features bring your own Linux box spam handling capabilities pretty close to any of those free-but-spy-all -services commonly used by everybody.
Addressing the list of requirements
Let's take a peek on what I did to meet above requirements.
Systemd daemon
This is nearly trivial. See service definition in spammer-reporter.service.
What's in the file is your run-of-the-mill systemd service with appropriate unit, service and install -definitions. That triplet makes a Linux run a systemd service as a daemon.
Python venv isolation
For any Python-developer this is somewhat trivial. You create the environment box/jail, install requirements via setup.py and that's it. You're done. This same isolation mechanism will be used later for packaging and deploying the ready-made daemon into a system.
What's missing or to-do is start using a pyproject.toml
. That is something I'm yet to learn. Obviously there is always something. Nobody, nowhere is "ready". Ever.
Asynchronous code
Talking to systemd watchdog and providing a service endpoint on system D-bus requires little bit effort. Read: lots of it.
To get a D-bus service properly running, I'll first become asynchronous. For that I'll initiate an event-loop dbus.mainloop.glib
. As there are multiple options for event loop, that is the only one actually working. Majority of Python-code won't work with Glib, they need asyncio. For that I'll use asyncio_glib to pair the GLib's loop with asyncio. It took me while to learn and understand how to actually achieve that. When successfully done, everything needed will run in a single asynchronous event loop. Great success!
With solid foundation, As the main task I'll create an asynchronous task for monitoring filesystem changes and run it in a forever-loop. See inotify(7) for non-Python command. See asyncinotify-library for more details of the Pythonic version. What I'll be monitoring is users' Maildirs configured to receive junk/spam. When there is a change, a check is made to see if change is about new spam received.
For side tasks, there is a D-bus service provider. If daemon is running under systemd also the required watchdog -handler is attached to event loop as a periodic task. Out-of-box my service-definition will state max. 20 seconds between watchdog notifications (See service Type=notify
and WatchdogSec=20s
). In daemon configuration file spammer-reporter.toml
, I'll use 15 seconds as the interval. That 5 seconds should be plenty of headroom.
Documentation of systemd.service for WatchdogSec states following:
If the time between two such calls is larger than the configured time, then the service is placed in a failed state and it will be terminated
For any failed service, there is the obvious Restart=on-failure
.
If for ANY possible reason, the process is stuck, systemd scaffolding will take control of the failure and act instantly. That's resiliency and self-healing in my books!
Security: Capabilities
My obvious choice would be to not run as root. As my main task is to provide a D-bus service while reading all users' mailboxes, there is absolutely no way of avoiding root permissions. Trust me! I tried everything I could find to grant nobody's process with enough permissions to do all that. No avail. See the documenatiion about UID / GID rationale.
As standard root will have waaaaaay too much power (especially if mis-applied) I'll take some of those away. There is a yet rarely used mechanism of capabilities(7) in a Linux. My documentation of what CapabilityBoundingSet=CAP_AUDIT_WRITE CAP_DAC_READ_SEARCH CAP_IPC_LOCK CAP_SYS_NICE
means in system service definition is also in the source code. That set grants the process for permission to monitor, read and write any users' files.
There are couple of other super-user permissions left. Most not needed powers a regular root would have are stripped, tough. If there is a security leak and my daemon is used to do something funny, quite lot of the potential impact is already mitigated as the process isn't allowed to do much.
Security: SElinux
For even more improved security, I do define a policy in spammer-block_policy.te
. All my systems are hardened and run SElinux in enforcing -mode. If something leaks, there is a limiting box in-place already. In 2014, I wrote a post about a security flaw with no impact on a SElinux-hardened system.
The policy will allow my daemon to:
- read and write FIFO-files
- create new unix-sockets
- use STDIN, STDOUT and STDERR streams
- read files in
/etc/
, note: read! not write - read I18n (or internationalization) files on the system
- use capabilities
- use TCP and UDP -sockets
- acess D-bus -sockets in
/run/
- access D-bus watchdog UDP-socket
- access user
passwd
information on the system via SSSd - read and search user's home directories as mail is stored in them, note: not write
- send email via SMTPd
- create, write, read and delete temporary files in
/tmp/
Above list is a comprehensive requirement of accesses in a system to meet the given task of monitoring received emails and act on determined junk/spam. As the policy is very carefully crafted not to allow any destruction, writing, deletion or manging outside /tmp/
, in my thinking having such a hardening will make the daemon very secure.
Yes, in /tmp/
, there is stuff that can be altered with potential security implications. First you have to access the process. While hacking the daemon, make sure to keep the event-loop running or systemd will zap the process within next 20 seconds or less. I really did consider quite a few scenarios if, and only if, something/somebody pops the cork on my daemon.
RPM Packaging
To wrap all of this into a nice packages, I'm using rpmenv. This toolkit will automatically wrap anything needed by the daemon into a nice virtualenv and deploy that to /usr/libexec/spammer-block/
. See rpm.json
for details.
SElinux-policy has an own spammer-block_policy_selinux.spec
. Having these two in separate packages is mandatory as the mechanisms to build the boxes are completely different. Also, this is the typical approach on other pieces of software. Not everybody has strict requirements to harden their systems.
Where to place an entire virtualenv in a Linux? That one is a ball-buster. RPM Packaging Guide really doesn't say how to handle your Python-based system daemons. Remember? Up there ↑, I tried explaining how all of this is rather novel and there isn't much information in The Net regarding this. However, I found somebody asking What is the purpose of /usr/libexec? on Stackexchange and decided that libexec/
is fine for this purpose. I do install shell wrappers into /bin/
to make everybody's life easier. Having entire Python environment there wouldn't be sensible.
Final words
Only time will tell if I made the right design choices. I totally see Python and Rust -based deamons gaining popularity in the future. The obvious difference is Rust is a compiled language like Go, C and C++. Python isn't.
MacBook Pro - Fedora 36 sleep wake - part 2
Friday, September 30. 2022
This topic won't go away. It just keeps bugging me. Back in -19 I wrote about GPE06 and couple months ago I wrote about sleep wake. As there is no real solution in existence and I've been using my Mac with Linux, I've come to a conclusion they are in fact the same problem.
When I boot my Mac, log into Linux and observe what's going on. Following CPU-hog can be observed in top
:
RES SHR S %CPU %MEM TIME+ COMMAND
0 0 I 41.5 0.0 2:01.50 [kworker/0:1-kacpi_notify]
ACPI-notify will chomp quite a lot of CPU. As previously stated, all of this will go to zero if /sys/firmware/acpi/interrupts/gpe06
would be disabled. Note how GPE06 and ACPI are intertwined. They do have a cause and effect.
Also, doing what I suggested earlier to apply acpi=strict noapic
kernel arguments:
grubby --args="acpi=strict noapic" --update-kernel=$(ls -t1 /boot/vmlinuz-*.x86_64 | head -1)
... will in fact reduce GPE06 interrupt storm quite a lot:
RES SHR S %CPU %MEM TIME+ COMMAND
0 0 I 10.0 0.0 0:22.92 [kworker/0:1-kacpi_notify]
Storm won't be removed, but drastically reduced. Also, the aluminium case of MBP will be a lot cooler.
However, by running grubby, the changes won't stick. Fedora User Docs, System Administrator’s Guide, Kernel, Module and Driver Configuration, Working with the GRUB 2 Boot Loader tells following:
To reflect the latest system boot options, the boot menu is rebuilt automatically when the kernel is updated or a new kernel is added.
Translation: When you'll install a new kernel. Whatever changes you did with grubby
won't stick to the new one. To make things really stick, edit file /etc/default/grub
and have line GRUB_CMDLINE_LINUX
contain these ACPI-changes as before: acpi=strict noapic
Many people are suffering from this same issue. Example: Bug 98501 - [i915][HSW] ACPI GPE06 storm
Even this change won't fix the problem. Lot of CPU-resources are still wasted. When you close the lid for the first time and open it again, this GPE06-storm miraculously disappears. Also what will happen, your next lid open wake will take couple of minutes. It seems the entire Mac is stuck, but it necessarily isn't (sometimes it really is). It just takes a while for the hardware to wake up. Without noapic
, it never will. Also, reading the Freedesktop bug-report, there is a hint of problem source being Intel i915 GPU.
Hopefully somebody would direct some development resources to this. Linux on a Mac hardware runs well, besides these sleep/sleep-wake issues.
This topic won't go away. It just keeps bugging me. Back in -19 I wrote about GPE06 and couple months ago I wrote about sleep wake. As there is no real solution in existence and I've been using my Mac with Linux, I've come to a conclusion they are in fact the same problem.
When I boot my Mac, log into Linux and observe what's going on. Following CPU-hog can be observed in top
:
RES SHR S %CPU %MEM TIME+ COMMAND
0 0 I 41.5 0.0 2:01.50 [kworker/0:1-kacpi_notify]
ACPI-notify will chomp quite a lot of CPU. As previously stated, all of this will go to zero if /sys/firmware/acpi/interrupts/gpe06
would be disabled. Note how GPE06 and ACPI are intertwined. They do have a cause and effect.
Also, doing what I suggested earlier to apply acpi=strict noapic
kernel arguments:
grubby --args="acpi=strict noapic" --update-kernel=$(ls -t1 /boot/vmlinuz-*.x86_64 | head -1)
... will in fact reduce GPE06 interrupt storm quite a lot:
RES SHR S %CPU %MEM TIME+ COMMAND
0 0 I 10.0 0.0 0:22.92 [kworker/0:1-kacpi_notify]
Storm won't be removed, but drastically reduced. Also, the aluminium case of MBP will be a lot cooler.
However, by running grubby, the changes won't stick. Fedora User Docs, System Administrator’s Guide, Kernel, Module and Driver Configuration, Working with the GRUB 2 Boot Loader tells following:
To reflect the latest system boot options, the boot menu is rebuilt automatically when the kernel is updated or a new kernel is added.
Translation: When you'll install a new kernel. Whatever changes you did with grubby
won't stick to the new one. To make things really stick, edit file /etc/default/grub
and have line GRUB_CMDLINE_LINUX
contain these ACPI-changes as before: acpi=strict noapic
Many people are suffering from this same issue. Example: Bug 98501 - [i915][HSW] ACPI GPE06 storm
Even this change won't fix the problem. Lot of CPU-resources are still wasted. When you close the lid for the first time and open it again, this GPE06-storm miraculously disappears. Also what will happen, your next lid open wake will take couple of minutes. It seems the entire Mac is stuck, but it necessarily isn't (sometimes it really is). It just takes a while for the hardware to wake up. Without noapic
, it never will. Also, reading the Freedesktop bug-report, there is a hint of problem source being Intel i915 GPU.
Hopefully somebody would direct some development resources to this. Linux on a Mac hardware runs well, besides these sleep/sleep-wake issues.
MacBook Pro - Fedora 36 sleep wake
Thursday, August 25. 2022
Few years back I wrote about running Linux on a MacBook Pro. An year ago, the OpenSuse failed to boot on the Mac. Little bit of debugging, I realized the problem isn't in the hardware. That particular kernel update simply didn't work on that particular hardware anymore. Totally fair, who would be stupid enough to attempt using 8 years old laptop. Well, I do.
There aren't that many distros I use and I always wanted to see Fedora Workstation. It booted from USB and also, unlike OpenSuse Leap, it also booted installed. So, ever since I've been running a Fedora Workstation on encrypted root drive.
One glitch, though. It didn't always sleep wake. Quite easily, I found stories of a MBP not sleeping. Here's one: Macbook Pro doesn't suspend properly. Unlike that 2015 model, this 2013 puppy slept ok, but had such deep state, it had major trouble regaining consciousness. Pressing the power for 10 seconds obviously recycled power, but it always felt too much of a cannon for simple task.
Checking what ACPI has at /proc/acpi/wakeup
:
Device S-state Status Sysfs node
P0P2 S3 *enabled pci:0000:00:01.0
PEG1 S3 *disabled
EC S4 *disabled platform:PNP0C09:00
GMUX S3 *disabled pnp:00:03
HDEF S3 *disabled pci:0000:00:1b.0
RP03 S3 *enabled pci:0000:00:1c.2
ARPT S4 *disabled pci:0000:02:00.0
RP04 S3 *enabled pci:0000:00:1c.3
RP05 S3 *enabled pci:0000:00:1c.4
XHC1 S3 *enabled pci:0000:00:14.0
ADP1 S4 *disabled platform:ACPI0003:00
LID0 S4 *enabled platform:PNP0C0D:00
For those had-to-sleep -cases, disabling XHC1
and LID0
did help, but made wakeup troublesome. While troubleshooting my issue, I did try if disabling XHC1
and/or LID0
would a difference. It didn't.
Also, I found it very difficult to find any detailed information on what those registered ACPI wakeup -sources translate into. Lid is kinda obvious, but rest remain relatively unknown.
While reading System Sleep States from Intel, a thought occurred to me. Let's make this one hibernate to see if that would work. Sleep semi-worked, but I wanted to see if hibernate was equally unreliable.
Going for systemctl hibernate
didn't quite go as well as I expected. It simply resulted in an error of: Failed to hibernate system via logind: Not enough swap space for hibernation
With free
, the point was made obvious:
total used free shared buff/cache available
Mem: 8038896 1632760 2424492 1149792 3981644 4994500
Swap: 8038396 0 8038396
For those not aware: Modern Linux systems don't have swap anymore. They have zram instead. If you're really interested, go study zram: Compressed RAM-based block devices.
To verify the previous, running zramctl displayed prettyy much the above information in form of:
NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram0 lzo-rle 7.7G 4K 80B 12K 8 [SWAP]
I finally gave up on that by bumping into article Supporting hibernation in Workstation ed., draft 3. It states following:
The Fedora Workstation working group recognizes hibernation can be useful, but due to impediments it's currently not practical to support it.
Ok ok ok. Got the point, no hibernate.
Looking into sleep wake issue more, I bumped into this thread Ubuntu Processor Power State Management. There a merited user Toz suggested following:
It may be a bit of a stretch, but can you give the following kernel parameters a try:
acpi=strict
noapic
I had attempted lots of options, that didn't sound that radical. Finding the active kernel file /boot/vmlinuz-5.18.18-200.fc36.x86_64
, then adding mentioned kernel arguments to GRUB2 with: grubby --args=acpi=strict --args=noapic --update-kernel=vmlinuz-5.18.18-200.fc36.x86_64
... aaand a reboot!
To my surprise, it improved the situation. Closing the lid and opening it now works robust. However, that does not solve the problem where battery is nearly running out and I plug the Magsafe. Any power input to the system taints sleep and its back to deep freeze. I'm happy about the improvement, tough.
This is definitely a far fetch, but still: If you have an idea how to fix Linux sleep wake on an ancient Apple-hardware, drop me a comment. I'll be sure to test it out.
Few years back I wrote about running Linux on a MacBook Pro. An year ago, the OpenSuse failed to boot on the Mac. Little bit of debugging, I realized the problem isn't in the hardware. That particular kernel update simply didn't work on that particular hardware anymore. Totally fair, who would be stupid enough to attempt using 8 years old laptop. Well, I do.
There aren't that many distros I use and I always wanted to see Fedora Workstation. It booted from USB and also, unlike OpenSuse Leap, it also booted installed. So, ever since I've been running a Fedora Workstation on encrypted root drive.
One glitch, though. It didn't always sleep wake. Quite easily, I found stories of a MBP not sleeping. Here's one: Macbook Pro doesn't suspend properly. Unlike that 2015 model, this 2013 puppy slept ok, but had such deep state, it had major trouble regaining consciousness. Pressing the power for 10 seconds obviously recycled power, but it always felt too much of a cannon for simple task.
Checking what ACPI has at /proc/acpi/wakeup
:
Device S-state Status Sysfs node
P0P2 S3 *enabled pci:0000:00:01.0
PEG1 S3 *disabled
EC S4 *disabled platform:PNP0C09:00
GMUX S3 *disabled pnp:00:03
HDEF S3 *disabled pci:0000:00:1b.0
RP03 S3 *enabled pci:0000:00:1c.2
ARPT S4 *disabled pci:0000:02:00.0
RP04 S3 *enabled pci:0000:00:1c.3
RP05 S3 *enabled pci:0000:00:1c.4
XHC1 S3 *enabled pci:0000:00:14.0
ADP1 S4 *disabled platform:ACPI0003:00
LID0 S4 *enabled platform:PNP0C0D:00
For those had-to-sleep -cases, disabling XHC1
and LID0
did help, but made wakeup troublesome. While troubleshooting my issue, I did try if disabling XHC1
and/or LID0
would a difference. It didn't.
Also, I found it very difficult to find any detailed information on what those registered ACPI wakeup -sources translate into. Lid is kinda obvious, but rest remain relatively unknown.
While reading System Sleep States from Intel, a thought occurred to me. Let's make this one hibernate to see if that would work. Sleep semi-worked, but I wanted to see if hibernate was equally unreliable.
Going for systemctl hibernate
didn't quite go as well as I expected. It simply resulted in an error of: Failed to hibernate system via logind: Not enough swap space for hibernation
With free
, the point was made obvious:
total used free shared buff/cache available
Mem: 8038896 1632760 2424492 1149792 3981644 4994500
Swap: 8038396 0 8038396
For those not aware: Modern Linux systems don't have swap anymore. They have zram instead. If you're really interested, go study zram: Compressed RAM-based block devices.
To verify the previous, running zramctl displayed prettyy much the above information in form of:
NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram0 lzo-rle 7.7G 4K 80B 12K 8 [SWAP]
I finally gave up on that by bumping into article Supporting hibernation in Workstation ed., draft 3. It states following:
The Fedora Workstation working group recognizes hibernation can be useful, but due to impediments it's currently not practical to support it.
Ok ok ok. Got the point, no hibernate.
Looking into sleep wake issue more, I bumped into this thread Ubuntu Processor Power State Management. There a merited user Toz suggested following:
It may be a bit of a stretch, but can you give the following kernel parameters a try:
acpi=strict
noapic
I had attempted lots of options, that didn't sound that radical. Finding the active kernel file /boot/vmlinuz-5.18.18-200.fc36.x86_64
, then adding mentioned kernel arguments to GRUB2 with: grubby --args=acpi=strict --args=noapic --update-kernel=vmlinuz-5.18.18-200.fc36.x86_64
... aaand a reboot!
To my surprise, it improved the situation. Closing the lid and opening it now works robust. However, that does not solve the problem where battery is nearly running out and I plug the Magsafe. Any power input to the system taints sleep and its back to deep freeze. I'm happy about the improvement, tough.
This is definitely a far fetch, but still: If you have an idea how to fix Linux sleep wake on an ancient Apple-hardware, drop me a comment. I'll be sure to test it out.
ArchLinux - Pacman - GnuPG - Signature trust fail
Wednesday, July 27. 2022
In ArchLinux, this is what happens too often when you're running simple upgrade with pacman -Syu
:
error: libcap: signature from "-an-author-" is marginal trust
:: File /var/cache/pacman/pkg/-a-package-.pkg.tar.zst is corrupted (invalid or corrupted package (PGP signature)).
Do you want to delete it? [Y/n] y
error: failed to commit transaction (invalid or corrupted package)
Errors occurred, no packages were upgraded.
This error has occurred multiple times since ever and by googling, it has a simple solution. Looks like the solution went sour at some point. Deleting obscure directories and running pacman-key --init
and pacman-key --populate archlinux
won't do the trick. I tried that fix, multiple times. Exactly same error will be emitted.
Correct way of fixing the problem is running following sequence (as root):
paccache -ruk0
pacman -Syy archlinux-keyring
pacman-key --populate archlinux
Now you're good to go for pacman -Syu
and enjoy upgraded packages.
Disclaimer:
I'll give you really good odds for above solution to go eventually rot. It does work at the time of writing with archlinux-keyring 20220713-2.
In ArchLinux, this is what happens too often when you're running simple upgrade with pacman -Syu
:
error: libcap: signature from "-an-author-" is marginal trust
:: File /var/cache/pacman/pkg/-a-package-.pkg.tar.zst is corrupted (invalid or corrupted package (PGP signature)).
Do you want to delete it? [Y/n] y
error: failed to commit transaction (invalid or corrupted package)
Errors occurred, no packages were upgraded.
This error has occurred multiple times since ever and by googling, it has a simple solution. Looks like the solution went sour at some point. Deleting obscure directories and running pacman-key --init
and pacman-key --populate archlinux
won't do the trick. I tried that fix, multiple times. Exactly same error will be emitted.
Correct way of fixing the problem is running following sequence (as root):
paccache -ruk0
pacman -Syy archlinux-keyring
pacman-key --populate archlinux
Now you're good to go for pacman -Syu
and enjoy upgraded packages.
Disclaimer:
I'll give you really good odds for above solution to go eventually rot. It does work at the time of writing with archlinux-keyring 20220713-2.
Fedora 35: Name resolver fail - Solved!
Monday, April 25. 2022
One of my boxes started failing after working pretty ok for years. On dnf update
it simply said:
Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-35&arch=x86_64 [Could not resolve host: mirrors.fedoraproject.org]
Whoa! Where did that come from?
This is one of my test / devel / not-so-important boxes, so it might have been broken for a while. I simply haven't realized it ever happening.
Troubleshooting
Step 1: Making sure DNS works
Test: dig mirrors.fedoraproject.org
Result:
;; ANSWER SECTION:
mirrors.fedoraproject.org. 106 IN CNAME wildcard.fedoraproject.org.
wildcard.fedoraproject.org. 44 IN A 185.141.165.254
wildcard.fedoraproject.org. 44 IN A 152.19.134.142
wildcard.fedoraproject.org. 44 IN A 18.192.40.85
wildcard.fedoraproject.org. 44 IN A 152.19.134.198
wildcard.fedoraproject.org. 44 IN A 38.145.60.21
wildcard.fedoraproject.org. 44 IN A 18.133.140.134
wildcard.fedoraproject.org. 44 IN A 209.132.190.2
wildcard.fedoraproject.org. 44 IN A 18.159.254.57
wildcard.fedoraproject.org. 44 IN A 38.145.60.20
wildcard.fedoraproject.org. 44 IN A 85.236.55.6
Conclusion: Network works (obviously, I just SSHd into the box) and capability of doing DNS-requests works ok.
Step 2: What's in /etc/resolv.conf
?
Test: cat /etc/resolv.conf
Result:
options edns0 trust-ad
; generated by /usr/sbin/dhclient-script
nameserver 185.12.64.1
nameserver 185.12.64.2
Test 2: ls -l /etc/resolv.conf
Result:
lrwxrwxrwx. 1 root root 39 Apr 25 20:31 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf
Conclusion: Well, well. dhclient isn't supposed to overwrite /etc/resolv.conf
if systemd-resolver is used. But is it?
Step 3: Is /etc/nsswitch.conf
ok?
Previous one gave hint something was not ok. Just to cover all the bases, need to verify NS switch order.
Test: cat /etc/nsswitch.conf
Result:
# Generated by authselect on Wed Apr 29 05:44:18 2020
# Do not modify this file manually.
# If you want to make changes to nsswitch.conf please modify
# /etc/authselect/user-nsswitch.conf and run 'authselect apply-changes'.
d...
# In order of likelihood of use to accelerate lookup.
hosts: files myhostname resolve [!UNAVAIL=return] dns
Conclusion: No issues there, all ok.
Step 4: Is systemd-resolved running?
Test: systemctl status systemd-resolved
Result:
● systemd-resolved.service - Network Name Resolution
Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; vendor pr>
Active: active (running) since Mon 2022-04-25 20:15:59 CEST; 7min ago
Docs: man:systemd-resolved.service(8)
man:org.freedesktop.resolve1(5)
https://www.freedesktop.org/wiki/Software/systemd/writing-network-configurat>
https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
Main PID: 6329 (systemd-resolve)
Status: "Processing requests..."
Tasks: 1 (limit: 2258)
Memory: 8.5M
CPU: 91ms
CGroup: /system.slice/systemd-resolved.service
└─6329 /usr/lib/systemd/systemd-resolved
Conclusion: No issues there, all ok.
Step 5: If DNS resolution is there, can systemd-resolved do any resolving?
Test: systemd-resolve --status
Result:
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Link 2 (eth0)
Current Scopes: LLMNR/IPv4 LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Test 2: resolvectl query mirrors.fedoraproject.org
Result:
mirrors.fedoraproject.org: resolve call failed: No appropriate name servers or networks for name found
Conclusion: Problem found! systemd-resolved and dhclient are disconnected.
Fix
Edit file /etc/systemd/resolved.conf
, have it contain Hetzner recursive resolvers in a line:
DNS=185.12.64.1 185.12.64.2
Make changes effective: systemctl restart systemd-resolved
Test: resolvectl query mirrors.fedoraproject.org
As everything worked and correct result was returned, verify systemd-resolverd status: systemd-resolve --status
Result now:
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Current DNS Server: 185.12.64.1
DNS Servers: 185.12.64.1 185.12.64.2
Link 2 (eth0)
Current Scopes: LLMNR/IPv4 LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Success!
Why it broke? No idea.
I just hate these modern Linuxes where every single thing gets more layers of garbage added between the actual system and admin. Ufff.
One of my boxes started failing after working pretty ok for years. On dnf update
it simply said:
Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-35&arch=x86_64 [Could not resolve host: mirrors.fedoraproject.org]
Whoa! Where did that come from?
This is one of my test / devel / not-so-important boxes, so it might have been broken for a while. I simply haven't realized it ever happening.
Troubleshooting
Step 1: Making sure DNS works
Test: dig mirrors.fedoraproject.org
Result:
;; ANSWER SECTION:
mirrors.fedoraproject.org. 106 IN CNAME wildcard.fedoraproject.org.
wildcard.fedoraproject.org. 44 IN A 185.141.165.254
wildcard.fedoraproject.org. 44 IN A 152.19.134.142
wildcard.fedoraproject.org. 44 IN A 18.192.40.85
wildcard.fedoraproject.org. 44 IN A 152.19.134.198
wildcard.fedoraproject.org. 44 IN A 38.145.60.21
wildcard.fedoraproject.org. 44 IN A 18.133.140.134
wildcard.fedoraproject.org. 44 IN A 209.132.190.2
wildcard.fedoraproject.org. 44 IN A 18.159.254.57
wildcard.fedoraproject.org. 44 IN A 38.145.60.20
wildcard.fedoraproject.org. 44 IN A 85.236.55.6
Conclusion: Network works (obviously, I just SSHd into the box) and capability of doing DNS-requests works ok.
Step 2: What's in /etc/resolv.conf
?
Test: cat /etc/resolv.conf
Result:
options edns0 trust-ad
; generated by /usr/sbin/dhclient-script
nameserver 185.12.64.1
nameserver 185.12.64.2
Test 2: ls -l /etc/resolv.conf
Result:
lrwxrwxrwx. 1 root root 39 Apr 25 20:31 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf
Conclusion: Well, well. dhclient isn't supposed to overwrite /etc/resolv.conf
if systemd-resolver is used. But is it?
Step 3: Is /etc/nsswitch.conf
ok?
Previous one gave hint something was not ok. Just to cover all the bases, need to verify NS switch order.
Test: cat /etc/nsswitch.conf
Result:
# Generated by authselect on Wed Apr 29 05:44:18 2020
# Do not modify this file manually.
# If you want to make changes to nsswitch.conf please modify
# /etc/authselect/user-nsswitch.conf and run 'authselect apply-changes'.
d...
# In order of likelihood of use to accelerate lookup.
hosts: files myhostname resolve [!UNAVAIL=return] dns
Conclusion: No issues there, all ok.
Step 4: Is systemd-resolved running?
Test: systemctl status systemd-resolved
Result:
● systemd-resolved.service - Network Name Resolution
Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; vendor pr>
Active: active (running) since Mon 2022-04-25 20:15:59 CEST; 7min ago
Docs: man:systemd-resolved.service(8)
man:org.freedesktop.resolve1(5)
https://www.freedesktop.org/wiki/Software/systemd/writing-network-configurat>
https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
Main PID: 6329 (systemd-resolve)
Status: "Processing requests..."
Tasks: 1 (limit: 2258)
Memory: 8.5M
CPU: 91ms
CGroup: /system.slice/systemd-resolved.service
└─6329 /usr/lib/systemd/systemd-resolved
Conclusion: No issues there, all ok.
Step 5: If DNS resolution is there, can systemd-resolved do any resolving?
Test: systemd-resolve --status
Result:
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Link 2 (eth0)
Current Scopes: LLMNR/IPv4 LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Test 2: resolvectl query mirrors.fedoraproject.org
Result:
mirrors.fedoraproject.org: resolve call failed: No appropriate name servers or networks for name found
Conclusion: Problem found! systemd-resolved and dhclient are disconnected.
Fix
Edit file /etc/systemd/resolved.conf
, have it contain Hetzner recursive resolvers in a line:
DNS=185.12.64.1 185.12.64.2
Make changes effective: systemctl restart systemd-resolved
Test: resolvectl query mirrors.fedoraproject.org
As everything worked and correct result was returned, verify systemd-resolverd status: systemd-resolve --status
Result now:
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Current DNS Server: 185.12.64.1
DNS Servers: 185.12.64.1 185.12.64.2
Link 2 (eth0)
Current Scopes: LLMNR/IPv4 LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Success!
Why it broke? No idea.
I just hate these modern Linuxes where every single thing gets more layers of garbage added between the actual system and admin. Ufff.
Databricks CentOS 8 stream containers
Monday, February 7. 2022
Last November I created CentOS 8 -based Databricks containers.
At the time of tinkering with them, I failed to realize my base was off. I simply used the CentOS 8.4 image available at Docker Hub. On later inspection that was a failure. Even for 8.4, the image was old and was going to be EOLd soon after. Now that 31st Dec -21 had passed I couldn't get any security patches into my system. To put it midly: that's bad!
What I was supposed to be using, was the CentOS 8 stream image from quay.io. Initially my reaction was: "What's a quay.io? Why would I want to use that?"

Thanks Merriam-Webster for that, but it doesn't help.
On a closer look, it looks like all RedHat container -stuff is not at docker.io, they're in quay.io.
Simple thing: update the base image, rebuild all Databricks-images and done, right? Yup. Nope. The images built from steam didn't work anymore. Uff! They failed working that bad, not even Apache Spark driver was available. No querying driver logs for errors. A major fail, that!
Well. Seeing why driver won't work should be easy, just SSH into the driver an take a peek, right? The operation is documented by Microsoft at SSH to the cluster driver node. Well, no! According to me and couple of people asking questions like How to login SSH on Azure Databricks cluster, it is NOT possible to SSH into Azure Databricks node.
Looking at Azure Databricks architecture overview gave no clues on how to see inside of a node. I started to think nobody had ever done it. Also enabling diagnostic logging required the premium (high-prized) edition of Databricks, which wasn't available to me.
At this point I was in a full whatta-hell-is-going-on!? -mode.
Digging into documentation, I found out, it was possible to run a Cluster node initialization scripts, then I knew what to do next. As I knew it was possible to make requests into the Internet from a job running in a node, I could write an intialization script which during execution would dig me a SSH-tunnel from the node being initialized into something I would fully control. Obiviously I chose one of my own servers and from that SSH-tunneled back into the node's SSH-server. Double SSH, yes, but then I was able to get an interactive session into the well-protected node. An interactive session is what all bad people would want into any of the machines they'll crack into. Tunneling-credit to other people: quite a lot of my implementation details came from How does reverse SSH tunneling work?
To implement my plan, I crafted following cluster initialization script:
LOG_FILE="/dbfs/cluster-logs/$DB_CLUSTER_ID-init-$(date +"%F-%H:%M").log"
exec >> "$LOG_FILE"
echo "$(date +"%F %H:%M:%S") Setup SSH-tunnel"
mkdir -p /root/.ssh
cat > /root/.ssh/authorized_keys <<EOT
ecdsa-sha2-nistp521 AAAAE2V0bV+TrsFVcsA==
EOT
echo "$(date +"%F %H:%M:%S") Install and setup SSH"
dnf install openssh-server openssh-clients -y
/usr/libexec/openssh/sshd-keygen ecdsa
/usr/libexec/openssh/sshd-keygen rsa
/usr/libexec/openssh/sshd-keygen ed25519
/sbin/sshd
echo "$(date +"%F %H:%M:%S") - Add p-key"
cat > /root/.ssh/nobody_id_ecdsa <<EOT
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAA
1zaGEyLW5pc3RwNTIxAAAACG5pc3RwNTIxAAAAhQQA2I7t7xx9R02QO2
rsLeYmp3X6X5qyprAGiMWM7SQrA1oFr8jae+Cqx7Fvi3xPKL/SoW1+l6
Zzc2hkQHZtNC5ocWNvZGVzaG9wLmZpAQIDBA==
-----END OPENSSH PRIVATE KEY-----
EOT
chmod go= /root/.ssh/nobody_id_ecdsa
echo "$(date +"%F %H:%M:%S") - SSH dir content:"
echo "$(date +"%F %H:%M:%S") Open SSH-tunnel"
ssh -f -N -T \
-R22222:localhost:22 \
-i /root/.ssh/nobody_id_ecdsa \
-o StrictHostKeyChecking=no \
nobody@my.own.box.example.com -p 443
Note: Above ECDSA-keys have been heavily shortened making them invalid. Don't copy passwords or keys from public Internet, generate your own secrets. Always! And if you're wondering, the original keys have been removed.
Note 2: My init-script writes log into DBFS, see exec >> "$LOG_FILE"
about that.
My plan succeeded. I got in, did the snooping around and then it took couple minutes when Azure/Databrics -plumbing realized driver was dead, killed the node and retried the startup-sequence. Couple minutes was plenty of time to eyeball /databricks/spark/logs/
and /databricks/driver/logs/
and deduce what was going on and what was failing.
Looking at simplified Databricks (Apache Spark) architecture diagram:

Spark driver failed to start because it couldn't connect into cluster manager. Subsequently, cluster manager failed to start as ps
-command wasn't available. It was in good old CentOS, but in base stream it was removed. As I got progress, also ip
-command was needed. I added both and got the desired result: a working CentOS 8 stream Spark-cluster.
Notice how I'm specifying HTTPS-port (TCP/443) in the outgoing SSH-command (see: -p 443
). In my attempts to get a session into the node, I deduced following:

As Databricks runs in it's own sandbox, also outgoing traffic is heavily firewalled. Any attempts to access SSH (TCP/22) are blocked. Both HTTP and HTTPS are known to work as exit ports, so I spoofed my SSHd there.
There are a number of different containers. To clarify which one to choose, I drew this diagram:

In my sparking, I'll need both Python and DBFS, so my choice is dbfsfuse. Most users would be happy with standard, but it only adds SSHd which is known not to work. ssh has the same exact problem. The reason for them to exist, is because in AWS SSHd does work. Among the changes from good old CentOS into stream is lacking FUSE. Old one had FUSE even in minimal, but not anymore. You can access DBFS only with dbfsfuse or standard from now on.
If you want to take my CentOS 8 brick-containers for a spin, they are still here: https://hub.docker.com/repository/docker/kingjatu/databricks, now they are maintained and get security patches too!
Last November I created CentOS 8 -based Databricks containers.
At the time of tinkering with them, I failed to realize my base was off. I simply used the CentOS 8.4 image available at Docker Hub. On later inspection that was a failure. Even for 8.4, the image was old and was going to be EOLd soon after. Now that 31st Dec -21 had passed I couldn't get any security patches into my system. To put it midly: that's bad!
What I was supposed to be using, was the CentOS 8 stream image from quay.io. Initially my reaction was: "What's a quay.io? Why would I want to use that?"
Thanks Merriam-Webster for that, but it doesn't help.
On a closer look, it looks like all RedHat container -stuff is not at docker.io, they're in quay.io.
Simple thing: update the base image, rebuild all Databricks-images and done, right? Yup. Nope. The images built from steam didn't work anymore. Uff! They failed working that bad, not even Apache Spark driver was available. No querying driver logs for errors. A major fail, that!
Well. Seeing why driver won't work should be easy, just SSH into the driver an take a peek, right? The operation is documented by Microsoft at SSH to the cluster driver node. Well, no! According to me and couple of people asking questions like How to login SSH on Azure Databricks cluster, it is NOT possible to SSH into Azure Databricks node.
Looking at Azure Databricks architecture overview gave no clues on how to see inside of a node. I started to think nobody had ever done it. Also enabling diagnostic logging required the premium (high-prized) edition of Databricks, which wasn't available to me.
At this point I was in a full whatta-hell-is-going-on!? -mode.
Digging into documentation, I found out, it was possible to run a Cluster node initialization scripts, then I knew what to do next. As I knew it was possible to make requests into the Internet from a job running in a node, I could write an intialization script which during execution would dig me a SSH-tunnel from the node being initialized into something I would fully control. Obiviously I chose one of my own servers and from that SSH-tunneled back into the node's SSH-server. Double SSH, yes, but then I was able to get an interactive session into the well-protected node. An interactive session is what all bad people would want into any of the machines they'll crack into. Tunneling-credit to other people: quite a lot of my implementation details came from How does reverse SSH tunneling work?
To implement my plan, I crafted following cluster initialization script:
LOG_FILE="/dbfs/cluster-logs/$DB_CLUSTER_ID-init-$(date +"%F-%H:%M").log"
exec >> "$LOG_FILE"
echo "$(date +"%F %H:%M:%S") Setup SSH-tunnel"
mkdir -p /root/.ssh
cat > /root/.ssh/authorized_keys <<EOT
ecdsa-sha2-nistp521 AAAAE2V0bV+TrsFVcsA==
EOT
echo "$(date +"%F %H:%M:%S") Install and setup SSH"
dnf install openssh-server openssh-clients -y
/usr/libexec/openssh/sshd-keygen ecdsa
/usr/libexec/openssh/sshd-keygen rsa
/usr/libexec/openssh/sshd-keygen ed25519
/sbin/sshd
echo "$(date +"%F %H:%M:%S") - Add p-key"
cat > /root/.ssh/nobody_id_ecdsa <<EOT
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAA
1zaGEyLW5pc3RwNTIxAAAACG5pc3RwNTIxAAAAhQQA2I7t7xx9R02QO2
rsLeYmp3X6X5qyprAGiMWM7SQrA1oFr8jae+Cqx7Fvi3xPKL/SoW1+l6
Zzc2hkQHZtNC5ocWNvZGVzaG9wLmZpAQIDBA==
-----END OPENSSH PRIVATE KEY-----
EOT
chmod go= /root/.ssh/nobody_id_ecdsa
echo "$(date +"%F %H:%M:%S") - SSH dir content:"
echo "$(date +"%F %H:%M:%S") Open SSH-tunnel"
ssh -f -N -T \
-R22222:localhost:22 \
-i /root/.ssh/nobody_id_ecdsa \
-o StrictHostKeyChecking=no \
nobody@my.own.box.example.com -p 443
Note: Above ECDSA-keys have been heavily shortened making them invalid. Don't copy passwords or keys from public Internet, generate your own secrets. Always! And if you're wondering, the original keys have been removed.
Note 2: My init-script writes log into DBFS, see exec >> "$LOG_FILE"
about that.
My plan succeeded. I got in, did the snooping around and then it took couple minutes when Azure/Databrics -plumbing realized driver was dead, killed the node and retried the startup-sequence. Couple minutes was plenty of time to eyeball /databricks/spark/logs/
and /databricks/driver/logs/
and deduce what was going on and what was failing.
Looking at simplified Databricks (Apache Spark) architecture diagram:
Spark driver failed to start because it couldn't connect into cluster manager. Subsequently, cluster manager failed to start as ps
-command wasn't available. It was in good old CentOS, but in base stream it was removed. As I got progress, also ip
-command was needed. I added both and got the desired result: a working CentOS 8 stream Spark-cluster.
Notice how I'm specifying HTTPS-port (TCP/443) in the outgoing SSH-command (see: -p 443
). In my attempts to get a session into the node, I deduced following:
As Databricks runs in it's own sandbox, also outgoing traffic is heavily firewalled. Any attempts to access SSH (TCP/22) are blocked. Both HTTP and HTTPS are known to work as exit ports, so I spoofed my SSHd there.
There are a number of different containers. To clarify which one to choose, I drew this diagram:
In my sparking, I'll need both Python and DBFS, so my choice is dbfsfuse. Most users would be happy with standard, but it only adds SSHd which is known not to work. ssh has the same exact problem. The reason for them to exist, is because in AWS SSHd does work. Among the changes from good old CentOS into stream is lacking FUSE. Old one had FUSE even in minimal, but not anymore. You can access DBFS only with dbfsfuse or standard from now on.
If you want to take my CentOS 8 brick-containers for a spin, they are still here: https://hub.docker.com/repository/docker/kingjatu/databricks, now they are maintained and get security patches too!
pkexec security flaw (CVE-2021-4034)
Wednesday, January 26. 2022
This is something Qualsys found: PwnKit: Local Privilege Escalation Vulnerability Discovered in polkit’s pkexec (CVE-2021-4034)
In your Linux there is sudo
and su
, but many won't realize you also have pkexec which does pretty much the same. More details about the vulnerability can be found at Vuldb.
I downloaded the proof-of-concept by BLASTY, but found it not working in any of my systems. I simply could not convert a nobody into a root.
Another one, https://github.com/arthepsy/CVE-2021-4034, states "verified on Debian 10 and CentOS 7."
Various errors I could get include a simple non-functionality:
[~] compile helper..
[~] maybe get shell now?
The value for environment variable XAUTHORITY contains suscipious content
This incident has been reported.
Or one which would stop to ask for root password:
[~] compile helper..
[~] maybe get shell now?
==== AUTHENTICATING FOR org.freedesktop.policykit.exec ====
Authentication is needed to run `GCONV_PATH=./lol' as the super user
Authenticating as: root
By hitting enter to the password-prompt following will happen:
Password:
polkit-agent-helper-1: pam_authenticate failed: Authentication failure
==== AUTHENTICATION FAILED ====
Error executing command as another user: Not authorized
This incident has been reported.
Or one which simply doesn't get the thing right:
[~] compile helper..
[~] maybe get shell now?
pkexec --version |
--help |
--disable-internal-agent |
[--user username] [PROGRAM] [ARGUMENTS...]
See the pkexec manual page for more details.
Report bugs to: http://lists.freedesktop.org/mailman/listinfo/polkit-devel
polkit home page: <http://www.freedesktop.org/wiki/Software/polkit>
Maybe there is a PoC which would actually compromise one of the systems I have to test with. Or maybe not. Still, I'm disappointed.
This is something Qualsys found: PwnKit: Local Privilege Escalation Vulnerability Discovered in polkit’s pkexec (CVE-2021-4034)
In your Linux there is sudo
and su
, but many won't realize you also have pkexec which does pretty much the same. More details about the vulnerability can be found at Vuldb.
I downloaded the proof-of-concept by BLASTY, but found it not working in any of my systems. I simply could not convert a nobody into a root.
Another one, https://github.com/arthepsy/CVE-2021-4034, states "verified on Debian 10 and CentOS 7."
Various errors I could get include a simple non-functionality:
[~] compile helper..
[~] maybe get shell now?
The value for environment variable XAUTHORITY contains suscipious content
This incident has been reported.
Or one which would stop to ask for root password:
[~] compile helper..
[~] maybe get shell now?
==== AUTHENTICATING FOR org.freedesktop.policykit.exec ====
Authentication is needed to run `GCONV_PATH=./lol' as the super user
Authenticating as: root
By hitting enter to the password-prompt following will happen:
Password:
polkit-agent-helper-1: pam_authenticate failed: Authentication failure
==== AUTHENTICATION FAILED ====
Error executing command as another user: Not authorized
This incident has been reported.
Or one which simply doesn't get the thing right:
[~] compile helper..
[~] maybe get shell now?
pkexec --version |
--help |
--disable-internal-agent |
[--user username] [PROGRAM] [ARGUMENTS...]
See the pkexec manual page for more details.
Report bugs to: http://lists.freedesktop.org/mailman/listinfo/polkit-devel
polkit home page: <http://www.freedesktop.org/wiki/Software/polkit>
Maybe there is a PoC which would actually compromise one of the systems I have to test with. Or maybe not. Still, I'm disappointed.
Databricks CentOS 8 containers
Wednesday, November 17. 2021
In my previous post, I mentioned writing code into Databricks.
If you want to take a peek into my work, the Dockerfile
s are at https://github.com/HQJaTu/containers/tree/centos8. My hope, obviously, is for Databricks to approve my PR and those CentOS images would be part of the actual source code bundle.
If you want to run my stuff, they're publicly available at https://hub.docker.com/r/kingjatu/databricks/tags.
To take my Python-container for a spin, pull it with:
docker pull kingjatu/databricks:python
And run with:
docker run --entrypoint bash
Databricks container really doesn't have an ENTRYPOINT
in them. This atypical configuration is because Databricks runtime takes care of setting everything up and running the commands in the container.
As always, any feedback is appreciated.
In my previous post, I mentioned writing code into Databricks.
If you want to take a peek into my work, the Dockerfile
s are at https://github.com/HQJaTu/containers/tree/centos8. My hope, obviously, is for Databricks to approve my PR and those CentOS images would be part of the actual source code bundle.
If you want to run my stuff, they're publicly available at https://hub.docker.com/r/kingjatu/databricks/tags.
To take my Python-container for a spin, pull it with:
docker pull kingjatu/databricks:python
And run with:
docker run --entrypoint bash
Databricks container really doesn't have an ENTRYPOINT
in them. This atypical configuration is because Databricks runtime takes care of setting everything up and running the commands in the container.
As always, any feedback is appreciated.
MySQL Java JDBC connector TLSv1 deprecation in CentOS 8
Friday, November 12. 2021
Yeah, a mouthful. Running CentOS 8 Linux, in Java (JRE) a connection to MySQL / MariaDB there seems to be trouble. I think this is a transient issue and eventually it will resolve itself. Right now the issue is real.
Here is the long story.
I was tinkering with Databricks. The nodes for my bricks were on CentOS 8 and I was going to a MariaDB in AWS RDS. with MySQL Connector/J. As you've figured out, it didn't work! Following errors were in exception backtrace:
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure
javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
Weird.
Going to the database with a simple CLI-command of (test run on OpenSUSE):
$ mysql -h db-instance-here.rds.amazonaws.com -P 3306 \
-u USER-HERE -p \
--ssl-ca=/var/lib/ca-certificates/ca-bundle.pem \
--ssl-verify-server-cert
... works ok.
Note: This RDS-instance enforces encrypted connection (see AWS docs for details).
Note 2: Term used by AWS is SSL. However, SSL was deprecated decades ago and the protocol used is TLS.
Two details popped out instantly: TLSv1 and TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA cipher. Both deprecated. Both deemed highly insecure and potentially leaking your private information.
Why would anybody using those? Don't MySQL/MariaDB/AWS -people remove insecure stuff from their software? What! Why!
Troubleshooting started. First I found SSLHandShakeException No Appropriate Protocol on Stackoverflow. It contains a hint about JVM security settings. Then MySQL documentation 6.3.2 Encrypted Connection TLS Protocols and Ciphers, where they explicitly state "As of MySQL 5.7.35, the TLSv1 and TLSv1.1 connection protocols are deprecated and support for them is subject to removal in a future MySQL version." Well, fair enough, but the bad stuff was still there in AWS RDS. I even found Changes in MySQL 5.7.35 (2021-07-20, General Availability) which clearly states TLSv1 and TLSv1.1 removal to be quite soon.
No amount of tinkering with jdk.tls.disabledAlgorithms
in file /etc/java/*/security/java.security
helped. I even created a simple Java-tester to make my debugging easier:
import java.sql.*;
// Code from: https://www.javatpoint.com/example-to-connect-to-the-mysql-database
// 1) Compile: javac mysql-connect-test.java
// 2) Run: CLASSPATH=.:./mysql-connector-java-8.0.27.jar java MysqlCon
class MysqlCon {
public static void main(String args[]) {
try {
Class.forName("com.mysql.cj.jdbc.Driver");
Connection con = DriverManager.getConnection("jdbc:mysql://db.amazonaws.com:3306/db", "user", "password");
Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery("select * from emp");
while (rs.next())
System.out.println(rs.getInt(1) + " " + rs.getString(2) + " " + rs.getString(3));
con.close();
} catch (Exception e) {
System.out.println(e);
e.printStackTrace(System.out);
}
}
}
Hours passed by, but no avail. Then I found command update-crypto-policies
. RedHat documentation Chapter 8. Security, 8.1. Changes in core cryptographic components, 8.1.5. TLS 1.0 and TLS 1.1 are deprecated contains mention of command:
update-crypto-policies --set LEGACY
As it does the trick, I followed up on it. In CentOS / RedHat / Fedora there is /etc/crypto-policies/back-ends/java.config
. A symlink pointing to file containing:
jdk.tls.ephemeralDHKeySize=2048
jdk.certpath.disabledAlgorithms=MD2, MD5, DSA, RSA keySize < 2048
jdk.tls.disabledAlgorithms=DH keySize < 2048, TLSv1.1, TLSv1, SSLv3, SSLv2, DHE_DSS, RSA_EXPORT, DHE_DSS_EXPORT, DHE_RSA_EXPORT, DH_DSS_EXPORT, DH_RSA_EXPORT, DH_anon, ECDH_anon, DH_RSA, DH_DSS, ECDH, 3DES_EDE_CBC, DES_CBC, RC4_40, RC4_128, DES40_CBC, RC2, HmacMD5
jdk.tls.legacyAlgorithms=
That's the culprit! It turns out any changes in java.security
-file won't have any effect as the policy is loaded later. Running the policy change and set it into legacy-mode has the desired effect. However, running ENTIRE system with such a bad security policy is bad. I only want to connect to RDS, why cannot I lower the security on that only? Well, that's not how Java works.
Entire troubleshooting session was way too much work. People! Get the hint already, no insecure protocols!
Yeah, a mouthful. Running CentOS 8 Linux, in Java (JRE) a connection to MySQL / MariaDB there seems to be trouble. I think this is a transient issue and eventually it will resolve itself. Right now the issue is real.
Here is the long story.
I was tinkering with Databricks. The nodes for my bricks were on CentOS 8 and I was going to a MariaDB in AWS RDS. with MySQL Connector/J. As you've figured out, it didn't work! Following errors were in exception backtrace:
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure
javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
Weird.
Going to the database with a simple CLI-command of (test run on OpenSUSE):
$ mysql -h db-instance-here.rds.amazonaws.com -P 3306 \
-u USER-HERE -p \
--ssl-ca=/var/lib/ca-certificates/ca-bundle.pem \
--ssl-verify-server-cert
... works ok.
Note: This RDS-instance enforces encrypted connection (see AWS docs for details).
Note 2: Term used by AWS is SSL. However, SSL was deprecated decades ago and the protocol used is TLS.
Two details popped out instantly: TLSv1 and TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA cipher. Both deprecated. Both deemed highly insecure and potentially leaking your private information.
Why would anybody using those? Don't MySQL/MariaDB/AWS -people remove insecure stuff from their software? What! Why!
Troubleshooting started. First I found SSLHandShakeException No Appropriate Protocol on Stackoverflow. It contains a hint about JVM security settings. Then MySQL documentation 6.3.2 Encrypted Connection TLS Protocols and Ciphers, where they explicitly state "As of MySQL 5.7.35, the TLSv1 and TLSv1.1 connection protocols are deprecated and support for them is subject to removal in a future MySQL version." Well, fair enough, but the bad stuff was still there in AWS RDS. I even found Changes in MySQL 5.7.35 (2021-07-20, General Availability) which clearly states TLSv1 and TLSv1.1 removal to be quite soon.
No amount of tinkering with jdk.tls.disabledAlgorithms
in file /etc/java/*/security/java.security
helped. I even created a simple Java-tester to make my debugging easier:
import java.sql.*;
// Code from: https://www.javatpoint.com/example-to-connect-to-the-mysql-database
// 1) Compile: javac mysql-connect-test.java
// 2) Run: CLASSPATH=.:./mysql-connector-java-8.0.27.jar java MysqlCon
class MysqlCon {
public static void main(String args[]) {
try {
Class.forName("com.mysql.cj.jdbc.Driver");
Connection con = DriverManager.getConnection("jdbc:mysql://db.amazonaws.com:3306/db", "user", "password");
Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery("select * from emp");
while (rs.next())
System.out.println(rs.getInt(1) + " " + rs.getString(2) + " " + rs.getString(3));
con.close();
} catch (Exception e) {
System.out.println(e);
e.printStackTrace(System.out);
}
}
}
Hours passed by, but no avail. Then I found command update-crypto-policies
. RedHat documentation Chapter 8. Security, 8.1. Changes in core cryptographic components, 8.1.5. TLS 1.0 and TLS 1.1 are deprecated contains mention of command:
update-crypto-policies --set LEGACY
As it does the trick, I followed up on it. In CentOS / RedHat / Fedora there is /etc/crypto-policies/back-ends/java.config
. A symlink pointing to file containing:
jdk.tls.ephemeralDHKeySize=2048
jdk.certpath.disabledAlgorithms=MD2, MD5, DSA, RSA keySize < 2048
jdk.tls.disabledAlgorithms=DH keySize < 2048, TLSv1.1, TLSv1, SSLv3, SSLv2, DHE_DSS, RSA_EXPORT, DHE_DSS_EXPORT, DHE_RSA_EXPORT, DH_DSS_EXPORT, DH_RSA_EXPORT, DH_anon, ECDH_anon, DH_RSA, DH_DSS, ECDH, 3DES_EDE_CBC, DES_CBC, RC4_40, RC4_128, DES40_CBC, RC2, HmacMD5
jdk.tls.legacyAlgorithms=
That's the culprit! It turns out any changes in java.security
-file won't have any effect as the policy is loaded later. Running the policy change and set it into legacy-mode has the desired effect. However, running ENTIRE system with such a bad security policy is bad. I only want to connect to RDS, why cannot I lower the security on that only? Well, that's not how Java works.
Entire troubleshooting session was way too much work. People! Get the hint already, no insecure protocols!
Wi-Fi 6 - Part 2 of 2: Practical wireless LAN with Linksys E8450
Sunday, August 15. 2021
There is a previous post in this series about wireless technology.
Wi-Fi 6 hardware is available, but uncommon. Since its introduction three years ago, finally it is gaining popularity. A practial example of sometimes-difficult-to-obtain part is an USB-dongle. Those have existed at least 15 years now. There simply is none with Wi-Fi 6 capability.
Additional twist is thrown at me, a person living in EU-reagion. For some weird (to me) reason, manufacturers aren't getting their radio transmitters licensed in EU. Only in US/UK. This makes Wi-Fi 6 appliance even less common here.
When I throw in my absolute non-negotiable requirement of running a reasonable firmware in my access point, I'll limit my options to almost nil. Almost! I found this in OpenWRT Table-of-Hardware: Linksys E8450 (aka. Belkin RT3200) It is an early build considered as beta, but hey! All of my requirements align there, so I went for it in Amazon UK:

Wi-Fi 6 Access Point: Belkin RT3200
Couple of days waiting for UPS delivery, and here goes:




This is exactly what I wanted and needed! A four-port gigabit switch for wired LAN, incoming Internet gigabit connector. 12 VDC / 2 A barrel connector for transformer. Given UK power plugs are from 1870s they're widely incompatible with EU-ones. Luckily manufacturers are aware of this and this box contains both UK and EU plugs in an easily interchangeable form. Thanks for that!
Notice how this is a Belkin "manufactured" unit. In reality it is a relabled Linksys RT3200. Even the OpenWRT-firmware is exactly same. Me personally, I don't care what the cardobard box says as long as my Wi-Fi is 6, is fast and is secure.
Illustrated OpenWRT Installation Guide
The thing with moving away from vendor firmware to OpenWRT is that it can be tricky. It's almost never easy, so this procedure is not for everyone.
To achieve this, there are a few steps needed. Actual documentation is at https://openwrt.org/toh/linksys/e8450, but be warned: amount of handholding there is low, for newbie there is not much details. To elaborate the process of installation, I'm walking trough what I did to get me OpenWRT running in the box.
Step 0: Preparation
You will need:
- Linksys/Belkin RT3200 access point
- Wallsocket to power the thing
- A computer with Ethernet port
- Any Windows / Mac / Linux will do, no software needs to be installed, all that is required is a working web browser
- Ethernet cable with RJ-45 connectors to access the access point's admin panel via LAN
- OpenWRT firmware from https://github.com/dangowrt/linksys-e8450-openwrt-installer
- Download files into a laptop you'll be doing your setup from
- Linksys-compatible firmware is at at:https://github.com/dangowrt/linksys-e8450-openwrt-installer/releases, get
openwrt-mediatek-mt7622-linksys_e8450-ubi-initramfs-recovery-installer.itb
- Also download optimized firmware
openwrt-mediatek-mt7622-linksys_e8450-ubi-squashfs-sysupgrade.itb
- Skills and rights to administer your workstation to have its Ethernet port a fixed IPv4-address from net 192.168.1.1/24
- Any other IPv4 address on that net will do, I used 192.168.1.10
- No DNS nor gateway will be needed for this temporary setup
Make sure not to connect the WAN / Internet into anything. The Big Net is scary and don't rush into that yet. You can do that later when all installing and setupping is done.
Mandatory caution:
If you just want to try OpenWrt and still plan to go back to the vendor firmware, use the non-UBI version of the firmware which can be flashed using the vendor's web interface.
Process described here is the UBI-version which does not allow falling back to vendor firmware.
Step 1: Un-box and replace Belkin firmware
After plugging the Access Point to a wall socket, flicking the I/O-switch on, attaching an Ethernet cable to one of the LAN-switch ports and other end directly to a laptop, going to http://192.168.1.1 with your browser will display you something like this:

What you need to do is try to exit the out-of-box-experience setup wizard:


For the "Ethernet cable is not connected" you need to click Exit. When you think of the error message bit harder, if you get the message, your Ethernet IS connected. Ok, ok. It is for the WAN Ethernet, not LAN.
Notice how setup "did not complete succesfully". That is fully intentional. Click "Do not set up". Doing that will land you on a login:

This is your unconfigured admin / admin -scenario. Log into your Linksys ... erhm. Belkin.
Select Configuration / Administration / Firmware Upgrade. Choose File. Out of the two binaries you downloaded while preparing, go for the ubi-initramfs-recovery-installer.itb
. That OpenWRT firmware file isn't from manufacturer, but the file is packaged in a way which makes it compatible to allow easy installation:

On "Start Upgrade" there will be a warning. Click "Ok" and wait patiently for couple minutes.

Step 2: Upgrade your OpenWRT recovery into a real OpenWRT
When all the firmware flashing is done, your factory firmware is gone:

There is no password. Just "Login". An OpenWRT welcome screen will be shown:

Now that you're running OpenWRT, your next task is to go from recovery to real thing. I'm not sure if I'll ever want to go back, but as recommended by OpenWRT instructions, I did take backups of all four mtdblocks: bl2, fip, factory and ubi. This step is optinal:

When you're ready, go for the firmware upgrade. This time select openwrt-mediatek-mt7622-linksys_e8450-ubi-squashfs-sysupgrade.itb
:


To repeat the UBI / non-UBI firmware: This is the UBI-version. It is recommended as it has better optimization for layout and management of SPI flash, but it does not allow fallbacking to vendor firmware.
I unchecked the "Keep settings and retain the current configuration" to make sure I got a fresh start with OpenWRT. On "Continue", yet another round of waiting will occur:

Step 3: Setup your wireless AP
You have seen this exact screen before. Login (there is no password yet):

Second time, same screen but with this time there is a proper firmware in the AP. Go set the admin account properly to get rid of the "There is no password set on this router" -nag. Among all settings, go to wireless configuration to verify both 2.4 and 5 GHz radios are off:

Go fix that. Select "Edit" for the 5 GHz radio and you'll be greeted by a regular wireless access point configuration dialog. It will include section about wireless security:

As I wanted to improve my WLAN security, I steer away from WPA2 and went for a WPA3-SAE security. Supporting both at the same time is possible, but securitywise it isn't wise. If your system allows wireless clients to associate with a weaker solution, they will.
Also for security, check KRACK attack countermeasures. For more details on KRACK, see: https://www.krackattacks.com/
When you've done, you should see radio enabled on a dialog like this:

Step 4: Done! Test.
That's it! Now you're running a proper firmware on our precious Wi-Fi 6 AP. But how fast it is?


As I said, I don't have many Wi-Fi 6 clients to test with. On my 1 gig fiber, iPad seems to be pretty fast. Also my Android phone speed is ... well ... acceptable. 
For that speed test I didn't even go for the "one foot distance" which manufacturers love to do. As nobody uses their mobile devices right next to their AP, I tested this on a real life -scenario where both AP and I were located the way I would use Internet in my living room.
Final words
After three year wait Wi-Fi 6 is here! Improved security, improved speed, improved everything!
There is a previous post in this series about wireless technology.
Wi-Fi 6 hardware is available, but uncommon. Since its introduction three years ago, finally it is gaining popularity. A practial example of sometimes-difficult-to-obtain part is an USB-dongle. Those have existed at least 15 years now. There simply is none with Wi-Fi 6 capability.
Additional twist is thrown at me, a person living in EU-reagion. For some weird (to me) reason, manufacturers aren't getting their radio transmitters licensed in EU. Only in US/UK. This makes Wi-Fi 6 appliance even less common here.
When I throw in my absolute non-negotiable requirement of running a reasonable firmware in my access point, I'll limit my options to almost nil. Almost! I found this in OpenWRT Table-of-Hardware: Linksys E8450 (aka. Belkin RT3200) It is an early build considered as beta, but hey! All of my requirements align there, so I went for it in Amazon UK:
Wi-Fi 6 Access Point: Belkin RT3200
Couple of days waiting for UPS delivery, and here goes:
This is exactly what I wanted and needed! A four-port gigabit switch for wired LAN, incoming Internet gigabit connector. 12 VDC / 2 A barrel connector for transformer. Given UK power plugs are from 1870s they're widely incompatible with EU-ones. Luckily manufacturers are aware of this and this box contains both UK and EU plugs in an easily interchangeable form. Thanks for that!
Notice how this is a Belkin "manufactured" unit. In reality it is a relabled Linksys RT3200. Even the OpenWRT-firmware is exactly same. Me personally, I don't care what the cardobard box says as long as my Wi-Fi is 6, is fast and is secure.
Illustrated OpenWRT Installation Guide
The thing with moving away from vendor firmware to OpenWRT is that it can be tricky. It's almost never easy, so this procedure is not for everyone.
To achieve this, there are a few steps needed. Actual documentation is at https://openwrt.org/toh/linksys/e8450, but be warned: amount of handholding there is low, for newbie there is not much details. To elaborate the process of installation, I'm walking trough what I did to get me OpenWRT running in the box.
Step 0: Preparation
You will need:
- Linksys/Belkin RT3200 access point
- Wallsocket to power the thing
- A computer with Ethernet port
- Any Windows / Mac / Linux will do, no software needs to be installed, all that is required is a working web browser
- Ethernet cable with RJ-45 connectors to access the access point's admin panel via LAN
- OpenWRT firmware from https://github.com/dangowrt/linksys-e8450-openwrt-installer
- Download files into a laptop you'll be doing your setup from
- Linksys-compatible firmware is at at:https://github.com/dangowrt/linksys-e8450-openwrt-installer/releases, get
openwrt-mediatek-mt7622-linksys_e8450-ubi-initramfs-recovery-installer.itb
- Also download optimized firmware
openwrt-mediatek-mt7622-linksys_e8450-ubi-squashfs-sysupgrade.itb
- Skills and rights to administer your workstation to have its Ethernet port a fixed IPv4-address from net 192.168.1.1/24
- Any other IPv4 address on that net will do, I used 192.168.1.10
- No DNS nor gateway will be needed for this temporary setup
Make sure not to connect the WAN / Internet into anything. The Big Net is scary and don't rush into that yet. You can do that later when all installing and setupping is done.
Mandatory caution:
If you just want to try OpenWrt and still plan to go back to the vendor firmware, use the non-UBI version of the firmware which can be flashed using the vendor's web interface.
Process described here is the UBI-version which does not allow falling back to vendor firmware.
Step 1: Un-box and replace Belkin firmware
After plugging the Access Point to a wall socket, flicking the I/O-switch on, attaching an Ethernet cable to one of the LAN-switch ports and other end directly to a laptop, going to http://192.168.1.1 with your browser will display you something like this:
What you need to do is try to exit the out-of-box-experience setup wizard:
For the "Ethernet cable is not connected" you need to click Exit. When you think of the error message bit harder, if you get the message, your Ethernet IS connected. Ok, ok. It is for the WAN Ethernet, not LAN.
Notice how setup "did not complete succesfully". That is fully intentional. Click "Do not set up". Doing that will land you on a login:
This is your unconfigured admin / admin -scenario. Log into your Linksys ... erhm. Belkin.
Select Configuration / Administration / Firmware Upgrade. Choose File. Out of the two binaries you downloaded while preparing, go for the ubi-initramfs-recovery-installer.itb
. That OpenWRT firmware file isn't from manufacturer, but the file is packaged in a way which makes it compatible to allow easy installation:
On "Start Upgrade" there will be a warning. Click "Ok" and wait patiently for couple minutes.
Step 2: Upgrade your OpenWRT recovery into a real OpenWRT
When all the firmware flashing is done, your factory firmware is gone:
There is no password. Just "Login". An OpenWRT welcome screen will be shown:
Now that you're running OpenWRT, your next task is to go from recovery to real thing. I'm not sure if I'll ever want to go back, but as recommended by OpenWRT instructions, I did take backups of all four mtdblocks: bl2, fip, factory and ubi. This step is optinal:
When you're ready, go for the firmware upgrade. This time select openwrt-mediatek-mt7622-linksys_e8450-ubi-squashfs-sysupgrade.itb
:
To repeat the UBI / non-UBI firmware: This is the UBI-version. It is recommended as it has better optimization for layout and management of SPI flash, but it does not allow fallbacking to vendor firmware.
I unchecked the "Keep settings and retain the current configuration" to make sure I got a fresh start with OpenWRT. On "Continue", yet another round of waiting will occur:
Step 3: Setup your wireless AP
You have seen this exact screen before. Login (there is no password yet):
Second time, same screen but with this time there is a proper firmware in the AP. Go set the admin account properly to get rid of the "There is no password set on this router" -nag. Among all settings, go to wireless configuration to verify both 2.4 and 5 GHz radios are off:
Go fix that. Select "Edit" for the 5 GHz radio and you'll be greeted by a regular wireless access point configuration dialog. It will include section about wireless security:
As I wanted to improve my WLAN security, I steer away from WPA2 and went for a WPA3-SAE security. Supporting both at the same time is possible, but securitywise it isn't wise. If your system allows wireless clients to associate with a weaker solution, they will.
Also for security, check KRACK attack countermeasures. For more details on KRACK, see: https://www.krackattacks.com/
When you've done, you should see radio enabled on a dialog like this:
Step 4: Done! Test.
That's it! Now you're running a proper firmware on our precious Wi-Fi 6 AP. But how fast it is?
As I said, I don't have many Wi-Fi 6 clients to test with. On my 1 gig fiber, iPad seems to be pretty fast. Also my Android phone speed is ... well ... acceptable.
For that speed test I didn't even go for the "one foot distance" which manufacturers love to do. As nobody uses their mobile devices right next to their AP, I tested this on a real life -scenario where both AP and I were located the way I would use Internet in my living room.
Final words
After three year wait Wi-Fi 6 is here! Improved security, improved speed, improved everything!
DynDNS updates to your Cloud DNS - Updated
Monday, July 12. 2021
There is a tool, I've been running for a few years now. In 2018 I published it into GitHub and wrote a blog post about it. Later, I wrote it to support Azure DNS.
As this code is something I do run in my production system(s), I do keep it maintained and working. Latest changes include:
- Proper logging wia logging-module
- Proper setup via
pip install .
- Library done as proper Python-package
- Python 3.9 support
- Rackspace Cloud DNS library via Pyrax maintained:
- Supporting Python 3.7+
- Keyword argument
async
renamed into async_call
- Improved documentation a lot!
- Setup docs improved
- systemd service docs improved
This long-running project of mine starts to feel like a real thing. I'm planning to publish it into PyPI later.
Enjoy!
There is a tool, I've been running for a few years now. In 2018 I published it into GitHub and wrote a blog post about it. Later, I wrote it to support Azure DNS.
As this code is something I do run in my production system(s), I do keep it maintained and working. Latest changes include:
- Proper logging wia logging-module
- Proper setup via
pip install .
- Library done as proper Python-package
- Python 3.9 support
- Rackspace Cloud DNS library via Pyrax maintained:
- Supporting Python 3.7+
- Keyword argument
async
renamed intoasync_call
- Improved documentation a lot!
- Setup docs improved
- systemd service docs improved
This long-running project of mine starts to feel like a real thing. I'm planning to publish it into PyPI later.
Enjoy!
Python 3.9 in RedHat Enterprise Linux 8.4
Tuesday, May 25. 2021
Back in 2018 RHEL 8 had their future-binoculars set to transitioning over deprecated Python 2 into 3 and were the first ones not to have a python
. Obviously the distro had Python, but the command didn't exist. See What, No Python in RHEL 8 Beta? for more info.
Last week 8.4 was officially out. RHEL has the history of being "stuck" in the past. They do move into newer versions rarely generating the feeling of obsoletion, staleness and being stable to the point of RedHat supporting otherwise obsoleted software themselves.
The only problem with that approach is the trouble of getting newer versions. If you talk about any rapid-moving piece of softare like GCC or NodeJS or Python or MariaDB or ... any. The price to pay for stableness is pretty steep with RHEL. Finally they have a solution for this and have made different versions of stable software available. I wonder why it took them that many years.
Seeing the alternatives:
# alternatives --list
ifup auto /usr/libexec/nm-ifup
ld auto /usr/bin/ld.bfd
python auto /usr/libexec/no-python
python3 auto /usr/bin/python3.6
As promised, there is no python
, but there is python3
. However, officially support for 3.6 will end in 7 months. See PEP 494 -- Python 3.6 Release Schedule for more. As mentioned, RedHat is likely to offer their own support after that end-of-life.
Easy fix. First, dnf install python39
. Then:
# alternatives --set python3 /usr/bin/python3.9
# python3 --version
Python 3.9.2
For options, see output of dnf list python3*
. You can choose between existing 3.6 or install 3.8 or 3.9 to the side.
Now you're set!
Back in 2018 RHEL 8 had their future-binoculars set to transitioning over deprecated Python 2 into 3 and were the first ones not to have a python
. Obviously the distro had Python, but the command didn't exist. See What, No Python in RHEL 8 Beta? for more info.
Last week 8.4 was officially out. RHEL has the history of being "stuck" in the past. They do move into newer versions rarely generating the feeling of obsoletion, staleness and being stable to the point of RedHat supporting otherwise obsoleted software themselves.
The only problem with that approach is the trouble of getting newer versions. If you talk about any rapid-moving piece of softare like GCC or NodeJS or Python or MariaDB or ... any. The price to pay for stableness is pretty steep with RHEL. Finally they have a solution for this and have made different versions of stable software available. I wonder why it took them that many years.
Seeing the alternatives:
# alternatives --list
ifup auto /usr/libexec/nm-ifup
ld auto /usr/bin/ld.bfd
python auto /usr/libexec/no-python
python3 auto /usr/bin/python3.6
As promised, there is no python
, but there is python3
. However, officially support for 3.6 will end in 7 months. See PEP 494 -- Python 3.6 Release Schedule for more. As mentioned, RedHat is likely to offer their own support after that end-of-life.
Easy fix. First, dnf install python39
. Then:
# alternatives --set python3 /usr/bin/python3.9
# python3 --version
Python 3.9.2
For options, see output of dnf list python3*
. You can choose between existing 3.6 or install 3.8 or 3.9 to the side.
Now you're set!
Behind the scenes: Reality of running a blog - Story of a failure
Monday, March 22. 2021
... or any (un)social media activity.
IMHO the mentioned "social" media isn't. There are statistics and research to establish the un-social aspect of it. Dopamin-loop in your brain keeps feeding regular doses to make person's behaviour addicted to an activity and keep the person leeching for more material. This very effectively disconnects people from the real world and makes the dive deeper into the rabbit hole of (un)social media.
What most of the dopamin-dosed viewer of any published material keep ignoring is the peak-of-an-iceberg -phenomenon. What I mean is a random visitor gets to see something amazingly cool. A video or picture depicting something that's very impressive and assume that person's life consists of a series of such events. Also humans tend to compare. What that random visitor does next is compares the amazing thing to his/hers own "dull" personal life, which does not consist of a such imaginary sequence of wonderful events. Imaginary, because reality is always harsh. As most of the time we don't know the real story, it is possible for 15 seconds of video footage to take months or preparation, numerous failures, reasonable amounts of money and a lot of effort to happen.
An example of harsh reality, the story of me trying to get a wonderful piece of tech-blogging published.
I started tinkering with a Raspberry Pi 4B. That's something I've planned for a while, ordered some parts and most probably will publish the actual story of the success later. Current status of the project is, well planned, underway, but nowhere near finished.
What happened was for the console output of the Linux to look like this:

That's "interesting" at best. Broken to say the least.
For debugging of this, I rebooted the Raspi into previous Linux kernel of 5.8 and ta-daa! Everything was working again. Most of you are running Raspian, which has Linux 5.4. As I have the energy to burn into hating all of those crappy debians and ubuntus, my obvious choice is a Fedora Linux Workstation AArch64-build.
To clarify the naming: ARM build of Fedora Linux is a community driven effort, it is not run by Red Hat, Inc. nor The Fedora Project.
Ok, enough name/org -talk, back to Raspi.
When in a Linux graphics go that wrong, I always disable the graphical boot in Plymouth splash-screen. Running plymouth-set-default-theme details --rebuild-initrd
will do the trick of displaying all-text at the boot. However, it did not fix the problem on my display. Next I had a string of attempts doing all kinds of Kernel parameter tinkering, especially with deactivating Frame Buffer, learning all I could from KMS or Kernel Mode Setting, attempting to build Raspberry Pi's userland utilities to gain insight of EDID-information just to realize they'll never build on a 64-bit Linux, failing with nomodeset and vga=0 as Kernel Parameters to solve the problem. No matter what I told the kernel, display would fail. Every. Single. Time.
It hit me quite late in troubleshooting. While observing the sequence of boot-process, during early stages of boot everything worked and display was un-garbled. Then later when Feodra was starting system services everything fell. Obviously something funny happened with GPU-driver of Broadcom BCM2711 -chip of VideoCore 4, aka. vc4 in that particular Linux-build when the driver was loaded. Creating file /etc/modprobe.d/vc4-blacklist.conf
with contents of blacklist vc4
to prevent VideoCore4 driver from ever loading did solve the issue! Yay! Finally found the problem.
All of this took several hours, I'd say 4-5 hours straight work. What happened next was surprising. Now that I had the problem isolated into GPU-driver, on IRC's #fedora-arm -channel, people said vc4 HDMI-output was a known problem and was already fixed in Linux 5.11. Dumbfounded by this answer, I insisted version 5.10 of being the latest and 5.11 lacking availability. They insisted back. Couple hours before me asking, 5.11 was deployed into mirrors sites for everybody to receive. This happened while I was investigating failing and investigating more.
dnf update
, reboot and pooof. Problem was gone!
There is no real story here. In pursuit of getting the thing fixed, it fixed itself by time. All I had to do is wait (which obviously I did not do). Failure after failure, but no juicy story on how to fix the HDMI-output. On a typical scenario, this type of story would not get published. No sane person would shine any light on a failure and time wasted.
However, this is what most of us do with computers. Fail, retry and attempt to get results. No glory, just hard work.
... or any (un)social media activity.
IMHO the mentioned "social" media isn't. There are statistics and research to establish the un-social aspect of it. Dopamin-loop in your brain keeps feeding regular doses to make person's behaviour addicted to an activity and keep the person leeching for more material. This very effectively disconnects people from the real world and makes the dive deeper into the rabbit hole of (un)social media.
What most of the dopamin-dosed viewer of any published material keep ignoring is the peak-of-an-iceberg -phenomenon. What I mean is a random visitor gets to see something amazingly cool. A video or picture depicting something that's very impressive and assume that person's life consists of a series of such events. Also humans tend to compare. What that random visitor does next is compares the amazing thing to his/hers own "dull" personal life, which does not consist of a such imaginary sequence of wonderful events. Imaginary, because reality is always harsh. As most of the time we don't know the real story, it is possible for 15 seconds of video footage to take months or preparation, numerous failures, reasonable amounts of money and a lot of effort to happen.
An example of harsh reality, the story of me trying to get a wonderful piece of tech-blogging published.
I started tinkering with a Raspberry Pi 4B. That's something I've planned for a while, ordered some parts and most probably will publish the actual story of the success later. Current status of the project is, well planned, underway, but nowhere near finished.
What happened was for the console output of the Linux to look like this:
That's "interesting" at best. Broken to say the least.
For debugging of this, I rebooted the Raspi into previous Linux kernel of 5.8 and ta-daa! Everything was working again. Most of you are running Raspian, which has Linux 5.4. As I have the energy to burn into hating all of those crappy debians and ubuntus, my obvious choice is a Fedora Linux Workstation AArch64-build.
To clarify the naming: ARM build of Fedora Linux is a community driven effort, it is not run by Red Hat, Inc. nor The Fedora Project.
Ok, enough name/org -talk, back to Raspi.
When in a Linux graphics go that wrong, I always disable the graphical boot in Plymouth splash-screen. Running plymouth-set-default-theme details --rebuild-initrd
will do the trick of displaying all-text at the boot. However, it did not fix the problem on my display. Next I had a string of attempts doing all kinds of Kernel parameter tinkering, especially with deactivating Frame Buffer, learning all I could from KMS or Kernel Mode Setting, attempting to build Raspberry Pi's userland utilities to gain insight of EDID-information just to realize they'll never build on a 64-bit Linux, failing with nomodeset and vga=0 as Kernel Parameters to solve the problem. No matter what I told the kernel, display would fail. Every. Single. Time.
It hit me quite late in troubleshooting. While observing the sequence of boot-process, during early stages of boot everything worked and display was un-garbled. Then later when Feodra was starting system services everything fell. Obviously something funny happened with GPU-driver of Broadcom BCM2711 -chip of VideoCore 4, aka. vc4 in that particular Linux-build when the driver was loaded. Creating file /etc/modprobe.d/vc4-blacklist.conf
with contents of blacklist vc4
to prevent VideoCore4 driver from ever loading did solve the issue! Yay! Finally found the problem.
All of this took several hours, I'd say 4-5 hours straight work. What happened next was surprising. Now that I had the problem isolated into GPU-driver, on IRC's #fedora-arm -channel, people said vc4 HDMI-output was a known problem and was already fixed in Linux 5.11. Dumbfounded by this answer, I insisted version 5.10 of being the latest and 5.11 lacking availability. They insisted back. Couple hours before me asking, 5.11 was deployed into mirrors sites for everybody to receive. This happened while I was investigating failing and investigating more.
dnf update
, reboot and pooof. Problem was gone!
There is no real story here. In pursuit of getting the thing fixed, it fixed itself by time. All I had to do is wait (which obviously I did not do). Failure after failure, but no juicy story on how to fix the HDMI-output. On a typical scenario, this type of story would not get published. No sane person would shine any light on a failure and time wasted.
However, this is what most of us do with computers. Fail, retry and attempt to get results. No glory, just hard work.