Post-passwords life: Biometrics for your PC
Monday, July 4. 2022
Last year I did a few posts about passwords, example. The topic is getting worn out as we have established the fact about passwords being a poor means of authentiaction, how easily passwords leak from unsuspecting user to bad people and how you really should be using super-complex passwords which are stored in a vault. Personally I don't think there are many interesting password avenues left to explore.
This year my sights are set into life after passwords: how are we going to authenticate ourselves and what we need to do to get there.
Biometrics. A "password" everybody of us carries everywhere and is readily available to be used. Do the implementation wrong, leak that "password" and that human will be in big trouble. Biometric "password" isn't so easy to change. Impossible even (in James Bond movies, maybe). Given all the potential downsides, biometrics still beats traditional password in one crucial point: physical distance. To authenticate with biometrics you absolutely, positively need to be near the device you're about to use. A malicious cracker from other side of the world won't be able to brute-force their way trough authentication unless they have your precious device at their hand. Even attempting any hacks remotely is impossible.
While eyeballing some of the devices and computers I have at hand:



The pics are from iPhone 7, MacBook Pro and Lenovo T570. Hardware that I use regularily, but enter password rarely. There obviously exists other types of biometrics and password replacements, but I think you'll catch the general idea of life after passwords.
Then, looking at the keyboard of my gaming PC:

Something I use on daily basis, but it really puzzles me why Logitech G-513 doesn't have the fingerprint reader like most reasonable computer appliance does. Or generally speaking, if not on keyboard could my self assembled PC have a biometric reader most devices do. Why must I suffer from lack of simple, fast and reliable method of authentication? Uh??
Back-in-the-days fingerprint readers were expensive, bulky devices weren't that accurate and OS-support was mostly missing and injected via modifying operating system files. Improvements on this area is something I'd credit Apple for. They made biometric authentication commonly available for their users, when it became popular and sensor prices dropped, others followed suit.
So, I went looking for a suitable product. This is the one I ended up with:

A note: I do love their "brief" product naming! 
It is a Kensington® VeriMark™ Fingerprint Key supporting Windows Hello™ and FIDO U2F for universal 2nd-factor authentication. Pricing for one is reasonable, I paid 50€ for it. As I do own other types of USB/Bluetooth security devices, what they're asking for one is on par with market. I personally wouldn't want a security device which would be "cheapest on the market". I'd definitely go for a higher price range. My thinking is, this would be the appropriate price range for these devices.
Second note: Yes, I ended up buying a security device from company whose principal market on mechanical locks.

Here is one of those lock slots on the corner of my T570:

From left to right, there is a HDMI-port, Ethernet RJ-45 and a Kensington lock slot. You could bolt the laptop into a suitable physical object making the theft of the device really hard. Disclaimer: Any security measure can be defeated, given enough time.
Back to the product. Here is what's in the box:


That would be a very tiny USB-device. Similar sized items would be your Logitech mouse receiver or smallest WiFi dongles.
With a Linux running lsusb
, following information can be retrieved:
Bus 001 Device 006: ID 06cb:0088 Synaptics, Inc.
Doing the verbose version lsusb -s 1:6 -vv, tons more is made available:
Bus 001 Device 006: ID 06cb:0088 Synaptics, Inc.
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 255 Vendor Specific Class
bDeviceSubClass 16
bDeviceProtocol 255
bMaxPacketSize0 8
idVendor 0x06cb Synaptics, Inc.
idProduct 0x0088
bcdDevice 1.54
iManufacturer 0
iProduct 0
iSerial 1 -redacted-
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 0x0035
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xa0
(Bus Powered)
Remote Wakeup
MaxPower 100mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 5
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83 EP 3 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0008 1x 8 bytes
bInterval 4
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x84 EP 4 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0010 1x 16 bytes
bInterval 10
Device Status: 0x0000
(Bus Powered)
So, this "Kensington" device is ultimately something Synaptics made. Synaptics have a solid track-record with biometrics and haptic input, so I should be safe with the product of my choice here.
For non-Windows users, the critical thing worth mentioning here is: There is no Linux support. There is no macOS support. This is only for Windows. Apparently you can go back to Windows 7 even, but sticking with 10 or 11 should be fine. A natural implication for being Windows-only leads us to following: Windows Hello is mandatory (I think you should get the hint from the product name already).
Without biometrics, I kinda catch the idea with Windows Hello. You can define a 123456-style PIN to log into your device, something very simple for anybody to remember. It's about physical proximity, you need to enter the PIN into the device, won't work over network. So, that's kinda ok(ish), but with biometrics Windows Hello kicks into a high gear. What I typically do, is define a rather complex alphanumeric PIN to my Windows and never use it again. Once you go biometrics, you won't be needing the password. Simple!
Back to the product. As these Kensington-people aren't really software-people, for installation they'll go with the absolutely bare minimum. There is no setup.exe
or something any half-good Windows developer would be able to whip up. A setup which would execute pnputil -i -a synaWudfBioUsbKens.inf
with free-of-charge tools like WiX would be rather trivial to do. But noooo. Nothing that fancy! They'll just provide a Zip-file of Synaptics drivers and instruct you to right click on the .inf
-file:

To Windows users not accustomed to installing device drivers like that, it is a fast no-questions-asked -style process resulting in a popup:

When taking a peek into Device Manager:

My gaming PC has a biometric device in it! Whoo!
Obviously this isn't enough. Half of the job is done now. Next half is to train some of my fingers to the reader. Again, this isn't Apple, so user experience (aka. UX) is poor. There seems not to be a way to list trained fingers or remove/update them. I don't really understand the reasoning for this sucky approach by Microsoft. To move forward with this, go to Windows Settings and enable Windows Hello:

During the setup-flow of Windows Hello, you'll land at the crucial PIN-question:

Remeber to Include letters and symbols. You don't have to stick with just numbers! Of course, if that suits your needs, you can.
After that you're set! Just go hit ⊞ Win+L to lock your computer. Test how easy it is to log back in. Now, when looking at my G-513 it has the required feature my iPhone 7, MBP and Lenovo has:

Nicely done!
Last year I did a few posts about passwords, example. The topic is getting worn out as we have established the fact about passwords being a poor means of authentiaction, how easily passwords leak from unsuspecting user to bad people and how you really should be using super-complex passwords which are stored in a vault. Personally I don't think there are many interesting password avenues left to explore.
This year my sights are set into life after passwords: how are we going to authenticate ourselves and what we need to do to get there.
Biometrics. A "password" everybody of us carries everywhere and is readily available to be used. Do the implementation wrong, leak that "password" and that human will be in big trouble. Biometric "password" isn't so easy to change. Impossible even (in James Bond movies, maybe). Given all the potential downsides, biometrics still beats traditional password in one crucial point: physical distance. To authenticate with biometrics you absolutely, positively need to be near the device you're about to use. A malicious cracker from other side of the world won't be able to brute-force their way trough authentication unless they have your precious device at their hand. Even attempting any hacks remotely is impossible.
While eyeballing some of the devices and computers I have at hand:
The pics are from iPhone 7, MacBook Pro and Lenovo T570. Hardware that I use regularily, but enter password rarely. There obviously exists other types of biometrics and password replacements, but I think you'll catch the general idea of life after passwords.
Then, looking at the keyboard of my gaming PC:
Something I use on daily basis, but it really puzzles me why Logitech G-513 doesn't have the fingerprint reader like most reasonable computer appliance does. Or generally speaking, if not on keyboard could my self assembled PC have a biometric reader most devices do. Why must I suffer from lack of simple, fast and reliable method of authentication? Uh??
Back-in-the-days fingerprint readers were expensive, bulky devices weren't that accurate and OS-support was mostly missing and injected via modifying operating system files. Improvements on this area is something I'd credit Apple for. They made biometric authentication commonly available for their users, when it became popular and sensor prices dropped, others followed suit.
So, I went looking for a suitable product. This is the one I ended up with:
A note: I do love their "brief" product naming!
It is a Kensington® VeriMark™ Fingerprint Key supporting Windows Hello™ and FIDO U2F for universal 2nd-factor authentication. Pricing for one is reasonable, I paid 50€ for it. As I do own other types of USB/Bluetooth security devices, what they're asking for one is on par with market. I personally wouldn't want a security device which would be "cheapest on the market". I'd definitely go for a higher price range. My thinking is, this would be the appropriate price range for these devices.
Second note: Yes, I ended up buying a security device from company whose principal market on mechanical locks.
Here is one of those lock slots on the corner of my T570:
From left to right, there is a HDMI-port, Ethernet RJ-45 and a Kensington lock slot. You could bolt the laptop into a suitable physical object making the theft of the device really hard. Disclaimer: Any security measure can be defeated, given enough time.
Back to the product. Here is what's in the box:
That would be a very tiny USB-device. Similar sized items would be your Logitech mouse receiver or smallest WiFi dongles.
With a Linux running lsusb
, following information can be retrieved:
Bus 001 Device 006: ID 06cb:0088 Synaptics, Inc.
Doing the verbose version lsusb -s 1:6 -vv, tons more is made available:
Bus 001 Device 006: ID 06cb:0088 Synaptics, Inc.
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 255 Vendor Specific Class
bDeviceSubClass 16
bDeviceProtocol 255
bMaxPacketSize0 8
idVendor 0x06cb Synaptics, Inc.
idProduct 0x0088
bcdDevice 1.54
iManufacturer 0
iProduct 0
iSerial 1 -redacted-
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 0x0035
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xa0
(Bus Powered)
Remote Wakeup
MaxPower 100mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 5
bInterfaceClass 255 Vendor Specific Class
bInterfaceSubClass 0
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83 EP 3 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0008 1x 8 bytes
bInterval 4
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x84 EP 4 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0010 1x 16 bytes
bInterval 10
Device Status: 0x0000
(Bus Powered)
So, this "Kensington" device is ultimately something Synaptics made. Synaptics have a solid track-record with biometrics and haptic input, so I should be safe with the product of my choice here.
For non-Windows users, the critical thing worth mentioning here is: There is no Linux support. There is no macOS support. This is only for Windows. Apparently you can go back to Windows 7 even, but sticking with 10 or 11 should be fine. A natural implication for being Windows-only leads us to following: Windows Hello is mandatory (I think you should get the hint from the product name already).
Without biometrics, I kinda catch the idea with Windows Hello. You can define a 123456-style PIN to log into your device, something very simple for anybody to remember. It's about physical proximity, you need to enter the PIN into the device, won't work over network. So, that's kinda ok(ish), but with biometrics Windows Hello kicks into a high gear. What I typically do, is define a rather complex alphanumeric PIN to my Windows and never use it again. Once you go biometrics, you won't be needing the password. Simple!
Back to the product. As these Kensington-people aren't really software-people, for installation they'll go with the absolutely bare minimum. There is no setup.exe
or something any half-good Windows developer would be able to whip up. A setup which would execute pnputil -i -a synaWudfBioUsbKens.inf
with free-of-charge tools like WiX would be rather trivial to do. But noooo. Nothing that fancy! They'll just provide a Zip-file of Synaptics drivers and instruct you to right click on the .inf
-file:
To Windows users not accustomed to installing device drivers like that, it is a fast no-questions-asked -style process resulting in a popup:
When taking a peek into Device Manager:
My gaming PC has a biometric device in it! Whoo!
Obviously this isn't enough. Half of the job is done now. Next half is to train some of my fingers to the reader. Again, this isn't Apple, so user experience (aka. UX) is poor. There seems not to be a way to list trained fingers or remove/update them. I don't really understand the reasoning for this sucky approach by Microsoft. To move forward with this, go to Windows Settings and enable Windows Hello:
During the setup-flow of Windows Hello, you'll land at the crucial PIN-question:
Remeber to Include letters and symbols. You don't have to stick with just numbers! Of course, if that suits your needs, you can.
After that you're set! Just go hit ⊞ Win+L to lock your computer. Test how easy it is to log back in. Now, when looking at my G-513 it has the required feature my iPhone 7, MBP and Lenovo has:
Nicely done!
123RF Leak - Fallout
Sunday, July 3. 2022
Your and my data is leaked left and right all the time by organizations who at the time of you entering your precious PII (or Personally Identifiable Information) solemny swear to take really good care of it. Still they fail at it. By adapting the famous speech by John F. Kennedy: "Not because it is easy, but because it is hard." Even companies whose business is information security get hacked, or mis-configure their systems or human administrators fail at a minor thing causing major disasters, causing data to leak. Because it is very very hard to protect your data. Let alone companies whose business is not information security, they simply want to conduct their business on-line and not to focus in nurturing YOUR information. When a flaw in their system or procedures is found by malicious actors, data will be leaked.
I don't think this is about to change anytime soon. Unfortunately.
Back in 2020, a site called 123RF.com was pwned. How, or by whom is irrelevant. They failed protecting MY data. More details are available in Bleeping Computer article Popular stock photo service hit by data breach, 8.3M records for sale.
As anybody can expect, there are negative effects on such leaks. Today, such negativity manifests itself in my mailbox as follows:

A company, who has nothing to do with me is sending email to my unique 123RF-dedicated address, stating my attempt to request password recovery on their on-line service has failed. Ok, intersting. Not cool.
However inconvenient this is to me, I fail to find the attacker's angle on this. By sending 8.3 million password recovery attempts to a public endpoint is a far fetch. They might by shear luck get access to somebody's account on the targeted site. Most likely not. But it's worth the try as it won't cost them anything and very easy to do.
This exact same has happened to me multiple times. An example from 2018 is my blog post DocuSign hacked: Fallout of leaked E-mail addresses. Given all the time and effort I put into creating mail system where I can have unique addresses to all possible use-cases it will be easy to identify the leaking source. Me and many others thought EU's GDPR was supposed to help every single Internet user with this, but still Wired is writing in their May 2022 article How GDPR Is Failing (sorry about the paywall, I'm a subscriber).
I'd like to end this by expressing my generic despair and agony not targeted towards anybody in particular while still targeted towards all stakeholders. It's a movie quote from the original -68 Planet of the Apes where at the end of the film Charlton Heston's character understands what had happened and how he ended up in that mess:
Ah, damn you! God damn you all to hell!
Your and my data is leaked left and right all the time by organizations who at the time of you entering your precious PII (or Personally Identifiable Information) solemny swear to take really good care of it. Still they fail at it. By adapting the famous speech by John F. Kennedy: "Not because it is easy, but because it is hard." Even companies whose business is information security get hacked, or mis-configure their systems or human administrators fail at a minor thing causing major disasters, causing data to leak. Because it is very very hard to protect your data. Let alone companies whose business is not information security, they simply want to conduct their business on-line and not to focus in nurturing YOUR information. When a flaw in their system or procedures is found by malicious actors, data will be leaked.
I don't think this is about to change anytime soon. Unfortunately.
Back in 2020, a site called 123RF.com was pwned. How, or by whom is irrelevant. They failed protecting MY data. More details are available in Bleeping Computer article Popular stock photo service hit by data breach, 8.3M records for sale.
As anybody can expect, there are negative effects on such leaks. Today, such negativity manifests itself in my mailbox as follows:
A company, who has nothing to do with me is sending email to my unique 123RF-dedicated address, stating my attempt to request password recovery on their on-line service has failed. Ok, intersting. Not cool. However inconvenient this is to me, I fail to find the attacker's angle on this. By sending 8.3 million password recovery attempts to a public endpoint is a far fetch. They might by shear luck get access to somebody's account on the targeted site. Most likely not. But it's worth the try as it won't cost them anything and very easy to do.
This exact same has happened to me multiple times. An example from 2018 is my blog post DocuSign hacked: Fallout of leaked E-mail addresses. Given all the time and effort I put into creating mail system where I can have unique addresses to all possible use-cases it will be easy to identify the leaking source. Me and many others thought EU's GDPR was supposed to help every single Internet user with this, but still Wired is writing in their May 2022 article How GDPR Is Failing (sorry about the paywall, I'm a subscriber).
I'd like to end this by expressing my generic despair and agony not targeted towards anybody in particular while still targeted towards all stakeholders. It's a movie quote from the original -68 Planet of the Apes where at the end of the film Charlton Heston's character understands what had happened and how he ended up in that mess:
Ah, damn you! God damn you all to hell!
FTC-case: Cafepress leak
Thursday, June 30. 2022
Couple days ago I got an email to a really old address of mine. I happen to own number of domains and most of them are connected to the mail system on my home Linux server. Many many years ago I stopped using a single email address for the simple reason that I couldn't identify to whom I had given that address to. This is a design flaw in archaic asynchronous messging methods like snail mail, telephone calls or SMTP. You simply don't know who is contacting you and where did they obtain your information from.
To defeat this design flaw I have an unique email address for every single service I ever hand my information to. That particular address in question pre-dates my system and is roughly 20 years old. So, no way of knowing who originally leaked it or how many times the address has been leaked. As I don't use it anywhere anymore, an email sent there is with 99,9% confidence junk.
Why I'm writing about that email is obvious, it had very interesting content. Sender claimed to be (a company?) called CafePress. Something I've never head of. Another reason why I kept investigating this maybe-spam was the subject of "Notice of FTC Settlement - 2019 Data Breach". Ok. Federal Trade Commission of USA was addressing me via CafePress. Really interesting!
First the technical ones. To deduce if this was spam or not, I always check the mail headers. This baby had both SPF and DKIM verified correctly. Both technical measures are pretty much a must for anybody to accept the incoming set of bytes. Well, also spammers know this and typically use Google or Microsoft as spam platforms via their free-of-charge mail offerings.
As technical details checked out, next the actual sending server. It originated to something called https://www.sparkpost.com/. Again, something I've never heard of. Ultimately this mail passed all techical checks. My thinking started leaning towards this being the Real Deal.
Email contents:
Dear Valued Customer,
We are contacting you about the 2019 breach of your information collected by the prior owners of CafePress. This notice is about that breach, which you may have already been notified of.
We recently reached a settlement with the Federal Trade Commission, the nation's consumer protection agency, to resolve issues related to the 2019 data breach, and to make sure CafePress keeps your information safe. What happened?
Before November 2019, CafePress didn't have reasonable practices to keep your information safe. When the company had a security breach, the following information about you may have been stolen: your email address, password, name, address, phone number, answers to your security questions, and the expiration date and last four digits of your credit card.
What you can do to protect yourself
Here are some steps to reduce the risk of identity theft and protect your information online:
<blah blah removed>
Sincerely,
Chris Klingebiel
General Manager, CafePress
Hm. Interesting.
Somebody in a company I've never heard of leaked my data in 2019 by a hack. Well, why did they have my information on the first place!
Googling the topic for more details:

Article Commission orders e-commerce platform to bolster data security and provide redress to small businesses is available on FTC.gov. It was legit after all!
Press release mentionded following: "Residual Pumpkin Entity, LLC, the former owner of CafePress, and PlanetArt, LLC, which bought CafePress in 2020". Whoa! Even more companies which I never heard of.
Some highlights:
According to the complaint, a hacker exploited the company’s security failures in February 2019 to access millions of email addresses and passwords with weak encryption; millions of unencrypted names, physical addresses, and security questions and answers; more than 180,000 unencrypted Social Security numbers; and tens of thousands of partial payment card numbers and expiration dates.
Obviously this CafePress (or whoever it is) didn't do much of a job for protecting their stolen/bought/downloaded-from-DarkNet -data. It's a good thing FTC gave them a slap-on-the-wrist -punishment as the company is clearly crooked. Why an earth didn't FTC do more follow up on the origins of the data? I'd definitely love to hear how this CafePress got my data into their hands. I didn't volunteer it, that's for sure!
Really puzzling case this. I suppose it speaks volumes about the modern state of Internet.
Couple days ago I got an email to a really old address of mine. I happen to own number of domains and most of them are connected to the mail system on my home Linux server. Many many years ago I stopped using a single email address for the simple reason that I couldn't identify to whom I had given that address to. This is a design flaw in archaic asynchronous messging methods like snail mail, telephone calls or SMTP. You simply don't know who is contacting you and where did they obtain your information from.
To defeat this design flaw I have an unique email address for every single service I ever hand my information to. That particular address in question pre-dates my system and is roughly 20 years old. So, no way of knowing who originally leaked it or how many times the address has been leaked. As I don't use it anywhere anymore, an email sent there is with 99,9% confidence junk.
Why I'm writing about that email is obvious, it had very interesting content. Sender claimed to be (a company?) called CafePress. Something I've never head of. Another reason why I kept investigating this maybe-spam was the subject of "Notice of FTC Settlement - 2019 Data Breach". Ok. Federal Trade Commission of USA was addressing me via CafePress. Really interesting!
First the technical ones. To deduce if this was spam or not, I always check the mail headers. This baby had both SPF and DKIM verified correctly. Both technical measures are pretty much a must for anybody to accept the incoming set of bytes. Well, also spammers know this and typically use Google or Microsoft as spam platforms via their free-of-charge mail offerings.
As technical details checked out, next the actual sending server. It originated to something called https://www.sparkpost.com/. Again, something I've never heard of. Ultimately this mail passed all techical checks. My thinking started leaning towards this being the Real Deal.
Email contents:
Dear Valued Customer,
We are contacting you about the 2019 breach of your information collected by the prior owners of CafePress. This notice is about that breach, which you may have already been notified of.
We recently reached a settlement with the Federal Trade Commission, the nation's consumer protection agency, to resolve issues related to the 2019 data breach, and to make sure CafePress keeps your information safe. What happened?
Before November 2019, CafePress didn't have reasonable practices to keep your information safe. When the company had a security breach, the following information about you may have been stolen: your email address, password, name, address, phone number, answers to your security questions, and the expiration date and last four digits of your credit card.
What you can do to protect yourself
Here are some steps to reduce the risk of identity theft and protect your information online:
<blah blah removed>
Sincerely,
Chris Klingebiel
General Manager, CafePress
Hm. Interesting.
Somebody in a company I've never heard of leaked my data in 2019 by a hack. Well, why did they have my information on the first place!
Googling the topic for more details:
Article Commission orders e-commerce platform to bolster data security and provide redress to small businesses is available on FTC.gov. It was legit after all!
Press release mentionded following: "Residual Pumpkin Entity, LLC, the former owner of CafePress, and PlanetArt, LLC, which bought CafePress in 2020". Whoa! Even more companies which I never heard of.
Some highlights:
According to the complaint, a hacker exploited the company’s security failures in February 2019 to access millions of email addresses and passwords with weak encryption; millions of unencrypted names, physical addresses, and security questions and answers; more than 180,000 unencrypted Social Security numbers; and tens of thousands of partial payment card numbers and expiration dates.
Obviously this CafePress (or whoever it is) didn't do much of a job for protecting their stolen/bought/downloaded-from-DarkNet -data. It's a good thing FTC gave them a slap-on-the-wrist -punishment as the company is clearly crooked. Why an earth didn't FTC do more follow up on the origins of the data? I'd definitely love to hear how this CafePress got my data into their hands. I didn't volunteer it, that's for sure!
Really puzzling case this. I suppose it speaks volumes about the modern state of Internet.
Breaking the paywall - Follow up
Wednesday, June 22. 2022
Back-in-the-day, roughly three years ago, I wrote a JavaScriptlet for breaking trough a press website's paywall. At the time I promised not to "piss" into their proverbial coffee pot and not publish updated versions of my hack. I kept my word. It was surprisingly easy! They didn't change a thing. Until now.
This particular ICT-media finally realized their client tell to contain mostly smart people with good computer skills. Just clicking a button on your shortcuts-bar of your chosen web browser after reading 5 articles was too easy. They finally went into full-blown hedgehog defense and enabled two measures:
- Without authentication, your IPv4-address is tracked.
- No cookies required
- Doesn't care which browser on which computer you're using
- There is a very small number of free-of-charge articles you can read per day. I'd say 5.
- Unfortunately their AWS-setup doesn't have IPv6 enabled. With that I'd have much more IP-addresses at my disposal.

- With authentication, the number of free-of-charge articles is limited. I'd guess to 10 of them.
For proper paywall breakage, lots of IPv4-addresses would be required. I do have some, but I'd rather not play this game anymore. I'll call it quits and stick with the headers of their RSS-feed. Obviously, they'll limit the feed into last 10 articles, but if you keep reading and accumulating the data 24/7, you'll get all of them.
Good job! One potential customer lost.
Back-in-the-day, roughly three years ago, I wrote a JavaScriptlet for breaking trough a press website's paywall. At the time I promised not to "piss" into their proverbial coffee pot and not publish updated versions of my hack. I kept my word. It was surprisingly easy! They didn't change a thing. Until now.
This particular ICT-media finally realized their client tell to contain mostly smart people with good computer skills. Just clicking a button on your shortcuts-bar of your chosen web browser after reading 5 articles was too easy. They finally went into full-blown hedgehog defense and enabled two measures:
- Without authentication, your IPv4-address is tracked.
- No cookies required
- Doesn't care which browser on which computer you're using
- There is a very small number of free-of-charge articles you can read per day. I'd say 5.
- Unfortunately their AWS-setup doesn't have IPv6 enabled. With that I'd have much more IP-addresses at my disposal.
- With authentication, the number of free-of-charge articles is limited. I'd guess to 10 of them.
For proper paywall breakage, lots of IPv4-addresses would be required. I do have some, but I'd rather not play this game anymore. I'll call it quits and stick with the headers of their RSS-feed. Obviously, they'll limit the feed into last 10 articles, but if you keep reading and accumulating the data 24/7, you'll get all of them.
Good job! One potential customer lost.
Google Analytics 4 migration
Monday, June 20. 2022
For number of reasons, instead of migration, I should write migraine. First is about moving from an old system into a new one. Latter is about medical condition causing severe headaches. In this case, there seems to be not much of a difference.
Google's own Make the switch to Google Analytics 4 -migration guide has 12 steps. Given complexity, the time I have to invest on this and lack of really diving into this, I probably missed most of them.
Going into GA4-console, I have this:

Clicking on Get Tagging Instructions, I'll get further information:

There is a snippet of JavaScript, which I can inject into this site.
<script async="" src="https://www.googletagmanager.com/gtag/js?id=G-859QZJED94"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-859QZJED94');
</script>
Note: My GA4-stream ID is G-859QZJED94. Make sure to use yours, not mine. As the stream ID cannot be a secret, it's ok to publish it here. The same ID can be found from HTML-code of this page.
To verify installation success, I went to GA4 portal, chose Reports and Realtime. It didn't take long, less than a minute, for that report to indicate user activity. Many other places keep saying "No data received from your website yet", "No data received in past 48 hours" or "Data collection isn’t active for your website". IMHO all of those are incorrect, as I do have stats. We'll see if I managed to do that right.
For number of reasons, instead of migration, I should write migraine. First is about moving from an old system into a new one. Latter is about medical condition causing severe headaches. In this case, there seems to be not much of a difference.
Google's own Make the switch to Google Analytics 4 -migration guide has 12 steps. Given complexity, the time I have to invest on this and lack of really diving into this, I probably missed most of them.
Going into GA4-console, I have this:
Clicking on Get Tagging Instructions, I'll get further information:
There is a snippet of JavaScript, which I can inject into this site.
<script async="" src="https://www.googletagmanager.com/gtag/js?id=G-859QZJED94"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-859QZJED94');
</script>
Note: My GA4-stream ID is G-859QZJED94. Make sure to use yours, not mine. As the stream ID cannot be a secret, it's ok to publish it here. The same ID can be found from HTML-code of this page.
To verify installation success, I went to GA4 portal, chose Reports and Realtime. It didn't take long, less than a minute, for that report to indicate user activity. Many other places keep saying "No data received from your website yet", "No data received in past 48 hours" or "Data collection isn’t active for your website". IMHO all of those are incorrect, as I do have stats. We'll see if I managed to do that right.
Megazoning (or Laser Tagging) part 2, Year 2022 edition
Sunday, May 22. 2022
Quite close six years ago, I posted about my visit in Megazone. Now I got to go there again. This time there were more of us and we had a private game for the company.
Round 1
This was me getting the look and feel how this game worked.
Oh boy our team sucked! (Sorry my colleagues, not sorry.)
Obviously I was the best in our green team, but I ranked 11th out of 30. Looking at the scores, red team was way too good for us and blue team had couple guys better than me.
Round 2
I'm not exactly sure, but somehow this turned into a base defense. Other teams were convinced they had to rape our base and me and another member of my red team were almost constantly defending our base.
This turn of events ended me being 4th out of 32 players. Actually, not missing the 2nd position by too much.
Round 3
This round was full-on base defense 100% of the time. There were 5 or 6 players of our team doing base defense to our blue team. Way too many players on other teams were keen on getting their 2000 pts on our base.
Btw, I'm pretty sure we fended them on each attack wave.
Mentally I'd love to do more freeroaming than doing area defense. I didn't enjoy this kind of game that much so I simply gave up. I wasn't moving nearly as much as reasonable Laser Tag tactics dictates. Being stationary will result in a below medium score and position 18th out of 30.
Overall
I still can get my kick out of round 2 while zapping attackers on left and right. That was fun!
However, I'll be waiting for the release of Sniper Elite 5 next week, I know I will enjoy video gaming more than IRL gaming.
Quite close six years ago, I posted about my visit in Megazone. Now I got to go there again. This time there were more of us and we had a private game for the company.
Round 1
This was me getting the look and feel how this game worked.
Oh boy our team sucked! (Sorry my colleagues, not sorry.)
Obviously I was the best in our green team, but I ranked 11th out of 30. Looking at the scores, red team was way too good for us and blue team had couple guys better than me.
Round 2
I'm not exactly sure, but somehow this turned into a base defense. Other teams were convinced they had to rape our base and me and another member of my red team were almost constantly defending our base.
This turn of events ended me being 4th out of 32 players. Actually, not missing the 2nd position by too much.
Round 3
This round was full-on base defense 100% of the time. There were 5 or 6 players of our team doing base defense to our blue team. Way too many players on other teams were keen on getting their 2000 pts on our base.
Btw, I'm pretty sure we fended them on each attack wave.
Mentally I'd love to do more freeroaming than doing area defense. I didn't enjoy this kind of game that much so I simply gave up. I wasn't moving nearly as much as reasonable Laser Tag tactics dictates. Being stationary will result in a below medium score and position 18th out of 30.
Overall
I still can get my kick out of round 2 while zapping attackers on left and right. That was fun!
However, I'll be waiting for the release of Sniper Elite 5 next week, I know I will enjoy video gaming more than IRL gaming.
Diverse spam
Thursday, May 5. 2022
It seems detecting and blocking email spam is easy. That's something I stated in my blog post back in 2020.
Here I'm introducing three types of spam I'm getting.
Spam from free email accounts
But wait! That's not new. We've seen email originating from Yahoo for many years.
What is new, the fact that increasing number of spammers are creating tons of accounts into Gmail or Outlook.com. As both free-of-charge services limit the amount of email one can send during 24 hours into 500, quite a lot of accounts are needed. As example, see Gmail's Limits for sending & getting mail.
The reason those two services are used for spamming is the good reputation and both are well accepted among postmasters. As an example I won't block neither of them and consider both being as reliable. As end result, majority of spam I receive is from either of those. To weed of bad ones from good ones, traditional spam-filtering will do the trick.
What's good is both Microsoft and Google are really good in detecting ill behaviour and stopping the activity. However, criminals are really good in spending their 500 mails effectively to make their behaviour as "natural" as possible. Neither service provider isn't fast/good enough to detect and stop the activity. Unfortunately.
SMS / iMessage spam
Getting spam via SMS isn't new, but getting junk from Apple iMessage is. Here is a sample:

For non-Finnish readers, the spam says I've been selected for a part time job. Daily salary range is from 10 to 200 euros.
It seems this weird scam is active all over Europe. See iPhone users lament for spam messages via iMessage. Unlike Android, iPhone doesn't have any mechanisms for detecting SMS-spam. Maybe it should?
Probably those spammers didn't even break anything (much). It is likely they've automated something to be able to send crap to everybody. I don't think iMessage protocol isn't broken nor Apple's authentication. All those jerks did was automate the thing to the hilt.
GPG spam
When spam arrives in a heavily encrypted email, then you'll know it must be very important! Here is one I got encrypted with my GPG-key:

What's weird is the choice of language. That sample is in Burmese, but I got one in Hindi and Telugu. All of which point heavily to Asia and specifically to Indian region. I have no idea why I got those as I definitely need Google Translate to decipher those. 
I have no idea how my address leaked, but the one I'm using is from https://keyserver.pgp.com/, a directory where I choose to publish my PGP / GPG public key. Maybe somebody hacked that? Or just iterated all possible keys. I dunno.
Wrap-up
I see all this as a beginning. Email generally is losing emphasis and Discord / Teams / Slack are gaining on that. Spammers are getting really creative and I'm sure criminals will diversify more. Back-in-the-days Skype spam was a a real thing, see my article from 2016 on that.
It seems detecting and blocking email spam is easy. That's something I stated in my blog post back in 2020.
Here I'm introducing three types of spam I'm getting.
Spam from free email accounts
But wait! That's not new. We've seen email originating from Yahoo for many years.
What is new, the fact that increasing number of spammers are creating tons of accounts into Gmail or Outlook.com. As both free-of-charge services limit the amount of email one can send during 24 hours into 500, quite a lot of accounts are needed. As example, see Gmail's Limits for sending & getting mail.
The reason those two services are used for spamming is the good reputation and both are well accepted among postmasters. As an example I won't block neither of them and consider both being as reliable. As end result, majority of spam I receive is from either of those. To weed of bad ones from good ones, traditional spam-filtering will do the trick.
What's good is both Microsoft and Google are really good in detecting ill behaviour and stopping the activity. However, criminals are really good in spending their 500 mails effectively to make their behaviour as "natural" as possible. Neither service provider isn't fast/good enough to detect and stop the activity. Unfortunately.
SMS / iMessage spam
Getting spam via SMS isn't new, but getting junk from Apple iMessage is. Here is a sample:
For non-Finnish readers, the spam says I've been selected for a part time job. Daily salary range is from 10 to 200 euros.
It seems this weird scam is active all over Europe. See iPhone users lament for spam messages via iMessage. Unlike Android, iPhone doesn't have any mechanisms for detecting SMS-spam. Maybe it should?
Probably those spammers didn't even break anything (much). It is likely they've automated something to be able to send crap to everybody. I don't think iMessage protocol isn't broken nor Apple's authentication. All those jerks did was automate the thing to the hilt.
GPG spam
When spam arrives in a heavily encrypted email, then you'll know it must be very important! Here is one I got encrypted with my GPG-key:
What's weird is the choice of language. That sample is in Burmese, but I got one in Hindi and Telugu. All of which point heavily to Asia and specifically to Indian region. I have no idea why I got those as I definitely need Google Translate to decipher those.
I have no idea how my address leaked, but the one I'm using is from https://keyserver.pgp.com/, a directory where I choose to publish my PGP / GPG public key. Maybe somebody hacked that? Or just iterated all possible keys. I dunno.
Wrap-up
I see all this as a beginning. Email generally is losing emphasis and Discord / Teams / Slack are gaining on that. Spammers are getting really creative and I'm sure criminals will diversify more. Back-in-the-days Skype spam was a a real thing, see my article from 2016 on that.
Fedora 35: Name resolver fail - Solved!
Monday, April 25. 2022
One of my boxes started failing after working pretty ok for years. On dnf update
it simply said:
Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-35&arch=x86_64 [Could not resolve host: mirrors.fedoraproject.org]
Whoa! Where did that come from?
This is one of my test / devel / not-so-important boxes, so it might have been broken for a while. I simply haven't realized it ever happening.
Troubleshooting
Step 1: Making sure DNS works
Test: dig mirrors.fedoraproject.org
Result:
;; ANSWER SECTION:
mirrors.fedoraproject.org. 106 IN CNAME wildcard.fedoraproject.org.
wildcard.fedoraproject.org. 44 IN A 185.141.165.254
wildcard.fedoraproject.org. 44 IN A 152.19.134.142
wildcard.fedoraproject.org. 44 IN A 18.192.40.85
wildcard.fedoraproject.org. 44 IN A 152.19.134.198
wildcard.fedoraproject.org. 44 IN A 38.145.60.21
wildcard.fedoraproject.org. 44 IN A 18.133.140.134
wildcard.fedoraproject.org. 44 IN A 209.132.190.2
wildcard.fedoraproject.org. 44 IN A 18.159.254.57
wildcard.fedoraproject.org. 44 IN A 38.145.60.20
wildcard.fedoraproject.org. 44 IN A 85.236.55.6
Conclusion: Network works (obviously, I just SSHd into the box) and capability of doing DNS-requests works ok.
Step 2: What's in /etc/resolv.conf
?
Test: cat /etc/resolv.conf
Result:
options edns0 trust-ad
; generated by /usr/sbin/dhclient-script
nameserver 185.12.64.1
nameserver 185.12.64.2
Test 2: ls -l /etc/resolv.conf
Result:
lrwxrwxrwx. 1 root root 39 Apr 25 20:31 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf
Conclusion: Well, well. dhclient isn't supposed to overwrite /etc/resolv.conf
if systemd-resolver is used. But is it?
Step 3: Is /etc/nsswitch.conf
ok?
Previous one gave hint something was not ok. Just to cover all the bases, need to verify NS switch order.
Test: cat /etc/nsswitch.conf
Result:
# Generated by authselect on Wed Apr 29 05:44:18 2020
# Do not modify this file manually.
# If you want to make changes to nsswitch.conf please modify
# /etc/authselect/user-nsswitch.conf and run 'authselect apply-changes'.
d...
# In order of likelihood of use to accelerate lookup.
hosts: files myhostname resolve [!UNAVAIL=return] dns
Conclusion: No issues there, all ok.
Step 4: Is systemd-resolved running?
Test: systemctl status systemd-resolved
Result:
● systemd-resolved.service - Network Name Resolution
Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; vendor pr>
Active: active (running) since Mon 2022-04-25 20:15:59 CEST; 7min ago
Docs: man:systemd-resolved.service(8)
man:org.freedesktop.resolve1(5)
https://www.freedesktop.org/wiki/Software/systemd/writing-network-configurat>
https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
Main PID: 6329 (systemd-resolve)
Status: "Processing requests..."
Tasks: 1 (limit: 2258)
Memory: 8.5M
CPU: 91ms
CGroup: /system.slice/systemd-resolved.service
└─6329 /usr/lib/systemd/systemd-resolved
Conclusion: No issues there, all ok.
Step 5: If DNS resolution is there, can systemd-resolved do any resolving?
Test: systemd-resolve --status
Result:
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Link 2 (eth0)
Current Scopes: LLMNR/IPv4 LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Test 2: resolvectl query mirrors.fedoraproject.org
Result:
mirrors.fedoraproject.org: resolve call failed: No appropriate name servers or networks for name found
Conclusion: Problem found! systemd-resolved and dhclient are disconnected.
Fix
Edit file /etc/systemd/resolved.conf
, have it contain Hetzner recursive resolvers in a line:
DNS=185.12.64.1 185.12.64.2
Make changes effective: systemctl restart systemd-resolved
Test: resolvectl query mirrors.fedoraproject.org
As everything worked and correct result was returned, verify systemd-resolverd status: systemd-resolve --status
Result now:
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Current DNS Server: 185.12.64.1
DNS Servers: 185.12.64.1 185.12.64.2
Link 2 (eth0)
Current Scopes: LLMNR/IPv4 LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Success!
Why it broke? No idea.
I just hate these modern Linuxes where every single thing gets more layers of garbage added between the actual system and admin. Ufff.
One of my boxes started failing after working pretty ok for years. On dnf update
it simply said:
Curl error (6): Couldn't resolve host name for https://mirrors.fedoraproject.org/metalink?repo=fedora-35&arch=x86_64 [Could not resolve host: mirrors.fedoraproject.org]
Whoa! Where did that come from?
This is one of my test / devel / not-so-important boxes, so it might have been broken for a while. I simply haven't realized it ever happening.
Troubleshooting
Step 1: Making sure DNS works
Test: dig mirrors.fedoraproject.org
Result:
;; ANSWER SECTION:
mirrors.fedoraproject.org. 106 IN CNAME wildcard.fedoraproject.org.
wildcard.fedoraproject.org. 44 IN A 185.141.165.254
wildcard.fedoraproject.org. 44 IN A 152.19.134.142
wildcard.fedoraproject.org. 44 IN A 18.192.40.85
wildcard.fedoraproject.org. 44 IN A 152.19.134.198
wildcard.fedoraproject.org. 44 IN A 38.145.60.21
wildcard.fedoraproject.org. 44 IN A 18.133.140.134
wildcard.fedoraproject.org. 44 IN A 209.132.190.2
wildcard.fedoraproject.org. 44 IN A 18.159.254.57
wildcard.fedoraproject.org. 44 IN A 38.145.60.20
wildcard.fedoraproject.org. 44 IN A 85.236.55.6
Conclusion: Network works (obviously, I just SSHd into the box) and capability of doing DNS-requests works ok.
Step 2: What's in /etc/resolv.conf
?
Test: cat /etc/resolv.conf
Result:
options edns0 trust-ad
; generated by /usr/sbin/dhclient-script
nameserver 185.12.64.1
nameserver 185.12.64.2
Test 2: ls -l /etc/resolv.conf
Result:
lrwxrwxrwx. 1 root root 39 Apr 25 20:31 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf
Conclusion: Well, well. dhclient isn't supposed to overwrite /etc/resolv.conf
if systemd-resolver is used. But is it?
Step 3: Is /etc/nsswitch.conf
ok?
Previous one gave hint something was not ok. Just to cover all the bases, need to verify NS switch order.
Test: cat /etc/nsswitch.conf
Result:
# Generated by authselect on Wed Apr 29 05:44:18 2020
# Do not modify this file manually.
# If you want to make changes to nsswitch.conf please modify
# /etc/authselect/user-nsswitch.conf and run 'authselect apply-changes'.
d...
# In order of likelihood of use to accelerate lookup.
hosts: files myhostname resolve [!UNAVAIL=return] dns
Conclusion: No issues there, all ok.
Step 4: Is systemd-resolved running?
Test: systemctl status systemd-resolved
Result:
● systemd-resolved.service - Network Name Resolution
Loaded: loaded (/usr/lib/systemd/system/systemd-resolved.service; enabled; vendor pr>
Active: active (running) since Mon 2022-04-25 20:15:59 CEST; 7min ago
Docs: man:systemd-resolved.service(8)
man:org.freedesktop.resolve1(5)
https://www.freedesktop.org/wiki/Software/systemd/writing-network-configurat>
https://www.freedesktop.org/wiki/Software/systemd/writing-resolver-clients
Main PID: 6329 (systemd-resolve)
Status: "Processing requests..."
Tasks: 1 (limit: 2258)
Memory: 8.5M
CPU: 91ms
CGroup: /system.slice/systemd-resolved.service
└─6329 /usr/lib/systemd/systemd-resolved
Conclusion: No issues there, all ok.
Step 5: If DNS resolution is there, can systemd-resolved do any resolving?
Test: systemd-resolve --status
Result:
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Link 2 (eth0)
Current Scopes: LLMNR/IPv4 LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Test 2: resolvectl query mirrors.fedoraproject.org
Result:
mirrors.fedoraproject.org: resolve call failed: No appropriate name servers or networks for name found
Conclusion: Problem found! systemd-resolved and dhclient are disconnected.
Fix
Edit file /etc/systemd/resolved.conf
, have it contain Hetzner recursive resolvers in a line:
DNS=185.12.64.1 185.12.64.2
Make changes effective: systemctl restart systemd-resolved
Test: resolvectl query mirrors.fedoraproject.org
As everything worked and correct result was returned, verify systemd-resolverd status: systemd-resolve --status
Result now:
Global
Protocols: LLMNR=resolve -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Current DNS Server: 185.12.64.1
DNS Servers: 185.12.64.1 185.12.64.2
Link 2 (eth0)
Current Scopes: LLMNR/IPv4 LLMNR/IPv6
Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Success!
Why it broke? No idea.
I just hate these modern Linuxes where every single thing gets more layers of garbage added between the actual system and admin. Ufff.
Gran Turismo 7
Thursday, March 31. 2022
I've taken a pause from blogging, lots of things happening on the other side of the blog. Been doing tons of coding and maintaining my systems. Aaaand playing GT7:

I've played Gran Turismo since version 1 on PSX, so I wasn't about to skip this one on my PS5. Actually, the point where I bought PS2, PS3 and PS4 were when a new GT was released.
Quite many of my projects are in a much more stable position and I think there is now more time to tell them about here. Let's see.
I've taken a pause from blogging, lots of things happening on the other side of the blog. Been doing tons of coding and maintaining my systems. Aaaand playing GT7:
I've played Gran Turismo since version 1 on PSX, so I wasn't about to skip this one on my PS5. Actually, the point where I bought PS2, PS3 and PS4 were when a new GT was released.
Quite many of my projects are in a much more stable position and I think there is now more time to tell them about here. Let's see.
Sanuli-Konuli RPA-edition
Sunday, February 13. 2022
Previously I wrote a Wordle-solver, Sanuli-Konuli. The command-line -based user-interface isn't especially useful for non-programmers, so I took the solver a bit further and automated the heck out of it.

There is a sample Youtube-video (13+ minutes) of the solver solving 247 puzzles in-a-row. Go watch it here: https://youtu.be/Q-L946fLBgU
If you want to run this automated solver on your own computer against Sanuli, some Python-assembly is required. Process the dictionary and make sure you can run Selenium from your development environment. Solver-script is at https://github.com/HQJaTu/sanuli-konuli/blob/master/cli-utils/sanuli-solver.py
The logic I wrote earlier is rather straightforward and too naive. I added some Selenium-magic to solver to write a screenshot of every win (overwriting previous one) and every failure (timestamped). A quick analysis of 5-6 failures revealed the flaw in my logic. As above screenshot depicts, applying improved logic in a scenario where four out of five letters were known and two guesses were still left, changing only one letter may or may not work, but depends on luck before running out of guesses. In improved approach would be to write detect code and when this scenario is detected, doing a completely different kind of "guess" throwing away those four known letters, determining a set of potential letters from set of words and finding a word from dictionary containg most of the unknown letters which would fit would be much smarter approach. Then one letter should be green and rest not, then doing a last guess with that letter should solve the puzzle. So, my logic is good until a point where one letter is missing and then its fully luck-based one.
In above game, guess #4 was "liima" with M as not being correct. At that point of the puzzle, potentially matching words "liiga, liina, liira" would result in potential letters G, N and R, then guess #5 woud be "rangi" (found in dictionary). Making such a guess would reveal letter N as the correct missing letter making guess #6 succeed with "liina".
As I said, this logic has not been implemented and I don't think improving the guessing algorithm any further is beneficial. I may take a new project to work with.
Standard disclaimer applies: If you have any comments or feedback, please drop me a line.
Previously I wrote a Wordle-solver, Sanuli-Konuli. The command-line -based user-interface isn't especially useful for non-programmers, so I took the solver a bit further and automated the heck out of it.
There is a sample Youtube-video (13+ minutes) of the solver solving 247 puzzles in-a-row. Go watch it here: https://youtu.be/Q-L946fLBgU
If you want to run this automated solver on your own computer against Sanuli, some Python-assembly is required. Process the dictionary and make sure you can run Selenium from your development environment. Solver-script is at https://github.com/HQJaTu/sanuli-konuli/blob/master/cli-utils/sanuli-solver.py
The logic I wrote earlier is rather straightforward and too naive. I added some Selenium-magic to solver to write a screenshot of every win (overwriting previous one) and every failure (timestamped). A quick analysis of 5-6 failures revealed the flaw in my logic. As above screenshot depicts, applying improved logic in a scenario where four out of five letters were known and two guesses were still left, changing only one letter may or may not work, but depends on luck before running out of guesses. In improved approach would be to write detect code and when this scenario is detected, doing a completely different kind of "guess" throwing away those four known letters, determining a set of potential letters from set of words and finding a word from dictionary containg most of the unknown letters which would fit would be much smarter approach. Then one letter should be green and rest not, then doing a last guess with that letter should solve the puzzle. So, my logic is good until a point where one letter is missing and then its fully luck-based one.
In above game, guess #4 was "liima" with M as not being correct. At that point of the puzzle, potentially matching words "liiga, liina, liira" would result in potential letters G, N and R, then guess #5 woud be "rangi" (found in dictionary). Making such a guess would reveal letter N as the correct missing letter making guess #6 succeed with "liina".
As I said, this logic has not been implemented and I don't think improving the guessing algorithm any further is beneficial. I may take a new project to work with.
Standard disclaimer applies: If you have any comments or feedback, please drop me a line.
Databricks CentOS 8 stream containers
Monday, February 7. 2022
Last November I created CentOS 8 -based Databricks containers.
At the time of tinkering with them, I failed to realize my base was off. I simply used the CentOS 8.4 image available at Docker Hub. On later inspection that was a failure. Even for 8.4, the image was old and was going to be EOLd soon after. Now that 31st Dec -21 had passed I couldn't get any security patches into my system. To put it midly: that's bad!
What I was supposed to be using, was the CentOS 8 stream image from quay.io. Initially my reaction was: "What's a quay.io? Why would I want to use that?"

Thanks Merriam-Webster for that, but it doesn't help.
On a closer look, it looks like all RedHat container -stuff is not at docker.io, they're in quay.io.
Simple thing: update the base image, rebuild all Databricks-images and done, right? Yup. Nope. The images built from steam didn't work anymore. Uff! They failed working that bad, not even Apache Spark driver was available. No querying driver logs for errors. A major fail, that!
Well. Seeing why driver won't work should be easy, just SSH into the driver an take a peek, right? The operation is documented by Microsoft at SSH to the cluster driver node. Well, no! According to me and couple of people asking questions like How to login SSH on Azure Databricks cluster, it is NOT possible to SSH into Azure Databricks node.
Looking at Azure Databricks architecture overview gave no clues on how to see inside of a node. I started to think nobody had ever done it. Also enabling diagnostic logging required the premium (high-prized) edition of Databricks, which wasn't available to me.
At this point I was in a full whatta-hell-is-going-on!? -mode.
Digging into documentation, I found out, it was possible to run a Cluster node initialization scripts, then I knew what to do next. As I knew it was possible to make requests into the Internet from a job running in a node, I could write an intialization script which during execution would dig me a SSH-tunnel from the node being initialized into something I would fully control. Obiviously I chose one of my own servers and from that SSH-tunneled back into the node's SSH-server. Double SSH, yes, but then I was able to get an interactive session into the well-protected node. An interactive session is what all bad people would want into any of the machines they'll crack into. Tunneling-credit to other people: quite a lot of my implementation details came from How does reverse SSH tunneling work?
To implement my plan, I crafted following cluster initialization script:
LOG_FILE="/dbfs/cluster-logs/$DB_CLUSTER_ID-init-$(date +"%F-%H:%M").log"
exec >> "$LOG_FILE"
echo "$(date +"%F %H:%M:%S") Setup SSH-tunnel"
mkdir -p /root/.ssh
cat > /root/.ssh/authorized_keys <<EOT
ecdsa-sha2-nistp521 AAAAE2V0bV+TrsFVcsA==
EOT
echo "$(date +"%F %H:%M:%S") Install and setup SSH"
dnf install openssh-server openssh-clients -y
/usr/libexec/openssh/sshd-keygen ecdsa
/usr/libexec/openssh/sshd-keygen rsa
/usr/libexec/openssh/sshd-keygen ed25519
/sbin/sshd
echo "$(date +"%F %H:%M:%S") - Add p-key"
cat > /root/.ssh/nobody_id_ecdsa <<EOT
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAA
1zaGEyLW5pc3RwNTIxAAAACG5pc3RwNTIxAAAAhQQA2I7t7xx9R02QO2
rsLeYmp3X6X5qyprAGiMWM7SQrA1oFr8jae+Cqx7Fvi3xPKL/SoW1+l6
Zzc2hkQHZtNC5ocWNvZGVzaG9wLmZpAQIDBA==
-----END OPENSSH PRIVATE KEY-----
EOT
chmod go= /root/.ssh/nobody_id_ecdsa
echo "$(date +"%F %H:%M:%S") - SSH dir content:"
echo "$(date +"%F %H:%M:%S") Open SSH-tunnel"
ssh -f -N -T \
-R22222:localhost:22 \
-i /root/.ssh/nobody_id_ecdsa \
-o StrictHostKeyChecking=no \
nobody@my.own.box.example.com -p 443
Note: Above ECDSA-keys have been heavily shortened making them invalid. Don't copy passwords or keys from public Internet, generate your own secrets. Always! And if you're wondering, the original keys have been removed.
Note 2: My init-script writes log into DBFS, see exec >> "$LOG_FILE"
about that.
My plan succeeded. I got in, did the snooping around and then it took couple minutes when Azure/Databrics -plumbing realized driver was dead, killed the node and retried the startup-sequence. Couple minutes was plenty of time to eyeball /databricks/spark/logs/
and /databricks/driver/logs/
and deduce what was going on and what was failing.
Looking at simplified Databricks (Apache Spark) architecture diagram:

Spark driver failed to start because it couldn't connect into cluster manager. Subsequently, cluster manager failed to start as ps
-command wasn't available. It was in good old CentOS, but in base stream it was removed. As I got progress, also ip
-command was needed. I added both and got the desired result: a working CentOS 8 stream Spark-cluster.
Notice how I'm specifying HTTPS-port (TCP/443) in the outgoing SSH-command (see: -p 443
). In my attempts to get a session into the node, I deduced following:

As Databricks runs in it's own sandbox, also outgoing traffic is heavily firewalled. Any attempts to access SSH (TCP/22) are blocked. Both HTTP and HTTPS are known to work as exit ports, so I spoofed my SSHd there.
There are a number of different containers. To clarify which one to choose, I drew this diagram:

In my sparking, I'll need both Python and DBFS, so my choice is dbfsfuse. Most users would be happy with standard, but it only adds SSHd which is known not to work. ssh has the same exact problem. The reason for them to exist, is because in AWS SSHd does work. Among the changes from good old CentOS into stream is lacking FUSE. Old one had FUSE even in minimal, but not anymore. You can access DBFS only with dbfsfuse or standard from now on.
If you want to take my CentOS 8 brick-containers for a spin, they are still here: https://hub.docker.com/repository/docker/kingjatu/databricks, now they are maintained and get security patches too!
Last November I created CentOS 8 -based Databricks containers.
At the time of tinkering with them, I failed to realize my base was off. I simply used the CentOS 8.4 image available at Docker Hub. On later inspection that was a failure. Even for 8.4, the image was old and was going to be EOLd soon after. Now that 31st Dec -21 had passed I couldn't get any security patches into my system. To put it midly: that's bad!
What I was supposed to be using, was the CentOS 8 stream image from quay.io. Initially my reaction was: "What's a quay.io? Why would I want to use that?"
Thanks Merriam-Webster for that, but it doesn't help.
On a closer look, it looks like all RedHat container -stuff is not at docker.io, they're in quay.io.
Simple thing: update the base image, rebuild all Databricks-images and done, right? Yup. Nope. The images built from steam didn't work anymore. Uff! They failed working that bad, not even Apache Spark driver was available. No querying driver logs for errors. A major fail, that!
Well. Seeing why driver won't work should be easy, just SSH into the driver an take a peek, right? The operation is documented by Microsoft at SSH to the cluster driver node. Well, no! According to me and couple of people asking questions like How to login SSH on Azure Databricks cluster, it is NOT possible to SSH into Azure Databricks node.
Looking at Azure Databricks architecture overview gave no clues on how to see inside of a node. I started to think nobody had ever done it. Also enabling diagnostic logging required the premium (high-prized) edition of Databricks, which wasn't available to me.
At this point I was in a full whatta-hell-is-going-on!? -mode.
Digging into documentation, I found out, it was possible to run a Cluster node initialization scripts, then I knew what to do next. As I knew it was possible to make requests into the Internet from a job running in a node, I could write an intialization script which during execution would dig me a SSH-tunnel from the node being initialized into something I would fully control. Obiviously I chose one of my own servers and from that SSH-tunneled back into the node's SSH-server. Double SSH, yes, but then I was able to get an interactive session into the well-protected node. An interactive session is what all bad people would want into any of the machines they'll crack into. Tunneling-credit to other people: quite a lot of my implementation details came from How does reverse SSH tunneling work?
To implement my plan, I crafted following cluster initialization script:
LOG_FILE="/dbfs/cluster-logs/$DB_CLUSTER_ID-init-$(date +"%F-%H:%M").log"
exec >> "$LOG_FILE"
echo "$(date +"%F %H:%M:%S") Setup SSH-tunnel"
mkdir -p /root/.ssh
cat > /root/.ssh/authorized_keys <<EOT
ecdsa-sha2-nistp521 AAAAE2V0bV+TrsFVcsA==
EOT
echo "$(date +"%F %H:%M:%S") Install and setup SSH"
dnf install openssh-server openssh-clients -y
/usr/libexec/openssh/sshd-keygen ecdsa
/usr/libexec/openssh/sshd-keygen rsa
/usr/libexec/openssh/sshd-keygen ed25519
/sbin/sshd
echo "$(date +"%F %H:%M:%S") - Add p-key"
cat > /root/.ssh/nobody_id_ecdsa <<EOT
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAA
1zaGEyLW5pc3RwNTIxAAAACG5pc3RwNTIxAAAAhQQA2I7t7xx9R02QO2
rsLeYmp3X6X5qyprAGiMWM7SQrA1oFr8jae+Cqx7Fvi3xPKL/SoW1+l6
Zzc2hkQHZtNC5ocWNvZGVzaG9wLmZpAQIDBA==
-----END OPENSSH PRIVATE KEY-----
EOT
chmod go= /root/.ssh/nobody_id_ecdsa
echo "$(date +"%F %H:%M:%S") - SSH dir content:"
echo "$(date +"%F %H:%M:%S") Open SSH-tunnel"
ssh -f -N -T \
-R22222:localhost:22 \
-i /root/.ssh/nobody_id_ecdsa \
-o StrictHostKeyChecking=no \
nobody@my.own.box.example.com -p 443
Note: Above ECDSA-keys have been heavily shortened making them invalid. Don't copy passwords or keys from public Internet, generate your own secrets. Always! And if you're wondering, the original keys have been removed.
Note 2: My init-script writes log into DBFS, see exec >> "$LOG_FILE"
about that.
My plan succeeded. I got in, did the snooping around and then it took couple minutes when Azure/Databrics -plumbing realized driver was dead, killed the node and retried the startup-sequence. Couple minutes was plenty of time to eyeball /databricks/spark/logs/
and /databricks/driver/logs/
and deduce what was going on and what was failing.
Looking at simplified Databricks (Apache Spark) architecture diagram:
Spark driver failed to start because it couldn't connect into cluster manager. Subsequently, cluster manager failed to start as ps
-command wasn't available. It was in good old CentOS, but in base stream it was removed. As I got progress, also ip
-command was needed. I added both and got the desired result: a working CentOS 8 stream Spark-cluster.
Notice how I'm specifying HTTPS-port (TCP/443) in the outgoing SSH-command (see: -p 443
). In my attempts to get a session into the node, I deduced following:
As Databricks runs in it's own sandbox, also outgoing traffic is heavily firewalled. Any attempts to access SSH (TCP/22) are blocked. Both HTTP and HTTPS are known to work as exit ports, so I spoofed my SSHd there.
There are a number of different containers. To clarify which one to choose, I drew this diagram:
In my sparking, I'll need both Python and DBFS, so my choice is dbfsfuse. Most users would be happy with standard, but it only adds SSHd which is known not to work. ssh has the same exact problem. The reason for them to exist, is because in AWS SSHd does work. Among the changes from good old CentOS into stream is lacking FUSE. Old one had FUSE even in minimal, but not anymore. You can access DBFS only with dbfsfuse or standard from now on.
If you want to take my CentOS 8 brick-containers for a spin, they are still here: https://hub.docker.com/repository/docker/kingjatu/databricks, now they are maintained and get security patches too!
Sanuli-konuli - Wordle solver
Sunday, February 6. 2022
Wordle is a popular word game. In just a couple months the popularity rose so fast New York Times bought the whole thing from Josh Wardle, the original author.
As always, a popular game gets the cloners moving. Also another thing about Worlde is the English language. What if somebody (like me) isn't a native speaker. It is very difficult in the middle of a game to come up with a word like "skirl". There is an obvious need for localized versions. In Finland, such a clone is Sanuli. A word-game, a clone of Wordle, but with Finnish words.
From software engineering point-of-view, this is a simple enough problem to solve. Get a dictionary of words, filter out all 5-letter words, store them to a list for filtering by game criteria. And that's exactly what I did in my "word machine", https://github.com/HQJaTu/sanuli-konuli/.
The code is generic to support any language and any dictionary.
Example Wordle game

Attempt #1
To get the initial word, I ran:
./cli-utils/get-initial-word.py words/nltk-wordnet2021-5-words_v1.dat
.
Command's output was a randomly selected word "mucor".
As seen above, that resulted all gray tiles. No matches.
Attempt #2
Second attempt with excluded lettes:
get-initial-word.py words/nltk-wordnet2021-5-words_v1.dat "mucor"
.
Command's output was a randomly selected word "slake".
This time game started boucing my way! First letter 'S' is on green and two yellow ones for 'L' and 'K'.
Attempt #3
Third attempt wasn't for initial word, this time I had clues to narrow down the word. A command to do this with above clues would be:
find-matching-word.py words/nltk-wordnet2021-5-words_v1.dat "s...." "ae" ".l.k."
Out of multiple possible options, a randomly selected word matching criteria was: "skirl"
Nice result there! Four green ones.
Note: Later, when writing this blog post, I realized the command has an obvious flaw. I should have excluded all known gray letters "mucore". Should my command have been:
find-matching-word.py words/nltk-wordnet2021-5-words_v1.dat "s...." "mucorae" ".l.k."
The ONLY matching word would have been "skill" at this point.
Attempt #4 - Win!
Applying all the green letters, the command to run would be:
find-matching-word.py words/nltk-wordnet2021-5-words_v1.dat "ski.l" "ar" "....."
This results in a single word: "skill" which will win the game!
Finally
I've been told this brute-force approach of mine takes tons of joy out of the word-game. You have to forgive me, my hacker brain is wired to instantly think of a software solution to a problem not needing one. 
Wordle is a popular word game. In just a couple months the popularity rose so fast New York Times bought the whole thing from Josh Wardle, the original author.
As always, a popular game gets the cloners moving. Also another thing about Worlde is the English language. What if somebody (like me) isn't a native speaker. It is very difficult in the middle of a game to come up with a word like "skirl". There is an obvious need for localized versions. In Finland, such a clone is Sanuli. A word-game, a clone of Wordle, but with Finnish words.
From software engineering point-of-view, this is a simple enough problem to solve. Get a dictionary of words, filter out all 5-letter words, store them to a list for filtering by game criteria. And that's exactly what I did in my "word machine", https://github.com/HQJaTu/sanuli-konuli/.
The code is generic to support any language and any dictionary.
Example Wordle game
Attempt #1
To get the initial word, I ran:
./cli-utils/get-initial-word.py words/nltk-wordnet2021-5-words_v1.dat
.
Command's output was a randomly selected word "mucor".
As seen above, that resulted all gray tiles. No matches.
Attempt #2
Second attempt with excluded lettes:
get-initial-word.py words/nltk-wordnet2021-5-words_v1.dat "mucor"
.
Command's output was a randomly selected word "slake".
This time game started boucing my way! First letter 'S' is on green and two yellow ones for 'L' and 'K'.
Attempt #3
Third attempt wasn't for initial word, this time I had clues to narrow down the word. A command to do this with above clues would be:
find-matching-word.py words/nltk-wordnet2021-5-words_v1.dat "s...." "ae" ".l.k."
Out of multiple possible options, a randomly selected word matching criteria was: "skirl"
Nice result there! Four green ones.
Note: Later, when writing this blog post, I realized the command has an obvious flaw. I should have excluded all known gray letters "mucore". Should my command have been:
find-matching-word.py words/nltk-wordnet2021-5-words_v1.dat "s...." "mucorae" ".l.k."
The ONLY matching word would have been "skill" at this point.
Attempt #4 - Win!
Applying all the green letters, the command to run would be:
find-matching-word.py words/nltk-wordnet2021-5-words_v1.dat "ski.l" "ar" "....."
This results in a single word: "skill" which will win the game!
Finally
I've been told this brute-force approach of mine takes tons of joy out of the word-game. You have to forgive me, my hacker brain is wired to instantly think of a software solution to a problem not needing one.
pkexec security flaw (CVE-2021-4034)
Wednesday, January 26. 2022
This is something Qualsys found: PwnKit: Local Privilege Escalation Vulnerability Discovered in polkit’s pkexec (CVE-2021-4034)
In your Linux there is sudo
and su
, but many won't realize you also have pkexec which does pretty much the same. More details about the vulnerability can be found at Vuldb.
I downloaded the proof-of-concept by BLASTY, but found it not working in any of my systems. I simply could not convert a nobody into a root.
Another one, https://github.com/arthepsy/CVE-2021-4034, states "verified on Debian 10 and CentOS 7."
Various errors I could get include a simple non-functionality:
[~] compile helper..
[~] maybe get shell now?
The value for environment variable XAUTHORITY contains suscipious content
This incident has been reported.
Or one which would stop to ask for root password:
[~] compile helper..
[~] maybe get shell now?
==== AUTHENTICATING FOR org.freedesktop.policykit.exec ====
Authentication is needed to run `GCONV_PATH=./lol' as the super user
Authenticating as: root
By hitting enter to the password-prompt following will happen:
Password:
polkit-agent-helper-1: pam_authenticate failed: Authentication failure
==== AUTHENTICATION FAILED ====
Error executing command as another user: Not authorized
This incident has been reported.
Or one which simply doesn't get the thing right:
[~] compile helper..
[~] maybe get shell now?
pkexec --version |
--help |
--disable-internal-agent |
[--user username] [PROGRAM] [ARGUMENTS...]
See the pkexec manual page for more details.
Report bugs to: http://lists.freedesktop.org/mailman/listinfo/polkit-devel
polkit home page: <http://www.freedesktop.org/wiki/Software/polkit>
Maybe there is a PoC which would actually compromise one of the systems I have to test with. Or maybe not. Still, I'm disappointed.
This is something Qualsys found: PwnKit: Local Privilege Escalation Vulnerability Discovered in polkit’s pkexec (CVE-2021-4034)
In your Linux there is sudo
and su
, but many won't realize you also have pkexec which does pretty much the same. More details about the vulnerability can be found at Vuldb.
I downloaded the proof-of-concept by BLASTY, but found it not working in any of my systems. I simply could not convert a nobody into a root.
Another one, https://github.com/arthepsy/CVE-2021-4034, states "verified on Debian 10 and CentOS 7."
Various errors I could get include a simple non-functionality:
[~] compile helper..
[~] maybe get shell now?
The value for environment variable XAUTHORITY contains suscipious content
This incident has been reported.
Or one which would stop to ask for root password:
[~] compile helper..
[~] maybe get shell now?
==== AUTHENTICATING FOR org.freedesktop.policykit.exec ====
Authentication is needed to run `GCONV_PATH=./lol' as the super user
Authenticating as: root
By hitting enter to the password-prompt following will happen:
Password:
polkit-agent-helper-1: pam_authenticate failed: Authentication failure
==== AUTHENTICATION FAILED ====
Error executing command as another user: Not authorized
This incident has been reported.
Or one which simply doesn't get the thing right:
[~] compile helper..
[~] maybe get shell now?
pkexec --version |
--help |
--disable-internal-agent |
[--user username] [PROGRAM] [ARGUMENTS...]
See the pkexec manual page for more details.
Report bugs to: http://lists.freedesktop.org/mailman/listinfo/polkit-devel
polkit home page: <http://www.freedesktop.org/wiki/Software/polkit>
Maybe there is a PoC which would actually compromise one of the systems I have to test with. Or maybe not. Still, I'm disappointed.
EU Digital COVID Certificate usage in the World
Saturday, January 22. 2022
In my previous post, I gave some statistics about which countries have issued how many X.509 certificates for signing the COVID Certificates. Also, I realized how adoption of this open-source system in non-EU -countries is wide. Actually in the list of countries using the system, there are more non-EU -countries than EU-countries.
As I wanted to learn how to progammatically alter SVG-files (they are essentially XML), I decided to write some code. Here is the result:

Updated map is available at https://blog.hqcodeshop.fi/vacdec-map/, also in SVG-format. Source code of how all this is created is at: https://github.com/HQJaTu/vacdec-map
The process begins with importing maps of the World from Wikipedia page Gallery of sovereign state flags (using Wikipedia API, of course). There is one shortcoming, Faroe Islands map isn't in the list as it is not a sovereign state. It is downloaded separately.
Then the flags are mapped against ISO-3166 country list. Again, some manual helping is needed, list of flags doesn't use all of the exact country names from ISO-standard. Good thing about this is, the need to do this only once. Flags and countries don't change that often.
Fetching the X.509 certificates for signing is something my code has successfully done since August -21. So, a list of known COVID Certificate signing certificates is fetched. A simple iteration is done to those X.509 certificates to find out the oldest date of any country. That date is assumed to be the system adoption date. In reality it is not, but for this coarse visualisation that accuracy will do.
Now there is a repository of all the flags with their appropriate ISO-3166-2 codes and all COVID signing certificates with their issuing dates. Then all country flags are placed on top of a World map, which has a second part in the bottom, a timeline for adoption date for non-EU -countries. Drawing EU-countries there is non-interesting as they pretty much did it all at the same time. The map SVG comes from Wikimedia Commons Carte du monde vierge (Allemagnes séparées).svg.
For optional PNG-rendering, I'm using Inkscape. There are failing attempts to do SVG-rendering with CairoSVG, PyVips and Html2Image. First two failed on accuracy, the rendering result was missing parts, had incorrect scales on various parts and was generally of low quality. Html2Image kinda worked, but wasn't applicable to be run on a server.
In my previous post, I gave some statistics about which countries have issued how many X.509 certificates for signing the COVID Certificates. Also, I realized how adoption of this open-source system in non-EU -countries is wide. Actually in the list of countries using the system, there are more non-EU -countries than EU-countries.
As I wanted to learn how to progammatically alter SVG-files (they are essentially XML), I decided to write some code. Here is the result:
Updated map is available at https://blog.hqcodeshop.fi/vacdec-map/, also in SVG-format. Source code of how all this is created is at: https://github.com/HQJaTu/vacdec-map
The process begins with importing maps of the World from Wikipedia page Gallery of sovereign state flags (using Wikipedia API, of course). There is one shortcoming, Faroe Islands map isn't in the list as it is not a sovereign state. It is downloaded separately.
Then the flags are mapped against ISO-3166 country list. Again, some manual helping is needed, list of flags doesn't use all of the exact country names from ISO-standard. Good thing about this is, the need to do this only once. Flags and countries don't change that often.
Fetching the X.509 certificates for signing is something my code has successfully done since August -21. So, a list of known COVID Certificate signing certificates is fetched. A simple iteration is done to those X.509 certificates to find out the oldest date of any country. That date is assumed to be the system adoption date. In reality it is not, but for this coarse visualisation that accuracy will do.
Now there is a repository of all the flags with their appropriate ISO-3166-2 codes and all COVID signing certificates with their issuing dates. Then all country flags are placed on top of a World map, which has a second part in the bottom, a timeline for adoption date for non-EU -countries. Drawing EU-countries there is non-interesting as they pretty much did it all at the same time. The map SVG comes from Wikimedia Commons Carte du monde vierge (Allemagnes séparées).svg.
For optional PNG-rendering, I'm using Inkscape. There are failing attempts to do SVG-rendering with CairoSVG, PyVips and Html2Image. First two failed on accuracy, the rendering result was missing parts, had incorrect scales on various parts and was generally of low quality. Html2Image kinda worked, but wasn't applicable to be run on a server.
EU Digital COVID Certificate trust-lists
Sunday, January 9. 2022
It seems vacdec
is one of my most popular public projects. I'm getting pull requests, comments and questions from number of people.
Thank you for all your input! Keep it coming. All pull requests, bug fixes and such are appreciated.
Briefly, this is about decoding EU digital COVID certificate QR-codes. For more details, see my blog post from August -21 or the Python source-code at https://github.com/HQJaTu/vacdec.
Quote from my original post: "the most difficult part was to get my hands into the national public certificates to do some ECDSA signature verifying".
My original code has tool called fetch-signing-certificates.py
to do the heavy lifting. It fetches a list of X.509 certificates which are used in signing the QR-code's payload. That list is stored into your hard drive. When you do any COVID certificate decoding, that list is iterated with purpose of finding the key used when the QR-code was generated. If the trust-list contains the cert, signature will be verified and a result from that will be given.
To state the obvious: That precious trust list will contain all signature certificates from all countries using the same EU COVID certificate QR-code system. The list of countries is long and includes all 27 EU-countries. Among participating non-EU countries are: Andorra, United Arab Emirates, Swizerland, Georgia, Iceland, Liechtenstein, Morocco, Norway, Singapore, San Marino, Ukraine, Vatican. Looking at the country-list, it is easy to see how EU COVID passport is a triumph of open-source!
On the flipside, getting signing certificates from all those countries requires some effort. A participating country first needs to generate and secure (emphasis: SECURE the private key) required number of ECDSA public/private key -pairs to all organizations/parties doing the COVID certificate issuing. Of those key-pairs COVID certificate signing X.509 certificates can be created and public certificate will be added to the common trust-list.
As an example, in my home country of Finland up until recently there used to be one (1 as in single) X.509 cert for entire nation of 5,6 million people, now there is two. Ultimately, pretty much everybody gets their COVID certs signed from a single cert. This is because in Finland issuing the certs is centralized. On the other hand bigger countries need to have multiple X.509 certs as their government organization may be different and less centralized. A good example is Germany, the most populous countru in Europe and a federal state consisting of 17 constituent states. Having single cert would not make sense in their government.
Moving on. Countries deliver their public certificates into some EU "central repository". It is not known how and where this is, but I'm assuming one has to exist. From this repository the participating countries are allowed the retrieve the trust-list for their verification needs. Subsequently the list is distributed by the country to those organizations and people needing to do COVID certificate verification. Remember "get my hands into the national public certificates"? Not all countries do the distributing very easily nor publicly. Some do, and that's where I get my data from.
Now that there is access of trust-lists, factor in time. That little devil is always messing everything up! 
In August, when I first download the Austrian list, the entire list had 150+ valid signing certs. Early December -21 there was 192. Early January -22 there was already 236. Today, the Austrian list contains 281 signing certificates. However, Swedish list has only 267. Argh! I'd really really need access to that EU endpoint for trust lists.
As the Austrian list is larger, here are the per-country statistics of 9th January 2022, expect this to change almost daily:
Number of EU COVID certificate signing X.509 certificates issued by country
Country ISO 3166-2 code
Count signing certs
EU member country name
Participating country name
Test data exists
DE
57
Germany
x
FR
50
France
x
ES
23
Spain
x
NL
21
Netherlands
x
CH
13
Switzerland
x
MT
12
Malta
x
LU
8
Luxembourg
x
GB
6
UK
LI
6
Liechtenstein
x
NO
6
Norway
x
IT
4
Italy
x
NZ
4
New Zealand
LT
3
Lithuania
x
LV
3
Latvia
x
MC
3
Monaco
PL
3
Poland
x
AL
2
Albania
AT
2
Austria
x
BE
2
Belgium
x
BG
2
Bulgaria
x
CZ
2
Czech Republic
x
DK
2
Denmark
x
EE
2
Estonia
x
FI
2
Finland
x
FO
2
Faroe Islands
GE
2
Georgia
x
HR
2
Croatia
x
IE
2
Ireland
x
IS
2
Iceland
x
MK
2
North Macedonia
SE
2
Sweden
x
SG
2
Singapore
x
AD
1
Andorra
x
AE
1
United Arab Emirates
x
AM
1
Armenia
CV
1
Cabo Verde
CY
1
Cyprus
x
GR
1
Greece
x
HU
1
Hungary
IL
1
Israel
LB
1
Lebanon
MA
1
Morocco
x
MD
1
Moldova
ME
1
Montenegro
PA
1
Panama
PT
1
Portugal
x
RO
1
Romania
x
RS
1
Serbia
SI
1
Slovenia
x
SK
1
Slovakia
x
SM
1
San Marino
x
TG
1
Togo
TH
1
Thailand
TN
1
Tunisia
TR
1
Turkey
TW
1
Taiwan
UA
1
Ukraine
x
UY
1
Uruguay
VA
1
Vatican
x
281
27
32
38
Any test data provided by participating country is at: https://github.com/eu-digital-green-certificates/dgc-testdata
Things to note from above table:
- Countries have their government arranged with lot of different variations.
- More certificates --> more things to secure --> more things to be worried about.
- In Germany somebody got access to a cert and issued QR-codes not belonging to actual persons.
- Lot of countries outside EU chose to use this open-source the system.
- Assume more countries to join. No need for everybody to invent this system.
- Lack of test data
- In EU, it is easy to make test data mandatory. Outside EU not so much.
- Way too many countries have not provided any sample certificates.
- Faroe Islands have 50k people, two certs.
- Okokok. It makes sense to have already issued two. If one cert leaks or gets "burned" for any reason, it is fast to re-issue the COVID certificates by using the other good cert.
- Country stats:
- Liechtenstein has 40k people, 6 certs! Whooo!
- Malta and Cabo Verde have 500k people. Maltese need 12 certs for all the signing! In Cabo Verde they have to do with only one.
- Vatican is the smallest country in the entire World. Single cert needed for those 800+ persons living there.
- Italy has 60M people, they manage their passes with only 4 certs. Nice!
- In total 59 countries participating this common system. Adoption is wide, system is well designed.
Also note how "EU" COVID passport has more participating countries outside EU. 
True testament of open-source, indeed!
It seems vacdec
is one of my most popular public projects. I'm getting pull requests, comments and questions from number of people.
Thank you for all your input! Keep it coming. All pull requests, bug fixes and such are appreciated.
Briefly, this is about decoding EU digital COVID certificate QR-codes. For more details, see my blog post from August -21 or the Python source-code at https://github.com/HQJaTu/vacdec.
Quote from my original post: "the most difficult part was to get my hands into the national public certificates to do some ECDSA signature verifying".
My original code has tool called fetch-signing-certificates.py
to do the heavy lifting. It fetches a list of X.509 certificates which are used in signing the QR-code's payload. That list is stored into your hard drive. When you do any COVID certificate decoding, that list is iterated with purpose of finding the key used when the QR-code was generated. If the trust-list contains the cert, signature will be verified and a result from that will be given.
To state the obvious: That precious trust list will contain all signature certificates from all countries using the same EU COVID certificate QR-code system. The list of countries is long and includes all 27 EU-countries. Among participating non-EU countries are: Andorra, United Arab Emirates, Swizerland, Georgia, Iceland, Liechtenstein, Morocco, Norway, Singapore, San Marino, Ukraine, Vatican. Looking at the country-list, it is easy to see how EU COVID passport is a triumph of open-source!
On the flipside, getting signing certificates from all those countries requires some effort. A participating country first needs to generate and secure (emphasis: SECURE the private key) required number of ECDSA public/private key -pairs to all organizations/parties doing the COVID certificate issuing. Of those key-pairs COVID certificate signing X.509 certificates can be created and public certificate will be added to the common trust-list.
As an example, in my home country of Finland up until recently there used to be one (1 as in single) X.509 cert for entire nation of 5,6 million people, now there is two. Ultimately, pretty much everybody gets their COVID certs signed from a single cert. This is because in Finland issuing the certs is centralized. On the other hand bigger countries need to have multiple X.509 certs as their government organization may be different and less centralized. A good example is Germany, the most populous countru in Europe and a federal state consisting of 17 constituent states. Having single cert would not make sense in their government.
Moving on. Countries deliver their public certificates into some EU "central repository". It is not known how and where this is, but I'm assuming one has to exist. From this repository the participating countries are allowed the retrieve the trust-list for their verification needs. Subsequently the list is distributed by the country to those organizations and people needing to do COVID certificate verification. Remember "get my hands into the national public certificates"? Not all countries do the distributing very easily nor publicly. Some do, and that's where I get my data from.
Now that there is access of trust-lists, factor in time. That little devil is always messing everything up!
In August, when I first download the Austrian list, the entire list had 150+ valid signing certs. Early December -21 there was 192. Early January -22 there was already 236. Today, the Austrian list contains 281 signing certificates. However, Swedish list has only 267. Argh! I'd really really need access to that EU endpoint for trust lists.
As the Austrian list is larger, here are the per-country statistics of 9th January 2022, expect this to change almost daily:
Country ISO 3166-2 code | Count signing certs | EU member country name | Participating country name | Test data exists |
---|---|---|---|---|
DE | 57 | Germany | x | |
FR | 50 | France | x | |
ES | 23 | Spain | x | |
NL | 21 | Netherlands | x | |
CH | 13 | Switzerland | x | |
MT | 12 | Malta | x | |
LU | 8 | Luxembourg | x | |
GB | 6 | UK | ||
LI | 6 | Liechtenstein | x | |
NO | 6 | Norway | x | |
IT | 4 | Italy | x | |
NZ | 4 | New Zealand | ||
LT | 3 | Lithuania | x | |
LV | 3 | Latvia | x | |
MC | 3 | Monaco | ||
PL | 3 | Poland | x | |
AL | 2 | Albania | ||
AT | 2 | Austria | x | |
BE | 2 | Belgium | x | |
BG | 2 | Bulgaria | x | |
CZ | 2 | Czech Republic | x | |
DK | 2 | Denmark | x | |
EE | 2 | Estonia | x | |
FI | 2 | Finland | x | |
FO | 2 | Faroe Islands | ||
GE | 2 | Georgia | x | |
HR | 2 | Croatia | x | |
IE | 2 | Ireland | x | |
IS | 2 | Iceland | x | |
MK | 2 | North Macedonia | ||
SE | 2 | Sweden | x | |
SG | 2 | Singapore | x | |
AD | 1 | Andorra | x | |
AE | 1 | United Arab Emirates | x | |
AM | 1 | Armenia | ||
CV | 1 | Cabo Verde | ||
CY | 1 | Cyprus | x | |
GR | 1 | Greece | x | |
HU | 1 | Hungary | ||
IL | 1 | Israel | ||
LB | 1 | Lebanon | ||
MA | 1 | Morocco | x | |
MD | 1 | Moldova | ||
ME | 1 | Montenegro | ||
PA | 1 | Panama | ||
PT | 1 | Portugal | x | |
RO | 1 | Romania | x | |
RS | 1 | Serbia | ||
SI | 1 | Slovenia | x | |
SK | 1 | Slovakia | x | |
SM | 1 | San Marino | x | |
TG | 1 | Togo | ||
TH | 1 | Thailand | ||
TN | 1 | Tunisia | ||
TR | 1 | Turkey | ||
TW | 1 | Taiwan | ||
UA | 1 | Ukraine | x | |
UY | 1 | Uruguay | ||
VA | 1 | Vatican | x | |
281 | 27 | 32 | 38 |
Any test data provided by participating country is at: https://github.com/eu-digital-green-certificates/dgc-testdata
Things to note from above table:
- Countries have their government arranged with lot of different variations.
- More certificates --> more things to secure --> more things to be worried about.
- In Germany somebody got access to a cert and issued QR-codes not belonging to actual persons.
- Lot of countries outside EU chose to use this open-source the system.
- Assume more countries to join. No need for everybody to invent this system.
- Lack of test data
- In EU, it is easy to make test data mandatory. Outside EU not so much.
- Way too many countries have not provided any sample certificates.
- Faroe Islands have 50k people, two certs.
- Okokok. It makes sense to have already issued two. If one cert leaks or gets "burned" for any reason, it is fast to re-issue the COVID certificates by using the other good cert.
- Country stats:
- Liechtenstein has 40k people, 6 certs! Whooo!
- Malta and Cabo Verde have 500k people. Maltese need 12 certs for all the signing! In Cabo Verde they have to do with only one.
- Vatican is the smallest country in the entire World. Single cert needed for those 800+ persons living there.
- Italy has 60M people, they manage their passes with only 4 certs. Nice!
- In total 59 countries participating this common system. Adoption is wide, system is well designed.
Also note how "EU" COVID passport has more participating countries outside EU.
True testament of open-source, indeed!