Decoding EU Digital COVID Certificate into human-readable format
Saturday, December 11. 2021
Since I started working with vacdec
, something has been bugging me. I quite couldn't put my finger on it at first. As a reader asked me to decode some unix timestamps from CBOR payload into human-readable format, it hit me. The output needs to be easily understandable!
Now there exists a version with that capability (see: https://github.com/HQJaTu/vacdec for details).
Using Finnish government provided test data, this is the raw CBOR/JSON data:
{
"1": "FI",
"4": 1655413199,
"6": 1623833669,
"-260": {
"1": {
"v": [
{
"ci": "URN:UVCI:01:FI:DZYOJVJ6Y8MQKNEI95WBTOEIM#X",
"co": "FI",
"dn": 1,
"dt": "2021-03-05",
"is": "The Social Insurance Institution of Finland",
"ma": "ORG-100001417",
"mp": "EU/1/20/1525",
"sd": 1,
"tg": "840539006",
"vp": "J07BX03"
}
],
"dob": "1967-02-01",
"nam": {
"fn": "Testaaja",
"gn": "Matti Kari Yrjänä",
"fnt": "TESTAAJA",
"gnt": "MATTI<KARI<YRJAENAE"
},
"ver": "1.0.0"
}
}
}
Exactly the same with improved version for vacdec
:
{
"issuer": "Finland",
"expiry:": "2022-06-16 20:59:59",
"issued:": "2021-06-16 08:54:29",
"Health certificate": {
"1": {
"Vaccination": [
{
"Unique Certificate Identifier: UVCI": "URN:UVCI:01:FI:DZYOJVJ6Y8MQKNEI95WBTOEIM#X",
"Country of Vaccination": "Finland",
"Dose Number": 1,
"Date of Vaccination": "2021-03-05",
"Certificate Issuer": "The Social Insurance Institution of Finland",
"Marketing Authorization Holder / Manufacturer": "Janssen-Cilag International",
"Medicinal product": "EU/1/20/1525: COVID-19 Vaccine Janssen",
"Total Series of Doses": 1,
"Targeted disease or agent": "COVID-19",
"Vaccine or prophylaxis": "COVID-19 vaccines"
}
],
"Date of birth": "1967-02-01",
"Name": {
"Surname": "Testaaja",
"Forename": "Matti Kari Yrjänä",
"ICAO 9303 standardised surname": "TESTAAJA",
"ICAO 9303 standardised forename": "MATTI<KARI<YRJAENAE"
},
"Version": "1.0.0"
}
}
}
Much easier on the eye. Raw data can still be displayed, but is not the default. Use switch --output-raw
to get original result.
There are comments in my Python-code, but for those wanting to eyeball the specs themselves, go see https://github.com/ehn-dcc-development/ehn-dcc-schema and https://ec.europa.eu/health/sites/default/files/ehealth/docs/digital-green-certificates_v1_en.pdf for exact details of CBOR header and payload fields. The certificate JSON-schema describes all used value-sets.
Note: Especially JSON-schema is a living thing. If you read this in the future, something might have changed.
Please, drop me a line when that happens.
Passwords, December 2021 Follow up
Monday, December 6. 2021
This year, I've done quite a few posts about passwords. There are still couple days left before 2022 begins, so another piece about passwords is needed.
Most commonly used passwords
For reason not really known to me, at the end of each year there is a reccurring topic: Lists of most commonly used passwords. Here is one example, a Techspot article from November 2021 The most common passwords of 2021 are outright embarrassing. In the article, they're referring to another article by Nordpass Top 200 most common passwords. Nordpass seems to be the password manager software by NordVPN.
In the Nordpass-article, you can select a list to look at. In their back-end the front-app loads the data as JSON from per-country lists. Example: All countries list has URL of https://nordpass.com/json-data/top-worst-passwords/findings/all.json. The one I, as a Finn, was most interested in was the Finnish most used passwords from https://nordpass.com/json-data/top-worst-passwords/findings/fi.json.
Bit of processing with jq to extract top-30 list with use frequency:
jq -r '.[] | "\(.Rank): \(.Password) [\(.User_count)]"' all.json | head -30
Resulting:
1: 123456 [103170552]
2: 123456789 [ 46027530]
3: 12345 [ 32955431]
4: qwerty [ 22317280]
5: password [ 20958297]
6: 12345678 [ 14745771]
7: 111111 [ 13354149]
8: 123123 [ 10244398]
9: 1234567890 [ 9646621]
10: 1234567 [ 9396813]
11: qwerty123 [ 8933334]
12: 000000 [ 8377094]
13: 1q2w3e [ 8204700]
14: aa12345678 [ 8098805]
15: abc123 [ 7184645]
16: password1 [ 5771586]
17: 1234 [ 5544971]
18: qwertyuiop [ 5197596]
19: 123321 [ 5168171]
20: password123 [ 4681010]
21: 1q2w3e4r5t [ 4624323]
22: iloveyou [ 4387925]
23: 654321 [ 4384762]
24: 666666 [ 4329996]
25: 987654321 [ 4239959]
26: 123 [ 3606937]
27: 123456a [ 3493177]
28: qwe123 [ 3284938]
29: 1q2w3e4r [ 3197899]
30: 7777777 [ 3112046]
Ok. Now we learn using 123456 is really common. Nordpass has recorded 103 million times that password being used.
Q: Where does this password data come from?
Yeah, I know the above list contains really commonly used passwords.Or does it?
From which data do we know the above list to contain top-30 most commonly used passwords? Isn't it kinda suspicious for a password vault company to publish this kind of information? Do they know what passwords do you use? How do they know that 123456 is being used over 100 million times?
There is no answer to the set of questions. Nordpass don't say, so I need to speculate and guess.
Harvesting the passwords people do use
According to John Wetzel of Recorded Future passwords are leaked constantly:
Actually, dumps with leaked passwords are easily available in the net. Even I have millions of passwords from various leaks. That could be one source to measure bad passwords, see which ones are leaked the most.
Q: What if only bad passwords leak?
Either Nordpass cheats and they do know which passwords their customers use, OR there is a case of survivorship bias. The bad password ended up being hijacked, put to a database, leaked and picked up by number of people wanting to see what passwords are being used just because it was poor one to begin with. Back-in-the-days there were cases where users' passwords
I don't know if that's the case. Not disclosing the source makes me wonder if Nordpass's ethics is bad or if their password manger is bad allowing them to see what the password is.
Life after passwords
A lot of systems being used allow users to change password and while doing it put a super simple one. Also a big problem are lists of default passwords of devices sold making the password easily guessable or non-existent as they're commonly known.
In UK they're fighting against passwords hard time: Ban on default passwords in new UK law. Nice! As a lot of device manufacturers choose to go the cheapest way, some legislation will be needed to smack some sense to them. If a device has simple default password, they won't allow selling it. We definitely should have that ruling in effect in the EU-side too!
One mechanism to rid passwords is WebAuthN. As it doesn't seem to get the traction, yet one initiative was launched Decentralized Identity Foundation (DIF). What they're proposing is to kinda reverse the authentication problem and let you control your own data allowing you to define which service will verify you're you to a website you're logging into. That would solve bunch of problems if being commonly used. However, DIF is such a new proposal, we don't know yet if that's going to fly or not.
EU Digital COVID Certificate, December 2021 edition
Sunday, December 5. 2021
Since my August blog post about vacdec
, the utility to take a peek into COVID-19 passport internals, I've got lots of feedback and questions. The topic, obviously, touches all of us. As this global pandemic won't end, I've been following different events, occurrences and incdients around digital COVID certificates.
TV and Print media bloopers
To anybody working with software, computers, data, encoding and transmitting data, especially in the era of GDPR, it's second nature to classify data. There is public data, confidential data, data containing GDPR-covered personal details, to classify by simple three criteria.
What's puzzling on how EU Digital COVID Certificate QR-codes where shown in media. Personally I contacted four different journalists in fashion of "Hey, you published all of your personal details in form of a QR-code. Please, retract and do not ever do that again." One journalist I talked in person and his comment was "I couldn't fathom this black-and-white blob should be kept secret!"
These leaks weren't by small and insignificant media. The codes I saw were portrayed from 9 o'clock evening TV news (cinematographer's cert), website of print media (journalist's own cert), Internet news video (producer's cert) and paper print of news (not sure whose cert that was).
As these bloopers aren't regularily seen anymore, it seems all of the media outlets have issued internal memos for not to display the QR-codes in a readable format anymore. I have facts of one media outlet's internal memo informing people not to publish real certs publicly. One positive change is most of them are using sample certs when they really want to portray a QR-code.
Brute-forcing ECDSA cryptography
In one instance, I bumped into a script-kiddies masterpiece. He was convinced of the possibility of brute-forcing the ECDSA-256 private key of one COVID Certificate issuer. I eyeballed the Python-code and indeed, it approached the problem by doing an nearly-forever loop of picking 256 random bits to match those bits against the public key. Nice! By carefully choosing the bits, that is how the thing works.
However, some suspicions rose. There was a comment in the discussion thread saying: "I don't think this code is working. My home computer has been crunching this for two weeks now and there doesn't seem to be any results."
Really! REALLY!!
These kids really have no clue.
Hints for brute-forcing ECDSA-crypto:
- If it was meant to be that easy, everybody was cracking the private key.
- Don't do random bits, instead go through 2^256 bits in sequence. By going random, there is no mechanism to check if that combination has been attempted already.
- Go parallel, split the sequence into chunks and do it simultaneously with multiple computers.
- Two to 256th power is: 115792089237316195423570985008687907853269984665640564039457584007913129639936. If for some reason you come to a conclusion of that particular number being on the larger scale, your conclusion is correct. IT IS!
- If some magic lottery bounces your way and you happend to find the correct sequence. For the love of god, do not tell anybody else of your findings. It takes you couple gazillion years to brute force the key, for the opposing party couple seconds to revoke it.
- Make sure to put in couple lottery tickets while at it. Chances are you'll become rich before finding the private key.
Forged Digital COVID Certificates
Somebody in Germany got their hands on the actual certificate issuing private key and did throw out couple fully valid and totally verifiable COVID certificates for different names. There were at least three different ones circulating that I know of. Here is one (notice the green color indicating validity):
This one was targeting Finnish customer pool and had first name of "Rokotepassieu" which translates as "Vaccination passport EU". Surname translated: "Contact me via Wickr" (for those who don't do Wickr, it's an AWS owned messaging app with highest standards on security).
According to Swedish government list of valid certificates, Germany has 57 sets of keys in use (disclaimer: at the time of writing this post). What German government had to do is revoke the public key which private key was misplaced. For thousands of people who had a valid COVID cert, they had to get theirs again with different signing key.
This kind of incident/leakage obviously caught attention from tons of officials. Making sellers' business dry up rather swiftly. Unfortunaly no details of this incident were disclosed. The general guess is for some underpaid person to be doing darknet moonlighting on the side with his access to this rare and protected resource.
Leaked Digital COVID Certificates
In Italy roughly thousand COVID certs issued to actual people were made available. I downloaded and processed couple thousand files in four different leaks just to find out most of the leaked certs were duplicates. I did combine and de-duplicate publicly available data. Here are the stats:
All leaked certs are still valid (disclaimer: at the time of writing this post). However, Italian government has only three sets of keys in place and replacing one would mean rendering 1/3 of all issued certificates as invalid. They have not chosen to do that. Most likely because the certs in circulation are harvested by somebody doing the checking with a malicious mobile app.
No surprises there. Most of them are for twice vaccinated people. Couple certs are found for three times vaccinated, test results and recovered ones. What's sad is the fact that such a leak exists. These QR-codes are for real human beings and their data should be handled with care. Not cool.
What next
This unfortunate pandemic isn't going anywhere. More and more countries are expanding the use of COVID certs in daily use. For any cert checking to make sense, the cert must be paired with person's ID-card. Crypto math with the certs is rock solid, humans are the weak link here.