Passwords, December 2021 Follow up
Monday, December 6. 2021
This year, I've done quite a few posts about passwords. There are still couple days left before 2022 begins, so another piece about passwords is needed.
Most commonly used passwords
For reason not really known to me, at the end of each year there is a reccurring topic: Lists of most commonly used passwords. Here is one example, a Techspot article from November 2021 The most common passwords of 2021 are outright embarrassing. In the article, they're referring to another article by Nordpass Top 200 most common passwords. Nordpass seems to be the password manager software by NordVPN.
In the Nordpass-article, you can select a list to look at. In their back-end the front-app loads the data as JSON from per-country lists. Example: All countries list has URL of https://nordpass.com/json-data/top-worst-passwords/findings/all.json. The one I, as a Finn, was most interested in was the Finnish most used passwords from https://nordpass.com/json-data/top-worst-passwords/findings/fi.json.
Bit of processing with jq to extract top-30 list with use frequency:
jq -r '.[] | "\(.Rank): \(.Password) [\(.User_count)]"' all.json | head -30
Resulting:
1: 123456 [103170552]
2: 123456789 [ 46027530]
3: 12345 [ 32955431]
4: qwerty [ 22317280]
5: password [ 20958297]
6: 12345678 [ 14745771]
7: 111111 [ 13354149]
8: 123123 [ 10244398]
9: 1234567890 [ 9646621]
10: 1234567 [ 9396813]
11: qwerty123 [ 8933334]
12: 000000 [ 8377094]
13: 1q2w3e [ 8204700]
14: aa12345678 [ 8098805]
15: abc123 [ 7184645]
16: password1 [ 5771586]
17: 1234 [ 5544971]
18: qwertyuiop [ 5197596]
19: 123321 [ 5168171]
20: password123 [ 4681010]
21: 1q2w3e4r5t [ 4624323]
22: iloveyou [ 4387925]
23: 654321 [ 4384762]
24: 666666 [ 4329996]
25: 987654321 [ 4239959]
26: 123 [ 3606937]
27: 123456a [ 3493177]
28: qwe123 [ 3284938]
29: 1q2w3e4r [ 3197899]
30: 7777777 [ 3112046]
Ok. Now we learn using 123456 is really common. Nordpass has recorded 103 million times that password being used.
Q: Where does this password data come from?
Yeah, I know the above list contains really commonly used passwords.Or does it?
From which data do we know the above list to contain top-30 most commonly used passwords? Isn't it kinda suspicious for a password vault company to publish this kind of information? Do they know what passwords do you use? How do they know that 123456 is being used over 100 million times?
There is no answer to the set of questions. Nordpass don't say, so I need to speculate and guess.
Harvesting the passwords people do use
According to John Wetzel of Recorded Future passwords are leaked constantly:
Actually, dumps with leaked passwords are easily available in the net. Even I have millions of passwords from various leaks. That could be one source to measure bad passwords, see which ones are leaked the most.
Q: What if only bad passwords leak?
Either Nordpass cheats and they do know which passwords their customers use, OR there is a case of survivorship bias. The bad password ended up being hijacked, put to a database, leaked and picked up by number of people wanting to see what passwords are being used just because it was poor one to begin with. Back-in-the-days there were cases where users' passwords
I don't know if that's the case. Not disclosing the source makes me wonder if Nordpass's ethics is bad or if their password manger is bad allowing them to see what the password is.
Life after passwords
A lot of systems being used allow users to change password and while doing it put a super simple one. Also a big problem are lists of default passwords of devices sold making the password easily guessable or non-existent as they're commonly known.
In UK they're fighting against passwords hard time: Ban on default passwords in new UK law. Nice! As a lot of device manufacturers choose to go the cheapest way, some legislation will be needed to smack some sense to them. If a device has simple default password, they won't allow selling it. We definitely should have that ruling in effect in the EU-side too!
One mechanism to rid passwords is WebAuthN. As it doesn't seem to get the traction, yet one initiative was launched Decentralized Identity Foundation (DIF). What they're proposing is to kinda reverse the authentication problem and let you control your own data allowing you to define which service will verify you're you to a website you're logging into. That would solve bunch of problems if being commonly used. However, DIF is such a new proposal, we don't know yet if that's going to fly or not.