Pwned Passwords, Version 5

Almost 2 years ago to the day, I wrote about Passwords Evolved: Authentication Guidance for the Modern Era. This wasn't so much an original work on my behalf as it was a consolidation of advice from the likes of NIST, the NCSC and Microsoft about how we should be doing authentication today. I love that piece because so much of it flies in the face of traditional thinking about passwords, for example:

  1. Don't impose composition rules (upper case, lower case, numbers, etc)
  2. Don't mandate password rotation (enforced changing of it every few months)
  3. Never implement password hints

And of most relevance to the discussion here today, don't allow people to use passwords that have already been exposed in a data breach. Shortly after that blog post I launched Pwned Passwords with 306M passwords from previous breach corpuses. I made the data downloadable and also made it searchable via an API, except there are obvious issues with enabling someone to send passwords to me even if they're hashed as they were in that first instance. Fast forward to Feb last year and with Cloudflare's help, I launched Pwned Passwords version 2 with a k-anonymity model. The data was all still downloadable if you wanted to run the whole thing offline, but k-anonymity also gave people the ability to hit the API without disclosing the original password. Subsequent updates to the corpus of breached passwords saw versions 3 and 4 arrive as more passwords flowed in from new breaches whilst the system also continued to grow and grow:

Today, after another 6 months of collecting passwords, I'm releasing version 5 of the service. During this time I collected 65M passwords from breaches where they were made available in plain text (I don't crack passwords for this service). Due to Pwned Passwords already having 551M records as of V4, increasingly new corpuses of passwords are actually adding very few new ones so V5 contributes an additional... 3,768,890 passwords. That may not seem like a lot in comparison, but my virtue of an entire half year passing I wanted to get the existing public set updated to the current numbers. It doesn't just add new ones though, those 65M occurrences all contribute to the exiting prevalence counts for passwords that have been seen before.

New passwords include such strings as "Mynoob" (seen 1,208 times), "Find_pass" (303 times) and "guns and robots" (134 times). There's often biases in password distribution due to the sources they're obtained from, for example the prevalence of the service's name or other attributes or relationships to the breached site.

The entire 555,278,657 passwords are now available for download if you're running the service offline. If you're using the k-anonymity API then there's nothing more to do - I've already flushed cache at Cloudflare so you're now getting the latest and greatest set of bad passwords. If you want to be sure you're getting the latest data via the API, check the "last-modified" response header has a July date rather than a January date.

And just while I'm here talking about updates to the corpus of Pwned Passwords, I'm really conscious that releases are happening on a half-yearly cadence which means a bunch of new passwords sit on my side for months before anyone can start black-listing them. This is one of the things that's high on my post-Project Svalbard list; I'd love to see a constant firehose of new passwords being integrated into this service. Not six-monthly, not monthly and frankly, not even weekly - I want to see passwords in there as soon as I get them. The shorter the period between a breached password entering circulation and it appearing in Pwned Passwords, the more impact the service can have on the scourge of credential stuffing. Stay tuned!

As time has passed and more organisations have implemented the service, there's been some really fantastic implementations come out of the community. I wrote about a bunch of them last year in my post on Pwned Passwords in Practice, but it's the work they've done at EVE Online that really stands out:

Obviously these are all some of my favourite things (HIBP, 1Password and Report URI), but it's the improvements made to the user selection of passwords that makes me particularly happy:

When we first implemented the check, about 19% of logins were greeted with the message that their password was not safe enough. Today, this has dropped down to around 11-12% and hopefully will continue to go down.

That's a massive drop that has a profoundly positive impact not just on the individuals using EVE Online, but to the company itself too. Account takeover attacks are a massive problem on the web today and if you reduce the proportion of customers using known bad passwords by up to 42%, you make a direct impact on the cost the organisation has to bear when dealing with the problem.

The NTLM hashes have been really well-received too as they've allowed organisations to quickly check the proportion of their Active Directory users with known bad passwords. Consistently, I'm hearing the results of this exercise are... alarming:

I've been really happy to see a bunch of community offerings appear around the NTLM hashes in particular. Most notable is this one by Ryan Newington:

What's great about this work is that not only can it stop people from making bad password choices in the first place, you'll see there's a reference towards the bottom that'll allow you to run it against your entire set of AD users on demand. And just like Pwned Passwords itself, it's 100% free and you can go and grab it all right now.

So that's Pwned Passwords V5 now live. Implement the k-anonymity API with a few lines of code or if you want to run it all offline, download the data directly. Either way, take it and do awesome things with it!

Have I Been Pwned Pwned Passwords
Tweet Post Update Email RSS

Hi, I'm Troy Hunt, I write this blog, create courses for Pluralsight and am a Microsoft Regional Director and MVP who travels the world speaking at events and training technology professionals