It was the Synthient threat data that ate most of my time this week, and it continues to do so now, the weekend after recording this video. Data like this is equal parts enormously damaging to victims and frustratingly noisy to process. I have to be confident enough that it's new enough, legit enough and impactful enough to justify loading and that the value presented to breach victims sufficiently offsets the inevitable chorus of "what am I meant to do with this, tell me exactly what password was exposed for my record". It's an expensive exercise too; we're currently running an Azure SQL Hyperscale database at 80 cores to analyse the ~2 billion credential stuffing email addresses in this corpus. That's 2 billion unique email addresses too 😮 More on that in the next video, let's just work out if it's going to go live in the system first.
References
- Sponsored by: Report URI: Guarding you from rogue JavaScript! Don’t get pwned; get real-time alerts & prevent breaches #SecureYourSite
- We poured 183M email addresses from Synthient's threat data collection into HIBP (over 16M of those hadn't been seen by us before)
- We're now up to well over 17 billion monthly queries on Pwned Passwords (every month seems to add another billion... or so)
- I've had loads of good feedback on the PC build Gist (I've now sent that to a couple of local builders, I'll share the results)