I spent longer than I expected talking about Trello this week, in part because I don't feel the narrative they presented properly acknowledges their responsibility for the incident and in part because I think the impact of scraping in general is misunderstood. I suspect many of us are prone to looking at this in a very binary fashion: if the data is publicly accessible anyway, scraping it poses no risk. But in my view, there's a hell of a big difference between say, looking at one person's personal info on LinkedIn via the browser versus having a corpus of millions of records of the same data saved offline. That's before we even get into the issue of whether in Trello's case, it should ever be possible for a third party to match email address to username and IRL name.
To add some more perspective, I've just posted a poll immediately before publishing this blog post, let's see what the masses have to say:
Scraping: should we be concerned if an individual's personal data is scraped, aggregated en mass and redistributed if that same data is already publicly accessible on the service anyway? Vote and if possible, add more context in a reply.— Troy Hunt (@troyhunt) January 28, 2024
- Trello had 15M records scraped and posted publicly (somehow the narrative feels like it's pushing back on things that were never said to begin with)
- The "Mother of all Breaches"... which isn't (someone leaving their personal stash of existing breaches doesn't make everything re-breached)
- HIBP got a nice little shout-out from our MP for Cyber Security (I'm still fascinated at just how mainstream this little service has become 😊)