Bluesky — Data Breach
major→
A Hugging Face employee published a dataset of 1 million Bluesky posts scraped via the public Firehose API, including text, metadata, and users' decentralized identifiers (DIDs). After immediate backlash, the dataset was removed within a day. However, larger datasets quickly appeared — including one of nearly 300 million non-anonymized posts (roughly 42.5% of all Bluesky posts). Bluesky acknowledged it could not enforce consent preferences outside its own systems.
Full diff not available for this historical entry.
Detailed line-by-line diffs will be generated automatically for future policy changes detected by the scraper.