Discord — Data Breach
Executive Summary
Researchers at the Federal University of Minas Gerais (Brazil) published a dataset of over 2 billion Discord messages scraped from 3,167 public servers spanning 2015–2024, covering 4.7 million users. Though the researchers claimed to have anonymized the data, the scraping violated Discord's Terms of Service. A separate tool called 'Searchcord' appeared using non-anonymized data from a different scrape, compounding user privacy concerns.
What Happened
Researchers at the Federal University of Minas Gerais in Brazil scraped and published a dataset containing over 2 billion Discord messages from 3,167 public servers, covering communications from 4.7 million users between 2015 and 2024. The researchers claimed to have anonymized the data by replacing usernames with pseudonyms and hashing identifiers, but Discord confirmed the scraping violated its Terms of Service, which explicitly prohibits mining or scraping data without written consent. Separately, a programmer released a tool called Searchcord using a different dataset that contained non-anonymized chat histories.
Who Is Affected
Any user who participated in public Discord servers between 2015 and 2024 may have their messages included in the dataset, representing approximately 10% of Discord's open servers and 4.7 million users. The researchers stated the data was anonymized, but experts note that anonymization processes can often be circumvented, particularly when conversations and message patterns can be pieced together to identify individuals.
Why It Matters
This incident demonstrates how messages posted in public online spaces can be collected and published without user knowledge or consent, even when platform terms explicitly prohibit such activity. The scale of the dataset, covering nearly a decade of communications from millions of users, raises concerns about the effectiveness of anonymization techniques and the potential for re-identification of users through contextual analysis of their conversations.
What You Should Do
Be aware that messages posted in public Discord servers can be scraped and archived despite platform policies against such activity. Review your past activity in public servers and consider whether you are comfortable with those messages potentially being preserved in public datasets. Going forward, treat public Discord servers as truly public spaces where content may be collected and republished.
AI-Assisted
Event summaries are generated by Claude AI from verified sources and reviewed by humans before publication.
Sources