New techniques to enable administrators of WhatsApp public groups to identify junk senders, as well as automatically filter junk/spam messages for WhatsApp users, has been developed by Assistant Professor of Library and Information Science Kiran Garimella and his co-authors.
“Our methods are very practical and applicable,” Garimella said. “WhatsApp can apply them to stop the spread of spam in their groups, and our techniques can be used on the platform centrally while still respecting the end-to-end encryption guarantees WhatsApp offers users to protect their privacy.”
Garimella’s newly developed methods for moderating WhatsApp public groups are a significant breakthrough, because, unlike other platforms such as email and Twitter, WhatsApp cannot read or moderate users’ content because of end-to-end encryption.
Their study, “Jettisoning Junk Messaging in the Era of End-to-End Encryption: A Case Study of WhatsApp,” the first to examine junk messages on WhatsApp public groups, has been accepted for publication by The Web Conference (WWW) 2022. Garimella and his co-authors will present the paper at the conference which will be held virtually from April 25 – 29, 2022.
Garimella’s newly developed methods for moderating WhatsApp public groups are a significant breakthrough, because, unlike other platforms such as email and Twitter, WhatsApp cannot read or moderate users’ content because of end-to-end encryption. While this ensures users’ privacy, WhatsApp’s inability to moderate content means the spam and unwanted messages posted by junk senders can impact users’ experience with public groups on the platform.
“WhatsApp has made some progress with blocking and deleting unwanted messages sent to individuals, Garimella said, “but there was no solution for detecting users who send messages to WhatsApp groups, done at scale.”
To collect data for the study, Garimella researched public-politics-related WhatsApp groups in India. They examined 2.6 million messages from 5,051 such groups, analyzing the content, URLs, and patterns of spam message posting over time.
Defining junk messages as “those which are not considered of interest or suitable by administrators for a group, leading such posters to be removed,” the prevalence of junk posted to these groups was much higher than they anticipated. “We found that 1 in 10 messages posted to these public groups were junk messages,” Garimella said.
Eliminating unwanted messages is key for improving information consumption for people who are bombarded by spam, Garimella said, and for reducing users’ economic concerns, as some junk senders aim to steal users’ credit card information.
They also found, Garimella said, that spam-senders on these public groups in WhatsApp tend to post across many groups. They also appear and disappear a number of times to avoid being detected and removed by administrators. They also spread the same spam messages over a small number of “active” days, and this strategy, according to the paper, “might improve the visibility of junk by providing a longer ‘shelf life’ in the recent messages.”
“We also discovered that a key indicator of the presence of junk are particular URLs and phone numbers,” Garimella said. “We show that these can be used for automatic detection, and we demonstrated that models can be trained to detect them. In addition, we found there are very simple approaches – simple classifiers – that can do the job of detecting junk messages well.”
As part of a broad and highly committed effort to reduce the spam on WhatsApp public groups, Garimella and his co-authors are offering to share their annotated dataset and code with WhatsApp and make it publicly available for other researchers to use.
Garimella's co-authors include researchers from King’s College London, Telefónica Research, Queen Mary University of London, the University of Surrey and the Hong Kong University of Science and Technology.