Hackers Are So Fed Up With Twitter Bots They’re Hunting Them Down Themselves

As Twitter ramps up its efforts against fake accounts, researchers are devising algorithms to distinguish humans from bots in their spare time.

Illustration: Angie Wang for The Intercept

Once a mere nuisance for Twitter, accounts created by software programs pretending to be human — “bots” — have become a major headache for the social network. In October, Twitter’s general counsel told a Senate committee investigating disinformation that Russian bots tweeted 1.4 million times during the run-up to the last presidential election, and such bots would later be implicated in hundreds of tweets that followed a school shooting in Florida. In January, the New York Times detailed how U.S. companies, executives, journalists, and celebrities often purchase bots as followers in an attempt to make themselves seem more popular.

The fallout for the company has been withering. In Vanity Fair last month, writer Nick Bilton, who has tracked the company closely as an author and journalist, accused Twitter of “turning a blind eye to the problem” of bots for years in order to artificially inflate its count of active users. Meanwhile, disgruntled former Twitter executives told Maya Kosoff, also in Vanity Fair, that the social network was throwing too many humans and too little technology at the problem of bots and other misbehavior. “You had this unsophisticated human army with no real scalable platform to plug into,” one said.

Even if Twitter hasn’t invested much in anti-bot software, some of its most technically proficient users have. They’re writing and refining code that can use Twitter’s public application programming interface, or API, as well as Google and other online interfaces, to ferret out fake accounts and bad actors. The effort, at least among the researchers I spoke with, has begun with hunting bots designed to promote pornographic material — a type of fake account that is particularly easy to spot — but the plan is to eventually broaden the hunt to other types of bots. The bot-hunting programming and research has been a strictly volunteer, part-time endeavor, but the efforts have collectively identified tens of thousands of fake accounts, underlining just how much low-hanging fruit remains for Twitter to prune.

Autodidacts at Automaton Detection

Among the part-time bot-hunters is French security researcher and freelance Android developer Baptiste Robert, who in February of this year noticed that Twitter accounts with profile photos of scantily clad women were liking his tweets or following him on Twitter. Aside from the sexually suggestive images, the bots had similarities. Not only did these Twitter accounts typically include profile photos of adult actresses, but they also had similar bios, followed similar accounts, liked more tweets than they retweeted, had fewer than 1,000 followers, and directed readers to click the link in their bios.

One of the first accounts Robert looked at, which is now suspended, linked to the site datewith-me1.com, which was registered with a Russian email address also connected to instaflirtbook.com, hookupviplocators1.com, yoursexydream11.com, and sex4choice.com. Robert said it looked like various phishing sites had an identical schema and were likely operated by the same person.

“The idea was to show how easy it was to find these bots by just using Google search.”

So, Robert decided to create a proof-of-concept bot to show his followers that finding these accounts is pretty easy. After determining a set of similarities between the bots, he used advanced Google queries and Google reverse image search to find more of them. He then wrote a few hundred lines of code in the programming language Python to create a bot to hunt down and expose fake accounts. “It took less than one hour to write the first version of @PornBotHunter,” Robert said. “The first idea was to show how easy it was to find these bots by just using Google search.” The bot hunter tweets information on porn bots it has detected every hour, including Twitter handles, profile pictures, number of followers, longevity of the profile, biographical links, and whether Twitter has suspended the account. It also posts lengthier reports about its activities to Pastebin, a text hosting site popular among security researchers. Robert also allows people to report false positives to his regular Twitter account.

Robert is quick to admit that the software is just a proof of concept. He is planning on rewriting it to catch other types of bots as well, ranging from cryptocurrency bots to political bots. He also hopes to create a framework that’ll help people see how many bots are following them on Twitter. Once the project is stable and reviewed, he plans to open source the source code.

Still, it’s fascinating that a tool put together in just an hour is catching bots before Twitter does itself. As of March 1, @PornBotHunter has listed 197 spammy, apparently fake accounts in Pastebin, and 66, roughly a third, have yet to be suspended by Twitter. The others were suspended soon after being indexed by Google or after Robert reported them to Twitter. (A handful of the remaining active accounts, which may have only been compromised temporarily, do not appear to be spam accounts run by bots.)

Top/Left: A visualization of websites linked by tweets from bots analyzed by Swedish research team Botjakten. Bottom/Right: A visualization of accounts mentioned in a set of 100 contemporaneous bot tweets examined by Botjakten. Image: Botjakten

Meanwhile, in Sweden, a group of five archivists, data journalists, and academic researchers also noticed that many Twitter users were getting inundated with automated accounts sharing sexually explicit links and images. They decided to analyze some of these porn bots through a project called Botjakten (“jakten” means “hunt” in Swedish). All of Botjakten’s code is up on GitHub.

Botjakten started in mid-January as a crowdsourced project using a simple Google form that allowed users to report suspected bots that had followed or retweeted them. That, along with individuals who reached out to share their blacklists, gave them about 5,000 bots. Although some people provided false information on the form, it gave them a great starting point to look for patterns in location, profile photos, account creation date, numbers of retweets, outbound links, and more. They then wrote software to identify still more bots, querying Twitter’s livestream API with different terms, including sexually explicit phrases and hashtags associated with bots, and visualize the bots’ activity, for example by showing the webpages they linked. (The visualization code was written in the JavaScript programming language, and the bot detection code in Python.)

After a few weeks, the 30,000 bots identified by the Botjakten group’s software suddenly ceased all activity, so they attempted to find more bots by examining accounts that followed those 30,000 bots. This produced 20,000 more bots to track. “It seemed likely that they would follow one another, and we were right about that assumption,” said Andreas Segerberg, who works at the Gothenburg City Archives and teaches digital preservation/archival theory at the University of Gothenburg when not working on the project. Twitter’s API limits the number of requests per minute, so the process was very slow and took about a month.

Botjakten is working on analyzing the data, for example, mapping the online behavior of these porn bots, trying to understand from where they originate, and gathering data on how widespread the issue is. The team hopes to analyze the percentage of bots that have been suspended and the percentage of accounts that were compromised, but have since returned to their rightful owners.

But even though the team put the collection part of the project on pause to focus on analysis, they have seen a second wave of bots that they plan to monitor. Botjakten plans to keep monitoring the bots and eventually publish its findings, then report the bots en masse to Twitter, Segerberg said.

Then, like Robert, they want to look at other data as well. “There’s an election coming up in Sweden in October; that will be something we will keep a close eye on,” said Segerberg.

Two sites tweeted by bots examined by Botjakten. Image: Botjakten

They’re still not done analyzing their data, but the Botjakten project has already identified similarities in porn bot accounts: images of a small number of adult film actresses, bots that share specific syntax when posting links that point to two websites with sexual content, the use of obscure link shorteners with URLs linking to questionnaire sites that then direct users to one of two dating websites. Aside from the redirects bouncing visitors from one place to another, some sites linked by the porn bots run cryptocurrency miners in users’ browsers, hijacking their computers’ processors to make money for scammers. Some accounts appear to be autogenerated for the sole purpose of spamming users as part of a bot network, while others appear to be hacked accounts from real users. (This can happen when people use bad passwords or reuse usernames and passwords on multiple accounts, since hackers can test those combinations out on multiple accounts after just one is compromised.)

Twitter Is “More Proactive Than They’ve Ever Been”

Although bots have clearly become a more prominent issue for Twitter over the past year and a half, freelance bot hunting has something of a history. Security analyst Rob Cook began hunting porn bots in 2016 and recently found about a fifth of the ones he discovered back then are still active. Also in 2016, security firm Symantec discovered that 2,500 popular Twitter accounts were hijacked by porn bots within a two-week span; profile photos were replaced with sexually suggestive images, display names and bios were changed, and tweets included links to sketchy adult dating sites. Research released in March 2017 estimated that a whopping 9 to 15 percent of active Twitter accounts were bots. Information security journalist Joseph Cox followed a trail of Twitter porn bots for Vice’s Motherboard a few months after that and found “a network of over a dozen interlinked dodgy-looking dating websites.”

The ubiquity of these accounts and relative ease with which users and researchers alike can find them begs the question of why Twitter seems to be two steps behind. When reached for comment, a Twitter spokesperson said, “Our team uses a combination of technology and human review to identify and remove content that is spammy and attempts to manipulate the service. This is important, ongoing work that we will continue to prioritize.” The spokesperson further provided The Intercept with information on Twitter’s rules on spam.

“Given the scale we’re working at and erring on the side of protecting peoples’ voices … context is crucial when we review content.”

When asked why so many accounts continue to proliferate in spite of these efforts — why hackers can build tools to detect active accounts that Twitter hasn’t caught — the spokesperson stated, “There are hundreds of millions of Tweets sent every day, and we use both reports and our technology to help enforce the Twitter Rules. Given the scale we’re working at and erring on the side of protecting peoples’ voices, including types of activity that on the outside can appear spammy, context is crucial when we review content. We employ a variety of techniques to combat spam, depending on the type of behavior we observe on the account; this approach helps us make the most informed, thoughtful decisions. For example, we may be able to help an original account owner regain control of a hacked account, rather than deleting it entirely. Every day we observe new ways people attempt to spam or manipulate content on the service, and our team works hard to ensure we’re enforcing our rules fairly and consistently.” While Twitter insists that it proactively monitors for spammy or manipulative behavior and removes accounts in violation of its spam rules, it would not comment on individual accounts for privacy and security reasons.

In the summer of 2017, Twitter did take down SIREN, a massive spam pornography botnet of over 90,000 accounts that included profiles with women’s names and sexually suggestive photographs with canned sexually explicit phrases in broken English. After Baltimore-based security firm ZeroFOX disclosed the profiles and posts to the Twitter security team, they were removed — but this was after the botnet generated almost 30 million clicks through Twitter, as well as spam emails.

A visualization of websites linked from the Twitter profiles of porn bots identified by the Swedish research project Botjakten. Image: Botjakten

Erin Gallagher, a multimedia artist, writer, and translator, said Twitter has improved in the past couple years. Gallagher would know because she researches Twitter bot networks and diagrams them, showing clusters of retweet networks, hub accounts, and other information, using the visualization software tool Gephi. “In their most recent report, [Twitter] said that in December of 2017, their automatic systems identified more than 6.4 million suspicious accounts per week, which seems bananas, and actually they’ve increased their detection system 60 percent since October,” she told me. “It seems that they’re more proactive than they have ever been.” Further, Gallagher explained that Twitter takes more heat than other social media platforms because it has an open API, which allows independent researchers to access the data and study it. And there’s the possibility that while these porn bots have found a way around Twitter’s detection mechanisms, they wouldn’t have taken steps to circumvent detection from smaller projects of which they’re unaware.

But Segerberg, of the Botjakten project, points out that it’s much easier to create a bot on Twitter than it is to report one. “In essence, there are no captchas or anything when signing up for an account, but it takes, like, five steps to report a spam account.” To do so, Twitter requires users to click on that account’s profile, click the overflow icon, select “report” from the menu, select “they are posting spam” from the drop-down menu, and follow recommendations for additional actions to take. Reporting tweets is equally cumbersome: One must navigate to the appropriate tweet, tap the icon, select “report tweet,” select “it’s spam,” and then follow the recommendations.

Perhaps creating a more user-friendly reporting process and tapping into some of the data from independent projects such as @PornBotHunter and Botjakten could help Twitter improve their processes and find some of the bots they have missed.

Join The Conversation