Twitter last month submitted a Digital Millennium Copyright Act notice to GitHub — a web service designed to host user-uploaded source code — demanding that certain content be taken down because it was allegedly “[p]roprietary source code for Twitter’s platform and internal tools.” Twitter subsequently filed a declaration in federal court supporting its request for a DMCA subpoena, the ostensible aim of which was “to identify the alleged infringer or infringers who posted Twitter’s source code on systems operated by GitHub without Twitter’s authorization.”
However, Twitter appears to have revised its DMCA notice, essentially a claim of copyright infringement, the same day it was filed to request not only information about the uploader, but also “any related upload / download / access history (and any contact info, IP addresses, or other session info related to same), and any associated logs related to this repo or any forks thereof.” In other words, Twitter is now seeking information not only about the alleged leaker, but also about anyone who interacted with the particular GitHub repository, the online space for storing source code, in any way, including simply by accessing it. Trying to strong-arm GitHub into revealing information about visitors to a particular repository it hosts via a request for a subpoena is a move reminiscent of the Justice Department attempting to compel a web-hosting company to reveal information about visitors to an anti-Trump website.
DMCA: The Doxxing and Censorship Tool of Choice
This isn’t the first time that corporations have tried to use DMCA subpoenas to identify leakers. A Marvel Studios affiliate recently petitioned for DMCA subpoenas to force Reddit and Google to reveal information about someone who uploaded a film script to Google and posted about it on Reddit before the movie was released. DMCA claims also have a sordid history of being used in doxxing attempts. False DMCA claims can be filed to lure a targeted user to then file a counterclaim, which necessitates that they fill in their name and address, which in turn gets passed on to the original filer. At other times, the DMCA is used simply to censor content, whether to muzzle members of civil society or for reputation management.
No Subpoena Required?
GitHub has seemed all too willing to provide information about both its repository owners and its visitors, even without a subpoena. When the owner of another, unrelated repository recently asked GitHub to provide access logs of users who had visited it, GitHub appears to have readily complied, obscuring only the last octet of the visitor IP address, with the unredacted portion still potentially revealing information such as a user’s internet service provider and approximate location.
There are also any number of public ways to extract user information from GitHub, such as email addresses associated with a particular GitHub account. Ironically, some scripts hosted on GitHub are designed to automate the exfiltration of a GitHub user’s email address. Once an email address is learned, the process of requesting a subpoena for further information about a particular user may be repeated in an attempt to obtain yet more sensitive data.
Musk’s Bag of Tricks
Aside from claiming to use watermarking methods to catch leakers, Musk’s other companies have also sought subpoenas to force service providers to reveal information about leakers. For instance, when Musk zeroed in on (and subsequently harassed) a suspected leaker who provided internal documents to a reporter about large amounts of waste being generated at Tesla’s “Gigafactory,” Tesla moved to subpoena Apple, AT&T, Dropbox, Facebook, Google, Microsoft, Open Whisper Systems (the organization formerly behind the secure messaging app Signal), and WhatsApp. The proposed subpoenas “commanded” their targets to preserve any information about the suspected leaker’s accounts, as well as all documents that the suspected leaker “has deleted from the foregoing accounts but that are still accessible by you.”
In addition to proposed subpoenas, Tesla has reportedly tried to identify leakers by reviewing surveillance footage to see who had been taking photos (the original Business Insider story that prompted the Tesla investigation mentioned that the source had provided images to corroborate their claims of waste at the factory). The company has also checked file access logs to see who had accessed data that was provided to the news outlet.
Following identification of the suspected leaker, Tesla reportedly engaged in an extensive surveillance campaign, including hacking the suspect’s phone; requesting that the suspect turn over their laptop for an “update” that was, in fact, a forensic audit; deploying a plainclothes security guard to monitor the suspect on the factory floor; and hiring private investigators to conduct further surveillance.
Takeaways for Leakers
Given the lax approach to divulging user information by service providers, coupled with the aggressive tactics employed by companies to reveal sources, the takeaway for would-be leakers is clear: Do not trust service providers to protect any information they may have about you. Websites may reveal information about the leaker, intentionally or not, and whether legally obligated or of their own accord. Leakers would do well to avoid using their home or other proximate internet connection and to further obfuscate it using tools such as the Tor Browser. Additionally, it’s best to ensure that any information required to set up a particular account, such as an email address or phone number, not be traceable to the leaker.