Snowden Archive
——
The SIDtoday
Files
Browse the Archive

DNI Collection: Keeping the Dictionaries from Filling Up

SUMMARY

The presence of keywords or other “selectors” in communications trigger NSA surveillance, and these terms are stored in databases called “dictionaries.” Agency dictionaries for filtering internet traffic recently approached the limit of 200,000 terms. In order to free up space, unproductive selectors were marked for purging and counterterrorism-related selectors were moved into a separate dictionary. An upgrade will raise the limit to 500,000 selectors.

DOCUMENT’S DATE

Mar 29, 2006

PUBLICLY AVAILABLE

Aug 15, 2018

1/1
Download
Page 1 from DNI Collection: Keeping the Dictionaries from Filling Up
DYNAMIC PAGE -- HIGHEST POSSIBLE CLASSIFICATION IS TOP SECRET // SI / TK // REL TO USA AUS CAN GBR NZL (S//SI) DNI Collection: Keeping the Dictionaries from Filling Up FROM: Janice Strauss, Deputy Chief, Targeting and Mission Management (S3C3) and SIGINT Communications Run Date: 03/29/2006 (S//SI) There are endless volumes of signals traffic out there in the world, but only a small percentage of it is actually useful for SIGINT purposes. One way we filter out the "good stuff" is through the use of dictionaries (compiled by SIGINT analysts) that control what is pulled in at field collection sites. These dictionaries are made up of selectors, such as promising phone numbers or key terms, for example. What happens, however, when these dictionaries run out of space? The CSRC Office of Targeting and Mission Management (TMM) recently tackled this very problem. (U) The "No Vacancy" Sign Was Lit (S//SI) It is TMM's job to look at the system end-to-end to make sure that data gets to its intended customers in a timely manner. To get an idea of the volume involved, about 16,000 DNI (Digital Network Intelligence) selectors are updated on a typical day, with a total capacity standing at about 200,000 selectors. Recently, site DNI dictionaries reached the 90-98% of capacity range, resulting in a very limited capability to task new selectors without first deleting existing ones. (U) A Short-Term Solution (S//SI) As part of a near-term solution to this problem, TMM provided other NSA staff and the Product Lines with "no hit" reports that identified selectors that, for over three months, had not selected traffic for delivery to follow-on processors and subsequent analysis and reporting. After much negotiation, TMM was empowered by S2 Staff to advise the Product Lines that detasking of these unproductive selectors would commence by a given date to prevent saturation of the field dictionaries. (U) ...But What About the Longer Term? (S//SI) Recognizing this was a stop-gap solution, TMM also collaborated with SCS (the Special Collection Service) and other sites to free up space to create a Counterterrorism dictionary that was initially populated with 50,000 selectors (25% of available capacity). This resulted in the detasking of those same selectors in other dictionaries, thereby bringing capacities down to about 70%. (S//SI) Work continues on a long-term solution to increase site dictionary capacities from 200,000 to 500,000 selectors. TMM is currently collaborating with the Information Technology Directorate and other organizations to roll out a plan that requires hardware and software upgrades at all collection sites to ensure compatibility and manageability of DNI selector tasking. (U//FOUO) If you have questions about this issue, please contact , "(U//FOUO) SIDtoday articles may not be republished or reposted outside NSANet without the consent of S0121 (DL sid comms)." DYNAMIC PAGE -- HIGHEST POSSIBLE CLASSIFICATION IS TOP SECRET // SI / TK // REL TO USA AUS CAN GBR NZL DERIVED FROM: NSA/CSSM 1-52, DATED 08 JAN 2007 DECLASSIFY ON: 20320108