NSA Presentation on RTRG Analytics for Forward Users

May. 29 2019 — 3:00p.m.


TOP SECRET REL Real-Time Regional Gateway Cloud Analytics for Forward Users TOP

RTRG: .. .brings near real-time intelligence to the warfighter ..."grew up" supporting operations in Iraq ...RlRG is now a global architectur-e ...leveraging the emerging cloud architecture to answer questions we have not been able to do before SECRET// RELUSA,FVEY 2

Mission Areas: Tracking high -value targets (HVT), Counter-Insurgency (COIN), Counter-lED (CIED) Organizations Using RTRG: • CSG* Afghanistan • U.S. Marine Corps (USMC) 1 st and znd Radio Battalions • U.S. Army SIGINT analysts at BCT* level • U.S. Air Force National Tactical Integration • Jalalabad Fusion Cell (USMC) • 52 TOPI • NSA-G SWAN Counternarcotics Team • All special operations task forces (SI/REL) Area 82 - Bagram Air Base Home of RTRGAFl & gmTote • 20 days data retention from AFPAK • 200-600 daily users • New upgrades include two systems in Kabul (TS//REL) (TS/ISi/ /REL) .L_ "RTRG is th most significant SIGINT support to the war figflter in the last decade" - General David Petraeus *CST-Cryptologic Support Team _J I __;:::l "USSOCOM has enduring and critical neects for the tools and data that RT-RG provides" *BCT- Brigade Combat Team - dmiLa1-1N llliamMcRav.en___ *CSG- Cryptologic Support Group TOPSECRET //S I//R ELUSA,FVEY 3

• In 2011, RTRGin Afghanistan - Played a key role in 90% of all SIGINT developed operations - Leading to 2270 captu reliki 11operations ~_______. -6534 enemies killed in action -1117 d€?tainees ------------. TOPSECRET // SI//RELUSA,FVEY 4

Monito ring Iranian Navy (IRIN) in Straits of Hormuz US Navy aircraft, located by RTRG SupP,orting CENTCOM Maritime (NAVCENT) Navy Information Operations Command- ' ahrain (NIOC-B) (TS//SI//REL) Missions supported: • Iran, Yemen, Persian Gulf • Recent successes include monitoring of Iranian naval assets RTRGAfloat on subsurface platform USS Georgia (SSGN-729) Missions supported : (TS//SI//REL) • Horn of Africa: In first week of mission, system received 31 million GSM events, leading to 10 high-value target voice ID, and 90 tactical tip-offs TOPSECRET // 51// RELUSA,FVEY 5

US-2 ~ cDen~_er-:::1,--~ us -=1- Fort Meade Global Mariti 'me & ELI T r-' Missien Assurance ................ . ~-' 1 1 IQ2 Iraq* ., AF-1,5 Bagram & Kabul COIN, CIED .-,, COIN,CIED ~ USS Georgia & USS Florida • J • I I US-:_ 3 JFCOM Joint For,o_e ~s Command US:S NSA-Texas Counternarcotics, Maritime, Mexico & SOUTHCOM Support • In draw-down BH-1 NIOC** Bahrain ---,......r Al=RICOM, EUCOM, P'ACOM GE-3 Germany** .___ K0-1,2 USFK** Pangyo AF K PACOM & North Korea Continuity-of-Operations ... ECC- NSA European Technical Center USFK- U.S. Forces, Korea TOPSECRET//SI/ /RELTO USA,FVEY NIOC - Navy Information Operations Command 6

• RTRG Mission Overview c:>· RTRG System: Today and Tomorrow • Target-Centric and Network-Centric Cloud Analytics • FtJtu re Work UNCLASSIFIED//FOR OFFICIALUSEONLY 7

Goldminer Metadata Search Supporting Tactical Users GeoT Geospatial Tools Agent Logic Sharkfinn Alerting Tools CIONE • ~ ~ Selector Report & Doc Target Enrichment Manager Management Oracle relat ional database & dimensional data model Ingest and Enrichment Pipeline : Flexible , high ..speed architecture for parser and data processors Forward Data Centers oRr Panopticon Services Layer : Web serv ices, authentication , auditing Publish and Subscribe Messaging Data Fe ds: SKS 1uGGERNAur TALIS/MATTERHORN AIRHANDLER LOPERS KL VOICESAIL LYCANTHROPE RETURNSPRING .. and a growing list of others* (DNR, DNI collect, tipping and reporting) ~--~ A successful architecture for several year . Demand for more data feeds, longer retention, and data-intensive analytics has driv n-R:'fRG e seek-new s-elutions * Based on Afghani stan RTRGdata flow SECRET //S I// RELTO USA, FVEY

Current Challenges • Data Storage & Retention - "Patterns of Life" analysis needs require 6+ months of data from world-wide collection - A typical system has capacity for only 4-6 weeks of regional data ("'90% user queries are within seven days of "now") • Data Use & Computation - Analytic processes should make maximum use of all available data to find small signals - Relational databases are unsuited to sophisticated analytics such as correlation and matching • [Data & Technology Hetenogeneity .___ - New types of data must 0e added to the system continually - Witn traditional oataoases, schema modificatieAs are oifficult Exotic data management solutions are diffi Gbllt to adopt due to limited expertise _ _ 1 SECRET //SI//RELTO USA, FVEY

Google Emerging NSA Cloud Reference Architecture is well-suited for developing analytics on intelligence data Distributed file systems and databases are built Scalable: on clusters of commodity hardware, leveraging open source projects and industrial solutions Computable: ScalableBlgTablelmplementatlon with Security The MapReduce programming model simplifies writing efficient parallel computations that operate over large volumes of data Cloud technologies enable flexible schema an leverage large open-source efforts UNCLASSIFIED // FOROFFICIALUSEONLY

Analytic Challenges from Iraq & AFPAK Data Challenges in AFPAK Current RTRG(AF1) • Current database is 27 terabytes (TB) • Retention is "'30 days Future Cloud enabled system • Even a modest cloud system (3 rack) for storage will be at least 125 TB of storage • Sx increase in available space • Actual retention improvement depends on h©w the space resources are allocated • Many analytics used by RTRGare based on R6 SORTINGLEAD event summaries • Event summaries were originally created on relational databases • Collection increased dramatically, and a mapreduce implementation was needed • For new analytics with presernt day collection velumes, a practic I parallel execution model is crucial Cloud supports more data feeds & more days of historical data TOPSECRET // SI// RELTO USA, FVEY orts large-scale analytics 11

• 5-12 racks commodity hardware 150+ data nodes 16 GB RAM each • gmLIGHT Y ima gm~E Apache Hadoop lOOs of terabytes of storage Stores 10s to lOOs of billions of events • gmBALTIC 0 0 CAVE g fEACH NSA Cloudbase Utah . gmPLACE gm DEN ce gmPARK ADF-C Cenie 0 lOOs of nodes serving BigTable implementation Stores lOOs of billions of entries Europe Iraq gmCARBON Korea & Japan Afghanistan 7 ; ·: gmHALO 0 , 0 0 gmTOTE (Bagram) & ~~, mPEN~ s. Korea JO Australia r ~' O ,~ gmMATE .<d t,,) ' ;-·-•M!:~c.."-----~~ ...,-/ b:.;i~.... . (SI/REL)(Alice Springs) S. Korea G OSTMACHINE is a data-intensive ""-- ·cloud system with many fielded instances worldwide fladoop c usters have 19een demonstrated with 10s of petabytes and ever-1-9;999-eores Map does not include all GHOSTMACHINE/SiteStore systems TOPSECRET // 51// RELUSA,FVEY 12

ScafilbleBlg11ible lmp/ementatlon with Security Large scale data preprocessing and analysis Applications Services Layer Publish and Subscribe Messaging ..·. .----..... ,c::;;::::;::=::=i Massive storage and fast record lookup F===-t Relational Database Ingest and Enrichment GHOSTMACHINE RTRGand GHOSTMACHINE systems are paired with one another: MapReduce analytic results are fed back to RTRGrelational aata6ase TOPSECRET // SI// RELTO USA, FVEY 13

User interfaces Web Tier Application Services Alerting Services -u. Ingest Cl) Streaming-based z Data Map Reduce-based I Secure Data Access Services Q) Casport , Wavelegal, etc. m L.. - Relational Store Enrichment User. & Auxiliary Data Cl) -I I Operating System _I Utility Cloud m m Q I I Java VM 11 HardenedRH Linux OpenStack VM HardwareLayer 11 I II I Cl) u. Q O> 0 Ope riating Syst ems Analytics Analytics Real time Enrictvnent Cloud base Event Data and Analytic Results Java VM Operating System Hardetled RH Linux - :::c Q) C) ~ 0 Cl) - m m Q Data/Analytic Cloud HardwareLayer The Cloud will bring new data-intensiv capabilities, support existing missions,and align RTRGinstallationswith emer ing NSA and IC standards TOP SECRET//SI//R ELTO USA,FVEY a

• RTRG Mission Overview • RTRG System: Today and Tomorrow q . Target-Centric and Network-Centric Cloud Analytics • FtJtu re Work UNCLASSIFIED//FOR OFFICIALUSEONLY 15

Graph & Network Target Development Co-Travelers & Meetings Graph Triage Handset Swap Contact Similarity High Priority Untasked Nu Target Enrichment & Disambiguation --Beddown Layercake Geo-Spatia I rhe data-intensive computing capabilities io the system enables a set of graph/network analytics and target development -analytics 16 TOPSECRET //SI//R ELTO USA,FVEY

TOP SECRET//SI//RELTO USA,FVEY = M = eeting Ta~get Development Challenges :":·_i, Past Analysts manually queried multiple, independent repositories, aggregating results in Excel, taking hours or work for search and refinement Now RT-RG provides a streamlined, integrated workflow saving analyst effort Tips Target Development EY8flt Type Icomms .gsm I OBJ DEWEYBEACHASSOCIATE .:] EY8flt Sti>type ca11.· Selecllrs NAI Selectors .:) IJ.l!l!J ' 12345 IKandahar (TS// SI/ /REl JOUS"-FYEY) RTRC- Af~an Q) ' Sct,m~ Que<y J ~ I Creel Que<y : Outcomes I 0 II . (TS/ /SI/ / JUL TOUS"- MY) (IMtl) ~ (TSI/ SI/ /REL TOUS"- f\lEY) (IMEi) tar-geU AROCC E)portTo Fiie... ] Metadata Search x Collaboration Geolocation - .,..,,., DJt, f?JOUi•' 1Vl)OMI DPL.Orl•Ttoor [email protected]) ,_,.,..__ Detainee Re orts ,_ ~~~~ "'&I CST 17 FOB FENTY (PA N) :AF I (Khaksar) ihu , 1 0 Mav 20 1 2 l S:SS:20 Zu TO P SECRET//CO M!NT(IREL TO Capture NZLl /20320108 ASS ET : JUGG ER NA UT Geospatial Alerting Enricrnment TOP SECRET //SI//REL TO USA, FVEY 17

Who is at the same UCELLID at the same time? Manual Process Cloud Process • Take your selector and query for every unique location he has been and at what time • Pre-calculates all UCELLIDoverlaps between tasked selectors • Queryfor other selectors who have been at the same places at the same times (impossible or painful } • Simply query your selector in cloud-generated QFD and view summary statistics • OR compare to another known set of selectors to find everlap (excel I ArcGIS I JEMA) (lim t-ing to what yo know ) ·- Counts • Summary statistics on the matching IMSls using excel or ArcGIS TOP SECRET //SI//REL TO USA,FVEY

Is there a pair traveling together? Manual Process Cloud Process • You could use the same manual process from Meetings, however, this would not find cotravelers on different networks • Measures miles-per-hour (MPH) between tasked selectors as they move around. • Low average MPH= co-traveling. • Marnual comparison of pairs of known selectors is possi le with ArcGIS or similar spatial tools - You mus know the pairs up front • Simpl¥ query for your selector to vie on average PFl, ~fays c::alculated,etc. *Also known as "Sidekicks"L..._ __________ TOPSECRET //SI//REL TO USA,FVEY statistics __.

Past Manually query multiple repositories and build network with Analyst Notebook (ANB) - amount of labor can be prohibitive Now RT-RGtools exist for contact chaining for selector-to-selector & selector-to-report graphs, with more analytics and tools to come ,~ Selector-to-Report Graph from Enrichment MAINWAY graph in RTRG UI TOP SECRET //SI//REL TO USA,FVEY 20

• Graph representation is natural for DNR • Result is Furious Chainsaw Prototype on Cloudbase Now supports contact chains and trends Metadata matrix in Cloudbase supports fast graph traversal Will support other graph analytics in the future • iTriage capability for forward users to complement Enterprise databases • Ena es chaining ana other analytics, provides foundation for graph algorithms Graph View in Renoir - but many other analytics are possible 21 TOPSECRET // SI// RELTO USA, FVEY

F,i iiiiiii t 'Y Contact Neighborhood Ozone Widget ---· t,: p \,OPl 1( o, • Crertte .. Pro!Jl.pc; ,,. 100 Par t ne r o:,. Daily Call Trends in Panopticon 80 - c 5 u c ~ ' !; ! . '!, ! .L.,.. Ji,, ...~.., l !.i -.............. - Top Associates Ozone Widget 60 40 - Hour of oay (Zulu) Weekly Call Trends Ozone Widget DNR graphs in Furious Chainsaw tables in Cloudbase support a wide range of fast queries a ~n_d_ a_n_a_l ,_t_ ic_s____ TOP SECRET //SI//REL TO USA, FVEY ~ 22

., II ARKFINN Structured Knowledge Space • Selector extraction, normalization, and enrichment • Entity extraction (people, organizations, times, geos) • Flexible free-text query interface • Keyword, faceted, and people search • Graph, text, and spreadsheet output formats • Document clustering • Arabic name expansion • Integrated data ingest using Niagara Files (NiFi) • SharkQuery: search by selector, entity, location, and keyword • SharkDocs : query, sharing, and collaboration on user uploaded documents • Visualization of results in query overview, table, graph, and map • Cloudbase and HDFSfor scalable text analytics platform TOPSECRET //S I// RELTO USA, FVEY I

"OBJSMITHERS"AND OBJECTIVES:OBJ TOPPER Previous 30 Days 1. Perform keyword search with seed OBJ on report library 4. Additional Search & Filtering OBJECTIVES OBJSMITHERS(19) OBJTOPPER(19) OBJSPRINGFIELD(12) OBJAPU (6) OBJ GROUNDSMAN W ILLY(6) OBJWIGGUM (4) OBJ MAYFIELD (3) OBJ NEVIS(1) OBJTRAJAN (1) 5. Correlate found selectors with Octave/UTT and Panopticon 6. Export selector & Report Graph 2. Finding co-occurring objectives 3. Selector co-occurrence TOP SECRET //SI//REL TO USA,FVEY

Past Ana lysts manually correlated locations using map viewers or spreadsheets, aggregating data from multiple sources Now Analytics and alerts push target information by subscription o + o ++ o + First Event UCELL/Ds 12 DOI000,. 11402 + 0 + 0 D + Last Event RT-RGpattern of life analytic for target location cues Analysts are notified by alerts, based on: -~ • Geospatial NAI * • Tasking status • SMS content • Selectors, callsigns, frequencies • r- .__....,\ ._ Op ATTENTIO N I NS RC· C K.-b ul (PA N) : U ID A SW SUI CID E VEST KABU L 3 Thu, 10 May 20 1 2 1 7: 4 8: 1 2 Zulu TOP SECR.ET//COM I NT//R.E L TO USA,. AUS, -._ ,1._1, CA>< , GBR, Nll.//20320 106 ••• TOP SECRET //SI//REL TO USA, FVEY * NAI = Named Area of Interest 25

Find the most consistent location of the day's first/last event ~,,, Manual Process - One Selector At a Time • Query all events for your selector. Bed Down Cloud Process- All Tasked Selectors • Pre-calculates first and last events in local time for ALL selectors. • Mark first and last events manu al ly. • Will calculate estimated Bed Down at query time . •OR • Enter in a tea l like CheekyMoi:ikey to \liew gaps in activity. •Can query multiple selectors in secondJs,find common overlap. • Slow process to do one selector at a time TOPSECRET //SI//R ELTO USA,FVEY

LayerCake- find geospatial overlap of a set of targets. Manual Process Cloud Process • Query all events for your selectors. • Pre-calculates unique locations visited for ALL selectors. • Display events spatially on mapping sofi ware (impossible to view polygon overlaps ). • Raster heat maps drawn at query time. • Rasterize and do "raster math" to dete" e max overlap. (ve-Fy -€omplex and expensive task in most GIStools . TOPSECRET //SI//R ELTO USA,FVEY

Using MapReduce, analytics can count and aggregate over large number of daily events to give a multi-resolution spatial visualization of the data Synoptic View Google Earth ' Cloudbase Tables for Geo Data MapReduce for Binning, Counting • r:::;::=::::ii Event Data Analytic Data Flow Fewer events TOPSECRET //SI// RELUSA,FVEY 28

Cloud enabled analytic allows viewing the spatial data at many levels-of-detail "Meso-scale" View Detail View - "Mashup" with heatmap _c:.,...JQ>U.,..µrr1r i.:..:~.l,1' We will al'JM!on sa11.11d:j'f o/,-,Jt~ :Ga..l,lli'))1...r'~JI-_J,)-.. .>~ob.J~~{»oo'J,C,.:. • And z uu, c,ayof Jucoment hea:hngdesiring iill believer:5, i-s. We+wtll• atr1ve•on• saturOay• commanderrellglon Is: llghl he:iM!nfSky timdle l~nd 1s to ltle resurrection-dayR He--atmapof activity of a set of targets Fewer calls Using Cybertrans translation service on MS messages, integrated with heatmaps More calls TOP SECRET II SIII RELUSA,FVEY 29

• RTRGMission Overview • RTRGSystem: Today and Tomorrow • ~ Centric a nal1 l<-Cent · FtJture Work - UNCLASSIFIED//FOR OFFICIALUSEONLY 30

• Improved DNI capabilities; convergence • Integrating focus on active SIGINT capabilities • Increased CT and expeditionary capabilities • Better tools for faster analytic development • lncorP-oration of content analysis and lHLTcapabilities • lmprovea integ rration between target and _____. opu lation analytics TOPSECRET // SI// RELTO USA, FVEY 31

L __J gmSeminole ' --' NSA- Georg1a.. . gmGulf NIOC-ti hrain / gm Zilla Kabul ' TOPSECJITT//SI// RELUSA,FVEY 32

• RTRGhas been a successful regional data store and exploitation system for COIN, CIED and other missions • Moving to NSA Cloud infrastructure - More llistorical data - Deeper analysis using parallel programs - Allows for more flexible deployments to IC, DoD service installations • Continuing to SUP-P-Or-t advanced analytics for- current and fu ure operations L---------" TOPSECRET // 51// RELUSA,FVEY 33

Fetching more

Filters SVG