Documents
XKS Tech Extractor 2009
Jul. 1 2015 — 9:52 a.m.

TCIP T0 USA, AUS, CAN, GER, NZLH20291123
if Hi .
me
(aka Tech Extractor)
December 2009
TOP SECRETHCOMINTHREL TO USA, AUS, CAN, GER, NZLH20291123

SECRETHCUMINTHHEL TU USA, AUS, CAN, GEIRThe "Tech Extractor" is a way of finding
valuable intelligence based on keywords in
the content of DNI sessions but it is a
departure from traditional "soft selection"
which tends to bring back a lot ofjunk.

To USA, ALIS, gin. gamma;
,7W?hat is soft selectio?
Soft selection, aka content based selection,
is an approach at targeting traffic by looking
for keywords or phrases rather than specific
E-mail accounts
Content based selection has suffered
because of the poor design of content
based selection engines

SECRETHCUMINTHHEL TU USA, ALIS, CAN, GER, NZL
Soft Selection vs Surgical to
Existing selection techniques are blunt instruments
XKEYSCORE contextual dictionaries provide an
extremely sharp knife to make accurate selection
decisions
?That?s not a a knife!?

i. .. usage0.: icatio vs DN on
- i Selection engines in use today were based on
designs built to handle TELEX traffic
TELEX is a highly formatted content rich type of
traffic that does not resemble raw DNI seen with
Internet traffic
Raw Internet traffic contains HTML, web-pages,
raw base-64 encoded documents etc.
When think of DNI ?content? they are
more referring to ?communication content? then
raw DNI content.

To USA, Alia, ?rm. GEL
.Communi
If an analyst tasks a Boolean equation ?bomb?
and ?chemical? they likely want to see all
communication that mentions ?bomb? and
?chemical? and not all web pages, news stories,
blog posts etc. where those two words appear
i What we need is a context-aware scanning
engine that knows where it is inside of the raw
DNI in order to properly apply analyst tasking

SECRETHCUMINTHHEL TU USA, AUS, CAN, GEIR, MEL
'What is the Tech ??if
The Tech Extractor was
first stab at context-aware scanning and it
only focuses on three contexts:
- E-mail Bodies
- Chat Bodies
- Document Bodies:
a Microsoft Word, Excel, PowerPoint, Project, Visio
a Adobe PDF,
a Rich Text Format (RTF)

TU USA, E. @339 Ml.
the Tech work?
The Tech Extractor works by scanning a list
of keywords against those three contexts
and then tagging the results.
It?s important to note that this is not ?filtering
and selection? and we?re not fonNarding any
data home
XKS is simply tagging sessions with
meta-data, much like we do with
appids+fingerprints

To usi? Lean, mg
.2
the Tech Extra-t work??
After the meta-data tag is applied,
can then use that meta-data tag as part of a
compliant query for traffic
It?s important to note, just like
Apple+Fingerprints, Tech Extractor tags
aren?t necessarily compliant by
themselves. You may need to add a valid
foreign IP address, MAC address or country
code before you query!

TU USA, AUS, CARI-GER
XKS get its aims?
.ffl'ZWhere does
provide the XKS team with lists of
terms, called ?Tech Dictionaries? which can
contain multiple category names (aka ?Tech
Names?
Only after the XKS team is supplied with
those terms can the system begin scanning
and tagging.

TDF USA. AUS, CAN, GER, NZL
EFT
sear oEan
Wis
THE '13} .1331. ILLHI IE
-
1 omssmo o4:55:o1
sons-014:1 sons-oi-oi -
o4:55:oo o4:55:o1 satE'l'tE
This document would have been nearly
generic for CADENCE style tasking and the
MAC address hit on an anchorless regular
expression, impossible with current
corporate scanning engines
I l_ 5! I
Attachments
USA, AUS, CAN, GER, HZL
impossible to find without the context aware
tasking. The terms ?wimax? and ?dvb? are too
HIE
Ranchor
Line:
HIE
Rancher
Line

. usanus??r?.
I .I
- g?
Suhject:
Fr?n?ll
I I 1:13:
t? Dist-E: TLIE DEE WHISTLE GMT EDGE
HTML Plain Text
?Em mm:?
emai
Madel: Eng-Cu:-
Fm WIDE-56024
Ri?ng
Syrup-tam: .41:ij
Camments: no fault fr:qu phczane is pmperljr kindly; can?rm the fault in detail when and in which cmnditican it
creates pmblem related menticjn
GEM HE: air En il'IEEf

SECRETHCUMINTHHEL TU USA, AUS, CAN, GEIR, MEL
'Fj?ul'l Foreign Language ppge
z-e--=Supports full foreign language tagging
and querying
look for common Arabic expressions
in E-mails coming from the Pakistan
tribal regions:
Activnz: 1136f:
i . . . i
UIS Wehm?il Display HgiiWIiif-i?m LIVE Mall
[is-ta
mer?mm-mi
Medium ?skYou 111511.:r not know this senderMa?: as safE Mark as unsafe
Slant: Thu 12:07? PM
TDP SEGHETHCUMIHTHRELTD USA, AUS, CAN, GER, NZL

SECRETHCUMINTHHEL TU USA, AUS, CAN, GER, NZL
Live Demo