Documents
Advanced HTTP Activity Analysis
Jul. 1 2015 — 9:51 a.m.

Advanced HTTP Activity
Analysis
2009

Goal
The goal of this training is to get you
familiar with basic HTTP traffic and
understand how to target and expliot it
using X-KEYSCORE

Agenda

What is
HTTP stands for Hypertext Transfer
Protocol and it?s the primary protocol for
transferring data on the World Wide Web

Why are we interested in
?myspaeennm.
i nlace- Iii-r t?riiarlihr
Because nearly everything a typical user
does on the Internet uses HTTP
facehook
I I
'co?m' GO 9 ?ght"
@meitru?

Why are we interested in
i Almost all web-browsing uses HTTP:
. Internet surfing
Webmail (YahoefHotmailiGmailletc.)
OSN (FacebeoklM yS pace/etc.)
Internet Searching (GoogleiBing/etc.)
Online Mapping (Geogle Maps/Mapquestfetc.)

How does HTTP work?
. HTTP is comprised of requests from clients to
servers and their corresponding responses
. Many are already familiar with the
terms ?client-to-server? or ?server-to-client"
collection (also referred to as ?client side? or
?server side? collection).

How does HTTP work?
. A ?Client? is usually referring to a Browser
(like Firefcx or IE) which is also referred to as
the ?User Agent?
. The "Server" can also be referred to as the
?web-server? or ?origin-server" which is the
machine that is storing the data that is being
accessed (like a web-page, a map, an inbox,
etc)

HTTP Activity
3. HTTP activity comes in Me types:
Websitecem
-
Client-te-Sewer
"requests" Sewer
Server-te-Client
?responses?
Client
User

HTTP Activity
3. HTTP activity comes in two types:
Website.cem
-
Sewer
Hi I While there may be a variety at" Proxies,
Gateways er Tunnels in between the client and
liq the server, traf?c is always geing in one directien
Ghent 2K. or the ether.
User

Client vs. Server Side Traffic
It How do you know which side you?re looking
at?
i-Client-to-Seryer requests are generally small
in size and are computers talking to other
computers
They contain standard HTTP header fields like
?Host? ?Accept? ?Connection? etc.

HTTP Activity Examples
'Client-to-Server request:
Til-F" 5
HUD-GET 1-: Flewdl'; lw'naraiuzm
HEWDEH ENIFurrnat
Seminar. 1?
WT mm: TWH
USE'J??gtl'?i Ma?a-?int Wind-3w; HI 1; :li?Ufjjl 131pch :ltl'lr?lL: 111:1:
Herlmj Sheree-'1 Swim-"JR":
FIE-war ?aw-nan mam
manual:an
?int-apt m1 '5
fizz-:pt Emmi?Lg; gajp?u?ntc?bziplsich
Ciel-Lie:
113:: 113::
??n?pt-Tanglag?' F?-U?len
fizz-:pt Chemist: ISO 3
Heat: 5111;313:311 cc-m
Czarzlt'tli'rn Keep-FIJin

Client vs. Server Side Traffic
Server-to-Client responses are generally
larger in size and are what web-pages look
like at the internet.
iWhen you?re at a computer accessing the
Internet, you?re only seeing Server-to-Client
traffic.

HTTP Activity Examples
Server-to-Client Response:
TI Document "rps: Fists? Fr Hall-,- L."srsish
l'I-l'll I'Jlsulntr Flu.- 'Jl-ll
Bsnus questish: are the
images in this web-page missing?
=11 ll'l'l'F' ads-I I1ItIrI1 ntitH1
SH Wines
Ilili?
5 'JiiIlJ- [sud sshilsnl
11:31 us
2-..smr Paar Latest Hews
In: Kuwait *rrsigjts? mlmn}
Til-?His Han! in" Mar mm is
Huwaltl qnu m1th r? has In: Fr: Til-?lm?
l-i
smarter-11:1;
rssiyndliun in EUL-ll'll'f'b Hrnir dlrliLl win sir
Twit?. nuts-rthe- premier?s ?fths -m
..
H5231: Juuld mu;
. . .
rue-sugwauzul' has tans-n FE'ms am: I ?man-the arm {rulernintnr. .1 nmr-nt
ul' Lsastr
Arier
Tsr- rn ?Jr at rs 'n.lrl h; the 1r-ltrr ml rut nil rm jinn?: R11 $11721: Les-l;
Faults-3E WI- L'h t: :e Pussies tn:- F'srs an Gull 1cum.- to ease the :nft'ue global 'Il'lcl'lCI-i:
3.
T?r- gr .rrunn r?t l'1i'. 1nns1nr?rrl rm th': ls}. 1. Mt.-
I
Suzi-Tsle
f'

HTTP Activity
XKS HTTP Activity Meta-data differs
greatly depending on which side of traffic
we?re celiecting
8* In nearly all cases it?s better to have
client-to-server traffic

HTTP literver
mnahaara -
EFT
Accept:
Renter: 1111:1313: 1313:. an.
Accept?meage
User-Agea . Hazillafa.? [campat:h;e; HSIE 6.0; Hindaws NT 3.1;
.IEJEEHHIHIJMI
Ennki?
CECILE- El
ConnectLan:
Haat UHL F'ath URL My:
Elrawaar
[anmpatihla MSIE NT 5.1; Sim
Search Terms Language
Via
mua harraf an
Rafarar
ht?: earch?
3530 2133 aalaEE Evin-45196 afaDS-ll

HTTP Activity Server-to?Client
HT
.3.
- Film-alt 'I'Iir?lijl'l?' war
- 1m"
-:
II'Il-L-rl'1c1j
Eu Ir
'IT'r'u- tat}-
Tr-n
Liz-?u: 1.1:
In:
I
- -
anl'
Zlif
.E -:
.1 -Ii .-:
F.11rr
lune.
l-L'Ilwuil
- I -
tI I'll: has suhn'urttD-rl rt:
In n'llill .1 mm
pr: a; tl'l: Eta-numb:
=I'n'l
t's I. 1: air
'l
551.11'l'.ll'll: I: fr. .11
I5.- . 'l
TI -- I'l I.--.I-I
I
If: I.'r
-: ads-:3:
l'
-1:

HTTP Activity HTTP Types
Meta-data will also tell you which side of
traffic you?re looking at
Client-to-server has two main types:
Server-to-cl' I .5
HTTP Type
only one:

HTTP Activity Get vs Post
A is you requesting data from the
server (most web surfing)
EA is you sending data to the
server signing in, filling out a form,
composing an E?mail, uploading a ?le
etc.)

Let?s break down the important
parts of a client-to-server request

HTTP Client-to-Server
GET themehtml
Heat:
User?Agent: Meziller'?? {Windewe; Ll; Windewe NT 5.1; en?US; ruz?i 3.0.13} GeekeiED?QD-?rE-E?ii?
25} Firefei-LlEi?fi
Accept:
Accept-Language:
gzipdeflete
Accept?Chareet:
Heep?Alive: BUD
Cenneetien: Heep-alive
First thing to note is the Host: line which tells
you the name of the server that the client is
requesting data from

Host Field
It?s important to note, that in many oases users think
they?re at websites like but behind
the scenes data is coming from a number of
different servers without the user knowing it:
si'i?z?a till-3111 Es 334.": . 2'
Jul
.I5. Elli!
F'hJ-i'th: rt. 5 mail nit-rt:-
"5
a rootlets-:1 12:11]:
Ft st; 3- Er. gag;- . de?ate
5. HT $2773.71-
- Tl 31-7-12; 'I'Ei'l
Bonus question: What would the impact of
this be in how you formulate your
queries using the Host
field?

HT
GET i'hcrne-
Heat: samplewebsiteccn?i
User?Agent: Mezillai?? Windews NT 5.1; en?US;
25} Firefexf3.Ei.'lEi
Accept:
Accept?Language:
Accept-Enccding: gzipldetlate
Accept-Charset:
Heep?Alive: EDD
Ccnnecticn: keep-alive
Second the GET line tells you which files the user is
requesting from the server.
If you simply take that line and append it to the Host
line you have the live public URL that the user is
reques?ngz

HT
GET
Host: samplewebsiteccn?r
User?Agent: Mozillai'?? Windows NT 5.1; en?US;
25} Firefcxf??t'l?
Accept:
Accept?Language:
Accept-Encoding: gzipldetlate
Accept-Charset:
Heep?Alive: EDD
Connection: keep-alive
When the GET line has a mark in it, then the GET
request is also passing information to the server.
So in this case the client is requesting the file
examplephp but it?s also passing along a value that
could have been entered by the user.

URL Lines
When there is a mark in the URL line, then
KEYSCORE is breaking it up into two parts. The
first part is called the URL Path and the second part
is called the URL Ar ument.
FCL 511: LIE l?
f5?? El "3 tE-l r?th IZIJE Ll E: r'EIf-El. at :31 rt=El ELSE: IZI Ll
Notice all of the ?arguments? (each separated by 863)
in this RL1:11-2:11 I: at" ta: E: I: ?111.22 1
11 try: ear-:11. tub-2. 1.115 aata:
ccept-Enc :uzlizw: :2 defl ate
_eer?J?Lgerut: BUHUS Any idea What the
seaer-l-t-hc-r-z-I-e infermatien that is being in the
RL Argument In this example are far?
Eli-ful?l: ti on: Heep?it; ive
E- E: 5 4'5

HTTP Client-to-Server
Hoet: earnlewebeiteoom
User-Agent: i'u?lozillai'?? {?Ju'indowe; Windows NT 5.1; nni??gi?i Gecko}? 9342315
Firefoiti'?tifi?
Accept:
Accept-Language:
Accept?Encoding: gzipoeflate
Aooept?Chareet:
Heep-Alive: EDD
Connection:
The User-Agent line gives you information on what
type of client is requesting the data. In this case,
we can see that it was a Firefox 3.0 browser from a
Windows NT 5.1 (XP) machine.

User Agents

User Agents
The UserAgent (also known as the ?browser?) can be
very valuable.
While it can not be trusted to be absolutely unique. in
many cases you can use it to unwind a proxy or
multi-user environment.
It can also help provide hints if the origins of the
request came from a mobile device:
-Ll Ill-Ls-s?uI?Z-J. Cl l?jgrIrttirLuItC-l??ii. '3 I.:Iir_ 2: It} [.153 1 :1 El 1. 11': 52. at.
hl-LI: L'itdiu 1
Us or 1-.11' all"? if: if; .4. 1 :I'io t5 . El PI 51 oils-IE? - I: I'i?gut' a ti 11-" - 1 . 1
Us or- ?agrant: Elton: 1]

HTTP Client-to-Server
GET themehtml
Hest: samplewebsiteccm
User-Agent: Fu?lezillsf?? {?Ju'ihdews; NT 5.1; err-US;
25'
Accept:
Accept-Language:
Accept?Enccding: gzip.deflate
Heep-Alive: EDD
Cennecticn: Keep-slice
The various ?Accept? lines instruct the server on the
types of responses the client can accept back.

Let?s look at a simplified version
of a HTTP request and response

What is Web (HTTP) Activity
Thie ehewe hew pereen legs en te webpege
Fi?m 3434* Click en TD 3?3
(client) GET Request (SEWEF)
The elient?e pert can be Ell'l'f high-numbered pen, 3-434 is just an example

What is Web (HTTP) Activity
This shows hew a person legs en to webpege
PW 3434* Click en http:iiww.hntmaii.cnm 80
(client) GET Request (sewer)
4 it Frem Pert 8D
?Ft 3 3?4 ?Weleemete Hetmeil? (Sewer)
(3 Response
The elierit?e pert can be any high-numbered pert, 3434 is just an example

What is Web (HTTP) Activity
This shows how a person legs en to webpege
Fr'?m PW 3434* Click en 5?3
(client) GET Request {Sewer}
TD lam-t 3434* H: Frem Pert 8D
(client) Weleeme tp Hptmell (sewer)
Respense
{client} EmeilAddress: me@hetmeil.spm (Sewer)
Password: Admini 23
POST tn the Web server
The client?s pert can be Eil'i'f high-numbered pen, 3-43-4 is just an example

What is Web (HTTP) Activity
This shows hew a person legs en to webpege
Fi?m 3434* Click eh TD 5?3
(client) GET Reque?t (Server)
TUI P?rt me P?rt
(client) ?Weleeme tp Hptmeil? (sewer)
Response
{Client} EmeilAddress: me@hetmeil.spm (Sewer)
Password: Admini 23
POST tn the Web server
Te Port 3434* Frem Part 80
"Weleeme te yeur Inbe?hemepage?
HTTP Respense
The client's pert can be any high-numbered part, 3434 is just an example

HTTP Activity
i Real traffic, however, can be a little more
complicated.
i- Almost all web pages are built from
mumme?bs
For example, every single image or
banner ad on a web page is a separate file
that needs to be individually requested
before the server that has the file can
respond

HTTP Activity Real World
Let?s look at the Today? home page.
r'r'i'Current Eund?lans
"Ifl: i'r'I'I 110
.
nap-5 "m Ln? Seaml- HearthII?'ll l-xf I Hi Hanna
5: ED: 3
HERE the
Senler Enlleted Leader -- Il-Jl. II.- I I. . I - Hmardc?
._sj Lkutad
1 rurLIL-I L'Ih'u'rb'?'
4
. . Ihi-s. Il'us- ?r-ISEIDH
1 at HP hr?. r.r rTr.I_. . =ur'
Tl I cu: cur Heart-inane- ape-astian ansZIZIIZI Lift;-

HTTP Activity Real World
- It looks like one page, but each of the
images and banners are separate data
files that your browser pieces back
$11-th1] HEAG hosts th 3 i-.l
"ram-1r Enlisted Lea-jar
I -
.Fri'ri; I'i=u' 1I'ii{fag-l 'l2-ll. II l' ll.- I.
"'J'Ilr'llr'l.'II'II' FF: hT] . 'I'Il'hl h'l'
. . --. --
II duh? I I Irl-'I
Tran:
I
I
-.-J-: -: to Tl'n:: t'rz?. :?cp . .- High?
h'h'm'lgmassu
L: . .r L: ?h :1
151' I I ll_.:hrl I I

HTTP Activity Real World
In fact, to build the NSA Today home page
it takes 34 separate files from 4 different
servers
1* However, most people probany don?t
notice, because the entire page loads in
<300 milliseconds.
i If we had a slow internet connection, we'd
notice the images would initially be
missing.

TI [locum-ant lnfannati-J-n
l'I-l'll 7131a 'Jl-ll Frwial
3? ll'l'l'F' Ila ails-I Il1l?fl1l?ti?-l1
HTTP Activity Real-Word
Netiee that all at the images are missing.
They are all separate sewer-te-elient
respenses and therefare completely separate
?sessions? in X-KEYSCORE er PINWALE
SH Wises VI
2-.sntr Paar
Ira:
Ltd-die East
:julesltrie
Twist":
25:13:31: .3an
.arhra
Jui'aE
Suzi-Teal:
eszlJ laud
ever 11:41
Kuwait W?F?igl'li" er
tut-*1. iI" Mar eil'll is I'Llu"
Ilili? i=1
:ul
Ellen's
Lisml Ff-t'nrf:
smarter-11:1;
The Huwaltl has In:
resign-slim! the EUL-ll'll'f'b urnir anti? :1
the premier?s han efthe economic
nrisls.
Il's- resignazijl' has teen suslnitted Fc'n'ia and:
it Lp the {ruler} t: :eJ'
nil-itnr. ?mnn? .1 nr-r .?imr-nr "man,
Ts? Thl-?L?iac?l dirt
1::1
Jueldmw
I :3
- Fri-2:.
s: Uri-.ls gr.
an- rnugrnl rn U-F 'n.lrl tirlh 1r r.an lhr: 1r-nrr ml 'it nil nn 'Jlf'lu'l'r. E..11
rasese paekage Wl' sh '5 LI tn the Pars :Irl Gull 'IstiL'r
e: the impact :nft'ie ; ln:ul:al "Ii'isncis
an- gr arunn rat WT ?rm *nnr'rrl nn rh *r
IE I. set:
?51125; tea-l;
[ht-ermqu was;
312
mat?

HTTP Activity Real World
6 It?s important to note that not all of the data
on one web-page came from the same
server.
i For example, most of the NSA Today
home page come from homewwnsa,
but the image of the current weather
conditions came from wk-
admiral208.corp.nsa.io.gov

HTTP Activity Real World
at This happens all the time on the Internet.
i The cnn.com home page, may have an ad
on it that was from the Google ad server
and etc.
i And this does have an impact on our
COHec?onl

i This is the traffic path for building the NSA
today home page
I I .1 I
a nail
heme warm nsa eerpwem .naa aitewerkanea win:?
admiralE?B.eerp.
neaiegm
Ueer

i What happens if we only have collection on
one of the paths?
I I .1 I
a nail
heme warm nsa .naa aitewerkenaa win:?
admiralE?B.eerp.
neaJegm
Ueer

What would that traffic look like?
GET
Heat: wit?admiraIEDE.eorpneajegev
Uaer?Agent: Mozillai?? {Windewe; Windowe NT 5.1; en?US; W11 .Q?fi?i
25} Firefoa-Li3?fi
Accept:
Accept-Language:
Aeoept-Eneoding: g?p?eflate
Accept?Chareet:
Heep?Alive: BUD
Connection: Heep-alive
lf?Modi?ed?Sinee: Thur DE Get 2009 19:31:53 GMT
lf-No "?19454 Ee?i -842th43"
Cache-Control: max-age=?
If we only saw this one GET request and not
the other 33 required to build the NSA Today
home page, would we be able to determine
what the user was actually doing?

What exactly is that telling us?
in First off, we know what file they are
requesting.
want current.ij from the wk-
admiral208.corp.nsa.ic.gov server.
i*Thatis actually a live public URL
It Do we have any indication why they wanted
that image? Answer is yes! Look at the referer
field.

What exactly is that telling us?
It They were referred from
i?The referer is in essence, telling you what site
was ?linking? to the new site.
It Warning! The referer can act in misleading
ways.

Referer Field
in The referer field is the address of the page
that links to new GET request.
However, this link could have been automatic
to the user.
l.e. in the case of the current weather image,
the link was automatic and the user wasn?t
even aware of the action

Referer Field
i The referer field could also indicate a user
ac?on.
For example, imagine we were on the NSA
Today webpage and clicked the link to the SID
Today page.
What would that traffic look like?

Referer Field
Hust: sidtndaynsa
user-Agent: {Wind-zlws; NT 5.1; err-US;
Gaul-(0200 904231 6 Firefcux? .U. 1 0
Accept:
Accept-Language:
Accept-Encnding: gzip?eflate
Accept-Charset: ISO-88594
Keep?Alive:
Cunnectinn: keen-alive
Referer: http?hnmawwmsaf
G??kie: =66534?96;
b?463444f72496d?f523 35tvi5it

Referer Field
in Now we?re seeing a request go to host
?sidtodaynsa? with the referer from
1* How can we tell from the traffic that the first
automatic referer we saw for the current
weather was any different from the user-
generated referer we saw for the SID Today
article?

Cookies!

Cookies
Cookies are small pieces of text-based data stored
on your machine by your web browser.
I Almost all websites have cookies enabled and they
have a variety of uses, including to help the web-site
track the activities of their users.
i Most are probably familiar with ?machine
specific cookies? like the Yahoo cookie
- However cookies are used for a variety of reasons

What can cookies be used for?
- Cookies can be used to authenticate a user.
For example in many cases, the ?active user?
for Yahoo web-mail traffic is seen encoded in
the I: part of the cookie stringElt' t: .1110 c. In gill in1.5.11 ante-1H: 1151i:le
i' 1.22 tummy: Unite-:1 States

What can cookies be used for?
- Cookies can be used to store information
about the user that the website is interseted in
Look at how the p= value below tells the
website information about the user of this
account
I: 1.5.11 cunts-1H: 1151i:le
i' Units-:1 States 'i

What can cookies be used for?
Cookies can be used to identify a single
machine from hundreds of other users on the
same proxy IP address
The Yahoo cookie is a ?machine specific
cookie?

What can cookies be used for?
it Important note: All three of those examples
are just subsets of the full Yahoo cookie string

HOW do we EHOW wlla! 980? COOEIG
value is used for?
Nearly every web-site uses cookies that in
most cases they designed for their own uses,
so how do we know what they all mean?
Protocol Exploitation can examine the traffic to
try to determine if there is any information
contained in cookie strings that we might be
interested, for example we?d like to know if
any part of the cookie acts like a ?machine
specific cookie."

HOW go we EHOW wlla! 980? COOEIG
value is used for?
However, there are far more cookie options
out in the wild than PE can possible examine.
even if they aren?t aware of a machine
specific cookie, it doesn?t mean that it doesn?t
exist.
X-KEYSCORE gives you access to the full
cookie string, so if you?re adventurous enough
you can do your own protocol exploitation.

Remember: Cookies are there for a reason!
it Websites put cookies on people?s computers
for a reason.
If the data is valuable for a website, it may be
valuable to us as well.

How long do cookies live for?
LiI-Cookiesi like any other file on a computer, can
be deleted by the user.
Almost all browsers give you the option to
View, manage and delete your cookies
.12} ?nal-Irina E1

Cookies
Yeu can see whet have been etered en yeur machine by geing inte
the ?eptiene? windew ef yeur breweer and selecting ?shew
ein ?e tE nt
Iv'l ul.
F_Ener:ner I alter in [ern'E anzl the search bar
I-
I: l:l l'_l
age-a cod-isles ir-: 'n sires Earn-em
L-zlzegt third-3e 43:;
Eeep expire
l-I
I ("leer Li-
I 34;:

Searches

Searching the Internet
When a user searches the Internet from one
of the many web-based search engines
(Google, Bing, etc.) what does the traffic look
like?

Searching the Internet: CIient-te-Server
In most cases, the client-to-server traffic is a
GET request where the search term is
passed in the URL Arguments:
GET
Host: wgooglecom
Accept: imagei?gif. imagefx-xoitlnac, imagefjpeg, imagefpjceg,
applicaticnr?undme-powerpcint: applicationfvno.me-excel, applicaticni?meworo,
Cookie:
4:3: neMSEIEtfc?er?
}(Ehpti?ri?o
Accept-Encoding: gzip: deflate
User?Agent: Mozillai4.? (compatible; MSIE 5.0; Windows NT 5.1}
Connection: Heep-Alive
Cache?Control: no?cache

Searching the lnternet: CIient?to?Server
- Notice how the URL Path is lsearch and one
part of the URL argument is q=iran
Each website can configure their
differently, so while with Google the search
term is contained in the q= part of the URL, a
different search form might have it as query:
or search_term= etc.

Searching the Internet: CIient-to-Server
X-KEYSCORE tries to account for all the
variations of search terms contained in the
URL Argument for what it extracts for the
?Search Term? column.
1* However, there are always other varieties
out there that we haven?t built it hooks for
yet, so anytime you see something that you
think should be extracted, please contact the
team

?Referer Searches?
it What happens when a user on a
search result?
Let's start by showing the query itself, in this
example, we're going to query the
Google for

?Referer Searches?
What does that GET request took like?
GET
Heet: geegle4.q.nee
User-Agent: Mezillat?? [Windeweg Windewe NT 5.1; en-US; GeeketE?DQ?-?i?t?
Firefexi??ji}
Accept?Language:
gzipee?ete
Accent?Ghereet:
Heep-Alive: EDD
Cenneetien: keep?alive
We knew frem this eeseien that the client is
requesting the data frem the heet ?geegle4.q.nea? and
we see the search term in the URL Arg ument

?Referer Searches?
What happens when a user clicks on a
search result?
GET Irecln'line
Heet:
User-Agent: Mozillaf?? (Windewe; Windewe NT 5.1; en-US; GeekefE?DQ?-?t?t?
Accept:
Accept?Language:
Accept-Eneeding: gzip?e?ate
Accept?Ghereet:
Heep-Alive:
Keep?alive
(Jackie: eE-fa W421
Referer:
First, we can determine the full URL I: Egg
by adding the GET line to the h?st
.r1 3%

?Referer Searches?
i Secondly, we get some hints as to why the
user was requesting that page from the
Referer line:
Referer:
Note that it was the same URL that we were
at immediately before we clicked the ?result?
link

?Referer Searches?
i- Let?s look at that process again:
gnagls4.q.nsa
First, a client-tn-
server request ls
sent that cantains
the queryr an

?Referer Searches?
i- Let?s look at that process again:
gnagls4qnsa
Second, the server
hack w?h
the search results

Ll
Ir
?Referer Searches?
Let?s look at that process again:
Ih?
geegleeqnse
I:li]iilil
Ikeyecere. r1 nee
Third, by elieking en one of
the results; a newr GET
request is issued to retrieve
the heme
page. In this request, the
location of the original
search is listed as the
?referer?

?Referer Searches?
In Let?s look at that process again:
goog e4q nea
I:lilillil
nea
What will happen if we
only have collection on
this link?

?Referer Searches?
When XKEYSCORE sees a search
contained in the ?referer? field, we still extract
it out as meta-data into the ?search terms"
but we append it with (referer) to denote
where it was originally foundIrefererJI-tlle legal Statue oftlle caspi?n 513:]
LIFIIL F'Iritl'l
F-ief er r'
ue=

?Referer Searches?
GET Hexadeaepian_etatue.htn1l
Accept:
Heed:
Refee r: {Jog I+elatue+ef+the+ caepian+ee
Accept-Le nguege.? fa
Aenept-Enendi ng: gain, efl ate
User-Agent: ?enmpa?hle; MEIE Fi?: Win-dewe NT 5.1: SE1: .HET ELF:
ntrel: mer?etele=?
Connection elese
I-BIueCnet-We:
Can we guess what happened here?

Referer searches
Another example:
Til-F" 5
HUD-GET 1-: Flewdl'; lw'naraiuzm
HEWDEH ENIFurrnat
Seminar. 1?
WT mm: TWH
USE'J??gtl'?i Ma?a-?int Wind-3w; HI 1; :li?Ufjjl 131pch :ltl'lr?lL: 111:1:
Herlmj Sheree-'1 Swim-"JR":
FIE-war ?aw-nan mam
manual:an
?int-apt m1 '5
fizz-:pt Emmi?Lg; gajp?u?ntc?bziplsich
Ciel-Lie:
113:: 113::
??n?pt-Tanglag?' F?-U?len
fizz-:pt Chemist: ISO 3
Heat: mew. 5111;315:311 cc-m
Czarzlt'tli'rn Keep-FIJin

Proxy Information

Proxy Information
In a lot of cases we?re going to see HTTP
Activity from behind a proxy or proxies.
What is a proxy?
. A proxy is a server that is acting as an
intermediary for HTTP requests from clients
Why do proxies exists?
- Performance: Proxy can cache responses for static pages
- Censorship: Proxy can filter traffic
- Security: Proxy can look for malware
- Access-Control: Proxy can control access to restricted content

Proxy Information
Routinely, we?re going to see ISP level
proxies.
That is, instead of having each individual
user request web pages directly from the
web servers, the ISP is going to collect all of
those requests first, and then proxy them out
through a handful of proxy IP addresses.
When the response is returned, the proxy
passes it on to the appriopriate user

Proxy Information
in Why would the ISP want to proxy traffic?
In many cases the ISP won?t have to supply
public IP addresses to all its users
It can simply give them a private IP address,
and then use a handful of public IP
addresses for its proxies which are the
machines actually requesting the traffic from
the web-servers

Proxies on the Internet
.
Single-ueer
Web-Sewer
5 I15
. l]
Web?Servers Web?Servers
Stuart-live Ill tennectiens
LID ng-Iiued
Multiple-ueere multiplexed
Multiple-ueere multiplexed
-
Ii?
Ww-te-Prexy
:l =l

Identifying a Proxy
it How do you know that the IP address that
you think is your target is really a proxy?
First step, check NKB.
They have services that attempt* to
automatically detect proxies
These services are in no way 100% accurate so this is only the first step in
checking to see if the IP Address is a proxy

Identifying a Proxy: NKB
Query-I: Addreas
?ute:
'I'alue Canticlan
Luau
. .
?ning: fujurd
I -
r'I' :Iui?lltl hunt
an El I I2: Thur-:1:- I

Identifying a Proxy
Other things to be on the look out for:
X?Forwarded-For IP Address
. What is it?
. An X?Forwarded-For IP address the proxy
passing on to the server what it thinks is the IP
address of the user
. Think of it as the proxy telling the server ?this is
who I think this request came from?
. It?s important to note that multiple proxies can,
and often, are present, so one proxy mightjust
be reporting the IP address of another proxy

Identifying a Proxy
- X-Forwarded-For IP Address as seen in
traffic:
GET 1.
Etf Jig-2m: [?zz-imp: atible; L-IEZIE. E33. Ell; FIT 5. 1; 3171:}
st: 513;- I:
tagnet. I: ?3111 a: I: .
I-Ill .-
i:1 41M: tLizL-?i cigl

Some Examples of X-Forwarded-For headers:
K?Fer'ward eel-Fer:
tt-Femrarded-Fer:
li-Femrarded-Fer:
K?Fenrrarded-Fer:
X?Femrarded?Fer:
Iii-Femrarcieci-Fer:?
K-Femrarded-Far: 12100.1.
K-Femrarcieci-Fer: ganglanant?
Multiple-Layers of Pruxiesi
ln-general, the first IP is the one closet to the original requester
Keep in mind these can be tetally fake

Identifying a Proxy
Similar to the X-Forwarded-For Tag is the
tag?
The VIA tag is the proxy identify itself
GET II .
er rut:
Ht: 3t:
m1:- 1-.-IEJIE. r5. ill; 121']? 5. EFF
[lb EIF .
. STEELE-1
[21-week er: 1 - i 55!. ?35. 1'35
E1 I: 11-: - r11 - ale. E: :1 [ti I:
1111:: than: 1-3: EFI 'EJi-rlirlf:

Identifying a Proxy
The Via: tag may even contain some good
information about the proxy
Be careful though because this information
could be falsified:
. to w: '32: 1 El Evil-ii" 5. LEE-LE 1 :l

Identifying a Proxy
Remember though that the X-
Forwarded-For and VIA lines can be falsified
and don?t have to be present!
Ifthey?re not present, how can you tell the IP
address is a proxy?
Test it in

Testing IP Addresses in MARINA
The primary side effect of a proxy is too
many users cnline at the same time
So if all else fails, try querying on the IP
address (assuming its compliant of
course!) in MARINA to see how many users
were active within an hour time frame
It It?s not scientific but generally it will help

Testing IP Addresses in MARINA
For example look at these results:
:59: 11:5? 1131-: lime: in [:51 1 I125. n- r: {an
[Eli-1: It- mm virgin-.12 i an]: -..-. an
3mm 41] Um: by. EtrurIL: .1 :1.il I
that. r'r
the
RE 1- it] :12: [Pili-
There were 274 unique ?Active Users? in that
hour, think it?s a proxy?

HTTP Header Fingerprint (HHFP)

What is the
GCHQ created the HHFP to help identify
individual users behind a single proxy IP
address
r. The HHFP is a hash of multiple header ?elds
that can be used to identify a single user
behind a proxy

What is the
It At least one of these values must be present:
. X-Fomrarded-Fer IP Address
. Via
. Client IP address
Ifso, the HHFP is a hash of those values
combined with the User Agent string

.3 ms was; r313. sis
lg 1393 5:14 '1
[lbs '2 It] 5 1 Era-:5.
EX: Here?s an Iranian proxy IP
Address that has multiple
underneath it.
'1 I:tl:t 2323:135- Era-:3.
-- NOTE: There?s no guarantee that
an HHFP is identifying a single
135.7323 '1
unique user, it?s entirely possible
that more than one user will have
the same HHFP
.j 13:1 13 at :l [31 j] Eris
a 4513i 4 rs [f1]
54?-
_i
lg sen-1 a as :11
:Elh ?3 :3 :3 a 2 [fl
sl'l'i tnF.? 4 I,j1j,
2 E: j]

Pros and Cons of HHFP
I On the positive side, the HHFP is a single 8 digit
value which can help identify a single user behind a
proxy
On the negative side, it requires an XFF IP
address, Via string or Client IP Address and since
many sessions do not contain all three, they?ll have
no HHFP string
I Also even with the HHFP, all of the fields that are
used to build it are available in the XKS HTTP
Activity query so it?s not providing you with any
data you don?t already have access to

HTTP Activity Search

XKS HTTP Activity Search
After that overview of how HTTP Activity
works, let?s look into how to effectively
target it through XKS queries

XKS HTTP Activity Search
HTTP Activity indexes every HTTP
session
i Client-to-seryer and server-to?client
i Can be queried on any of the unique
HTTP meta-data fields or any of the
?standard? DNI fields (IP Address, SIGAD,
CASENOTATION etc).

XKS HTTP Activity Search
- Unique Meta-data fields of this search
Include: . . . ..
cavered In training:
?rm-a:
FEIFB
IZISIZ:
2-1 Fn?:
URI. Fail-I:
.H.
5933:?. Tarn-a:
CCICHCIEII
Wan: ?ttauzl'rn-al'lt Filal'lal'na:
T515 H=rw=ur ?ynn:
CharaxerEnId -
MEI: l??j
CE FtElrt E1:
_ink5
Ilnrta-nl: TI-al:

XKS HTTP Activity Search
- In addition to all of the common fields like:
?nnlira?ri?n'
[Perlrl?E?S Fm? Apploatlo'In?ro:
lF' ?dd'e55 - - To
.Eltjlication:
- - rlgr'.
?rml'l
l?r'r'l' 'Tn v
Counth 1*
2 de
Ham gaggi?l?lLEl'l'Jth
CIt?y? n3;
elm-.3.
DUES
BMW:
ll'l :l ll.JT"Ij.

XKS HTTP Activity Search
Most commonly HTTP Activity query
searches in XKS will be to enable
?persona analysis?
Based on MARINA, TRAFFICTHIEF or
PINWALE, we?ll want to query XKS to
discover all of the HTTP Activity that
occurred around the targets session of
interest

Simple HTTP Searches
In order to do a ?persona analysis? type
search, all we?ll need to fill in is the IP of
the target (assuming it?s
compliant) and a short time range ?around?
the time of the activity:
atetim E: I21. E: 'n

XKS HTTP Activity Search
Another common query is who
want to see all traffic from a given IP
address (or IP addresses) to a specific
website.

XKS HTTP Activity Search
i For example let's say we want to see all
traffic from IP Address 1.2.3.4 to the
website
i While we can just put the IP address and
the ?host? into the search form, remember
what we saw before about the various host
names for a given website

Host Field
It?s important to note, that in many oases users think
they?re at websites like but behind
the scenes data is coming from a number of
different servers without the user knowing it:
E: J's-1' El] 3? El] 7" 334.": 1-. . 2'
Jul
.I5.
I
F'hJ-i'th: rt. 5 mail u.
W: I _1 3' tits-7.711713 IF. 1 - r" Er?u'l?Tn I1- air.
23.1. ..
:Tttat??-i?E:
a with:
Ft at; 3- Er. gag;- . dc??l:
5. HT $2773.71-
I: I . -.
I--I 3 '1
11:1
5 515
l: Y?l'tnlil gill
. Gentle-r: malla-i Eirtlt year: Postal
4
fit-LE 31-1-12;

XKS HTTP Activity Search
i In order to account for all of the possible
host names, we must front-wildcard the
host name.
i Be careful when front-wildcarding
because beyond being resource intensive
for XKS, it can be dangerous from a
perspective

Hints for wildcarding a host name
i If you?re trying to query for traffic to the
website the best way to
wildoard it is:
i *.website.oom
i Notice that the . before the hostname
website is still there, that way we will
properly hit on ads.website.com
images.website.oom but avoid the false
hits on

Hints for wildcarding a host name
Why are we only interested in traffic
coming from our IP of interest going to
our website of interest?

Helpful GUI Shortcuts
- Earlier we talked about how XKS broke 3
GET request into the URL Path and URL
Argument (separated by a
Ex: http:f!farum.
Get?a broken out to:
HIZI Eft LI FCL Ell l'l E15:
f-trrum? EelIcr'n'lltraathtlui #1314515

Helpful GUI Shortcuts
- So if we were to query for this URL we
would need to enter those fields in
separately:
HIZI :31 URL F'Eltl'l
?erhiwthremlquur #131435
lluru r'

Helpful GUI Shortcuts
Or we could use the Field Builder? to
simply copy and paste the full URL and let
XKS break it into its appropriate parts:
I: 1
URL F'Eilil'l
Field Builder
UIIL llL? [lurk-L11 I.IJ [.mpulull: Ina-L,
path, and argument fields:
:ritar

Helpful GUI Shortcuts
Field Euider
Enter a URL that be automatically par5ec tn pupulate the host.
path. and argument fields:
Writ] Ell: . 1:314:35
El I_:el
at
LII-7.1- atl': [rt-1:1. re
F-d.