Using divert sockets to log HTTP requests

One of the cool things about divert sockets is they allow you to spy on local traffic with virtually zero overhead. You don't have to worry about re-forwarding the traffic to the intended recipient port, re-setting headers inside the TCP packet, etc.
Using divert tees you can just get a copy of the traffic matching a given rule, free for you to use in any way without interfering with the regular traffic.


cristi:~ diciu$ sudo ipfw add tee 8080 tcp from 193.231.199.80 to any 80
00100 tee 8080 tcp from 193.231.199.80 to any dst-port 80
cristi:~ diciu$ sudo ipfw list
00100 tee 8080 tcp from 193.231.199.80 to any dst-port 80
65535 allow ip from any to any


The ipfw rule we are using is "tee" - the concept is similar to the Unix tee in that we get a copy of the traffic matched by the rule. Once we've set up a divert tee, we need a divert socket reader.
The protocol type for the divert socket is defined in netinet/in.h:


/usr/include/netinet/in.h:
#define IPPROTO_DIVERT 254 /* divert pseudo-protocol */



The code below binds a divert sockets and looks for HTTP GET and POST requests inside the content received from the socket:


import socket
import select
import re

IPPROTO_DIVERT = 254

sock = socket.socket(socket.AF_INET, socket.SOCK_RAW, IPPROTO_DIVERT)
fd = sock.bind(('127.0.0.1', 8080))
sock.setblocking(True)

MSGLEN = 32768

while(1):
msg = ''
while len(msg) < MSGLEN:
chunk = sock.recv(MSGLEN-len(msg))
if chunk == '':
raise RuntimeError, "Socket gone"
msg = msg + chunk

t = re.compile(r"(?P(.*)(GET|POST)(.*)(HTTP/\d+\.\d+)(.*)(Host: )([a-zA-Z\.0-9-]*)(.*))", re.DOTALL)
m = t.match(msg)
if m:
print m.group(3) + " " + m.group(4) + " " + m.group(5) + " " + m.group(8)



to run the code, you need to run python as root, because otherwise you'll get a permission denied when trying to create a RAW socket:


sudo python divert.py


Once you're done, you need to delete the divert tee:


sudo ipfw delete 100