How mitm works#

A high-level overview on how a man-in-the-middle proxy works.


mitm is a TCP proxy server that is capable of intercepting requests and responses going through it.

To understand how an mitm proxy works let’s take a look at a simple example using the HTTP protocol. Let’s familiarize ourselves with a raw HTTP communication, how a normal proxy functions, and finally how an MITM proxy works.


HTTP & HTTPS#

For the sake of example, imagine a client is trying to reach example.com:

import requests
requests.get("http://example.com")

Whether the client is trying to reach the domain via the requests module or their browser, both methods must generate a valid HTTP request to send to the server. In the case of the above, requests would generate the following HTTP request:

GET http://example.com/ HTTP/1.1
Host: example.com
User-Agent: python-requests/2.26.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive

This HTTP request is sent through hundreds of miles of wires until it reaches the server, which interprets the message, and replies back with the requested page:

HTTP/1.1 200 OK
Content-Encoding: gzip
Accept-Ranges: bytes
Age: 111818
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Fri, 05 Nov 2021 18:49:47 GMT
Etag: "3147526947"
Expires: Fri, 12 Nov 2021 18:49:47 GMT
Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
Server: ECS (agb/5295)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 648
[data]

The server, like the client, strictly follows the RFCs that define the HTTP protocol. In some cases, however, the client might want to create a more secure connection with the server. We know of this as HTTPS, which stands for HTTP secure. To do this, a client would connect to the server with the https prefix:

import requests
requests.get("https://example.com")

In this case, the clients initial request will be

CONNECT example.com:443 HTTP/1.0

Which indicates that the client is ready to begin a secure connection with the server. When the server receives this message it replies back to the client

HTTP/1.1 200 OK

And the client begins what is called the “TLS/SSL handshake,” which you can read more about it here. During this handshake the server and the client create a secure tunnel that they can communicate without fear of someone being able to see their communication.

All of the above is important to have a general understanding of to comprehend how proxies work.


Proxies#

A proxy works by sitting between the client and destination server and is typically used to concel the IP address of a client. A normal proxy would be used either by setting its settings in the browsers’ configuration, or via the proxies parameter in requests:

import requests

proxies = {"http": "http://127.0.0.1:8888", "https": "http://127.0.0.1:8888"}
requests.get("http://example.com", proxies=proxies)

In this case requests will generate the same HTTP request we saw above, but instead of sending it to the destination server - example.com - it will send it to the proxy.

GET http://example.com/ HTTP/1.1
Host: example.com
User-Agent: python-requests/2.26.0
Accept-Encoding: gzip, deflate
Accept: */*
Connection: keep-alive

The proxy, once it receives the HTTP request, interprets where the client is trying to go via either the first line of the request, or the Host header. It then opens a connection with the destination server on behalf of the client, and allows the client and the server to communicate between each other through it. In other words, a proxy is a ‘man in the middle’ whose job is primairly concentrated on conceling the IP address of the client.

When a client utilises HTTPS (https://) the initial request goes to the proxy, and subsequently the proxy connects the client and server. The difference here, however, is that after the client and server are connected they perform the TLS/SSL handshake and begin a secure connection. This connection is now encrypted and the client and server can communicate freely without fear of being intercepted.


Man-in-the-middle#

mitm, therefore, is a proxy that purposely intercepts the requests and responses going through it. When a client connection is a standard HTTP connection the mitm server doesn’t have to do anything special. It creates a connection to the destination server on behalf of the client and listens to the messages between both. The issue is when a client is trying to connect to the server via HTTPS:

import requests

proxies = {"http": "http://127.0.0.1:8888", "https": "http://127.0.0.1:8888"}
requests.get("https://example.com", proxies=proxies)

When this happens, and the client sends a

CONNECT example.com:443 HTTP/1.0

What mitm does is acts like the destination server by responding back to the client

HTTP/1.1 200 OK

and then performs the TLS/SSL handshake on behalf of the destination server. Once mitm and the client are connected it then opens a connection with the destination server and relays their communication back-and-forth, sitting in the middle and listening. Note, however, that since mitm generates its own TLS/SSL certificates a client will not trust it unless one either adds the generated certificate to their keychain (not recommended) or one uses a special flag in requests:

import requests

proxies = {"http": "http://127.0.0.1:8888", "https": "http://127.0.0.1:8888"}
requests.get("https://example.com", proxies=proxies, verify=False)

… and that’s really it!