Pull to refresh

How blocking on the Internet works: an overview of modern methods using a real example

Level of difficultyEasy
Reading time4 min
Views607
Original author: Иван Сергеев

A group of Indian scientists has published an overview of modern methods of Internet blocking introduced by government agencies, using the example of their own country. They studied the mechanisms used by Internet service providers restricting access to prohibited information, assessed their accuracy, and the ability to bypass such blocks. The team of service provided resident proxies has sorted out this research and would like to bring to your attention the main thesis of this work.

Input data

In recent years, researchers from different countries have conducted many studies of blocking methods that are used in countries considered to be "not free" – for example, in China or Iran. However, in recent years even democratic states like India have deployed large-scale infrastructure to censor the Internet.

During the study, scientists compiled a list of 1,200 potentially blocked sites in the country. The data was collected from open sources like Citizen Lab or Herdict. Then, Internet access was organized with using nine of the most popular Internet service providers.

Initially the OONI tool was used to determine the fact of censorship and blocking of the site.

OONI vs custom script for the blocks search

Initially, the researchers were going to use a popular censorship detection tool called OONI. However, during the experiment it turned out that it gives a lot of false positives – manual verification of the results revealed many inaccuracies.

The poor quality of censorship detection may be due to outdated OONI mechanisms. So, when DNS filtering is detected, the tool compares the IP address of the specified host returned by Google DNS (it is considered to be uncensored) with the IP address assigned to the site by the Internet provider.

If the addresses do not match, OONI signals the presence of a block. However, in the realities of the modern Internet, different IP addresses do not mean anything and, for example, may be an evidence of the use of CDN networks.

Thus, the researchers had to write their own scripts to identify the blocks. Below is an overview of popular ways to block content on the Internet and an analysis of their effectiveness in modern conditions.

How are blocks implemented or what are the middleboxes

The analysis showed that in all cases of blocks of various types, they are carried out with using embedded network elements. The researchers called them middleboxes – they intercept user traffic, analyze it, and if they detect an attempt to connect to a prohibited site, they embed special packets into the traffic.

To detect middleboxes, researchers have developed their own Iterative Network Tracing (INT) method, which uses the principles of the traceroute utility. Its essence boils down to sending web requests to blocked sites with increasing TTL values in IP headers.


Middlebox mechanism for data interception

DNS blocks

The DNS resolution process is the main step towards gaining access to any website. The URL entered by the user is firstly resolved to the associated IP address. When using DNS block, censors always interfere at this step – the controlled resolver returns the wrong IP address to the user, as a result, the site simply does not open (DNS poisoning).

Another blocking method is the use of DNS injections – in this case, the middlebox between the client and the resolver intercepts the DNS request and sends its own response containing an incorrect IP address.

To identify DNS blocks by Internet service providers, the researchers used TOR with exit nodes in uncensored countries – if the site opens with it, but not with a simple connection through the provider, then the fact of blocking is confirmed.

After identifying sites blocked by DNS, the researchers determined the method of blocking.

Iterative network tracing method: the client sends special requests (DNS/HTTP GET) containing a blocked site and an ever-increasing TTL
Iterative network tracing method: the client sends special requests (DNS/HTTP GET) containing a blocked site and an ever-increasing TTL

TCP/IP packet filtering

Blocking by filtering by packet headers is considered to be a popular method of Internet censorship. There are many studies on the Internet, the authors of which are trying to identify this way of blocking sites specifically.

In reality, the problem is that this method is easily confused with ordinary system failures, leading to difficulties in the operation of the network and reducing its bandwidth. Unlike HTTP blocks, when filtering TCP/IP, the user does not receive any notifications that the site they need is blocked – it simply does not open. It is very difficult to validate and separate blocking cases from normal network failures and errors.

Nevertheless, the researchers tried to do it. The handshake procedure was used to do that. Handshake packets were tunneled through Tor with exit nodes in uncensored countries. In the case of sites a connection to which was managed to be established through Tor, the handshake procedure was performed five more times in a row with a delay of about two seconds. If each of the attempts turned out to be unsuccessful, there was a high probability that a case of a deliberate filtering took place.

As a result, no such blocking method was found for all of the tested Internet service providers.

HTTP filtering

But in the case of five out of nine providers, HTTP filtering was detected. This method involves analyzing the contents of HTTP packets. It can be done with the help of those intermediate network elements (middleboxes).

To identify HTTP filtering, the researchers created Tor loops ending in countries without Internet censorship. Then they compared the content received in response to requests to blocked sites made within the country and while using Tor.

One of the first tasks was to identify the moment at which the block occurs. For example, in a case of some providers, after sending an HTTP GET request, an HTTP 200 OK response was received with the TCP FIN bit set with a blocking notification – it is the one that forces the client's browser to terminate the connection with the target site. However, after that, a packet from the site also came. In such cases, it was unclear whether the blocking trigger was the client request or the site response.

This was discovered with a use of a simple manipulation: in the HTTP packet header in the GET request, the Host field was replaced with HOST. This turned out to be enough for the blocked site to start opening. This proves that the censors only check the client's requests, not the server's responses.

Conclusion: do all providers implement blocking?

Often, particular Internet service providers do not block websites themselves, but they rely on providers managing "neighboring" networks in this regard. In the experiment under consideration, several Internet service providers were never noticed to be using their own blocking, but at the same time, sites blocked in the country still could not be opened by the users of these providers.

Tags:
Hubs:
+5
Comments0

Articles