From Layer 4 to Layer 7: Decoding Load Balancers and Proxy Modes— Is It Possible to Track a Hacker?

Load balancers are like the silent gatekeepers of our networks. They make sure that traffic is distributed evenly across our servers, reducing overload, keeping response times low, and minimizing downtime.

In this blog post, we are going to dig into what load balancers do at each layer, the differences between L4 and L7, proxy modes and how to work around challenges like preserving client IP addresses.

Introduction
Layer 4 Load Balancer
L4 load balancer? fine! But i want to preserve the client ip?
Layer 7 Load Balancer
How Can We Retrieve the Client’s Source IP in Layer 7?
Using the X-Forwarded-For Header
Comparison of Common Public Cloud Load Balancers
So, Can You Catch a Hacker Who's Using a Chain of DUMP Load Balancers?

Layer 4 Load Balancer

Layer 4 load balancers operate at the transport layer (Layer 4) of the OSI model, making routing decisions based on IP addresses and ports. This means they don't inspect the actual content of network packets. They simply pipe traffic blindely. ```

Example of Layer 4 Load Balancer Configuration (HAProxy)

defaults mode tcp # Operates in TCP mode frontend tcp_front bind *:80 # Listens on port 80 for incoming TCP connections default_backend flamingo

backend flamingo # our backedn pool our the loadbalancer will forward traffice to balance roundrobin # Distributes requests to the servers based on roundrobin algirthm server server_a 192.168.1.1:80 check server server_b 192.168.1.2:80 check ```

mode tcp means that the load balancer will run on layer 4, and it is expecting a tcp packet, that's it.

All these layers and modes are just standardisation that a bunch of people agreed on and put it out there, so when people say let's operate on TCP mode, they should adhere to the same standards stated here RFC793 , So both they can have a clear expectations, after all it's all code, bits and bytes.

For example, if i'm using python or any other programming language and i'm gonna create a layer 4 load balancer, i should do the following -> Whenever my load balancer receives a request:

I should create a socket that will send the data to the backend server, or pool of servers.
When i send the data or receive the data these are the expectations:
- Expectations from the receiver: Since i mentioned that my loadbalancer will run in TCP mode, or a layer 4 load balancer, which means the same thing, the receiver will expect a tcp header, that have the source ip address and the destination ip address along with the content of my packet.
- Expectations from the sender: Since we made the sender aware the we are a layer 4 load balancer, he should send a tcp header, that have it's ip address and the source ip address (which is our layer 4 load balancer in our case.) along with the tcp packet content.
  However, such load balancers will lose the source ip address on the way. That's why they call them DUMP LOADBALANCERS.

as we said in the tcp mode we only know the sender ip and the receiver ip, if we have a client, loadbalancer and a backend server behind it, this is the flow that you should expect.

The client will know its own IP address and the load balancer's IP address.
The load balancer will know the client’s IP address and its own IP address.
The backend server will know the load balancer's IP address and its own IP address.

So as you see, the backend server lost the IP address of the client, so the backend server can't know about the client IP address and only knows the loadbalance's IP adderss.

L4 load balancer? fine! But i want to preserve the client ip?

Although L4 load balancers called dump, but they are used due to it's speed and anonymity, l4 lb will not do any operation and forward the traffic directly, also the receiver can only trust the lb ip address and can't know the actuall sender ip!

But in some cases, We want to have control over the client ip while still benefiting from the l4 speed and anonymity features! What we should do.

Many people had the same issue, until Willy Tarreau from HAProxy created the PROXY protocol

So how did the proxy protocol solves the issue? they proposed to prepend the a special formated header to the tcp packet, that stores the client ip original ip address, allowing backend servers to identify where the original request comes from!

if we want to enable the proxy protocol on HAproxy for example we can do the following: frontend mywebsite bind :80 accept-proxy default_backend webservers

when accept-proxy is enabled the l4 lb will always store the client ip for us!

Layer 7 Load Balancer

Unlike Layer 4, a Layer 7 load balancer operates at the application layer (Layer 7) of the OSI model, which means it has full visibility into the content of the network packets. This enables it to make more complex routing decisions based on information like HTTP headers, cookies, and URLs, rather than just IP addresses and ports. This is particularly useful for HTTP based applications where requests can be directed to different servers based on the content within the HTTP request itself.

Also, when running an l7 loadbalancer, the TLS will be terminated there, and not at the backend servers, so the traffic will be decrypted, inspected and then the LB will decide where to send it based on teh content of the http header.

# Example of Layer 7 Load Balancer Configuration (HAProxy)

frontend http_front
    bind *:80                       # Listens on port 80 for HTTP connections
    mode http                        # Operates in HTTP mode (Layer 7)
    acl is_blog_path path_beg /blog  # Checks if the URL path begins with /blog
    use_backend blog_backend if is_blog_path
    default_backend main_backend     # Default backend for non-matching requests

backend blog_backend
    balance roundrobin               # Distributes requests based on round-robin
    server server_a 192.168.1.3:80 check
    server server_b 192.168.1.4:80 check

backend main_backend
    balance roundrobin
    server server_c 192.168.1.5:80 check
    server server_d 192.168.1.6:80 check

In this configuration:

mode http specifies that the load balancer should operate in Layer 7, allowing it to interpret HTTP headers and URLs. ACLs (Access Control Lists) allow for conditions to be checked on incoming requests. Here, we use an ACL to check if the URL path begins with /blog. If it does, requests are routed to the blog_backend. Backend Pools like blog_backend and main_backend define groups of servers to which traffic is directed based on specified rules.

How Can We Retrieve the Client’s Source IP in Layer 7?

In a Layer 7 load balancer, the original source IP address from the client may be masked by the load balancer itself, which typically sends traffic to the backend servers with its own IP as the source. To address this, many Layer 7 load balancers support the X-Forwarded-For (XFF) header, a standard HTTP header that appends the client’s original IP address to each request.

# example of how you might configure this in HAProxy:
frontend http_front
    bind *:80
    mode http
    option forwardfor   # Add X-Forwarded-For header for all traffic 
    default_backend app_servers

backend app_servers
    balance roundrobin
    server server_a 192.168.1.3:80 check
    server server_b 192.168.1.4:80 check

Once the forwardfor option is added, backend servers can access it to determine the client's original IP. For instance, in a web application, you might access it as follows:

In Python: request.headers.get('X-Forwarded-For', '').split(',')[0].strip()
In Apache/Nginx: Enable logging for X-Forwarded-For to capture the client IP.

Comparison of Common Public Cloud Load Balancers: Layer Support and Proxy Mode Capabilities

Modern cloud platforms offer managed load balancers, so you don’t need to set up one. Here are some popular choices:

Load Balancer	Layer 4 Support	Layer 7 Support	Proxy Mode Support
AWS Elastic Load Balancer (ELB) - Classic Load Balancer	Yes	No	Yes
AWS Application Load Balancer (ALB)	No	Yes	Yes
AWS Network Load Balancer (NLB)	Yes	No	Yes
Azure Load Balancer	Yes	No	No
Azure Application Gateway	No	Yes	Yes
Google Cloud Load Balancing (TCP/UDP Load Balancer)	Yes	No	No
Google Cloud Load Balancing (HTTP(S) Load Balancer)	No	Yes	Yes
HAProxy	Yes	Yes	Yes
NGINX	Yes	Yes	Yes

Please correct me in the comments section if there is a mistake within the table!

So, Can You Catch a Hacker Who's Using a Chain of DUMP Load Balancers?

Imagine a hacker routing their traffic through a chain of “dump” Layer 4 load balancers, each stripping away crucial source IP information as it forwards the traffic. Could you still trace their origin? How would you begin tracking them down with this kind of setup?

Let’s hear your thoughts! How do you think you could track down a hacker in this scenario? Drop your ideas in the comments below, and let’s see if we can solve this mystery together!

./Crashloop.sh