From Layer 4 to Layer 7: Decoding Load Balancers and Proxy Modes— Is It Possible to Track a Hacker?


From Layer 4 to Layer 7: Decoding Load Balancers and Proxy Modes— Is It Possible to Track a Hacker?
Load balancers are like the silent gatekeepers of our networks. They make sure that traffic is distributed evenly across our servers, reducing overload, keeping response times low, and minimizing downtime.
In this blog post, we are going to dig into what load balancers do at each layer, the differences between L4 and L7, proxy modes and how to work around challenges like preserving client IP addresses.
Table of Contents
- Introduction
- Layer 4 Load Balancer
- L4 load balancer? fine! But i want to preserve the client ip?
- Layer 7 Load Balancer
- How Can We Retrieve the Client’s Source IP in Layer 7?
- Using the X-Forwarded-For Header
- Comparison of Common Public Cloud Load Balancers
- So, Can You Catch a Hacker Who's Using a Chain of DUMP Load Balancers?
Layer 4 Load Balancer
Layer 4 load balancers operate at the transport layer (Layer 4) of the OSI model, making routing decisions based on IP addresses and ports. This means they don't inspect the actual content of network packets. They simply pipe traffic blindely. ```
Example of Layer 4 Load Balancer Configuration (HAProxy)
defaults mode tcp # Operates in TCP mode frontend tcp_front bind *:80 # Listens on port 80 for incoming TCP connections default_backend flamingo
backend flamingo # our backedn pool our the loadbalancer will forward traffice to balance roundrobin # Distributes requests to the servers based on roundrobin algirthm server server_a 192.168.1.1:80 check server server_b 192.168.1.2:80 check ```
mode tcp means that the load balancer will run on layer 4, and it is expecting a tcp packet, that's it.
All these layers and modes are just standardisation that a bunch of people agreed on and put it out there, so when people say let's operate on TCP mode, they should adhere to the same standards stated here RFC793 , So both they can have a clear expectations, after all it's all code, bits and bytes.
For example, if i'm using python or any other programming language and i'm gonna create a layer 4 load balancer, i should do the following -> Whenever my load balancer receives a request:
- I should create a socket that will send the data to the backend server, or pool of servers.
- When i send the data or receive the data these are the expectations:
- Expectations from the receiver: Since i mentioned that my loadbalancer will run in TCP mode, or a layer 4 load balancer, which means the same thing, the receiver will expect a tcp header, that have the source ip address and the destination ip address along with the content of my packet.
Expectations from the sender: Since we made the sender aware the we are a layer 4 load balancer, he should send a tcp header, that have it's ip address and the source ip address (which is our layer 4 load balancer in our case.) along with the tcp packet content.
However, such load balancers will lose the source ip address on the way. That's why they call them DUMP LOADBALANCERS.
as we said in the tcp mode we only know the sender ip and the receiver ip, if we have a client, loadbalancer and a backend server behind it, this is the flow that you should expect.
The client will know its own IP address and the load balancer's IP address.
The load balancer will know the client’s IP address and its own IP address.
The backend server will know the load balancer's IP address and its own IP address.
So as you see, the backend server lost the IP address of the client, so the backend server can't know about the client IP address and only knows the loadbalance's IP adderss.
L4 load balancer? fine! But i want to preserve the client ip?
Although L4 load balancers called dump, but they are used due to it's speed and anonymity, l4 lb will not do any operation and forward the traffic directly, also the receiver can only trust the lb ip address and can't know the actuall sender ip!
But in some cases, We want to have control over the client ip while still benefiting from the l4 speed and anonymity features! What we should do.
Many people had the same issue, until Willy Tarreau from HAProxy created the PROXY protocol
So how did the proxy protocol solves the issue? they proposed to prepend the a special formated header to the tcp packet, that stores the client ip original ip address, allowing backend servers to identify where the original request comes from!
if we want to enable the proxy protocol on HAproxy for example we can do the following:
frontend mywebsite
bind :80 accept-proxy
default_backend webservers
when accept-proxy is enabled the l4 lb will always store the client ip for us!
Layer 7 Load Balancer
Unlike Layer 4, a Layer 7 load balancer operates at the application layer (Layer 7) of the OSI model, which means it has full visibility into the content of the network packets. This enables it to make more complex routing decisions based on information like HTTP headers, cookies, and URLs, rather than just IP addresses and ports. This is particularly useful for HTTP based applications where requests can be directed to different servers based on the content within the HTTP request itself.
Also, when running an l7 loadbalancer, the TLS will be terminated there, and not at the backend servers, so the traffic will be decrypted, inspected and then the LB will decide where to send it based on teh content of the http header.
# Example of Layer 7 Load Balancer Configuration (HAProxy)
frontend http_front
bind *:80 # Listens on port 80 for HTTP connections
mode http # Operates in HTTP mode (Layer 7)
acl is_blog_path path_beg /blog # Checks if the URL path begins with /blog
use_backend blog_backend if is_blog_path
default_backend main_backend # Default backend for non-matching requests
backend blog_backend
balance roundrobin # Distributes requests based on round-robin
server server_a 192.168.1.3:80 check
server server_b 192.168.1.4:80 check
backend main_backend
balance roundrobin
server server_c 192.168.1.5:80 check
server server_d 192.168.1.6:80 check
In this configuration:
mode http specifies that the load balancer should operate in Layer 7, allowing it to interpret HTTP headers and URLs. ACLs (Access Control Lists) allow for conditions to be checked on incoming requests. Here, we use an ACL to check if the URL path begins with /blog. If it does, requests are routed to the blog_backend. Backend Pools like blog_backend and main_backend define groups of servers to which traffic is directed based on specified rules.
How Can We Retrieve the Client’s Source IP in Layer 7?
In a Layer 7 load balancer, the original source IP address from the client may be masked by the load balancer itself, which typically sends traffic to the backend servers with its own IP as the source. To address this, many Layer 7 load balancers support the X-Forwarded-For (XFF) header, a standard HTTP header that appends the client’s original IP address to each request.
# example of how you might configure this in HAProxy:
frontend http_front
bind *:80
mode http
option forwardfor # Add X-Forwarded-For header for all traffic
default_backend app_servers
backend app_servers
balance roundrobin
server server_a 192.168.1.3:80 check
server server_b 192.168.1.4:80 check
Once the forwardfor option is added, backend servers can access it to determine the client's original IP. For instance, in a web application, you might access it as follows:
In Python: request.headers.get('X-Forwarded-For', '').split(',')[0].strip()
In Apache/Nginx: Enable logging for X-Forwarded-For to capture the client IP.
Comparison of Common Public Cloud Load Balancers: Layer Support and Proxy Mode Capabilities
Modern cloud platforms offer managed load balancers, so you don’t need to set up one. Here are some popular choices:
Load Balancer | Layer 4 Support | Layer 7 Support | Proxy Mode Support |
---|---|---|---|
AWS Elastic Load Balancer (ELB) - Classic Load Balancer | Yes | No | Yes |
AWS Application Load Balancer (ALB) | No | Yes | Yes |
AWS Network Load Balancer (NLB) | Yes | No | Yes |
Azure Load Balancer | Yes | No | No |
Azure Application Gateway | No | Yes | Yes |
Google Cloud Load Balancing (TCP/UDP Load Balancer) | Yes | No | No |
Google Cloud Load Balancing (HTTP(S) Load Balancer) | No | Yes | Yes |
HAProxy | Yes | Yes | Yes |
NGINX | Yes | Yes | Yes |
Please correct me in the comments section if there is a mistake within the table!
So, Can You Catch a Hacker Who's Using a Chain of DUMP Load Balancers?
Imagine a hacker routing their traffic through a chain of “dump” Layer 4 load balancers, each stripping away crucial source IP information as it forwards the traffic. Could you still trace their origin? How would you begin tracking them down with this kind of setup?
Let’s hear your thoughts! How do you think you could track down a hacker in this scenario? Drop your ideas in the comments below, and let’s see if we can solve this mystery together!