HAProxy 101

Arunvel Arunachalam
9 min readMar 15, 2020

--

Long overdue story, wanted to cover it, as part of my Docker 101 & Minikube 101, but due to work commitments could not finish it off in time.

If you have not read Docker 101 & Minikube 101, do go through.

Docker 101 = https://medium.com/@csemanit2015/docker-101-6f4bb77dda8c

Minikube 101 = https://medium.com/@csemanit2015/minikube-101-kubernetes-local-cluster-deploment-service-heapster-dashboard-hpaminikube-715828ae235d

A big shout out to all fighters, volunteers directly and indirectly involved in mitigating COVID-19. Its a tragedy, but I am sure the world will overcome it. (We are born fighters)

Agenda

  1. HAProxy Introduction
  2. Difference between Reverse Proxy & Forward Proxy
  3. HAProxy 101 Labs

HAProxy is free, open source software that provides a high availability load balancer and proxy server for TCP (e.g database)and HTTP-based (Web) applications that spreads requests across multiple servers. It is written in C and has a reputation for being fast and efficient.

Initial Release = 16/12/2001

Latest Stable Release = 21/12/2019 (2.1.2)

Q) Why HAProxy?

A) HAProxy is the default router used in Openshift. The routes created for an application in Openshift is via HAProxy. (Articles on Openshift to be published soon)

Q) What is Openshift?

A) OpenShift(PaaS) is a family of containerization software developed by Red Hat. Its flagship product is the OpenShift Container Platform — an on-premises platform as a service built around Docker containers orchestrated and managed by Kubernetes on a foundation of Red Hat Enterprise Linux.

Q)What is a Load Balancer?

A) Load Balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancers are used to increase capacity (concurrent users) and reliability of applications

HAProxy = The Reliable High Performance TCP/HTTP Load Balancer

HAProxy Timeline

Predominantly there are two kinds of load balancers

  1. Hardware Load Balancer
  2. Software Load Balancer

The most obvious difference between hardware vs. software load balancers is that hardware load balancers require proprietary, rack-and-stack hardware appliances, while software load balancers are simply installed on standard x86 servers or virtual machines

Q) Are both reverse proxy(HAProxy, NGINX & Varnish) and forward proxy(squid) same?

A) The main difference between the two is that forward proxy is used by the client such as a web browser,whereas reverse proxy is used by the server such as a web server. Forward proxy can reside in the same internal network as the client, or it can be on the Internet

Forward Proxy = Forward proxy can be used by the client to bypass firewall restrictions in order to visit websites that are blocked.

Reverse Proxy = Reverse proxy is mainly used by server admins to achieve load balancing and high availability. A website may have several web servers behind the reverse proxy. The reverse proxy server takes requests from the Internet and forward these requests to one of the web servers.

Just to summarize ( HAProxy is free, open source software that provides a high availability load balancer and proxy server for TCP and HTTP-based applications)

Lets get our hands dirty with some HAProxy Labs

PC Configurations

  1. Two Virtual Machines

OS Ubuntu 16.04

2. System Configuration

Cpu = 2 vcpu

RAM = 4 GB

HDD = 50 GB

This is my machine configuration. Anything around 1vcpu, 1gb ram and 20 GB HDD space will be fine

hostname = client (ip = 192.168.0.103/24)

hostname = server (ip = 192.168.0.105/24)

Lets Install HAProxy in Server (ip = 192.168.0.105)

Q) What is software-properties common user for?

A) This software provides an abstraction of the used apt repositories. It allows you to easily manage your distribution and independent software vendor software sources

Q) What is add-apt-repository?

A) add-apt-repository is a Python script that allows you to add an APT repository to either /etc/apt/sources.list or to a separate file in the /etc/apt/sources.list.d directory. The command can also be used to remove an already existing repository

Here we are adding haproxy-1.6

haproxy version = 1.6.15

Q) Who is Willy Tarreau?

A) HAProxy was written in 2000 by Willy Tarreau, a core contributor to the Linux kernel, who still maintains the project

An HAProxy configuration file, guides the behavior of your HAProxy load balancer

Here are four essential sections to an HAProxy configuration file. They are global, defaults, frontend, and backend. These four sections define how the server as a whole performs, what your default settings are, and how client requests are received and routed to your backend servers.

If you compare the world of reverse proxies to an relay race, then global, defaults, frontend and backend are the star runners. Each section plays a vital role, handing the baton to the next in line.

Q) We have our Load Balancer Ready, where is our web server?

A) Ideally the Load Balancer and Web-server should be two different physical server or virtual machines.

In our case, we have both in one single virtual machine. (Our Ubuntu Server, will host both HAProxy & Nginx)

Q) What is Nginx?

A) Nginx is a web server which can also be used as a reverse proxy, load balancer, mail proxy and HTTP cache. The software was created by Igor Sysoev and first publicly released in 2004

Nginx can be used for many puposes e.g reverse proxy (Inception …HAHA ….),load balancer, mail proxy and HTTP cache.

In our case we are only using Nginx as web server

HAProxy as our reverse proxy server

So our web server is ready and listening on localhost:80 or 192.168.0.105:80

But we want our clients to hit HAProxy server first and then, HAProxy should redirect it to our web server(nginx)

Hence making some changes in haproxy.cfg

binding frontend, to our HAProxy server which is listening on localhost:8080

or 192.168.0.105:8080

And then the traffic which hits the frontend has to be redirected to the backend(web server)

To check if the configuration file is correct or not.

Here it gives us an error (unable to find required default_backend webserver), which is a valid error

Here we are defining our backend server which is the port on which nginx is listening.

Client — → 192.168.0.105:8080 — — → 192.168.0.105:80

This is the flow of the traffic, from client it hits the HAProxy server and then redirected to the nginx(web server).

Kudos = The configuration file is valid.

If your eyesight is good enough(coz my screen shot is not good enough), you can make out that the client has entered localhost:8080 or 192.168.0.105:8080 and the traffic has been redirected to our nginx web server

Same thing can be done using listen.

Here we have binded to port no 8080, which will t hen be redirected to 80

Here only one change the mode of the traffic is http

If you want to check stats for your HAProxy

just enter stats enable in haproxy.cfg

localhost:8080/haproxy?stats

Final part of this article (HAProxy is huge),you can do many things. I will try to cover those in later articles

Now the problem with reverse proxy is that, the client ip is not visible to the server.

The client sends the request to HAProxy,which then forwards the request to the web server. So the web server only is able to see the reverse proxy’s ip address.

If we want the web server to track the client’s ip then, we have to make some changes in the HAProxy configuration file & also to the nginx configuration file

We have added send-proxy

The changes to be made in nginx.conf

/var/log/nginx/proxy.log is the location in the web server, where will get all client ip address and header information

Here we can see one client is localhost and another is having an ip address of 192.168.0.104

Here we want to change the stats page uri & also add authentication to the stats page

Now the stats page will open at following url

localhost:8080/report

or

192.168.0.105:8080/report

Q) Why have you used one HAProxy server & one Web Server?

A) Due to resource constraints I have used one HAProxy server, we can use High Availability in HAProxy and also multiple web servers

Folks this is just the tip of HAProxy Iceberg. There are many more important concepts in HAProxy like Load Balancing Algorithms, ACL, Directing Traffic and also detecting/reacting to failure & Logging

I will try to cover those in upcoming articles.

Summary

  1. HAProxy Basics
  2. Understood difference between Reverse Proxy & Forward Proxy
  3. HAProxy Installation and working

Well folks I hope you enjoyed HAProxy-101

If you with me and have followed my instructions the summary is the expected output. If not no problem you can try hard once again.

Next in pipeline is articles on Terraform, Varnish and Gogs

Any comments and feedback please mail me at csemanit2015@gmail.com

--

--

Responses (2)