HAProxy 101
Long overdue story, wanted to cover it, as part of my Docker 101 & Minikube 101, but due to work commitments could not finish it off in time.
If you have not read Docker 101 & Minikube 101, do go through.
Docker 101 = https://medium.com/@csemanit2015/docker-101-6f4bb77dda8c
Minikube 101 = https://medium.com/@csemanit2015/minikube-101-kubernetes-local-cluster-deploment-service-heapster-dashboard-hpaminikube-715828ae235d
A big shout out to all fighters, volunteers directly and indirectly involved in mitigating COVID-19. Its a tragedy, but I am sure the world will overcome it. (We are born fighters)
Agenda
- HAProxy Introduction
- Difference between Reverse Proxy & Forward Proxy
- HAProxy 101 Labs
HAProxy is free, open source software that provides a high availability load balancer and proxy server for TCP (e.g database)and HTTP-based (Web) applications that spreads requests across multiple servers. It is written in C and has a reputation for being fast and efficient.
Initial Release = 16/12/2001
Latest Stable Release = 21/12/2019 (2.1.2)
Q) Why HAProxy?
A) HAProxy is the default router used in Openshift. The routes created for an application in Openshift is via HAProxy. (Articles on Openshift to be published soon)
Q) What is Openshift?
A) OpenShift(PaaS) is a family of containerization software developed by Red Hat. Its flagship product is the OpenShift Container Platform — an on-premises platform as a service built around Docker containers orchestrated and managed by Kubernetes on a foundation of Red Hat Enterprise Linux.
Q)What is a Load Balancer?
A) Load Balancer is a device that acts as a reverse proxy and distributes network or application traffic across a number of servers. Load balancers are used to increase capacity (concurrent users) and reliability of applications
HAProxy = The Reliable High Performance TCP/HTTP Load Balancer
HAProxy Timeline
Predominantly there are two kinds of load balancers
- Hardware Load Balancer
- Software Load Balancer
The most obvious difference between hardware vs. software load balancers is that hardware load balancers require proprietary, rack-and-stack hardware appliances, while software load balancers are simply installed on standard x86 servers or virtual machines
Q) Are both reverse proxy(HAProxy, NGINX & Varnish) and forward proxy(squid) same?
A) The main difference between the two is that forward proxy is used by the client such as a web browser,whereas reverse proxy is used by the server such as a web server. Forward proxy can reside in the same internal network as the client, or it can be on the Internet
Forward Proxy = Forward proxy can be used by the client to bypass firewall restrictions in order to visit websites that are blocked.
Reverse Proxy = Reverse proxy is mainly used by server admins to achieve load balancing and high availability. A website may have several web servers behind the reverse proxy. The reverse proxy server takes requests from the Internet and forward these requests to one of the web servers.
Just to summarize ( HAProxy is free, open source software that provides a high availability load balancer and proxy server for TCP and HTTP-based applications)
Lets get our hands dirty with some HAProxy Labs
PC Configurations
- Two Virtual Machines
OS Ubuntu 16.04
2. System Configuration
Cpu = 2 vcpu
RAM = 4 GB
HDD = 50 GB
This is my machine configuration. Anything around 1vcpu, 1gb ram and 20 GB HDD space will be fine
hostname = client (ip = 192.168.0.103/24)
hostname = server (ip = 192.168.0.105/24)
Lets Install HAProxy in Server (ip = 192.168.0.105)
Q) What is software-properties common user for?
A) This software provides an abstraction of the used apt repositories. It allows you to easily manage your distribution and independent software vendor software sources
Q) What is add-apt-repository?
A) add-apt-repository
is a Python script that allows you to add an APT repository to either /etc/apt/sources.list
or to a separate file in the /etc/apt/sources.list.d
directory. The command can also be used to remove an already existing repository
Here we are adding haproxy-1.6
haproxy version = 1.6.15
Q) Who is Willy Tarreau?
A) HAProxy was written in 2000 by Willy Tarreau, a core contributor to the Linux kernel, who still maintains the project
An HAProxy configuration file, guides the behavior of your HAProxy load balancer
Here are four essential sections to an HAProxy configuration file. They are global
, defaults
, frontend
, and backend
. These four sections define how the server as a whole performs, what your default settings are, and how client requests are received and routed to your backend servers.
If you compare the world of reverse proxies to an relay race, then global
, defaults
, frontend
and backend
are the star runners. Each section plays a vital role, handing the baton to the next in line.
Q) We have our Load Balancer Ready, where is our web server?
A) Ideally the Load Balancer and Web-server should be two different physical server or virtual machines.
In our case, we have both in one single virtual machine. (Our Ubuntu Server, will host both HAProxy & Nginx)
Q) What is Nginx?
A) Nginx is a web server which can also be used as a reverse proxy, load balancer, mail proxy and HTTP cache. The software was created by Igor Sysoev and first publicly released in 2004
Nginx can be used for many puposes e.g reverse proxy (Inception …HAHA ….),load balancer, mail proxy and HTTP cache.
In our case we are only using Nginx as web server
HAProxy as our reverse proxy server
So our web server is ready and listening on localhost:80 or 192.168.0.105:80
But we want our clients to hit HAProxy server first and then, HAProxy should redirect it to our web server(nginx)
Hence making some changes in haproxy.cfg
binding frontend, to our HAProxy server which is listening on localhost:8080
or 192.168.0.105:8080
And then the traffic which hits the frontend has to be redirected to the backend(web server)
To check if the configuration file is correct or not.
Here it gives us an error (unable to find required default_backend webserver), which is a valid error
Here we are defining our backend server which is the port on which nginx is listening.
Client — → 192.168.0.105:8080 — — → 192.168.0.105:80
This is the flow of the traffic, from client it hits the HAProxy server and then redirected to the nginx(web server).
Kudos = The configuration file is valid.
If your eyesight is good enough(coz my screen shot is not good enough), you can make out that the client has entered localhost:8080 or 192.168.0.105:8080 and the traffic has been redirected to our nginx web server
Same thing can be done using listen.
Here we have binded to port no 8080, which will t hen be redirected to 80
Here only one change the mode of the traffic is http
If you want to check stats for your HAProxy
just enter stats enable in haproxy.cfg
localhost:8080/haproxy?stats
Final part of this article (HAProxy is huge),you can do many things. I will try to cover those in later articles
Now the problem with reverse proxy is that, the client ip is not visible to the server.
The client sends the request to HAProxy,which then forwards the request to the web server. So the web server only is able to see the reverse proxy’s ip address.
If we want the web server to track the client’s ip then, we have to make some changes in the HAProxy configuration file & also to the nginx configuration file
We have added send-proxy
The changes to be made in nginx.conf
/var/log/nginx/proxy.log is the location in the web server, where will get all client ip address and header information
Here we can see one client is localhost and another is having an ip address of 192.168.0.104
Here we want to change the stats page uri & also add authentication to the stats page
Now the stats page will open at following url
localhost:8080/report
or
192.168.0.105:8080/report
Q) Why have you used one HAProxy server & one Web Server?
A) Due to resource constraints I have used one HAProxy server, we can use High Availability in HAProxy and also multiple web servers
Folks this is just the tip of HAProxy Iceberg. There are many more important concepts in HAProxy like Load Balancing Algorithms, ACL, Directing Traffic and also detecting/reacting to failure & Logging
I will try to cover those in upcoming articles.
Summary
- HAProxy Basics
- Understood difference between Reverse Proxy & Forward Proxy
- HAProxy Installation and working
Well folks I hope you enjoyed HAProxy-101
If you with me and have followed my instructions the summary is the expected output. If not no problem you can try hard once again.
Next in pipeline is articles on Terraform, Varnish and Gogs
Any comments and feedback please mail me at csemanit2015@gmail.com