Monday, January 14, 2013

Nginx Load Balancing Basics

Nginx is a powerful high performance web server with a lot of features that can help high load projects to overcome their problems.
Here you can see a particular feature of Nginx that allows you to load balance the traffic across multiple external and internal(on same hardware) servers.

Load Balancing can come in handy when your only server can no longer handle all the incoming requests and you need to offload some of the load to an other server.
The load balancing is implemented by the HttpUpstreamModule that most of the times is included in Nginx by default.

Basic Config

Lets say you have deployed your website/webapp on multiple servers in the same network with the IPs 10.0.1.1,10.0.1.2,10.0.1.3 (of course it doesn't matter where the servers are. They can all be on the same physical machine, in the same network or even somewhere else in the internet).
All of the servers/backends will be combined into one upstream link by Nginx and then used as a single server in the rest of the configuration. 
This is an oversimplified principal of a personal "cloud" so lets call our upstream myCloud:

upstream myCloud{
  server 10.0.1.1;
  server 10.0.1.2;
  server 10.0.1.3;
} 

You can add as many servers as you want in there, including localhost,LAN and Online servers.
upstream myCloud{
  server s1.domain.com;
  server 10.0.1.2;
  server unix:/tmp/backend;
  server 127.0.0.1:8080;
} 

"myCloud" is now a registered upstream server that can be used as a parameter with the proxy_pass option. Now, wherever we use the myCloud upstream, Nginx, with the help of Round-Robin algorithm, will select the server to be used and proxy the received request to it.

Now you can use the new upstream link in your vhost config:

 
server {
  listen domain.com:80;
  access_log /var/log/nginx/proxy.log;
 
  location / {
    proxy_pass http://myCloud;
  }
}


Bind clients to servers

Often, it will be necessary for a single client to be served by a single server, not having different servers answering each request. For this, Nginx has the option ip_hash.
When ip_hash is turned on, the proxy server will remember the client's IP address hash and will use the same server every time.

upstream myCloud{
  ip_hash;
  server 10.0.1.1;
  server 10.0.1.2;
  server 10.0.1.3;
} 

Exclude Servers

If for some reason you need to temporary exclude one or more servers from being proxied by Nginx, you can use the "down" parameter:

upstream myCloud{
  server 10.0.1.1;
  server 10.0.1.2 down;
  server 10.0.1.3;
} 

Define priorities

You can define priorities by using a "weight" option for each server. The weight of a server roughly describes how often he will be used.
For example let's say the weight of the 10.0.1.1 server is 3, of the 10.0.1.2 is 1 and the weight of 10.0.1.3 is 2. 
In this case the first 3 requests will be proxied to server 10.0.1.1, the 4th to server 10.0.1.2 and the 5th and 6th requests will be proxied to server 10.0.1.3, after which the cycle will start again from the beginning.
The weight for all servers by default is equal to 1, but you change it as you wish.

upstream myCloud{
  server 10.0.1.1 weight=3;
  server 10.0.1.2;
  server 10.0.1.3 weight=2;
} 

Automatic Failover

If any of the upstream servers stops responding, then Nginx won't be able to connect to it and will serve the next available server from the "cloud".
The client won't actually experience any downtime but rather a long response from the server, which is not good. 
The next problem is that Nginx will try to connect to the non responsive server over and over again on each cycle, losing valuable time.
But with the parameter max_fails you can set a maximum amount of connection failures before Nginx marks the server as down and stops trying to connect there.
By default this option equals to 1, which means that after a connection failure Nginx will stop trying to connect for a certain amount of time. This is defined by the option fail_timeout and by default is 10 seconds.

upstream myCloud{
  server 10.0.1.1 max_fails=3 fail_timeout=120;
  server 10.0.1.2;
  server 10.0.1.3;
} 

Backup Servers

Backup Servers are used only when all of the normal upstream servers stop responding to requests. They are marked with the backup parameter.

upstream myCloud{
  server 10.0.1.1;
  server 10.0.1.2;
  server 10.0.1.3;
  server 10.0.1.8 backup;
  server 10.0.1.9 backup;
} 


That's it, please comment,like and share if you liked the post. 

13 comments:

  1. It'd be great if there was some kind of UI like haproxy has to monitor which servers are up, down, failed requests etc.

    ReplyDelete
    Replies
    1. Check this is out https://github.com/cep21/healthcheck_nginx_upstreams

      Delete
  2. Thanks for the post. How Nginx performs backend server health check?

    ReplyDelete
    Replies
    1. It has in-band health check. Fully automatic, you don't need to configure anything special.

      Delete
    2. I'm not sure what Anonymous means by 'in-band' health check, nginx provides no such thing. Compile nginx with the nginx_upstream_check_module here: https://github.com/yaoweibin/nginx_upstream_check_module

      Delete
    3. to Nate Smith
      "In-band health checks" is a well-known term in networking. You should use google. The module that you have mentioned just provides out-of-band health checks functionality.

      Delete
  3. Thanks for the post. It is really helpful - especially because I find it hard to digest most of the nginx docs. Here is a question for you

    Say you have an app setup so two different websites are really built into one single web application (using url routing, obviously)

    127.0.0.1:8000/website1/

    127.0.0.1:8000/website2/

    Now say I want to run both this applications on a single server with one nginx load balancer and one webserver. I purchase two different domains

    www.domain1.com

    www.domain2.com

    and now I want to tell nginx to route requests from

    www.domain1.com --> 127.0.0.1:8000/website1/

    www.domain2.com --> 127.0.0.1:8000/website2/

    I cant figure out how to do this with nginx, although I assume it is possible. Thanks

    ReplyDelete
    Replies
    1. HI, did you ever get an answer to this? I'm kinda stuck at the moment!

      Delete
  4. Why do you refer to community wiki instead of official documentation? There's actually more load-balancing methods in nginx.

    ReplyDelete
  5. NGINX simplicity is impressive. Even load balancing config above is easy to understand

    ReplyDelete
  6. Nice little tutorial. You should note that the backup directive cannot be used with the directive ip_hash.

    ReplyDelete
  7. How much takes nginx Packets/Second and Bytes/Second ??

    ReplyDelete
  8. ip_hash upstream module is maybe not a good idea because there could be situations where a lot of different browsers are coming with the same IP address (behind proxies), use nginx sticky module https://code.google.com/p/nginx-sticky-module/

    ReplyDelete