Reverse proxy for Nodejs in production with Apache2, HAProxy and Monit

décembre 18th, 2012

We’ve recently finish a small Web site (code named sirifacts.org) with nodejs and expressjs. Our goal with this project was to setup Apache and NodeJS together in a real production environment, on a Linux Debian box.

In mostly Web application, Apache2 is serving PHP and static files on port 80, while Nodejs is serving on a different port.

The problem is that we couldn’t set apache and node to listen on the same port (80). And because all of our current projects are being served via apache, we didn’t have the option of deactivating apache to run just nodejs.

There are several ways to set up NodeJS and Apache together in production mode.

  • The easiest one is to setup Apache as a reverse proxy with mod_proxy_http. But Apache doesn’t handle large numbers of open/long-lasting connections like node (or nginx) does. Further, it will completely fail to reverse-proxy websockets.
  • Use Hosted solutions like Heroku, Rackspace, Amazon EC2, etc. Since we already have our own hardware stuff, no need to invest money in something else.
  • Another solution is to install NGinx as a proxy for Apache2 and Node which is very light and fast, but we had to patch the source code to use tcp_proxy to handle websockets (HTTP 1.1) which made us unconfortable regarding the update process in the future.
  • Digging around, we found HAProxy which is a fast server for high availability and load balancing. It handles websockets out-of-the-box but unlike Nginx or Varnish, HAProxy doesn’t support cacheing or serving static files. For us, that isn’t an issue – at least not yet. If it becomes important to cache or serve static files outside of Node.js, I could always add Varnish past HAProxy.

Configuring HAProxy to run Apache2, NodeJS and Monit Web status

HAProxy has a very clear configuration file. The ability to route to the proper Node.js instance when running multiple applications on the same machine required a little bit of config gymnastics with HAProxy, but the resultant configuration file is still very readable.

Install HAproxy with the following command :

sudo apt-get install haproxy

For our configuration, we wanted to achieve 2 primary goals:

  • route traffic to the appropriate application
  • properly handle WebSocket traffic

Here is the configuration file for HAProxy that we have in use now.

Edit /etc/haproxy/haproxy.cfg :


global
   log 127.0.0.1   local0         # Enable per-instance logging of events and traffic.
   log 127.0.0.1   local1 notice  # only send important events
   maxconn 4096                   # the server will handle up to 4096 simultaneous connections.
   user haproxy
   group haproxy
   daemon                         # the server will put itself in the background when launched.
   nbproc      2                  # number of processes when launched 2.

defaults
    # default mode will be http (as opposed to tcp)
    mode http
    # Enable early dropping of aborted requests pending in queues
    option abortonclose          
    # Set the maximum inactivity time on the client side
    timeout client  5000
    # Enable HTTP connection closing on the server side
    option http-server-close
    # Return a file contents instead of errors generated by HAProxy
    no option accept-invalid-http-request
    # Enable or disable relaxing of HTTP response parsing
    no option accept-invalid-http-response
    # By default, the first operational backup server gets all traffic when normal servers are all down
    option allbackups
    # Enable insertion of the X-Forwarded-For header to requests sent to servers
    option forwardfor except 127.0.0.1 header X-Forwarded-For
    # Enable session redistribution in case of connection failure.
    option redispatch            
    # Set the number of retries to perform on a server after a connection failure
    retries 3                    
    # Enable the saving of one ACK packet during the connect sequence
    option tcp-smart-connect      
    # Fix the maximum number of concurrent connections on a frontend
    maxconn 2000                  
    # Set the maximum time to wait for a connection attempt to a server to succeed
    contimeout      5000          
    # Set the maximum inactivity time on the client side
    clitimeout      50000        
    # Set the maximum inactivity time on the server side
    srvtimeout      50000        

#this frontend interface receives the incoming http requests
frontend all 0.0.0.0:80
    timeout client 1h
    # use apache2 as default webserver for incoming traffic
    default_backend apache2

    acl is_nodejs hdr_end(host) -i sirifacts.org
    use_backend nodejs_backend if is_nodejs

    acl is_websocket hdr_end(host) -i socket.io.tld
    use_backend nodejs_socket if is_websocket

    acl is_monit hdr_end(host) -i monit.io.tld
    use_backend monit_backend if is_monit

#apache backend, transfer to port 82
backend apache2
    # Define the load balancing algorithm to be used in a backend
    balance roundrobin
    # Enable insertion of the X-Forwarded-For header to requests sent to servers    
    option forwardfor
    server apache2 127.0.0.1:82 weight 1 maxconn 1024 check  
    # server must be contacted within 5 seconds
    timeout connect 5s
    # all headers must arrive within 3 seconds
    timeout http-request 3s
    # server must respond within 25 seconds. should equal client timeout
    timeout server 25s

#nodejs backend, transfer to port 3000
backend nodejs_backend
    # Set the running mode or protocol of the instance { tcp|http|health }
    mode http
    timeout server 1h
    timeout connect 1s  
    # Enable passive HTTP connection closing
    option httpclose
    # Enable insertion of the X-Forwarded-For header to requests sent to servers    
    option forwardfor
    server server1 127.0.0.1:3000 weight 1 maxconn 1024 check

#websocket backend, transfer to port 9000
backend nodejs_socket
    # Set the running mode or protocol of the instance { tcp|http|health }
    mode http
    timeout server 86400000
    timeout connect 5000
    server io_test localhost:9000

#monit backend, transfer to port 2812
backend monit_backend
    # Set the running mode or protocol of the instance { tcp|http|health }
    mode http
    timeout server 1h
    timeout connect 1s  
    # Define whether haproxy will announce keepalive to the server or not
    option http-pretend-keepalive
    # Enable insertion of the X-Forwarded-For header to requests sent to servers  
    option forwardfor
    server server1 127.0.0.1:2812 weight 1 maxconn 1024 check

More informations about HAProxy options here.

Then edit /etc/default/haproxy and set ENABLED=1.

Start HAProxy :

sudo /etc/init.d/haproxy start

HAProxy will now handle the initial requests on port 80 and dispatch them to node and apache. I want the requests sent to the following domain :

  • sirifacts.org to be forwarded to node,
  • socket.io.tld to be forwarded to node,
  • monit.io.tld to be forwarded to monit,
  • the rest will be forwarded to Apache.

Change Listen port on Apache2

For the reverse proxy to work we first need to modify the ports apache listen to. So I changed the apache conf to have it listen locally to port 82. Here’s how to change apache port :

Change:

NameVirtualHost *:80
Listen 80

To:

NameVirtualHost *:82
Listen 82

Restart apache

sudo /etc/init.d/apache2 restart

Install Node.js on Linux

Deploying Node applications is kind of tricky because your app is the webserver.

Here, I’ll outline one of our presently preferred ways of setting up a Nodejs server as a service on Linux, using an init.d script and some tailoring of the server application itself.

When installing Node.js for a server application, the two things to bear in mind are that (a) you really don’t want to run any process as root if you don’t have to, and (b) you have to launch a process as root in order to bind to privileged ports like 80 and 443. Well, point (b) isn’t strictly true, as there are other ways to do this, but launching as the root user and then downgrading the process permissions to run as another (non-privileged) user after the port is bound is an easy method that will just work across a broad range of Linux variants.

So to install Node.js, We need to create a user that will own the running server process and the data.

sudo useradd -m -d /home/nodeapps nodeapps

Now install node.js and mongoDB.

sudo apt-get install nodejs mongodb-server

Tailor Your Node.js Application

Firstly, your Node.js server application will have to downgrade its own permissions after it binds to all needed privileged ports. Your code should expect to launch under ownership by root, and alter its own permissions to run under the node user. Here is a trivial HTTP server in Express as an example:


var express = require("express");
var server = express.createServer();
var serverPort = 80;
var nodeUserGid = "nodeapps";
var nodeUserUid = "nodeapps";
 
server.listen(serverPort, function() {
  process.setgid(nodeUserGid);
  process.setuid(nodeUserUid);
});

Set up an init.d Script for your node app

The following script and setup instructions are good for Ubuntu or other Debian-style distributions, though you will have to change the paths to suit your application and installation details.


#!/bin/bash
# This is suitable for Ubuntu or other Debian-style distributions.
#
### BEGIN INIT INFO
# Provides:          my_application_name
# Required-Start:    $local_fs $network $syslog
# Required-Stop:     $local_fs $network $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start daemon at boot time
# Description:       Enable service provided by my_application_name nodejs app.
### END INIT INFO

# Make changes according to your configuration
APP_NAME=my_application_name
APP_DIR=/home/nodeapps/$APP_NAME
NODE=/usr/bin/node
LOG_DIR=/var/log/nodeapps
USER=nodeapps
PID_DIR=/var/run

# Don't modify
APP_PID=""

test -x $NODE || exit 0

function get_pid {
  APP_PID=`ps ax | grep -iw $APP_NAME | grep -iv 'grep' | grep -iv 'tail' | awk '{print $1}' | cut -f1 -d/ | tr '\n' ' '`
}

function init_log {
  if [ ! -d $LOG_DIR ]; then
    mkdir $LOG_DIR
    touch "$LOG_DIR/$APP_NAME.log"
    chown -R $USER $LOG_DIR
  fi
}

function init_pid {
  if [ ! -f "$PID_DIR/$APP_NAME.pid" ]; then
     touch "$PID_DIR/$APP_NAME.pid"
     chown $USER "$PID_DIR/$APP_NAME.pid"
  fi
}

function start {
  echo "Starting $APP_NAME node instance"
  get_pid
  if [ "$APP_PID" = "" ]; then
   
    # Create the log and pid files, making sure that the target use has access to them
    init_log
    init_pid

    # Launch the application
    cd $APP_DIR
    exec sudo -u $USER NODE_ENV=production "$NODE" "$APP_DIR/app.js" 1>>"$LOG_DIR/$APP_NAME.log" 2>&1 &
    echo $! > "$PID_DIR/$APP_NAME.pid"
    get_pid
    echo "$APP_NAME are now up and running with pid $APP_PID !"; sleep 1
  else
       echo "Instance already running at pid $APP_PID"; sleep 1
  fi
}

function restart {
  echo "Restarting $APP_NAME node instance"
  get_pid
  if [ "$APP_PID" != "" ]; then
    stop
    start
  else
    start
  fi
}

function stop {
   get_pid
  echo "Shutting down $APP_NAME node instance PID : $APP_PID "
   
  if [ "$APP_PID" != "" ]; then
    kill -TERM $APP_PID;
    echo "$APP_NAME stopped."; sleep 1
   else
    echo "Instance is not running"
   fi
}

case "$1" in
    start)
        start
        ;;
    stop)
        stop
        ;;
    restart)
        restart
        ;;
    *)
        echo "Usage:  {start|stop|restart}"
        exit 1
        ;;
esac
exit 0

Copy your script into /etc/init.d/my_application_name, and set its permissions appropriately. You can then set it to run as a service using a tool such as update-rc.d:


sudo chmod +x /etc/init.d/my_application_name
sudo update-rc.d my_application_name defaults 22
sudo /etc/init.d/my_application_name start

Also, because your application is the webserver, if it crashes, your whole service is boned. So, we also need a solution to monitor that, too. There are many solutions to monitor node application, using forever, upstart or monit. For this tutorial, we choose the monit solution.

Installing Monit

sudo apt-get install monit

Edit /etc/monit/monitrc to set your configuration. This is mine for example :


 set daemon 120            # check services at 2-minute intervals
 set logfile /var/log/monit.log
 set idfile /var/lib/monit/id
 set statefile /var/lib/monit/state
 
 set eventqueue
     basedir /var/lib/monit/events # set the base directory where events will be stored
     slots 100                     # optionally limit the queue size
     
 set alert <your@email.adress>     # receive all alerts
 
 set httpd port 2812 and
   use address localhost  # only accept connection from localhost
   allow localhost        # allow localhost to connect to the server and
   allow <your login>:<your password>     # require user 'admin' with password 'monit'
   allow @monit           # allow users of group 'monit' to connect (rw)
   allow @users readonly  # allow users of group 'users' to connect readonly

 check system localhost
   if loadavg (1min) > 4 then alert
   if loadavg (5min) > 2 then alert
   if memory usage > 75% then alert
   if swap usage > 25% then alert
   if cpu usage (user) > 70% then alert
   if cpu usage (system) > 30% then alert
   if cpu usage (wait) > 20% then alert

 check process nodeapps with pidfile "/var/run/my_application_name.pid"
   start program = "/etc/init.d/my_application_name start"
   restart program  = "/etc/init.d/my_application_name restart"
   stop program  = "/etc/init.d/my_application_name stop"
   if cpu > 60% for 2 cycles then alert
   if cpu > 80% for 5 cycles then restart
   if totalmem > 200.0 MB for 5 cycles then restart
   if children > 250 then restart

Restart monit and test your monit Web status at this url : http://your_monit_status_url:2812

It’s fairly straightforward and it just works.

Have fun !

  • Arindam Das

    I didn’t understand why the node app is running on port 80 while the node backend is transferring to 3000. What am i missing