Service discovery and load balancing with Hashicorp’s Nomad
MUST WIN BLOG

In our last post, navigating the Kubernetes/Mesos/Docker Swarm Jungle, we discussed a young contender, Nomad. This week we’re diving into Nomad in more detail. As with most things from Hashicorp, it’s based on the latest research and the APIs are very well-designed. There is still some assembly required, and load balancing is one of the pieces left to the consumer.

There are a lot of ways to go about this. These days, smart load balancers are catching on. Nginx+ (not free), Fabio, and Traefik all integrate with Consul natively, so these can be attractive options. The latter two don’t support TCP load balancing yet, so that’s a showstopper for many applications.

This post will discuss a reference implementation using a HAProxy instance on each node that intercepts traffic between services with some clever DNS redirection and routes it to healthy service instances running on arbitrary host:ports across your Nomad cluster.

The batteries Hashicorp includes

Consul

Consul is the Hashicorp service discovery tool. It’s best in class for this purpose. It’s highly available and multi-dc was a first-class design consideration, so cross DC failover is greatly simplified. Nomad automatically integrates with consul to register any jobs scheduled on the cluster.

Consul-Template

Consul-template is a tool that watches consul for updates and re-renders a configuration file based on a template. One obvious application of this is to configure reverse proxies (which is how we use it). It also has some nice features to avoid duplicating work across the cluster.

The other OSS batteries

HAProxy

HAProxy is the self-described “Reliable, High Performance TCP/HTTP Load Balancer”. It’s been around a long time, and works well. We basically construct a valid HAProxy configuration with consul-template and use the live-reloading features.

DNSMasq

DNSMasq makes request interception easy. We can route any requests matching a given format to the local load balancer. Rather than needing to configure an application with a service:port, you can, for example, configure your applications with a consul service name, a known suffix, and a consistent port, e.g. solr-search.service. DNSMasq can intercept all such DNS lookups and direct them to a local address where HAProxy will proxy the request to the host:port indicated by consul.


The Setup

To follow along, open up the repo.

The setup is based on Hashicorp’s million container challenge repo. It uses Packer to build images, Terraform to deploy infrastructure, and it sets up a Consul cluster and a Nomad cluster. We’ve modernized it a bit to work with the latest versions of dependencies, removed the c1m related pieces, and added the proxying configurations. We build each of the nodes using Packer, so images can be built across cloud providers. Several node types (consul servers, nomad servers, nomad clients) are described in json files in the packer folder. We’ll dig into the nomad_client configuration, because most of the nodes in the cluster will be this variety.

The Packer config basically copies all the configuration and script files into the image, runs setup scripts that place files in the right filesystem locations, sets up upstart, etc and then cleans up after itself.

DNSMasq -> HAProxy interceptor

The nodes are all configured to look at their local consul agents for DNS here, that’s standard practice in a consul cluster. We take this one step further by adding a DNS catchall for names matching a known convention, I used *.service, and resolve all of those names to localhost (here). Then, by convention, applications on the cluster will just use port 80 for everything. I also added one more filter, nomad jobs must be tagged with the “routed” tag to receive this treatment. Applications will do a DNS lookup, discover the local proxy, which will forward packets to one of the correct backends.

Using this repo, you can super easily configure and provision a new Nomad cluster on Google’s cloud, and all your services will be able to speak to one another easily. There’s a simple helloworld Nomad job spec in the repository as well. Once that’s deployed, you’ll be able to curl helloworld.service and hit the service, no matter where it is on the cluster.

Tune in for our next post about securing your cluster with Vault. Until then, Happy load balancing!


Must Win is an official Hashicorp Solutions Integrator. If you’re looking for help with Hashicorp tools, reach out to we@mustwin.com and reference this post.

Mike Ihbe is a founding partner ofThe Must Win All Star Web & Mobile Consultancy.

Mike is an expert in a wide array of technologies and has loads of experience managing fast-moving dev teams, designing systems, and scaling large applications. When not on the job, he can be found cooking delicious meals on ski slopes in exotic locales.