-
Notifications
You must be signed in to change notification settings - Fork 4
The Problem
Manticore is a collection of programs designed to utilize a network of machines to run and manage user-requested jobs on. Manticore is the term used to describe the entire system, and is not a single running program.
Manticore is unlike other orchestration tools. However, it uses existing orchestration tools such as Consul and Nomad and adds extra functionality on top. Here is what all the programs are and what they do:
Hashicorp's many products are meant to solve problems at the level of distributed systems and microservices. Nomad is a flexible scheduler that is able to scale across many machines and determine where programs should be placed out of the machines it can see. Other useful features include:
- scheduling docker containers
- the ability for a container scheduled by Nomad to communicate with the Nomad server cluster without storing any direct addresses to the servers
- the ability to stream logs that are output from a scheduled container
- a well-documented and useful HTTP API
- job files which dictate what the expected state of all the programs described should be, leaving how to achieve that state up to Nomad
- easy integration with Consul for service discovery
Nomad can schedule and run programs in one of many machines, and the service of that program could be exposed in one of tens of thousands of possible ports. In order to know how to actually find and access these machines, we need service discovery. Consul works with Nomad to allow finding running containers easily. Along with that, it has the following useful features:
- a distributed key-value store where information can be stored and pulled from
- the ability for a container scheduled by Nomad to communicate with the Consul server cluster without storing any direct addresses to the servers
- a well-documented and useful HTTP API
- the ability to watch for changes in the key-value store and changes in running services, checking the state of containers using health checks
Being able to run any program in an environment that you can ensure will be the same every time is a very comforting feature to have. Since the environment will not change, we can have sdl_core already compiled in a docker container so that when we need sdl_core to run it will start up almost instantly. Nomad is able to schedule and run docker containers. sdl_core and the hmi are "dockerized", as well as the server that opens an API for clients to receive sdl_core and hmi instances. Additionally, we can pass extra information into these containers that allow the HMI to always know where sdl_core is for communication.
HAProxy is used as the internal load balancer that helps route traffic of clients to their HMI instances. HAProxy only allows connections to the HMI from users that have knowledge of a randomly generated URL given only to them by the Manticore web app. Additionally, HAProxy supports HTTP and websocket connections out of the box, and is very easy to programmatically change its configuration for whenever the state of Manticore changes. HAProxy opens up TCP ports so that users can connect their SDL app to sdl_core.
consul-template is a program that looks at changes in the Consul KV store and can use that information stored to generate template files. This program is useful for allowing HAProxy to have automatic updates to its configuration.
AWS has many important uses for Manticore, but the ELB is useful for finding running instances with HAProxy to route external traffic to. The ELB also makes it easy to use SSL certificates to allow encrypted HTTP and websocket traffic, with SSL termination at the ELB.
NodeJS and the community-made modules that can be used with it are great for making servers and managing clients accessing an HTTP API. The Manticore web application is the component that controls all of the logic that the other programs cannot do on their own. This includes making calls to Consul and Nomad to start up containers, watching for service changes, managing user requests, implementing a waiting list, sending data to the KV store to be used by consul-template, and using the AWS SDK API to configure the ELB to open and close TCP ports.
With the help of all the programs mentioned, Manticore does the following:
- Generates job files that are configurable by the user’s request
- Allows jobs that chain together dependent docker containers in stages
- Provides many users dedicated, containerized applications
- Implements a waiting list for users in case resources are scarce
- Informs users of their job’s addresses and their position in the waiting list using WebSockets for long periods of waiting
- Detects when jobs fail or become unhealthy and takes appropriate actions
- Allows arbitrary scaling for both the web apps and for the network of machines running the jobs
It's time to see how all of these technologies can be combined to form into the system that is Manticore. Read here for possible designs for Manticore