There are a lot of potentially unfamiliar terminology used in this document so it might be helpful to briefly explain some of them.
Celery: Celery is a Python library for implementing a task queue. It uses RabbitMQ under the hood.
Django: Django is a Python web application framework that uses traditional server-side scripting. The framework follows Model-view-controller (MVC) design pattern.
Docker: Docker is a tool for bundling applications into images that contains all the libraries and resources that the application needs including a preferred Linux distribution. Dockerfile contains the instructions on how to create an image. These images can be used to created instances called containers that can be started, stopped and removed.
Docker Compose: Docker Compose is a tool for making the management of multiple Docker containers easier. The configuration is stored in a docker-compose.yml file.
PostgreSQL: PostgreSQL is a relational database management system.
RabbitMQ: RabbitMQ is a message broker.
The full environment is set up using Docker Compose and consists of four Docker containers:
- Web contains Django web server (nicknamed WebMark)
- DB contains PostgreSQL database
- RabbitMQ contains RabbitMQ message broker
- Benchmark contains a benchmarking environment powered by Celery
The following diagram shows how the containers are connected together along with an example scenario where an user submits an algorithm for analysis.
The web server has been nicknamed WebMark. Folder /WebMark
contains project settings and /WebCLI
contains the actual application.
Architecturally the web server consists of three parts: templates, views and models.
- Templates contain HTML documents with occasional inline Javascript. Django templating language is used to generate some of the HTML using the data provided by view layer.
- Views contain the logic used to populate the templates with correct data that are fetched using models.
- Models represent the database schema and are used to execute database operations.
The database uses PostgreSQL and the schema is shown in the following diagram. User table has been automatically created by Django and most of the fields are not used. The relevant fields are username and password.
RabbitMQ is automatically managed by Celery library and is mostly transparent to the developer. It is used as a task queue where tasks are added by the web server and consumed by Celery workers running in benchmarking environment. RabbitMQ stores received tasks if a worker is not readily available to consume them.
The benchmarking environment is inside WebMark repository in a folder /BenchMark
. It contains a Celery task that runs the benchmarks and return the results. The idea behind a separate environment is to reduce stress from the web server and increasing security by limiting direct database access. The environment has been configured to create multiple workers when started to take advantage of multiprocessing.