This document describes (or links to existing documentation of) the infrastructure on which Web Labs (such as the Cardiac Electrophysiology Web Lab) run.
Briefly, the Web Lab front-end is a Django website which communicates with a "Functional Curation" back-end via a task queue implemented in "fc-runner" (aka fcws).
The best description of the full infrastructure is currently given as an Ansible configuration. There is an "ansible playbook" that can install the Web Lab for servers or locally, and load updates etc.
The plans for the final set-up are given in this presentation Jonathan prepared for Harmony 2018
Project-wide issues (not tied to any specific component) go here
To get started practically, jump down to the deployment section.
This diagram describes the general purpose Web Lab infrastructure.
The website is written in Python using Django. The current Django application is slightly cardiac-specific, and is hosted in the WebLab repository.
The django application can access data from various sources (databases, local files, git repositories, a metadata triple store (not implemented yet)), and can communicate with the "task queue" to execute experiments via a REST API.
To make the Django application respond to HTTP requests, we use the nginx
web server.
Requests for static files are handled by nginx
directly, other requests are passed to the Django server via uwsgi
, which is a "WSGI" server (a bit like a CGI server).
The Django server serves up a HTML and javascript front-end, which can interact with the Django server backend via a REST API.
nginx is a web server (think equivalent of Apache but more lightweight). uwsgi is a WSGI server, which serves WSGI apps (Django in our case) - think faster CGI. Requests come in to nginx which responds directly if a static resource, or hands off to Django via uwsgi if not.
The back-end fc-runner is also served via nginx, but this time it's an old-fashioned CGI app wrapped via fcgiwrap, since nginx doesn't do old-fashioned CGI natively.
PostgreSQL. Gets created by the Ansible deployment (including setting up a DB user for access) and the schema is defined by Django.
Stored in Django's data dir, path defined in config/settings
in REPO_BASE
.
Accessed with gitpython via entities/repository.py
.
Used for datasets and predictions. Again the location is defined in Django settings (DATASETS_BASE
and EXPERIMENT_BASE
respectively).
Not yet implemented; will contain a copy of metadata associated with models, datasets, etc. to support searching; also the 'metadata interface' of protocols to support determining compatibility.
Happens via a task queue, communicated with via a REST API.
The fc-runner repository contains a CGI script that can be called by the Django front-end.
It handles these tasks:
- Determining a protocol interface (i.e. the variable annotations it requires)
- Scheduling experiments
- Cancelling scheduled experiments
- Checking the protocol's syntax for correctness (results will be displayed by the front-end)
Experiments are schedule by placing them into a Celery task queue (wiki. One or multiple workers then take tasks from the queue and execute them. This way, the actual running of experiments can be offloaded to a cloud of worker machines.
The workers talk to the queue via a message passing system. Instead of talking to each other directly, messaging is handled via a broker, for which Web Lab uses RabbitMQ (wiki). See also Celery docs: Using RabbitMQ.
Further information is given here.
Todo.
Vagrant (wiki) is a tool for "building and maintaining virtual software development environments".
For the WebLab, we use Vagrant to create and manage VirtualBox machines, which are automatically set up for development or production using Ansible.
Users can connect to running vagrant machines using $ vagrant ssh
.
Ansible (wiki) is a tool to set up production (or development) environments.
The WebLab deployment repo contains several Ansible playbooks, which each set up some part of the WebLab infrastructure. Overlapping parts of playbooks are shared via roles).
See also: Using Vagrant and Ansible
Five playbooks are defined:
Stored in site.yml
, this is the main playbook, and does nothing other than simply importing the remaining four playbooks.
Stored in webservers.yml
, this sets up a Django server.
At the moment, there's only a single server, but we may want to consider more in the future e.g. for load balancing.
Sets up the non-worker parts of celery (basically just checking out the repo), and the fc-runner REST web service (configuring nginx etc).
Sets up celery workers, using configuration from group_vars
and defaults for the roles.
Sets up rabbitmq