Skip to content

Advanced networking setup

Milton Pividori edited this page Oct 24, 2018 · 27 revisions

Table of Contents

For security reasons, the instructions so far were intended to make ukbREST run in a single machine. If you want to work with several team members, and all of them have authorization to access the UK Biobank data, then you can setup ukbREST to work in a network. Below you will find some examples to do this. However, here we just provide a general guide that is not comprehensive enough to cover all scenarios; you must check with a networking expert to know exactly how to do it and avoid unauthorized data access.

When ukbREST is setup to work in a network to receive queries by several users, you should enable both user authentication and SSL encryption. This is described below.

Activate user authentication

User authentication in ukbREST is provided using HTTP Basic Authentication. You should really use this with SSL encryption too, as described below.

To activate it you have to use environmental variables using the Docker's -e parameter. First of all, you need a users file. Let's assume you have a file stored in ~/users with this content:

user1: password1
user2: password2

Then you run ukbREST with this command:

$ docker run --rm --net ukb -p 5000:5000 \
  -e UKBREST_SQL_CHUNKSIZE="10000" \
  -e UKBREST_DB_URI="postgresql://test:test@pg:5432/ukb" \
  -v ~/users:/etc/ukbrest_users \
  -e UKBREST_HTTP_USERS_FILE_PATH="/etc/ukbrest_users" \
  hakyimlab/ukbrest

Note that with parameter -p we are not using 127.0.0.1 anymore, since now we want to make others be able to access our ukbREST instance remotely. We have mounted our local users file inside Docker (-v ~/users:...) and specified this path using the environmental variable UKBREST_HTTP_USERS_FILE_PATH. ukbREST will automatically hash your passwords when the users file is read.

Make sure this is working properly by trying to request data with a wrong user/password, and with a valid one. For instance, if you try to make this query from the same machine or another one in the network, you should get an error:

$ curl -G \
  -HAccept:text/csv \
  http://REMOTE_ADDRESS:5000/ukbrest/api/v1.0/phenotype \
  --data-urlencode "columns=c50_0_0 as height" \
  --data-urlencode "columns=c21002_1_0 as weight"

[...]
Unauthorized Access

It should work if you use the -u USER:PASS parameter with a valid user/pass.

REMEMBER that you also have to activate SSL encryption (explained below).

Activate data encryption with SSL

To use SSL you need a certificate. Generate a certificate and private key with this command (it will ask a few questions):

$ openssl req -x509 -newkey rsa:4096 -nodes -out ~/cert.pem -keyout ~/key.pem -days 365
Generating a 4096 bit RSA private key
.......................................................................++
.....................................................................................++
writing new private key to 'key.pem'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:Illinois
Locality Name (eg, city) []:Chicago
Organization Name (eg, company) [Internet Widgits Pty Ltd]:Your institution
Organizational Unit Name (eg, section) []:Your Unit
Common Name (e.g. server FQDN or YOUR name) []:IP ADDRESS OR HOST
Email Address []:

This certificate will be valid for 365 days; check out the documentation of OpenSSL for more options. Pay special attention to Common Name field: clients will use this address to connect to the ukbREST server. Then you can distribute the certificate file cert.pem to your team members.

To enable SSL encryption in ukbREST (in addition to HTTP Basic authentication explained above), you have to use a command like this:

$ docker run --rm --net ukb -p 5000:5000 \
  -e UKBREST_SQL_CHUNKSIZE="10000" \
  -e UKBREST_DB_URI="postgresql://test:test@pg:5432/ukb" \
  -v ~/users:/etc/ukbrest_users \
  -e UKBREST_HTTP_USERS_FILE_PATH="/etc/ukbrest_users" \
  -v ~/key.pem:/etc/ukbrest_key.pem \
  -v ~/cert.pem:/etc/ukbrest_cert.pem \
  -e GUNICORN_CMD_ARGS="--log-file=- -k eventlet -w 4 --timeout 10000 -b 0.0.0.0:5000 --certfile /etc/ukbrest_cert.pem --keyfile /etc/ukbrest_key.pem" \
  hakyimlab/ukbrest

[2018-08-06 21:35:11 +0000] [1] [INFO] Starting gunicorn 19.7.1
[2018-08-06 21:35:11 +0000] [1] [INFO] Listening at: https://0.0.0.0:5000 (1)

Query with user authentication and SSL encryption

Now you need to distribute your certificate cert.pem among your clients. A query with curl would look like this one (note that now you need to use https):

$ curl -G \
  -HAccept:text/csv \
  -u user1:password1 \
  --cacert cert.pem \ 
  "https://REMOTE_ADDRESS:5000/ukbrest/api/v1.0/phenotype" \
  --data-urlencode "columns=c50_0_0 as height" \
  --data-urlencode "columns=c21002_1_0 as weight"