- Go to Google cloud console
- Set it up fully with credentials billing etc. (Currently already setup)
- Click on the top left and click on compute engine
- Click on create instance.
- Choose standard options.
- 1 CPU Ubuntu 16.04 instance should work. Allow HTTP and HTTPS
- The instance should be created.
- Install git bash on your local system Git bash download
- Setup rsa on system SSH Key Setup
- On the google cloud compute system, add the public ssh key that you obtained to the system. You can do this by clicking on edit for the instance
- Login for the first time from the browser window
- If you are logged in as Admin, then fine, else create a new user
#Add User
sudo useradd Admin
# Create password etc
# Add sudo privileges to user
sudo usermod -a -G sudo Admin
#change user to Admin
su Admin
#cd into their ssh directory
cd ~/.ssh
#change the authorized keys to add your local key
nano authorized_keys
- Make a list of the ssh keys of all users you want to login to the remote system
# Add the keys to the authorized_keys file
# A typical key looks as follows
ssh-rsa AAAAB3N.....abcdef My Name <my.name@your_domain.com>
- Now log out of the webbrowser terminal
- Login to terminal from git bash. Type
ssh Admin@ip-address-of-remote
#if that doesn't work
ssh -i ~/.ssh/id_rsa Admin@ip-address-of-remote
# Now you should be logged in
- Setup the user who will be using the superset
#Add User on who's account the superset will run
sudo useradd flaskuser
# Create password etc
# Add sudo privileges to user
sudo usermod -a -G sudo flaskuser
# Change to that user and go to base directory
su flaskuser
cd ~
- Add the basic dependencies
sudo apt-get install build-essential libssl-dev libffi-dev python-dev python-pip libsasl2-dev libldap2-dev
- Install a virtualenvironment
# Install virtualenv module
pip install virtualenv
# Create venv a virtual environment named venv
virtualenv venv
# Activate venv
$ . /home/flaskuser/venv/bin/activate
- Upgrade pip etc
pip install --upgrade setuptools pip
- Now do the actual setup and see if it works
- Follow instructions from here
# Important note, don't use sudo on any of the below commands
# Install superset
pip install superset
# Create an admin user (you will be prompted to set username, first and last name before setting a password)
fabmanager create-admin --app superset
# I created my_first_name.last_name user here. note the password that you input here
# Initialize the database
superset db upgrade
# Load some data to play with
superset load_examples
# Create default roles and permissions
superset init
# Start the web server on port 80
superset runserver -p 80
-
Go the browser and type the ip of the instance. This connects by default to the http port. This should give you superset. Login and see that it works
-
Now that you have it setup, don't expect to log back into it until some other steps are done
- login to godaddy.com
- Setup Account.
- Buy a domain that you like
- Then click on your name on the top right and click on manage my domains
- If using an existing domain, then add an A Host to link to your particular server IP
- If using a new domain, then add @ forwarding to your IP. and then also add www domain forwarding to @
- Once all this is done and takes effect, you should see http://xyz.domain-name.com reflects the front page you saw earlier
- Take server down by hitting ctrl+C
- In the venv environment check your PYTHONPATH
- Find your PYTHONPATH
# Open the Python Shell.
$ python
>>> #import sys library
>>> import sys
>>> for p in sys.path:
>>> print(p)
>>> # for python 2.7 use 'print p'
>>> exit()
- choose one of the paths printed there for saving superset_config.py
- I chose /home/flaskuser/venv/local/lib/python2.7/site-packages/superset_config.py
# Create a new config file
nano /home/flaskuser/venv/local/lib/python2.7/site-packages/superset_config.py
- The base settings of the file are as follows:
#---------------------------------------------------------
# Superset specific config
#---------------------------------------------------------
ROW_LIMIT = 5000
SUPERSET_WORKERS = 4
# You will be serving site on port 8000 from gunicorn which sits in front of flask, and then send to nginx
SUPERSET_WEBSERVER_PORT = 8000
#---------------------------------------------------------
#---------------------------------------------------------
# Flask App Builder configuration
#---------------------------------------------------------
# Your App secret key
SECRET_KEY = '\2\mthisismyscretkey\1\2\a\b\y\h'
# The SQLAlchemy connection string to your database backend
# This connection defines the path to the database that stores your
# superset metadata (slices, connections, tables, dashboards, ...).
# Note that the connection information to connect to the datasources
# you want to explore are managed directly in the web UI
SQLALCHEMY_DATABASE_URI = 'sqlite:////home/flaskuser/.superset/superset.db'
# Flask-WTF flag for CSRF
CSRF_ENABLED = True
# Set this API key to enable Mapbox visualizations
MAPBOX_API_KEY = ''
- We will be making several changes to this file as we go about modifying some things
- ensure in venv environment
- pip install psycopg2
pip install psycopg2
- Download postgres
sudo apt-get update && sudo apt-get upgrade
sudo apt-get install postgresql postgresql-contrib
- Create a postgres user
# By default postgres creates a user called postgres
sudo -i -u postgres
# Check postgres working
psql
>>> # Quit
>>> \q
# Get back into user with sudo access
exit
- Create password for user postgres
sudo passwd postgres
# Store somewhere safe
- Login to postgres user again
su - postgres
# If above doesn't work, try sudo -i -u postgres
# Create a user who will access the database. I am calling this user flaskuser
createuser --pwprompt --interactive
# When it asks whether user is superuser say yes
# Save password securely
# Create a database named superset
createdb superset
# Test the db.
psql superset
superset=#
#This is working
# Grant user access
superset=# GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA superset TO flaskuser;
# exit
superset=# \q
- Ensure you are still user postgres
- Edit /etc/postgresql/9.5/main/pg_hba.conf
nano /etc/postgresql/9.5/main/pg_hba.conf
- Change the file as below
# "local" is for Unix domain socket connections only
# local all all peer
# change above to
local all all md5
- Return to normal unix shell as flaskuser
# exit postgres user
exit
# Restart postgresql service
sudo service postgresql restart
su - postgres
# test the service
psql -U flaskuser -W superset
- Check that postgres is working at any time (say after restart)
htop
# if command doesn't work, then 'sudo apt-get install htop'
- Change the superset_config.py to reflect the following db changes:
# Change the line
# SQLALCHEMY_DATABASE_URI = 'sqlite:////home/flaskuser/.superset/superset.db'
# to
SQLALCHEMY_DATABASE_URI = 'postgresql+psycopg2://flaskuser:your_password@localhost/superset'
- Now do the following steps to ensure postgres working
# Login as flaskuser
sudo su flaskuser
# activate virtualenvironment
cd ~
. ./venv/bin/activate
# Install superset again. This step can be skipped also
pip install superset
# Almost everything shoudl say already installed. Make sure no errors.
# Initialize the database
superset db upgrade
# This is the important part. Ensure the shell says it's pulling the superset_config file and also running on postgresql. If not go back to previous steps and ensure everything ok
# Load the examples into new database
superset load_examples
# Create default roles and permissions in the new database
superset init
# Start the web server on port 80
superset runserver -p 80
- Ensure working from a browser then close/interrupt the server by hitting ctrl+C
- This reference - Redis Install and configure is comprehensive
- Upgrade apt-get
sudo apt-get update
sudo apt-get upgrade
- Install the redis server
sudo apt-get install redis-server
- Change the redis configuration file
sudo nano /etc/redis/redis.conf
- Add the following lines to redis.conf
# 128 MB max memory
maxmemory 128mb
# When mem overflow remove according to LRU algorithm
maxmemory-policy allkeys-lru
- Restart and enable redis on reboot
# restart redis service
sudo systemctl restart redis-server.service
# enable on reboot
sudo systemctl enable redis-server.service
# ensure redis shows up in htop
htop
#F10 to exit htop
- Monitoring live
redis-cli monitor
- Flush Cache
# Enter the command prompt
redis-cli
> # In the command prompt flushall to purge cache
> flushall
- Reference here: Redis for Python
- Ensure you are logged in as flaskuser
sudo su flaskuser
- Activate environment
# goto home
cd ~
# activate venv
. ./venv/bin/activate
- Install flask
pip install redis
-
Change superset and flask to accept connections
-
This reference: Superset Redis Issue
-
Modify the superset_config.py as per follows:
#---------------------------------------------------------
# Superset specific config
#---------------------------------------------------------
ROW_LIMIT = 5000
SUPERSET_WORKERS = 4
SUPERSET_WEBSERVER_PORT = 8000
#---------------------------------------------------------
#---------------------------------------------------------
# Flask App Builder configuration
#---------------------------------------------------------
# Your App secret key
SECRET_KEY = '\2\mthisismyscretkey\1\2\a\b\y\h'
# The SQLAlchemy connection string to your database backend
# This connection defines the path to the database that stores your
# superset metadata (slices, connections, tables, dashboards, ...).
# Note that the connection information to connect to the datasources
# you want to explore are managed directly in the web UI
SQLALCHEMY_DATABASE_URI = 'postgresql+psycopg2://flaskuser:your_password@localhost/superset'
# Flask-WTF flag for CSRF
CSRF_ENABLED = True
# Set this API key to enable Mapbox visualizations
MAPBOX_API_KEY = ''
CACHE_DEFAULT_TIMEOUT = 86400
CACHE_CONFIG = {
'CACHE_TYPE': 'redis',
'CACHE_DEFAULT_TIMEOUT': 86400,
'CACHE_KEY_PREFIX': 'superset_',
'CACHE_REDIS_HOST': 'localhost',
'CACHE_REDIS_PORT': 6379,
'CACHE_REDIS_DB': 1,
'CACHE_REDIS_URL': 'redis://localhost:6379/1'
}
- This is a good time to get mapbox api key which is the default configured map on superset
- Login to mapbox.com
- Create an account
- Once logged in go to top right and click on account
- Click on API access tokens and get one
- Add this to your superset_config.py
- The should have this line added:
# Set this API key to enable Mapbox visualizations
MAPBOX_API_KEY = 'pk.eyJ1IjoicmFodWwtbWFkZHkiLCJhIjoiY2o0YTF2OTBrMHRxcTJxbzZxcm5zbHV5aiJ9.jVLtVWHVGnRJcqsMa8XiMA'
- What we are looking for on nginx is to run a reverse proxy. The actual server is gunicorn sitting in front of flask, which is the framework hosting superset
- Some references for this section: Nginx with Reverse Proxy , nginx.conf for optimized performance, Setting up postgres nginx and gunicorn , How to install nginx
- Install commands
# Install
sudo apt-get update
sudo apt-get install nginx
# Start and Reload
sudo nginx -s start
sudo nginx -s reload
# Check status
systemctl status nginx
# It should be running
- At this point if you browse to your site, it should show nginx bad gateway.
- There are two files to change, nginx.conf and the file that it includes in sites-enabled
- Feel free to change the configuration as per requirements of your server
- Open nginx.conf as follows:
sudo nano /etc/nginx/nginx.conf
- See file below
- If you want to understand what each of these mean go to Reference
user www-data;
worker_processes auto;
pid /run/nginx.pid;
#from linode nginx.config optimizer
events {
worker_connections 650;
use epoll;
multi_accept on;
}
http {
#from linode nginx.config optimizer
keepalive_requests 100000;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
client_header_timeout 3m;
client_body_timeout 3m;
send_timeout 3m;
open_file_cache max=1000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 5;
open_file_cache_errors off;
gzip on;
gzip_min_length 1000;
gzip_buffers 4 4k;
gzip_types text/html application/x-javascript text/css application/javascript text/javascript text/plain text/xml applica$
gzip_disable "MSIE [1-6]\.";
# [ debug | info | notice | warn | error | crit | alert | emerg ]
error_log /var/log/nginx.error_log warn;
log_format main '$remote_addr - $remote_user [$time_local] '
'"$request" $status $bytes_sent '
'"$http_referer" "$http_user_agent" '
'"$gzip_ratio"';
log_format download '$remote_addr - $remote_user [$time_local] '
'"$request" $status $bytes_sent '
'"$http_referer" "$http_user_agent" '
'"$http_range" "$sent_http_content_range"';
map $status $loggable {
~^[23] 0;
default 1;
}
#-------------from basic conf file-----------
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /var/log/nginx/access.log;
include /etc/nginx/conf.d/*.conf;
include /etc/nginx/sites-enabled/*;
}
- If you notice this includes the files in sites-enabled. For now we should have no sites enabled.
- You can see these files in the sites-enabled folder
cd /etc/nginx/sites-enabled/
ls
# If there is anything you can remove as follows
rm file-to-be-removed.conf
- Now create a config file in sites-available and hard link it to sites-enabled
cd /etc/nginx/sites-available
sudo nano superset.conf
- create a really simple config file to start off. We will change this later.
server {
listen 80;
server_name your.domain.com;
location / {
proxy_buffers 16 4k;
proxy_buffer_size 2k;
proxy_pass http://127.0.0.1:8000;
}
}
- Now hardlink this superset.conf into the sites-enabled folder
# Hardlinking files
sudo ln -s /etc/nginx/sites-available/superset.conf /etc/nginx/sites-enabled
# Test for syntax
sudo nginx -t
sudo nginx -s reload
- The output should look as follows:
#Output
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
- This is a good time to test whether server works (without the SSL bit)
# Login as flaskuser
sudo su flaskuser
# activate virtualenvironment
cd ~
. ./venv/bin/activate
# Start the web server on port 8000 (Note that we have changed the server port)
superset runserver -p 8000
- Ensure working from a browser then close/interrupt the server by hitting ctrl+C
- If above is not working, go back and ensure all steps followed
- We will be using Let's encrypt for SSL.
- The references for this section: Digital Ocean: Getting letsencrypt, Server Fault answer on how to setup letsencrypt with reverse proxy , Raymii.org strong security
# Add the repo
sudo add-apt-repository ppa:certbot/certbot
# Update apt-get
sudo apt-get update
# Install certbot
sudo apt-get install certbot
-
First some prerequisites:
- We will be using the 'well-known' application to help us with the
- We are using letsencrypt which gives us the certificate for free
- But it needs to verify our site first. It does this by doing some checks (which it calls challenges) within http://sub.domain.com/.well-known
-
Create some folders for nginx to redirect /.well-known requests
# Go to below folder. If html directory doesn't exist within www, you can create the same with mkdir command as below
cd /var/www/html/
#make directory .well-known. we will be redirecting nginx requests here.
mkdir .well-known
# Go back to nginx directory
cd /etc/nginx/sites-available
- Go to the nginx config file
sudo nano /etc/nginx/sites-available/superset.conf
- Change it slightly to allow letsencrypt to certify your site
server {
listen 80;
server_name sub.domain.com www.sub.domain.com;
[…]
# Add the below
location /.well-known {
alias /var/www/html/.well-known;
}
location / {
# proxy commands go here
[…]
}
}
- Now, you can use the certbot client to request a certificate from Let's Encrypt using the webroot plugin (as root):
certbot certonly --webroot -w /var/www/sub.domain.com/ -d sub.domain.com -d www.sub.domain.com
- The certificate is now installed in /etc/letsencrypt/live/sub.domain.com/
ls /etc/letsencrypt/live/sub.domain.com/
- Now establish some strong security
#Establish a 2048 bit diffie Helman group Params
sudo openssl dhparam -out /etc/ssl/certs/dhparam.pem 2048
- sudo nano /etc/nginx/sites-available/superset.conf
server {
listen 80 default_server;
listen [::]:80 default_server;
server_name analysis.atidiv.com;
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl;
ssl_certificate /etc/letsencrypt/live/your.domain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/your.domain.com/privkey.pem;
# from https://cipherli.st/
# and https://raymii.org/s/tutorials/Strong_SSL_Security_On_nginx.html
ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;
ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
ssl_ecdh_curve secp384r1;
ssl_session_cache shared:SSL:10m;
ssl_session_tickets off;
ssl_stapling on;
ssl_stapling_verify on;
resolver 8.8.8.8 8.8.4.4 valid=300s;
resolver_timeout 5s;
# Disable preloading HSTS for now. You can use the commented out header line that includes
# the "preload" directive if you understand the implications.
#add_header Strict-Transport-Security "max-age=63072000; includeSubdomains; preload";
add_header Strict-Transport-Security "max-age=63072000; includeSubdomains";
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
ssl_dhparam /etc/ssl/certs/dhparam.pem;
# Uncomment this line only after testing in browsers,
# as it commits you to continuing to serve your site over HTTPS
# in future
add_header Strict-Transport-Security "max-age=31536000";
access_log /var/log/nginx/sub.log combined;
server_name your.domain.com;
location /.well-known {
alias /var/www/html/.well-known;
}
location / {
proxy_buffers 16 4k;
proxy_buffer_size 2k;
proxy_pass http://127.0.0.1:8000;
#from linode nginx optimizer
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_connect_timeout 90;
proxy_send_timeout 90;
proxy_read_timeout 90;
proxy_busy_buffers_size 4k;
proxy_temp_file_write_size 4k;
proxy_temp_path /etc/nginx/proxy_temp;
}
}
- Reload nginx after change
sudo nginx -s reload
- run the certbot renew command
certbot renew --renew-hook "service nginx reload"
- Add to crontab
# at 4:47am/pm, renew all Let's Encrypt certificates over 60 days old
47 4,16 * * * root certbot renew --quiet --renew-hook "service nginx reload"
- Dry run he renewal
certbot --dry-run renew
- Force early renewal to test
certbot renew --force-renew --renew-hook "service nginx reload"
- Now we are good to start the server proper
- For this we use the application screen. This helps in persistent sessions. ie those that don't go off when you log off from the bash running the server
- Reference Persistent SSH Sessions
- Run screen
# Run Screen
screen
>Press enter
> New terminal session
# Change user to flaskuser
sudo su flaskuser
# Go to home directory
cd ~
# Activate virtual environment
. ./venv/bin/activate
# Initialize the database
superset db upgrade
# Ensure everything goes right
# Load some data to play with
superset load_examples
# Ensure everything goes right
# Create default roles and permissions
superset init
# Ensure everything goes right
# Start the web server on port 8000
superset runserver - p 8000
# ctrl a+d to log off screen
# you can type screen -r to log back into screen running server
- Now your server is running with redis, SSL, Postgresql and Nginx!
- grep search for any the colours on this page Colours used by superset
sudo grep -rnwl '#7b0051' /home/flaskuser/venv/local/lib/python2.7/site-packages/superset
# Results as follows
/home/flaskuser/venv/local/lib/python2.7/site-packages/superset/data/__pycache__/__init__.cpython-35.pyc
/home/flaskuser/venv/local/lib/python2.7/site-packages/superset/data/__init__.py
/home/flaskuser/venv/local/lib/python2.7/site-packages/superset/data/__init__.pyc
/home/flaskuser/venv/local/lib/python2.7/site-packages/superset/static/assets/dist/explore.f6a614ef75857893cf0e.entry.js
/home/flaskuser/venv/local/lib/python2.7/site-packages/superset/static/assets/dist/dashboard.e444b819184014cf5be3.entry.js
/home/flaskuser/venv/local/lib/python2.7/site-packages/superset/static/assets/javascripts/modules/colors.copy
/home/flaskuser/venv/local/lib/python2.7/site-packages/superset/static/assets/coverage/lcov-report/javascripts/modules/colors.js.html
- You can do a sed replace after a grep search for each of the colours in the colour scheme 'bnbcolors' with your own
- reference: How to grep and replace
grep -rl matchstring somedir/ | xargs sed -i 's/string1/string2/g'