Skip to content
Domingos Gonçalves edited this page Dec 18, 2018 · 2 revisions

OpenConext statistics

Introduction

A SAML proxy provides an excellent opportunity to gather statistics on the login process. The OpenConext statistics module provides a way to do so. It has been built using as much off the shelf components as possible: Syslog and Filebeat to transport the logs, Logstash to parse it, and Influxdb to create and store meaningfull statistics. The module integrates with the OpenConext Dashboard as well. If you plan on using the statistics module, it's recommended to use a seperate machine or VM for this. Both Logstash and InfluxDB are memory hungry, so give them plenty.

Architecture

Engineblock logs every authentication to syslog, which results in messages like this:

Dec 12 09:11:28 t06 EBAUTH[18593]: {"channel":"authentication","level":"INFO","message":"login granted","context":{"login_stamp":"2018-12-12T09:11:28.168643+01:00","user_id":"urn:collab:person:example.com:statsdemo","sp_entity_id":"https:\/\/profile.test2.surfconext.nl\/authentication\/metadata","idp_entity_id":"http:\/\/mock-idp","key_id":null,"proxied_sp_entity_ids":[],"workflow_state":"prodaccepted"},"extra":{"session_id":"mvkSYtlQA,Muj90jcpGNDCSBvqd","request_id":"5c10c2afef844"}}

Filebeat is then used to send all messages containing "EBAUTH" to a Logstash server. Logstash then parses this message, extracting the timestamp, sp and idp entityID, and workflow state. Furthermore, it generates a hash of the userid so you don't have to store the userid itself. After parsing the message, it is sent to a InfluxDB table. InfluxDB is provisioned with some Continious Queries (please follow this link for more information on those) which will aggregate the raw login data, making it faster to access. A Python Flask based API and frontend is put in front of the InfluxDB database. The API is accessed by both the OpenConext Dashboard, as the web application frontend.

The default landing page of the web application shows some statistics for public viewing purposes (to show the rest of the world how many logins you actually process). The login button allows you to access the application dashboard. The login is using the OpenID Connect module.

Installation

This manual describes how you install the statistics module using Ansible. It assumes that you have a working OpenConext installation, and some knowledge on the Ansible deploy scripts.

Preparations

If you plan on using the statistics for production purposes, please make sure that you have a seperate machine for this. Furthermore, you need a DNS name for logstash.YOURDOMAIN. Filebeat needs this to sends the logs to.

You need the following additions to your group_vars:

Add to manage-api-users a stats user

    - {  
        name: "stats",
        password: "{{ manage_stats_api_password }}", 
        scopes: ["READ"]
      }    

Some variables that are shared between the roles:

influx_stats_db: prod_logins
influx_ebauth_measurement: EBAUTH

If you use haproxy, you will need to have to configure an additional vhost

Add a loadbalancing port:

loadbalacing:
	stats:
		port: 702

Add a haproxy application:

haproxy_applications:
  - name: stats
    vhost_name: stats
    ip: "{{ haproxy_sni_ip }}"
    ha_method: "GET"
    ha_url: "/health"
    port: "702"
    servers: "{{ log_servers }}"
    crt_name: "{{ tls_star_cert }}"
    hidden: false
    ipv6: "2001::"
    key_name: "{{ tls_star_cert_key }}"
    x_forwarded_port: "443"

Create a log_servers group:

log_servers:
   - { ip: "your.ip.address", label: "stats"}

Create a host_vars file for your machine and add:

apache_app_listen_address:
  stats: "ansible_default_ipv4" 

In addtion to these group_vars, you'll need the following secrets:

stats_api_secret: 
stats_dashboard_api_password: 
stats_sysadmin_api_password: 
manage_stats_api_password: 
stats_oidc_client_secret: 
stats_oidc_crypto_pass:
logstash_ebauth_sha_secret: 
influxdb_admin_password: 
influxdb_stats_password: 

haproxy:

Reinstall the haproxy role to activate your new application

Filebeat

Install the filebeat role on your PHP application servers. This role is fairly straightforward.

Logstash

Create a group in your inventory file called logstash:

[logstash]
your.machine.hostname

Then, install the role "elk" on your statistics machine. It will install logstash and the configuration needed for the statistics.

Influxdb

Install the influxdb role

stats role

Install the stats role

stats

This installs the stats role

stats_backfill_cq

This role will create all the continious queries and backfill all the data already present.

After deploying, you should create an OIDC client in order to log in. Create an OIDC client with the clientID https@//stats.YOURDOMAIN. Add the secret to your secrets file to the variable stats_oidc_client_secret