Background: We have developed two different recommendation engines, first one is User Article Recommender and second one is Market Basket Analysis. We have used the KNN Algorithm for both systems.
The source code of both Reco Engines is available in Summit’s git repository. Here is the path of the same.
As both Reco Engines are developed using KNN algorithm, we are using Python’s surprise library for the Implementation. Here is the path of the training module for both the engines.
- User Article Recommender: https://github.com/Summit-x-Tuple/summit_recommander/tree/master/uar_training
- Market Basket Analysis: https://github.com/Summit-x-Tuple/summit_recommander/tree/master/uar_training
Training data for both algorithms is exactly the same. We are considering 3 fields in training data they are:
- Summit client ID
- Article Slug
- Time on Page
To control the size of the model and keep recommendations relevant we are training our models only for the last 6 month’s data. Here is the SQL to generate training data:
WITH w1_ut AS (
SELECT
ut.summit_client_id
,ut.domain
,ut.slug
,ar.article_title
,ut.time_on_page
,ut.date_time
,cast(substr(ut.date_time, 1, 8) as bigint) date
FROM archives.tf_articles ar
JOIN google_analytics.tf_user_tracks_raw ut
ON (ar.domain = ut.domain AND ar.article_title_id = ut.slug)
)
SELECT
summit_client_
,concat(domain, '/', slug) slug
,time_on_page
FROM w1_ut
WHERE date >= 20200401 AND date <= 20200430
LIMIT 10
In this SQL change the date range and domain accordingly. Run this SQL on Athena and download the data as a CSV file. Keep this training data CSV file in the training folder of both the Reco Engines and run the training script, it should generate the models.
In this module we generate the predictions/recommendations for the requested users.
User Article Recommender takes summit_client_id as input and it returns the slug of top 10 recommended articles by the model.
On the other hand Market Basket model takes article slug as input and generates the recommended slug for the input. Here is the path of both inference modules.
- UAR inference code: https://github.com/Summit-x-Tuple/summit_recommander/blob/master/get_recommendation.py
- MBA inference code: https://github.com/Summit-x-Tuple/summit_recommander/blob/master/mba_inference.py
We are using Python Flask to write our APIs and backend code. Here is the path for code of APIs and server code which responds to all user requests
This Server is developed using the Python Flask framework. We are using Nginx as a web server and gunicorn as an application server. We are using the supervisor process to manage the gunicorn server. Here is the architecture / block diagram of ML API Server.
- Import a HTML file and watch it magically convert to Markdown
- Drag and drop images (requires your Dropbox account be linked)
You can also:
- Import and save files from GitHub, Dropbox, Google Drive and One Drive
- Drag and drop markdown and HTML files into Dillinger
- Export documents as Markdown, HTML and PDF
Markdown is a lightweight markup language based on the formatting conventions that people naturally use in email. As John Gruber writes on the Markdown site
The overriding design goal for Markdown's formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it's been marked up with tags or formatting instructions.
This text you see here is actually written in Markdown! To get a feel for Markdown's syntax, type some text into the left window and watch the results in the right.
Dillinger uses a number of open source projects to work properly:
- AngularJS - HTML enhanced for web apps!
- Ace Editor - awesome web-based text editor
- markdown-it - Markdown parser done right. Fast and easy to extend.
- Twitter Bootstrap - great UI boilerplate for modern web apps
- node.js - evented I/O for the backend
- Express - fast node.js network app framework @tjholowaychuk
- Gulp - the streaming build system
- Breakdance - HTML to Markdown converter
- jQuery - duh
And of course Dillinger itself is open source with a public repository on GitHub.
Dillinger requires Node.js v4+ to run.
Install the dependencies and devDependencies and start the server.
$ cd dillinger
$ npm install -d
$ node app
For production environments...
$ npm install --production
$ NODE_ENV=production node app
Dillinger is currently extended with the following plugins. Instructions on how to use them in your own application are linked below.
Plugin | README |
---|---|
Dropbox | plugins/dropbox/README.md |
GitHub | plugins/github/README.md |
Google Drive | plugins/googledrive/README.md |
OneDrive | plugins/onedrive/README.md |
Medium | plugins/medium/README.md |
Google Analytics | plugins/googleanalytics/README.md |
Want to contribute? Great!
Dillinger uses Gulp + Webpack for fast developing. Make a change in your file and instantaneously see your updates!
Open your favorite Terminal and run these commands.
First Tab:
$ node app
Second Tab:
$ gulp watch
(optional) Third:
$ karma test
For production release:
$ gulp build --prod
Generating pre-built zip archives for distribution:
$ gulp build dist --prod
Dillinger is very easy to install and deploy in a Docker container.
By default, the Docker will expose port 8080, so change this within the Dockerfile if necessary. When ready, simply use the Dockerfile to build the image.
cd dillinger
docker build -t joemccann/dillinger:${package.json.version} .
This will create the dillinger image and pull in the necessary dependencies. Be sure to swap out ${package.json.version}
with the actual version of Dillinger.
Once done, run the Docker image and map the port to whatever you wish on your host. In this example, we simply map port 8000 of the host to port 8080 of the Docker (or whatever port was exposed in the Dockerfile):
docker run -d -p 8000:8080 --restart="always" <youruser>/dillinger:${package.json.version}
Verify the deployment by navigating to your server address in your preferred browser.
127.0.0.1:8000
See KUBERNETES.md
- Write MORE Tests
- Add Night Mode
MIT
Free Software, Hell Yeah!