Skip to content

KChalk/nlpia-bot

 
 

Repository files navigation

Build Status Coverage GitHub release PyPI version License

nlpia_bot

The nlpia_bot package is both a chatbot framework and a working "reference implementation" virtual assistant that actually assists! Most bots manipulate you to make money for their corporate masters. Your bot can help protect you and amplify your intelligence.

The presentations for San Diego Python User Group are at totalgood.org/midata/talks/ and in docs/.

Skills

The current version of nlpia-bot can answer basic questions about data science for healthcare and chatbots themselves. It can also imitate the classic therapist bot "Eliza" and carry on a relatively entertaining conversation based on lines it's read from movie scripts. You can select any or all of these skills with command line args and the configuration file ~/nlpia-bot.ini in your user directory.

You can expand the questions that nlpia-bot can answer by adding Q/A pairs to yaml text files in data/faq. And soon nlpia_bot will be able to detect your mood and carry on more meaningful conversations, to give you encouragement and emotional support. We'll have something like this online in a couple months:

bot: How are you doing?
YOU: not so great
bot: I'm really sorry to hear that. What do you think about doing 10 pushups to get your blood flowing?
YOU: not so much
bot: Would you like to chat about it?
YOU: sure
bot: So what are you feeling right now? How does your body feel?
...

Install

You'll want to install and use the conda package manager within Anaconda3, especially if your development environment is not a open standard operating system like Linux.

git clone git@github.com:nlpia/nlpia-bot
cd nlpia-bot
conda env create -n nlpia -f environment.yml  # or environment-windoze.yml
conda activate nlpia
pip install --editable .

Usage

$ bot --help
usage: bot [-h] [--version] [--name STR] [-p] [-b STR] [-v] [-vv] [words [words ...]]

You can run bot just like any other command line app, giving it your statement/query as an argument.

$ bot what is an allele
bot: A variant form of a given gene, a version of a known mutation at the same place as the original unmodified gene within a chromosome.

Travis's probabilistic conversation manager is working nicely to chose a reply from the possiblities generated by the bots:

  • pattern_bots.py: regex patterns and greeting templates
  • fuzzy_search_bots.py: movie dialog fuzzy matching
  • parul_bots.py: Wikipedia searches using conventional TFIDF like a search engine
  • eliza_bots.py: A python port of the ELIZA therapist bot
  • time_bots.py: A time and productivity tracker that parses your git logs and bash history

Here's how those bots work together:

$ bot
(nlpia) hobs@Hobsons-MacBook-Air:~/code/chatbot/nlpia-bot$ bot
# 2019-11-21 12:59:05,854 WARNING:nlpia.constants:107:            <module> Starting logger in nlpia.constants...
# 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 64350/64350 [00:00<00:00, 495935.48it/s]
YOU: Hi
BOT: hey there. tired of breathing?
YOU: Hello
BOT: How do you do. Please state your problem.
YOU: What is an ELIZA chatbot?
BOT: joey never told you we went out, did he?
YOU: ^[[A
BOT: I am sorry! I dont understand you
YOU: What is an ELIZA chatbot?
BOT: Does that question interest you ?
YOU: What is a chatbot?
BOT: hello barbie is an internet-connected version of the doll that uses a chatbot provided by the company toytalk, which previously used the chatbot for a range of smartphone-based characters for children.
YOU: Hello
BOT: hello.
YOU: Hello
BOT: How do you do. Please state your problem.
YOU: bye
$

Check out the config file in nlpia-bot.ini or $ bot -h to change the default bot personalities loaded for your own personalized bot.

Approach

This hybrid chatbot framework allows you to combine 4 approaches to give you state-of-the-art capability to answer questions and carry on a conversation:

1. search: [chatterbot](https://github.com/gunthercox/ChatterBot), [will](https://github.com/skoczen/will)
2. pattern matching and response templates: Alexa, [AIML](https://github.com/keiffster/program-y)
3. generative deep learning: [robot-bernie](https://github.com/nlpia/robot-bernie), [movie-bot](https://github.com/totalgood/nlpia/blob/master/src/nlpia/book/examples/ch10_movie_dialog_chatbot.py)
4. grounding: [snips](https://github.com/snipsco/snips-nlu)

It's all explained in detail at NLP in Action.

Presentations for San Diego Python User Group are in [docs/](/docs/2019-08-22--San Diego Python User Group -- How to Build a Chatbot.odp) and on the web at http://totalgood.org/midata/talks

Contributors (alphabetically)

DM @hobson if youwould like to participate in the weekly Zoom collaborative-programming sessions.

  • Erturgrul: Turkish wikipedia QA bot (parul bot)
  • Hobson (@hobson): infrastructure (CI, webapp) and framework features (nltk->spacy, USE vectors)
  • Kendra (@kchalk): semantic search
  • Maria Dyshell (tangibleai.com): student and career coaching
  • Mohammed Dala (@dala85): django web application
  • Nima (@hulkgeek): question answering bot based on his state of the art question classifier
  • Olesya: ElasticSearch and nboost
  • Prarit (@praritlamba): BERT
  • Travis (@travis-harper): markhov chain reply selection and other data science enhancements
  • Xavier (@spirovanni): employment counselor for workforce.org and the city of San Diego
  • YOU: What AI idea would you like to make a reality?

Crazy Ideas

Please submit your feature ideas github issues. Here are a few ideas to get you started.

  1. movie dialog in django database to hold the statement->response pairs
    1. graph schema compatible with MxGraph (draw.io) and other js libraries for editing graphs/flow charts.
    2. ubuntu dialog corpus in db
    3. mindfulness faq corpus in db
    4. famous quotes as responses to the statement "tell me something inspiring"
    5. jokes for "tell me a joke"
    6. data science faq
    7. nlpia faq
    8. psychology/self-help faq
  2. html django template so there is a web interface to the app rather than just the command line command bot
  3. use Django Rest Framework to create a basic API that returns json containing a reply to any request sent to the local host url, like http://localhost:8000/api?statement='Hello world' might return {'reply': 'Hello human!'}
  4. have the command line app use the REST API from #3 rather than the slow reloading of the csv file every time you talk to the bot
  5. use database full text search to find appropriate statements in the database that we have a response for
  6. use semantic search instead of text similarity (full text search or fuzzywyzzy text matches)
    1. add embedding vectors (300D document vectors from spacy) to each statement and response in the db
    2. create a semantic index of the document vectors using annoy so "approximate nearest neighbors" (semantic matches) can be found quickly
    3. load the annoy index of the document vectors every time the server is started and use it to find the best reply in the database.
    4. use universal sentence encodings instead of docvecs from spacy.
  7. create a UX for dialog graph creation/design:
    1. install mxgraph in the django app
    2. create a basic page based on this mxgraph example so the user can build and save dialog to the db as a graph: tutorial, example app
    3. convert the dialog graph into a set of records/rows in the nlpia-bot db so it acts
  8. tag different dialog graphs in the db so the user can turn them on/off for their bot
    1. allow the user to prioritize some dialogs/models over others
    2. allow the user to create their own weighting function to prioritize individual statements produced by the api
  9. train a character-based generative model
    1. decoder half of autoencoder to generate text based on docvecs from spacy
    2. decoder part of autoencoder to generate text based on universal sentence encodings
    3. train model to generate reply embeddings (doc vecs and/or use vecs) using statement embeddings (dialog engine encoder-decoder using docvecs or use vecs for the encoder half
  10. add a therapy/mindfulness-coach feature to respond with mindfulness ideas to some queries/statements
  11. add the "translate 'this text' to spanish" feature
    1. train character-based LSTM models on english-spanish, english-french, english-german, english<->whatever
    2. add module for this to the django app/api
  12. AIML engine fallback

Inspiration

A lot of the patterns and ideas were gleaned from other awesome prosocial chatbots and modular open source frameworks.

Mental Health Coaches

Open Source Frameworks

  • will
    • lang: python
    • web: zeromq
    • db: redis, couchbase, flat file, user-defined
    • integrations: hipchat, rocketchat, shell, slack
  • ai-chatbot-framework
    • lang: python
    • web: flask
    • orm: flask?
    • db: mongodb
    • nice general json syntax for specifying intent/goals for conversation manager (agent)
  • rasa
    • lang: python
    • web: sanic (async)
    • orm: sqlalchemy
    • db: sqlite
    • rich, complex, mature framework
  • botpress
    • javascript (typescript)
    • meta-framework allowing your to write your own modules in javascript
  • Program-Y
    • python
    • web: flask (rest), sanic (async)
    • db: aiml flat files (XML)
    • integrations: facebook messenger, google search, kik, line, alexa, webchat, viber

About

Build a virtual assistant that actually assists!

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • HTML 66.3%
  • Python 33.4%
  • Shell 0.3%