Code samples for CASCON 2017 paper & demo:
Using IBM Watson cloud services to analyze chat conversations, forum posts, and social media
Sarah Packowski
spackows@ca.ibm.com
If you have questions, feedback, or suggestions, I would love to hear from you!
See prerequisites here.
Note: You don't have to meet any of these prerequisites to have fun... The directory sample-output
contains all the output you would have generated if you ran the code yourself, including the html dashboard and clustering results in .pdf files. :)
Work through the files in the sample-code
directory in numerical order. The file names indicate what they do. The files start small and simple, and get more complex as you go.
-
All of the NLC files (01 - 04, and 10) require you to paste a URL, username, and password for your NLC service at the top of the file. You can retrieve those details from the service details page for your NLC service in the IBM Cloud dashboard.
-
All of the NLU files (05 - 10) require you to paste a username and password for your NLU service a the top of the file. You can retrieve those credentials from the service details page for your NLU service in the IBM Cloud dashboard.
-
File 14 requires you to paste a host name, port number, username, and password for your Db2 Warehouse on Cloud service at the top of the file. You can retrieve those credentials from the service details page for your Db2 service in the IBM Cloud dashboard.
-
The NLC files that perform classification (03, 04, and 10) require you to paste the classifier ID at the top of the file. File 01 is what creates the classifier using training data from the
sample-data
directory. You can retrieve the classifier ID from the output of file 01 or file 02, or from the Watson NLC toolkit that you can launch from the service details page for your NLC service in the IBM Cloud dashboard. -
The NLU files that use a custom language model (07 - 10) require you to create a custom language model, and then paste the model ID at the top of the file. Steps for creating the custom language model are here. You can retrieve the model ID from Watson Knowledge Studio after you deploy the custom language model, or from the output of file 07.
The short videos in the demo-videos
directory show me working my way through these files, and creating the custom language model, just like you would.
See 8-step instructions here.
File 13 generates an R script. You can run that R script in multiple ways:
-
Installing RStudio locally is easy, and there's a free version, so that really is a good way to go.
-
Sign up for a free trial of Data Science Experience and then paste the R from the script generated by file 13 into a Notebook.
-
Provision an instance of Db2 Warehouse on Cloud and then use the REST API to run R scripts on the Db2 Warehouse on Cloud server. File 14 demonstrates how to do this.
The comments in the sample-data
directory and the custom-language-model/document-set
directory are not actual user comments collected from anywhere. I made them up for this sample (but I tried to make them in the same style as typical user comments I've seen.) Do not infer any real-world busness meaning about any company or products from the results in the dashboard or the clusters.
What's shown in these files is not the "best practice" or the recommended way to do things. Instead, what's in these files is just a fast way to step through using these services and to see the value you can get (eg. the dashboard and the clusters.) For example, there are MUCH better ways to normalize results for natural language understanding projects.. but the kludge here (file 12) requires no extra tools or effort.
This is meant to be fun and to inspire you to create solutions for processing your comments, questions, and chat convos. :)