Skip to content

The hands-on NLTK tutorial for NLP in Python

License

Notifications You must be signed in to change notification settings

hb20007/hands-on-nltk-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Binder

Hands-On NLTK Tutorial

The hands-on NLTK tutorial in the form of Jupyter notebooks

NLTK is one of the most popular Python packages for Natural Language Processing (NLP).

Index of Jupyter Notebooks

Notebooks
1.1 Downloading Libs and Testing That They Are Working
Getting ready to start!
1.2 Text Analysis Using nltk.text
Extracting interesting data from a given text
2.1 Deriving N-Grams from Text
Creating n-grams (for language classification)
2.2 Detecting Text Language by Counting Stop Words.ipynb
A simple way to find out what language a text is written in
2.3 Language Identifier Using Word Bigrams
State-of-the-art language classifier
3.1 Bigrams, Stemming and Lemmatizing
NLTK makes bigrams, stemming and lemmatization super-easy
3.2 Finding Unusual Words in Given Language
Which words do not belong with the rest of the text?
3.3 Creating a POS Tagger
Creating a Parts Of Speech tagger
3.4 Parts of Speech and Meaning
Exploring awesome features offered by WordNet
4.1 Name Gender Identifier
Building a classifier that guesses the gender of a name
4.2 Classifying News Documents into Categories
Building a classifier that guesses the category of a news item
5.1 Sentiment Analysis
Is a movie review positive or negative?
5.2 Sentiment Analysis with nltk.sentiment.SentimentAnalyzer and VADER tools
More sentiment analysis!
6.1 Twitter Stream (and Cleaning Tweets)
Live-stream tweets from Twitter
6.2 Twitter Search
Search through past tweets
7.1 NLTK with the Greek Script
Using NLTK with foreign scripts
8.1 The langdetect and langid Libraries
Useful libraries for language identification
8.2 Word2Vec (gensim)
Google's Word2vec

Meta

H. Z. Sababa — hb20007 — hzsababa@outlook.com

Distributed under the MIT license. See LICENSE for more information.