- Data scientist -> lecturer
- Large scale AgTech expert
- Modern dancer
- How to convert text to quantitative data
- Common processes and metrics used for text data
- A bit of Python and warnings on scikit-learn packages
- extract meaning from text
- linguistics
- machine learning
A better map is presented by . This does not include the recent "NLP" work though.
- Working with Text in Python with Jupyter Notebooks
- Common pre-processing Steps
- Quantification of text and common metrics
- Apply it on some text
- Nonstationarity: meaning changes quickly over time, e.g. #cancel
- Multiple features can be extracted from the same text
- Preprocessing steps
- Tokenization
- Lemmatization
- Quantification of text
- Bag of words
- Metrics for text
- Term Frequency Inverse Document Frequency (TF-IDF)
- "Distance" between documents
- Cosine similarity
Indeed.com helps candidates find job postings