Naive context sentiment analysis

Aim

This R script should address the problem that several sentiment analysis scripts ignore valence shifters (e.g. "hardly difficult", "not great at all"). For a great outline of that issue, you can see trinker's argument and sentimentr package here.

The sentimentr package does a remarkable job in handling valence shifters but it requires 'good' text data that is properly punctuated - because the valence shifter weighting is done on "polarized context clusters" in sentences (i.e., you get one sentiment value per sentence).

Many text data are not suitable in that pipeline because they are

not punctuated at all (e.g., auto-generated YouTube transcripts)
badly punctuated (e.g., data from blogs where punctuation is not necessarily a given)
or because they are very brief: Twitter data, for example, even if properly annotated for sentence-boundary-disambiguation, would return one or two sentiment values.

Why "naive context sentiment analysis"

Our approach is based on the sentimentr idea of creating a "cluster" around sentiments. Within that cluster, we then look for valence shifters (taken from the brilliant lexicon package), weight the original sentiment, and returns a vector of sentiments of the size v (where v = number of tokens that are not punctuation marks).

Our approach does not rely on sentences and punctation and is therefore "naive" towards the broader structure texts.

Note: We are still developing this tool.

Development wish list

speed improvements (in particular in the length standardisation, e.g. switch to different discrete cosine transformation or Fourier transformation)
multi-dimensionality implementation for other lexicon-based approaches (needed: "lexicon" as function parameter)
multi-language support (needs lexicon-databases in different languages)
python implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Naive context sentiment analysis

Aim

Why "naive context sentiment analysis"

Development wish list

Files

README.md

Latest commit

History

README.md

File metadata and controls

Naive context sentiment analysis

Aim

Why "naive context sentiment analysis"

Development wish list