Skip to content

Commit

Permalink
Update analyzing-multilingual-text-nltk-spacy-stanza.md
Browse files Browse the repository at this point in the history
Minor adjustments/typing corrections via @lachapot.
  • Loading branch information
anisa-hawes authored Oct 23, 2024
1 parent f19bb9f commit 48dd547
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ You will need to install Python3 as well as the NLTK, spaCy, and Stanza librarie

Computational text analysis is a broad term that encompasses a wide variety of approaches, methodologies, and Python libraries, all of which can be used to handle and analyze digital texts of all scales. Harnessing computational methods allows you to quickly complete tasks that are far more difficult to perform without these methods. For example, the part-of-speech tagging method described in this lesson can be used to quickly identify all verbs and their associated subjects and objects across a corpus of texts. This could then be used to develop analyses of agency and subjectivity in the corpus (as, for example, in Dennis Tenen's article [Distributed Agency in the Novel](https://doi.org/10.1353/nlh.2022.a898333)).

In addition to the methods we cover in this lesson, other commonly-performed tasks which simplified by computation include sentiment analysis (which provides a quantitative assessment of the sentiment of a text, generally spanning a numerical scale indicating positivity or negativity) and Named Entity Recognition (NER) (which recognizes and classifies entities in a text into categories such as place names, person names, and so on).
In addition to the methods we cover in this lesson, other commonly-performed tasks which are simplified by computation include sentiment analysis (which provides a quantitative assessment of the sentiment of a text, generally spanning a numerical scale indicating positivity or negativity) and Named Entity Recognition (NER) (which recognizes and classifies entities in a text into categories such as place names, person names, and so on).

For further reading on these methods, please see the _Programming Historian_ lessons [Sentiment Analysis for Exploratory Data Analysis](/en/lessons/sentiment-analysis) and [Sentiment Analysis with 'syuzhet' using R](/en/lessons/sentiment-analysis-syuzhet) for sentiment analysis, and [Finding Places in Text with the World Historical Gazetteer](/en/lessons/finding-places-world-historical-gazetteer) and [Corpus Analysis with spaCy](/en/lessons/corpus-analysis-with-spacy) for Named Entity Recognition. The lesson [Introduction to Stylometry with Python](/en/lessons/introduction-to-stylometry-with-python) may be of interest to those looking to further explore additional applications of computational text analysis.

Expand Down Expand Up @@ -909,7 +909,7 @@ Output:

As we can see, lemmatizing the sentences has replaced our words with their dictionary, non-inflected forms. The verb _vois_ in the French sentence, for example, was replaced with its infinitive _voir_, and the Russian _говорила_ was replaced with its infinitive _говорить_.

This process if helpful when you want to identify all instances of a particular verb in a text: for example, if we were interested in examining themes of seeing and vision in the text, lemmatization would allow us to reliably identify every time the lemma _voir_ occurs, without worrying about all its possible conjugated forms. For this same reason, lemmatization therefore also makes counting word frequencies much easier.
This process is helpful when you want to identify all instances of a particular word in a text: for example, if we were interested in examining themes of seeing and vision in the text, lemmatization would allow us to reliably identify every time the lemma _voir_ occurs, without worrying about all its possible conjugated forms. For this same reason, lemmatization therefore also makes counting word frequencies much more accurate, especially for highly inflected languages.

## Conclusion

Expand Down

0 comments on commit 48dd547

Please sign in to comment.