This repository is a main part of the PhD written by José Calvo Tello as part of the CLiGS project between 2015 and 2020 in Würzburg, Germany. The topic of the PhD is The Novel in the Spanish Silver Age: A Digital Analysis of Genre through Machine Learning and will be published soon in transcript.
This repository contains three folders:
- scripts: This folder contains Python scripts (.py) with general functions that I have used in several notebooks.
- notebooks: This folder contains several Jupyter Notebooks files. For all the chapters of my thesis in which I needed programming, I created a Notebook. Inside the Notebook, you find the same headers that you find in the prose of the thesis. That means, these Notebooks were meant to be a companion to the thesis and not autonomous files. In other words, if you are reading the thesis, you get more information in the Notebooks; but you cannot read only the Notebooks and get an entire picture of the thesis.
- visualizations: This folder contains all figures shown in the dissertation and the graphs for each subgenre analyzed in the research and described in the Appendix.
For an overview about this type of document, I recommend:
- VanderPlas, J. (2016). Python Data Science Handbook: Essential Tools for Working with Data. First edition. Beijing Boston Farnham: O’Reilly.
- Quinn Dombrowski, Tassie Gniady, and David Kloster, "Introduction to Jupyter Notebooks," The Programming Historian 8 (2019), https://doi.org/10.46430/phen0087.
I have worked with the Corpus of the Novel of the Spanish Silver Age (CoNSSA). This is a corpus with 358 novels (22 million tokens) encoded in TEI with novels by Spanish authors published between 1880 and 1939. A section of the corpus cannot be published due to copyright issues. I have published the majority of the text, linguistic annotation, and metadata from all the novels here:
- Corpus of Novels of the Spanish Silver Age (CoNSSA), by Calvo Tello, José. University of Würzburg, 2021. https://github.com/cligs/conssa, https://doi.org/10.5281/zenodo.4674257.
No, but that would be very cool. In 2018, I decided to use Notebooks for the PhD. However, full replicability was never part of my plans, nor was part of the feedback from my tutors. Besides, the fact that not all the data can be published hinders full replicability. Furthermore, a PhD is a research production that lasts for a very long time, with many sections, done by a single person. Honestly, I doubt that full replicability is possible for a PhD, what costs would involve, or how it could be implemented in reality.
Quite. This kind of Notebooks can be nowadays found in a number of current publications, although they are still the minority, at least in the (Digital) Humanities. But there are very few examples for PhDs. Actually I have found only one example:
- Dobson, James E. 2019. Critical Digital Humanities. Topics in the Digital Humanities. Urbana, Chicago, and Springfield: University of Illinois Press.
If you have further examples, specially from the (Digital) Humanities, please contact me.
I know that I do not follow in many cases best practices when coding. I am not a programmer, nor I have studied Computer Sciences. Actually, when I started the PhD, I did not know any Python. I tried my best and it is observable that my coding skills evolved through the process. If you have constructive feedback about my code, please feel free to contact me. However, I do not plan to maintain these repositories.