Start by cloning the Git repsitory:
git clone https://github.com/lena-will/master-thesis.git
- The text data are articles form the German newspaper "Frankfurter Allgemeine Sonntagszeitung" starting in 2001 and including articles up until April 2024. The data can be aquired from the "Frankfurter Allgemeine Zeitung".
- Previously dated German recessions in
recessions_germany.csv
are based on the "German Council of Economic Experts"'s business cycle dating (see https://www.sachverstaendigenrat-wirtschaft.de/en/topics/business-cycles-and-growth/konjunkturzyklus-datierung.html). - The macroeconomic indicators for Germany are available via the following sources:
- Quarterly GDP
- IP Index (Produktion im produzierenden Gewerbe ohne Bau)
- Economic Sentiment Index
- CPI
- Vacancies
- Long-term Intrest Rate (10 years)
- Short-term Interest Rate (3 months)
- Text pre-processing is done in python and can be found in
preprocessing.py
. - For computational efficiency the LDA is run using the R package
topicmodels
which was built in C. - However,
LDA_gibbs_sampling_algorithm.R
offers code to do the inference to Latent Dirichlet Allocation from scratch using the Gibbs sampling algorithm introduced by Griffiths and Steyvers (2004). - Code for any plots can be found in the
plots code
folder. - All functions to the weekly bridge models are in the
functions
folder. - All main files to run the weekly models for the baseline models as well as with the text-based indicators are in the
nowcasting
folder. - Files starting with
LDA_
hold the code to different LDA specification and/or extensions. date_week_mapping.py
takes publication dates as inputs and returns the week of the quarter of a given year.