CSSR-Project.rmd

---
title: |
  | Drink and Thrive?
  | The economics of binge-drinking in Germany
subtitle: "CSSR Final Project"
author: "Alexander Sacharow & Torben Klausa"
header-includes:
    - \usepackage{fancyhdr}
    - \pagestyle{fancy}
    - \fancyhead[LO,LE]{Drink and Thrive? The economics of binge-drinking in Germany}
    - \fancyfoot[LO,LE]{CSSR Project}
    - \fancyfoot[RE,RO]{Torben Klausa and Alexander Sacharow}
date: "December 16, 2016"
output: 
  pdf_document:
    toc: true
    number_sections: true
bibliography:
    -  literature.bib
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)

# Clear Global environment
rm(list=ls())

# Setting Working directory
try(setwd("/home/torben/GIT/Pair_Assignment_2"), silent = TRUE)
try(setwd("D:/Eigene Datein/Dokumente/Uni/Hertie/Materials/Collaborative Social Science Data Analysis/CSSR_Project"), silent = TRUE)

source("main.R")

# Collect packages/libraries we need:
packages <- c("fontcm", "psych", "plotly")

# install packages if not installed before
for (p in packages) {
  if (p %in% installed.packages()[,1]) {
    require(p, character.only=T)
  }
  else {
    install.packages(p, repos="http://cran.rstudio.com", dependencies = TRUE)
    require(p, character.only=T)
  }
}
rm(p, packages)

# Initiate Label Names
statelist_code <- c("DE-BW", "DE-BY", "DE-BE", "DE-BB", "DE-HB", "DE-HH", 
                    "DE-HE", "DE-MV", "DE-NI", "DE-NW", "DE-RP", "DE-SL",
                    "DE-SN", "DE-ST", "DE-SH", "DE-TH")

statelist_name <- c("Baden-Württemberg", "Bayern", "Berlin", "Brandenburg", 
                    "Bremen", "Hamburg", "Hessen", "Mecklenburg-Vorpommern", 
                    "Niedersachsen", "Nordrhein-Westfalen", "Rheinland-Pfalz", 
                    "Saarland", "Sachsen", "Sachsen-Anhalt", 
                    "Schleswig-Holstein", "Thüringen")
statelist.code.short <- substr(statelist_code, 4, 5)
statelist_cities <- c("DE-BE", "DE-HB", "DE-HH")
statelist_other <- c("DE-BW", "DE-BY", "DE-BB", "DE-HE", "DE-MV", "DE-NI", 
                     "DE-NW", "DE-RP", "DE-SL", "DE-SN", "DE-ST", "DE-SH",
                     "DE-TH")
statelist_other_wo_treatment <- c("DE-BY", "DE-BB", "DE-HE", "DE-MV", "DE-NI",
                                  "DE-NW", "DE-RP", "DE-SL","DE-SN", "DE-ST", 
                                  "DE-SH", "DE-TH")
```

# Introduction

Alcohol has a history of causing both joy and pain. In our research we want to focus on the latter: Alcohol, if consumend in high amounts and/or regularly over a longer period of time, does severe damage to individuals' health [see e.g. @WHO.2014; @Miller.2007]. It is therefore a policy concern not only to identify reasons and motivation for excessive alcohol consumption but also to develop measures to reduce the public health costs associated to this problem.

It is the aim of this paper to identify factors interacting with alcohol related health problems and to assess alcohol policy interventions against it. Using the example of all German federal states and time series data for different age groups from 2000 to 2014, we will [1] analyse the possible connection between medical diagnoses of alcohol misuse and possible explanatory socioeconomic factors like regional economic performance and unemployment rates. For this, we will take both short-term and long-term medical consequences of alcohol misuse into account by examining hospital health records on acute alcohol intoxication as well as on alcoholic liver disease. Besides the socio-economic factors mentioned above, we will [2] test the effect of recent policy measures on the German state level on alcohol consumption, namely the ban on alcohol night sales introduced in 2010 in the state of Baden-Wuerttemberg.

We find no conclusive relationship between alcohol-related hospitalizations and socio-economic indicators. However, the reason for this is not necessarily that the relationship does not exist. It is also possible that the small number of observations and the limited availability of relevant indicators prevent our data from generating sufficient insights into the matter. 
Regarding the second topic under investigation, we find that the night sales ban reduces the number of excessive alcohol drinking diagnoses among German youth by about 20 percent. This finding confirms a prior study by @Marcus.2015 who found a decrease of about 7 percent. While their study used more detailed data, it focused only on the short-term effects until 2011. Our study in contrast looked at a longer time-horizon, but used more aggregated data.

# Related Literature

This paper is building on prior research looking at (I) alcohol consumption in the light of socio-economic factors and (II) the effects of policy measures to tackle alcohol misuse. The impact of socio-economic factors, e.g. economic performance, for the alcohol abuse has been investigated by @Popovici.2013 and @Ettner.1997. They looked at the change in employment status and its effect on alcohol consumption of individuals. Focusing on the United States, the two studies found inconsistent results: While one sees "a positive and significant effect of unemployment on drinking behaviors and the findings are robust to numerous sensitivity tests" [@Popovici.2013] the other argues that "non-employment significantly reduces both alcohol consumption and dependence symptoms, probably due to an income effect." [@Ettner.1997] For the German case, Henkel has been tackling the challenge in extenso [@Henkel.2016; @Henkel.2015; @Henkel.2013; @Henkel.2000], analysing not only alcohol but also a broad range of other and illegal drugs and their use amongst unemployed and people with low income. Like @Popovici.2013, he finds that people in an economically better situation are less affected by alcohol misuse and addiction [@Henkel.2015]. An extensive overview on the international literature can be found in @Henkel.2011.

In terms of the effectiveness of policy measures to tackle these issues, the present study is inspired by the work of @Marcus.2015 who analysed *"The Effect of a Ban on Late-Night Off-premise Alcohol Sales on Alcohol-Related Hospital Stays in Germany."* They find that the introduction of the alcohol night sales ban reduces alcohol-related hospitalizations among adolescents and young adults by about seven percent. In their paper on a similar policy change in the canton of Geneva, Switzerland, @Wicki.2011 come to a confirming result: The "hospitalisation rate among adolescents and young adults could be reduced by between 25 percent and 40 percent due to restricted hours of off-premise sales, and a ban on the sale of alcoholic beverages in gas stations and video stores." @Green.2015 found a similar effect in their study on the extension of pub opening hours in England and Wales, introduced in 2005: "increased alcohol availability in England and Wales led to increased consumption, heavy drinking and led to poorer physical and mental health outcomes." Looking at the effect of minimum legal drinking ages, beer taxes, and "zero tolerance" laws, @Carpenter.2007 find that "a variety of types of government intervention have been and can be effective at reducing youth alcohol use." In their broader, related literature review on alcohol-related harm, @Wilkinson.2016 develop a corresponding result stating that "reducing the hours during which on-premise alcohol outlets can sell alcohol late at night can substantially reduce rates of violence. Increasing trading hours tends to result in higher rates of harm, while restricting trading hours tends to reduce harm." 

# Data

Our research health data origins from the German [hospital diagnosis statistics](https://www.destatis.de/DE/Publikationen/Thematisch/Gesundheit/Krankenhaeuser/DiagnosedatenKrankenhaus.html) from 2000 to 2014, obtained via the [Information System of the Federal Health Monitoring (GBE)](http://www.gbe-bund.de/). The data is reported by hospitals to the statistical bureaus of the respective German states and then aggregated by the [Statistische Bundesamt *Destatis*](https://www.destatis.de/). The data contains aggregated numbers of hospital diagnoses for each German state by age group, gender and year. The diagnoses are published according to the WHO International Statistical Classification of Diseases and Related Health Problems (ICD-10). 

The data is gathered via the online-interface of the [GBE](http://www.gbe-bund.de/). Unfortunately, the data provider neither provides an API nor a web-scrapable interface. Therefore, we download the base tables manually by searching for the respective ICD-10 code, using the malleable tables to gather as much information in a single table as possible and then export them.

To identify alcohol-related health problems we use the diagnosis category *F10* (mental and behavioural disorders due to use of alcohol), more specificly: its subdivision *F10.0* (acute intoxication) and *F10.2* (dependence syndrome), and *K70* (alcoholic liver disease). The diagnosis category *F10.0* essentially captures short term effects of excessive drinking and can be used as an indicator for binge drinking.[^ICD] The categories *F10.2* and *K70* capture long-term effects of drinking, as these diagnoses are consequences of regular drinking.

The used data has been downloaded on 01 November 2016 and, due to the data platform's table size limitations, consists of three csv files, one for each diagnose category, containing the German states, the former area of Western Germany as well as Eastern Germany as columns and years (2000-2014), gender (male/female/unknown), and age groups (less than 1 year; 1 to less than 100 years in 5-year groups; older than 100 years).

The advantage of this data is that it reflects a complete survey of all civilian hospitals in Germany for the time-frame of our investigation. Furthermore, the data is not self-reported by patients in interviews but instead consists of the professional diagnoses by a third party, i.e. doctors, which eliminates potential problems of self-reporting biases. For the case study of Baden-Wuerttemberg's night sale prohibition we do also benefit from the availability of a longer time series as compared to the prior work on the issue by @Marcus.2015. Finally, the state-year aggregation level does allows us to supplement to our analysis a wide variety of other freely available data, e.g. unemployment rates etc.

However, there are several serious limitations to the data we are using. First of all, the data set can only be exploited in limits with more sophisticated methods like panel data analysis. We only have access to data which is reported annually and on the state level.[^FDZ] Hence, for a panel data analysis our scope is limited to 14 years and 16 states, which makes a total of 224 data points for each combination of diagnoses, age group and gender. We still think it is meaningfull to work with the data at hand: On the one hand we can employ simpler but still insightful methods like multiple linear regression. On the other hand, our approach can easily be extended to more fine-grained data if such data becomes available to us. A second shortcoming is the lack of further information on the patients beyond their age and gender, e.g. socio-economic characteristics. Factors like available income and medical history will matter for alcohol-related health problems, but for good reasons (data protection) they are not recorded in the respective statistics.

As already mentioned, our analysis will be supplemented by data from other sources. We are using state level indicators. They include the respective population, population density, unemployment rates of different age groups, the state GDP, and beer sales by state. We are aware of the fact that other alcoholic beverages besides beer probably play a role in the cases under scrutiny. However, with these numbers not available, the beer consumption serves as a proxy for general alcohol consumption in the state. We are collecting the supplementary data from different statistics provided by *Destatis*.

# Empirical Strategy

We want to apply a two stage analysis of short and long-term alcohol-related hospitalizations (ARH)[^ARH]. We will start with a multiple regression for a better understanding of factors correlated to high rates of hospitalizaton for alcohol-related reasons. This analysis shall show how socio-economic characteristics of states are correlated to the number of ARH. To keep it simple, the first multiple regression model only includes observations of one year:

\begin{equation}
    ARH_s = \alpha_0 + \alpha_1 \cdot GDP_s + \alpha_2 \cdot UR_s + \alpha_3 \cdot PD_s + \epsilon_s
\end{equation}

where $GDP$ stands for the GDP level, $UR$ for the unemployment rate  and $PD$ for the population density. All variables are recorded on the state level as indicated by the indices. The regression will be conducted for the aggregated number for all age groups. We will run this first regression separately for the years 2000, 2007, and 2014.[^ROB]

Based on the idea that individuals adapt to their circumstances and are more reactive to changes than to actual levels we will then analyse ARH difference over time:

\begin{equation}
   \Delta ARH_{s,t} = \beta_1 \cdot \Delta GDP_{s, t} + \beta_2 \cdot UR_{s, t} + \epsilon_{s,t}
\end{equation}

In contrast to the multiple regression on levels, the difference approach highlights how *changes* in hospitalizations are correlated to *changes* in the socio-economic factors used as independend variables. We eliminated the population density because its changes are not fine-grainedly recorded and the census 2011 led to a jump due to measurement adjustments in the data series. To check for further robustness of the results, we are comparing short-term ARH (F10.0), which should be affected by short-term changes and long-term ARH (K70/F10.2), which should not be affected.

In the second step of our analysis, we are looking at the effect of the night sales ban on alcohol in Baden-Wuerttemberg. To measure the effect of the ban we use a difference-in-difference (DD) approach. As the ban was only introduced in Baden-Wuerttemberg, this state will be the treatment group. All other states are in the control group. This distinction can be justified by the fact that most alcohol regulation is done on the federal level with the exception of sales hour regulation and campaigns.[^TCG] The ban was introduced in 2010, which is captured by the treatment dummy. We further assume there are no dynamic effects, which is reasonable as patients are registered when their treatment begins:[^REG] 

\begin{equation}
   ARH_{s,t} = \gamma_0 + \gamma_1 \cdot dBW_{s} + \gamma_2 \cdot dPOST_t + \gamma_4 \cdot dBAN_{s,t} + \epsilon_{s,t}
\end{equation}

where $dBW$ is a dummy variable to indicate the treatmeant group (here the state of Baden-Wuerttemberg), $dPOST$ a time dummy indicating the post-treatment period and $dBAN$ a dummy for the night-sale-prohibition ($= 1$ for Baden-Wuerttemberg from 2010 on). Since were are re-engineering @Marcus.2015 here, we will limit the DD application to ARH by young people (15-25). To check for further robustness of the results, we use different control group compositions: All states but Baden-Wuerttemberg and all territorial states, exclulding the city states Berlin, Bremen and Hamburg.[^CITY] 

As the DD approach crucially depends on the common trend assumption, it is advisable to include control variables to capture possible trends affecting treatment and control group differently. We control for youth unemployment ($YUR$)  and $GDP$ as a proxy variable for the economic situation.

\begin{equation}
  ARH_{s,t} = \delta_0 + \delta_1 \cdot dBW_{s} + \delta_2 \cdot dPOST_t + \delta_3 \cdot dBAN_{s,t} + \delta_4 \cdot YUR_{s,t} + \delta_4 \cdot GDP_{s,t} + \epsilon_{s,t}
\end{equation}

Finally, we are refining our DD approach further by turning it into a panel and allowing for heterogenity in the control group. This is the final version of the model, but we expect it to generate less insightful results as we (with all states seperately) have comparably few data points (due to comparably little money to invest in data...).

\begin{equation}
  ARH_{s,t} = \theta_0 + \theta_1 \cdot dBAN_{s,t} + \theta_2 \cdot YUR_{s,t} + \theta_3 \cdot GDP_{s,t} + c_s + z_t + \epsilon_{s,t}
\end{equation}

where $c_s$ now is capturing unobservable and time-invariant heterogeneity of states. In addition, we are replacing $dPOST$ with $z_t$ to capture time fixed effects affecting all states, e.g. a change in a federal law. This model can be estimated by using dummies for the time-periods and first-differencing or fixed-effects to address the unobserved heterogeneity of states.

# Data Overview

Looking at the dependend variables, especially the short-term alcohol related hospitalizations (ARH-ST) show a clear upward trend over time. The numbers of acute intoxication per 1000 people only moderately increase in city states like Berlin (BE), Bremen (HB), and Hamburg (HH). Schleswig-Holstein and other states, on the contrary, double their rates in the time span from 2000 to 2014.

The development in German states from the beginning of the present dataset to the end shows that apparently only the big city states have been able to retain their relatively low level of short-term ARH. 

```{r echo = FALSE, out.width=c('280px', '280px')} 
spplot(map.Germany.2000.F100_p1000, 
       zcol = "value", 
       col.regions = colorRampPalette(rev(brewer.pal(9, "RdBu")))(22),
       main=list(label="2000: F10.0 diagnoses\nper 1000 pp all age groups"),
       at = seq(0, 2.2, by = 0.1))
spplot(map.Germany.2014.F100_p1000, 
       zcol = "value", 
       col.regions = colorRampPalette(rev(brewer.pal(9, "RdBu")))(22),
       main=list(label="2014: F10.0 diagnoses\nper 1000 pp all age groups"),
       at = seq(0, 2.2, by = 0.1),
       colorkey=FALSE)
```

The long-term ARH values also slightly increase, but this increase is a rather moderate one. This is a reasonable finding, as already the description "long-term" suggests that any change is very unlikely to happen in a short period of time like the one captchured by our sample.

```{r echo = FALSE, out.width = "450px", out.height = "450px"}
# data frame for plot
DS0 <- TOTAL %>% filter(GENDER == "all", AGE=="all") %>% 
  select(STATE, YEAR, F100_p1000, F102_p1000, K70_p1000)

# Color Blind Palette
cbbPalette <- c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
ggplot(data=filter(DS0, STATE == "DE-DE"), aes(x = YEAR, colour = Diagnoses)) +
  scale_x_continuous(breaks = c(2000, 2005, 2010, 2014), minor_breaks = c(2000:2014)) +
  geom_line(aes(y = F100_p1000, colour = "F10.0"), size = 1) +
  geom_line(aes(y = K70_p1000, colour = "K70"), size = 1) +                              # line plot
  geom_line(aes(y = F102_p1000, colour = "F10.2"), size = 1) +
  scale_color_manual(values = cbbPalette) +
  theme_bw() +                               # bw background
  xlab("Years") +                            
  ylab("Diagnoses\nper 1000 pp") +
  ggtitle("F10.0, F10.2, and K70 diagnoses in Germany,\n2000-2014") +
  theme(legend.key.size = unit(2.0, 'lines'),
        axis.title.y=element_text(margin=margin(10,10,0,10)))
```

With regards to the different age groups, the distribution of cases is already somewhat revealing: long-term ARH numbers as an indicator for strong alcoholism show that it rather is a problem for best-agers than for younger people. The numbers begin to rise at the 35-39 years age group and start declining with the 65-69 years group -- probably due to a higher mortality (although the data does not offer such interpretation).

In Germany, the two minimum legal drinking ages are 16 years (for beer and whine) and 18 years (for brandy). This helps with the interpretation of the short-term ARH bar plot with its two peaks: The first is a very strong one at the 15-19 years age group. It appears resonable to interpret this one as a sign of younsters making first experiences with alcoholic drinks -- and overdoing it. The second "mid-life crisis" peak corresponds with the peak of long-term alcohol abuse shown before.

```{r echo = FALSE} 
# data frame for plot
PL1 <- TOTAL %>% filter(GENDER == "all", AGE!="all", STATE=="DE-DE")
# Bar diagram: F10.0 cases per 1000 per age group in DE-DE
ggplot(data=PL1, aes(x=AGE, y=F100_p1000, fill = AGE)) +
  geom_bar(stat="identity", fill = cbbPalette[2]) +
  theme_bw() +
  xlab("Age Group") +
  ylab("F10.0 cases per 1000 pp") +
  ggtitle("F10.0 cases in Germany per age group") +
  guides(fill=FALSE) +
  theme(axis.text.x  = element_text(angle=90, vjust=0.5),
        axis.title.y=element_text(margin=margin(10,10,0,10)))
```

An alarming development is illustrated by the high increase in 15-19-year-olds admitted to hospitals with acute alcohol intoxication. From 2000 to 2011, Bavaria more than tripled its quota in this regard. The vertical line is indicating the year of the night sale introduction in Baden-Wuerttemberg.

```{r echo = FALSE} 
# data frame for plot
DS2 <- TOTAL %>% filter(GENDER == "all", AGE=="15-19y", STATE!="DE-DE") %>% 
   select(STATE, YEAR, F100_p1000, F102_p1000, K70_p1000)

DS2$GROUP <- "Non-City"
DS2$GROUP[DS2$STATE == "DE-BW"] <- "DE-BW (Treatment)"
DS2$GROUP[DS2$STATE == "DE-BE" | DS2$STATE == "DE-HB" | DS2$STATE == "DE-HH"] <- "City"

# F10.0 cases per 1000 per age group
ggplot(data=DS2, aes(x = YEAR, colour = GROUP)) +
  scale_x_continuous(breaks = c(2000, 2005, 2010, 2014), minor_breaks = c(2000:2014)) +
  geom_line(data=filter(DS2, GROUP == "Non-City"), aes(y = F100_p1000, group = STATE), size = .5) +
  geom_line(data=filter(DS2, GROUP == "City"), aes(y = F100_p1000, group = STATE), size = 1) +
  geom_line(data=filter(DS2, GROUP == "DE-BW (Treatment)"), aes(y = F100_p1000, group = STATE), size = 1) +
  theme_bw() +
  scale_color_manual("State", values = c(cbbPalette[6], cbbPalette[7], cbbPalette[1])) +
  xlab("Years") +
  ylab("F10.0 diagnoses per 1000 pp\nfor 15-19-year-olds") +
  ggtitle("F10.0 diagnoses for 15-19-year-olds in German states,\n2000-2014") +
  geom_vline(xintercept = 2010) +
  theme(legend.position = "bottom", 
        legend.key = element_rect(size = 5),
        legend.key.size = unit(2.0, 'lines'),
        axis.title.y=element_text(margin=margin(10,10,0,10)))
```

# Results
The first model shown in *Table 1* is simple and its results are indicative, at most. It does reveal to which state charateristics short-term alcohol related hospitalizations (ARH-ST) are correlated.  We find that ARH-ST are higher in states with a lower GDP. This captures the basic intuition that binge dringing is more common in regions that are economically less well-off. But this view is not confirmed by our second indicator for economic well-being: the unemployment rate. To the contrary, states with higher unemployment rates have less cases of binge drinking. These results are robust over all years of our investgation. But this is not surprising as the variables we are considering are not strongly fluctuating during the period under investigation. Most of them follow more or less the same trend. As the contradicting interpretation already hints at, we are most likey not seeing what is happing below the aggregated surface.

```{r Model1, results = "asis", echo = FALSE}

mod1.labels <- c("GDP per capita", "Unemployment rate", "Population density", "(Intercept)")
stargazer::stargazer(mod1.2000, mod1.2007, mod1.2014,
                     covariate.labels = mod1.labels,
                     column.labels = c("2000", "2007", "2014"),
                     model.numbers  = FALSE,
                     dep.var.labels = "F10.0 Diagnoses per 1000 capita",
                     title = 'Regression results for Model 1 for 2000, 2007, and 2014',
                     digits = 2, type = 'latex', header = FALSE)

```

The major drawback of this regression is the low number of observations. With one observation per state the standard errors are *huge*. In the second step, we increased our data set by looking at the difference over time, which increases the number of available observations to 224 (*Table 2*).

```{r Model2, results = "asis", echo = FALSE} 
stargazer::stargazer(mod2.F100, mod2.F102, mod2.K70,
                     column.labels = c("F10.0", "F10.2", "K70"),
                     covariate.labels = c("GDP Change", "Unemployment Change", "Beer Tax Change", "(Intercept)"),
                     model.numbers  = FALSE,
                     dep.var.labels = c("Change in F10.0","Change in F10.2","Change in K70"),
                     title = 'Regression results for Model 2 with first differenced data',
                     digits = 2, type = 'latex', header = FALSE)
```

The differences show a correlation between GDP growth and short-term ARH. A superficial interpretation could be that individuals in years of strong growth tend to party more. But again, this should be treated with care. For short-term ARH the other variables are not significant.
The effects of changes on long-term ARH are unsurprisingly inconclusive. In our case each coefficient is directed exactly in the opposite direction of our two long-term ARH indicators. But a clear result would again have been implausible, as long-term ARH result from excessive drinking behaviour over years. Short-term macro developments should not have any impact on this.

The next step in refining our model (*Model 3*) takes us to analyzing a policy measure, namely the introduction of a ban on night sales of alcohol in the state of Baden-Wuerttemberg. Contrary to the prior analysis, we now take a look at the age group of 15 to 19 year olds. The results are shown in *Table 3*. While we are introducing one dummy for the state in consideration, one for the post-treatment period, and the interaction term of both, the only significant variable is the post-treatment dummy. Even more counterintuitively, the post-treatment dummy is *positively* correlated with short-term ARHs. This hints to the fact that we might witness a common trend over time, increasing short-term ARHs, that is similar for all states and that is currently captured by the time dummy. That would explain why the time dummy (which essentially splits the whole set of observations into two consecutive phases) is highly significant but neither the state dummy nor the interaction term are of any significance.

```{r Model3and4, results = "asis", echo = FALSE} 
stargazer::stargazer(mod3.age15.16, mod3.age15.13, mod4.age15, mod4.age15.13,
                     column.labels = c("(3) All states", "(3) non-city states",
                                       "(4) All states", "(4) non-city states"),
                     covariate.labels = c("BW night sale ban", "Post-ban dummy", 
                                          "GDP per capita", "Youth unemployment", 
                                          "Interaction dummy", "(Intercept)"),
                     model.numbers  = FALSE,
                     dep.var.labels = c("F10.0 cases per 1000 PP among 15-19 year olds"),                    
                     title = 'Model 3 and 4 - simple differences-in-differences without and with controls',
                     digits = 2, type = 'latex', header = FALSE,
                     omit.stat = c("f"))

```

In case such a common trend is an economic one, we can capture it and take it out of the diff-in-diff model by including two more control variables that not only allow for development over time but which also capture varying developments among states. *Model 4* therefore includes state GDP per capita and the youth unemployment rate on state level (*Table 3*).

The results do not have any effect on the significance of treatment-related dummy variables. Also, the "wrong" direction of the post-treatment dummy's effect stays. However, the two newly introduced control variables are not only highly significant. In the case of GDP per capita, the effect at first sight is also pointing into a plausible direction: Increasing GDP per capita correlates with decreasing short-term ARHs. The reasoning behind this might be that in times of economic prosperity, alcohol as remedy for despair is not needed. But the youth unemployment rate points into the opposite direction, with increasing youth unemployment rates being correlated with decreasing ARHs. An explanation for this effect might be that for young people unemployment comes with more severe financial constraints than it comes for older members of the workforce, therefore quicker closing the tap for younger people. 

But these explanations are tentatively, at best. Therefore, it is advisable to take the final step in refining the model by introducing a panel data approach which best fits the fact that we are analysing 15 years of the very same sample of German states.

```{r Model5, results = "asis", echo = FALSE} 

stargazer::stargazer(mod5, mod5.age15.16, mod5.age15.13,
                     column.labels = c("All age/states", "15-19y all states", "15-19y + non-city state"),
                     covariate.labels = c("BW night sale ban", "Youth unemployment rate", "GDP per capita"),
                     model.numbers  = FALSE,
                     dep.var.labels = c("F10.0 cases per 1000 PP"),                           
                     title = 'Model 5 - multi-period panel differences-in-differences',
                     digits = 2, type = 'latex', header = FALSE)
```

<!--
Result of the final model
-->
The final regression indicate a negative impact of the ban on ARH related short-term hospitalizations. The estimated relation of introducing a night-sales ban is a decrease of 0.06 hospitalizations per 1000 inhabilitants, which makes roughly 600 cases less per year based on the population of Baden-Wuerttemberg. This would imply a decrease of 20%, which is more than @Marcus.2015 found (7%). To illustrate the result, we calculated the hypothetical number of hospitalizations without the ban:

```{r echo = FALSE, out.width = "400px", out.height = "350px"} 
# Bar diagram: With and without ban
ggplot(data=melt(bancomparison, id.vars = "YEAR"), 
       aes(x = YEAR, y = value, fill = variable)) +
       geom_bar(stat = "identity", position=position_dodge()) +
       theme_bw() +
       xlab("Year") +
       ylab("F10.0 diagnoses per 1000 inhabiants\nin BW for 15-19-year-olds") +
       ggtitle("F10.0 cases with and without ban in BW") +
       scale_fill_manual("Sales ban", values = c(cbbPalette[2], cbbPalette[1]),
                         labels=c("With", "Without (simulation)"))
```

But again the result should be treated with care. The calculations are based on 195 observations, which are due to reduced control group to territorial states. Moreover, the exact estimate is sensitive to adjustments in the control group. However, the general direction and the magnitude are meaningful and point in the same direction as @Marcus.2015.

# Conclusion

Overall, our research is in line with the general conclusion of the literature on drinking and socio-economic factors. In particular, the most sophisticated approach (model 5) we used finds that higher youth unemployment is correlated with more alcohol-related diagnoses amoung young people and that the night sales ban on alcohol has a measurable negative effect on binge-drinking. But this does not hold true for all our estimations: The results of the multiple regressions and simple difference in difference models are in contrast not conclusive. This is due to the limitations of our data and the general limitation of these models.

As our study reengineered @Marcus.2015, the confirmation of their general result deserves particular attention. While their study used more fine-grained data, our research included a longer treatment horizon. Hence, the effect @Marcus.2015 find prevailed over the time horizon of their research according to our findings.

The research could be further improved by using more disaggregated data, in particular by using monthly numbers and the county level. This would increase the number of observations tremendously and therefore allow for more substantial results. It would also allow the inclusion of more controls as we only relied on a small number of socio-economic variables. Many more potential controls exist, but we did not include them as they are not available as a time-series and on the regional level. 

# References

[^ICD]: In contrast to @Marcus.2015 and @Wicki.2011 we do not include T51 (Intoxination due to alcohol) as its covers the consumption of pure ethanol and its number are relatively small. 
[^FDZ]: In principle, there is more fine-grained data available, even down to the individual level. However, the access to the data requires using the paid services by [forschungsdatenzentrum.de](http://www.forschungsdatenzentrum.de/) which is out of our students' budget.
[^ARH]: ARH is normalized by the state population and denoted in hospitalizations per 100,000 inhabitants.
[^TCG]: The night-sale ban in Baden-Wuerttemberg has been the most prominent alcohol policy in 2010. There have also been general changes to opening hours as during this time the competency was transfered to the states and media campaigns. As we only have annual data we are not fully able to distinguish between these treatments, but we assume that they are outweigthed by the night-sales ban.
[^REG]: Even if they were registered in consecutive reporting periods, the stay of patients for short-term ARH (*F10.0*) would be in average 2,1 days and for long-term ARH 10,7 days (*K70*) and 11,4 days (*F10.2*) in 2014, making the dynamic effect insignificantly small.
[^ROB]: We also computed all other years to check robustness. The results can be found in the appendix. 
[^CITY]: The hospitalization statistics has revealed that city states followed a slighlty different trend than the territorial states, see section data overview.