mlwdata
contains a set of annotated datasets that have been collected
as part of research studies conducted at the Malawi-Liverpool-Wellcome
Trust Clinical Research Programme in Blantyre, Malawi.
The main aims are to:
- Ensure consistent use of data between and within studies over time
- Facilitate sharing of data to reduce duplication of efforts
You can install the from GitHub with:
# install.packages("devtools")
devtools::install_github("petermacp/mlwdata")
Currently, the following datasets are available:
scale_72_clusters
A sf
MULTIPOLYGON object, containing
polygon boundaries and cluster IDs for the SCALE Study, with variables:
cluster
: unique identifier for each of the 72 study clustersgeometery
: a list column, containing polygons for each cluster boundary.
blantyre_clinics
A sf
POINT object, containing coordinates
for Blantyre clinics/hospitals, and clinic IDs:
clinic
: unique name of each of the 18 clinics/hospitalsgeometry
: a list column, containing points for each clinic/hospital.
blantyre_census_2008_2018
A tibble
, containing age, sex
and district-stratfied (Blantyre City, Blantyre Rural) population
estimates from the 2008 and 2018 Malawi National Census. These data were
provided by the Malwai National Statistics Office in October 2019.
District
: either Blantyre City, or Blantyre Rural (as classified in the Census)Year
: census year (2008 or 2018)Age
: age group of population estimatesTotal
: total populationMale
: Male populationFemale
: Female population
blantyre_census_by_q
A tibble
, containing age, sex and
district-stratfied (Blantyre City, Blantyre Rural) population, with
linear interpolation by quarter, from the 2008 and 2018 Malawi National
Census. These data were provided by the Malwai National Statistics
Office in October 2019.
district
: either Blantyre City, or Blantyre Rural (as classified in the Census)year
: census year (2008 or 2018)age
: age group of population estimatesquarter
: quarter of the yearyear_q
: concatenated year and quartersex
: male or femalepopulation
: population estimate
hiv_pops_blantyre_city
A tibble
, containing estimates of
HIV prevalence for Blantyre City, stratified by age, sex and quarter
between 2008 and 2018. Population estimates are from Malawi National
Census 2008 and 2018 estimates. HIV prevalence estimates are from an
HIV-prevalence survey conducted in 2014-15 in North West Blantyre. Age
and Sex specific HIV-prevalence estimates were multiplied by population
demoninators to obtain numbers of HIV-positive people per
age-sex-quarter strata. (Note, estimates for adults [16+] and Blantyre
City only, and not Blantyre Rural are provided).
year_q
: concatenated year and quarteryear
: census year (2008 or 2018)quarter
: quarter of the yearsex
: male or femaleage
: age group of population estimateshiv_prev
: HIV prevalence in age-sex stratapopulation
: number of HIV-positive people in age-sex-quarter strata
blantyre_tb_cases_2009_2018
A tibble
, containing numbers
of TB cases notified in Blantyre TB registration centres between Q1 2009
and Q4 2018 by quarter, and stratified by active case finding area of
the city (ACF vs. non-ACF), and microbiological status of cases
year_q
: Annual quarteracf
: Area of Blantyre City (ACF
= received active case finding intervention;non-ACF
= didn’t receive active case finding intervention)tbcases
: Classification of TB diagnosis (Smr/Xpert-positive
cases = cases that were either smear or xpert positive in testing by the routine clinic programme, or smear-positive by the research TB lab;All cases
= all cases started on treatment, regardless of microbiological status)n
: number of cases in category
blantyre_tb_cases_2009_2018
A tibble
, containing
anonymised individual-level TB cases notified in Blantyre TB
registration centres between Q1 2011 and Q4 2018
unique_id
: Anonymised unique case IDfac_code
: TB registration centrereg_date
: Date on which TB case was registered for treatmentyear
: Year of registration for TB treatmentquarter
: Quarter of registration for TB treatmentyear_q
: Year and quarter of TB treatment registrationperiod
: Active case finding intervention period (pre-ACF
= before ACF implemented;ACF
= during ACF intervention;post-ACF
= after ACF intervention implemented)sex
: Sex of TB case (male or female)age
: Age of TB case on day of treatment registrationagegp
: Age group of TB case (1
= 0-4 years,2
= 5-14 years,3
= 15+ years)acf
: Whether TB case’s household was located in the ACF intervention area of Blantyre City (ACF
), or the non-ACF area of Blantyre CityNon-ACF
hiv
: HIV status of the TB case at TB treatment registrationart
: Was patient taking antiretroviral therapy for the treatment of HIV at the start of TB treatment?smr_clinic
: Sputum smear status of TB case at TB registration, from sample collected and tested by the routine health systemsmr_lab
: Sputum smear status of TB case at TB registration, from sample collected at treatment registration, and tested in the research TB lab at the College of Medicine, University of Malawixpert_clinic
: Sputum Xpert status of TB case at TB registration, from sample collected and tested by the routine health system. Note that Xpert was only reliably introduced into the programme from Q2 2015 onwardstbtype
: Classified as eitherPulmonary TB
orExtrapulmonary TB
cgh_dur
: Duration of cough (in weeks) prior to TB treatment registrationtb_cat
: Category of TB patient (New
,Relapse
,Retreatment after default
,Retreatment after failure
,Other
)smr_any
: Whether a positive sputum smear result was obtained from either the routine health system or the study research lab samplesmr_xpert_any
: Whether a positive sputum smear result was obtained from either the routine health system or the study research lab sample, or a positive Xpert result was obtained from the routine health system.
acf_cnrs_overall
A tibble
, containing TB case notification
rates (all cases) per 100,000 population between Q1 2009 and q4 2018,
stratifed by active case finding intervention area
year_q
: Annual quarteracf
: Whether TB case’s household was located in the ACF intervention area of Blantyre City (ACF
), or the non-ACF area of Blantyre CityNon-ACF
cases
: Number of registered TB cases per categorytbcases
: Classification of TB cases (microbiologically-confirmed or all cases)population
: Total population per strataq_population
: Total population per strata/4 (for CNR estimates)cnr
: TB case notification rate, per 100,000 per strataconf.low
: Lower bound of CNR 95% confidence intervalconf.high
: Upper bound of CNR 95% confidence intervalperiod
: Active case finding intervention period (pre-ACF
= before ACF implemented;ACF
= during ACF intervention;post-ACF
= after ACF intervention implemented)
acf_smrpos_cnrs_overall
A tibble
, containing TB case
notification rates (smear or xpert-positive cases) per 100,000
population between Q1 2009 and q4 2018, stratifed by active case finding
intervention area. Note cases are classified as Smear/Xpert positive if
a positive sputum smear result was obtained from either the routine
health system or the study research lab sample, or a positive Xpert
result was obtained from the routine health system.
year_q
: Annual quarteracf
: Whether TB case’s household was located in the ACF intervention area of Blantyre City (ACF
), or the non-ACF area of Blantyre CityNon-ACF
cases
: Number of registered TB cases per categorytbcases
: Classification of TB cases (microbiologically-confirmed or all cases)population
: Total population per strataq_population
: Total population per strata/4 (for CNR estimates)cnr
: TB case notification rate, per 100,000 per strataconf.low
: Lower bound of CNR 95% confidence intervalconf.high
: Upper bound of CNR 95% confidence intervalperiod
: Active case finding intervention period (pre-ACF
= before ACF implemented;ACF
= during ACF intervention;post-ACF
= after ACF intervention implemented)
The SCALE Study defined 72 geographical cluster boundaries in urban Blantyre using GPS waypaths. These clusters can be loaded and plotted:
library(mlwdata)
library(tidyverse)
glimpse(scale_72_clusters)
#> Observations: 72
#> Variables: 2
#> $ cluster <chr> "c1", "c2", "c3", "c4", "c5", "c6", "c7", "c8", "c9", "c10",…
#> $ geometry <list> [35.05040, 35.05040, 35.05040, 35.05040, 35.05040, 35.05040…
ggplot(scale_72_clusters) +
geom_sf() +
geom_sf_text(aes(label=cluster), colour="blue") +
theme_minimal() +
labs(title="SCALE Study",
subtitle="72 clusters",
x="",
y="",
caption = "Corbett, MacPherson et al.")
We could add on the location of the clinics in Blantyre.
library(sf)
library(ggrepel)
qech <- blantyre_clinics %>%
#filter to show QECH only for interest
filter(clinic=="Queen Elizabeth hospital") %>%
#split out the x and y coordinates to allow use of ggrepel
st_coordinates() %>%
as_tibble() %>%
mutate(clinic="QECH")
ggplot() +
geom_sf(data = scale_72_clusters) +
geom_sf_text(data = scale_72_clusters, aes(label=cluster), colour="blue") +
geom_sf(data=blantyre_clinics, shape=17, colour="#22211d") +
geom_label_repel(data = qech,
aes(x=X, y=Y, label=clinic),
fontface = "bold",
min.segment.length = 0) +
theme_minimal() +
labs(title="SCALE Study",
subtitle="72 clusters. Clinics/hospitals labelled with triangles",
x="",
y="",
caption = "Corbett, MacPherson et al.") +
coord_sf()