Skip to content

Commit

Permalink
updating template
Browse files Browse the repository at this point in the history
  • Loading branch information
scarpino committed Jun 27, 2024
1 parent 880174a commit cc11660
Show file tree
Hide file tree
Showing 35 changed files with 17 additions and 2,577 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,6 @@ dmypy.json

# Pyre type checker
.pyre/

# dstore
*.DS_Store
8 changes: 4 additions & 4 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2022 Global.health
Copyright (c) 2024 Global.health

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand All @@ -12,10 +12,10 @@ furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
THE SOFTWARE AND DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
OUT OF OR IN CONNECTION WITH THE SOFTWARE/DATA OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE/DATA.
53 changes: 6 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,63 +1,22 @@
# Ebola data Uganda outbreak 2022
# WOAH-WAHIS Influenza A

This repository contains dated records of curated Ebola cases from the 2022 outbreak in Uganda. Data are curated from openly accessible sources. We continue to experience ongoing challenges in data curation, discussed below. Line-list data may change due to ongoing data reconciliation and validation.

Our latest data set and archives can be found [here](https://3mmuwilir3.execute-api.eu-central-1.amazonaws.com/web).

UPDATE:

2023-01-20. On January 11th, 2023, the Ugandan Ministry of Health declared the end of the SVD Ebola outbreak. The final number of confirmed cases is 142. G.h counts for confirmed cases by district matches that reported in the MOH SitReps through [#90](https://www.afro.who.int/sites/default/files/2023-01/Ug_EVD_SitRep%2390.pdf) (Mubende 64; Kyegegwa 4; Kassanda 49; Kagadi 1; Masaka 1; Wakiso 4; Jinja 1; Kampala 17; Bunyangabu 1); however, SitRep [#91](https://www.afro.who.int/sites/default/files/2023-01/Ug_EVD_SitRep%2391.pdf) reassigned a KAL case to WAK without explanation. Therefore, final counts of confirmed cases by district do not match.

Also, Outcome for ID#s 149, 150, 157, 158, 161, and 162 remain unassigned for reasons outlined in our [blog post](https://globaldothealth.substack.com/p/curator-review-for-2022-reflecting).

Finally, Uganda's [MOH website](https://www.health.go.ug/ebola/) differs from final SitRep data in [#93](https://www.afro.who.int/sites/default/files/2023-01/Ug_EVD_SitRep%2393.pdf). The MOH website reports 142 cases/ 56 deaths/ 86 recoveries. SitRep data reports 142 cases/ 55 deaths/ 87 recoveries.

We recognize these discrepancies and remain limited by the detail and accuracy of information publicly released by the MOH. Questions can be addressed to info@global.health.

2022-12-14. The sum of deaths and recoveries for the G.h dataset is not in alignment with current MOH numbers. Our curation team identified count and location discrepancies in data from SitRep [66](https://www.afro.who.int/countries/uganda/publication/ebola-virus-disease-uganda-sitrep-66) and beyond that prevent us from updating the Outcome for the remaining cases. Also, SitRep [68](https://www.afro.who.int/countries/uganda/publication/ebola-virus-disease-uganda-sitrep-68) reported a reclassification of Outcome for a Mubende case; however, without further detail, we are unable to identify a specific ID# for the change. We are limited by the detail and accuracy of information publicly released by the MOH. We will continue to check for new MOH reports that may provide updated/corrected case information that could reconcile differences between our two datasets.

2022-11-23: The curation team has completed a data reconciliation exercise to update case data between SitReps (e.g. deaths, recoveries, HCW, and location information). The "Date_last_modified" column lists the date of reconciliation on 2022-11-22. Data are updated through SitRep [57](https://www.afro.who.int/countries/uganda/publication/ebola-virus-disease-uganda-sitrep-57) with a total of 141 confirmed cases, including 55 deaths, and 79 recoveries.

2022-11-04: Confirmed cases have been added to the line-list through SitRep [41](https://www.afro.who.int/sites/default/files/2022-11/Ug_EVD_SitRep%2341.pdf) @131 cases. Our curation team is working to reconcile case data between SitReps (e.g. deaths, recoveries, HCW), which will take time to complete due to changing report format and fluctuating counts that greatly complicate the review process.
This repository contains data reported to the World Organization for Animal Health related to influenza A infections in non-human animals.

## Data curation
This section is an overview of the data curation process, a discussion about limitations and assumptions.

Curation, especially early in the outbreak, is a manual, labor-intensive process. We experience many recurring challenges in building an emerging disease dataset in real-time.

The Ebola line-list is built from a collection of sources, listed here, which will be updated as new sources become available: https://github.com/globaldothealth/ebola/wiki. The original source(s) of information is provided for each line-list ID in our database. The WHO provides Situation Reports (SitReps) from the Uganda Ministry of Health; while not available from the start of the outbreak, these resources have become a primary source for information. However, we remain limited by inconsistent, aggregated, or missing case information; change in reporting format; data reconciliation; conflicting details; confusing statements; reporting delays. We frequently observe conflicting data and details between sources, and even within a single report. We have not outlined the specific challenges or discrepancies for each [SitRep](https://www.afro.who.int/countries/publications?country=879), but can discuss further as needed.

Reports from government/official sources can be enriched with supplemental information retrieved from local reporting (including media) or other sources. Metadata are added at any time, as information becomes available and our time and resources permit. After making changes, the case will be recorded as modified with the date. Multiple curators look at each datapoint and any discrepancies are resolved in conversations between them. Assumptions are made that may compromise the accuracy of the data.

Users should refer to our [data dictionary](data_dictionary.yml) for a description of each variable. Assumptions for select variables are briefly discussed below.

**Case_status**: Only confirmed and probable cases are logged at this time.

**Date_of_onset**: Information is only available for probable cases; we are unable to disaggregate Date_of_onset details for confirmed cases.

**Date_confirmation**: The report date is used when a Date_confirmation is not specified by source.

**Outcome**. Type: Death: The report date is used when a Date_death is not specified by source. If the number of deaths exceeds the number of new cases reported for that day, then deaths are logged under a previous ID with unassigned outcome and corresponding location information, when available.

**Outcome**. Type: Recovery: When a recovery is reported, the recovery is assigned to a previous ID with unassigned outcome and corresponding location information, when available.

**Healthcare_worker**: Healthcare worker information (e.g. location, date_confirmation, outcome) is not consistently provided by source. Supplemental sources are used, when available, to gain context that may help us to assign an ID. However, due to the limited availability of information, we have not been able to log every confirmed HCW case or outcome.

Data are hand-curated. The process and methods to create, organize, and maintain data have been applied with consistency; however, we’re human and mistakes happen. As stated above, line-list data may change due to ongoing data reconciliation and validation. We welcome your contributions and feedback. Get involved!
Curation, is a manual, labor-intensive process. We experience many recurring challenges in building an emerging disease dataset in real-time.

## Contributing

If you would like to request changes, [open an issue](https://github.com/globaldothealth/ebola/issues/new) on this repository and we will happily consider your request.
If you would like to request changes, [open an issue](https://github.com/globaldothealth/woah_wahis_flu_a/issues/new) on this repository and we will happily consider your request.

If requesting a fix please include steps to reproduce undesirable behaviors.

If you would like to contribute, assign an issue to yourself and/or reach out to a contributor and we will happily help you help us.

If you want to send data to us, you can use our template at [ebola-template.csv](ebola-template.csv) which makes
it easier for us to add to our list. Just open an issue and attach a CSV / XLSX file in this repository,
or email data to info@global.health. Remove any Personally Identifiable Information.

## License and attribution

This repository is published under MIT License and data exports are published under the CC BY 4.0 license.

Please cite as: "Global.health Ebola (accessed on YYYY-MM-DD)" & please add the appropriate agency, paper, and/or individual in publications and/or derivatives using these data, contact them regarding the legal use of these data, and remember to pass-forward any existing license/warranty/copyright information.
Please cite as: "Global.health WHOA-WHIS Influenza A (accessed on YYYY-MM-DD)" & please add the appropriate agency, paper, and/or individual in publications and/or derivatives using these data, contact them regarding the legal use of these data, and remember to pass-forward any existing license/warranty/copyright information.
300 changes: 0 additions & 300 deletions data_dictionary.yml
Original file line number Diff line number Diff line change
@@ -1,300 +0,0 @@
dictionary_last_modified: 2022-11-04
fields:
- name: ID
type: string
description: >
Unique ID of the case
required: true

- name: Pathogen
type: string
description: >
Constant, pre-filled value, pathogen of interest
required: true

- name: Case_status
type: confirmed | suspected | discarded | omit_error
description: >
Status of a case. Cases which are discarded were previously
suspected but have now been confirmed negative, and should
be excluded from case counts. Cases which are omit_error were
incorrectly added and should be dismissed from any data
interpretation.
required: true

- name: Location
type: string
description: Country subdivision where case was reported
examples: [Islington, Croydon]

- name: City
type: string
description: City where case was reported
examples: [London, Madrid, Sydney, Boston]

- name: Country
type: string
description: Country where case was reported
required: true
examples: [England, United States, Spain, Australia]

- name: Country_ISO3
type: string
description: ISO 3166-1 alpha-3, three letter country codes
required: true
examples: [ITA, GBR, ESP, BOL]

- name: Age
type: integer-range
description: Age of the individual, specified as a range, either open-ended (<n, >n) or as a range delimited by a hyphen (m-n) following 5-year age increments (m-n)

- name: Gender
type: male | female | other
description: Gender of the individual

- name: Occupation
type: string
description: Free response entry describing the individual's occupation

- name: Healthcare_worker
type: Y | N | NA
description: Is the individual a healthcare worker (Y=Yes, N=No, NA=Not applicable)?

- name: Symptoms
type: string
description: Comma separated list of symptoms
examples:
- rash
- vesicular rash
- skin lesions
- ulcerative lesions
- oral and genital ulcers
- fever

- name: Date_onset
type: iso8601date
description: Date of onset of symptoms

- name: Date_confirmation
type: iso8601date
description: Date when case was confirmed

- name: Confirmation_method
type: string
description: Test used to confirm diagnosis
examples: ["RT-PCR"]

- name: Previous_infection
type: Y | N | NA
description: Did the individual test positive for the infection prior to the most recent diagnosis (Y=Yes, N=No, NA=Not applicable)?

- name: Co_infection
type: string
description: If the individual tested positive for another pathogen

- name: Pre_existing_condition
type: string
description: If the individual has any pre-existing medical conditions

- name: Pregnancy_status
type: Y | N | NA
description: Is the case pregnant or post-partum?

- name: Vaccination
type: Y | N | NA
description: Has the individual received a dose of vaccine (Y=Yes, N=No, NA=Not applicable)?

- name: Vaccine_name
type: string
description: Name of the first vaccine

- name: Vaccine_date
type: iso8601date
Description: Date of first vaccination

- name: Vaccine_side_effects
type: string
description: Comma separated list of symptoms experienced after receiving the vaccine (i.e. cough, sore throat, etc.)

- name: Date_of_first_consult
type: iso8601date
Description: Date that the individual received first clinical consultation

- name: Hospitalised (Y/N/NA)
type: Y | N | NA
description: Whether individual was hospitalised (Y=Yes, N=No, NA=Not applicable)

- name: Reason for hospitalisation
type: monitoring | treatment | unknown
Description: Reason why the individual was hospitalised; can list multiple, comma separated

- name: Date_hospitalisation
type: iso8601date
Description: Date individual was hospitalised

- name: Date_discharge_hospital
type: iso8601date
Description: >
Date that the individual was discharged from the hospital. Note: there is a separate field for ICU discharge.
- name: Intensive_care (Y/N/NA)
type: Y | N | NA
description: Whether individual admitted to an intensive care unit or high dependency unit at hospital (Y=Yes, N=No, NA=Not applicable)

- name: Date_admission_ICU
type: Y | N | NA
description: Date individual entered intensive care unit

- name: Date_discharge_ICU
type: Y | N | NA
description: Date that the individual was discharged from the ICU

- name: Home_monitoring
type: Y | N | NA
description: Whether individual is being remotely monitored by health officials at home without hospital admission (Y=Yes, N=No, NA=Not applicable)

- name: Isolated (Y/N/NA)
type: Y | N | NA
description: Whether individual was isolated at home or in hospital (Y=Yes, N=No, NA=Not applicable)

- name: Date_isolation
type: iso8601date
description: Date individual entered isolation

- name: Outcome
type: recovered | death
description: Optional field that specifies outcome of the disease

- name: Date_death
type: iso8601date
description: Date of death

- name: Date_recovered
type: iso8601date
description: Date of recovery

- name: Contact_with_case
type: Y | N | NA
Description: Has the individual had contact with a confirmed/ probable/ suspected case (Y=Yes, N=No, NA=Not applicable)?

- name: Contact_ID
type: integer
description: If specified, is the case ID from which this patient contracted the virus

- name: Contact_setting
type: string
description: Setting where contact occurred that led to transmission
examples: >
Dictionary of possible entries:
HOUSE= household
WORK= workplace
SCHOOL= school/nursery
HEALTH= healthcare (including laboratory exposure)
PARTY= Sexual contact at night club/private party/sauna or similar setting
BAR= Bar/restaurant/ or other small event where there was no sexual contact
LARGE = Large event with no sexual contact (e.g., festival or sports event)
LARGECONTACT = Large event with sexual contact
OTHER
UNK= Unknown
- name: Contact_animal
description: Whether the individual has known contact with animals
examples: >
Dictionary of possible entries:
PET = Household pets excluding rodents
PETRODENTS = Rodent pets
WILD = Wild animals excluding rodent
WILDRODENTS = Wild rodents
Other (specify)
- name: Contact_comment
type: string
description: Free text describing any additional contact information.

- name: Transmission
type: string
description: Setting where contact occurred that led to transmission
examples: >
Dictionary of possible entries:
ANIMAL = Animal to human transmission
HAI = Healthcare-associated
LAB = Transmission in a laboratory due to occupational exposure
MTCT = Transmission from mother to child during pregnancy or at birth
OTHER = Other transmission
FOMITE = Contact with contaminated material (e.g bedding, clothing, objects)
PTP = Person-to-person (excluding: mother-to-child, healthcare-associated or sexual transmission)
SEX = Sexual transmission
TRANSFU = parenteral transmission including intravenous drug use and transfusion
UNK = Unknown
- name: Travel_history (Y/N/NA)
type: Y | N | NA
description: Whether individual has travel history, domestic and/or international (Y=Yes, N=No, NA=Not applicable)

- name: Travel_history_entry
type: string
description: Date when individual entered the country

- name: Travel_history_start
type: string
description: Free text describing travel

- name: Travel_history_location
type: string
description: Last known location where individual had travelled from
example: [Europe]

- name: Travel_history_country
type: string
description: Last known country where individual had travelled from
example: [Spain]

- name: Genomics_Metadata
type: string
description: Which clade the viral strain belongs to
examples:
- West African Clade
- North African Clade

- name: Accession Number
type: string
description: Accession number of the sequence uploaded to public database

- name: Source
type: url
description: URL of news story or government source where this case was confirmed
required: true

- name: Source_II
type: url
description: URL of news story or government source where this case was confirmed (archived or secondary source)

- name: Source_III
type: url
description: URL of news story or government source where this case was confirmed (archived or secondary source)

- name: Source_IV
type: url
description: URL of news story or government source where this case was confirmed (archived or secondary source)

- name: Source_V
type: url
description: URL of news story or government source where this case was confirmed (archived or secondary source)

- name: Source_VI
type: url
description: URL of news story or government source where this case was confirmed (archived or secondary source)

- name: Source_VII
type: url
description: URL of news story or government source where this case was confirmed (archived or secondary source)

- name: Date_entry
type: iso8601date
description: Date case was entered into line list
required: true

- name: Date_last_modified
type: iso8601date
description: Last date when case was modified in line list
required: true
Loading

0 comments on commit cc11660

Please sign in to comment.