All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[3.0.0] - 2019-10-12
- Add CLI flag to dump reports with parsing errors on disk. This also adds helper functions and tests to simplify the debuging of the dumps. #214
- The internal format was updated to allow automated data validation, prevent invalid values to injected in the data
and handle multiple fatalities in one crash. New fields were added such as
middle name
orgeneration
. #199
- Fixed parsing issues:
[2.1.1] - 2019-08-22
- Fixed parsing of crashes involving multiple deaths. #178
[2.1.0] - 2019-07-17
- Add CLI flags to parametrize retries at runtime. #159
- Add tasks to do some profiling of the application. #163
- Improve logging ability by adding more details about a parsing failure. #167
- Fix
Deceased
field parsing: - Fix
Location
field parsing to support additional internal formats. #169 - Fix
Date
field parsing to support single digit 12 hour format like8 p.m.
. #171 - Fix issues where the notes where either not parsed, or incorrectly parsed. #166
[2.0.0] - 2019-06-11
- Add retries around functions fecthing data from remote sources to increase reliability. #116
- Refactor the functions parsing the fields to simplify their maintenance and improve the overall quality of the parsing.
- Remove LXML as a dependency. #117 #132 #136 #138
- Dates are stored internaly as date objects instead of strings. The formatting is delegated to the
Formatters
themselves. #125 - Remove GSheet support. #133
- Fix parsing of fatalities without date of birth. #125
- Fix parsing names with generation suffixes. #137
- Skip fields with empty values. #139
- Fix parsing short ethnicities in deceased fields. #140
[1.5.1] - 2019-04-25
- Fix gender value mix character cases.
- Fix
fetch_text()
by adding retry ability with exponential backoff. #85 - Fix incorrect name parsing when a nickname or a middle name was specified. #84
- Fix and improve the
Deceased
field parsing. The parser now supports pipe and space delimited fields. #90
[1.5.0] - 2019-03-19
- Fix inconsistent date formats. All the new date fields now follow the English format
MM/DD/YYYY
. #57 - Build the Docker image with the right version of ScrAPD. #61
- Fix the parsing of the
Notes
. #60, #64 - Remove duplicate fatality entries. #66
[1.4.2] - 2019-03-03
- Improve fatality page detection. #47
[1.4.1] - 2019-02-23
- Fix Changelog and documentation. #46
[1.4.0] - 2019-02-23
- Improved date parsing and date manipulation operations. #45
- Simplify the tool by removing the
retrieve
subcommand. ScrAPD is a tool to do exactly that, there is no need for subcommands.
- Fix incorrect date filtering condition. #44
- Improve regex to detect fatality links. #44
- Improve twitter description parsing. #44
[1.3.0] - 2019-02-16
- Add a Docker image to run ScrAPD from a container. #38
[1.2.0] - 2019-02-09
- Fix issue where scrapd was retrieving unnecessary data. #28
[1.1.0] - 2019-01-25
- Add the
Notes
column to the csv output. #13 - Add feature tests to be able to validate scenarios in a manner that reflects the user interaction with the software. #14
- Add CircleCI jobs to automatically publish a new release on PyPI. #23
- Fix incorrect package metadata. #20
[1.0.0] - 2019-01-21
Initial release.
This first version allows a user to retrieve traffic fatality repports for a certain period of time and export the results as csv, json or python.