Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix English and Spanish event pairing in Metro scraper #304

Merged
merged 2 commits into from
Dec 3, 2019

Conversation

hancush
Copy link
Collaborator

@hancush hancush commented Dec 2, 2019

Description

Connects Metro-Records/la-metro-councilmatic#393 and opencivicdata/python-legistar-scraper#100.

The Metro scraper pairs English and Spanish Legistar events. When we are conducting a partial scrape, it's possible that one language event will be included, but not the other. We have logic to handle this case, however it relies on since_datetime being passed as a keyword argument to check whether a scrape is partial. This check did not work because we pass this value as a positional argument in the python-legistar library, hence the scraper never recognized partial scrapes and accordingly never looked for missing partners.

This PR:

  • Passes since_datetime as a keyword argument to events in the scrape method. This isn't technically necessary, but I thought it would be nice for consistency. The actual repair is made in Pass since_datetime as a keyword argument python-legistar-scraper#100, i.e., these PRs should be merged and deployed together.
  • Repairs the datetime comparison to check whether an event falls after Spanish audio was introduced. This code path was not exercised before, hence we never saw an error.

Test instructions

  • Pull down and check out this branch.
  • Pull down and checkout the patch/hec/pass-kwarg branch in python-legistar-scraper.
  • In your scraper directory, perform an interactive install of your local version of python-legistar: pip install -e path/to/python-legistar.
  • Run the Metro event scrape and confirm that it completes successfully: pupa update lametro events window=0.25 --fastmode --rpm=0

@hancush hancush changed the title Repair datetime comparison, use pass since_datetime as a keyword argument Repair datetime comparison, pass since_datetime as a keyword argument Dec 2, 2019
@hancush hancush changed the title Repair datetime comparison, pass since_datetime as a keyword argument Fix English and Spanish event pairing in Metro scraper Dec 2, 2019
@hancush hancush requested a review from fgregg December 2, 2019 20:32
@hancush hancush merged commit cd1877e into master Dec 3, 2019
@hancush hancush deleted the patch/hec/metro/fix-comparison branch January 28, 2022 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants