Skip to content

Commit

Permalink
Campos dos Goytacazes-RJ spider refactor
Browse files Browse the repository at this point in the history
The way the spider was implemented assumed that there could only be a single file_url per day per is_extra_edition value, which was not always true.

This refactoring gathers all the various files per day and is_extra_edition.

We addressed the text format for Saturday gazettes to be considered is_extra_edition.

We also included the start_date and end_date handling, and edition_number when applicable.

resolve okfn-brasil#637
  • Loading branch information
Alexandre Harano committed Oct 9, 2022
1 parent 40e9c3b commit 2f22c56
Show file tree
Hide file tree
Showing 2 changed files with 402 additions and 62 deletions.
Loading

0 comments on commit 2f22c56

Please sign in to comment.