Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup Harvesting Logic Repo #4502

Closed
1 task done
rshewitt opened this issue Oct 19, 2023 · 1 comment
Closed
1 task done

Cleanup Harvesting Logic Repo #4502

rshewitt opened this issue Oct 19, 2023 · 1 comment
Assignees
Labels
H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0

Comments

@rshewitt
Copy link
Contributor

rshewitt commented Oct 19, 2023

User Story

In order for the harvesting logic repo to only be a python module for the ETVL of dcatus catalogs, datagov wants to remove all files, functionalities, tests, and dependencies which don't involve the ETVL of a dcatus catalog.

Acceptance Criteria

Background

  • datagov has expressed interest in replacing the current harvester with a workflow orchestration system ( e.g. airflow or windmill ) utilizing a python module for the ETVL of dcatus catalogs.
  • One of the first steps in this effort was creating a harvesting logic repo containing general purpose logic for the ETVL of datasets we maintain. At the moment this repo focuses only on dcatus catalogs.
  • Over time, the harvesting logic repo has become bloated with unneeded content ( e.g. postgres interfaces, a flask controller, an airflow implementation, s3 loading functionality, and docker services ).

Sketch

  • remove everything related to the unneeded content mentioned in the background.
@rshewitt rshewitt added the H2.0/Harvest-General General Harvesting 2.0 Issues label Oct 19, 2023
@rshewitt rshewitt self-assigned this Oct 19, 2023
@rshewitt
Copy link
Contributor Author

remove/commit things by component so they're easier to identify in the future and include those commits in the PR ( e.g. remove s3 help module, remove sqlachemy module, remove flask controller, etc... )

@btylerburton btylerburton moved this to 🏗 In Progress [8] in data.gov team board Oct 19, 2023
@rshewitt rshewitt moved this from 🏗 In Progress [8] to 👀 Needs Review [2] in data.gov team board Oct 24, 2023
@rshewitt rshewitt moved this from 👀 Needs Review [2] to ✔ Done in data.gov team board Oct 30, 2023
@hkdctol hkdctol closed this as completed Nov 9, 2023
@hkdctol hkdctol moved this from ✔ Done to 🗄 Closed in data.gov team board Nov 9, 2023
@btylerburton btylerburton added H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0 and removed H2.0/Harvest-General General Harvesting 2.0 Issues labels Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
H2.0/Harvest-Runner Harvest Source Processing for Harvesting 2.0
Projects
Archived in project
Development

No branches or pull requests

3 participants