Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[4/4] Rétrofit des helpers pg / s3 / macros etc #173

Merged
merged 7 commits into from
Dec 14, 2023

Commits on Dec 14, 2023

  1. fix(bug) : Typo in profiles.yml

    'pass' cannot be used, it's 'password'. I don't entirely understand how
    it could work so far.
    vperron committed Dec 14, 2023
    Configuration menu
    Copy the full SHA
    3fa7dfb View commit details
    Browse the repository at this point in the history
  2. chore(dags) : Retrofit pg & s3 helper in import_sources

    TODOS:
    - rewrite settings.py to use a DataSource(class) defining HTTP
      extractors and loaders, streams and schedule intervals, etc.
    - split the DAG.
    - place mediation numerique somewhere else
    
    About the tests:
    - split the tests that should be run on CI and the others.
    - if we want to test the DAGs we want:
      * a specific test database just for the testing moment
      * some cleanup before and after
      * etc etc
      ==> "just running" pytest without orchestration has zero chance to
      work, so we should split the tests run on CI and the others.
    
    Don't forget to re-run the DAGs before and after the changes and check
    if the sources or datawarehouses have changed...
    vperron committed Dec 14, 2023
    Configuration menu
    Copy the full SHA
    acbbeea View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    df0087f View commit details
    Browse the repository at this point in the history
  4. chore(dags) : Use pg helpers in the other DAGs.

    Some of the DAGs (for instance all the INSEE ones)
    could maybe be combined ?
    
    I am a little surprised that for those ones we don't
    have the datalake layer. I agree those are mostly
    for "sources" in the DI sense but technically, it
    would be nice to assume that we can reproduce everything
    from a single S3 dump.
    
    Maybe that's too much ? I still think I would sleep
    better at night if the Extract + Load was always the same.
    
    In which case we could separate the subfolders though.
    - di-sources
    - seeds
    - you name it
    vperron committed Dec 14, 2023
    Configuration menu
    Copy the full SHA
    0b7f3d0 View commit details
    Browse the repository at this point in the history
  5. chore(dbt) : Retrofit the stg_source_header macro

    In the name of the DRY.
    vperron committed Dec 14, 2023
    Configuration menu
    Copy the full SHA
    69df199 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    d6cf8a4 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b394b01 View commit details
    Browse the repository at this point in the history