Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output Parquet files as well as SQLite in PUDL ETL #3296

Merged
merged 68 commits into from
Feb 6, 2024
Merged

Commits on Jan 6, 2024

  1. Configuration menu
    Copy the full SHA
    8b4fed4 View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2024

  1. Configuration menu
    Copy the full SHA
    b1ac18a View commit details
    Browse the repository at this point in the history
  2. Modify PudlSqliteIOManager to support parquet.

    Use pyarrow.parquet for writing/reading, add BaseSettings to
    configure input/output behaviors. By default, let both modes
    be disabled but allow overrides via PUDL_WRITE_TO_PARQUET and
    PUDL_READ_FROM_PARQUET env variables.
    rousik committed Jan 10, 2024
    Configuration menu
    Copy the full SHA
    0a78a94 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c9aa6f0 View commit details
    Browse the repository at this point in the history

Commits on Jan 11, 2024

  1. Fix the export of pandas/parquet.

    Conversion to pyarrow table was necessary before writing to parquet.
    rousik committed Jan 11, 2024
    Configuration menu
    Copy the full SHA
    b6136c7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    83492a0 View commit details
    Browse the repository at this point in the history

Commits on Jan 16, 2024

  1. Configuration menu
    Copy the full SHA
    83650d5 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    39b72e5 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9043255 View commit details
    Browse the repository at this point in the history

Commits on Jan 17, 2024

  1. Configuration menu
    Copy the full SHA
    4ab92f2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c49d129 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    afed63d View commit details
    Browse the repository at this point in the history

Commits on Jan 18, 2024

  1. Rename pudl_sqlite_io_manager to pudl_io_manager

    This makes more sense as the io manager should be format independent.
    rousik committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    d88c1ef View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4ab01d5 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    cd98379 View commit details
    Browse the repository at this point in the history
  4. Refactor PudlSQLiteIOManager tests to use unittest.

    This encapsulation makes it much nicer than passing around fixtures.
    rousik committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    a400446 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    df5e12d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    c3d4bb3 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    9081388 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    273741a View commit details
    Browse the repository at this point in the history

Commits on Jan 24, 2024

  1. Configuration menu
    Copy the full SHA
    06571ac View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2024

  1. Configuration menu
    Copy the full SHA
    cd2dcd0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c7ce623 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ab66c2a View commit details
    Browse the repository at this point in the history
  4. Revert "Refactor PudlSQLiteIOManager tests to use unittest."

    This reverts commit a400446.
    zschira committed Jan 25, 2024
    Configuration menu
    Copy the full SHA
    013f4a5 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    3408333 View commit details
    Browse the repository at this point in the history
  6. Fix docs build failure

    zschira committed Jan 25, 2024
    Configuration menu
    Copy the full SHA
    91b4ff9 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    01ae240 View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2024

  1. Update src/pudl/io_managers.py

    Co-authored-by: Zane Selvans <zane.selvans@catalyst.coop>
    zschira and zaneselvans committed Jan 26, 2024
    Configuration menu
    Copy the full SHA
    21e0dbf View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ee2a533 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2fceea6 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    1cf1fbd View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    dd6e395 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    192a569 View commit details
    Browse the repository at this point in the history
  7. Remove unused argument

    zschira committed Jan 26, 2024
    Configuration menu
    Copy the full SHA
    a27c5f1 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f47a9d1 View commit details
    Browse the repository at this point in the history
  9. Merge branch 'parquet_outputs' of github.com:catalyst-cooperative/pud…

    …l into parquet_outputs
    zschira committed Jan 26, 2024
    Configuration menu
    Copy the full SHA
    a961480 View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2024

  1. Configuration menu
    Copy the full SHA
    46dd5a2 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e8ebdd7 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    06aa9fe View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5b82d91 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    aaa46cb View commit details
    Browse the repository at this point in the history

Commits on Jan 30, 2024

  1. Update src/pudl/etl/check_foreign_keys.py

    Co-authored-by: Zane Selvans <zane.selvans@catalyst.coop>
    zschira and zaneselvans committed Jan 30, 2024
    Configuration menu
    Copy the full SHA
    0492659 View commit details
    Browse the repository at this point in the history
  2. Remove redundant log

    Co-authored-by: Zane Selvans <zane.selvans@catalyst.coop>
    zschira and zaneselvans committed Jan 30, 2024
    Configuration menu
    Copy the full SHA
    e5825fb View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    6fd541e View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    cc3644a View commit details
    Browse the repository at this point in the history
  5. Merge branch 'parquet_outputs' of github.com:catalyst-cooperative/pud…

    …l into parquet_outputs
    zschira committed Jan 30, 2024
    Configuration menu
    Copy the full SHA
    7269bf7 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    cf1d5c5 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    f9ed62b View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f2c9f25 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    41eb6d3 View commit details
    Browse the repository at this point in the history

Commits on Jan 31, 2024

  1. Revert "Change how dtype handling is done for parquet reads"

    This reverts commit 41eb6d3.
    zschira committed Jan 31, 2024
    Configuration menu
    Copy the full SHA
    27555ae View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    827dc8b View commit details
    Browse the repository at this point in the history

Commits on Feb 1, 2024

  1. Configuration menu
    Copy the full SHA
    16b4585 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b54e915 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5b1a6f5 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ef74271 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    4ed6f90 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    53a3fd4 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    40e58a6 View commit details
    Browse the repository at this point in the history

Commits on Feb 2, 2024

  1. Configuration menu
    Copy the full SHA
    b922919 View commit details
    Browse the repository at this point in the history

Commits on Feb 5, 2024

  1. Configuration menu
    Copy the full SHA
    d59f795 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5fffc65 View commit details
    Browse the repository at this point in the history

Commits on Feb 6, 2024

  1. Configuration menu
    Copy the full SHA
    16fb36d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6c23e79 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1aa55b9 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    6ca051b View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    0883831 View commit details
    Browse the repository at this point in the history