Skip to content

Conversation

@hua7450
Copy link
Collaborator

@hua7450 hua7450 commented Dec 13, 2025

Fixes #6962

Summary

Optimizes CI test job distribution to reduce total CI runtime by rebalancing workloads across parallel jobs.

Changes

Job restructuring

  • Move NY state tests from States → Baseline job (NY was a major bottleneck)
  • Move Python tests from Structural → States job
  • Split contrib/states into per-state batches to prevent memory accumulation (RI alone uses 5.2 GB)
  • Rename jobs to reflect actual content:
    • States (excl NY) & Python
    • Baseline (incl NY) & Reform
    • Structural (States) — each state runs in own subprocess
    • Structural (Other) — 7 memory-balanced batches including congress

Results

Job Before After Delta
States (excl NY) & Python 23m 59s ~8m 10s -15m 49s
Baseline (incl NY) & Reform 9m 22s ~12m 43s +3m 21s
Structural (States) 15m 17s ~15m ~same
Structural (Other) 20m 4s ~13m -7m

Total CI time (slowest job): ~24m → ~15m = ~9 min faster


CI Job Breakdown

1. States (excl NY) & Python

Batch Path Notes
1 policyengine_us/tests/policy/baseline/gov/states/ --exclude ny (46 states + tax/)
2 pytest policyengine_us/tests/ All Python tests

States included: al, ar, az, ca, co, ct, dc, de, ga, hi, ia, id, il, in, ks, ky, la, ma, md, me, mi, mn, mo, ms, mt, nc, nd, ne, nh, nj, nm, oh, ok, or, pa, ri, sc, tx, ut, va, vt, wa, wi, wv + tax/


2. Baseline (incl NY) & Reform

Batch Path Notes
1 policyengine_us/tests/policy/baseline/gov/states/ny/ NY only
2 policyengine_us/tests/policy/baseline/ --exclude states
3 policyengine_us/tests/policy/baseline/household/
4 policyengine_us/tests/policy/baseline/contrib/
5 policyengine_us/tests/policy/reform/ ctc_expansion.yaml, winship.yaml

Baseline (excl states) includes: calcfunctions, contrib, gov (excl states: aca, doe, ed, fcc, hhs, hud, irs, local, simulation, ssa, tax, territories, usda), income, parameters


3. Structural (States) — Per-State Batches

Each state folder runs in its own subprocess with garbage collection between them. This prevents memory accumulation that was causing CI failures when adding new state folders.

Batch State Peak Memory
1 dc 1.7 GB
2 de 3.3 GB
3 mi 1.3 GB
4 mn 1.7 GB
5 mt 3.0 GB
6 ny 2.1 GB
7 or 2.5 GB
8 ri 5.2 GB
9 ut 1.7 GB
10+ new states auto-added

New state folders are automatically picked up as separate batches.


4. Structural (Other) — 7 Memory-Balanced Batches

Batch Folders/Files Memory
1 federal, harris, treasury ~9.0 GB
2 ctc, snap_ea, ubi_center ~8.6 GB
3 deductions, aca, snap ~8.1 GB
4 tax_exempt, eitc, state_dependent_exemptions, additional_tax_bracket ~8.0 GB
5 local, reconciliation, dc_single_joint_threshold_ratio.yaml, dc_kccatc.yaml, reported_state_income_tax.yaml + new folders ~7.8 GB
6 crfb ~8.9 GB
7 congress ~6.3 GB

Summary

CI Job # Batches
States (excl NY) & Python 2
Baseline (incl NY) & Reform 5
Structural (States) 9+ (1 per state)
Structural (Other) 7

Test plan

  • All CI jobs pass
  • Memory-based batching verified locally
  • New state folders automatically get their own batch
  • Congress runs as its own batch (~6.3 GB)

@hua7450 hua7450 marked this pull request as ready for review December 13, 2025 04:38
@hua7450 hua7450 marked this pull request as draft December 13, 2025 06:05
@hua7450
Copy link
Collaborator Author

hua7450 commented Dec 15, 2025

@PolicyEngine upgrade uv lock

@policyengine
Copy link

policyengine bot commented Dec 15, 2025

I ran into an issue:

Failed to clone repository: Cloning into '/tmp/policyengine-bot-c9om2gn8/policyengine-us'...
warning: Could not find remote branch hua7450/issue6962 to clone.
fatal: Remote branch hua7450/issue6962 not found in upstream origin

@hua7450 hua7450 marked this pull request as ready for review December 16, 2025 05:17
@hua7450 hua7450 merged commit 6b5a38a into PolicyEngine:main Dec 18, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Experiment: Faster CI timing

2 participants