Skip to content

MVP for Texas groundwater rights extraction#373

Merged
ppinchuk merged 39 commits intomainfrom
sp/tx_groundwater
Feb 1, 2026
Merged

MVP for Texas groundwater rights extraction#373
ppinchuk merged 39 commits intomainfrom
sp/tx_groundwater

Conversation

@ppinchuk
Copy link
Collaborator

@ppinchuk ppinchuk commented Feb 1, 2026

@spodgorny9 implemented the initial groundwater rights extraction in ELM, and this is the MVP port into COMPASS. It's still rough around the edges, but Slater and I will discuss next steps with this code and massage it into a good state.

I am pushing this PR though because it also includes the beginnings of the COMPASS framework generalization that I am eager to implement ASAP.

@ppinchuk ppinchuk added this to the New technologies milestone Feb 1, 2026
@ppinchuk ppinchuk requested a review from castelao as a code owner February 1, 2026 00:16
Copilot AI review requested due to automatic review settings February 1, 2026 00:16
@ppinchuk ppinchuk added enhancement Update to logic or general code improvements new computation Update that adds a new computation method topic-python-llm Issues/pull requests related to LLMs p-high Priority: high labels Feb 1, 2026
@codecov-commenter
Copy link

codecov-commenter commented Feb 1, 2026

Codecov Report

❌ Patch coverage is 24.03846% with 316 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.65%. Comparing base (d0dca89) to head (8ef4019).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
compass/extraction/water/graphs.py 11.25% 134 Missing ⚠️
compass/extraction/water/processing.py 19.71% 57 Missing ⚠️
compass/scripts/process.py 12.96% 47 Missing ⚠️
compass/extraction/water/parse.py 38.66% 46 Missing ⚠️
compass/extraction/water/ordinance.py 50.00% 22 Missing ⚠️
compass/scripts/download.py 0.00% 5 Missing ⚠️
compass/validation/content.py 37.50% 5 Missing ⚠️

❌ Your patch status has failed because the patch coverage (24.03%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #373      +/-   ##
==========================================
- Coverage   56.23%   53.65%   -2.58%     
==========================================
  Files          45       49       +4     
  Lines        4316     4695     +379     
  Branches      395      416      +21     
==========================================
+ Hits         2427     2519      +92     
- Misses       1860     2148     +288     
+ Partials       29       28       -1     
Flag Coverage Δ
unittests 53.65% <24.03%> (-2.58%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an MVP for Texas groundwater rights extraction and begins generalizing the COMPASS framework. The main changes port groundwater rights extraction logic from ELM into COMPASS while adding extensibility through hook/callback mechanisms to support different extraction workflows beyond the original wind/solar ordinances.

Changes:

  • Generalizes framework terminology (city → subdivision, LegalTextValidator → TextKindValidator) for broader applicability
  • Adds hook/callback system to TechSpec for customizing document processing and data extraction workflows
  • Implements water rights extraction module with RAG-based parsing and Texas water district jurisdictions
  • Updates date handling patterns to use more robust unpacking syntax

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
pyproject.toml Bumps nlr-elm dependency to 0.0.36
compass/data/tx_water_districts.csv Adds 98 Texas water conservation districts as new jurisdiction types
compass/utilities/nt.py Extends TechSpec with 5 optional hooks for custom processing workflows
compass/utilities/__init__.py Adds cost entries for new LLM models (egswaterord-gpt4.1-mini, text-embedding-ada-002)
compass/utilities/enums.py Adds EMBEDDING task enum for text embedding operations
compass/utilities/jurisdictions.py Changes jurisdiction loading to support multiple CSV files via registry
compass/utilities/parsing.py Refactors date extraction to use robust unpacking pattern
compass/validation/content.py Generalizes LegalTextValidator into abstract TextKindValidator base class
compass/validation/graphs.py Renames city-specific nodes/prompts to subdivision for broader terminology
compass/extraction/apply.py Inverts check_if_legal_doc logic (was is_legal_doc) for clarity
compass/extraction/water/__init__.py Exports water rights extraction classes and configuration
compass/extraction/water/ordinance.py Implements water rights text collectors and extractors
compass/extraction/water/parse.py Implements structured water parser using decision trees and RAG
compass/extraction/water/graphs.py Defines 16 decision tree graphs for water rights feature extraction
compass/extraction/water/processing.py Implements corpus building, extraction, and data writing hooks
compass/scripts/process.py Integrates water extraction workflow and new hook system
compass/scripts/download.py Refactors content filtering, makes permitted_use_text_collector optional
compass/services/threaded.py Updates date handling to match new pattern
tests/python/unit/validation/test_validation_graphs.py Updates tests for city → subdivision renaming
tests/python/unit/validation/test_validation_content.py Updates tests for validator generalization

@ppinchuk ppinchuk merged commit dcbafa7 into main Feb 1, 2026
26 checks passed
@ppinchuk ppinchuk deleted the sp/tx_groundwater branch February 1, 2026 01:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Update to logic or general code improvements new computation Update that adds a new computation method p-high Priority: high topic-python-llm Issues/pull requests related to LLMs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants