Skip to content

Part 2 of plugin refactor#377

Open
ppinchuk wants to merge 21 commits intomainfrom
pp/ordinance_plugin
Open

Part 2 of plugin refactor#377
ppinchuk wants to merge 21 commits intomainfrom
pp/ordinance_plugin

Conversation

@ppinchuk
Copy link
Collaborator

@ppinchuk ppinchuk commented Feb 7, 2026

Generalized plugin classes a little bit as well as add prompt-driven plugin implementations. Also added a plugin registry.
This PR brings us about 90% of the way to the full plugin architecture. We are now also in a position where we can easily implement 1-shot extraction

@ppinchuk ppinchuk self-assigned this Feb 7, 2026
Copilot AI review requested due to automatic review settings February 7, 2026 05:16
@ppinchuk ppinchuk requested a review from castelao as a code owner February 7, 2026 05:16
@ppinchuk ppinchuk added enhancement Update to logic or general code improvements p-critical Priority: critical refactor Code improvements that do not change functionality topic-python-general Issues/pull requests related to python labels Feb 7, 2026
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 60.43360% with 146 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.97%. Comparing base (be35823) to head (155b16e).

Files with missing lines Patch % Lines
compass/plugin/ordinance.py 48.87% 105 Missing and 9 partials ⚠️
compass/plugin/interface.py 39.47% 19 Missing and 4 partials ⚠️
compass/scripts/process.py 14.28% 6 Missing ⚠️
compass/plugin/registry.py 78.57% 2 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (60.43%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #377      +/-   ##
==========================================
+ Coverage   56.63%   56.97%   +0.34%     
==========================================
  Files          55       56       +1     
  Lines        4953     4953              
  Branches      431      446      +15     
==========================================
+ Hits         2805     2822      +17     
+ Misses       2120     2090      -30     
- Partials       28       41      +13     
Flag Coverage Δ
unittests 56.97% <60.43%> (+0.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR advances COMPASS’s plugin architecture by introducing a centralized plugin registry, refactoring ordinance plugins into more generic base classes, and enabling prompt-driven collector/extractor implementations to reduce duplicated extraction logic across technologies.

Changes:

  • Added a plugin registry (PLUGIN_REGISTRY + register_plugin) and migrated the processing runner to resolve plugins via the registry.
  • Refactored ordinance plugin framework to support prompt-chain collectors/extractors and centralized configuration/validation.
  • Updated threaded cleaned-text writing to be driven by per-plugin file output registration, and adjusted unit tests/docs accordingly.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
compass/plugin/registry.py Introduces plugin registration and the global plugin registry used by the runner.
compass/plugin/ordinance.py Adds generalized ordinance plugin base + prompt-driven collector/extractor implementations; registers cleaned output filenames.
compass/plugin/interface.py Renames/splits plugin base responsibilities (filtered pipeline base separated from ordinance-specific parsing).
compass/plugin/base.py Adds JURISDICTION_DATA_FP hook and a base validate_plugin_configuration entrypoint.
compass/plugin/__init__.py Re-exports new plugin framework symbols and registry utilities.
compass/scripts/process.py Switches from hardcoded extractor registry to PLUGIN_REGISTRY for tech→plugin resolution.
compass/services/threaded.py Makes cleaned text outputs configurable via CLEANED_FP_REGISTRY keyed by plugin tech.
compass/validation/content.py Extends chunk-validation callback signature to support prompt-driven validation calls.
compass/utilities/jurisdictions.py Removes TX water districts CSV from default registry (now added via plugin registration).
compass/extraction/wind/plugin.py Migrates wind plugin to OrdinanceExtractionPlugin and registers it.
compass/extraction/wind/ordinance.py Converts wind ordinance collectors/extractors to prompt-driven implementations.
compass/extraction/solar/plugin.py Migrates solar plugin to OrdinanceExtractionPlugin and registers it.
compass/extraction/solar/ordinance.py Converts solar ordinance collectors/extractors to prompt-driven implementations.
compass/extraction/small_wind/plugin.py Migrates small-wind plugin to OrdinanceExtractionPlugin and registers it.
compass/extraction/small_wind/ordinance.py Converts small-wind ordinance collectors/extractors to prompt-driven implementations.
compass/extraction/water/plugin.py Registers the TX water rights plugin and supplies its jurisdiction dataset path.
compass/extraction/__init__.py Temporarily imports plugins to force registration via side effects.
compass/__init__.py Temporarily imports plugins at package import time to force registration.
docs/source/conf.py Updates Sphinx crossrefs and adds nitpick ignores for moved/changed symbols.
tests/python/unit/services/test_services_threaded.py Updates cleaned-file writer tests for the new registry-driven output behavior.
tests/python/unit/plugin/test_plugin_ordinances.py Updates ordinance plugin validation tests for the new ordinance plugin base classes.

Comment on lines +846 to +848
self._validate_collector_prompts()
self._validate_collector_prompts()
self._register_clean_file_names()
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OrdinanceExtractionPlugin.validate_plugin_configuration() calls _validate_collector_prompts() twice, and _validate_collector_prompts is also defined twice (the second definition overwrites the first). As a result, prompts for PromptBasedTextCollector subclasses are never validated, and the second call is redundant. Split these into two distinct methods (e.g., _validate_text_collector_prompts and _validate_text_extractor_prompts) and call each once from validate_plugin_configuration.

Suggested change
self._validate_collector_prompts()
self._validate_collector_prompts()
self._register_clean_file_names()
self._validate_text_collector_prompts()
self._validate_text_extractor_prompts()
self._register_clean_file_names()
def _validate_text_collector_prompts(self):
"""Validate prompts for text collectors"""
self._validate_collector_prompts()
def _validate_text_extractor_prompts(self):
"""Validate prompts for text extractors"""
self._validate_collector_prompts()

Copilot uses AI. Check for mistakes.
Comment on lines +212 to +223
fp_names = {
"relevant_text": "{jurisdiction} Ordinance Original text.txt",
"cleaned_text_for_extraction": "{jurisdiction} Cleaned Text.txt",
"districts_text": "{jurisdiction} Districts.txt",
}

CLEANED_FP_REGISTRY["cleaned_file_test"] = fp_names
outputs = threaded._write_cleaned_file(
doc, tmp_path, jurisdiction_name="Sample Jurisdiction"
doc,
tmp_path,
tech="cleaned_file_test",
jurisdiction_name="Sample Jurisdiction",
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test mutates the global CLEANED_FP_REGISTRY without restoring it, which can leak state across tests (especially with re-ordering or xdist). Use monkeypatch to set the registry entry and ensure cleanup (or delete the key at the end of the test).

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +42
if plugin_class.JURISDICTION_DATA_FP is not None:
KNOWN_JURISDICTIONS_REGISTRY.add(plugin_class.JURISDICTION_DATA_FP)

plugin_instance = plugin_class(None, None)
plugin_instance.validate_plugin_configuration()

PLUGIN_REGISTRY[plugin_class.IDENTIFIER.casefold()] = plugin_class
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

register_plugin will silently overwrite an existing entry if another plugin registers with the same IDENTIFIER (case-insensitive). Consider raising COMPASSPluginConfigurationError when a duplicate identifier is detected to avoid non-deterministic behavior based on import order.

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +13
# Temporarily import to register plugins
# Can drop once plugins register themselves
from .extraction import (
COMPASSWindExtractor,
COMPASSSolarExtractor,
COMPASSSmallWindExtractor,
TexasWaterRightsExtractor,
)
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing all extraction plugins in compass.__init__ to force registration adds significant import-time side effects (loads pandas/LLM-related modules, runs register_plugin, mutates registries) for any consumer that imports any compass.* module. Prefer an explicit plugin registration entrypoint (e.g., compass.plugin.discover_plugins() called by the CLI) or lazy registration to keep library imports lightweight and reduce circular-import risk.

Copilot uses AI. Check for mistakes.
Comment on lines +934 to +956
"""Validate that all text collectors have prompts defined"""

for collector in self.TEXT_COLLECTORS:
if not issubclass(collector, PromptBasedTextCollector):
continue
try:
num_prompts = len(collector.PROMPTS)
except NotImplementedError:
msg = (
f"Text collector {self.__class__.__name__} is missing "
"required property 'PROMPTS'"
)
raise COMPASSPluginConfigurationError(msg) from None

if num_prompts == 0:
msg = (
f"Text collector {self.__class__.__name__} has an empty "
"'PROMPTS' property! Please provide at least one prompt "
"dictionary."
)
raise COMPASSPluginConfigurationError(msg)

def _validate_collector_prompts(self):
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assignment to '_validate_collector_prompts' is unnecessary as it is redefined before this value is used.

Suggested change
"""Validate that all text collectors have prompts defined"""
for collector in self.TEXT_COLLECTORS:
if not issubclass(collector, PromptBasedTextCollector):
continue
try:
num_prompts = len(collector.PROMPTS)
except NotImplementedError:
msg = (
f"Text collector {self.__class__.__name__} is missing "
"required property 'PROMPTS'"
)
raise COMPASSPluginConfigurationError(msg) from None
if num_prompts == 0:
msg = (
f"Text collector {self.__class__.__name__} has an empty "
"'PROMPTS' property! Please provide at least one prompt "
"dictionary."
)
raise COMPASSPluginConfigurationError(msg)
def _validate_collector_prompts(self):

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Update to logic or general code improvements p-critical Priority: critical refactor Code improvements that do not change functionality topic-python-general Issues/pull requests related to python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants