Fix AdCP schema caching: Replace time-based with ETag validation#396
Merged
Fix AdCP schema caching: Replace time-based with ETag validation#396
Conversation
## Problem 24-hour time-based cache was inappropriate for rapidly evolving AdCP spec: - Assumed schemas valid for 24 hours (wrong during active development) - Couldn't detect changes within cache window - Led to persistent stale schemas causing test failures - budget.json schema persisted in cache despite not existing in official spec ## Solution: ETag-Based HTTP Conditional GET Replaced time-based caching with server-driven validation: - Always checks with server using If-None-Match header - Server returns 304 Not Modified if unchanged (use cache) - Server returns 200 OK if changed (download new version) - Falls back to cache if server unavailable (resilient) ### Key Changes 1. **Schema Validator** (tests/e2e/adcp_schema_validator.py): - Deprecated time-based _is_cache_valid() logic - Implemented ETag-based conditional GET in download methods - Added metadata file tracking (.meta files store ETags) - Graceful fallback to cache on network errors 2. **Schema Refresh Tool** (scripts/refresh_adcp_schemas.py): - Clean slate approach: deletes all cached schemas before refresh - Handles both .json and .meta files - Verifies no outdated references remain (e.g., budget.json) 3. **Budget Format Fix** (tests/e2e/adcp_request_builder.py): - Fixed top-level budget: plain number (was object) - Fixed package budget: plain number (was object) - Matches official AdCP spec (currency from pricing_option_id) 4. **Cache Metadata** (.gitignore): - Ignore .meta files (local ETag cache metadata) ### Benefits ✅ Always fresh - detects changes immediately, no 24h delay ✅ Bandwidth efficient - only downloads when schemas actually change ✅ Resilient - falls back to cache if server unavailable ✅ Zero maintenance - automatic in online mode ### Schema Updates - Deleted obsolete budget.json (doesn't exist in official spec) - Added new schemas from official source (performance-feedback, etc.) - All schemas now validated against https://adcontextprotocol.org ## Documentation - docs/schema-caching-strategy.md - Technical implementation guide - BUDGET_FORMAT_FINDINGS.md - Budget format analysis - SCHEMA_CACHE_SOLUTION.md - Solution overview 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
BUDGET_FORMAT_FINDINGS.md and SCHEMA_CACHE_SOLUTION.md were temporary analysis documents. The technical documentation in docs/schema-caching-strategy.md is sufficient. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…mode Fixes from python-expert review (#396): 1. Fixed metadata cleanup (Issue #1): - Delete old .meta files before saving new ones - Prevents stale ETag issues when schemas change - Applied to both index and schema downloads 2. Removed dead code (Issue #2): - Deleted deprecated _is_cache_valid() method - Was unused and confusing 3. Added --offline-schemas flag (Issue #6): - New pytest flag for offline development - Creates adcp_validator fixture with offline mode support - Useful for airplane mode / no network scenarios These changes address the must-fix issues identified in the expert review before CI completes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
E2E tests were failing because code expected budget.total but budget is now
a plain number per AdCP spec.
Fixed get_total_budget() method to handle:
- Plain numbers (new AdCP spec format)
- Dict format {total, currency} (backward compat)
- Budget object with .total attribute (legacy)
This maintains backward compatibility while supporting the spec-compliant
plain number format.
Fixes CI E2E test failure.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Line 4135: Use request_currency instead of req.budget.currency - Lines 4776-4780: Add hasattr() check before accessing .currency attribute - Lines 3571-3580: Add type annotation for start_time to satisfy mypy - Lines 4783-4786: Use currency_stmt variable name to avoid type conflict - Fixes 'float' object has no attribute 'currency' error in E2E tests Related to AdCP v2.4 budget format (plain number instead of object)
- Use inline type comment for start_time to avoid redefining parameter - Add isinstance assertion to clarify start_time is datetime when not 'asap' - Use currency_stmt variable name to avoid SQLAlchemy type conflict - Convert media_buy.currency to str to fix mypy assignment error These changes resolve mypy errors in the budget currency handling code.
Issue: Schema validation was failing because pricing-options schemas (cpm-fixed-option, cpm-auction-option, etc.) were not being downloaded. Root cause: _preload_schema_references() only downloaded direct references, not references within those references (non-recursive). Solution: Made _preload_schema_references() recursive with cycle detection: - Added _visited set to track already-preloaded schemas - Recursively preload references found in each schema - Prevents infinite loops via visited set This ensures all nested schema references are downloaded and cached before validation runs, eliminating 'could not resolve schema reference' warnings. Fixes E2E test: test_valid_get_products_response Also includes updated cached schemas from adcontextprotocol.org with latest pricing-options schemas and minor updates to existing schemas.
danf-newton
pushed a commit
to Newton-Research-Inc/salesagent
that referenced
this pull request
Nov 24, 2025
…bid#396) * Fix AdCP schema caching: Replace time-based with ETag validation ## Problem 24-hour time-based cache was inappropriate for rapidly evolving AdCP spec: - Assumed schemas valid for 24 hours (wrong during active development) - Couldn't detect changes within cache window - Led to persistent stale schemas causing test failures - budget.json schema persisted in cache despite not existing in official spec ## Solution: ETag-Based HTTP Conditional GET Replaced time-based caching with server-driven validation: - Always checks with server using If-None-Match header - Server returns 304 Not Modified if unchanged (use cache) - Server returns 200 OK if changed (download new version) - Falls back to cache if server unavailable (resilient) ### Key Changes 1. **Schema Validator** (tests/e2e/adcp_schema_validator.py): - Deprecated time-based _is_cache_valid() logic - Implemented ETag-based conditional GET in download methods - Added metadata file tracking (.meta files store ETags) - Graceful fallback to cache on network errors 2. **Schema Refresh Tool** (scripts/refresh_adcp_schemas.py): - Clean slate approach: deletes all cached schemas before refresh - Handles both .json and .meta files - Verifies no outdated references remain (e.g., budget.json) 3. **Budget Format Fix** (tests/e2e/adcp_request_builder.py): - Fixed top-level budget: plain number (was object) - Fixed package budget: plain number (was object) - Matches official AdCP spec (currency from pricing_option_id) 4. **Cache Metadata** (.gitignore): - Ignore .meta files (local ETag cache metadata) ### Benefits ✅ Always fresh - detects changes immediately, no 24h delay ✅ Bandwidth efficient - only downloads when schemas actually change ✅ Resilient - falls back to cache if server unavailable ✅ Zero maintenance - automatic in online mode ### Schema Updates - Deleted obsolete budget.json (doesn't exist in official spec) - Added new schemas from official source (performance-feedback, etc.) - All schemas now validated against https://adcontextprotocol.org ## Documentation - docs/schema-caching-strategy.md - Technical implementation guide - BUDGET_FORMAT_FINDINGS.md - Budget format analysis - SCHEMA_CACHE_SOLUTION.md - Solution overview 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove temporary analysis docs from repo root BUDGET_FORMAT_FINDINGS.md and SCHEMA_CACHE_SOLUTION.md were temporary analysis documents. The technical documentation in docs/schema-caching-strategy.md is sufficient. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Address expert review feedback: fix metadata cleanup and add offline mode Fixes from python-expert review (prebid#396): 1. Fixed metadata cleanup (Issue prebid#1): - Delete old .meta files before saving new ones - Prevents stale ETag issues when schemas change - Applied to both index and schema downloads 2. Removed dead code (Issue prebid#2): - Deleted deprecated _is_cache_valid() method - Was unused and confusing 3. Added --offline-schemas flag (Issue prebid#6): - New pytest flag for offline development - Creates adcp_validator fixture with offline mode support - Useful for airplane mode / no network scenarios These changes address the must-fix issues identified in the expert review before CI completes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix budget handling: support both number and object formats E2E tests were failing because code expected budget.total but budget is now a plain number per AdCP spec. Fixed get_total_budget() method to handle: - Plain numbers (new AdCP spec format) - Dict format {total, currency} (backward compat) - Budget object with .total attribute (legacy) This maintains backward compatibility while supporting the spec-compliant plain number format. Fixes CI E2E test failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix budget.currency access for plain number budgets - Line 4135: Use request_currency instead of req.budget.currency - Lines 4776-4780: Add hasattr() check before accessing .currency attribute - Lines 3571-3580: Add type annotation for start_time to satisfy mypy - Lines 4783-4786: Use currency_stmt variable name to avoid type conflict - Fixes 'float' object has no attribute 'currency' error in E2E tests Related to AdCP v2.4 budget format (plain number instead of object) * Add mypy type annotations for budget currency handling - Use inline type comment for start_time to avoid redefining parameter - Add isinstance assertion to clarify start_time is datetime when not 'asap' - Use currency_stmt variable name to avoid SQLAlchemy type conflict - Convert media_buy.currency to str to fix mypy assignment error These changes resolve mypy errors in the budget currency handling code. * Fix recursive schema preloading for nested references Issue: Schema validation was failing because pricing-options schemas (cpm-fixed-option, cpm-auction-option, etc.) were not being downloaded. Root cause: _preload_schema_references() only downloaded direct references, not references within those references (non-recursive). Solution: Made _preload_schema_references() recursive with cycle detection: - Added _visited set to track already-preloaded schemas - Recursively preload references found in each schema - Prevents infinite loops via visited set This ensures all nested schema references are downloaded and cached before validation runs, eliminating 'could not resolve schema reference' warnings. Fixes E2E test: test_valid_get_products_response Also includes updated cached schemas from adcontextprotocol.org with latest pricing-options schemas and minor updates to existing schemas. --------- Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
24-hour time-based cache was inappropriate for rapidly evolving AdCP spec:
budget.jsonschema persisted in cache despite not existing in official specSolution: ETag-Based HTTP Conditional GET
Replaced time-based caching with server-driven validation:
If-None-Matchheader304 Not Modifiedif unchanged (use cache)200 OKif changed (download new version)Key Changes
1. Schema Validator (
tests/e2e/adcp_schema_validator.py)_is_cache_valid()logic.metafiles store ETags)2. Schema Refresh Tool (
scripts/refresh_adcp_schemas.py).jsonand.metafilesbudget.json)3. Budget Format Fix (
tests/e2e/adcp_request_builder.py)pricing_option_id)4. Cache Metadata (
.gitignore).metafiles (local ETag cache metadata)Benefits
Schema Updates
budget.json(doesn't exist in official spec)performance-feedback, etc.)Documentation
docs/schema-caching-strategy.md- Technical implementation guideBUDGET_FORMAT_FINDINGS.md- Budget format analysisSCHEMA_CACHE_SOLUTION.md- Solution overviewTesting
Migration Notes
No action required - ETag caching is automatic and backward compatible.