fix: Use content-based hashing for schema sync to avoid metadata noise #649
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
When the AdCP server restarts and regenerates schemas, it creates new ETags and timestamps even though the schema content is identical. This causes 120+ files to appear as modified in git with only metadata changes - pure noise that clutters git history and creates confusion.
Root Cause
refresh_adcp_schemas.pydownloads → updates.metafiles with new ETagsgenerate_schemas.pysees different ETags → regenerates Python filesSolution
Replace ETag-based tracking with content-based hashing:
Changes
Modified
scripts/generate_schemas.py:add_etag_metadata_to_generated_files():schema_hashwith current content hash# schema_hash: abc123def456generate_schemas_from_json():__init__.pySCHEMA_HASHNew Schemas (from AdCP spec):
webhook-payload- Webhook payload structure for async task notificationstask-type- Valid AdCP task types enumBenefits
Before fix:
After fix:
Testing
Tested the workflow:
Code Review