Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Decouple asset_handlers into c2pa-codecs crate #533

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ok-nick
Copy link
Contributor

@ok-nick ok-nick commented Jul 31, 2024

Changes in this pull request

Decouples asset_handler/asset_io into its own crate.

Changes

  • Separate/group encoding and decoding
    • Introduces Encode/Decode trait with the following methods:
      • Encode: write_c2pa, remove_c2pa, patch_c2pa, write_xmp, write_xmp_provenance, remove_xmp, remove_xmp_provenance
      • Decode: read_c2pa, read_xmp, read_xmp_provenance
    • Asset handlers (codecs) now take a read-only stream on construction
      • When signing, the typical workflow is read->write->read->write, there's a lot we can cache
    • Streams are no longer trait objects and instead generics (less overhead)
  • Decouple hashers to the parsers
    • Introduces a Hash trait with 5 methods: hash, data_hash, box_hash, bmff_hash, collection_hash
    • Parsers can choose their default hasher (usually defined by spec) with Hasher::hash whilst also implementing any other supported hashes
    • Data hash now returns a list of byte spans corresponding to the manifest rather than explicitly defining byte ranges over the entire file
  • Add file signature inference
    • Introduces a Supporter trait with 3 methods: supports_signature, supports_extension, supports_mime.
  • Add Embed trait (composed manifests)
    • Construct an Embeddable with embeddable, read it from a stream with read_embeddable and write it to a stream with write_embeddable
  • Codec is the new entry-point for asset handlers
    • It implements all traits and forwards it to the corresponding codec based on the file type
    • It is an enum over all codecs, removing the need for boxing trait objects
    • For instance, Codec::from_stream(&mut stream).read_c2pa()
  • Granular parsing errors
    • It's now clear specifically where parsing went wrong, rather than "not found"
  • Codecs are individually locked behind feature flags
    • If you only want png, enable the png feature, same with gif, jpeg, etc.
  • External codecs
    • The Codec struct can be created with an external codec that implements the desired traits (user-defined codecs)
  • Updated dependencies
    • Many parser dependencies have become unmaintained or deprecated

Testing

  • Test the codecs separately from c2pa-rs
    • Use verified test image suites provided online, things like pngsuite, imagetestsuite, etc.
  • Test all codecs simulatenously
    • Construct Codec and test read_xmp for all file types (no longer needs to be implemented for each asset handler)
  • Fuzz the codecs
    • Individually fuzz each method of the codecs using afl or proptest

Related issues

Checklist

  • This PR represents a single feature, fix, or change.
  • All applicable changes have been documented.
  • Any TO DO items (or similar) have been entered as GitHub issues and the link to that issue has been included in a comment.

@ok-nick ok-nick added the enhancement New feature or request label Jul 31, 2024
@scouten-adobe scouten-adobe changed the title Decouple asset_handlers into c2pa-codecs crate feat: Decouple asset_handlers into c2pa-codecs crate Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant