Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Host test vectors on GH #214

Closed
diehuxx opened this issue Dec 21, 2023 · 8 comments
Closed

[Proposal] Host test vectors on GH #214

diehuxx opened this issue Dec 21, 2023 · 8 comments
Assignees
Labels
docs/design tbdex 1.0 testing related to new or existing tests
Milestone

Comments

@diehuxx
Copy link
Contributor

diehuxx commented Dec 21, 2023

Overview

TBDex will host test vectors on github and can be fetched and cached by implementation repos in a .gitignore'd place. Implementation repos will NOT need to git commit the vectors themselves; implementation repos will only commit a git ref of the tbdex repo containing the desired version of the test vectors.

Hosting on Github

The github tbdex repo will host all test vectors for the protocol in the directory test-vectors/protocol/ . An implementation repo can fetch the test vectors at a given git ref like so:

curl -L \                                                                                                                                                                                                                                                                                                    
  -H "Accept: application/vnd.github+json" \
  -H "X-GitHub-Api-Version: 2022-11-28" \
"https://api.github.com/repos/TBD54566975/tbdex/contents/specs/?ref=$MY_GIT_REF"

Each vector file will follow this JSON schema:

  1. description : A human readable description of the scenario being tested.
  2. input : A stringified JSON TBDex object
  3. output : If the input contains a valid TBDex object: A JSON object of the the expected parsed input.
  4. error : boolean. True if an error is expected, false if not. No error message/detail is provided in the test vectors because different languages/implementations may handle errors quite differently. So, it's better not to be too prescriptive of implementation.

NOTE: This is nearly identical to the current test vector structure. The only difference is standardizing the type of error.

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "description": {
      "type": "string",
    },
    "input": {
      "type": "string",
    },
    "output": {
      "type": "object",
      "additionalProperties": true,
    },
    "error": {
      "type": "boolean"
    }
  },
  "required": ["description", "input"],
  "oneOf": [
    {
      "required": ["output"],
      "not": {"required": ["error"]}
    },
    {
      "required": ["error"],
      "not": {"required": ["output"]}
    }
  ]
}

Benefits of Hosting on Github

Some reviewers will prefer the approach to hosting test vectors taken by Web5 — that is, copy-pasting each test vector file into an implementation repo. I really really don’t want to do that. Here’s my hot takes:

  1. Hosting on Github is convenient. If an implementer wants to test against a specific version of the test vectors, they can link to a specific git ref, using the github API.
  2. TBDex already hosts testing-related things online. tbdex-kt is fetching JSON schemas of message kinds from tbdex.dev. There is already a precedent for making network calls while running tests. (IMO we should host these on GH as well, but that’s a separate issue).
  3. If an implementer needs to run tests offline, an implementation repo can cache the test vectors locally.

Recommendations for implementation repos

An implementation repo should NOT always fetch the test vectors in tbdex ’s main branch. Each repo should specify a specific version of tbdex to get a deterministic set of vectors. This will both 1) clarify what version of the TBDex protocol an implementation supports and 2) prevent confusion and CI flakiness if tbdex main gets ahead of the implementation.

It is also useful to add a flag for running a set of test vectors locally. I'm going to write a reference implementation in tbdex-js soon.

Generating Test Vectors

  1. Create a new test vector file in test-vectors/protocol
  2. Use an implementation of TBDex that supports the desired scenario to create the input and output if applicable.
  3. Run your test vectors against an implementation repo locally or push your branch of tbdex and get the URL of your branch on GH to use the new test vector in your implementation repo.
  4. Once you are confident that your test vector works as desired, open a PR on tbdex .

Types of TBDex test vectors

Protocol

In contrast to Web5, which deals with many types of data structures and actions, TBDex protocol implementations have one simple interoperability question to answer: do we accept valid TBDex messages and reject invalid TBDex messages?

Http server and client

TODO(diehuxx): More detail or a separate issue is needed. This is not as high priority right now.

Broadly, the client should create requests and accept responses whereas the server should accept requests and create responses in accordance with the spec.

Reference Implementations

tbdex -- hosting: #215
tbdex-js -- fetching from Github and caching efficiently: TBD54566975/tbdex-js#125

@diehuxx diehuxx added docs/design testing related to new or existing tests labels Dec 21, 2023
@diehuxx diehuxx changed the title Host test vectors on GH [Proposal] Host test vectors on GH Dec 21, 2023
@nitro-neal
Copy link

nitro-neal commented Dec 21, 2023

Thanks for taking the time to go over the current approach and presenting this.

I do go back and fourth on hosting vectors on sdk-development only and then having the unit tests do network calls.

Some thoughts I have on the proposed items:

  • Adding a vector branching strategy adds complexity
  • Adding and maintaing a script to compile into large file adds complexity / less readability / harder debugging.
  • Developers will be jumping between 2 repos more often when implementing a feature (the current approach you can just stay in one and then put into sdk-develpment -whenever-)

One of the most important aspects for me is that we keep it as simple as possible so developers actually use this without slowdown and creates lots and lots of test vectors.

With the current approach developers can produce a feature, add a test vector locally, create a basically normal unit test locally and be on the way. (we can even make another process to update sdk-development automatically so they dont have to update this later)

Also, There are still a lot of cases where you will probably still be copy pasting and having a local test vector on the file system when manually creating and using these vectors when developing. So you will be doing things locally, then deleting and putting it on github.

To implement a new feature and new test vector you will have to repeditly update 2 repos, and then things can still get out of sync, you can have a situation where mainline tbdex pointing to a dev branch in sdk-development.

We have checks on if sdk-development test-vectors folder vs the other repos gets out of sync, and it will open PRs automatically if you have any concern about things getting out of sync.

I do think that web5-* and tbdex-* repos should use as the exact same approach as possible for test vectors to keep things consistent and uniform across repos if possible.

We did decide on having local with scripts to make sure things are in sync in meetings a while back, but super happy to revise if needed. If we truly want to have test vectors not be local and only in the sdk-development repo we can change a few lines in our current test vector impls:

Current local test vector impl:

  val testVectors = mapper.readValue(
    File("../test-vectors/presentation_exchange/create_presentation_from_credentials.json"),
    typeRef
  )

Remote test vector impl:

  val jsonUrl = "https://raw.githubusercontent.com/TBD54566975/sdk-development/main/web5-test-vectors/presentation_exchange/create_presentation_from_credentials.json"
  val request = Request.Builder().url(jsonUrl).build()
  val response = httpClient.newCall(request).execute()

  val testVectors = ObjectMapper().readValue(response.body?.string(), typeRef)

If the tbdex-* repo is truly that different then the web5-* then I guess we can have a split in the way we do test vectors, but I really think we should keep all test vector impls the same across all repos

@nitro-neal
Copy link

Just as a clean example on how the current process works:

new test vector: (2 files changed)
https://github.com/TBD54566975/web5-js/pull/357/files#diff-dca2114c947eccff38e8509d221b1eeae0de4510f4081ab01588d528d34a0989R421

After merge we will get a new checkmark here (no action needed from dev):
https://tbd54566975.github.io/sdk-development/

@diehuxx
Copy link
Contributor Author

diehuxx commented Jan 3, 2024

Made a few changes to my proposal based on feedback:

  1. Made error a boolean instead of an object.
  2. Don't compile test vectors into one file. Initially I thought that would be helpful so we only need to pull one file, but it sounds like it introduces more complication than it helps.
  3. Add recommendation for implementation repos to have a flag for using test vectors locally. I'm going to add a reference implementation of such a flag in tbdex-js.

I'll be updating I have updated #215 accordingly.

@diehuxx
Copy link
Contributor Author

diehuxx commented Jan 3, 2024

Adding a vector branching strategy adds complexity

Agreed it adds complexity, but I think it's necessary. If the hosted vectors get out of sync with the implementation, implementation repos will want to pin which version of vectors the implementation supports. The GH action y'all mentioned sounds like it checks that test vectors in the impl repo are the same as in sdk-development/master. But if sdk-development/master gets ahead, we still want to ensure that the vectors in the impl repo are consistent with themselves.

@diehuxx
Copy link
Contributor Author

diehuxx commented Jan 4, 2024

Here are some reference implementations for how tbdex and an implementation could support this:
tbdex -- hosting: #215
tbdex-js -- fetching from Github and caching efficiently: TBD54566975/tbdex-js#125

The caching done in the tbdex-js branch makes a big difference. Fetching from github can take more than 500ms. Once it's cached, loading the test vector is a millisecond or less.

@diehuxx
Copy link
Contributor Author

diehuxx commented Jan 5, 2024

@mistermoe Pointed out to me that TBD54566975/tbdex-js#125 is essentially reimplementing git submodules. So, I decided to try using git submodules and it's honestly way simpler than the stuff I was cooking up.

Instead of fetching test vectors dynamically from github, I'm now convinced that git submodules best allow implementation repos to have fast local access to specific versions of implementation repos. tbdex should still be the host of JSON schemas and test vectors. The change from my original proposal is in how they are consumed.

@diehuxx
Copy link
Contributor Author

diehuxx commented Jan 7, 2024

Introducing tbdex-interop-suite. A repo which will contain everything that is currently in the hosted directory of tbdex. This dedicated repo is lighter weight for implementation repos to include as a submodule. I've updated TBD54566975/tbdex-js#125 and #215 accordingly.

Disregard. We decided in this Discord thread not to extract hosted stuff into a new repo

@mistermoe mistermoe added this to the 1.0 milestone Jan 8, 2024
diehuxx added a commit that referenced this issue Jan 9, 2024
* Compile test vectors into one file

* Dont compile test vectors into one big file

* Replace hosted/ with tbdex-interop-suite submodule

* Revert "Replace hosted/ with tbdex-interop-suite submodule"

This reverts commit c3e3327.

* Update test-vector README
@diehuxx
Copy link
Contributor Author

diehuxx commented Jan 13, 2024

Implemented here and in tbdex-js and tbdex-kt with these PRs
#215
TBD54566975/tbdex-js#129
TBD54566975/tbdex-kt#124

@diehuxx diehuxx closed this as completed Jan 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs/design tbdex 1.0 testing related to new or existing tests
Projects
Status: Done
Development

No branches or pull requests

3 participants