-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(credo-interop): various didexchange and did:peer related fixes #2748
fix(credo-interop): various didexchange and did:peer related fixes #2748
Conversation
Quality Gate passedKudos, no new issues were introduced! 0 New issues |
@genaris just FYI, with these changes, ACA-Py and AFJ can now successfully complete OOB + DIDExchange with did:peer:4 (or AFJ emitting did:peer:1 and ACA-Py emitting did:peer:4, I believe). I commented on an issue in Credo on issues with did:peer:2 and tagged you. |
Wow nice work @dbluhm ! Not only fixes in ACA-Py but also in Credo. That's what I'm talking about! 😄 Now that your Multikey support PR is merged in Credo main branch, I guess you can re-attempt using it. |
I haven't gotten this quite right yet. Checking DID Exchange backwards compatibility with earlier versions of ACA-Py, I get this error:
The updates for the DID Exchange version were inadvertently breaking. I'll have to look deeper. My first pass was not much more than a copy-paste from OOB. |
1872e86
to
cb54ea1
Compare
I've spent a fair amount of time sifting through the behavior of ACA-Py as it relates to handling messages from protocol versions different than the explicitly supported versions. What should happen, according to RFC 0003, is that if a message with a minor version greater than what is supported is received, we handle the message still and respond anyways with our lower protocol version. Through this, the response recipient knows that there is a version mismatch and that some fields may have been ignored. For example, if we get As it turns out, recent past releases of ACA-Py emit this error on receipt of a Traceback (most recent call last):
File "/usr/local/lib/python3.9/asyncio/tasks.py", line 256, in __step
result = coro.send(None)
File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/core/dispatcher.py", line 142, in handle_message
(message, warning) = await self.make_message(
File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/core/dispatcher.py", line 293, in make_message
message_cls = registry.resolve_message_class(message_type)
File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/core/protocol_registry.py", line 243, in resolve_message_class
return ClassLoader.load_class(msg_cls)
File "/home/aries/.local/lib/python3.9/site-packages/aries_cloudagent/utils/classloader.py", line 98, in load_class
if "." in class_name:
TypeError: argument of type 'NoneType' is not iterable This essentially means that a new ACA-Py release would be incompatible with past versions of ACA-Py when using DID Exchange. I could adjust our behavior as a newer version of ACA-Py to attempt to respond with |
I’m confused. Would we not want to add support for 1.1 and handle both 1.0 and 1.1? Presumably we want to support |
The way I interpret that second point from the semver rules, it seems to suggest that we should always respond with our "most current" minor version. By doing so, we make it clear to the other party that they may experience feature degradation or minor version mismatch since the language describing sending a problem report with a warning has been removed from the protocol:
And:
|
I might be splitting hairs 😅 I think the ideal would be to only "implement" a single minor version of any major version of a protocol. Honestly wondering if DID Exchange 1.1 should have been DID Exchange 2.0 because, though did_rotate is technically optional, it will break handling of the messages still for did_doc~attach to be absent. |
Sorry — I’m not understanding the issue. Probably we need to discuss. It would be good to get a clear outline of what is happening, what could be done easily and what you recommend we do.
I think the ideal is that ACA-Py be changed to have a current minor version of 1.1 and handle both 1.0 and 1.1 requests. However, I gather that is problematic… Thanks! |
Yeah, okay, allow me to slow down and take a deep breath for a moment lol. The ideal: ACA-Py always just sends didexchange/1.1 since that reduces complexity. However, this would break backwards compatibility. Theoretically, the way DIDComm protocol versions should be handled, past versions of ACA-Py should have been able to work with 1.1; but, due to a bug in past versions, it cannot handle versions of the protocol greater than its "current minor version" (1.0). (As much as it chagrins me to have explicit different paths for the same major version of a protocol rather than being able to just ignore some attributes) the right answer here is to support both 1.0 and 1.1, responding in kind to requests. I believe it should be possible to include both My biggest concern is just managing complexity in the DID Exchange protocol implementation. It's already quite dense. I'll continue to work to find a balance between just getting the dang feature implemented and not doing things that will make me sad later lol. |
After looking at @swcurran's clarification PR in Aries RFC, I realized that the DID Exchange version was not explicitly changed on its RFC. Some time ago it was agreed to do so (see hyperledger/aries-rfcs#795 (comment)) and we reflected that into Credo implementation, but now I'm not sure if it's clear for everyone. I think it's the right thing to do although this makes things harder to implement in ACA-Py due to the aforementioned bug in previous versions. BTW, do you think if as a workaround (until unqualified dids support is completely removed) it could be possible to use discover-features to query for DID Exchange support and, based on that, send 1.1 or 1.0 requests? I guess it would be tricky, since a connection is not established yet, but maybe possible somehow? |
It looks like the version bump was made but that just the title and metadata didn't get updated. Or at least that's how I read the intent and my memory of the discussions.
We've got a few entry points to a DID Exchange protocol:
When we create an OOB invitation, we can include both didexchage/1.0 and didexchange/1.1 in the protocol list. We'll receive a request from them with one version or the other and then proceed from that point with the same version they sent. When we receive an OOB invitation, if didexchange/1.1 is in the protocol list, we'll use it. Otherwise, we'll go to 1.0. When we receive a DID Exchange request through a public DID, we will just match the version received in the request. When we send a DID Exchange request, we could potentially do a discover features protocol with the public DID (I don't think ACA-Py technically supports this right now since there's no connection yet)... But to this point, we've left it up to the controller to orchestrate the flow of multiple protocols. So continuing that pattern, I think we'd add a switch to the request through public DID Admin API endpoint to give the controller the ability to pick which version of the DID Exchange protocol to send. We can default to 1.0. |
ab08c6d
to
c832ded
Compare
A lot of progress on this PR. I think we're where we wanted to be implementation wise now. I just need to update the tests that are failing and do some regression checks against Credo to make sure that's still working, too. |
Unit tests fixed now. Might be room for some additions to the tests. I've tested this branch against Credo 0.5.0 and tests passed so looking good on that front. I'll run some tests against previous ACA-Py versions. |
Diagnosing an issue: acapy_1 | Traceback (most recent call last):
acapy_1 | File "/usr/local/lib/python3.9/asyncio/tasks.py", line 256, in __step
acapy_1 | result = coro.send(None)
acapy_1 | File "/usr/src/app/aries_cloudagent/core/dispatcher.py", line 210, in handle_message
acapy_1 | await handler(context, responder)
acapy_1 | File "/usr/src/app/aries_cloudagent/protocols/didexchange/v1_0/handlers/complete_handler.py", line 29, in handle
acapy_1 | await mgr.accept_complete(context.message, context.message_receipt)
acapy_1 | File "/usr/src/app/aries_cloudagent/protocols/didexchange/v1_0/manager.py", line 1064, in accept_complete
acapy_1 | conn_rec.my_did = await self.long_did_peer_4_to_short(conn_rec.my_did)
acapy_1 | File "/usr/src/app/aries_cloudagent/connections/base_manager.py", line 115, in long_did_peer_4_to_short
acapy_1 | await wallet.store_did(did_info)
acapy_1 | File "/usr/src/app/aries_cloudagent/wallet/askar.py", line 279, in store_did
acapy_1 | raise WalletDuplicateError("DID already present in wallet")
acapy_1 | aries_cloudagent.wallet.error.WalletDuplicateError: DID already present in wallet |
@ianco I seem to be running amok of the reusing did:peer:4 changes. My tests work if I make sure to set the |
That was based on this comment from @swcurran : #2703 (comment) I don't have an opinion on what should be the default on the (Connection reuse won't work unless we use the same DID on each invitation ...) |
Is this happening in an integration test? Possibly aca-py should just check before adding a DID, if the DID is already present ... |
To encourage connection reuse where possible, we want an Agent to use the same DID for invitations. The recommendation is that ACA-Py should default to enabling reuse, and a controller must go out of its way to force a new connection for every invitation. Does that make sense and answer the question? |
Okay, I think I'm detecting where my assumptions have broken down. To restate, we want to encourage connection reuse by using the same DID for invitations; then, in the process of connecting, the invite creator should rotate out for a new did:peer:4 DID. Sound right? I think some of my changes are erroneously assuming that a DID associated with the connection means we don't need to create a new one. I believe this was the case before the reuse PR. But with this new condition, I'll have to adjust my logic to make sure the new DID is rotated in for the connection. |
Yes — that’s what I think we want. Use the same did:2/4 for every invitation, enabling reuse, and rotate to a new DID when connecting. And in the super edge case where an Agent wants to actively prevent connection reuse, they have that option to use a new DID on every invitation and (presumably) still rotate to a new DID when connecting. |
Accidentally broke in last commit Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
This helps running tests work better in some environments Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
Signed-off-by: Daniel Bluhm <dbluhm@pm.me>
275c3d1
to
19547fc
Compare
Quality Gate passedIssues Measures |
(Please note my updated summary of changes in the opening comment of this PR) As a result of these changes, in combination with previously merged PRs, backwards compatibility is influenced by the following factors: Past == ACA-Py 0.11.0
Are these interop constraints we're happy with or should we revisit what triggers the OOB behavior? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of the details are over my head but overall looks good to me.
w00t!!! Thanks, @dbluhm — awesome work! |
Fixes #2742 and various other issues discovered while testing ACA-Py against AFJ.
didexchange/1.1
as handshake protocol in OOBjwk
andkid
in the protected headers of signed attachments. Thekid
was redundant so I removed it.Credo does not acceptMultikey is now supported by Credo so I'll revert this change.Multikey
verification method type yet (I opened a PR though: feat: add Multikey as supported vm type credo-ts#1720 -- tests failing 😄). So I adjusted our did:peer:4 doc creation to useEd25519VerificationKey2020
for now.Updated Summary of Changes
Fixes
NoneType
exception when no session is openauthentication
andkeyAgreement
verification relationships.clean_finished_oob_record
was suppressing exceptions from some bad code. OOB Records were not getting correctly deleted because thedelete_record
call was outside of the context holding thesession
. Exceptions are now logged as a warning from that block of code.kid
from the headers of signed attachments; this was an overspecification -- eitherjwk
orkid
are expected but not both.Removals
ConnRecord.Protocol
Enum was removed in favor of a simplerSUPPORTED_PROTOCOLS
list. The logic surrounding the Enum was convoluted and barely used. I opted to simplify.aries_cloudagent/core/util.py
methods associated with version definition loading and parsing. This was an error prone piece of code (that recently broke loading protocols from plugins altogether) that was responsible for determining what the "correct" version of a protocol to respond with given the version definition for the protocol. It did this by dynamically loading thedefinition.py
module for the message package through a very brittle process of guessing at the location of the module relative to the module containing the agent message class in question. However, all this information was already known to theProtocolRegistry
; there is no need to dynamically load it. Additionally, the support for emitting warnings that it enabled is no longer recommended by the RFCs. These methods were removed and replaced with calls to theProtocolRegistry
.Refactors
Protocol Registry and Dispatcher
make_message
The Protocol Registry had seen only minor updates since it was first written years ago. I found the code to be a little brittle, depending a little too heavily on loosely defined lists of dictionaries. I formalized the data structures used to resolve a message class from a message type and introduced some helper types for parsing information from message type strings (
MessageType
,ProtocolIdentifier
,MessageVersion
, etc.).The refactors correct the issue discussed in the comments of this PR regarding past versions of ACA-Py being unable to handle minor versions greater than the current minor version within a major version of a protocol.
These refactors had some minor effects on the Dispatcher
make_message
Specifics:
registry.resolve_message_class
now returns either aType[AgentMessage]
orDeferLoad
of anAgentMessage
. I think theDeferLoad
just wasn't around when theProtocolRegistry
was first written because it was the perfect tool for the job. This results in Message Classes being cached after first load so they don't have to be dynamically imported every time. I didn't measure but I suspect this has a positive impact on performance.make_message
no longer returns a "warning" string (see removal note above).register_controllers
no longer takes aversion_definition
as a parameter (it was unused)ProtocolDefinitions
, a combination of the protocol identifier URI and the information from the version definition (definition.py
in the protocol package) are now stored in the registry. All information about currently supported versions is stored in the registry (see note about removal ofcore/util.py
version definition methods above).resolve_message_class
will return the message class for the current supported minor version for minor versions greater than what is supported.AgentMessage
Some minor updates were made to
AgentMessage
to make it easier to access and update message version information:AgentMessage._type
is now aMessageTypeStr
instance. It behaves like astr
but has additional properties forprotocol
,version
,name
, etc.AgentMessage._version
property for accessingAgentMessage._type.version
as a stringassign_version_from(msg: AgentMessage)
to set the version of the current message based on the version of another message.assign_version(version: str)
for manually setting the version to a value.@property.setter
for the_type
attribute which was unused and had dubious utility.DID Exchange Manager
Behavior of the manager is now governed by the
connection_protocol
property of theConnRecord
associated with the exchange. If the protocol isdidexchange/1.0
the original DID Exchange manager behavior is executed for backwards compatibility with previous versions of ACA-Py. If the protocol isdidexchange/1.1
, the manager will execute the new behavior, including qualified DIDs anddid_rotate~attach
. If ACA-Py's settings do not permit qualified DIDs (through omission ofemit-did-peer-X
flags on startup), the manager will default to the original unqualified DID behavior.The manager will send messages with the same versions as those it receives. E.g.
didexchange/1.0/request
received,didexchange/1.0/response
sent.didexchange/1.1/request
received,didexchange/1.1/response
sent.Using a public DID for a DID Exchange now requires
didexchange/1.1
support since thedid_rotate~attach
mechanism is used.Various refactors were made to streamline logic and make it easier to reuse chunks.
Added a
protocol
parameter to implicit DID Exchange request Admin API endpoint. This endpoint triggers a request to a public DID without previously receiving an invitation from that DID. Theprotocol
parameter selects which version of DID Exchange to use, 1.1 or 1.0. The default is 1.0 for now.Deprecated: An Admin API endpoint that was apparently broken and unused was marked as deprecated:
POST /didexchange/receive-request
. The docs for the endpoint describe that it is intended to be used to receive a request against a public DID. However, requests to a public DID should be sent directly to the messaging endpoint of the public DID. Given this and the fact that errors in the endpoint rendered it useless (until fixed in this PR), I marked the endpoint as deprecated.Out of Band Manager and Invitation message
Minor updates were made:
The manager passes the version of the handshake protocol used to the DID Exchange manager so that it can keep track of this in the
connection_protocol
attribute of theConnRecord
created.The invitation message (and relevant Admin API endpoints) now accept
didexchange/1.1
as a handshake protocol value.The
HSProto
andHSProtoSpec
were also streamlined to remove unused features. Additionally,rfc
as a method for referencing an HSProto was removed given that DID Exchange 1.0 and 1.1 are both from the same RFC.