-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add functions for Extended Diagnostic Notation (RFC 8610 Appendix G) #386
Add functions for Extended Diagnostic Notation (RFC 8610 Appendix G) #386
Conversation
fad762f
to
f8033f8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this PR. Really amazing work! 👍
I really like this PR because it:
- implements both DN and EDN
- includes RFC 8949 Diagnostic Examples as test cases
- decouples decoding from diagnosis and avoids any performance hit
Given the size of this PR, I need to do another review but I have some initial thoughts to share.
API
Maybe we can make diagnostic notation API be more similar to decoding API. For example, Unmarshal()
uses default decoding options and DecMode.Unmarshal()
uses user defined decoding options. Maybe something like this:
type DiagMode interface {
Diagnostics([]byte) (string, error)
}
// Readonly options for concurrent use
type diagMode struct {
...
}
// User defined options used to create diagMode
type DiagOptions struct {
...
}
func DiagOptions.DiagMode() (DiagMode, error) {
...
}
// Diagnostics returns extended diagnostic notation (EDN) of CBOR data items using default diagnostic mode
func Diagnostics([]byte) (string, error) {
return defaultDiagMode.Diagnose()
}
// Diagnostic returns extended diagnostic notation (EDN) of CBOR data items using dm mode
func (dm *diagMode) Diagnostics([]byte) (string, error)
DiagOptions
- We can include these options
MaxNestedLevels
,MaxArrayElements
, andMaxMapPairs
- We can change
ByteStringEncoding
from string to enum to reduce user error.
CBOR Sequences (RFC 8742)
- We should probably avoid returning partial/incomplete EDN. For example, if the third CBOR data item in sequence has error, we can return the EDN only for the first two data items.
Thanks again for opening this amazing PR!
@fxamacker Updated. Do you have any other suggestions? |
Hey @zensh Nice work! I need to work this weekend but I can take a closer look hopefully next weekend. Thanks again for adding this really great feature! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updated PR!
I only had time to review some of the PR and I left some suggestions. I can resume reviewing PR this coming weekend.
BTW, thanks again for opening this PR, I would like to include it in v2.5.0 release. 👍
Good suggestions. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the PR! 👍
This is a partial review and I need to continue review next weekend. Last few weeks have been unusually busy for me.
eda2717
to
363f256
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for this PR! 👍 Reviewing this PR led to opening issue #399 and PR #400 because the Valid()
function checks for well-formedness instead of validity. It was documented correctly but it needed to be improved.
In this PR, a check for validity is needed in two places to avoid possible runtime errors.
Reproducers:
func TestDiagnoseC201(t *testing.T) {
data := []byte{0xc2, 0x01}
cbor.Diagnose(data)
}
func TestDiagnoseC301(t *testing.T) {
data := []byte{0xc3, 0x01}
cbor.Diagnose(data)
}
Signed-off-by: Yan Qing <txr1883@gmail.com>
Signed-off-by: Yan Qing <txr1883@gmail.com>
Signed-off-by: Yan Qing <txr1883@gmail.com>
Signed-off-by: Yan Qing <txr1883@gmail.com>
Signed-off-by: Yan Qing <txr1883@gmail.com>
…2. update from valid() to wellformed(). Signed-off-by: Yan Qing <txr1883@gmail.com>
363f256
to
72e9118
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience and updates! 🙏
I finished reviewing the non-test code. 😅 I had to reread sections of RFCs due to the nature of the PR and edge cases.
Some thoughts:
- Special handling is needed for indef-length byte string & text string without chunks.
- For now, we can just return error on invalid UTF-8 text strings instead of encoding to diagnostic notation. If needed, we can add option after v2.5.0 to print invalid UTF-8.
- We can add
DiagnoseFirst()
to match recently addedUnmarshalFirst()
. - We can remove
CBORSequence
option, becauseDiagnoseFirst()
can be used instead.
Adding DiagnoseFirst()
will make API symmetrical with newly added UnmarshalFirst()
. And we can support diagnostic notation for CBOR data even if it has remaining bytes that is extraneous non-CBOR data.
} | ||
|
||
if di.dm.byteStringEmbeddedCBOR { | ||
di2 := newDiagnose(val, di.dm.decMode, di.dm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need to change wellformed()
to receive a bool flag for CBOR sequence, because wellformed
uses di.dm.cborSequence
and we need to use di.dm.byteStringEmbeddedCBOR
here for embedded CBOR sequence.
Co-authored-by: Faye Amacker <33205765+fxamacker@users.noreply.github.com>
Signed-off-by: Yan Qing <txr1883@gmail.com>
@fxamacker Thanks for your review.
|
Signed-off-by: Yan Qing <txr1883@gmail.com>
Thanks so much! I'll take a closer look this weekend to see if we missed anything.
Yeah, you are right, I agree with 1. 👍 Although users can call We can keep
If you agree, I think I just need to review tests this weekend and hopefully merge. 😄 |
Agreed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Thanks again for your contributions! 👍
Really appreciate your patience with multiple delays caused by special circumstances. 🙏
Signed-off-by: Yan Qing txr1883@gmail.com
Description
PR Was Proposed and Welcomed in Currently Open Issue
Checklist (for code PR only, ignore for docs PR)
Last line of each commit message should be in this format:
Signed-off-by: Firstname Lastname firstname.lastname@example.com
(see next section).
Certify the Developer's Certificate of Origin 1.1
the Developer Certificate of Origin 1.1.