Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: For CBOR are Maps with keys of Major type 2 possible? #286

Closed
eikeon opened this issue Mar 14, 2019 · 10 comments
Closed

Question: For CBOR are Maps with keys of Major type 2 possible? #286

eikeon opened this issue Mar 14, 2019 · 10 comments

Comments

@eikeon
Copy link

eikeon commented Mar 14, 2019

I've not been able to work out if it's possible to codec maps with key of major type 2 re: CBOR. I was converting my keys from []byte into string so that I can use them as keys in my map. But they are then getting tagged as major type 3.

@ugorji
Copy link
Owner

ugorji commented Mar 15, 2019

Thanks.

There's a quick hack fix, but I don't like it.

I instead want to take a second stab at how we handle golang string type during encoding, and also how we decode in corresponding context.

This may cause the fix to take a week or so. I will be in touch once fixed.

@eikeon
Copy link
Author

eikeon commented Mar 15, 2019

Thank you!

I will look forward to the enhancement and can test it once it’s ready for testing.

@ugorji ugorji closed this as completed in aa2c01e Mar 15, 2019
ugorji added a commit that referenced this issue Mar 15, 2019
When decoding into a nil interface{} (naked decoding),
we peek at the stream and decode into a type that matches the stream.

Previously, we treated string type as UTF-8. However, a go string is just
a sequence of bytes (an immutable view of []byte), that makes no determination
of the encoding.

To that effect, some users want to decode an un-encoded sequence of bytes
in the stream as an immutable view (a string).

We now enable this via the flag:RawToString.

Now, if we peek at the stream and it is a sequence of bytes, we will decode it
into a string if RawToString=true. By default, it continues to be decoded into a []byte.

Updates #286
@ugorji
Copy link
Owner

ugorji commented Mar 15, 2019

@eikeon I couldn't wait to try out my fix.

Please test out and let me know that it solves your concerns.

@eikeon
Copy link
Author

eikeon commented Mar 15, 2019

Yes, with the EncodeOptions.StringToRaw = true the map keys are now turning into major type 2.

Question: If my map values are also string should the StringToRaw option also apply to them? The resulting CBOR I'm now getting is a map where the keys are encoded with type 2 and the values with type 3. Which in my current use case is fine, I think, as they are 'proper' strings. But want to understand how the determination is made for the map values?

Seems like Go's type system is letting us down a bit here :(

@ugorji
Copy link
Owner

ugorji commented Mar 15, 2019

It's should be working as expected. But I can only be sure with a reproducer.

Send me a reproducer (code I can run with go test or go run), and I can explain what is happening, or fix if something isn't working right.

@ugorji
Copy link
Owner

ugorji commented Mar 15, 2019

@eikeon ^^

@ugorji
Copy link
Owner

ugorji commented Mar 16, 2019

@eikeon seems there is a path within the code which doesn't take the StringToRaw flag into consideration. Fixing now.

@ugorji ugorji reopened this Mar 16, 2019
@ugorji ugorji closed this as completed in 95c34d1 Mar 16, 2019
@eikeon
Copy link
Author

eikeon commented Mar 16, 2019

@ugorji, The clarification in the doc is very helpful. I am now seeing StringToRaw being consistently applied.

I am not able to determine how I can use encoding.TextMarshaler so that my type gets encoded as a UTF-8 string. Here's my explainer:

package main

import (
	"bytes"
	"encoding/hex"
	"log"

	"github.com/ugorji/go/codec"
)

type Path string

type PathT string

func (p PathT) MarshalText() (text []byte, err error) {
	return []byte(p), nil
}

type Hash string

type Manifest map[Hash]interface{}

func (m Manifest) Bytes() []byte {
	var ch codec.CborHandle
	var b []byte
	ch.BasicHandle.EncodeOptions.Canonical = true
	ch.BasicHandle.EncodeOptions.StringToRaw = true
	enc := codec.NewEncoderBytes(&b, &ch)
	err := enc.Encode(m)
	if err != nil {
		log.Fatal(err)
	}
	return b
}

func main() {
	bufA, _ := hex.DecodeString("da8a33")
	bufB, _ := hex.DecodeString("5ea3f0")

	m := Manifest{Hash(bufB): Path("b"), Hash(bufA): Path("a")}
	mm := Manifest{Hash(bufB): PathT("b"), Hash(bufA): PathT("a")}

	if bytes.Compare(m.Bytes(), mm.Bytes()) == 0 {
		log.Println("not different")
	}
}

ugorji added a commit that referenced this issue Mar 16, 2019
@ugorji
Copy link
Owner

ugorji commented Mar 16, 2019

see https://godoc.org/github.com/ugorji/go/codec#hdr-Custom_Encoding_and_Decoding

Type must implement both sides of the symmetry i.e. MarshalText and UnmarshalText

@ugorji
Copy link
Owner

ugorji commented Mar 16, 2019

@eikeon

BTW, for performance and ergonomic reasons, I would write the code snippet as below (see https://godoc.org/github.com/ugorji/go/codec#hdr-Usage ):

type Manifest map[Hash]interface{}

var ch codec.CborHandle // handles maintain cache of things on first use - make shared, see docs above
func init() {
	ch.Canonical = true // leveraging ergonomic embedding rules of go
	ch.StringToRaw = true // leveraging ergonomic embedding rules of go
}
func (m Manifest) Bytes() (b []byte) {
	enc := codec.NewEncoderBytes(&b, &ch)
	err := enc.Encode(m)
	if err != nil {
		log.Fatal(err)
	}
	return
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants