Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix decoding of slice with type implemented UnmarshalJSON #198

Merged
merged 7 commits into from
May 1, 2021

Conversation

goccy
Copy link
Owner

@goccy goccy commented Apr 29, 2021

fix #195

In decoding UnmarshalJSON, it is assumed that the passed pointer is used as it is as a receiver. If a type that implements UnmarshalJSON is specified as an element of slice, this should be taken into consideration so that the pointer of the reused slice element is not passed as it is.

Copy link
Contributor

@IncSW IncSW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, the comment does not apply to this PR.

@goccy goccy force-pushed the feature/fix-decoding-of-slice branch from 5b7b1ac to e647daf Compare April 30, 2021 13:55
@zimmski
Copy link

zimmski commented Apr 30, 2021

I'll receive a segmentation fault with this PR:

unexpected fault address 0x0
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x488232]

goroutine 1 [running, locked to thread]:
runtime.throw(0xf8f64a, 0x5)
        runtime/panic.go:1117 +0x72 fp=0xc001f3b468 sp=0xc001f3b438 pc=0x44c192
runtime.sigpanic()
        runtime/signal_unix.go:741 +0x268 fp=0xc001f3b4a0 sp=0xc001f3b468 pc=0x464fe8
runtime.memmove(0x7b3a226570795422, 0xc001ea3d18, 0x4)
        runtime/memmove_amd64.s:155 +0x112 fp=0xc001f3b4a8 sp=0xc001f3b4a0 pc=0x488232
runtime.slicecopy(0x7b3a226570795422, 0x4, 0xc001ea3d18, 0x4, 0x1, 0xc001f3b548)
        runtime/slice.go:277 +0x11c fp=0xc001f3b4f0 sp=0xc001f3b4a8 pc=0x46699c
encoding/json.(*RawMessage).UnmarshalJSON(0xc001eaa420, 0xc001ea3d18, 0x4, 0x4, 0xc001eaa420, 0x4)
        encoding/json/stream.go:275 +0x99 fp=0xc001f3b558 sp=0xc001f3b4f0 pc=0x67b379
github.com/goccy/go-json.(*unmarshalJSONDecoder).decode(0xc001ea5440, 0xc00013a9a0, 0x14f, 0x14f, 0x10, 0x1, 0xc001eaa420, 0xf, 0x0, 0x0)
        github.com/goccy/go-json/decode_unmarshal_json.go:67 +0x289 fp=0xc001f3b618 sp=0xc001f3b558 pc=0x93d5c9
github.com/goccy/go-json.(*mapDecoder).decode(0xc001ea0180, 0xc00013a9a0, 0x14f, 0x14f, 0x1, 0x1, 0xc001e29668, 0xc001e29668, 0x15f7680, 0xc001e82000)
        github.com/goccy/go-json/decode_map.go:153 +0x3e7 fp=0xc001f3b710 sp=0xc001f3b618 pc=0x924987
github.com/goccy/go-json.unmarshal(0xc001eaa3d8, 0x14e, 0xa80, 0xec5700, 0xc001e29668, 0xed4380, 0xc001e2f3e0)
        github.com/goccy/go-json/decode.go:44 +0x1b5 fp=0xc001f3b7b0 sp=0xc001f3b710 pc=0x908635
github.com/goccy/go-json.Unmarshal(...)
        github.com/goccy/go-json/json.go:266

Will add a reproducer in a jiffy.

@goccy goccy changed the title Fix decoding of slice type Fix decoding of slice with type implemented UnmarshalJSON Apr 30, 2021
@zimmski
Copy link

zimmski commented Apr 30, 2021

I'll receive a segmentation fault with this PR:

Sorry it takes me ages to reduce the test case to something that i can safely send out without any company violation. Just a note, i already tested your new commit in the meantime 6444a1b: did not solve the problem i am currently seeing. So am continuing to work on a reproducer.

@goccy
Copy link
Owner Author

goccy commented Apr 30, 2021

@zimmski Thank you, I wait for your report

@zimmski
Copy link

zimmski commented Apr 30, 2021

I think it has something to do with data being marked as invalid... when i add a "Valid" call to our code, i see that go-json marks the JSON raw data as invalid.

Here is an ugly reproducer... sorry i cannot post the original content.

package main

import (
	"encoding/json"
	"fmt"
	"os"

	gojson "github.com/goccy/go-json"
)

func main() {
	original := []byte(`{"ABCDEFGHIJKL":[{"MNOPQRSTUVWX":[{"YABC":{"DEFG":[{"HIJKLMNO":{"PQRS":"TUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXYABCDE","FGHIJKLM":"NOPQRSTUVW"},"XYABCDEFGH":[{"IJKLMNOP":"!=","Q":{"RSTU":"V","WXYABCDE":"FGHIJKLMNO"},"P":{"QRSTUVWX":"YAB"},"CDEFGHIJ":"KLMNOP","QRSTUVWX":"YABCDEFGHI"}],"JKLMNOPQRSTUVW":null,"XYABCDEF":"GHIJ"},{"KLMNOPQR":{"STUVWXY":{"ABCDEFGH":"IJKLMN_OPQ_RST","U":{"VWXY":"A","BCDEFGHI":"JKLMNOPQRS"},"TUVWXYAB":"CDEFG","HIJKLMNO":"PQRSTUVWXY"},"ABCDEFGH":"IJKLMNOP!Q41R8ST98U00V204W9800998XYA8427B","CDEFGHIJ":"KLMNOP","QRSTUVWX":"YABCDEFGHI"},"JKLMNOPQRS":null,"TUVWXYABCDEFGH":null,"IJKLMNOP":"QRST"}],"UVWXYABC":"DEFGH","IJKLMNOP":"QRSTUVWXY"},"ABCDEFGH":9,"IJKL":"MNOPQRST/UVWXYABCDE//FGHIJKLMNOPQRST!4UV2WXYABC7826D7659EF223GH40I91J","KLMNOPQRST":[{"UVWXYABCDEFG":null,"HIJKLMNO":0,"PQRS":"T","UVWX":{"YABC":{"DEFG":"HIJK/LMNO/PQRSTU","VWXYABCD":"EFGHIJKLM","NOPQRSTU":"VWXY"},"ABCDEFGH":"IJKLMNO","PQRSTUVW":"XYAB"},"CDEFGHIJ":"KLMNOPQR"}],"STUVWXYA":null,"BCDEFGH":null,"IJKLMN":null,"OPQRSTUVWXYABC":null,"DEFGHIJK":"LMNOPQRS"},{"TUVW":{"XYAB":[{"CDEFGHIJ":{"KLMN":"OPQRSTUV/WXYABCDEFG//HIJKLMNOPQRSTUV!4WX2YABCDE7826F7659GH223IJ40K91L","MNOPQRST":"UVWXYABCDE"},"FGHIJKLMNO":[{"PQRS":"T","UVWXYABC":"DEFGHIJKLM"}],"NOPQRSTUVWXYAB":null,"CDEFGHIJ":"KLMN"}],"OPQRSTUV":"WXYAB","CDEFGHIJ":"KLMNOPQRS"},"TUVWXYAB":9,"CDEF":"GHIJKLMN/OPQRSTUVWX//YABCDEFGHIJKLM!4NO2PQRSTU7826V7659WX223YA40B91C","DEFGHIJKLM":[{"NOPQRSTUVWXY":null,"ABCDEFGH":0,"IJKL":"M","NOPQ":{"RSTU":{"VWXY":"ABCD/EFGH/IJKLMN","OPQRSTUV":"WXYABCDEF","GHIJKLMN":"OPQR"},"STUVWXYA":"BCDEFGH","IJKLMNOP":"QRST"},"UVWXYABC":"DEFGHIJK"}],"LMNOPQRS":null,"TUVWXYA":null,"BCDEFG":null,"HIJKLMNOPQRSTU":null,"VWXYABCD":"EFGHIJKL"},{"MNOP":{"QRST":[{"UVWXYABC":0,"DEFG":["HIJK"],"LMNO":[{"PQRS":{"TUVW":"XYABCDEF/GHIJKLMNOP","QRSTUVWX":"YABCDEFGH","IJKLMNOP":"QRST"},"UVWXYABC":"DEFGHIJ","KLMNOPQR":"STUV"}],"WXYAB":[{"CDEF":{"GHIJ":{"KLMN":"OPQRSTUV/WXYABCDEFG","HIJKLMNO":"PQRSTUVWX","YABCDEFG":"HIJK"},"LMNOPQRS":"TUVWXYA","BCDEFGHI":"JKLM"},"NOPQRSTU":"VWX"}],"YABCDEFG":"HIJKLMNOPQR","STUVWXYA":"BCDEFGHIJ"},{"KLMNOPQR":"=","S":{"TUVWXYA":{"BCDE":"FGHI","JKLMNOPQ":"RSTUVWXYAB"},"CDEFGHIJ":"@KLMN/OPQR/STUVWX","YABCDEFG":"HIJKLM","NOPQRSTU":"VWXYABCDEF"},"G":{"HIJKLMNO":{"PQRS":"TUVW/XYAB/CDEFGH//IJKLMN!O41P8QR98S00T204U9800998VWX8427Y","ABCDEFGH":"IJKLMNOPQR"},"STUVWXYABC":null,"DEFGHIJKLMNOPQ":null,"RSTUVWXY":"ABCD"},"EFGHIJKL":"MNOPQR","STUVWXYA":"BCDEFGHIJK"},{"LMNOPQR":[{"STUV":"WXYA","BCDEFGHI":"JKLMNOPQRS"}],"TUVWXYAB":"CDEFGH","IJKLMNOP":"QRSTUVWXY"}],"ABCDEFGH":"IJKLM","NOPQRSTU":"VWXYABCDE"},"FGHIJKLM":37,"NOPQ":"RSTUVWXY/ABCDEFGHIJ//KLMNOPQRST!U41V8WX98Y00A204B9800998CDE8427F","GHIJKLMNOP":null,"QRSTUVWX":null,"YABCDEF":[{"GHIJKLMNOPQR":null,"STUVWXYA":0,"BCDE":"","FGHI":{"JKLM":{"NOPQ":"RSTUVWXY/ABCDEFGHIJ","KLMNOPQR":"STUVWXYAB","CDEFGHIJ":"KLMN"},"OPQRSTUV":"WXYABCD","EFGHIJKL":"MNOP"},"QRSTUVWX":"YABCDEFG"}],"HIJKLM":null,"NOPQRSTUVWXYAB":null,"CDEFGHIJ":"KLMNOPQR"}],"STUVWXYABC":null,"DEFGHIJK":[{"LMNO":{"PQRS":"TUVW/XYAB/CDEFGH","IJKLMNOP":"QRSTUVWXY","ABCDEFGH":"IJKL"},"MNOPQRST":"UVWXYAB","CDEFGHIJ":"KLMN"}],"OPQRSTUV":1,"WXYA":"BCDEFGHI/JKLMNOPQRS","TUVWXYAB":null,"CDEFGHIJKLMNOP":null,"QRSTUVWX":"YABCDE","FGHIJKLM":"NOPQ"}]}`)

	validEncodingJSON := json.Valid(original)
	validGoJSON := gojson.Valid(original)

	if validEncodingJSON != validGoJSON {
		fmt.Printf("%v vs %v\n", validEncodingJSON, validGoJSON)

		os.Exit(1)
	}
}

@zimmski
Copy link

zimmski commented Apr 30, 2021

However, i can reproduce with the master the same problem. So this is a bug that has nothing to do with this PR.

So i create a new issue and since this PR solves the original reproducer, it can be merged?

@zimmski
Copy link

zimmski commented Apr 30, 2021

I did find another case where it fails with a segmentation fault on this PR. Will first create the issue with the invalid case, since that can be reproduced with the master (will double check that). And then move on to have a reproducer for the segmentation fault for this PR. Sorry to flipflop.

Side question: is the form of reproducers in #198 (comment) ok? Because that is a no-brainer to send out.

@zimmski
Copy link

zimmski commented Apr 30, 2021

Finally. I got one that i can post (as i sidenote this happens with and without #200):

package main

import (
	"fmt"

	//gojson "encoding/json"
	gojson "github.com/goccy/go-json"
)

func main() {
	original := []byte(`{
		"Body": {
			"List": [
				{
					"Returns": [
						{
							"Value": "10",
							"nodeType": "Literal"
						}
					],
					"nodeKind": "Return",
					"nodeType": "Statement"
				}
			],
			"nodeKind": "Block",
			"nodeType": "Statement"
		},
		"nodeType": "Function"
	}`)

	var a map[string]gojson.RawMessage
	if err := gojson.Unmarshal(original, &a); err != nil {
		panic(err)
	}
	var b map[string]gojson.RawMessage
	if err := gojson.Unmarshal(a["Body"], &b); err != nil {
		panic(err)
	}
	var c []gojson.RawMessage
	if err := gojson.Unmarshal(b["List"], &c); err != nil {
		panic(err)
	}
	var d map[string]gojson.RawMessage
	if err := gojson.Unmarshal(c[0], &d); err != nil {
		panic(err)
	}
	var e []gojson.RawMessage
	if err := gojson.Unmarshal(d["Returns"], &e); err != nil {
		panic(err)
	}
	var f map[string]gojson.RawMessage
	if err := gojson.Unmarshal(e[0], &f); err != nil {
		fmt.Printf("JSON raw: %s\n", e[0])

		panic(err)
	}
}

@IncSW
Copy link
Contributor

IncSW commented Apr 30, 2021

@goccy @zimmski
problem can be solved by disabling slice pool, data corruption after copySlice

for reproduce:

func TestSliceBug(t *testing.T) {
	data := []byte(`["0-1-2-3-4-5-6-7-8-9-10(-11-12-13-14-15-16-17-18)-19-20"]`)

	var c []gojson.RawMessage
	if err := gojson.Unmarshal(data, &c); err != nil {
		panic(err)
	}

	var d []gojson.RawMessage
	if err := gojson.Unmarshal(data, &d); err != nil {
		panic(err)
	}

	t.Log(string(c[0]))
	t.Log(string(d[0]))
}

decode_slice.go Outdated Show resolved Hide resolved
@goccy
Copy link
Owner Author

goccy commented May 1, 2021

Probably, I was able to fix the problem without disabling the slice pool .

@IncSW
Copy link
Contributor

IncSW commented May 1, 2021

In my last comment on the code i suggested solution without disabling pool, but your solution looks better, all my yesterday tests are green.

@zimmski
Copy link

zimmski commented May 1, 2021

Thanks @goccy and @IncSW for taking such an interested! Looking at the made changes, i am pretty sure that i would have needed lots of time to build up the context to fix them.

I tested the changes (with and without the current version of #200) and my reproducers work now as well as another set of tests in our test suite. However, there are unfortunately still problems left:

  • Using the "Valid" function before unmarshaling finds some cases where go-json does not agree with encoding/json yet. I opened "Valid" does not behave the same way as "encoding/json.Valid" does #199 for that.
  • When i do not apply "Valid" before unmarshaling some of the JSON some of the "invalid" (encoding/json and jq agree they are valid) JSON raw data unmarshals, some not. I will try to post another reproducer for one of them.

Do you know any tooling to reduce JSON data automatically? If not, i could help out creating a reducer. Please let me know.

Since all tests passed and the code got reviewed(?), maybe this PR should be just merged and i will continue do add reproducers to new issues? What is your take on this?

@goccy
Copy link
Owner Author

goccy commented May 1, 2021

@zimmski
OK, so first of all, the purpose of this PR is to fix the problem of #195, so let's merge this.
Then, I will fix the Valid API's issue in another PR.
If you still have problems, could you create a new issue and report it ?

@zimmski
Copy link

zimmski commented May 1, 2021

@zimmski
OK, so first of all, the purpose of this PR is to fix the problem of #195, so let's merge this.

Agreed, thanks!

If you still have problems, could you create a new issue and report it ?

I am on it. Finding send-able reproducers is a pain. Do you know of any "reducer" tooling? If not, i will create one.

@zimmski
Copy link

zimmski commented May 1, 2021

Btw: I think this package also found a problem in our code base.

We have a type that looks like this

type someStruct struct {
  ...
  someField map[string]someInterface
  ...
}

And the values for "someInterface" are always "null" in that case so encoding/json automatically decodes to null, however go-json does not like that. Will report that as well, because the behavior is different to encoding/json, but i found it interesting since it is practically a bug in our code.

@goccy
Copy link
Owner Author

goccy commented May 1, 2021

Sorry, I know the tools for fuzzing, but I don't know any "reducer" tooling .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multiple "Unmarshal" calls on the same data slice lead to corrupted data that cannot be unmarshaled
3 participants