Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug when comparing size of len(msg.Pack) and msg.Len() #663

Closed
pierresouchay opened this issue Apr 2, 2018 · 6 comments
Closed

Bug when comparing size of len(msg.Pack) and msg.Len() #663

pierresouchay opened this issue Apr 2, 2018 · 6 comments

Comments

@pierresouchay
Copy link
Contributor

pierresouchay commented Apr 2, 2018

Following #657 I updated DNS lib to 1.0.5 into Consul but ran into another issue and I am still able to reproduce case where len(msg.Pack()) is not equal to msg.Len()

pierresouchay/consul@f85db7e

It crashes with the following output:

2018/04/02 15:25:33 [DEBUG] err:= %!s(<nil>), dns: DIFF between resp.Len() := 65481 and len(buf) := 65465 for ;; opcode: QUERY, status: NOERROR, id: 42437
;; flags: qr aa rd; QUERY: 1, ANSWER: 647, AUTHORITY: 0, ADDITIONAL: 647

;; QUESTION SECTION:
;redis.service.consul.	IN	 SRV

;; ANSWER SECTION:
redis.service.consul.	0	IN	SRV	1 1 8000 host-redis-1-146.test.acme.com.node.dc1.consul.
redis.service.consul.	0	IN	SRV	1 1 8000 host-redis-2-31.test.acme.com.node.dc1.consul.

[643 SRV similar Entries Skipped...]

redis.service.consul.	0	IN	SRV	1 1 8000 host-redis-3-100.test.acme.com.node.dc1.consul.
redis.service.consul.	0	IN	SRV	1 1 8000 host-redis-3-56.test.acme.com.node.dc1.consul.

;; ADDITIONAL SECTION:
host-redis-1-146.test.acme.com.node.dc1.consul.	0	IN	CNAME	fx.168.1.146.
host-redis-2-31.test.acme.com.node.dc1.consul.	0	IN	CNAME	fx.168.2.31.
host-redis-1-226.test.acme.com.node.dc1.consul.	0	IN	CNAME	fx.168.1.226.
host-redis-1-59.test.acme.com.node.dc1.consul.	0	IN	CNAME	fx.168.1.59.

[639 CNAME Entries Skipped...]

host-redis-3-108.test.acme.com.node.dc1.consul.	0	IN	CNAME	fx.168.3.108.
host-redis-3-85.test.acme.com.node.dc1.consul.	0	IN	CNAME	fx.168.3.85.
host-redis-3-100.test.acme.com.node.dc1.consul.	0	IN	CNAME	fx.168.3.100.
host-redis-3-56.test.acme.com.node.dc1.consul.	0	IN	CNAME	fx.168.3.56.
    2018/04/02 15:25:33 [DEBUG] http: Request PUT /v1/catalog/register (443.311µs) from=127.0.0.1:51799
    2018/04/02 15:25:33 [DEBUG] dns: request for name redis.service.consul. type SRV class IN (took 69.958983ms) from client 127.0.0.1:51798 (tcp)
panic: Bug in DNS lib

goroutine 2140 [running]:
github.com/hashicorp/consul/agent.(*DNSServer).trimTCPResponse(0xc4204266c0, 0xc42016e120, 0xc42016e1b0, 0x0)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/agent/dns.go:761 +0x638
github.com/hashicorp/consul/agent.(*DNSServer).trimDNSResponse(0xc4204266c0, 0x20e2a9f, 0x3, 0xc42016e120, 0xc42016e1b0, 0x289)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/agent/dns.go:827 +0xc0
github.com/hashicorp/consul/agent.(*DNSServer).serviceLookup(0xc4204266c0, 0x20e2a9f, 0x3, 0xc4205058ec, 0x3, 0xc420af2100, 0x5, 0x0, 0x0, 0xc42016e120, ...)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/agent/dns.go:911 +0x54f
github.com/hashicorp/consul/agent.(*DNSServer).dispatch(0xc4204266c0, 0x20e2a9f, 0x3, 0xc42016e120, 0xc42016e1b0)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/agent/dns.go:423 +0xd9a
github.com/hashicorp/consul/agent.(*DNSServer).handleQuery(0xc4204266c0, 0x2a295c0, 0xc420cfc780, 0xc42016e120)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/agent/dns.go:273 +0x4d4
github.com/hashicorp/consul/agent.(*DNSServer).(github.com/hashicorp/consul/agent.handleQuery)-fm(0x2a295c0, 0xc420cfc780, 0xc42016e120)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/agent/dns.go:117 +0x48
github.com/hashicorp/consul/vendor/github.com/miekg/dns.HandlerFunc.ServeDNS(0xc420182100, 0x2a295c0, 0xc420cfc780, 0xc42016e120)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/vendor/github.com/miekg/dns/server.go:84 +0x44
github.com/hashicorp/consul/vendor/github.com/miekg/dns.(*ServeMux).ServeDNS(0xc4201820c0, 0x2a295c0, 0xc420cfc780, 0xc42016e120)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/vendor/github.com/miekg/dns/server.go:210 +0x65
github.com/hashicorp/consul/vendor/github.com/miekg/dns.(*Server).serve(0xc4203be000, 0x2a1dfe0, 0xc420708030, 0x2a15ea0, 0xc4201820c0, 0xc42083e1c0, 0x3d, 0x3d, 0x0, 0x0, ...)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/vendor/github.com/miekg/dns/server.go:567 +0x2c3
github.com/hashicorp/consul/vendor/github.com/miekg/dns.(*Server).serveTCP.func1(0xc420182200, 0x2a29740, 0xc42022e0e0, 0x77359400, 0xc4203be000, 0xc420182220)
	/Users/p.souchay/go/src/github.com/hashicorp/consul/vendor/github.com/miekg/dns/server.go:481 +0x141
created by github.com/hashicorp/consul/vendor/github.com/miekg/dns.(*Server).serveTCP
	/Users/p.souchay/go/src/github.com/hashicorp/consul/vendor/github.com/miekg/dns/server.go:475 +0x2e3

I am gonna try to provide a test case

@pierresouchay
Copy link
Contributor Author

Here is a failing Test Case:

func TestMsgCompressLengthLargeRecordsAllValues(t *testing.T) {
	msg := new(Msg)
	msg.Compress = true
	msg.SetQuestion("redis.service.consul.", TypeSRV)
	for i := 0; i < 900; i++ {
		target := fmt.Sprintf("host-redis-%d-%d.test.acme.com.node.dc1.consul.", i/256, i%256)
		msg.Answer = append(msg.Answer, &SRV{Hdr: RR_Header{Name: "redis.service.consul.", Class: 1, Rrtype: TypeSRV, Ttl: 0x3c}, Port: 0x4c57, Target: target})
		msg.Extra = append(msg.Extra, &CNAME{Hdr: RR_Header{Name: target, Class: 1, Rrtype: TypeCNAME, Ttl: 0x3c}, Target: fmt.Sprintf("fx.168.%d.%d.", i/256, i%256)})
		predicted := msg.Len()
		buf, err := msg.Pack()
		if err != nil {
			t.Error(err)
		}
		if predicted != len(buf) {
			t.Fatalf("predicted compressed length is wrong for %d records: predicted %s (len=%d) %d, actual %d", i, msg.Question[0].Name, len(msg.Answer), predicted, len(buf))
		}
	}
}

It fails with the following:

--- FAIL: TestMsgCompressLengthLargeRecordsAllValues (0.10s)
	/Users/p.souchay/go/src/github.com/miekg/dns/length_test.go:170: predicted compressed length is wrong for 249 records: predicted redis.service.consul. (len=250) 22835, actual 22818
FAIL
FAIL	github.com/miekg/dns	0.115s
Error: Tests failed.

@miekg
Copy link
Owner

miekg commented Apr 6, 2018 via email

@pierresouchay
Copy link
Contributor Author

@miekg I tried to find the issue... it seems it is related to the compression of data of Extra... I was not exactly able to find what is the issue, but this is definitly related to tests such as len < maxCompressionOffset...

I'll try to find some time to dig a bit more

@miekg
Copy link
Owner

miekg commented Apr 9, 2018 via email

@pierresouchay
Copy link
Contributor Author

It took me a while to figure out the issue (a few hours), it was too simple :-)

Since len was added before trying to add label to compression and since it is legal to have a start offset before 14 bits, we just had to more the increase of len at the end of loop :-)

@pierresouchay
Copy link
Contributor Author

@miekg Ok, the patch I made seems to work fine now.

I added lots of tests cases, including TestMsgCompressLengthLargeRecordsWithPaddingPermutation which test all permutations with padding around 14bits, so it should work well in all cases now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants