Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When Compress is enable AND size if large, msg.Len() != len(msg.Pack()) #657

Closed
pierresouchay opened this issue Mar 28, 2018 · 12 comments
Closed

Comments

@pierresouchay
Copy link
Contributor

While working on a patch for Consul hashicorp/consul#3948 I found a bug in this library because when compression is activated, msg.Len() is not the same as len(msg.Pack())

Since Consul was using a very old version of the library, I thought it had been fixed, so Consul upgraded its version to 1.0.4 after this issue hashicorp/consul#3977

But still, the issue still exists with version 1.0.4.

I created a test case that shows it into action: (to paste into dns_test.go ):

func TestMsgCompressLengthLargeRecords(t *testing.T) {
	msg := new(Msg)
	msg.Compress = true
	msg.SetQuestion(Fqdn("my.service.acme."), TypeSRV)
	numEntries := 0
	for j := 0; j < 10; j++ {
		for i := 0; i < 255; i++ {
			numEntries++
			target := fmt.Sprintf("host-redis-%d-%d.test.acme.com.node.dc1.consul.", j, i)
			msg.Answer = append(msg.Answer, &SRV{Hdr: RR_Header{Name: "redis.service.consul.", Rrtype: 0x21, Class: 0x1, Ttl: 0x3c}, Port: 0x4c57, Target: target})
			msg.Extra = append(msg.Extra, &CNAME{Hdr: RR_Header{Name: target, Rrtype: 0x1, Class: 0x1, Ttl: 0x3c}, Target: fmt.Sprintf("fx.168.%d.%d.", j, i)})
			predicted := msg.Len()
			buf, err := msg.Pack()
			if err != nil {
				t.Error(err)
			}
			if predicted != len(buf) {
				t.Errorf("predicted compressed length is wrong: predicted %s (len=%d) %d, actual %d with %d entries",
					msg.Question[0].Name, len(msg.Answer), predicted, len(buf), numEntries)
			}
		}
	}
}

This test fails this way:

go test github.com/miekg/dns -run ^TestMsgCompressLengthLargeRecords$
--- FAIL: TestMsgCompressLengthLargeRecords (9.82s)
	dns_test.go:309: predicted compressed length is wrong: predicted my.service.acme. (len=250) 22833, actual 22850 with 250 entries
	dns_test.go:309: predicted compressed length is wrong: predicted my.service.acme. (len=251) 22925, actual 22959 with 251 entries
	dns_test.go:309: predicted compressed length is wrong: predicted my.service.acme. (len=252) 23017, actual 23068 with 252 entries
	dns_test.go:309: predicted compressed length is wrong: predicted my.service.acme. (len=253) 23109, actual 23177 with 253 entries
	dns_test.go:309: predicted compressed length is wrong: predicted my.service.acme. (len=254) 23201, actual 23286 with 254 entries
	dns_test.go:309: predicted compressed length is wrong: predicted my.service.acme. (len=255) 23293, actual 23395 with 255 entries

Which is exactly what I did see while patching Consul.

In the patch I am working on, since Consul might create very large SRV records (several thousands), in order to have a good response, we are trying to fill DNS at its maximum, so we compute the size compressed, see if it is more than 64K, if not send the response, otherwise, remove records and retry.

So, basically, the size uncompressed is supposed to be far bigger than 64k while I am trying to improve the patch to send 64k MAX compressed.

Note that this test passes without any issue with compression disabled.

@andrewtj
Copy link
Collaborator

In trying to minimise this test case I ran in to the following:

func TestMsgCompress(t *testing.T) {
	msg := new(Msg)
	msg.Compress = true
	msg.Answer = append(msg.Answer, &SRV{Hdr: RR_Header{Name: ".", Rrtype: TypeSRV, Class: ClassINET, Ttl: 60}, Port: 0, Target: "."})
	predicted := msg.Len()
	buf, err := msg.Pack()
	if err != nil {
		t.Error(err)
	}
	if predicted != len(buf) {
		t.Errorf("predicted length is wrong: %d, actual %d", predicted, len(buf))
	}
}
--- FAIL: TestMsgCompress (0.00s)
	dns_test.go:504: predicted length is wrong: 32, actual 30
FAIL
exit status 1
FAIL	github.com/miekg/dns	0.027s

@pierresouchay
Copy link
Contributor Author

@andrewtj yes, seems linked to SRV records as my patch did not had issues with A/AAAA records

@andrewtj
Copy link
Collaborator

@pierresouchay the same behaviour occurs (with my example) using a CNAME instead of an SRV. I'm not sure we're exercising the same bug, but I'm not up to speed with the current compression and len implementation so I'm just hazarding a guess.

If you have a bit of time to put into this but don't want to dig in to the implementation itself (understandable), it might be interesting to see if your example shows up after a particularly offset as only names appearing at an offset that fits in 14-bits can be used as compression targets (RFC3597 might come in useful for padding).

Aside: SRV shouldn't be compressed anyway, but that's #521.

@miekg
Copy link
Owner

miekg commented Mar 30, 2018

oh yeah, the 14 bit limit that's not implemented in Len() (forgot about that one).

@miekg
Copy link
Owner

miekg commented Mar 30, 2018

Note the CNAME header type in the test above is set to 0x1 which is dns.TypeA, it should be dns.TypeCNAME

Clearing up the test to the minimum shows it happens (in case) with 250 records, so we keep missing 17 octets, which Pack seems to save, but Len() doesn't. Some hackish println-ing shows that all headers that can be compressed are compressed. Now wondering where 17 comes from and why...

--- FAIL: TestMsgCompressLengthLargeRecords (0.53s)
	length_test.go:171: predicted compressed length is wrong: predicted my.service.acme. (len=250) 22833, actual 22850 with 250 entries
	length_test.go:171: predicted compressed length is wrong: predicted my.service.acme. (len=251) 22925, actual 22959 with 251 entries
	length_test.go:171: predicted compressed length is wrong: predicted my.service.acme. (len=252) 23017, actual 23068 with 252 entries
	length_test.go:171: predicted compressed length is wrong: predicted my.service.acme. (len=253) 23109, actual 23177 with 253 entries
	length_test.go:171: predicted compressed length is wrong: predicted my.service.acme. (len=254) 23201, actual 23286 with 254 entries
	length_test.go:171: predicted compressed length is wrong: predicted my.service.acme. (len=255) 23293, actual 23395 with 255 entries

@miekg
Copy link
Owner

miekg commented Mar 30, 2018

Pack() thinks a test.acme.com.node.dc1.consul can get compressed, which is actually a bit weird.
The first number is where the pointer points to, the second is where we are in the packet.

COMPRESSING host-redis-1-248.test.acme.com.node.dc1.consul. 16329 22781
COMPRESSING test.acme.com.node.dc1.consul. 86 22824

@miekg
Copy link
Owner

miekg commented Mar 30, 2018

Pushed code that has my current code: #658

@pierresouchay
Copy link
Contributor Author

@miekg yes, i did copy the code of another test, just to reproduce the issue on Consul's codebase, this was just to prove my claims

@miekg
Copy link
Owner

miekg commented Mar 31, 2018

Ah,
adding roBs[begin:] to compression set host-redis-1-248.test.acme.com.node.dc1.consul.
Is the last named added to set of names that can still be used for compression, so for any new we can only compress test.acme.con.node.dc.consul using an earlier pointer. The bytes left are exactly 17 bytes. So Pack() is right here (pfew). Len() is wrong

@miekg
Copy link
Owner

miekg commented Mar 31, 2018

#658

Fixes Len() to take maxCompressionOffset into account. It doesn't fail on the test in length_test.go and to make sure I also used to above test code as well (which also works).

@miekg
Copy link
Owner

miekg commented Apr 1, 2018

closed via #658

@pierresouchay
Copy link
Contributor Author

I found another issue #663 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants