Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP /v1/catalog/register endpoint closes connection when decoding invalid Service.Connect field #8529

Closed
blake opened this issue Aug 18, 2020 · 1 comment · Fixed by #8537
Labels
theme/api Relating to the HTTP API interface type/bug Feature does not function as expected

Comments

@blake
Copy link
Member

blake commented Aug 18, 2020

Overview of the Issue

Consul 1.6.2 introduced a change to the way JSON is decoded with PR #6680. As of that release, and current to 1.8.3, Consul fails to properly decode a catalog registration request which contains the field "Connect": null (Service.Connect), and abruptly closes the TCP connection instead of responding with an error.

This field is sent by consul-k8s's catalog sync in version 0.7.0 and earlier.

Reproduction Steps

Steps to reproduce this issue:

  1. Create a JSON file (e.g., payload.json) with the following contents.

    {
        "ID": "",
        "Node": "k8s-sync",
        "Address": "127.0.0.1",
        "TaggedAddresses": null,
        "NodeMeta": {
            "external-source": "kubernetes"
        },
        "Datacenter": "",
        "Service": {
            "Kind": "",
            "ID": "test-service-f8fd5f0f4e6c",
            "Service": "test-service",
            "Tags": [
                "k8s"
            ],
            "Meta": {
                "external-k8s-ns": "",
                "external-source": "kubernetes",
                "port-stats": "18080"
            },
            "Port": 8080,
            "Address": "192.0.2.10",
            "EnableTagOverride": false,
            "CreateIndex": 0,
            "ModifyIndex": 0,
            "ProxyDestination": "",
            "Connect": null
        },
        "Check": null,
        "SkipNodeUpdate": true
    }
  2. Attempt to register the service with /v1/catalog/register using curl.

    $ curl --verbose \
               --insecure \
               --http1.1 \
               --request PUT \
               --header "Content-Type: application/json" \
               --data @payload.json \
               $CONSUL_HTTP_ADDR/v1/catalog/register  
    *   Trying 10.0.0.201...
    * TCP_NODELAY set
    * Connected to 10.0.0.201 (10.0.0.201) port 443 (#0)
    * ALPN, offering http/1.1
    * successfully set certificate verify locations:
    *   CAfile: /etc/ssl/cert.pem
    CApath: none
    * TLSv1.2 (OUT), TLS handshake, Client hello (1):
    * TLSv1.2 (IN), TLS handshake, Server hello (2):
    * TLSv1.2 (IN), TLS handshake, Certificate (11):
    * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
    * TLSv1.2 (IN), TLS handshake, Server finished (14):
    * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
    * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
    * TLSv1.2 (OUT), TLS handshake, Finished (20):
    * TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
    * TLSv1.2 (IN), TLS handshake, Finished (20):
    * SSL connection using TLSv1.2 / ECDHE-ECDSA-AES256-GCM-SHA384
    * ALPN, server accepted to use http/1.1
    * Server certificate:
    *  subject: CN=server.dc2.consul
    *  start date: Jun 18 22:28:05 2020 GMT
    *  expire date: Jun 18 22:28:05 2022 GMT
    *  issuer: C=US; ST=CA; L=San Francisco; street=101 Second Street; postalCode=94105; O=HashiCorp Inc.; CN=Consul Agent CA 276763587613377718708777645654418674711
    *  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
    > PUT /v1/catalog/register HTTP/1.1
    > Host: 10.0.0.201
    > User-Agent: curl/7.64.1
    > Accept: */*
    > Content-Type: application/json
    > Content-Length: 711
    >
    * upload completely sent off: 711 out of 711 bytes
    * TLSv1.2 (IN), TLS alert, close notify (256):
    * Empty reply from server
    * Connection #0 to host 10.0.0.201 left intact
    curl: (52) Empty reply from server
    * Closing connection 0
    
  3. The server abruptly closes the connection without returning an HTTP response. There is no error logged when running Consul with a --log-level of DEBUG or TRACE. However, this error can be seen when running Consul under a debugger (stacktrace from Consul 1.6.2).

    Failed to continue - runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation]
    Unable to propogate EXC_BAD_ACCESS signal to target process and panic (see https://github.com/go-delve/delve/issues/852)
    Last known immediate stacktrace (goroutine id 511):
        /Users/blake/Documents/p/hashicorp/consul/agent/structs/structs.go:901
            github.com/hashicorp/consul/agent/structs.(*ServiceConnect).UnmarshalJSON
        /usr/local/Cellar/go/1.14.6/libexec/src/encoding/json/decode.go:860
            encoding/json.(*decodeState).literalStore
        /usr/local/Cellar/go/1.14.6/libexec/src/encoding/json/decode.go:384
            encoding/json.(*decodeState).value
        /usr/local/Cellar/go/1.14.6/libexec/src/encoding/json/decode.go:765
            encoding/json.(*decodeState).object
        /usr/local/Cellar/go/1.14.6/libexec/src/encoding/json/decode.go:370
            encoding/json.(*decodeState).value
        /usr/local/Cellar/go/1.14.6/libexec/src/encoding/json/decode.go:765
            encoding/json.(*decodeState).object
        /usr/local/Cellar/go/1.14.6/libexec/src/encoding/json/decode.go:370
            encoding/json.(*decodeState).value
        /usr/local/Cellar/go/1.14.6/libexec/src/encoding/json/decode.go:180
            encoding/json.(*decodeState).unmarshal
        /usr/local/Cellar/go/1.14.6/libexec/src/encoding/json/stream.go:73
            encoding/json.(*Decoder).Decode
        /Users/blake/Documents/p/hashicorp/consul/agent/http.go:580
            github.com/hashicorp/consul/agent.decodeBody
        /Users/blake/Documents/p/hashicorp/consul/agent/catalog_endpoint.go:18
            github.com/hashicorp/consul/agent.(*HTTPServer).CatalogRegister
        /Users/blake/Documents/p/hashicorp/consul/agent/http.go:298
            github.com/hashicorp/consul/agent.(*HTTPServer).handler.func3
        /Users/blake/Documents/p/hashicorp/consul/agent/http.go:490
            github.com/hashicorp/consul/agent.(*HTTPServer).wrap.func1
    

Consul info the Server

Server info
agent:
	check_monitors = 0
	check_ttls = 0
	checks = 0
	services = 0
build:
	prerelease =
	revision = ba7d9435
	version = 1.8.2
consul:
	acl = enabled
	bootstrap = false
	known_datacenters = 1
	leader = false
	leader_addr = 10.42.1.12:8300
	server = true
raft:
	applied_index = 7685433
	commit_index = 7685433
	fsm_pending = 0
	last_contact = 50.271917ms
	last_log_index = 7685433
	last_log_term = 39744
	last_snapshot_index = 7679999
	last_snapshot_term = 39744
	latest_configuration = [{Suffrage:Voter ID:8467be19-2998-c825-c0b0-d876111a6969 Address:10.42.0.55:8300} {Suffrage:Voter ID:465cf9d9-dc76-1a35-56d0-b444522f92cd Address:10.42.1.12:8300} {Suffrage:Voter ID:4c555447-ecd9-e67d-6f09-4fc07a725ead Address:10.42.4.72:8300}]
	latest_configuration_index = 0
	num_peers = 2
	protocol_version = 3
	protocol_version_max = 3
	protocol_version_min = 0
	snapshot_version_max = 1
	snapshot_version_min = 0
	state = Follower
	term = 39744
runtime:
	arch = arm64
	cpu_count = 4
	goroutines = 181
	max_procs = 4
	os = linux
	version = go1.14.6
serf_lan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 112
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 8430
	members = 6
	query_queue = 0
	query_time = 1
serf_wan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 1
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 3039
	members = 3
	query_queue = 0
	query_time = 1

The expectation is that Consul should return a proper HTTP error response to the requestor instead of closing the connection.

@blake blake added type/bug Feature does not function as expected theme/api Relating to the HTTP API interface labels Aug 18, 2020
@dnephin
Copy link
Contributor

dnephin commented Aug 18, 2020

#8537 will fix the panic, but that request would still fail with a 400 because ProxyDestination was removed in 65be587 and a704ebe made unknown fields fail validation.

If we backport this fix to 1.6.x, that request should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/api Relating to the HTTP API interface type/bug Feature does not function as expected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants