Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New edgeAgent Direct Method GetModuleLogs returns empty payload when management/workload URIs are http-based #3795

Closed
ryayl opened this issue Oct 23, 2020 · 3 comments · Fixed by #3908

Comments

@ryayl
Copy link

ryayl commented Oct 23, 2020

Expected Behavior

The new direct method GetModuleLogs (provided by edgeAgent 1.0.10) is working correctly on a RaspberryPi based system, where the iotedge config file has the following settings for both connect and listen:

management_uri: "unix:///var/run/iotedge/mgmt.sock"
workload_uri: "unix:///var/run/iotedge/workload.sock"

After sending GetModuleLogs command from the Azure CLI, I get a response including the payload containing the logs. Meanwhile I can observe the following activity in the edgeAgent and edgeDaemon logs:

  • edgeAgent receives direct method and generates http request, getting the response:
<7> 2020-10-23 21:47:49.140 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient] - Making a Http call to unix:///var/run/iotedge/mgmt.sock to Get logs for CameraApp
<7> 2020-10-23 21:47:49.141 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Connecting socket /var/run/iotedge/mgmt.sock
<7> 2020-10-23 21:47:49.142 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Connected socket /var/run/iotedge/mgmt.sock
<7> 2020-10-23 21:47:49.142 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Sending request http://mgmt.sock/modules/CameraApp/logs?api-version=2019-11-05&follow=false&tail=100
<7> 2020-10-23 21:47:49.158 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Util.Uds.HttpUdsMessageHandler] - Response received OK
<7> 2020-10-23 21:47:49.158 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient] - Received a valid Http response from unix:///var/run/iotedge/mgmt.sock for Get logs for CameraApp
  • edgeDaemon receives the http request and sends the response:
Oct 23 22:55:57 rypi iotedged[1099]: 2020-10-23T21:55:57Z [INFO] - Getting logs for module CameraApp...
Oct 23 22:55:57 rypi iotedged[1099]: 2020-10-23T21:55:57Z [INFO] - Successfully got logs for module CameraApp
Oct 23 22:55:57 rypi iotedged[1099]: 2020-10-23T21:55:57Z [INFO] - [mgmt] - - - [2020-10-23 21:55:57.584500064 UTC] "GET /modules/CameraApp/logs?api-version=2019-11-05&follow=false&tail=100 HTTP/1.1" 200 OK - "-" "-" auth_id(-)

Current Behavior

On our production hardware (not RPi) we have a different setup which runs the edgeDaemon inside a container, therefore the management and workload URIs are set for http using the addr of the local docker bridge network:

management_uri: "http://172.17.0.1:8080"
workload_uri: "http://172.17.0.1:8081"

This network configuration seems to work for our application's needs, but GetModuleLogs always returns an empty response. I see the following activity in the logs:

  • edgeAgent receives direct method, generates http request and gets response:
<7> 2020-10-23 22:07:59.341 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient] - Making a Http call to http://172.17.0.1:8080/ to Get logs for CameraApp
<7> 2020-10-23 22:07:59.344 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient] - Received a valid Http response from http://172.17.0.1:8080/ for Get logs for CameraApp
  • edgeDaemon receives http request, fails
<6>2020-10-23T22:11:06Z [INFO] - [mgmt] - - - [2020-10-23 22:11:06.693025976 UTC] "GET //modules/CameraApp/logs?api-version=2019-11-05&follow=false&tail=100 HTTP/1.1" 404 Not Found - "-" "-" auth_id(-)

Note that the http GET request echoed by edgeDaemon differs from the working example - it has one extra leading slash

Is the edgeAgent not handling http-based management_uri strings correctly?

Steps to Reproduce

Provide a detailed set of steps to reproduce the bug.

  1. Set the management_uri in the iotedge config.yaml to http://:8080
  2. Ensure edgeAgent and iotedge security daemon still communicate normally with that setting, syncing modules, etc
  3. Send the GetModuleLogs direct method using az iot hub invoke-module-method
  4. Receive a response but payload is empty, even when log exists

Context (Environment)

Output of iotedge check

Click here

Configuration checks
--------------------
√ config.yaml is well-formed - OK
√ config.yaml has well-formed connection string - OK
× container engine is installed and functional - Error
    Could not communicate with container engine at unix:///var/run/docker.sock.
    Please check your moby-engine installation and ensure the service is running.
√ config.yaml has correct hostname - OK
‼ latest security daemon - Warning
    Installed IoT Edge daemon has version 1.0.9.4 but 1.0.10 is the latest stable version available.
    Please see https://aka.ms/iotedge-update-runtime for update instructions.
√ host time is close to real time - OK
‼ DNS server - Warning
    Container engine is not configured with DNS server setting, which may impact connectivity to IoT Hub.
    Please see https://aka.ms/iotedge-prod-checklist-dns for best practices.
    You can ignore this warning if you are setting DNS server per module in the Edge deployment.
‼ production readiness: certificates - Warning
    The Edge device is using self-signed automatically-generated development certificates.
    They will expire in 40 days (at 2020-12-02 22:42:07 UTC) causing module-to-module and downstream device communication to fail on an active deployment.
    After the certs have expired, restarting the IoT Edge daemon will trigger it to generate new development certs.
    Please consider using production certificates instead. See https://aka.ms/iotedge-prod-checklist-certs for best practices.
‼ production readiness: logs policy - Warning
    Container engine is not configured to rotate module logs which may cause it run out of disk space.
    Please see https://aka.ms/iotedge-prod-checklist-logs for best practices.
    You can ignore this warning if you are setting log policy per module in the Edge deployment.

Connectivity checks
-------------------
√ host can connect to and perform TLS handshake with IoT Hub AMQP port - OK
√ host can connect to and perform TLS handshake with IoT Hub HTTPS / WebSockets port - OK
√ host can connect to and perform TLS handshake with IoT Hub MQTT port - OK

7 check(s) succeeded.
4 check(s) raised warnings. Re-run with --verbose for more details.
1 check(s) raised errors. Re-run with --verbose for more details.
11 check(s) were skipped due to errors from other checks. Re-run with --verbose for more details.

  • Note for iotedge check, communication with docker.sock usually appears to fail because iotedge daemon is running inside a container that mounts the /var/run/ path as a volume. When I start a bash shell inside that running container in order to run iotedge check, I can see that /var/run/docker.sock exists, but I can't execute other docker commands while inside the container because the docker daemon is running on the host OS. This configuration may be related to the problem, but hasn't prevented iotedge security daemon from properly managing the other deployed module containers so far.

Device Information

  • Host OS [e.g. Ubuntu 16.04, Ubuntu 18.04, Windows IoT Core]: Torizon (Yocto/OpenEmbedded based)
  • Architecture [e.g. amd64, arm32, arm64]: arm32
  • Container OS [e.g. Linux containers, Windows containers]: iotedged runs in Debian container, other application containers running Ubuntu 16.04

Runtime Versions

  • iotedged [run iotedge version]: 1.0.9.4 (on both working and non-working systems)
  • Edge Agent [image tag (e.g. 1.0.0)]: 1.0.10 (on both working and non-working systems)
  • Edge Hub [image tag (e.g. 1.0.0)]: 1.0.10 (on both working and non-working systems)
  • Docker/Moby [run docker version]: (on non-working system)
Client:
 Version:           19.03.12-ce
 API version:       1.40
 Go version:        go1.14
 Git commit:        48a66213fe
 Built:             Sun Jul 26 07:21:29 2020
 OS/Arch:           linux/arm
 Experimental:      false

Server:
 Engine:
  Version:          19.03.12-ce
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.14
  Git commit:       48a66213fe1747e8873f849862ff3fb981899fc6
  Built:            Thu Jan  1 00:00:00 1970
  OS/Arch:          linux/arm
  Experimental:     false
 containerd:
  Version:          v1.2.7-11-gfd103cb7.m
  GitCommit:        fd103cb716352c7e19768e4fed057f71d68902a0.m
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f-dirty
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683-dirty

Note: when using Windows containers on Windows, run docker -H npipe:////./pipe/iotedge_moby_engine version instead

Logs

iotedged logs
<6>2020-10-23T22:11:06Z [INFO] - [mgmt] - - - [2020-10-23 22:11:06.693025976 UTC] "GET //modules/CameraApp/logs?api-version=2019-11-05&follow=false&tail=100 HTTP/1.1" 404 Not Found - "-" "-" auth_id(-)
edge-agent logs

<7> 2020-10-23 22:07:59.341 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient] - Making a Http call to http://172.17.0.1:8080/ to Get logs for CameraApp
<7> 2020-10-23 22:07:59.344 +00:00 [DBG] [Microsoft.Azure.Devices.Edge.Agent.Edgelet.ModuleManagementHttpClient] - Received a valid Http response from http://172.17.0.1:8080/ for Get logs for CameraApp

edge-hub logs

n/a

Additional Information

Please provide any additional information that may be helpful in understanding the issue.

@veyalla
Copy link
Contributor

veyalla commented Oct 23, 2020

Thanks for the report, we'll take a look.

@ryayl
Copy link
Author

ryayl commented Dec 7, 2020

It looks like this fix made it into release 1.0.10.3, but so far I'm getting the same error. Does the edge security daemon also need to be updated, or only edgeAgent?

@veyalla
Copy link
Contributor

veyalla commented Dec 7, 2020

The fix is checked-in but not released. It will be part of our next release currently planned for Jan 2021.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants