Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

cmd/log-parser: Add containerID to list of standard fields #453

Closed
jodh-intel opened this issue Jun 28, 2018 · 5 comments
Closed

cmd/log-parser: Add containerID to list of standard fields #453

jodh-intel opened this issue Jun 28, 2018 · 5 comments
Assignees

Comments

@jodh-intel
Copy link
Contributor

Once kata-containers/runtime#453 lands, we should be able to add containerID as a standard field in the kata-log-parser tool.

The complication will be ensuring it handles the proxy which will not provide the containerID for all log entries (namely, the logs coming direct from the proxy as opposed to the agent output it also handles).

@jodh-intel jodh-intel self-assigned this Jun 28, 2018
jodh-intel added a commit to jodh-intel/tests-1 that referenced this issue Jun 29, 2018
Extract the container ID from the structured logs and add it to the
standard `LogEntry` type. This will allow (most of) the log entries to
be associated with individual containers, which will make debugging and
analysis easier.

Fixes kata-containers#453.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
@jodh-intel
Copy link
Contributor Author

Hi @bergwolf, @sboeuf, @grahamwhaley - I've got a branch that captures the containerID [1], but I'm wondering whether the log parser should also capture the sandboxID. If it did, we would be closer to being able to associate a group of log entries with an individual container (as the group would contain all the container-specific log entries + log entries associated with the containers "parent" sandbox).

There are of course quite a few log entries across the system that are not associated with any container/sandbox. But looking through https://github.com/kata-containers/agent/blob/master/protocols/grpc/agent.proto I noticed a few points:

  • All container-related APIs specify a container_id field (which is useful :)
  • None (well neither strictly :) of the sandbox APIs specify a corresponding sandbox_id field
  • None of the networking/CPU/Memory API calls specify a sandbox_id field.

To summarise, the following gRPC services are not associated with a container or a sandbox (from the perspective of the logs):

// networking
rpc AddInterface(AddInterfaceRequest) returns(Interface);
rpc UpdateInterface(UpdateInterfaceRequest) returns (Interface);
rpc RemoveInterface(RemoveInterfaceRequest) returns (Interface);
rpc UpdateRoutes(UpdateRoutesRequest) returns (Routes);

// misc (TODO: some rpcs can be replaced by hyperstart-exec)
rpc CreateSandbox(CreateSandboxRequest) returns (google.protobuf.Empty);
rpc DestroySandbox(DestroySandboxRequest) returns (google.protobuf.Empty);
rpc OnlineCPUMem(OznlineCPUMemRequest) returns (google.protobuf.Empty);

Clearly, they don't strictly need to be (as the system works :). However, for consistency and to allow us to improve our tracing, I wonder if we should consider adding a sandbox_id to each of the above?


[1] - It also picks out the container ID from the agent gRPC trace generated by https://github.com/kata-containers/agent/blob/master/agent.go#L585.

jodh-intel added a commit to jodh-intel/tests-1 that referenced this issue Jul 2, 2018
Extract the container ID from the structured logs and add it to the
standard `LogEntry` type. This will allow (most of) the log entries to
be associated with individual containers, which will make debugging and
analysis easier.

Fixes kata-containers#453.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
jodh-intel added a commit to jodh-intel/tests-1 that referenced this issue Jul 3, 2018
Extract the container ID from the structured logs and add it to the
standard `LogEntry` type. This will allow (most of) the log entries to
be associated with individual containers, which will make debugging and
analysis easier.

Fixes kata-containers#453.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
@grahamwhaley
Copy link
Contributor

+1 from me. I think being able to associate actions and commands in the logs with the relevant container and/or sandbox is pretty critical to our debug - particularly when we start to run in to multi-pod and parallelism bugs.

@sboeuf
Copy link

sboeuf commented Jul 3, 2018

+1 from me !

@jodh-intel
Copy link
Contributor Author

I've just realised that we don't need to pass the sandbox ID over the wire as we can tell the agent the sandboxID at agent startup by adding something like a kata-sandbox=$sandbox kernel command-line option.

@jodh-intel
Copy link
Contributor Author

jodh-intel added a commit to jodh-intel/tests-1 that referenced this issue Jul 4, 2018
Extract the container ID from the structured logs and add it to the
standard `LogEntry` type. This will allow (most of) the log entries to
be associated with individual containers, which will make debugging and
analysis easier.

Fixes kata-containers#453.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants