Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Traceflow User Guide #972

Merged
merged 1 commit into from
Aug 4, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 14 additions & 12 deletions docs/octant-plugin-installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ There are two ways to deploy Octant and antrea-octant-plugin.


### Prerequisites
antrea-octant-plugin depends on the Antrea monitoring CRDs, AntreaControllerInfo and AntreaAgentInfo.
antrea-octant-plugin depends on the Antrea monitoring CRDs (AntreaControllerInfo and AntreaAgentInfo) and Traceflow CRD (Traceflow).

To run Octant together with antrea-octant-plugin, please make sure you have these two CRDs defined in you K8s cluster.
To run Octant together with antrea-octant-plugin, please make sure you have these CRDs defined in you K8s cluster.

If Antrea is deployed before antrea-octant-plugin starts by using the standard deployment yaml, Antrea monitoring
If Antrea is deployed before antrea-octant-plugin starts by using the standard deployment yaml, these
CRDs should already be added. If not, please refer to [antrea.yaml](/build/yamls/antrea.yml) to
create these two CRDs first.
create these CRDs first.

### Deploy Octant and antrea-octant-plugin as a Pod

Expand Down Expand Up @@ -55,11 +55,16 @@ downloaded which may be due to network issues, you can run command `make
octant-antrea-ubuntu` to build the image locally. If it is the case, you need
to make sure that the image exists on all the K8s Nodes since the antrea-octant
Pod may run on any of them.
2. If the Pod is running without any explicit issue but you can not access the
2. In Antrea v0.8.2, the Traceflow UI is a separate Octant plugin called antrea-traceflow-plugin,
but it has been merged into antrea-octant-plugin on the master branch since then.
To get the latest version of Traceflow UI, please build image
antrea/octant-antrea-ubuntu via command `make octant-antrea-ubuntu` and use the image antrea/octant-antrea-ubuntu:latest
for deploying UI as a Pod.
3. If the Pod is running without any explicit issue but you can not access the
URL, please take a further look at the network configurations in your
environment. It may be due to the network policies or other security rules
configured on your hosts.
3. To deploy a released version of the plugin, you can download
4. To deploy a released version of the plugin, you can download
`https://github.com/vmware-tanzu/antrea/releases/download/<TAG>/antrea-octant.yml`,
where `<TAG>` (e.g. `v0.3.0`) is the desired version (should match the version
of Antrea you are using). After making the necessary edits, you can apply the
Expand Down Expand Up @@ -97,7 +102,7 @@ based on your environment and move the binary to OCTANT_PLUGIN_PATH.
For example, you can get antrea-octant-plugin-linux-x86_64 if it matches your operating system and architecture.

```bash
wget -O antrea-octant-plugin https://github.com/vmware-tanzu/antrea/releases/download/v0.8.1/antrea-octant-plugin-linux-x86_64
wget -O antrea-octant-plugin https://github.com/vmware-tanzu/antrea/releases/download/<TAG>/antrea-octant-plugin-linux-x86_64
# Make sure antrea-octant-plugin is executable, otherwise Octant cannot find it.
chmod a+x antrea-octant-plugin
# If you did not change OCTANT_PLUGIN_PATH, the default folder should be $HOME/.config/octant/plugins.
Expand All @@ -114,15 +119,12 @@ based on your environment and move the binary to OCTANT_PLUGIN_PATH.
Now, you are supposed to see Octant is running together with antrea-octant-plugin via URL http://(IP or $HOSTNAME):80.

Note:
1. In Antrea v0.8.1, the Traceflow UI is a separate Octant plugin called antrea-traceflow-plugin.
Starting with v0.9.0, the Traceflow UI will be merged into antrea-octant-plugin. When deploying Octant as a Pod using
image antrea/octant-antrea-ubuntu:v0.8.1, you already have access to the alpha version of the Traceflow UI.
2. If you deploy Octant and the Antrea UI as a process, you cannot access the Traceflow UI for now when following the
1. If you deploy Octant and the Antrea UI as a process, you cannot access the Traceflow UI for now when following the
steps listed above (at least until the v0.9.0 release). However, you can still build the binary yourself with
the command below, with the remaining steps being almost the same as the ones above.

```bash
# You will find the compliled binary under folder antrea/plugins/octant/bin.
cd plugins/octant
make antrea-traceflow-plugin
make antrea-octant-plugin
```
107 changes: 107 additions & 0 deletions docs/traceflow-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Traceflow User Guide

Antrea supports using Traceflow for network diagnosis: it generates tracing requests for traffic going through
Antrea-managed Pod network. Creating a new Traceflow CRD triggers the Traceflow module to inject packet into OVS,
provide various observation points along the packet's path and populate these observations into the status field of
the Traceflow CRD. Users can start a new trace simply from either Kubectl or Antrea-Octant-Plugin and view Traceflow
result via CRD or UI graph. We will also provide a corresponding Antctl command to start a new trace in the near future.

## Table of Contents

- [Prerequisites](#Prerequisites)
- [Start a New Trace](#Start-a-New-Trace)
- [Using kubectl and YAML file](#using-kubectl-and-YAML-file)
- [Using Octant with antrea-octant-plugin](#Using-Octant-with-antrea-octant-plugin)
- [View Traceflow Result and Graph](#View-Traceflow-Result-and-Graph)
- [View Traceflow CRDs](#View-Traceflow-CRDs)

## Prerequisites
You need to switch on traceflow from featureGates defined in antrea.yml for both Controller and Agent.
```yaml
antrea-controller.conf: |
featureGates:
# Enable traceflow which provides packet tracing feature to diagnose network issue.
Traceflow: true
antrea-agent.conf: |
featureGates:
# Enable traceflow which provides packet tracing feature to diagnose network issue.
Traceflow: true
```
For antrea-octant-plugin installation, please refer to [antrea-octant-installation](/docs/octant-plugin-installation.md).

## Start a New Trace
You can choose to use either Kubectl together with YAML file or Octant UI to start a new trace.
If you use Kubectl to start a new trace, you can provide the following information which will be used to build the trace packet:
* source Pod
* destination Pod or destination IP address
* transport protocol (TCP/UDP/ICMP)
* transport ports

If you use the UI to start a new trace, we currently only support Pods as the destination, but will soon support
destination IPs and Service names.

### Using kubectl and YAML file
mengdie-song marked this conversation as resolved.
Show resolved Hide resolved
You can start a new trace by creating Traceflow CRD via Kubectl and a YAML file which contains the essential
configuration of Traceflow CRD. An example YAML file of Traceflow CRD might look like this:
```yaml
apiVersion: ops.antrea.tanzu.vmware.com/v1alpha1
kind: Traceflow
metadata:
name: tf-test
spec:
source:
namespace: default
pod: tcp-sts-0
destination:
namespace: default
pod: tcp-sts-2
# ip: IP can also be marked as destination, but namespace/pod and ip are mutually exclusive.
packet:
ipHeader:
protocol: 6 # Protocol here can be 6 (TCP), 17 (UDP) or 1 (ICMP), default value is 1 (ICMP)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR. This reminds me we probably should use string "TCP", "UDP", "ICMP" for protocols like NetworkPolicy does to make it more friendly, otherwise people like me won't be able to set correct protocol without googling once every time. @jianjuns @antoninbas @gran-vmv what do you think?
https://github.com/kubernetes/kubernetes/blob/master/pkg/apis/core/types.go#L619

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 - but is breaking backwards-compatibility an issue? maybe open an issue and we will make that change when we switch the API to beta, along with any other change that may come up until then?

transportHeader:
tcp:
srcPort: 10000 # Source port needs to be set when Protocol is TCP/UDP.
dstPort: 80 # Destination port needs to be set when Protocol is TCP/UDP.
```
The CRD above starts a new trace from port 10000 of source Pod named `tcp-sts-0` to port 80
of destination Pod named `tcp-sts-2` using TCP protocol.

mengdie-song marked this conversation as resolved.
Show resolved Hide resolved
### Using Octant with antrea-octant-plugin

<img src="https://s3-us-west-2.amazonaws.com/downloads.antrea.io/static/tf_create.png" width="600" alt="Start a New Trace">

From Octant dashboard, you need to click on left navigation bar named "Antrea" and then
choose category named "Traceflow" to lead you to the Traceflow UI displayed on the right side.

Now, you can start a new trace by clicking on the button named "Start New Trace" and submitting the form with trace details.
It helps you create a Traceflow CRD and generates a corresponding Traceflow Graph.

## View Traceflow Result and Graph

You can always view Traceflow result directly via Traceflow CRD status and see if the packet is successfully delivered
or somehow dropped by certain packet-processing stage. Antrea also provides a more user-friendly way by showing the
Traceflow result via a trace graph on UI.

<img src="https://s3-us-west-2.amazonaws.com/downloads.antrea.io/static/tf_graph_success.png" width="600" alt="Show Successful Trace">

From the graph above, we can see the inter-node traffic between two Pods has been successfully delivered.
Sometimes the traffic may not be successfully delivered and we can always easily identify where the traffic is dropped
via a trace graph like below.

<img src="https://s3-us-west-2.amazonaws.com/downloads.antrea.io/static/tf_graph_failure.png" width="600" alt="Show Failing Trace">

You can also generate a historical trace graph by providing a specific Traceflow CRD name (assuming the CRD has not been deleted yet)
as shown below.

<img src="https://s3-us-west-2.amazonaws.com/downloads.antrea.io/static/tf_historical_graph.png" width="600" alt="Generate Historical Trace">

## View Traceflow CRDs

<img src="https://s3-us-west-2.amazonaws.com/downloads.antrea.io/static/tf_overview.png" width="600" alt="Antrea Overview">

As shown above, you can check the existing Traceflow CRDs in the "Traceflow Info" table of the Antrea Overview web page
in the Octant UI. You can generate a trace graph for any of these CRDs, as explained in the previous section.
Also, you can view all the traceflow CRDs from the Tracflow page by clicking the right tab named "Traceflow Info" like below.

<img src="https://s3-us-west-2.amazonaws.com/downloads.antrea.io/static/tf_table.png" width="600" alt="Traceflow CRDs">