Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add network metrics collection #50

Merged
merged 2 commits into from
Nov 26, 2018
Merged

Conversation

lzang
Copy link
Collaborator

@lzang lzang commented Oct 25, 2018

Add the network metrics collection logic to collect the following metrics:

  • conntrack entries: conntrack usage
  • conntrack_error_count (insert_failed, and drop) : A request from TSE due to the issue mentioned here
    https://www.weave.works/blog/racy-conntrack-and-dns-lookup-timeouts
  • num_inuse_sockets(tcp, udp): this would help to tell if the node is under some sort of ddos attack, or a particular node has an unusually high number of connections.
  • num_tw_sockets: this can tell if there is a large number of short-lived sockets, which is often not desired.
  • socket_memory: the memory used by all sockets. When the memory hits the limit, the network becomes not usable. There is a request to add this metrics (kubelet should track tcp_mem stats also along with cpu/ram/disk kubernetes/kubernetes#62334).
    All these metrics can be obtained by read proc file directly (i.e. O(1) cost) .

@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

@googlebot
Copy link

CLAs look good, thanks!

@bowei bowei self-assigned this Nov 6, 2018
pkg/metrics/collector/collector.go Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
cmd/netd/main.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat_test.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat_test.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Show resolved Hide resolved
@bowei
Copy link
Member

bowei commented Nov 16, 2018

Seems like my comment got eaten:

type conntrackStats struct {
  insertFailed int
  drop int
}

func (c *conntrackStats) merge(other *conntrackStats) {
  c.insertFailed += other.insertFailed
  c.drop += other.drop
}

type conntrackIndices struct {
  numFields int
  insert_failed_index int
  drop_index int
}

// parseHeader parses the conntrack header line, returning the
// indexes of the fields we wish to extract.
func parseHeader(line string) (conntrackIndicies, error) {
  var indicies conntrackIndicies
  nameParts = strings.Split(scanner.Text(), " ")
  indicies.numFields = len(nameParts)
  for i, v := range nameParts {
    switch v {
      case "insert_failed":
        indicies.insertFailed = i
      case "drop":
        indicies.drop = i
    }
  }
}

// parseConntrackData parses ...
func parseConntrackData(line string, indicies conntrackIndicies) (*conntrackStats, error) {
  stats := &conntrackStats{}
  valueParts := strings.Split(scanner.Text(), " ")
  if len(valueParts) != numFields { ... }

  if v, err := strconv.ParseUint(valueParts[indicies.insertFailed], 16, 32); err == nil {
    stats.insertFailed = v
  } else { ... }

  if v, err := strconv.ParseUint(valueParts[indicies.drops], 16, 32); err == nil {
    stats.drops = v
  } else { ... }

  return stats, nil
}

// parseConntrackFile parses the conntrack file contents read in from `r` and returns a merged set of stats.
func parseConntrackFile(r io.Reader) (conntrackStats, error) {
  ...
  if ! scanner.Scan() {
    // error no header line
  }
  indicies, err := parseHeader(scanner.Line())
  if err != nil { ... } 
  
  var accStats conntrackStats
  for scanner.Scan() {
     stats, err := parseConntrackData(scanner.Line(), indicies)
     if err != nil { ... }
     accStats.merge(stats)
  }
  ...
}

cmd/netd/main.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/conntrack.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat_test.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat_test.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat_test.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
cmd/netd/main.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/conntrack.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat.go Outdated Show resolved Hide resolved
pkg/metrics/collector/sockstat.go Show resolved Hide resolved
pkg/metrics/collector/sockstat.go Outdated Show resolved Hide resolved
pkg/metrics/metrics.go Outdated Show resolved Hide resolved
Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run golint and govet

minor comments

cmd/netd/main.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Outdated Show resolved Hide resolved
pkg/metrics/collector/collector.go Show resolved Hide resolved
@bowei bowei merged commit e77b8fc into GoogleCloudPlatform:master Nov 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants