Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add containerPool container histogram metric #5222

Merged
merged 8 commits into from
May 27, 2022

Conversation

ningyougang
Copy link
Contributor

@ningyougang ningyougang commented Apr 24, 2022

There has container start relative metric in FunctionFullContainerPool,
It is better to provide container size histogram metric as well

  • Add containerPool container size histogram metric
  • Add pod creation consume time metric already existed using trans.finish
  • Add docker container creation consume time metric already existed using trans.finish

Description

Related issue and scope

  • I opened an issue to propose and discuss this change (#????)

My changes affect the following components

  • API
  • Controller
  • Message Bus (e.g., Kafka)
  • Loadbalancer
  • Scheduler
  • Invoker
  • Intrinsic actions (e.g., sequences, conductors)
  • Data stores (e.g., CouchDB)
  • Tests
  • Deployment
  • CLI
  • General tooling
  • Documentation

Types of changes

  • Bug fix (generally a non-breaking change which closes an issue).
  • Enhancement or new feature (adds new functionality).
  • Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

  • I signed an Apache CLA.
  • I reviewed the style guides and followed the recommendations (Travis CI will check :).
  • I added tests to cover my changes.
  • My changes require further changes to the documentation.
  • I updated the documentation where necessary.

val t1 = System.currentTimeMillis()
MetricEmitter.emitHistogramMetric(
LogMarkerToken("kubeapi", "create", "duration", Some("create"), Map("cmd" -> "create"))(
MeasurementUnit.time.milliseconds),
Copy link
Contributor Author

@ningyougang ningyougang Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, just add kubeapi for create, maybe we can add kubeapi for delete in future.

Copy link
Member

@upgle upgle May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The related transaction id already exists. And it is better to use transid than to calculate time directly.

    val start = transid.started(
      this,
      LoggingMarkers.INVOKER_KUBEAPI_CMD("create"),
      s"launching pod $name (image:$image, mem: ${memory.toMB}) (timeout: ${config.timeouts.run.toSeconds}s)",
      logLevel = akka.event.Logging.InfoLevel)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transid.finished method already sends Histogram metric here. What's the difference?

MetricEmitter.emitHistogramMetric(endMarker, deltaToEnd)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@upgle ,good found, has no any difference, i already removed the duplicated one, just keep transid.finish to send the metric.

MetricEmitter.emitHistogramMetric(LoggingMarkers.INVOKER_CONTAINERPOOL_CONTAINER("prewarmed"), prewarmedSize)
MetricEmitter.emitHistogramMetric(LoggingMarkers.INVOKER_CONTAINERPOOL_CONTAINER("busy"), busySize)
MetricEmitter.emitHistogramMetric(LoggingMarkers.INVOKER_CONTAINERPOOL_CONTAINER("warmed"), warmedSize)
MetricEmitter.emitHistogramMetric(LoggingMarkers.INVOKER_CONTAINERPOOL_CONTAINER("all"), allSize)
Copy link
Contributor Author

@ningyougang ningyougang Apr 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As admin, he may want to know the prewarmed/busy/warmed container in one invoker in metric show page

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep this a great metric addition and needed. Should we include namespace and action as tags here as well for the warmed containers? If needed in a separate metric. I think it's useful as well to know what invokers a namespace and actions are running on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that would apply for the memory metric as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdoyle0182 for warmed container size metric, i have added namespace and action tags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding I guess that would apply for the memory metric as well
Seems a little complex to apply tag: namespace + action to memory metric.
Currently, i just apply the tag(namespace + action) to container metric.

@codecov-commenter
Copy link

codecov-commenter commented Apr 24, 2022

Codecov Report

Merging #5222 (4b595e0) into master (4b595e0) will not change coverage.
The diff coverage is n/a.

❗ Current head 4b595e0 differs from pull request most recent head 53cbe7f. Consider uploading reports for the commit 53cbe7f to get more accurate results

@@           Coverage Diff           @@
##           master    #5222   +/-   ##
=======================================
  Coverage   44.87%   44.87%           
=======================================
  Files         238      238           
  Lines       13996    13996           
  Branches      561      561           
=======================================
  Hits         6281     6281           
  Misses       7715     7715           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4b595e0...53cbe7f. Read the comment docs.

@bdoyle0182
Copy link
Contributor

LGTM other than my one comment

MetricEmitter.emitHistogramMetric(
LogMarkerToken("docker", "runCmd", "duration", Some(args.head), Map("cmd" -> args.head))(
MeasurementUnit.time.milliseconds),
Instant.now.toEpochMilli - transid.meta.start.toEpochMilli)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the usual way to send metric.
Is there any reason not to use transid start, finish ?

Copy link
Contributor Author

@ningyougang ningyougang May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already changed to use transid.finish to send the metric and delete the repeated one.

Copy link
Member

@upgle upgle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the transid.finished method is already sending histogram metric. I wonder what the difference is.

val t1 = System.currentTimeMillis()
MetricEmitter.emitHistogramMetric(
LogMarkerToken("kubeapi", "create", "duration", Some("create"), Map("cmd" -> "create"))(
MeasurementUnit.time.milliseconds),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transid.finished method already sends Histogram metric here. What's the difference?

MetricEmitter.emitHistogramMetric(endMarker, deltaToEnd)

@ningyougang ningyougang force-pushed the add-container-count-metric branch 2 times, most recently from 18b611d to 910d7de Compare May 10, 2022 08:56
def INVOKER_RUNC_CMD(cmd: String) =
LogMarkerToken(invoker, "runc", start, Some(cmd), Map("cmd" -> cmd))(MeasurementUnit.time.milliseconds)
def INVOKER_KUBEAPI_CMD(cmd: String) =
LogMarkerToken(invoker, "kubeapi", start, Some(cmd), Map("cmd" -> cmd))(MeasurementUnit.none)
LogMarkerToken(invoker, "kubeapi", start, Some(cmd), Map("cmd" -> cmd))(MeasurementUnit.time.milliseconds)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obviously, the unit should be MeasurementUnit.time.milliseconds

@@ -207,7 +207,6 @@ class DockerClient(dockerHost: Option[String] = None,
case Success(_) => transid.finished(this, start)
case Failure(pte: ProcessTimeoutException) =>
transid.failed(this, start, pte.getMessage, ErrorLevel)
MetricEmitter.emitCounterMetric(LoggingMarkers.INVOKER_DOCKER_CMD_TIMEOUT(args.head))
Copy link
Contributor Author

@ningyougang ningyougang May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transid.failed already send the metric, so delete the repeated one and deleted the INVOKER_DOCKER_CMD_TIMEOUT method

@bdoyle0182
Copy link
Contributor

@ningyougang can we get this merged in? This is a useful metric to have available along with the container pool memory that's already there

@ningyougang
Copy link
Contributor Author

@ningyougang can we get this merged in? This is a useful metric to have available along with the container pool memory that's already there

Oh, ok

@ningyougang ningyougang reopened this May 26, 2022
@ningyougang ningyougang force-pushed the add-container-count-metric branch 2 times, most recently from 4b2b1ba to 45c1d32 Compare May 26, 2022 06:20
}
val warmedPoolResultMap: mutable.Map[(String, String), Int] = mutable.Map.empty[(String, String), Int]
warmedPoolGroupMap.foreach { data =>
warmedPoolResultMap += (data._1 -> data._2.size)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an incompatibility problem in travis test.
image

So don't use mapValues here due to scala2.13's return value is MapView and scala2.12's return type is Map

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't it just remove the type annotation?

    val warmedPoolMap = warmedPool groupBy {
      case (_, warmedData) => (warmedData.invocationNamespace, warmedData.action.toString)
    } mapValues (_.size)
    for((data, size) <- warmedPoolMap) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good solution.

map += (k -> v)
}
}
LogMarkerToken(invoker, "containerPoolContainer", counter, Some(state), map)(MeasurementUnit.none)
Copy link
Contributor

@jiangpengcheng jiangpengcheng May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can use tags.map(Map("state" -> state) ++: _).getOrElse(Map("state" -> state)) here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated accordingly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorrry, this Map("state" -> state) ++: tags.getOrElse(Map.empty) should be better

Copy link
Contributor

@jiangpengcheng jiangpengcheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some minor suggestions

@ningyougang ningyougang merged commit 1a6c99d into apache:master May 27, 2022
JesseStutler pushed a commit to JesseStutler/openwhisk that referenced this pull request Jul 13, 2022
* Add containerPool container histogram metric

* Add pod creation consume time metric

* Add docker container creation consume time metric

* Use transaction to calculate

* Add namespace and action tags for container metric

* Avoid send metric repeatedly

* Fix scala 2.13 compilation error

* Update according to review comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants