Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagnostics: Add CPU history #1602

Merged
merged 5 commits into from
Jun 11, 2020
Merged

Conversation

j82w
Copy link
Contributor

@j82w j82w commented Jun 5, 2020

Pull Request Template

Description

This adds the CPU load history to all diagnostics requests. This information is important to troubleshoot latency and other problems caused by high CPU usage. This is a best effort and will not trace the CPU history if an exception is hit. If some error does happen this should not block users.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update
{
    "DiagnosticVersion": "2",
    "Summary": {
        "StartUtc": "2020-06-05T14:56:53.9601828Z",
        "TotalElapsedTimeInMs": 145.179,
        "UserAgent": "cosmos-netstandard-sdk/3.9.1|3.10.0|03|X64|Microsoft Windows 10.0.19041 |.NET Core 4.6.28801.04|",
        "TotalRequestCount": 1,
        "FailedRequestCount": 0
    },
    "Context": [
        {
            "Id": "ItemSerialize",
            "ElapsedTimeInMs": 0.101
        },
        {
            "Id": "ExtractPkValue",
            "ElapsedTimeInMs": 28.4009
        },
        {
            "Id": "AggregatedClientSideRequestStatistics",
            "ContactedReplicas": [
                {
                    "Count": 12,
                    "Uri": "rntbd://127.0.0.1:10253/apps/DocDbApp/services/DocDbServer19/partitions/a4cb495f-38c8-11e6-8106-8cdcd42c33be/replicas/1p/"
                }
            ],
            "RegionsContacted": [
                "https://127.0.0.1:8081/"
            ],
            "FailedReplicas": []
        },
        {
            "Id": "Microsoft.Azure.Cosmos.Handlers.DiagnosticsHandler",
            "HandlerElapsedTimeInMs": 116.54690000000001
        },
        {
            "Id": "SystemInfo",
            "CpuHistory": "(2020-06-05T14:56:40.6078897Z 50.000), (2020-06-05T14:56:50.6084254Z 3.458)"
        },
        {
            "Id": "Microsoft.Azure.Cosmos.Handlers.RetryHandler",
            "HandlerElapsedTimeInMs": 116.54350000000001
        },
        {
            "Id": "Microsoft.Azure.Cosmos.Handlers.RouterHandler",
            "HandlerElapsedTimeInMs": 116.5233
        },
        {
            "Id": "Microsoft.Azure.Cosmos.Handlers.TransportHandler",
            "HandlerElapsedTimeInMs": 1.4924000000000002
        },
        {
            "Id": "Microsoft.Azure.Documents.ServerStoreModel",
            "ElapsedTimeInMs": 116.46820000000001
        },
        {
            "Id": "AddressResolutionStatistics",
            "StartTimeUtc": "2020-06-05T14:56:54.0522007Z",
            "EndTimeUtc": "2020-06-05T14:56:54.0712330Z",
            "ElapsedTimeInMs": 19.0323,
            "TargetEndpoint": "https://127.0.0.1:8081//addresses/?$resolveFor=dbs%2fkdAZAA%3d%3d%2fcolls%2fkdAZAPI0Aag%3d%2fdocs&$filter=protocol eq rntbd&$partitionKeyRangeIds=0"
        },
        {
            "Id": "StoreResponseStatistics",
            "StartTimeUtc": "2020-06-05T14:56:53.9887790Z",
            "ResponseTimeUtc": "2020-06-05T14:56:54.1050021Z",
            "ElapsedTimeInMs": 116.2231,
            "ResourceType": "Document",
            "OperationType": "Create",
            "LocationEndpoint": "https://127.0.0.1:8081/",
            "ActivityId": "2d891c88-7907-4dde-85f0-66cab771df98",
            "StoreResult": "StorePhysicalAddress: rntbd://127.0.0.1:10253/apps/DocDbApp/services/DocDbServer19/partitions/a4cb495f-38c8-11e6-8106-8cdcd42c33be/replicas/1p/, LSN: 14, GlobalCommittedLsn: -1, PartitionKeyRangeId: 0, IsValid: True, StatusCode: 201, SubStatusCode: 0, RequestCharge: 11.05, ItemLSN: -1, SessionToken: -1#14, UsingLocalLSN: False, TransportException: null"
        }
    ]
}

@j82w j82w added the Diagnostics Issues around diagnostics and troubleshooting label Jun 5, 2020
@j82w j82w self-assigned this Jun 5, 2020
Copy link
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only comment I am blocking on is the "ProcessInfo" json name - that is exposed publicly - and should be either "SystemInfo", "ProcessorInfo" or "CPUInfo" etc. - ProcessINof is confusing because we measure the System CPU usage

@kirankumarkolli
Copy link
Member

Please run it through with Fabian once.

@j82w j82w merged commit 180ef23 into master Jun 11, 2020
@j82w j82w deleted the users/jawilley/diagnostics/cpu_usage branch June 11, 2020 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Diagnostics Issues around diagnostics and troubleshooting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants