Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record stats by service #966

Closed
paulmelnikow opened this issue Apr 26, 2017 · 14 comments
Closed

Record stats by service #966

paulmelnikow opened this issue Apr 26, 2017 · 14 comments
Labels
blocker PRs and epics which block other work core Server, BaseService, GitHub auth, Shared helpers operations Hosting, monitoring, and reliability for the production badge servers

Comments

@paulmelnikow
Copy link
Member

Logging stats by service would allow us to identify which badges are used the most often. This would be useful for prioritizing fixes and feature requests. It would also help with questions like the one I raised here:

A standard API key with https://libraries.io/ covers 60 requests per minute. Will that be enough for the public server?

Currently we're recording stats by format. Those stats were used to drive the choice of the default format. Since that's now been established, there's no need to keep doing that.

In the process of doing that, it'd be great to move the stats code into separate modules, and get some tests around that code.

We could consider removing support for Redis, which is not used on the production servers.

(Much of above comes from an offline conversation with @espadrine)

@paulmelnikow paulmelnikow added the core Server, BaseService, GitHub auth, Shared helpers label Apr 26, 2017
@paulmelnikow
Copy link
Member Author

Started by moving the code into a module and adding a test of the analytics endpoint: #970.

@paulmelnikow paulmelnikow added operations Hosting, monitoring, and reliability for the production badge servers developer-experience Dev tooling, test framework, and CI labels Mar 4, 2018
@techtonik
Copy link
Contributor

Just be careful not to overanalyze it. There is still No tracking promise on http://shields.io =)

@paulmelnikow paulmelnikow removed developer-experience Dev tooling, test framework, and CI core Server, BaseService, GitHub auth, Shared helpers labels Nov 6, 2018
@paulmelnikow paulmelnikow added core Server, BaseService, GitHub auth, Shared helpers blocker PRs and epics which block other work labels Jan 4, 2019
paulmelnikow added a commit that referenced this issue Feb 24, 2019
paulmelnikow added a commit that referenced this issue Feb 27, 2019
This picks up #2068 by adding per-badge stats as discussed in #966.

It ensures every service has a unique `name` property. By default this comes from the class name, and is overridden in all the various places where the class names are duplicated. (Some of those don't seem that useful, like the various download interval services, though those need to be refactored down into a single service anyway.) Tests enforce the names are unique. These are the names used by the service-test runner, so it's a good idea to make them unique anyway. (It was sort of strange before that you had to specify `nuget` instead of e.g. `resharper`.)

I've added validation to `deprecatedService` and `redirector`, and required that every `route` has a `base`, even if it's an empty string.

The name is used to generate unique metric labels, generating metrics like these:

```
service_requests_total{category="activity",family="eclipse-marketplace",service="eclipse_marketplace_update"} 2
service_requests_total{category="activity",family="npm",service="npm_collaborators"} 3
service_requests_total{category="activity",family="steam",service="steam_file_release_date"} 2
service_requests_total{category="analysis",family="ansible",service="ansible_galaxy_content_quality_score"} 2
service_requests_total{category="analysis",family="cii-best-practices",service="cii_best_practices_service"} 4
service_requests_total{category="analysis",family="cocoapods",service="cocoapods_docs"} 2
service_requests_total{category="analysis",family="codacy",service="codacy_grade"} 3
service_requests_total{category="analysis",family="coverity",service="coverity_scan"} 2
service_requests_total{category="analysis",family="coverity",service="deprecated_coverity_ondemand"} 2
service_requests_total{category="analysis",family="dependabot",service="dependabot_semver_compatibility"} 3
service_requests_total{category="analysis",family="lgtm",service="lgtm_alerts"} 2
service_requests_total{category="analysis",family="lgtm",service="lgtm_grade"} 3
service_requests_total{category="analysis",family="snyk",service="snyk_vulnerability_git_hub"} 4
service_requests_total{category="analysis",family="snyk",service="snyk_vulnerability_npm"} 5
service_requests_total{category="analysis",family="symfony",service="sensiolabs_i_redirector"} 1
service_requests_total{category="analysis",family="symfony",service="symfony_insight_grade"} 1
service_requests_total{category="build",family="appveyor",service="app_veyor_ci"} 3
service_requests_total{category="build",family="appveyor",service="app_veyor_tests"} 6
service_requests_total{category="build",family="azure-devops",service="azure_dev_ops_build"} 6
service_requests_total{category="build",family="azure-devops",service="azure_dev_ops_release"} 5
service_requests_total{category="build",family="azure-devops",service="azure_dev_ops_tests"} 6
service_requests_total{category="build",family="azure-devops",service="vso_build_redirector"} 2
service_requests_total{category="build",family="azure-devops",service="vso_release_redirector"} 1
service_requests_total{category="build",family="bitbucket",service="bitbucket_pipelines"} 5
service_requests_total{category="build",family="circleci",service="circle_ci"} 5
```

This is predicated on being able to use Prometheus's [`rate()`](https://prometheus.io/docs/prometheus/latest/querying/functions/#rate) function to visualize a counter's rate of change, as mentioned at #2068 (comment). Otherwise the stats will be disrupted every time a server restarts.

The metrics only appear on new-style services.
@paulmelnikow
Copy link
Member Author

It's really fascinating to see all this data! It's always been such a mystery of what people are really using and I have to say, it's really fun to see the numbers climb! Some badges which I personally don't notice that often are more popular than I might have expected! It's interesting to see. 🍬

@platan I tried adding some dashboards but I'm finding the process a bit confusing!

These are some of the ideas I had:

  • A pie chart showing badges in the last 7 days by category
  • A table showing the 50 most popular individual services over the last 7 days
  • A stacked graph, like the one in Cloudflare (below), showing total requests per day per server
  • A table showing all the services families showing the percentage of the total badges over the last 7 days.

screen shot 2019-02-27 at 7 30 07 pm

@chris48s
Copy link
Member

Yeah well done on getting this shipped. Its amazing to finally get some understanding of which integrations are popular and what sort of numbers are involved. There are some quite surprising results in there.
It is probably a bit weirdly skewed in that we probably log less traffic in there on badges that are usually served up via a proxy than ones which aren't, but its great to have these stats.

@platan
Copy link
Member

platan commented Mar 1, 2019

@paulmelnikow thanks for ideas for graphs. It would be pleasure for me to add them to grafana. I can do it at the beginning of the next week. This week I don't have access to my pc.

@paulmelnikow
Copy link
Member Author

@platan That sounds awesome!

@platan
Copy link
Member

platan commented Mar 5, 2019

I've prepared first version of a dashboard with service stats: https://metrics.shields.io/d/aESRBSjmz/services?orgId=2. Feel free to update it (especially names of the graphs/tables). All graph except "Total requests per day per server" show data for selected time range (you can select a time range using
screenshot_2019-03-05 grafana - services).
Do you have ideas for more graphs?

@paulmelnikow
Copy link
Member Author

paulmelnikow commented Mar 6, 2019

It is so fascinating to see this information!

I will think about some more graphs :)

paulmelnikow added a commit that referenced this issue Mar 6, 2019
Right now they're showing up in "other," though I expect they make up
most of that category.

#966 (comment)
calebcartwright pushed a commit that referenced this issue Mar 6, 2019
Right now they're showing up in "other," though I expect they make up
most of that category.

#966 (comment)
@paulmelnikow
Copy link
Member Author

Seems like this is done! Thanks @platan! Let's open a new issue for any follow-on work.

@techtonik

This comment has been minimized.

@paulmelnikow

This comment has been minimized.

@techtonik

This comment has been minimized.

@paulmelnikow

This comment has been minimized.

@techtonik

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocker PRs and epics which block other work core Server, BaseService, GitHub auth, Shared helpers operations Hosting, monitoring, and reliability for the production badge servers
Projects
None yet
Development

No branches or pull requests

4 participants