Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(docs): fix grammatical errors #617

Merged
merged 1 commit into from
Sep 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions MOTIVATION.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
## Promxy, a prometheus scaling story

### The beginning
Prometheus is an all-in-one metrics and alerting system. The fact that everything is built in is quite convenient when doing initial setup and testing. Throw grafana in front of that and we are cooking with gas! At this scale-- there where no concerns, only snappy metrics and pretty graphs.
Prometheus is an all-in-one metrics and alerting system. The fact that everything is built-in is quite convenient when doing initial setup and testing. Throw grafana in front of that and we are cooking with gas! At this scale -- there where no concerns, only snappy metrics and pretty graphs.

### Redundancy
After installing the first prometheus host you realize that you need redundancy. To do this you stand up a second prometheus host with the same scrape config. At this point you have 2 nodes with the data, but grafana only pointing at one. Quickly you put a load balancer in front and grafana load balances between the 2 nodes -- problem solved! Then some time in the future you have to reboot a prometheus host. After the host reboots you notice that you have holes in the graphs 50% of the time. With prometheus itself there is no solution short or long term for this as there is no cross-host merging and prometheus' datastore doesn't support backfilling data.
Expand All @@ -10,10 +10,10 @@ After installing the first prometheus host you realize that you need redundancy.
As you continue to scale (adding more machines and metrics) you quickly realize that
all of your metrics cannot fit on a single host anymore. No problem, we'll shard the
scrape config! The suggested way in the prometheus community is to split your metrics
based on application-- so you dutifully do so. Now you have a cluster per-app for
based on application -- so you dutifully do so. Now you have a cluster per-app for
metrics separate. Soon after though you realize that there are lots of servers, so
its not even feasibly for all the app metrics in a single shard -- you need to split
them. You do this by region/az/etc. but now grafana is littered with so many
its not even feasible for all the app metrics to be in a single shard -- you need to
split them. You do this by region/az/etc. but now grafana is littered with so many
prometheus datasources! No problem you say to yourself, as you switch from using
a single source to using mixed sources -- and adding each region/az/etc. with the
same promql statements.
Expand All @@ -39,7 +39,7 @@ At this point you consider your situation:
- You have metrics
- You have redundancy (with the occasional hole on a restart or node failure)
- You have **many** prometheus data sources in grafana (which is confusing to all your grafana users -- as well as yourself!)
- You have Aggregation set up -- which (1) accounts for the majority of load on the other promethes hosts (2) is at a lower granularity than you'd like, and (3) now means that you have to maintain separate alerting rules for the **aggregation** layers from the rest of the prometheus hosts.
- You have aggregation set up -- which (1) accounts for the majority of load on the other promethes hosts (2) is at a lower granularity than you'd like, and (3) now means that you have to maintain separate alerting rules for the **aggregation** layers from the rest of the prometheus hosts.

And you tell yourself, this seems too complicated; there must be a better way!

Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ can have a single source and you can have globally aggregated promql queries.
## Quickstart
Release binaries are available on the [releases](https://github.com/jacksontj/promxy/releases) page.

If you are interested in hacking on promxy (or just running your own build), you can clone and build`:
If you are interested in hacking on promxy (or just running your own build), you can clone and build:

```
git clone git@github.com:jacksontj/promxy.git
Expand Down Expand Up @@ -60,7 +60,7 @@ Promxy is currently using a fork based on prometheus 2.24. This version isn't su
but it is relevant for promql features (e.g. subqueries) and sd config options.

### What changes are required to my prometheus infra for promxy?
None. Promxy is simply an aggregating proxy that sends requests to prometheus-- meaning
None. Promxy is simply an aggregating proxy that sends requests to prometheus -- meaning
it requires no changes to your existing prometheus install.

### Can I have promxy as a downstream of promxy?
Expand All @@ -73,7 +73,7 @@ Promxy's goal is to be the same performance as the slowest prometheus server it
has to talk to. If you have a query that is significantly slower through promxy
than on prometheus direct please open up an issue so we can get that taken care of.

**Note**: if you are running prometheus <2.2 you may notice "slow" performance when running queries that access large amounts of data. This is due to inefficient json marshaling in prometheus. You can workaround this by configuring promxy to use the [remote_read](https://github.com/jacksontj/promxy/blob/master/pkg/servergroup/config.go#L27) API
**Note**: if you are running prometheus <2.2 you may notice "slow" performance when running queries that access large amounts of data. This is due to inefficient json marshaling in prometheus. You can workaround this by configuring promxy to use the [remote_read](https://github.com/jacksontj/promxy/blob/master/pkg/servergroup/config.go#L27) API.

### How does Promxy know what prometheus server to route to?
Promxy currently does a complete scatter-gather to all configured server groups.
Expand All @@ -92,7 +92,7 @@ endpoint must be defined in the promxy config (which is where it will send those

### What happens when an entire ServerGroup is unavailable?
The default behavior in the event of a servergroup being down is to return an error. If all nodes in a servergroup
are down the resulting data can be inaccurate (missing data, etc.) -- so we'd rather by default return an error rather
are down the resulting data can be inaccurate (missing data, etc.) -- so we'd rather by default return an error
than an inaccurate value (since alerting etc. might rely on it, we don't want to hide a problem).

Now with that said if you'd like to make some or all servergroups "optional" (meaning the errors will
Expand Down
Loading