Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

[meta] Grafana persistence has been broken for most of a year, and so has the process of PR review #10306

Closed
AaronFriel opened this issue Dec 30, 2018 · 14 comments
Assignees
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@AaronFriel
Copy link
Contributor

AaronFriel commented Dec 30, 2018

Persistence in Grafana has been broken for months (#4156), a PR was made to fix the volume permissions (#5576), this was incorrectly nitpicked twice by the person it was assigned to @unguiculus, then there was some bikeshedding about whether the change that caused it broke semver and how that should be dealt with in the future.

Then #5576 was never merged and unceremoniously closed.

In the meantime, in #5635 the advice is the chart is broken and wait on #5576 which of course was closed.

Nevermind that persistence was disabled by default and it's pretty obvious more effort went into making sure that chart template was done correctly, but since it left a bunch of values (persistence size, access modes) unspecified, people ran into other issues which are unresolved in #5102, #8857, #9660, #10116, and #10276.

Those are just the results I found recently when searching for "grafana persistence" in issues.

Here's what I propose:

Fix the chart by adding an initContainer, as in #5576. If the assignee @unguiculus would just fix up the PR and merge it, at least half a dozen people would be happy(er), and those are just the people who reported issues. As it is, even if a user sets all of the persistence properties correctly the container won't boot. That's not a user issue, that's an issue with the chart. I fix this in #10307 by simply following the instructions other people laid out in the issues above. Here's a smattering of the issues (that I could find) fixed by my PR:

@AaronFriel AaronFriel changed the title [stable/grafana] Persistence is Still Broken, So Is The Process [meta] Persistence is Still Broken, So Is The Process Jan 3, 2019
@AaronFriel AaronFriel changed the title [meta] Persistence is Still Broken, So Is The Process [meta] Grafana persistence has been broken for most of a year, and so has the process of PR review Jan 3, 2019
@mattfarina
Copy link
Contributor

mattfarina commented Jan 3, 2019

@AaronFriel First, let me say that you are right the process around this is broken in some ways and you can see that with this particular problem. Thank you for highlighting that.

I would like to talk about a few things to help fix this situation along with highlight a couple reasons this happened to add context.

First, how did we get here? Why did this happen?

  1. To be honest, the maintainers of the charts repository have become overwhelmed and the people working on the charts repository are no longer scaling with the load. This is why we created the hub to try and distribute the load and unblock people.
  2. We want to rely on charts maintainers who maintain a chart. The maintainers of the Grafana have become somewhat inactive and have not helped on this issue. We could like to have had them active and then given them the ability to review and merge things themselves. Alas, that did not happen. This has left us with a gap.

Because of this the problem has been poorly addressed.

Second, what can we do about this.

  1. We will take a look at [stable/grafana] Fix persistence by chowning /var/lib/grafana #10307, as you propose to start the process of fixing it here.
  2. If someone wants to host a Grafana chart elsewhere, in another repo, we would be happy to list that repo on the hub. It's ok to take this repo out of the loop all together.
  3. We can find new or more maintainers for the chart as it exists here.

I'm happy to help get this resolved and sorry this happened.

@AaronFriel
Copy link
Contributor Author

Hey Matt, thanks for getting back to me. I admit, there are a lot of ways to try to resolve this, but I don't see any that don't have pretty obvious, apparent downsides. I'm sure you're doing your best and trying to grow and maintain a community is not easy; I greatly appreciate your response here.

In an ideal world, charts for third party projects would be fully maintained by a team of dedicated, eager volunteers from the project who respond quickly to every issue and write perfect code.

We're just human though, and I'm no stellar example of a responsive maintainer myself, so some sort of process might be nice when charts are just plain broken, because that affects anyone who tries to install the image and hurts the reputation of everyone involved. I know better, but I could see how if helm install stable/foo-vana didn't work, someone could walk away thinking Kubernetes is a waste of time, Helm is broken, and foo-vana is more like bar-gatory.

My only suggestion would be to have some cross-chart maintainers for charts in this repo (maybe call them something like curators?) who can volunteer to review PRs. Their focus would be break/fix only, when a chart just doesn't work, and punt on reorganization and feature development to chart maintainers where the project would probably have an opinionated stance on changes to what resources are deployed, how they're configured, etc.

I would have liked to be able to ping someone who, if they had only a few minutes on their own Docker machine could run helm install stable/grafana --set persistence.enabled=true, see that the container goes into a crash loop, and say, "Ok, yeah, this needs to be fixed."

@mattfarina
Copy link
Contributor

We have meetings on Tuesdays and I've added this topic, including a link to this issue, to the agenda. The meeting schedule, including an ics feed, is available at https://github.com/helm/community/ for anyone interested.

The next step is for folks to talk about this broader topic.

@sstarcher
Copy link
Collaborator

The hub is a nice start, but one benefit of this repo is the ability to have all of the auto testing and not having to host your own index repo.

@jkodroff
Copy link
Collaborator

jkodroff commented Jan 4, 2019

@mattfarina I think it might be helpful to add some copy somewhere prominent about both the hub and that the current maintainers are overwhelmed in this repo. As a chart developer, I found the lack of response time and dependence on the core maintainers frustrating, but more because I had the expectation of a speedy response time.

I don't fault the maintainers for being a victim of Helm's own popularity, but a heads up that folks are swamped might go a long way in calming some frustrations.

@jkodroff
Copy link
Collaborator

jkodroff commented Jan 4, 2019

@mattfarina Is there a way to allow the people in OWNERS to completely manage the PR process within a specific chart, or does that create too much potential liability for the core group of maintainers?

(I kinda expected that this was how the OWNERS file works, but it does not appear to actually work this way.)

@unguiculus
Copy link
Member

While I can understand your sentiments @AaronFriel if find your critique a little bit harsh. I like to do thorough reviews, sometimes questioning things and enforcing our best practices. I'm always open for discussion, though, and have frequently changed my mind when folks convinced me about their changes. I always try to be helpful and you can always ping us on Slack in the charts channel as well. As @mattfarina mentioned, Grafana maintainers were kind of inactive. It was kind of hard for us charts maintainers to follow allong and see which of several PRs that tried to fix the same thing should go in or not. I was hoping for the chart's maintainers to jump in which should have deeper knowledge about their charts as us charts maintainers.

@unguiculus
Copy link
Member

@sstarcher
Copy link
Collaborator

sstarcher commented Jan 9, 2019

@unguiculus excellent that is a part that I was missing how do I become a Be invited (and accept your invite) as a read-only collaborator on this repo. This is required for @k8s-ci-robot PR comment interaction.

@AaronFriel
Copy link
Contributor Author

@unguiculus Being thorough needs to be balanced with "making charts work". A chart that is broken for most of a year is a bad signal to users of Helm and to the users commenting on the repo. Accepting a PR that fixes a chart, then making your own PR to fix your nits is a perfectly acceptable way of doing things that makes everyone happy. You never responded to @sylr's reply to your requested changes, blocking it from being merged indefinitely.

@sylr
Copy link
Contributor

sylr commented Jan 11, 2019

@AaronFriel @mattfarina #1714

@maorfr
Copy link
Member

maorfr commented Jan 13, 2019

Hey,

It is important to note that until now, the grafana chart had no OWNERS file, so the chart maintainers were limited in what they could do. See this PR for example, where one of the maintainers tried to lgtm, but failed to do so:
#10439 (comment)

This is now resolved as an OWNERS file was added, and a new maintainer (myself) was added to balance the load.

I hope that things will be better from now on ;)

@stale
Copy link

stale bot commented Feb 12, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 12, 2019
@stale
Copy link

stale bot commented Feb 26, 2019

This issue is being automatically closed due to inactivity.

@stale stale bot closed this as completed Feb 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

7 participants