Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-Team Repo Activity - Meltano/MeltanoLabs #479

Closed
pnadolny13 opened this issue Nov 16, 2022 · 6 comments · Fixed by #547
Closed

Non-Team Repo Activity - Meltano/MeltanoLabs #479

pnadolny13 opened this issue Nov 16, 2022 · 6 comments · Fixed by #547
Assignees

Comments

@pnadolny13
Copy link
Contributor

Related to discussion in https://github.com/meltano/internal-marketing/issues/716

We have a pipeline that posts a daily summary of singer ecosystem activity in the singer-ecosystem-activity slack channel. This request is to speed up the cadence of notifications for meltano/MeltanoLabs repos.

  • Should this be in exchange for the current daily summary or in addition to it?
  • If we change to 1 message per action (i.e. similar to meltano-repo-activity channel) should it only be for meltano/meltano labs or everything at that point. From a few weeks of daily updates the whole ecosystem updates aren't terribly noisy but could get worse over time.
  • Should it go into the same slack channel? IIRC these slack api keys are 1-1 with a slack channel so theres a bit more configuration over head to have a new channel
  • GitHub API rate limits might become a challenge. We are teatering on maxing them out already. I think they reset hourly though so running this every 1-2 hours is probably fine. Another option is to use a list of github tokens like the tap accepts, I'm not sure how that works though like we'd need multiple service accounts then because rate limits are at the account level.

cc @tayloramurphy @DouweM @afolson

@pnadolny13 pnadolny13 self-assigned this Nov 16, 2022
@pnadolny13 pnadolny13 moved this to Needs Refinement in Data Team Nov 16, 2022
@tayloramurphy
Copy link
Contributor

@pnadolny13 Having a separate post for MeltanoLabs activity makes sense to me. I don't think it needs to be like meltano-repo-activity right now, but I think we do want to distinguish MeltanoLabs vs the rest of the singer ecosystem.

I'd like to see these in the same slack channel.

Maybe we can test doing it ~3x per day to start and see how it goes. Looking at the existing feed it doesn't seem like we'd need it hourly. I also don't have a good mental model of how many api calls are actually happening on a regular basis.

@afolson
Copy link

afolson commented Nov 16, 2022

@pnadolny13 I think it's fine for these to go into the same channel for now.

I also agree that we don't need hourly. Could we do one post in Sven's AM, one in my AM, and one in my late afternoon? Ideally these would all be separate messages in Slack so we could check them off but for a first iteration a digest is totally fine.

@pnadolny13
Copy link
Contributor Author

I updated the title to not be Singer specific. After re-reading https://github.com/meltano/internal-marketing/issues/716 this should include all non-team activity not just Singer activity.

@tayloramurphy @afolson 3x per day seems reasonable. The current summary includes everything meltanolabs or not, would we want to exclude any meltano/labs activity from that summary and put it in its own 3x per day messages?

My opinion on a path forward is that we keep the current summary the way it is, for the first iteration we add a new pipeline that posts to the same channel with meltano/labs non-team activity summary messages 3x per day, then long term we move those messages out of the singer-ecosystem-activity channel and into something else like support or something new, potentially doing a final iteration to split the summary into individual messages like meltano-repo-activity.

What do you think about that?

@pnadolny13 pnadolny13 changed the title Singer Activity - MeltanoLabs Feed Non-Team Repo Activity - Meltano/MeltanoLabs Nov 16, 2022
@afolson
Copy link

afolson commented Nov 17, 2022

@pnadolny13 I think what you suggested is perfect. Let's see how it goes.

@pnadolny13 pnadolny13 moved this from Needs Refinement to Planned in Data Team Nov 17, 2022
@pnadolny13
Copy link
Contributor Author

pnadolny13 commented Nov 17, 2022

I think we'd need to spin off an implementation issue(s) for this then:

  1. Create new dbt model (or append to the existing slack_alerts model) for compiling and formatting non-team + public + non-bot activity across meltano/labs repos
  2. Add job/schedule for running github meltano EL + dbt staging + dbt slack alerts models on 3x per day cadence

I'm actually realizing that this shouldnt be too taxing on our github rate limits since the github meltano EL job is much more restrictive and only looks at our orgs vs the search EL job that queries the whole github universe.

I also think that meltano/sdk#1200 or meltano/sdk#161 might be needed to allow us to dedupe and avoid double sending message if they havent changed since the last sync. Otherwise if theres no new activity since the last sync we'll end up resending. There are some alternative solutions using SQL filter logic that we can try but they'd be hacky and hard to maintain. Might be fine for the short term though.

cc @tayloramurphy

@pnadolny13
Copy link
Contributor Author

The new channel is called meltano-org-contribution-activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Planned
Development

Successfully merging a pull request may close this issue.

3 participants