Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MVP User Stories for Issue/PR Velocity Analytics via GitHub API #3

Open
1 of 7 tasks
Sihemgourou opened this issue Jul 14, 2021 · 19 comments
Open
1 of 7 tasks

Comments

@Sihemgourou
Copy link

Overview

As a pm at HFLA website, I need to check regularly the velocity of the delivery to adapt the roadmap and make sure that we stay on track.

Action Items

  • Brainstorm with the dev team about the analytics that they would be interested in

Here are the metrics the product team needs:

  • During the recent months, how much time would it take to have an issue closed based on its sizing label and role label.
  • During the recent months, how much time would it take to have an issue move from the prioritized backlog to in progress and then to in progress to closed (based on its sizing label and role label).
  • During the recent months, how much issue have been closed per week and per months
  • How long does it take to review a PR based on the size of the issue

Here are the metrics the dev team needs:

  • Number of pull requests done per month.
  • Time it takes for contributors to pick up new issues after an issue is done and merged

Resources/Instructions

Don't hesitate to ask the product team for any question regarding this issue
Please put the metrics on a spreadsheet format
Link to issue data
Link to pull request data

@Sihemgourou Sihemgourou self-assigned this Jul 14, 2021
@Aveline-art
Copy link
Member

From the meeting last Thursday, July 15:

  • Number of pull requests done per month.
  • Time it takes for contributors to pick up new issues after an issue is done and merged

@Sihemgourou Sihemgourou removed their assignment Jul 19, 2021
@Aveline-art Aveline-art self-assigned this Jul 30, 2021
@Aveline-art
Copy link
Member

  • Progress: Just picked up the issue.
  • Blockers: Will need to make decisions on how to organize the data that can be retrieved from GitHub's API. Leaning on Jupyter, but unsure what methods to analyze data is best.
  • Availability: Random time here and there.
  • ETA: Uncertain as of now. 1 month is a good estimate.

@Aveline-art
Copy link
Member

Aveline-art commented Aug 2, 2021

Ava's notes to self

Question

What is the velocity of the HackForLA Website Developer team?

Background

As the HackForLA team expands, we are ready to take on bigger and bigger projects. In order to effectively track our projects, we need a way to measure the velocity of our team. For this issue, we will gather various metrics that allows us to access not only the velocity of the team, but points where velocity can be improved. This allows us to better communicate with stakeholders, as well as team members, on our progress with the Website's various projects.

Metrics

  • Time it takes for contributors to pick up new issues after an issue is done and merged
    • Independent variables: Assignee
    • Dependent variables: Time for an assignee to assign themselves to a new issue after their last issue is done

  • During the recent months, how much time would it take to have an issue move from the prioritized backlog to in progress and then to in progress to closed (based on its sizing label and role label).
    • Independent variables: Sizing labels, role labels
    • Dependent variables: Average time for issue to move backlog --> in progress on a weekly/monthly basis

  • During the recent months, how much time would it take between an issue being assigned to a pull request created based on size and role labels.
    • Independent variables: Sizing labels, role labels
    • Dependent variables: Average time for issue to move to in progress --> a PR is made for the issue on a weekly/monthly basis

  • During the recent months, how much time would it take between the creation of a pull request to an issue closing based on size and role labels.
    • Independent variables: Sizing labels, role labels
    • Dependent variables: Average time for a pull request to be made -->the linked issue is closed on a weekly/monthly basis

  • During the recent months, how much time would it take for an issue to move from in progress to closed (based on its sizing label and role label).
    • Independent variables: Sizing labels, role labels
    • Dependent variables: Average time for issue to move in progress --> closed status on a weekly/monthly basis

  • During the recent months, how many issues have been closed per week and per months
    • Independent variables: monthly time, weekly time
    • Dependent variables: Number of issues closed

  • Number of pull requests done per month
    • Independent variables: month
    • Dependent variables: number of pull request closed

  • What is the spread of completion time based on size of the issue
    • Independent variables: size
    • Dependent variables: time
    • Display: Bar distribution

Mockup of Table to House Raw Data

Definitions

assignee: id of the issue's assignee
created: date when the issue/pr is created
in progress: date when the issue is assigned
issue closed: date when he issue is closed
issue number: the number of the issue/pr
linked issue/PR: the number of the linked issue/pr
pr closed: date wen the pr is closed

Issues

Issue Number* Linked PR* Assignee* Created Assigned PR Made Issue Closed Labels
number number number date date date date [str]

Pull Requests

Issue Number* Linked Issue* Created PR Closed
number number date date
*potential identifying information

@Aveline-art
Copy link
Member

Aveline-art commented Aug 3, 2021

Questions for @Sihemgourou

  1. In the issue, it was referenced, "in recent months", does that mean we are interested in a timeline from week to week/month to month, or an average of the last X months? If so, what is X?
  2. What is the ideal representation of the analytics data? Are we interested in only the central values (mean, media, mode), or do we want to look at the distribution of results, as well as changes to the distribution over time?
  3. The metrics posed in the issue does not really imply statistical analyses (best fit, significance test, etc) are needed for this project. But would we want that? Should this be discussed further once we have the data we need?
  4. There are some potential edge cases in the data (for example, how do we consider an issue that was assigned multiple times due to a missing assignee?). Is it okay to not include them in the data and make a note of them, or should we include them?

@Sihemgourou Sihemgourou self-assigned this Aug 3, 2021
@Sihemgourou
Copy link
Author

@Aveline-art,

  1. I think that you confirmed that it is possible to retrieve data of past events on the project board. I said "recent months" but maybe we can say from begging of may to now, which can represent best our average velocity. For all the metrics, it is best to have it per month.
  2. For now, only mean is needed. It could be nice to have the maximum and minimum but this is only optional. Also, We need to be able to assess the stats each months to make sure we stay on track.
  3. I don't think it is interesting to have a statistical analysis of the data for now.. We are well aware that those data are only a proxy to measure velocity.
  4. Good point. What are the risks of not excluding the edge cases. Let's discuss that in live !

@Aveline-art
Copy link
Member

Aveline-art commented Aug 4, 2021

Plan to move forward:

Outliers/edge cases (for example, issues that have been reassigned multiple times) will need to be examined on an individual basis before making a conclusion on how to merge it into the data.

Nice data to have concerning edge cases:

  • What is the overall type of issue that people tends to drop? (for example, does it have to do with size? role?)
  • What is the velocity of issues that involves multiple collaborators? Or back to back collaborators (such as a designer works on it before being passed on to developer)?

@github-actions

This comment was marked as resolved.

@Aveline-art
Copy link
Member

  • Progress: Completed a first analysis, so I know the code written is robust.
  • Blockers: Need to create a filter function to handle outliers better, and to also to fill in gaps in our data.
  • Availability: 8 hours
  • ETA: Sometime next week

@github-actions

This comment was marked as resolved.

@Aveline-art
Copy link
Member

  • Progress: Finished examining all the data. Will need to start subdividing by size and role.
  • Blockers: Will need to do research on data science libraries to find the best way to format the data.
  • Availability: 5 hours
  • ETA: 14 days

@github-actions

This comment was marked as resolved.

@Aveline-art
Copy link
Member

Aveline-art commented Aug 28, 2021

  • Progress: Managed to make great progress after a busy series of weeks. I have managed to extract all the relevant data, made the analysis tool better, fixed a bug, and made charts. The next step is to filter out all product issues, since our lack of PMs for a long time had created outliers. Then, the data cleaner needs to be smarter about what data to exclude to avoid over-excluding.
  • Blockers: None. I am close to being done. I hope I don't get suddenly swamped with work!
  • Availability: 5 hours
  • ETA: By next week! Cross my heart!
It's beautiful(ish)!

@Aveline-art
Copy link
Member

Notes on Definitions

All the definitions below are made under the assumption that analytics are to record the amount of work done.

Opened: Always take the first opened. This signifies when work first begins. The time before an issue is assigned is the queue time before this work can proceed.

Assigned: Always take the first assigned. This signifies when work is started being actively worked on. There might have been work before, in the form of dependencies or discussion, which the timeline cannot account for.

Pr_made: Always take the first pr_made after first assigned. This signifies that review work has begun. If an issue has multiple pr, it means that a lot of reviews are needed, and review work has been happening along side dev work for this issue. The assumption to make is that pr_made means that both review and dev work is happening.

Closed: Always take the last closed after the last open or reopened. The final close means that all work is done on this. If an issue is reopened for discussion, to be reworked on, by accident, the time it takes to close again represents the work that must be done before close happens again.

@Aveline-art
Copy link
Member

Aveline-art commented Sep 7, 2021

Update:

@github-actions

This comment was marked as resolved.

@Aveline-art
Copy link
Member

Aveline-art commented Sep 12, 2021

Progress: Made good progress. Started analyzing the PR data and talked to stalkholders on how to utilize that data.
Blockers: No blockers so far, but I will need to do research on presenting Jupyter data.
Availability: 4 hours
ETA: Unsure. The project as it stands is done, but sign-off is required before it can move onto the next stage at the team-analytics repo.

@github-actions

This comment was marked as duplicate.

@Aveline-art
Copy link
Member

I am closing this issue as it has moved to @hackforla/team-analyitics

@mayankt153 mayankt153 moved this to New Issue Approval in P: HfLA Dashboards: Project Board Aug 25, 2024
@ExperimentsInHonesty ExperimentsInHonesty changed the title Analytics on Github project board MVP User Stories for Issue/PR Velocity Analytics via GitHub API Sep 8, 2024
@Samhitha444
Copy link
Member

Samhitha444 commented Sep 12, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: New Issue Approval
Development

No branches or pull requests

5 participants