-
-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renaming Bus Factor #632
Comments
In my experience, academia (at least the part that's aware of open source software) uses bus factor. In academia, most people who don't know it, after having it explained, generally either get it and laugh, or still don't understand it (and don't really understand open source either). I know I've also heard at least one other term that was more pleasant and still worked, but I can't quite remember it for the minute. (I've also heard truck factor, but I guess that's not really much of an improvement.) |
If we're to rename the metric, now is the time before we standardize
anything with a standards body.
Bus Factor and Lottery Factor both describe an external event that would
impact directly a project member and thus the project. There are a host of
other events with the same potential impact: becoming a parent, getting
laid off, moving to a new city, having to take care of family members,
meeting a significant other, ... - the metric could be named after any of
these events and still always require explanation.
Maybe we can choose a name that is directly descriptive of the problem or
threat: concentration of knowledge, distribution of effort, ...
|
When I talk about this issue, I generally frame it as a discussion of "Contributor Sustainability", but it's probably only one of a number of things that impact contributor sustainability. I still think the metric should be named something that's already established within our community and the literature, which might make Pony Factor a better choice. I like Lottery Factor, but it's definitely not as well known. |
There are two things I think don't work well about "Bus Factor"
I find skeptical looks and confusion when I'm explaining that a "bus factor" of a project is 3, and that's a bad thing. I feel that if we had some language that better indicated what the bus factor is, intuitively, that would be more persuasive and useful. I know "bus" is what we're moving away from, but... When I think of projects making progress, I usually think of forms of transportation. Trains, planes, boats, cars, etc. They move many people around and need critical pieces to keep them moving. We could borrow some of these ideas, and swap "factor" for "count" like:
I'm also happy to turn away from "bus" like things, maybe options from nature without conflicting with git?
That's if we're willing to get creative though. I think if we plan on changing the name to something, Bus Factor is certainly the most well known -- so we should update it to something more intuitive and descriptive. Out of the options I gave, I like Host, Pilot, Captain, and Monarch. Excited to have a discussion about it. The confusing wordplay also exists for "elephant factor" -- but that should probably be a separate discussion :) |
I would vote against Pony Factor, as it's based on an in-joke that is just confusing to people who aren't in the group |
@justaugustus I'm curious if the Inclusive Naming Initiative have had any discussions about this or related terminology? |
I usually use lottery as well, pony doesn't make a lot of sense to me. On count vs factor, what about something like frequent contributors count, which parallels with 'inactive contributors' and 'new contributors'. Bus factor assumes something about the impact of these specific people leaving which might or might not be true depending on additional context. Just calling it a count of people who undertake a certain level of activity leaves it more neutral and more clearly as just part of the fuller picture. |
I usually use lottery factor as well |
I like to use names/terms that are easy to read and do not have implied meaning or metaphor. For something like this in other kinds of projects or organizations, it is sometimes called 'key person risk' (or key people/member risk). I'll toss 'key maintainer risk' in here for consideration, since when I read things like 'lottery factor' or 'pony factor' I have to go look up what that means in this context (and it may be even harder for those who don't have English as a first language); 'key maintainer risk' seems closer to describing exactly what is being measured. |
I would love to not use "bus factor" and usually use "lottery factor" instead. And I always explain in a few words what that means, as I'd explain the naming for any of our metrics. In my opinion, nothing is intuitive to everyone. Even something like "event location inclusivity" requires a few words of explanation by what we mean by that.
What about "Key Maintainer Count" or "Core Maintainer Count"? |
@ElizabethN I would be concerned with using the term "maintainer" as for many projects that has a very specific meaning. Maybe "Key Contributor Count" |
KCC has a nice ring to it, and it's definitely more clear than an analogy. |
Oooh, I like Key Contributor Count. |
For some reason, I thought we had already addressed this one. Thanks for bringing this up @geekygirldawn. The name is definitely problematic and agree pony factor isn't good option either. We could use them as key words though link them to the new name. I like Key Contributor Count... Or Key Contributor Risk. |
...or Core Contributor Risk. We have defined Occasional Contributors ( which was previously problematic as "Drive-by Contributors"). However, we have not defined key or core contributors. Academic literature usually uses core but key may be more descriptive of contribution importance. |
I like Risk better than Count, as it has the same sense (of urgency/danger) that Bus Factor has. It also feels less like something people would try to game |
At risk of sounding like a typical tech exec.... Wouldn't "gaming" this metric be a good thing? More people contributing to oss at a level to constitute bus factor seems likea. good thing... I'll put in that I think "risk" being a number runs the same risk (ha) as using a word like "factor". Without explanation, "my key contributor risk is 3" is a nonsensical phrase. |
I like Key Contributor Count.
My concern with "key" is that it adds a value judgement. Also following
Elizabeth's comment: not everyone needs to have a key to the project to be
included.
Returning to the definition of the metric, we're counting the smallest
number of contributors that made 50% of all contributions during the
analysis time window.
How about: Majority Contributor Count
"Majority" because people know that concept from voting and other contexts.
We could also emphasized that a larger count is good and go with something
like:
- Majority Contributor Spread
- Majority Contributor Concentration
|
In truth, without explanation, any name we choose is likely to be nonsensical. Some are more descriptive than others though. The metric should describe what we are trying to measure - which is the risk associated with key contributors abandoning a project (i think). I wouldn't get hung up on a number. |
@klumb I get ya. The measurement is absolutely indicating the risk. I would agree more with "risk" if the metric itself wasn't a count/number. It's descriptive of what we're measuring to name the measurement. If we're renaming it anyways, why be vague? We could keep metaphorical names like "pilot risk" etc. but that hardly solves the #2 problem I mentioned above, where I regularly have to explain what the metric actually is for people to buy into why it's useful. Just my experience, though. I like @GeorgLink 's observation. Majority Contributor Count is ultra-succinct and descriptive. I didn't even think about how "key" could mean like "having a key". Majority is much more specific. |
Also, I think value judgement is going to necessary for this metric. What the value is the question? Is it related to 'ownership/authorship of a percentage of the codebase? |
If it is about percentage of codebase, rather than contributor, maybe we need it to be about contribution authorship. For example, Majority Contribution Authorship, Majority Contribution Spread, or Majority Contribution Maintainership? Contribution Maintenance Risk? Majority Contribution Count? Just throwing some more out there. ;) |
I really like majority, I think it removes the value judgement of 'key'. This count in the chaoss metric reads to me as just a naive count of how many contributions people make as a percent of the total number of contributions, it says nothing about the value of those contributions in terms of code quantity or quality, which I think argues for keeping the metric as more of a single neutral data point. The metric says it wants to answer "how many contributors can we lose before a project stalls?" but that seems packed with assumptions to me. |
Sorry, but I have no idea what majority means in this context. And given that not all open source contributions are captured in a repository, how would it be measured? |
It totally is! That's part of the magic IMO 😄 There's some stake-in-grounding happening here. What kind of assumptions need to get made to actually measure something, ya kno?
While it's perfectly possible this isn't a perfectly accurate representation, I believe that the metric and it's implementations are usually disjointed. I believe most of the time, contributions are "counted" here as "commits" or "pull request open/close" or "issue open/close". That's just a function of using the GH API / history to make measurements... More tools could definitely get built to measure more though 😃
"Majority" here meaning who is making the majority of contributors. Majority Contributor Count = count of contributors who make the majority of contributions in the project for some time window. |
I like what the bus factor means, in that - a project is one disaster away from the project being completely abandoned or maintained. Some maintainers might keep maintaining if they win the lottery :) I think the seriousness should be retained because that seriousness is what gets people (leaders, people with influence) to act (majority contributor count IMHO, not so much) _, but agree bus factor is morbid. Propose then something more like 'disaster factor' because it has meaning immediately. |
Adoption may be better for Disaster Factor because it is close enough to the previous problematic name. It also signals the risk part. I think that could work. |
Disaster Factor = The risk associated with a count of contributors, who authored a majority of contributions in the project for some time window, abandoning a project. It is probably a good idea to review the description and objective of this metric as well. |
This is such an interesting discussion! At the risk of making this more complicated, is there a rubric that is used to come up with this number, so the name of the metric might be less of a concern (it's explanation would be the rubric)? I like the words 'adoption', 'risk', 'sustainability', and 'resilience' because they are less problematic (for me) than 'bus' or 'disaster' - |
We could use the GitHub poll capability in the Discussions area to create a poll from the names that have been suggested here and solicit votes from the community. Let me know if you'd like me to put that into a Discussion thread (I'm still relatively new to this community - sorry if that's been rehashed or if there is a different norm for this sort of thing!) |
I was a big fan of Lottery Factor, but with the comment from @RichardLitt, now I'm not so sure. Another option that @decause-gov proposed on the poll itself is "Nebraska Factor", which I think is something else to consider:
The benefit of Nebraska Factor is that it doesn't imply a reason (disaster, bus, lottery), because as mentioned above, a person can leave for positive reasons, too. |
I'll vote against Nebraska Factor, as in inside joke that's likely off-putting to those not in on it. It also doesn't have any inherent meaning, so you have to know what it is to know what it is, the name doesn't help you understand it (unless you already know the cartoon) |
Agree. Nebraska factor seems off-putting. When I used lottery factor it was largely in response to the negativity of bus factor, not because I particularly liked it. I ended up voting for the |
How about: Critical contributor index? |
Keep in mind, if we use key/core/critical contributor, we will also need to address this in our ongoing discussions about the boundaries between - core/regular contributors, occasional contributors, conversion rate, and 2nd contributors. What is a key/core/critical contributor? I think Bus Factor addresses the risk associated with 'people' leaving but I don't believe it defines who they are or why they are important. |
I'm thinking maybe we keep it simple. We don't need to make the judgement about whether someone is "core", "key", or "critical" in this metric, so maybe just "Contributor Risk"? |
I like "The Lottery Bus Factory". 🚌 💰 The Lottery Factor works as well. |
Hey All, just wanted to pop in here to let you all know that I came across an 'industry standard' (in quotes because I'm not sure it actually is) here in Australia wherein Dr. Jennifer Beckett has a measure for the bus factor but she referred to it more informally as the 'moses effect.' She "firmly believes" (this is in quotes because I'm pretty sure it was a joke) that Bus factor is a MUCH better term, but we are probably going to be using bus factor as a primary measurement for internet toxicity because it specifically acts to measure the Bridging Capital of individual influence from inside of a community, to the outside public. More importantly it connects the three social capitals (bonding, linking and bridging) together in one singular user journey so we can easily understand the risk of that person propping everything up. Reversing the metric also allows us to gauge the level of reputation and briding capital that someone will have coming INTO the community (reputation being a social currency metric, and riding capital being a social capital metric). If you'd like I can have a longer conversation with her about this and it may give us a new insight into the way that Bus Factor impacts the socio-cultural stability of online communities. Worth a look I think! |
In a seperate comment on this, I also wonder if we should be basing the name of this off of what is actually happening when you graph the bus factor on a network diagram - not just on contributors to an opensource project, but actually place it within the context of social capital and currency theories IN GENERAL. In reality the Bus factor occurs when a 'node' (member in a community) has garnered a lot of linking capital (they are connected to a lot of people) and Bonding capital (they have close ties and have grown in reputation so their voice is recognized), shows a potential threat or likelihood of leaving a community--and their linked members are only connected to them in the project, so those nodes are at risk of disappearing from the community network diagram. In other words, that individual has garnered the linking capital and bonding capital to prop up a community BUT if they were to leave the community would be at risk. Within the context of an opensource project that is the likelihood that their leaving would put the project at risk but even for an entertainment community this can be measured in the amount of engagements that a member of import has caused, in comparison to the amount that surrounding nodes have caused. We also see this example in real life with malls (🤮) American malls were architected such that there was always a food court in the middle for people to connect with as a 3rd space. Then at either end of a mall (usually a line in 2 or 3 directions) there would be an 'anchor store' that people would go to for low-dollar, but high-value items such as grocery, or mig-market stores. The smaller novelty stores and specialty services such as asian-import stores, mini-gold outlets, video game labs, and whatnot relied on the big-box anchors to force people to go between the communal space, and the larger store. Larger stores relied on the people being there for those specific interests to keep them in the mall for long periods of time. What caused malls to die, and also what causes bigger contributors who commit frequently to leave, is usually that the perceived reward, becomes too much for the work that they have to commit. (I've talked about the burden of contribution in CHAOSS meetings before). So there could be something involved in the generalized issues for bus factor that we could use to rename it. This might be 'lossed link likelihood' or 'at-risk supporter' or something along those lines? |
In the metrics meeting, we discussed renaming this to "Contributor Risk" to keep it simple and descriptive, similar to our other metrics names. |
This might be the least worst option 😄 |
Another idea: kujenga factor. This is the Swahili word meaning "to build", and it is the basis for the popular game Jenga™. As you may not have played it, this game involves taking wooden blocks out of a tower, until the tower eventually falls down, at which point the person who removed the last block loses. https://en.wikipedia.org/wiki/Jenga I believe that Jenga is trademarked. Kujenga, however, isn't. It's relatively easy to explain. Also, it looks like: |
Are we thinking about projects or people? |
Or ecosystems. |
Continuing to iterate on the name:
@RichardLitt and @danielskatz - you can learn more about how we've defined it here: https://chaoss.community/kb/metric-bus-factor/ - it's focused on the question: How high is the risk to a project should the most active people leave? So far, we aren't talking about re-defining it, just renaming what we already have defined. |
@alice-sowerby mentioned "Contributor Absence Risk" - CAR factor :) I'm a fan of it, not solely for the irony. |
I've recently been learning about "strategic ambiguity," and think "Contributor Risk" is still better than being overly specific :) |
@danielskatz any thoughts on how "risk" being a higher number in that case is a good thing? Another pt from the OSPO community call: "risk" is usually something folks instinctively prefer to be lower. |
Maybe we could define risk as 1/n, where n is the number of key/essential/important contributors, so that 1 would be the highest risk, 0.5 would be better, 0.33 would be better, 0.25 ... |
As a reminder, we're trying to avoid redefining the metric, since it's already very widely used. Renaming it is going to be disruptive enough :) Going back to Gary's point, the suggestion was to use “count” instead of risk, since the word “risk” at the end tends to imply that lower is better and in bus factor is larger is better. Or maybe we simplify it to not use risk OR count? I think this might be more consistent with how we've historically named our metrics:
The plan is to make a final decision in the next Metrics Development meeting on Thursday, June 6. See https://chaoss.community/chaoss-calendar/ for your time zone and meeting logistics. |
What about Contributor Factor then? Or Contributor ... Factor? |
I appreciate the sentiment that goes into the new suggestions. I'll reflect
on these.
When I read:
- Contributor Abandonment
- Contributor Departure
- Contributor Absence
I expect a metric by these names counts how many contributors have
abandoned, departed, or become absent. It's backwards looking.
The current Bus Factor is defined as the count of contributors that are
currently doing 50% of the work.
Adding "Risk" modifies the meaning to be forward looking but then we have
the unintuitive discrepancy with interpreting a high number as low risk.
... I am thinking back to other names we discussed. The only one that I
remember without looking into the history:
- Key Contributor Count
Maybe the test "what metric name do you remember" also tests for "what name
is intuitive". My answer may not be your answer and what is intuitive to me
may not be intuitive for others.
One of the reasons the "Bus Factor" may have been successful was the name's
ability to create an image in people's minds that stuck.
|
I worry that we are becoming overly concerned with finding a perfect name, focusing too much on minor details, and maybe starting to go in circles. I think we can agree that a perfect name hasn't been proposed and we are unlikely to find it. That said, I believe there are at least 3 or 4 names in the discussion that are acceptable and descriptive of what we are trying to measure. In regards to using risk in the name. Risk is descriptive of what we are trying to measure and yes, we want risk to be low. I don't think it is too difficult to understand that a high number of something can relate to lower risk. |
OK, in the Metrics Development WG Meeting, we decided on ... 🥁 🎉 Contributor Absence Factor 🎉 |
Thank you so much to everyone for such a lively discussion over the past couple of months! |
…/community#632. To minimize disruption, this first PR simply changes the text of the metric without changing any filenames to minimize disruption while we transition to the new name. Signed-off-by: Dawn M. Foster <dawn@dawnfoster.com>
@geekygirldawn are we good to close this issue since we have a resolution? 🙏 |
We're still working on implementing this across our various repos / website, but the renaming is done, so let's close this issue 🎉 |
I know that renaming what is probably our most widely used metric is going to be painful, but I think it's time to rename Bus Factor to something else.
The number of people I've had express pretty severe dislike of the name Bus Factor is quite high, and I often try to avoid calling it Bus Factor.
I often call it "Lottery Factor" because it's easy to understand. How likely is your project to survive if someone suddenly one the lottery, retired on a beach, and never looked at your project again.
Pony factor is more widely used, because it's been adopted by the Apache Software Foundation, but I find that it's harder for people to understand outside of the ASF. There isn't an easy narrative around it like what I have above for Lottery Factor.
I'd be really curious about the opinions from folks involved in inclusive naming initiatives and whether they've seen a commonly suggested substitute for Bus Factor.
I'm also curious about what the academic folks have seen. Is there a particular term that is more widely used in Academia / research?
cc-ing a few folks that I think would be interested in this discussion: @GeorgLink @germonprez @sgoggins @ElizabethN @klumb @dicortazar
I welcome any Chaotics to jump in with opinions.
The text was updated successfully, but these errors were encountered: