Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discuss] Provide community guidance and deeper support for some communities #297

Closed
2 of 3 tasks
choldgraf opened this issue Nov 10, 2021 · 13 comments
Closed
2 of 3 tasks
Assignees
Labels
Community Engaging and cultivating communities that we currently serve. Discussion A discussion without a specific action to take.

Comments

@choldgraf
Copy link
Member

choldgraf commented Nov 10, 2021

Description

We are defining a support steward role (#187) to handle all support requests for our hubs. This roughly breaks down to:

  • A community representative is the main connection with a hub's users, they discuss and triage support questions, and then they escalate to support@2i2c.org as needed
  • A 2i2c support team monitors support@2i2c.org and communicates with the community representative as support issues are resolved.

Recent conversations have made it clear that some communities might need more support resources than the "community representative" model described above. In particular, what should we do if a community does not have the resources to define a "Community Representative" that can be the "middle layer" for support requests?

Value / benefit

This would allow us to interact with larger or more complex communities, and also offer more clear guidance about what we do and do not offer as a part of the hub service. It might also be an opportunity for us to offer more value to those communities than we currently offer.

Implementation details

We discussed a few ideas for this, here are some that came to mind:

  • Define a dedicated support team for a community. In this case, one or more engineers could have part of their time devoted to a particular community. However, it's unclear how this would intersect with communities that don't have this dedicated time (e.g., would they still use the same support channels?)
  • Offer a "Community Representative as a service". In this case, we'd offer a person to "join" the community that we're working with, and interact with them in a more heavy-touch relationship. They'd need to understand more about the workflows in that community and help guide the work people do, and would also be the main point of contact to the 2i2c SRE team.

In both cases, we'd need to define a sustainability mechanism for these roles. Both would require more dedicated resources, and would thus be more expensive than the typical hub offerings we have described so far.

Finally, we already have two communities like this:

  • Pangeo
  • University of Toronto (though they seem OK with defining their own community representatives)

Perhaps we can use these communities as testing grounds for what a support process could look like. In particular, Pangeo has already awarded us with funds that are beyond what we'd typically charge for managing JupyterHub infrastructure. While part of those funds are meant to go towards development, perhaps a more high-touch support role can be a part of those funds as well.

Tasks to complete

  • Discuss what options we have here, and what is worth pursuing
  • Add next steps as we decide on something...
  • Prototype a process with the Pangeo community, and see how this goes

Updates

@rabernat
Copy link

Thanks for opening the issue Chris. It's an important discussion.

I just want to clarify a little bit on this point...

what should we do if a community does not have the resources to define a "Community Representative"

With Pangeo, it's really not the case that we don't have resources. We have significant grant funding and a large community of contributors (both volunteer and paid by various projects) to draw upon. For me, it's more of a question of "what is the boundary between Pangeo and 2i2c"? When we initially conceived this collaboration, I imagined that Pangeo and 2i2c would become closely enmeshed with each other. As things have evolved, what has emerged is more of a client relationship, with me as the de-facto liason as the PI on the grants that are funding this hub.

A perfectly fine outcome for me would be for us to designate a community representative outside of 2i2c. There are many people who could fit this role. Recognizing how overextended everyone at 2i2c currently is, that kind of sounds to me like the best choice.

@choldgraf
Copy link
Member Author

choldgraf commented Nov 10, 2021

Just a quick follow-up to @rabernat's point - something I am trying to figure out here is where to draw the line between development and support. In my mind, the "enmeshing" that Ryan describes above would be some combination of:

  • Interacting closely with the Pangeo community to understand opportunities for new development
  • Doing that development in collaboration with Pangeo community members
  • Using this experience to make the necessary improvements to Pangeo's hub to meet the community's needs

When I describe it like that, it does sound to me like a "Community Representative as a service", but I hadn't considered this role in the context of support (e.g., responding to outages and minor requests to update packages and such). To me it feels like there's a large benefit to having a team of people acting as support for Pangeo's hubs, since we avoid single points of failure and can spread the load a bit. However, it's unclear what that team's role should be in the context of this more "Community Representative" style team member...

@choldgraf choldgraf changed the title Support process for larger or more complex communities like Pangeo Alpha support process for larger or more complex communities like Pangeo Nov 11, 2021
@choldgraf
Copy link
Member Author

Notes from meeting

@sgibson91 and I had a conversation about this today, and we brainstormed a few ideas around the best way to serve a community like Pangeo. It seems like we are missing role in the story of working with communities like Pangeo. This role would be responsible for things like:

  • Keeping in regular connection with the Pangeo community and understanding its workflows (e.g., by attending Pangeo meetings, keeping watch over community spaces for conversation,
  • Serving as eyes and ears to understand where new development could help the Pangeo community
  • Helping the Pangeo community best-leverage the infrastructure that 2i2c provides (e.g., by improving documentation, providing advice online, running workshops, etc)
  • Helping the engineering team plan and prioritize new development (e.g., by keeping up with the backlog, refining items and guiding discussion, leading planning and discussion meetings, etc)

and at a more meta-level (AKA, for 2i2c in general, not just Pangeo):

  • Work across communities, and help understand the common pain points across them to drive new development and improvements in the infrastructure.
  • Help 2i2c demonstrate impact by communicating the experiences of the communities we serve to the outside world
  • Help 2i2c reach prospective communities by helping them understand the value that 2i2c brings (e.g., as a part of the sales process)

Notes from conversation with @KirstieJane

I also had a conversation with Kirstie Whitaker. She agreed that the kind of role I've describe above is quite important. She referred me to a new job title at the Turing Institute that might be relevant here, called the Research Application Manager.

Here's the job description of a recent RAM posting: https://cezanneondemand.intervieweb.it/uploads/153/annunci/ResearchApplicationManagerJD.pdf

The basic idea is that this position is a combination of "Product Manager" and "Stakeholder Manager". Their job is to ensure that the research work going on at Turing is having impact through applications that are carried out by stakeholders in the ecosystem.

One possibility is to define something similar for open source infrastructure that we build (e.g. an Open Source Application Manager). The goal of the position would be to ensure that 2i2c is maximizing the impact of its open infrastructure, by bringing best-practices into communities that we serve and sharing knowledge about how to best use the tools that 2i2c provides, and by bringing experiences from the communities back into our development cycles and workflows. The position is sort-of like a translator that bridges the SRE/Development world and the communities we serve.

One option: a team-based approach to Pangeo's collaboration

With all of that in-mind, one potential path forward could be for us to define a team approach to the Pangeo collaboration, rather than a single individual who does all of this work themselves. For example, we could imagine a breakdown of roles like this:

  • Support and operations: Performed by the 2i2c engineering team, by standard 2i2c practices. This focuses on Site Reliability Engineering and handling cloud-related support.
  • Infrastructure Engineering: Performed by a specific 2i2c engineer for particular projects, potentially shifting between team members depending on the needs and amount of development. This focuses on creating and improving technology to extend use-cases of those in the Pangeo community. They work in partnership with the Infrastructure Application Manager
  • Infrastructure Application Management: Performed by an Infrastructure Applications Manager. This role must understand the Pangeo use-cases by working closely with stakeholders in that community, and also works closely with the 2i2c Engineering team to guide development efforts in the most important directions.

note: I know the above is a lot of raw data! None of this is final, it is meant to spur discussion and brainstorming

@rabernat
Copy link

In recent discussions with Chris, I realized that "Pangeo" is probably a pretty anomalous model to use as a "community." The reason is that Pangeo the community does not actually have any money. The entities that hold money that could potentially go to 2i2c are:

  • Grant-funded research projects (NSF, NASA, NIH, etc.)
  • Individual PIs with startup / discretionary funds
  • Academic departments with instructional budgets

The current Pangeo Hub is not actually funded by "the community" at large. It is funded by a specific NSF award to Columbia University, for which I am the PI.

More generally, the concept of an open science community like Pangeo is really pretty new. It would be wise to base our short-term strategy around the existing reality of how science is organized / funded today, rather than some vision for how science should be organized in the hypothetical future.

So now I will offer a different perspective, as a PI of another large new NSF-funded project...


Columbia was recently awarded $25M from NSF for a project call Learning the Earth with AI and Physics. My role is director of data and computation. We need a cloud-based pangeo-style JupyterHub hub for both research and education. Is this "part of Pangeo"? In my head, yes. But on a more concrete level, it is funded by a different grant and has a different set of users and stakeholders.

I proposed the idea of working with 2i2c for our cloud hub using the alpha pricing model Chris shared. They were generally supportive, but the main question people asked was does 2i2c provide training / onboarding for researchers / educators to learn how to use the hub. This would not be language specific (we have both Python and R users), but would be more about cloud-specific stuff, recognizing that most researchers have not used cloud computing before. Stuff like:

  • Logging on / off
  • Selecting environments (profile-list options)
  • Using nbgitpuller
  • Navigating jupyterlab
  • Dealing files / data in the cloud

We would not need any training in how to do dat science. If 2i2c could offer this sort of minimal but important training / onboarding, it would be much more attractive to this group.

@sgibson91
Copy link
Member

  • Logging on / off
  • Selecting environments (profile-list options)
  • Using nbgitpuller
  • Navigating jupyterlab

I think there are enough resources on all of the above things out there already to collate a short intro guide.

  • Dealing files / data in the cloud

This worries me. My mind explodes every time I watch you do a demo that I feel under-qualified to onboard / train anyone in that.

Which leads to the other question: if 2i2c agrees it can offer this level of onboarding / training, who will give it? Is it the engineering team, or is it some other as-yet-undefined (and likely unstaffed!) role/team?

@rabernat
Copy link

rabernat commented Nov 23, 2021

We could remove cloud object storage from the scope here if that makes things easier.

if 2i2c agrees it can offer this level of onboarding / training, who will give it?

In our case, it may be enough to develop the training modules, and then hand them off to a TA or other support role within Columbia. Videos and self-paced training could also work.

@choldgraf choldgraf changed the title Alpha support process for larger or more complex communities like Pangeo Provide community guidance and deeper support for some communities Nov 23, 2021
@choldgraf
Copy link
Member Author

I've had a few more conversations about this, and I'm coming around to the idea that this issue is also related to https://github.com/2i2c-org/meta/issues/256. I spoke with a few people in product management roles, and they often described their work as primarily understanding the needs of the user community around a tool in order to guide development etc. The way that many of them accomplished this was by having a lot more high-touch connections with those communities, it was things like:

  • Having one-on-one conversations with stakeholders to understand what could be better, changed, etc
  • Running workshops and tutorials to teach people how to use the tools effectively and to get feedback from them
  • Participating in community events and communicating in community spaces to share information and learn what people are thinking
  • Writing documentation that clarifies questions community members are having

On the development side these people would then help represent the community's interests in ideas for new features, prioritization, etc.

This takes me back to the "Infrastructure Application Manager"-style role described in this comment.

Proposal for next steps

  • Define a role for community engagement and advocacy. This role would complement the SRE team as well as the developer(s) to help guide the community in using the tools, and to guide the developers in building the right things. It would be some part product manager, stakeholder/community manager, instructor, documenter, etc. This could be a service that 2i2c offers as a more high-touch relationship with some communities, but could also be more generally useful for our team to help guide development.
  • Write a job description/title for a person that could fill this role among others. Depending on Budget analysis / projection to understand the resources we have #296, we could resolve this issue and https://github.com/2i2c-org/meta/issues/256 by bringing on a new team member. They would be a generalist, like myself, rather than tightly scoped to a skillset, and their focus would be more at the meta-level rather than diving deep on engineering. One of the roles that this person could fulfil is the one defined in the bullet point above. Beyond that, they could also help with more of the high-level things like backlog and team stewardship, organizational stuff, fundraising, etc.

I don't know exactly what those two bullet points would look like, but does that seem like a reasonable plan to pursue? If so then I will put this on my plate for our next cycle to try and clarify things further...

@sgibson91
Copy link
Member

I am +1 on this plan @choldgraf

@choldgraf choldgraf changed the title Provide community guidance and deeper support for some communities [Discuss] Provide community guidance and deeper support for some communities Nov 29, 2021
@choldgraf
Copy link
Member Author

I had another conversation with @damianavila about this today, and wanted to write down a quick idea while it was fresh in my head.

We discussed that a high-touch collaboration like Pangeo is really a combination of four services:

  • Operating cloud infrastructure so that it is reliable, and so that necessary changes are made for the community
  • Supporting issues that arise on Pangeo's infrastructure, as surfaced through our support channels
  • Developing new technology that supports Pangeo's use-case
  • Engaging with the Pangeo community to build a high-bandwidth information channel between 2i2c and the community

I thought this was a nice way to disentangle a few different services that we're providing, and to identify where we can use a team, where we can use individuals, and where there might be different skillsets needed to do things most effectively. For example, I think the final bullet point is more akin to the "Product / impact manager" that is described above.

@rabernat
Copy link

rabernat commented Dec 1, 2021

I like this enumeration of different services. However, I feel the need to keep pointing out that "Pangeo" / "Pangeo community" is not really an entity that can ship money to 2i2c. It's a very loosely organized and heterogeneous collection of individuals from different institutions and projects. Pangeo literally does not exist from a legal or financial point of view. So from a sustainability / scalability point of view, it doesn't make sense to target "Pangeo" as a customer or client.

The 2i2c customer or client is a discrete funded project, academic department, or research lab with money to spend. In this context, "Pangeo" represents a set of configurations, tools, and practices that the client wants to use. In many cases, the client may even want 2i2c to teach them "how to Pangeo." Right now I am trying to funnel many different projects towards 2i2c as a "Pangeo provider" (perhaps as an alternative to using the MS Planetary Computer or deploying their own stuff via QHub).

I just think it's important that we keep this distinction clear to avoid setting up a business model that caters to a non-existent customer.

@damianavila
Copy link
Contributor

This is a really important distinction... how would be the business model for 2i2c to interact and get funds from collaborations like Pangeo? Should 2i2c try to get funds directly from consumers of that collaboration? That would mean we need a composite model where we are not only serving communities but also individuals?

@rabernat
Copy link

rabernat commented Dec 2, 2021

Should 2i2c try to get funds directly from consumers of that collaboration?

In general, 2i2c should try to get funds from the people who have money to spend. If we are targeting academic research, that means "PIs"; the people write the grants and make decisions about how awarded funds are spent. For education, the decision makers may be different; individual instructors, department chairs, university IT managers, etc.

I not sure I would characterize these people as "consumers" of Pangeo. Many of them are aware of Pangeo and are asking, "can I get Pangeo for my project / lab / class / department?" In past years I would get emails from these people saying, "can I pay you to run a Pangeo hub for us?" I had to say no because I had no way provide such services. An explicit goal of Pangeo partnering with 2i2c was to develop a turnkey, scalable model that could be responsive to such inquiries.

That would mean we need a composite model where we are not only serving communities but also individuals?

We need to distinguish between the 2i2c "customer" - the person who decides to spend money with 2i2c (i.e. the PI) - and the 2i2c user, who actually logs in to the hubs. In some cases, the PI may never even log in to the hub. They just care about their users getting the resources they need to do their work. The PI will want detailed reports on usage and costs breakdowns. They will also want training / onboarding to make sure the money spent on the hub will have its maximum impact.

The "community" concept is amorphous and without clear boundaries. Open Science Communities are something we are trying to will into existence--they don't really exist yet. To allow communities to grow spontaneously, I think it's very important that membership in a "community" not be tied to access to a particular hub. Access to hubs is determined by who pays for the hub and whether the user is formally affiliated with that project. Participation in a community should be open to anyone. That's why I think it's best for our hubs to be very generic, such that workflows can be run from any hub.

@choldgraf choldgraf added Community Engaging and cultivating communities that we currently serve. and removed 🏷️ support labels Aug 1, 2022
@choldgraf
Copy link
Member Author

I'm going to close this one, as it led to the creation of these issues:

@choldgraf choldgraf added Discussion A discussion without a specific action to take. and removed Enhancement An improvement to something or creating something new. labels Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Engaging and cultivating communities that we currently serve. Discussion A discussion without a specific action to take.
Projects
No open projects
Development

No branches or pull requests

4 participants