Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Planning] Invite orgs to contribute pkg list approved for GxP Use #52

Open
1 of 5 tasks
Tracked by #43
aclark02-arcus opened this issue Jul 2, 2024 · 12 comments
Open
1 of 5 tasks
Tracked by #43
Assignees

Comments

@aclark02-arcus
Copy link

aclark02-arcus commented Jul 2, 2024

This issue has two prongs to it, with the ultimate goal of increasing our rate of participation overall:

  • A. invite a few orgs into participating so we can learn how to best frame the "ask"
  • B. compose a "call to participate" in a public sphere, manifesting in three comms channels:
    • website
    • Posit Conf
    • Revisiting Org Use Case interviews (looping in @anujadas185)

For part A, here is my initial distribution list:

  • Preetham @ Merck
    • Would love to contribute anonymously... checking if their are objections internally
  • James @ Roche
    • open-sourced their list on github.io
  • Eric @ Biogen
    • said anonymous may work
  • Andy @ GSK
    • Checking to see if they can do something similar to Roche, but will likely be able to share anonymously without much difficulty from legal side
  • Sam Parmar & Narayanan Iyer @ Pfizer
    • Checking with Mike Smith and waiting for approval
  • Nick Masel @ JnJ
    • Would love to contribute... checking if their are objections internally
Initial email request, sent 7/1

Hi Eric,

I think we talked about this at a past R Validation Hub community meeting and I think you said you could provide a list for Biogen, but please forgive me if this is your first time hearing about it. The {riskassessment} app has undergone steady development and is now ready to accept CSVs from pharma orgs who want to openly share what R pkgs they've approved for use within their GxP environment(s). This data will be integrated within a special deployment of the {riskassessment} app so that any user can look up a package of interest, like {stringr} for example, and see which orgs also allow {stringr} for GxP use. This will be a big step towards gaining an industry consensus on which subset(s) of pkgs are generally accepted by pharma orgs. Obviously, app users can choose what they want to do with this info and we make no promises / guarantees about these pkg collections.

Initially, we hope to gather this info from the orgs who participate regularly with the R Validation Hub (like Merck, Pfizer, Biogen, GSK, etc) to get us started. In the meantime, we'll post an announcement on the pharmar.org site soliciting contributions from any/all orgs who are interested in participating. I'm working on composing that post right now. The hope is that we'll have data gathered for 4 - 5 pharma orgs before our Posit Conf Presentation on Aug 11. But in order to prepare, we're hoping to get Biogen's list sometime this week, perhaps before Tues, July 9 if that works for you?

If you have any questions, please feel free to pose the question here! But we're asking for a CSV with the following fields:

  • package name
  • package version
  • assessment date
  • risk decision (could be "low risk", "medium risk", "high risk" or some alternative, like "approved" for example)
  • additional considerations (optional)

For now, we just plan to collect this info once, but we'll see how well it's received by the community and if warranted, consider gathering an update every year, or perhaps every 6 months.

Thanks!

Aaron Clark

@aclark02-arcus aclark02-arcus changed the title Invite pharma orgs to contribute pkg list approved for GxP Use [Planning] Invite pharma orgs to contribute pkg list approved for GxP Use Jul 2, 2024
@aclark02-arcus aclark02-arcus changed the title [Planning] Invite pharma orgs to contribute pkg list approved for GxP Use [Planning] Invite orgs to contribute pkg list approved for GxP Use Jul 2, 2024
@aclark02-arcus aclark02-arcus added this to the Posit Conf 2024 milestone Jul 2, 2024
@aclark02-arcus
Copy link
Author

aclark02-arcus commented Jul 2, 2024

Hi @jthompson-arcus, @dgkf, & @emilliman5:

Per our conversations today, I wanted to reframe the messages that I sent out yesterday, so thought I'd run my follow-up past you all to make sure we're aligned. Expand the "Initial email request, sent 7/1" section above to read the initial ask, and then how I want to re-frame things given our conversation today.

But first, an update: James Black has already gotten back to me. He has received the green light from legal and he plans to have a repo published very soon... as early as the end of week, but perhaps next week. I'm thankful James is attacking it so quickly so that it could be considered at a model for other orgs to follow, if they so choose.

So, here is my planned follow-up email. One question I have in particular is about # 2 below. Is that the scope we want to go with?

Follow up

Hi Preetham,

I wanted to follow up to my request yesterday after having conversations with Doug & other parties interested in contributing to the initiative. We've strategized a few options / pathways forward that we think will ultimately help increase the rate of participation in this project from other orgs in the space.

First, there are three big take-aways:

  1. We are not going to publish this data in the {riskassessment} app. Why? We don't want to suggest that this data is somehow actionable to an org qualifying a package for GxP use. That is, we don't want orgs to conclude that a package is somehow qualified just because "Merck" or some other org has qualified it for some unknown use case. Instead, we plan to only analyze the results internally and do two things with it:
    • Observe & analyze the data, then follow up with a blog post to our site (pharmar.org) that summarizes what we found, in aggregate. It's important to note that orgs can choose to remain anonymous or be named in the publication. Either way, we'll make sure we have sufficient participation from the entire industry first, and summarize by org size (small, medium, large).
    • share the data with our Regulatory R Repo workstream since knowing which pkgs are generally qualified will greatly help them identify useful benchmarks / thresholds cutoffs when building consensus on measurable quality metrics.
  2. We want to narrow the scope of the ask: previously, I my request was very broad, but we want to tighten things up and instead ask, "what packages have you qualified for late stage analysis?"
  3. We want you to know that you have options when it comes sharing this data. Namely, you could:
    • Choose to be completely anonymous if needed. In this case, your org's name would never be attached to the data you share, nor in any publication. However, you could also elect to remain anonymous and still let the R Validation Hub post the data in one of our public repos.
    • Alternatively, you could choose to publish the data to a GitHub Repo owned by your organization. That way you can maintain a license & disclaimers in the README, clearing your org of any potential or perceived liability. Roche has already taken this approach and serves as an excellent exemplar to follow for other orgs interested in this path.

If you have any follow up questions, please feel to reach out and I'll do my best to provide guidance. Thank you for being a major contributor of the R Validation Hub & R Consortium!

Regards,
Aaron Clark

@aclark02-arcus
Copy link
Author

FYI - James got Roche’s validated list of packages open-sourced on Friday. See link below as a good model for other orgs to follow, if they feel so inclined.

https://insightsengineering.github.io/rvalidationhub-packages/

@aclark02-arcus
Copy link
Author

aclark02-arcus commented Jul 9, 2024

on 7/9, @dgkf suggested we spin up a template subpage about how orgs can contribute their data, with the ability to opt out of certain elements, as needed.

@aclark02-arcus
Copy link
Author

aclark02-arcus commented Jul 18, 2024

FYI, still waiting to hear back from several pharmas. @pharmaR/ws-communications, Here is the new and approved script for requesting this info, and inviting orgs to join in to an opportunity to share a Case Studies update:

Hi Nick...

Click to see the rest of the email script

I saw you presented on behalf of JnJ so I thought I'd reach out to see if you'd be interested in participating again. Is it okay if I put you in touch with the team leading the initiative so they can share more info?

Something different this time around is we are hoping to gather list of R pkgs pharma orgs have approved for use on late stage analysis within their GxP environment(s). Initially, we hope to gather this info from the orgs who participate regularly with the R Validation Hub (like JnJ, Roche, Novartis, Merck, Pfizer, Biogen, GSK, etc) to get us started. In the meantime, we'll post an announcement on the pharmar.org site soliciting contributions from any/all orgs who are interested in participating. I'm working on composing that post right now. The hope is that we'll have data gathered for 4 - 5 pharma orgs before our Posit Conf Presentation on Aug 11.

At the end of the day, we hope to:

  • analyze the data, then follow up with a blog post to our site (pharmar.org) that summarizes what we found, in aggregate. It's important to note that orgs can choose to remain anonymous or be named in the publication. Either way, we'll make sure we have sufficient participation from the entire industry first, and summarize by org size (small, medium, large).
  • share the data with our Regulatory R Repo workstream since knowing which pkgs are generally qualified will greatly help them identify useful benchmarks / thresholds cutoffs when building consensus on measurable quality metrics.

Last, if you interested in participating, we want you to know that you have options when it comes sharing this info. Namely, you could:

  • Choose to be completely anonymous. In this case, your org's name would never be attached to the data you share, nor in any publication. However, you could also elect to remain anonymous and still let the R Validation Hub post the data in one of our public repos.
  • Alternatively, you could choose to publish the data to a GitHub Repo owned by your organization. That way you can maintain a license & disclaimers in the README, clearing your org of any potential or perceived liability. Roche has already taken this approach and serves as an excellent exemplar to follow for other orgs interested in this path.

If you have any questions, please feel free to pose the question here! But we're asking for a CSV with the following fields:

  • package name
  • package version
  • assessment date
  • risk decision (could be "low risk", "medium risk", "high risk" or some alternative, like "approved" for example)
  • additional considerations (optional)

For now, we just plan to collect this info once, but we'll see how well it's received by the community and if warranted, consider gathering an update every year, or perhaps every 6 months.

Regards,

I will close this issue once I've heard timeline from each of these orgs, and specifically, whether they can share their list before Posit Conf arrives.

@antalmartinecz
Copy link

antalmartinecz commented Jul 18, 2024 via email

@aclark02-arcus
Copy link
Author

I talked with my manager about it and most probably we’d (Certara) be also very happy to provide a list of packages we use.

Hi @antalmartinecz, that's great. Does Certara hope to do this like Roche (open source) or more anonymously? Either way, we are thrilled your org is willing to contribute!

@aclark02-arcus
Copy link
Author

FYI, sent out reminder messages today to the remaining pharma orgs that showed interest.

@DrLynTaylor
Copy link
Collaborator

Hi @aclark02-arcus and R Validation Hub, I was talking with the PHUSE CAMIS co-leads this week and we realized that by creating a repo of Comparing Analysis Methods in Software (SAS vs R vs Python https://psiaims.github.io/CAMIS/), we have inadvertently, created a renv.lock file with a list of packages most commonly used in pharma for stats analysis. In the cases where these packages have had case study datasets run through both R & SAS and we have documented a match in results, we have essentially take a step towards considering these "Trusted packages"! See our repo https://github.com/PSIAIMS/CAMIS. Is this lock file any use to your group? We'd need to ensure we remove any packages we don't trust based on our comparison findings (like epibasix https://psiaims.github.io/CAMIS/Comp/r-sas_mcnemar.html) but you are welcome to use it to add to your central package list. Want to discuss?

@dgkf
Copy link
Collaborator

dgkf commented Sep 5, 2024

we have inadvertently, created a renv.lock file with a list of packages most commonly used in pharma for stats analysis. In the cases where these packages have had case study datasets run through both R & SAS and we have documented a match in results, we have essentially take a step towards considering these "Trusted packages"!

This is super cool, @DrLynTaylor! I hadn't connected that idea, but it's a really amazing way to tie the collective knowledge of CAMIS back to the R Validation Hub. @aclark02-arcus - I think we could surface this list similar to an organization's list of packages.

@DrLynTaylor
Copy link
Collaborator

I have to thank Christina Fillmore (GSK) for the idea. She's co-lead of Camis driving forward our repo tech / renv file etc so we can bring her into discussions about what we need to pass onto you

@DrLynTaylor
Copy link
Collaborator

The renv.lock file is here if you want to include it in your package compilation, the only package we found so far that should not be used (as we cannot replicate the results) was epibasix, so I'd recommend taking that one out of your list. https://github.com/PSIAIMS/CAMIS/tree/main

@aclark02-arcus
Copy link
Author

aclark02-arcus commented Oct 1, 2024

The renv.lock file is here if you want to include it in your package compilation, the only package we found so far that should not be used (as we cannot replicate the results) was epibasix, so I'd recommend taking that one out of your list. https://github.com/PSIAIMS/CAMIS/tree/main

Thank you @DrLynTaylor! And apologies for replying 3 weeks late. We're working on consolidating this work into a central repository, so I'll be sure to include it in the list!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants