-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The future of deep review #810
Comments
I think that if we have a committed group of maintainers there is the opportunity to do something new here in the way of a living scientific manuscript that stays up to date with the field. However, we probably need more than me and @agitter to make it sustainable. Does anybody else have an interest in helping to contribute at the maintainer level? |
One quick thought. We should probably be talking more about what we do after v1.0, which I imagine would be the accepted version at the journal. At this point I feel like we should push to that finish line. 😄 |
Agreed. I think we should only take pull requests on obvious typos until v1.0. |
Agreed - hit the (immediate) finish line and then worry about the future. And when we get to that future, a few discussion points or ideas:
|
I could be interested in this. It could help to define expectations for maintainer roles, but those obviously depend on a lot of other variables. Similarly to @agapow , I felt that keeping up with the notifications was sometimes like drinking water from a fire hose. I think this was due to the intermixing of notifications related to content (i.e. tracking new references and writing) and infrastructure (e.g. administrative, repository code, formatting). I also wonder if github is the best place for this sort of thing, or if such a platform exists. Lastly, given the size of the paper, does shifting to a different format - one that is designed to organize information on a grander scale (e.g. a book) - make more sense long-term? |
@cgreene @agitter I realize I'm a bit late to the conversation but fwiw, would definitely be interested in helping in a maintainer function or role. @evancofer To your pt. about inundation, was the Projects feature (or something similar with GH integration like Trello) used to track progress? I wonder if that might be one way of making the different workstreams a little more manageable and organizing issues based on the topic or sub-topic. |
@stephenra AFAIK there weren't any project management tools (e.g. Trello, Asana, GH Projects) used. I would guess that @cgreene lab had some internal tracking of general project status, but that probably isn't too useful for our purposes. Enabling contributors to easily subscribe to notifications for one or a few sections/topics could be useful. |
@stephenra we used milestones within GitHub and labels (usually, not always) for some form of organization. We also ended up having a few master issues to track progress and link to related issues at various project stages (e.g. #188 and #678 ). @cgreene and I didn't really have any formal internal tracking beyond that, and I'd be open to better organization if other maintainers join in to keep this going. |
@evancofer @agitter Thanks for clarifying! I'm tool-agnostic but under the working assumption that this continues to grow in scope, it may be helpful to adopt one (my past experience with Trello and GH Projects have been overall positive but admittedly Kanban board-style project management tools aren't for everyone). Is there an estimation of roughly how many maintainers would be needed to keep the project going? |
@stephenra I'd say the number of maintainers depends on what exactly we want to sustain. Is it an up-to-date manuscript or book? A curated list of papers, tools, and data? Something else? |
@agitter Makes sense. And apologies, I realized I'm getting ahead of the conversation given the immediate focus on v1.0. |
@stephenra This is actually a good time to have the conversation while we still have contributors' attention after the recent round of revisions. Let us know if you have more thoughts about what form the future product or project should have. |
I agree that now is a good time to figure this out. In terms of tooling, our lab has used waffle.io for other projects and found it useful. I think the same things that it has helped us organize could aid the maintainers in planning what to include. I also think we'd be breaking new ground on authorship, but I like the idea of a "release" occurring either every 6 or 12 months (from our own experience, I think 12 months is more reasonable). If there were project participants who would like to lead each of those releases, I think the authorship categories could accommodate a reshuffling of the author list on each release (we could stick "maintainers of previous versions" in a category that doesn't shuffle to the last positions - those could be "maintainers of the current version"). Maybe JRSI would like to publish an annual update for a few years, or maybe we could talk with other journals about future releases (imagine a Jan 2019 release date for the next one...). If any journals are interested, feel free to chime in :) Anyway, these are just some thoughts. |
If you want to move on to another collaborative paper in deep learning for medicine, try: “DiversityNet: a collaborative benchmark for generative AI models in chemistry” |
@mostafachatillon : thanks for raising that. It might be more appropriate to raise this as a new issue since your point doesn't relate directly to the future of this paper. Also, note that your blog post has an inaccuracy. You say:
That is related to GitHub's native system for displaying markdown. Deep-review doesn't use that. It may also be the case that manubot, the build system for deep-review doesn't yet support formulas. However, if that's the case you should correct the inaccurate link in your blog post. |
@agitter @cgreene Apologies for the lapse in response.
Agreed on the tooling. I've heard good things about waffle.io and had some success with Asana and Trello, which both integrate with GitHub as well. I'm not particularly opinionated on this so I would imagine whichever platform most contributors feel comfortable with or offers the lowest bar to access is the best way to go. I'd be happy to set up a survey, if that helps. Apart from GitHub issues, I've found it helpful and more easily manageable in tracking todos and PRs to batch issues by categories (rather than just tags). I'm not sure if the lab(s) adopted this approach but, for example, the different application domains/sub-domains in the paper could be a natural way to think of structuring these categories (e.g. gene expression vs. protein secondary and tertiary structure, etc.).
I favor the idea of 12 month release as well. It gives time to account for difficulties in scheduling and coordination for contributors and, given the speed of the field, it also provides time to understand a broader range of contributions and distinguish what might be meaningful work vs. flag-planting. |
@cgreene @agitter @stephenra I have used Asana and Trello in the past, and I am comfortable with using both. Tentatively, I would lean towards Asana because it seemed to be (at least to me) more flexible and feature-rich than Trello. However, I am not particularly familiar with integrating either of them with GitHub. Is there a way to use any of these project management tools in an "open" manner that allows people to view the current project status without necessarily signing up for an Asana/Trello/Whatever account and so on? At least with respect to content reviews and discussion, it is probably important to maintain this project's transparency. Obviously, the immediate goal is to finish the initial publication. The next step is to identify and enumerate the specific maintenance tasks, especially those that the current team needs the most help with. With regards to planning for long-term progress, it would also be useful to list any goals/problems that have come up but were too ambitious or not pressing enough for the initial publication. Thoughts? |
@evancofer I believe you can make Asana projects 'public' but this only makes the project viewable to others who are part of your organization but not necessarily a team member (as opposed to anyone, in general). On the other hand, Trello you can make publicly viewable to anyone and the project page will be indexed on Google. I do agree on the pt. about transparency -- to this end, I've worked on or seen some projects that use some combination of GitHub, Trello, and Gitter. The code/repo is on GitHub, the (public) project management is handled by Trello, and the community and chat is on Gitter. If that's too much added complexity, perhaps GitHub and Trello might be best. |
@stephenra Trello and GitHub seems like a good solution without too much added complexity. I'm thinking we could use Trello to track maintenance etc and keep discussion on GitHub (and use continue to use issue labels and other features to track and organize). |
@evancofer That sounds reasonable to me. 👍 |
If you have not played around with http://waffle.io I would encourage you to give it a shot. I made a deep-review waffle. It is an overlay on github issues, so it's convenient to work with in this framework: At this stage, I think we really need 2-3 committed maintainers to develop a new plan, update the README with the plan, and then start to take over the project with the goal of releasing a new release at some point in 2019. |
I went through all issues up to 100 and I closed them if we had referenced the paper or if it was a discussion that had concluded. |
@stephenra @cgreene The waffle.io view on the project should work fine. Like Casey said, we should probably find some more committed maintainers interested in long-term work if this is going to be successful. Contributors were obviously a good place to start, but I am unsure where to search next? I'll get working on an update to the README and submit a PR sometime this evening. This will probably include a status update and a new section about the future of the project. |
It might be nice to think about an authorship model where people "decay" towards the middle after a release. The current set of authors would be the "middlest set" of the next release (unless they contribute) and new authors would bookend them. I'd imagine maintainers at the end with the other author classes on the front. If people understand how these items will be handled, it might help to draw in new contributors. I'm also happy to promote the work towards a 2019 release, and I'll even commit to a bit of writing (though at this time I'd prefer not to be a maintainer 😄 ). It sounds like @evancofer and @stephenra might be interested. Maybe you could snag a third so that votes are resolved via tiebreak, although @agitter and I did survive the pairing. |
It does seem prudent to get a third person. Most of the people that come to mind are in my current lab or department, so - out of fairness - I am somewhat hesitant to recommend any of them. It may be best (in terms of ethics and effort) to, as you say, append them in a semi-randomized order. Perhaps we could do this at the end of every month (or some other period of time)? I imagine this could incentivize repeat contributions. Perhaps it would be useful to use a semi-random hierarchical grouping again? Was manually determining author hierarchies time consuming or maintanable? |
I agree that it is important to think about authorship, how new contributors will be recognized and incentivized, and what will happen to the existing contributors in a However, if you don't find a third maintainer, I'd be willing to help with tie-breaking in special circumstances.
We only did this twice, so it wasn't too onerous. We also kept the categories broad to help. It did require considerable manual effort because we reviewed commits as well as contributors' discussion in issues and pull requests. I was initially working toward fully automating the author randomization but stopped once Manubot because a separate project. The deep review author ordering was too specific to this collaborative process. A fully automated ordering for Manubot should probably take some unordered author |
Interesting discussion! The guidelines for the "Living Data" paper on the Global Carbon Budget might be useful: http://www.globalcarbonproject.org/carbonbudget/governance.htm#gov2.5 The dataset and paper with Carbon Emissions are updated yearly, but the paper (and data) stay partly the same. The practices with regards to authorship might be very different there due to nature of the project and different fields, but the comments on citation recommendation and "self-plagiarism" seem worth considering ... |
Interesting parallel @rgieseke Going forward, we could also bring more organization to the issues. Would adopting |
@agitter that seems like a good idea. I feel like summarizing or discussing any of the articles with an issue might make good first issues. Also, revisions seem like the central focus of the second release, so creating issues for said revisions (e.g. #847) is probably a good way to elicit meaningful contributions. Perhaps we could do this for each section we felt needed work? Some of the the existing issues (e.g. #598) could also be broken down by subsection into more manageable tasks. I'm not sure about the best way to do this however, and it could just result in too many issues. |
@agitter @evancofer Yes, I think those would be useful labels to have as well. I've gone ahead and created labels for I like the idea of creating issues for each section. I think in terms of management, that structure lends itself to being more easily identifiable/accessible for would-be contributors. From a gut feeling, I think breaking down into different subsections might slowly lead to issue creep as you pointed out. |
@evancofer @stephenra If you still need a third maintainer, I would be happy to help. This has been a great work, and I would be happy to contribute to the future versions. |
@nafizh Yes, your help would be greatly appreciated! |
@evancofer Is there an explanation for the labels? I understand most of them are self-explanatory, but I am confused about some of them, for example, paper, treat, study or next. |
@nafizh some of the labels come from https://waffle.io/greenelab/deep-review We may want to feature that more prominently in the readme so that it isn't buried in this thread.
The new maintainers should feel welcome to change the label organization. |
I agree with @agitter that a reexamination of the labels would be in order. I'll note only that |
I agree with @agitter that we should delete the I also think we should assume that, unless otherwise marked, an issue corresponds to a paper. Some divisions that come to mind are: community discussion/feedback and project updates, build/orchestration issues, and content revisions? To revisit our earlier discussion of issue prioritization, I think some good labels might be: |
Thanks @evancofer. Agree with @agitter @cgreene as well. I'm OK deleting those labels ( |
I'd recommend using the waffle labels for state where provided - it'll mean that things will look nice on the waffle regardless of how those labels get assigned - so sounds like that one doesn't need to be added |
Thanks @cgreene. For priority labels, I'd prefer having at least three (e.g. |
@stephenra Yes, that is a better syntax. |
@evancofer @nafizh Great. The priority labels are now all added. |
What can we do to drum up more discussion in the issues? Aggregating new articles on deep learning is essential, but I get the feeling that we will lose momentum without continued and consistent contributions in the form of discussion and writing as well. Perhaps we should set some very easily attainable goals for activity/discussion/editing contributions? I am currently reviewing and drafting some bits on the various genomics sections (e.g. splicing, variant calling, sequencing), but I realize that there are many other sections of the review that may need attention. |
From my experience the path is to start writing + get some small wins in (adjustments to specific sections, etc). Then we can start tweeting about those to build more momentum. If the community is active and the topic remains of interest (as I suspect this one does), I think that's what it'll take. Right now it's unclear if the project is really alive or not, which may make it hard to draw in contributors. |
I completely agree with @cgreene. If momentum is restored, it could also help to use issues to recruit contributors to work on specific small sections that need updates. (Though I tried this with #847 and it didn't go anywhere.) A minor idea would be to rebrand the review. We could adopt the style used in database papers (e.g. The UCSC Genome Browser database: 2018 update) and add |
Major +1 for adding |
I have been following this for a while and I'd like to contribute but I'm not sure what the best way to is. I'm more than happy to link my papers (or those I stumble upon) and discuss them. However, I'm not sure what the goal of discussion in the issues are. Is it to talk about our thoughts / critiques on the methods or to discuss how to best incorporate it into the paper? |
@jmschrei both are goals of the issues. @evancofer has some recent examples (e.g. #886) of discussing and critiquing methods. The intent is that this helps us decide what we want to say about a paper if/when we add it to the review. I'm also proposing that using issues could help restart the writing effort by opening a discussion topic, discussing what should be written, and then making a pull request. For example, there have been several new methods about autoencoders for single-cell RNA-seq data. I could open an issue that notes we only reference two of these in the current review. Then we could re-assesses the state of the subarea with an updated assessment of what has been done well and what challenges remain. Ideally other contributors would help provide relevant papers and form a consensus opinion. We haven't had many issues of this type yet, but I'm hoping it could help re-engage our contributors (past and future). |
@agitter I agree that this is probably the optimal next step. |
I think you've already made your plans but just checking that you are aware of http://www.livecomsjournal.org/, the Living Journal of Computational Molecular Science (c.f. @davidlmobley). |
Thanks for the pointer @baoilleach. We did see the Living Journal of Computational Molecular Science and reference it in our manuscript on the Manubot system for collaborative writing https://greenelab.github.io/meta-review/ We'd be happy to discuss that platform versus Manubot more with you and @davidlmobley, but I suggest taking that conversation to a new issue in https://github.com/greenelab/meta-review |
Just in case: the 2019 edition shouldn't be published without mentioning alphafold for structure prediction by Google/Alphabet. They simply crushed everyone else as a first time entrant to the CASP protein structure prediction competition. |
@bachev it's definitely relevant. Please open a new issue in this repo if you'd like to discuss AlphaFold. It may be hard for us to write too much about it until they have a complete description of the method. For now, this blog post and comments from Jinbo are the most informative. |
Hi all, A phenomenal read, and thanks for all the amazing work. Just in case if this work is still possible to enrich/enhance? If so, can we include knowledge from some of the publications in Section of Single Cell concerning Deep Learning published in 2019? Some papers I feel I did not see there (in-case can be added) that have already laid some new additional information with DL and single cell:
Some of my personal notes are here but I have not developed them past few months in single-cell space and some mere ML/DL info: For now, these are the above that came to my mind to enrich the single-cell space in the current review if possible. I did see mention of Autoencoders as well by @agitter . Will be happy to contribute if possible and if all agree to what I have proposed for now. (PS: let me know if the above information fits in the scope here). Kind regards, Vivek |
Thanks for the suggestions @vd4mmind. Those topics are certainly in scope, and there has been a lot of recent work in the area that is not covered in the current version of the review. However, it's unclear how much actual updating there will be to this review. We've found that we need to have good editors/reviewers lined up if we're going to make major additions or revisions so that pull requests don't languish. My latest thoughts are that we should drop the "2019 update" part of the title and say something instead about this being the living version or post-publication update. Ideally we would also have a better way to show a rich diff of what has changed since publication (manubot/rootstock#54), a more dynamic way of adding authors frequently (#959), and a preferred way for readers to refer to the specific version they read or cited. Finally, to help keep track of papers, we've been opening one issue per paper. The format in #940 shows an example with the title in the issue title, abstract quoted, and DOI link. |
We resubmitted version 0.10 to the Journal of the Royal Society Interface and bioRxiv. Thanks everyone for all the great work on the revisions!
I'd like input on where we want to go from here. Should we continue to accept edits to the manuscript even after a version is accepted at a journal? Should we accept only errata corrections but lock the main content?
I don't want to dissolve this amazing group of authors. However, there isn't much precedent for living manuscripts that continue to change after publication, and realistically we are all very busy with other projects. The activity dropped off considerably between the first submission and the reviewer feedback.
The text was updated successfully, but these errors were encountered: