-
Notifications
You must be signed in to change notification settings - Fork 81
deblur 2021.09 #3141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deblur 2021.09 #3141
Conversation
qiita_pet/support_files/doc/source/processingdata/deblur_2021.09.rst
Outdated
Show resolved
Hide resolved
qiita_pet/support_files/doc/source/processingdata/deblur_2021.09.rst
Outdated
Show resolved
Hide resolved
general rule of thumb, as a first analytical pass for meta-analysis for 16S data, we use | ||
5,000 sequences per sample and we prefer 150 base pair trimming. Thus, we directly | ||
contacted all study owners that would recover more than 5% of the samples in their study | ||
(total 24). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be OK to recommend to users that they reach out to qiita-help if they are concerned about how this affected them?
Co-authored-by: Yoshiki Vázquez Baeza <yoshiki@ucsd.edu>
Sample counts implications | ||
-------------------------- | ||
|
||
At the time of writing Qiita had 978,052 16S deblured private or pubic samples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unclear if you refer to the time of writing the bogus parser or this text
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this text, I can add more text to make it clearer ...
In the figure below, we have at different trimming lengths how many samples we recover | ||
based on the minimum number of sequences per sample - this is an important consideration | ||
as we normally need to remove samples below a given threshold for beta diversity | ||
calculations (via rarefactoin) or differential abundance testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this only applies, if rarefaction was based off faith PD, all other metrics should not have been affected, iff you did not additionally filter the deblur tables provided by Qiita down to those features contained in the insertion tree file (which comes together with the biom file in qiita)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, but this is true only for the full/all table not for the reference-hit because the table needs to be filtered to match what's in the tree, which is automatically done via the meta-analysis construction in Qiita. Do you have any specific text suggestions to cover this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add some more information about this at the end of the intro paragraph, please let me know if that works.
- 96.6% of preparations had 0-10% of features lost | ||
- 12.6% had 10-20% of the features lost | ||
- 9.7% 30-40% | ||
- 6.9% 40-50% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to give the full list, all the way to 90%.
- 6.9% 40-50% | ||
|
||
Remember that the percentage reported above is inclusive at the next level, for example the studies with | ||
40-50% lost are also accounted for at lower levels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also include a comment that we did not find any strong patterns among the studies that were most greatly affected, whether they were from a specific sample type (according to empo category) or target 16S variable region.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All requests are in suggested changes so if good should be quick to merge
qiita_pet/support_files/doc/source/processingdata/deblur_2021.09.rst
Outdated
Show resolved
Hide resolved
qiita_pet/support_files/doc/source/processingdata/deblur_2021.09.rst
Outdated
Show resolved
Hide resolved
Thank you @wasade! Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu>
Thanks @antgonza! |
Thank you all for the feedback. |
No description provided.