-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CG CHG and CHH with --comprehensive flag using bismark workflow #388
Comments
Hi MMJ, I am not from nf-core/methylseq team. Because of my own curiosity, I dig into your issue and thought of sharing my findings (while you are still awaiting response from nf-core/methylseq team). Assuming you are on main github webpage (https://github.com/nf-core/methylseq), click on code --> click on conf --> click on modules.config --> go to line 209 which looks like this: params.comprehensive ? ' --comprehensive --merge_non_CpG' : '', if this line is modified into as described under: params.comprehensive ? ' --comprehensive' : '', then using the flag <--comprehensive> in your command, I believe, you may get separate result for CHG and CHH. if you want combined result, then use the flag <--comprehensive --mergenonCpG> instead. In order to implement this suggestion, you will have to find the file <modules.config> in your local set up. I am not sure, if this is the only file that needs to be changed to implement your objective or there are more files which may need to be modified. Hope this helps |
Hi @MMJBerger, I agree that this is a bit confusing to have a flag with the same name as Bismark that does more stuff than advertised. However, I'm a bit wary of changing it as it would be a breaking change that could affect existing users who upgrade. Any suggestions on new flags that we could add progressively to get what you want? Phil |
@deep-buddingcoder you're on the right tracks for how to customise pipeline behaviour. But note that you should never need to edit the pipeline source code to adjust configuration - you can leave that alone and override it with your own local configs. See the docs here: https://nf-co.re/docs/usage/configuration#custom-configuration-files |
Hi @deep-buddingcoder, hi @ewels ! Thank you for your answer, actually we managed to get the CG, CHG and CHH methylation_extractor files independently by simply removing the flag "--mergenonCpG" in the BISMARK_METHYLATIONEXTRACTOR section (module.config), and it worked verry well ! However, @ewels , I have another question, we could not obtain separated files for the rest of the pipeline. Indeed, we tried to use the --CX flag to get independent cytosine reports and the coverage files for each context, without succeeding. We tried modifying the step BISMARK_COVERAGE2CYTOSINE by adding the '--CX' flag in the "ext.args" (module config) but it didn"t change anything, we get the files, but not separately for each sequence context. results/bismark/methylation_calls/methylation_calls results/bismark/methylation_calls/methylation_coverage results/bismark/coverage2cytosine/reports results/bismark/coverage2cytosine/coverage Is there any way to act on it and get CG, CHG and CHH files for each samples in ../methylation_calls/methylation_coverage and ../coverage2cytosine/reports ? @ewels I would suggest to add a flag, or a setting maybe, that could allow us to get all of these informations separately if wanted (and if it's possible). For plants analysis we need to consider methylation independently as each context and their related changes are informative and associated with specific types of chromatin regulation. It would really help a lot to get them directly with the pipeline instead of running another script to extract these informations afterwards. Thank you again for your help, |
Hi Phil, with respect to your message:
Thanks a lot for the suggestion about "your own local configs". It helps me develop greater understanding about how source codes work and how custom changes should be implemented. Best |
Hi Margot, I am glad to know that the suggestion worked for you. As Phil has mentioned above, it may not be the best strategy to implement custom requirement. You may want to look into the usage of custom configuration file as suggested by Phil (in order to avoid fiddling with source code). Best |
Hi Margot, it is true that Bismark is written primarily with mammalian samples in mind, and the typical procedure only runs a single coverage/bedGraph conversion (limited to the CpG context by default). The nf-core/methylseq workflow even goes one step further by merging CHG and CHH contexts into a single non-CG context (somewhat counter-intuitive to what you would expect from the option As a plant person, you would have to essentially run
If you would also like the genome-wide reports, this would then have to be followed by running
While
So in short: |
Hi @FelixKrueger ! Thank you for your answer, indeed we used to run things like you explained above for our analyzes to get coverage files and reports separately. However, the possibility to get everything done by itself on all samples analysed using nf-core/methylseq workflow was too appealing not to ask whether we could implement it 'easily' (by that I mean without any deep code modifications) in some way in the pipeline so ... Thank you all for your help, and yes, if that's possible, we, plant-folks (and other, I am sure) would be very gratefull if such an option was implemented (--Ent 🌱?) |
hi @MMJBerger would you be interested in working with me on integrating this option ? |
Hi @sateeshperi , I am a newbie in bioinformatics and coding, but if you think I can have any sort of input/can help with that, I would be glad to work with you on integrating this option ! |
Hi @MMJBerger this could be your first contribution to methylseq! so, if you have time, I would be happy to guide/co-work/test this feature with you to get it integrated for all you Ents out there. Could you reach out to me on nf-core #methylseq channel so, we can co-ordinate much better. Thanks |
Description of feature
Hello,
Would it be possible not to instruct Bismark methylation extractor to use both the --comprehensive and --merge_non_CpG flags when specifying --comprehensive to the pipeline ?
This is not big issue since it is possible to act on it (thank you for your answer on the issue #372 ). However, in plants, the three methylation contexts are (almost) always considered independently in methylome analysis, and I found a little bit confusing to 'only' get "CpG" and "non_CpG" results with the Bismark workflow by using the same flag than the one currently used on Bismark to access CHG and CHH context independently.
I think having an easier way to chose directly in the pipeline if we want 'non_CpG' merged or not when using bismark workflow might be a plus for plant biologists !
Best regards,
The text was updated successfully, but these errors were encountered: