Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to use isOnclust on multiple samples? #22

Open
ghost opened this issue Aug 19, 2022 · 4 comments
Open

Is it possible to use isOnclust on multiple samples? #22

ghost opened this issue Aug 19, 2022 · 4 comments

Comments

@ghost
Copy link

ghost commented Aug 19, 2022

Hi!
I am working with samples from nanopore. I successfully used isOnclust on a single sample, but I would need to compare various samples. Is it possible to use isOnclust on multiple samples?

@ksahlin
Copy link
Owner

ksahlin commented Aug 19, 2022

It depends on what your input data is and what is the desired output of the analysis.

Are all the reads within one file but with sample-specific barcods? maybe then perform an initial barcode clustering/trimming with e.g. pychopper, and the run isONclust on each of the sample specific files.

Otherwise please describe the data and the desired outcome in more detail.

@ghost
Copy link
Author

ghost commented Aug 22, 2022

Hi, thank you very much for your prompt answer.
I have 10 different singles file.fastq.
I would like to use isOnclust, but then I would like to compare one sample against another one. The point is that if i run isOnclust on every single sample, I do not know how to compare the clusters across the samples. Brefly, I would like only one "otu table" with 10 samples.
I hope I have been clear.

Thanks a lot

@ksahlin
Copy link
Owner

ksahlin commented Aug 24, 2022

Not sure it is possible to create an OTU table without some scripting. One idea:

  1. Label all your reads in the headers (sample 1 gets a _1 appended, etc) for uniqueness.
  2. Combine all reads into one big file.
  3. Cluster the big file with isONclust.
  4. Parse the isonclust csv output file. The file contains a line for each read and which 'cluster representative' it is part of. Summing the reads per sample from this file, it will give you the abundance of each sample in a cluster.

@ghost
Copy link
Author

ghost commented Aug 24, 2022

It sounds good.
Thanks for the suggestion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant