Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster labels missing, and where to find annotations for 68K PBMC? #3

Open
lynnyi opened this issue Aug 8, 2018 · 7 comments
Open

Comments

@lynnyi
Copy link

lynnyi commented Aug 8, 2018

Hi,

I'm looking for annotations for the 68K PBMC dataset that corresponds to Fig 3 in Zheng et al.

I downloaded the kmeans clustering labels for the 68K PBMCs from this site (https://support.10xgenomics.com/single-cell-gene-expression/datasets/1.1.0/fresh_68k_pbmc_donor_a) , but the 10 cluster .csv file only had cluster labels for 40,000 cells, not the fully dataset.

Furthermore, I notice that the numbering scheme for these clusters does not match the numbering scheme for Figure 3 in Zheng et al., making it difficult to assign cell types to the cluster numbers. Could you also provide the annotations for all 68K cells?

Thank you,
Lynn

@gokceneraslan
Copy link

It'd be really nice to save labels as tsv in this repo, I fully agree.

But for now, one option is to use the file in scanpy tutorial here, and the other option is to rerun this script in this repo, it'll generate cluster labels by correlating single cell expression with purified samples.

@lynnyi
Copy link
Author

lynnyi commented Aug 9, 2018

Thanks! I'm guessing the scanpy labels are the result of a de novo clustering analysis different from 10x though, since the scanpy labels don't seem to match the 10x labels though:

i.e. first 5 10x labels:
Barcode,Cluster
AAACATACACCCAA-1,2
AAACATACCCCTCA-1,3
AAACATACTAACCG-1,6
AAACATACTCTTCA-1,3
AAACATACTGTCTT-1,2

The 1st and 5th cell should be the same cluster, but first 5 scanpy labels:
CD8+ Cytotoxic T
CD8+/CD45RA+ Naive Cytotoxic
CD4+/CD25 T Reg
CD19+ B
CD4+/CD25 T Reg

I'll take a look at the script and the solution that Magnus mentioned on twitter.

@gokceneraslan
Copy link

gokceneraslan commented Aug 10, 2018

That's because the cell order is different:

image

Here is the full file with barcodes and labels as tsv: zheng17-cell-labels.txt

@gokceneraslan
Copy link

gokceneraslan commented Aug 10, 2018

Barcode order in scanpy file follows the barcode order in http://cf.10xgenomics.com/samples/cell-exp/1.1.0/fresh_68k_pbmc_donor_a/fresh_68k_pbmc_donor_a_filtered_gene_bc_matrices.tar.gz file, fyi.

Labels are not from denovo clustering, it's based on correlation with 11 purified bulk samples, same as the R script.

@Khalid-Usman
Copy link

@gokceneraslan I have different barcode for Pbmc 2700, Can you please share file for it? Thanks

@namratabhattacharya
Copy link

@gokceneraslan Can you please help with cell type annotations of 3K PBMC? Kindly share the file for it.

@zhiiiyang
Copy link

@gokceneraslan, thank you for sharing the annotation for 68k PBMC. Is that ground truth or manually annotation from unsupervised clustering?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants