CellTypist wiki

Materials and scripts for building cell type encyclopedia table

Generate newest json file from latest models

Make sure that the latest models are uploaded to s3://celltypist/models/*/. Put all shareable models in a local folder (a subset of s3 models), and run the following:

python src/generate_json_from_latest_models.py /path/to/local_model_folder

(Find all settings within the 'setting here' section <- no need to change in most cases)

New json file will stay in json/models.json. Upload to s3://celltypist/models/.

Generate encyclopedia database with tissue, dataset, and marker information

Run the following command with the atlas name (e.g. Pan_Immune_CellTypist) and version (e.g. v1).

python src/generate_encyclopedia.py an_atlas_name a_version

All settings can be found in the configuration file specific to each atlas (atlases/an_atlas_name/a_version/config/Encyclo.config), including:

filter_out: cell types with <filter_out cells from a tissue-dataset combination are removed (no such cell type in the given tissue and dataset).
model: model to extract top marker genes. Make sure the model of interest is exported in CellTypist (or use a local model).
no_celltypes: number of cell types to double-check with the meta csv file and with the model.

Details of the four tables specific to each atlas used during the execution can be found in the sections below (Images and Other tables).

The resulting table will stay in atlases/an_atlas_name/a_version/encyclopedia/encyclopedia_table.xlsx, and database in atlases/an_atlas_name/a_version/encyclopedia/encyclopedia.db. Upload the latter to s3://celltypist/atlases/an_atlas_name/a_version/.

Generate average and percent expression for gene expression heat map

Run the following command with the atlas name (e.g. Pan_Immune_CellTypist) and version (e.g. v1).

python src/generate_Heatmap_data.py an_atlas_name a_version

All settings can be found in the configuration file specific to each atlas (atlases/an_atlas_name/a_version/config/Heatmap.config), including:

adata_path: path to the AnnData.
tissue_column: cell metadata column specifying tissue/organ information.
celltype_column: cell metadata column specifying cell type information.
use_raw: whether to use the .raw attribute for expression matrix in the AnnData.
filter_out: cell types with <=filter_out cells from a tissue-celltype combination are thought as non-existing (black grids in the heat map).
do_normalize: log-normalise (to 1e4) the data if the AnnData is provided in raw counts.

Tissue and cell type orders are defined in the atlases/an_atlas_name/a_version/Heatmap_data/tissue_order.txt and atlases/an_atlas_name/a_version/Heatmap_data/celltype_order.txt, respectively.

Heatmap data will stay in atlases/an_atlas_name/a_version/Heatmap_data/exp_pct_celltypist.pkl. Upload to s3://celltypist/atlases/an_atlas_name/a_version/.

Images

Images are in images/*.png. White background, 842 x 736 (pixels).

Correspondence between cell type names and images for a given atlas is in atlases/an_atlas_name/a_version/tables/celltype_to_image.csv (no headers).

Other tables

atlases/an_atlas_name/a_version/tables/Basic_celltype_information.xlsx: free text of basic cell type information. Headers must be High-hierarchy cell types, Low-hierarchy cell types, Description, Cell Ontology ID and Curated markers.

atlases/an_atlas_name/a_version/tables/celltypist_meta.csv: cell meta-information for deriving the tissue and dataset information (e.g. adata.obs[['CellType', 'Tissue', 'Dataset']].to_csv('celltypist_meta.csv', header=True, index=False)). Header names are arbitrary, but should be in such a order (<-).

atlases/an_atlas_name/a_version/tables/dataset_to_PMID.csv: link/paper of each data set. No headers. Datasets without available PMIDs can have urls instead.

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
atlases/Pan_Immune_CellTypist		atlases/Pan_Immune_CellTypist
images		images
json		json
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CellTypist wiki

Generate newest json file from latest models

Generate encyclopedia database with tissue, dataset, and marker information

Generate average and percent expression for gene expression heat map

Images

Other tables

About

Releases

Packages

Contributors 2

Languages

Teichlab/celltypist_wiki

Folders and files

Latest commit

History

Repository files navigation

CellTypist wiki

Generate newest json file from latest models

Generate encyclopedia database with tissue, dataset, and marker information

Generate average and percent expression for gene expression heat map

Images

Other tables

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages