The JUMP-Target
set of perturbations comprises three lists of perturbations
JUMP-Target-Compound
– a list of compounds with diverse targets. See also JUMP-MOA for a similar compound set.JUMP-Target-ORF
– a list of ORF sequences corresponding to genes that are targets of compounds fromJUMP-Target-Compound
.JUMP-Target-CRISPR
– a list of sgRNA sequences corresponding to genes that are targets of compounds fromJUMP-Target-Compound
.
Both JUMP-Target-ORF
and JUMP-Target-CRISPR
have sgRNAs and ORFs corresponding to a set of 160 genes, each of which are targets of compounds in JUMP-Target-Compound
. JUMP-Target-ORF
has perturbations for an additional set of 15 genes that serve as negative controls (below).
Each of the 3 lists fit on a single 384-well plate; the suggested plate layouts are provided (below).
The target annotations were obtained from https://clue.io/repurposing.
This resource was created through the JUMP-Cell Painting Consortium.
Versioning
- There's a single version for the
JUMP-Target-CRISPR
andJUMP-Target-ORF
plates (JUMP-Target-1-CRISPR
andJUMP-Target-1-ORF
respectively) - There are two versions of
JUMP-Target-Compound
(JUMP-Target-1-Compound
andJUMP-Target-2-Compound
). Both have the same set of compounds, but have differentbroad_sample
IDs and different layouts.JUMP-Target-2-Compound
is used in the production of the JUMP dataset. We recommend using this layout because it will remove layout as a potential confounder when matching against the JUMP dataset. While thebroad_sample
IDs differ between versions, the compounds can be matched between versions using theirInChIKey
values, which remain identical for the same compound.
There are 301 compounds in total, of which 46 are included as controls serving different goals
poscon_orf
: compounds that correlate strongly with (overexpressed) genes in previous Cell Painting experiments. There are 6 such compounds, and their corresponding 3 genes are included inJUMP-Target-ORF
andJUMP-Target-CRISPR
poscon_cp
: compounds with a strong association with the genes that they target, according to ChemicalProbes.org. There are 26 such compounds, and their corresponding 13 genes are included inJUMP-Target-ORF
andJUMP-Target-CRISPR
.poscon_diverse
: pairs of compounds that are strongly correlated to each other, and weakly correlated to otherposcon_diverse
pairs, in previous Cell Painting experiments. There are 7 such pairs, so 14 compounds in total.negcon
: DMSO is the negative control.
The recommended concentration for these compounds is 5 uM
.
There are
- n=2 replicate wells for each of 14
poscon_diverse
compounds - n=64 DMSO wells
- n=1 well for the remaining compounds
The company Specs has assembled the compounds for purchase; for info contact tamara.baptist@specs.net
JUMP-Target-1_compound_metadata.tsv
JUMP-Target-2_compound_metadata.tsv
Column | Description |
---|---|
pert_iname | Compound name |
pubchem_cid | PubChem ID e.g. 72716071 |
target_list | List of genes targeted by the compound |
pert_type | Perturbation type – trt : treatment, control : one of the 3 controls |
control_type | Control type – poscon_orf , poscon_cp , or poscon_diverse |
smiles | Simplified molecular-input line-entry system (SMILES) string |
InChI | International Chemical Identifier corresponding to smiles |
InChIKey | InChIKey generated from InChI |
smiles
was standardized using JUMP-CP Standardizer.
See #34 for details about the standardization.
JUMP-Target-1_compound_platemap.tsv
JUMP-Target-2_compound_platemap.tsv
Column | Description |
---|---|
well_position | Well position |
broad_sample | Compound ID in Broad Institute's compound management database |
solvent | Solvent (always DMSO) |
- 301 compounds
JUMP-Target-1_Compound
:- Compound Request Number: CR-12417 - placed by Niranj Chandrasekaran - $10,404
- 5 milliMolar 13.75 uL total volume (5 uL dead volume)
JUMP-Target-2_Compound
:- Compound Request Number: CR-12908 - placed by Niranj Chandrasekaran - $???
- 5 milliMolar 13.75 uL total volume (5 uL dead volume)
- This plate was created because CDoT was unable to create plates with the same plate map as
JUMP-Target-1
for the production experiments. Thus, the compounds inJUMP-Target-2
are identical to the compounds inJUMP-Target-1
but the plate maps are different. Thebroad sample
IDs of some compounds are also different.- To map the compounds in
JUMP-Target-1
toJUMP-Target-2
useInChIKey
for "joining".
- To map the compounds in
There are 335 sgRNAs (corresponding to 175 genes) in total, of which 88 sgRNAs are included as controls serving different goals
poscon_orf
: the corresponding genes targets of theposcon_orf
compounds. Total: 3 genes, where 2 genes have 2 sgRNAs each and 1 gene has 1 sgRNA = 5 sgRNAs total.poscon_cp
: the corresponding genes targets of theposcon_cp
compounds. Total: 13 genes, where 12 genes have 2 sgRNAs each and 1 gene has 1 sgRNA = 25 sgRNAs total.poscon_diverse
: the corresponding genes targets of theposcon_diverse
compounds. Total: 14 genes x 2 sgRNAs per gene = 28 sgRNAs.negcon
: 30 unique sgRNAs that serve as negative controls because they are either non-targeting (NO_SITE
) or target intergenic region (ONE_INTERGENIC_SITE
).negcon
sgRNAs have an additional annotation e.g.NO_SITE (5 zeros)
. The number of zeros indicates how deep into the "threat matrix" a particular sgRNA makes it before a single match is found. An sgRNAnegcon
with higher number of zeros is less likely to have off-target effects and therefore may be a better negative control.
For the remaining treatment genes (non-controls):
- 117 genes have 2 sgRNAs each
- 13 genes have 1 sgRNA each
There are
- n=2 replicate wells for each of
negcon
s, as well as for 1poscon_cp
sgRNA, 1poscon_orf
sgRNA, and 13 other sgRNAs - n=1 well for each remaining sgRNAs
- n=4 empty wells
JUMP-Target-1_crispr_metadata.tsv
Column | Description |
---|---|
broad_sample | sgRNA id in Broad Institute's Genetic Perturbation Platform database |
gene | Gene targeted by the sgRNA |
pert_type | Perturbation type – trt : treatment, control : one of the 4 controls |
control_type | Control type – negcon , poscon_orf , poscon_cp , or poscon_diverse |
target_sequence | sgRNA sequence |
negcon_control_type | Negative control type – NO_SITE or ONE_INTERGENIC_SITE |
JUMP-Target-1_crispr_platemap.tsv
Column | Description |
---|---|
well_position | Well position |
broad_sample | sgRNA id in Broad Institute's Genetic Perturbation Platform database |
There are 175 ORFs (corresponding to 175 genes) in total, of which 45 ORFs (corresponding to 45 genes) are included as controls serving different goals and 130 genes are represented in duplicate on the plate.
All genes have a single corresponding ORF
poscon_orf
: the corresponding genes targets of theposcon_orf
compounds. Total: 3 genes.poscon_cp
: the corresponding genes targets of theposcon_cp
compounds. Total: 13 genes.poscon_diverse
: the corresponding genes targets of theposcon_diverse
compounds. Total: 14 genes.negcon
: genes that have had weak signatures in previous Cell Painting overexpression experiments. Total: 15 genes.
There are
- n=2 replicate wells for each of the 30 poscons
- n=4 replicate wells for each of the 15 negcons
- n=2 wells for each remaining ORF (130 of them)
- n=4 empty wells
- For a total of 384 wells
JUMP-Target-1_orf_metadata.tsv
Column | Description |
---|---|
broad_sample | ORF id in Broad Institute's Genetic Perturbation Platform database |
gene | Gene overexpressed by the ORF |
pert_type | Perturbation type – trt : treatment, control : one of the 4 controls |
control_type | Control type – negcon , poscon_orf , poscon_cp , or poscon_diverse |
JUMP-Target-1_orf_platemap.tsv
Column | Description |
---|---|
well_position | Well position |
broad_sample | ORF id in Broad Institute's Genetic Perturbation Platform database |
The list of compounds were derived from Broad's Drug Repurposing Hub dataset, a curated and annotated collection of FDA-approved drugs, clinical trial drugs, and pre-clinical tool compounds. The genes perturbed by genetic perturbations were chosen because they are the annotated targets of the compounds. The Repurposing Hub compounds were filtered using the following criteria:
- The compounds should target genes that belong to diverse gene families. This is because the ideal methods would work well for many different biological pathways, not just a few that are well-characterized and/or easy to predict.
- Each gene should be targeted by at least two compounds so that gene-compound matching and compound-compound matching can both be performed using the dataset.
- The constraint that each compound should target only a single gene, was considered. However, this criterion is difficult to achieve due to polypharmacology, which is the property for compounds to bind and impact many different gene products in the cell; this is especially common for protein kinase inhibitors in the dataset. Instead, only the so-called “historical compounds” listed in the Chemical Probes Portal, comprising compounds that are known to be quite non-selective (or not sufficiently potent) compared with other available chemical probes, were filtered out.
- Three types of positive control compounds and the genes that they target were selected. (Compound positive controls, CRISPR positive controls, and ORF positive controls).
- To ensure that the compounds and genes selected were available for performing experiments, compounds that were unavailable for purchase from at least one of four compound vendors (Sigma, SelleckChem, Tocris, and MedChemEx) and genes for which genetic reagents were unavailable from Broad's GPP portal were filtered out.
- Compounds that belong to DEA's list of controlled substances or OPCW's list of chemical weapons precursors were also excluded.
Based on our experiments with the Target-1 compounds plates, we have identified a list of eight compounds (and also a subset of four compounds) with maximally diverse phenotypes.
Compound names in bold is the subset of four compounds.
Our recommendation is to have at least four replicates of the eight (or four) compounds spread across the plate. If such a layout is not possible, then we recommend the following for a 384-well plate.
pert_iname | well_position |
---|---|
AMG900 | B1, J1, F24, N24 |
NVS-PAK1-1 | F1, N1, B24, J24 |
dexamethasone | C1, K1, G24, O24 |
LY2109761 | E1, M1, A24, I24 |
FK-866 | D1, L1, H24, P24 |
quinidine | G1, O1, C24, K24 |
TC-S-7004 | H1, P1, D24, L24 |
aloxistatin | A1, I1, E24, M24 |
For a 1536-well plate, the layout is similar, but in four quadrants.
ID | pair | target_gene | InChIKey | broad_compound_id | broad_orf_id | broad_crispr_id_1 | broad_crispr_id_2 |
---|---|---|---|---|---|---|---|
poscon_diverse-pair0-1 | 0 | RET | XKFTZKGMDDZMJI | BRD-K07881437-001-03-8 | ccsbBroad304_14827 | BRDN0001058343 | BRDN0001058011 |
poscon_diverse-pair0-2 | 0 | TUBB | MTJHLONVHHPNSI | BRD-K23363278-001-02-1 | ccsbBroad304_05206 | BRDN0001489809 | BRDN0001488241 |
poscon_diverse-pair1-1 | 1 | HSP90AA1 | RVAQIUULWULRNW | BRD-K38852836-001-04-9 | ccsbBroad304_06412 | BRDN0001488886 | BRDN0001483257 |
poscon_diverse-pair1-2 | 1 | PIK3CG | CWHUFRVAEUJCEF | BRD-K42191735-001-08-7 | ccsbBroad304_14762 | BRDN0001065162 | BRDN0001065152 |
poscon_diverse-pair2-1 | 2 | NAMPT | KPBNHDGDUADAGP | BRD-K58550667-001-08-7 | ccsbBroad304_07557 | BRDN0001484730 | BRDN0001481327 |
poscon_diverse-pair2-2 | 2 | AKT1 | AFJRDFWMXUECEW | BRD-K25412176-001-01-9 | ccsbBroad304_14538 | BRDN0001054985 | BRDN0000562836 |
poscon_diverse-pair3-1 | 3 | KRAS | DHMTURDWPRKSOA | BRD-K41599323-001-02-3 | ccsbBroad304_16173 | BRDN0001054815 | BRDN0000563627 |
poscon_diverse-pair3-2 | 3 | PAK4 | AYCPARAPKDAOEN | BRD-K37764012-001-03-3 | ccsbBroad304_02392 | BRDN0001145608 | BRDN0001147629 |
poscon_diverse-pair4-1 | 4 | DNMT3A | NMUSYJAQQFHJEW | BRD-K03406345-001-21-1 | ccsbBroad304_00454 | BRDN0001066838 | BRDN0001066751 |
poscon_diverse-pair4-2 | 4 | IMPDH1 | WYWHKKSPHMUBEB | BRD-K49350383-001-13-7 | ccsbBroad304_06451 | BRDN0001489452 | BRDN0001488805 |
poscon_diverse-pair5-1 | 5 | CDK7 | HUXYBQXJVXOMKX | BRD-K64800655-001-09-0 | ccsbBroad304_00280 | BRDN0001162216 | BRDN0001147450 |
poscon_diverse-pair5-2 | 5 | PLK1 | XQVVPGYIWAGRNI | BRD-K64890080-001-02-1 | ccsbBroad304_14770 | BRDN0001144995 | BRDN0001054037 |
poscon_diverse-pair6-1 | 6 | CHEK2 | IAYGCINLNONXHY | BRD-K86525559-001-07-8 | ccsbBroad304_14989 | BRDN0000585854 | BRDN0001145610 |
poscon_diverse-pair6-2 | 6 | GABRB2 | ALBKMJDFBZVHAK | BRD-K33882852-003-02-8 | ccsbBroad304_00607 | BRDN0001486784 | BRDN0001482741 |
ID | target_gene | broad_compound_id_1 | broad_compound_id_2 | broad_orf_id | broad_crispr_id_1 | broad_crispr_id_2 |
---|---|---|---|---|---|---|
poscon_cp-00 | AURKB | BRD-K21728777-001-02-3 | BRD-K55966568-001-09-6 | ccsbBroad304_14932 | BRDN0001054845 | BRDN0000562944 |
poscon_cp-01 | BRD4 | BRD-K13094524-001-04-2 | BRD-K12502280-001-11-4 | ccsbBroad304_11738 | BRDN0000733514 | BRDN0001146786 |
poscon_cp-02 | CLK1 | BRD-K66430217-001-03-8 | BRD-K97072811-001-11-4 | ccsbBroad304_00326 | BRDN0001147209 | BRDN0001146512 |
poscon_cp-03 | DYRK1B | BRD-K89517477-001-01-4 | BRD-K80935598-001-01-1 | ccsbBroad304_14931 | BRDN0001149482 | BRDN0001149418 |
poscon_cp-04 | ERBB2 | BRD-K76908866-001-07-6 | BRD-K61642990-001-01-0 | ccsbBroad304_14631 | BRDN0001148320 | BRDN0000579738 |
poscon_cp-05 | EZH2 | BRD-K91535048-001-01-2 | BRD-K26989966-001-04-3 | ccsbBroad304_00526 | BRDN0001057455 | |
poscon_cp-06 | FLT3 | BRD-K90747162-001-01-4 | BRD-K91283740-003-01-6 | ccsbBroad304_14644 | BRDN0001145324 | BRDN0001149376 |
poscon_cp-07 | HDAC3 | BRD-K61688984-001-02-9 | BRD-K61397605-001-01-8 | ccsbBroad304_07311 | BRDN0000582721 | BRDN0000733096 |
poscon_cp-08 | IGF1R | BRD-K15179513-001-04-2 | BRD-K24696047-001-02-3 | ccsbBroad304_14671 | BRDN0001065819 | BRDN0001065728 |
poscon_cp-09 | JAK1 | BRD-K16803204-001-01-6 | BRD-K53972329-001-01-3 | ccsbBroad304_14679 | BRDN0001065674 | BRDN0001065858 |
poscon_cp-10 | MET | BRD-K73319509-001-08-0 | BRD-K19477839-001-07-6 | ccsbBroad304_14696 | BRDN0000734703 | BRDN0001065558 |
poscon_cp-11 | PAK1 | BRD-K19333160-001-01-3 | BRD-K28132190-001-02-0 | ccsbBroad304_01144 | BRDN0001145958 | BRDN0001147867 |
poscon_cp-12 | USP1 | BRD-K93942811-001-01-3 | BRD-K81197548-003-01-4 | ccsbBroad304_01760 | BRDN0001487607 | BRDN0001484218 |
poscon_orf-00 | BRAF | BRD-K73789395-001-09-9 | BRD-K62810658-001-11-4 | ccsbBroad304_16175 | BRDN0001054801 | BRDN0000562882 |
poscon_orf-01 | CDK2 | BRD-K50836978-001-03-3 | BRD-K07762753-001-05-1 | ccsbBroad304_14572 | BRDN0001147786 | BRDN0001148950 |
poscon_orf-02 | MAPK14 | BRD-K54330070-001-19-6 | BRD-A37704979-001-12-3 | ccsbBroad304_00371 | BRDN0001148417 |