-
Notifications
You must be signed in to change notification settings - Fork 3
Data Loading : What You Need To Change
This page contains an overview to help you transition to the new file formats. As this should be a one-time operation for all users, this page is only temporary.
-
(recommended) Update your genes table, see issues #799 and #805 on how to do this
- Reason: cBioPortal in the past accidentally imported the wrong column as HUGO symbol. This will cause many warnings about invalid genes during the validation process.
-
Be aware: now there is a strict validation of the file column header names for all data files that have Entrez Id and Hugo gene symbol columns. The column names have to be
Entrez_Gene_Id
andHugo_Symbol
. This can be a change if you are expecting the position of the column to be important rather than the name. The columns still should be placed before any of the samples columns, though (i.e. only the columns afterEntrez_Gene_Id
andHugo_Symbol
columns are considered as sample columns). The new validator will warn you when your file does not comply to at least having theEntrez_Gene_Id
column (which is the recommended column to use for gene identifiers). -
Be aware: now there is a strict validation on
datatype
in the meta files, now also documented in the updated File formats page (and in table below) -
Other changes: check the following table for your data types:
DataType | What you have to do |
---|---|
Cancer Study | (optionally) Add add_global_case_list
|
Cancer Type | Create the meta file |
Discrete Copy Number Data | Update meta file:
|
Copy Number Data | Update meta file:
|
Segmented Data | Update meta file:
|
Expression Data | Update meta file:
|
Mutation Data | Update meta file:
|
Fusion Data (TODO) | Update meta file:
|
Methylation Data | Update meta file:
|
RPPA Data | Update meta file:
|
Clinical Data |
|
Case Lists | - |
Timeline Data | Update meta file(s):
|
Gistic Data | Create the meta file |
MutSig Data | Create the meta file |