Add support for extracting the full sample/portion/analyte hierarchy from GDC #6

gaurav · 2021-05-31T02:04:34Z

We currently use a single GDC query that only provides information on the case, along with lists of identifiers for the samples, portions, analytes, aliquots and slides. However, we need to make additional queries to actually retrieve the data associated with these. We should include this information in the downloaded data and demonstrate how to consider (or, more usefully, just build a transformation library that can retrieve all of this GDC data and export it as CRDC-H instance data).

As an example, GDC case TCGA-HNSC / TCGA-CV-7261 reports the following items in the data we have obtained from their service:

'aliquot_ids': ['8f695cd3-01dd-4601-8b17-37cf40514422', 'f0e325f8-297c-41e3-913d-a70e35ab5096', 'b6063ecd-bd1b-4cff-bf17-33357c17573b', 'e7d6f3dd-9de4-47e8-86f8-6f2a1b8b716e','10c3c9e6-3bb2-4082-913a-57e80018cb45', '9fa7bc79-d05b-41da-8bcc-8d5ad4451b0c', '1f730300-dafe-4b06-9da3-9b9d2855cbac', '81fa5865-d03a-401c-a9b7-19a38a36ec33', 'c9731404-75c5-4b04-82b3-aec9cbd0290e', 'af270fca-56e4-493a-9735-e51964cac713', 'b34ad9ae-e439-49c1-9512-daedfc15ed13', '9760e44d-1227-40b0-8469-dea55bd02b5d', '70d41532-dd46-4e3f-9417-8a7306ef4117', '2376520b-1f75-4e34-b3fb-92fa131d938a','01f28aef-6802-4d54-a644-448887298280','78ca1f34-d401-4056-a821-ce8cf947c669', '43406d9f-8734-4b0b-8b2d-6b617575607a', 'cc61d260-3898-4328-984a-7ddee700d6a8']
'analyte_ids': ['a72f2de7-eb40-4818-a104-edb508d5517b', 'e8120e5b-79a0-46ca-b603-88e2d6745657', 'cc4e73d3-e4f0-42a0-97cb-ef99336bdad8', '66dc8914-c32e-46d6-9769-278e90dcc062', '8c80c204-c894-4939-af90-07988a86bd02', 'd10874e0-fae9-43a6-8b47-2366cd929960', '147eac8b-ec1f-4dd3-a760-22e02c4b7098', '34f70218-2f5d-49e6-b73b-043e465a4c6b']
'portion_ids': ['177fa10b-0135-468d-b5a3-6f30cc3cd390', 'f51d76a7-77af-4513-b7eb-6fbd05aeeff9', '1a6628a1-09e5-4fca-9917-f577d7ca08fe', '806efd93-d80d-4a4b-83f0-ee6362022052']
...

gaurav added this to the Build a complete Example Data Workflow milestone Jul 28, 2021

gaurav mentioned this issue Jul 28, 2021

Develop a complete Example Data Workflow for CRDCH cancerDHC/tools#36

Open

4 tasks

gaurav added use case A use case that needs to be added to the Example Data Workflow good first issue Good for newcomers labels Sep 3, 2021

gaurav mentioned this issue Oct 11, 2021

Figure out how to add demographic information to GDC head-and-mouth data #24

Closed

gaurav modified the milestones: Build a complete Example Data workflow (Phase 3, Quarter 4, 2021), Demonstrate harmonization of multiple CRDCH entities (Phase 3, Quarter 4, 2021) Nov 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for extracting the full sample/portion/analyte hierarchy from GDC #6

Add support for extracting the full sample/portion/analyte hierarchy from GDC #6

gaurav commented May 31, 2021 •

edited

Loading

Add support for extracting the full sample/portion/analyte hierarchy from GDC #6

Add support for extracting the full sample/portion/analyte hierarchy from GDC #6

Comments

gaurav commented May 31, 2021 • edited Loading

gaurav commented May 31, 2021 •

edited

Loading