The data stories and queries in this repository demonstrate working with genomic data via Google BigQuery. All examples are built upon public datasets.
Have other data stories you would like to see here? Have any data stories you would like to share? Have corrections to the biology covered in this material? Have query simplifications or speed improvements? Let us know by filing an issue or contacting us directly.
Start here: getting-started-bigquery.
The Google Genomics API spec includes an import method that loads VCF and CGI masterVar files directly from Google Cloud Storage.
For other types of data, such as variant annotations, see Preparing Data for BigQuery and also BigQuery in Practice : Loading Data Sets That are Terabytes and Beyond for more detail.
The Google Genomics Discuss mailing list is a good
way to sync up with other people who use googlegenomics including the core developers. You can subscribe
by sending an email to google-genomics-discuss+subscribe@googlegroups.com
or just post using
the web forum page.