-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions for homework 3 #21
Comments
Is the code for 1a and 1b supposed to go all in the same code block below 1b? Also, for 1a printing out the first five unique TCGA IDs, is that supposed to be drawn from sorting the overlapping segments? Thank you! |
@LABrumage Sorry about leaving out a code chunk! Please add one for 1a. @gavinha, can you speak to this question:
|
We've (@LABrumage + @sbest0128) been trying to get the |
Hi @rcsegura Please refer to the R Markdown file from the lecture: tfcb_2019/lectures/lecture08/Lecture8_GenomicData.Rmd Lines 165 to 166 in b4828fb
Let me know if you still have any questions. Best, |
Yes, after finding the overlapping ranges for |
I received the following question via email:
The commas are not separating numbers, but instead represent numbers higher than 128 million |
Another question for @gavinha, regarding 1a: after creating a |
@gavinha is this on the right track for 1b (where object overlap1b is the GRanges object containing the overlaps I found)? |
After using
Let me know if you have any further questions. |
Those specific lines of code look correct. If Also, make sure you are using the correct function to assign |
Hi class, In Problem 4d, if you used tfcb_2019/lectures/lecture08/Lecture8_GenomicData.Rmd Lines 244 to 246 in b4828fb
you may notice that in the R Markdown itself shows something like this:
Don't worry! Once you Knit, the output will be fine. Hope this helps. |
I've talked to a few folks who are struggling to conceptualize homework 3. If you haven't worked with genomic data and/or haven't spent a lot of time working in R, it's possible this homework will be particularly challenging. Here are a few things that may help to keep in mind:
A few other more specific answers to questions I've received about genomic data/analyses:
|
Thanks @k8hertweck |
What VCF file are we supposed to load for problem #4? 'GIAB_highconf_v.3.3.2.vcf.gz' from lecture 7? |
For Problem 4d where we have to combine the genotype info into a table, all of the necessary columns appear to be present but the entries in the table are <chr [1]>, <int[1]>, and so forth. Is this what the entries are supposed to read as? I used the vcf object where I loaded info for chromosome 8 (defined back in 4a). |
When I run this segs.gr[ind.segs.overlap.tile2] after using find0verlaps, I get no results:
|
For 2d, I found two windows that have the highest count of deletion segments. Should we be looking for only one window as the answer? |
Hi @LABrumage Please refer to my message 3 days ago about question 4d. |
Hi @Amandakr713 You can return both or just one of them. |
Hi @alexgal8 The variable that you use to index tfcb_2019/lectures/lecture08/Lecture8_GenomicData.Rmd Lines 142 to 150 in fce205c
|
In 4e, where do the DNAString and DNAStringSet objects exist? |
Hi @Amandakr713 In the lecture notes: tfcb_2019/lectures/lecture08/Lecture8_GenomicData.Rmd Lines 220 to 223 in fce205c
Find the column referring to the reference base |
Yes, I've got the overlaps but now I'm trying to query the correct rows from the original 'segs.gr'. |
Thank you @gavinha ! |
Hi @Amandakr713 Thank you for bringing up this concern. I just realized that this question is an older version that is much more challenging. I had come up with a slightly different version but did not manage to upload the changes. I will talk to Kate about this but likely, all of 4f will be a question only for bonus marks. For your reference, the more straight-forward version of the question is simply a change in the region of interest to Reporting the answer with these 5 SNPs in I not sure if other students actually come to GitHub to read these comments so Kate will send an announcement about this. Thanks for bringing this to my attention and sorry for the inconvenience. |
@gavinha In regard to problem 2 and finding copy number alterations, can you direct me to the part in the lecture that addresses this? I remember you going over it but now I can't find the section in the lecture materials. Thanks! |
Assuming you used something like |
@alexgal8
You need to make use of basic operations/functionality of |
Please see this corrected language for question 4f: f. Find the phased SNPs for both haplotypes within the region chr8: 129,127,000-129,128,000. (4 points) Given the lateness of this change, and difficulty in communicating updates, all answers to question 4f will be counted as extra credit, and all reasonable genomic regions will be acceptable for receiving at least partial extra credit for this question. |
For 2c-e, are we using the same tiles in chromosome X that we made in 2b or should we be using separate tiling over chromosomes 1-22 + X instead? |
@zyaffe The question intended for all chromosomes, but we'll accept either interpretation. |
For 2d, I have the list of the overlaps and segment means but I am stuck on what to do next. How do I group ranges by tile? How to I sort by < -0.3? (I am still trying to learn R basics and am struggling to find everything I need to include with sort() to get it to work with these data). |
During the lecture, I showed an example (not in the notes) on how to select segments from a
Once you have this, then perform the overlap. I am at Fred Hutch today until 5pm and available for office hours tomorrow 9:30-11am. My office is M1-B869 in the Arnold Building. |
Hi @gavinha and @k8hertweck, for the bonus question (2 points): provide code to determine the two haplotype sequences, can you elaborate on what the question is looking for? How does it differ from question 4f? |
Hi @yliu234 The question asks you to print out the sequence of alleles for each of the two haplotypes. You can simply read it off the screen and type it out manually. Hope this makes sense. |
Include questions about homework 3 here! @gavinha and I will be happy to help answer them.
The text was updated successfully, but these errors were encountered: