-
When asked today, I realized that this is not well-documented yet. I don't have the time to write up a real documentation so this can start as a starting point. Besides, actual documentation should be tested and overall it would take a few days for the whole process to complete because of the large data sets. |
Beta Was this translation helpful? Give feedback.
Replies: 0 comments 1 reply
-
The description is top-down because one needs to know a bit about context and perspective to understand the process. There are two databases. "Two" you ask?Maybe even three.
How do you build the databases?
"Data Release Tarball"? Where do I get that from?There is a About these files shifting location ...Yes, we feel with you. We have not implemented this, but for other cases we have something called Is that all?Yes, mostly. In the case of an upgrade to GRCh38, one would probably have to touch the VarFish code here and there are a number of places where Most occurences are tests and we should also write some tests for GRCh37 (no need to replicate everything, though). We would need additional sites for variant QC, e.g., we could get them from peddy. I would also -- for now -- recommend fixing for a case whether it is GRCh37 or GRCh38. Further, one would have to implement lift-over if one wants to move a GRCh37 case to GRCh38, but such things could come later. |
Beta Was this translation helpful? Give feedback.
The description is top-down because one needs to know a bit about context and perspective to understand the process.
There are two databases. "Two" you ask?
Maybe even three.
varfish-annotator
tool has a H2 (embedded Java) database with the information needed to annotate files. These data are stored in tables, one for each datasetvarfish-annotator
tool needs a RefSeq and Ensembl.ser
file for the current genome build compatible with the Jannovar library embedded invarfish-annotator
.varfish-server
.How do you build the databases?