-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: remove pseudogenes, read through genes, and ones with only predicted functions #510
Comments
Looks good! May I know how these genes were identified here? I double checked some read through, which is clear. As for the novel protein or predicted function, I am not sure how to double check them |
very good question! Now the number of genes is 2292 in
Obviously they don't have annotated function by Uniprot. But they can be checked on Ensembl (eg, ENSG00000011052) and NCBI (654364) |
as the only gene for |
as a follow-up, replace ENSG00000137700 with ENSG00000281500, because both refer to the same gene SLC37A4 while the later was reviewed with proper annotation and substrates. |
now there are still 5 genes that are left unmapped to Uniprot, here are manually checked and associated as below:
|
fixed in #537 |
Description of the issue:
It is proposed to remove following genes, because they are either pseudogenes, or genes from read through transcripts (from two adjacent genes in the same strand with same orientation), or the ones with only predicted functions.
ENSG00000137700with only predicted functionThe text was updated successfully, but these errors were encountered: