Skip to content

Genomic Data Retrieval with R

Compare
Choose a tag to compare
@HajkD HajkD released this 04 Oct 14:35
· 42 commits to master since this release

Package generalization

Over 5000 lines have been edited, most of them removed (#100), to generalize the package to make it more safe for future
development. This progress is still ongoing.

  • @Roleren is joining as package author and new core developer of biomartr.

New features

  • Ensembl genomes is no longer a different database compared to ensembl in biomaRt, since this split is artifical.
    It is adviced to use only "ensembl" as db from now on, but "ensemblgenomes" will still work.
  • Annotation did mean gff, but it should be both gff and gtf getter, with format specification, this is now fixed and generalized.
  • Added in new kingdom for ensembl: protists supportwith correct collection getters
  • The retrieval from the UniProt database is now updated to the new API/FTP path system. Now users
    can retrieve proteomes using the functions getProteome(db = "uniprot", ...) and getProteomeSet(db = "uniprot", ...) (see #82)
  • new function getBioSet: Generic Bio data set extractor
  • new function getBio: A wrapper to all bio getters, selected with 'type' argument
  • a new function getUniProtSTATS(): Retrieve UniProt Database Information File (STATS)

Power user cache

The package now supports caching of back end files which used to be saved to /tmp folder (i.e. lost on computer restart).
This make it easy for power users who want higher speed. For more info, see the function ?cachedir_set

Bug fixes

  • Fixed many wrong urls and non working functions, more tests are added to make sure they work.
  • Fixed fungi collection accessor for ensembl