Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Enterobase URL #31

Open
dfornika opened this issue Mar 26, 2018 · 2 comments
Open

New Enterobase URL #31

dfornika opened this issue Mar 26, 2018 · 2 comments

Comments

@dfornika
Copy link
Contributor

Enterobase now provides the follwing URL for downloading schemes:

http://enterobase.warwick.ac.uk/schemes/

@crarlus
Copy link

crarlus commented Jun 25, 2018

My solution is to use the following python script, which has been adapted from the Enterobase API How-To (https://bitbucket.org/enterobase/enterobase-web/wiki/api_download_schemes):

import os
import urllib2
import json
import base64
import sys
from urllib2 import HTTPError
import logging

# You must have a valid API Token
API_TOKEN =os.getenv('ENTEROBASE_API_TOKEN', None)
SERVER_ADDRESS = 'http://enterobase.warwick.ac.uk'
DATABASE = 'senterica'
SCHEME = 'cgMLST_v2'
LIMIT = 10000
TARGET_DIR = 'alleles_cgMLST_v2'

def __create_request(request_str):

    request = urllib2.Request(request_str)
    base64string = base64.encodestring('%s:%s' % (API_TOKEN,'')).replace('\n', '')
    request.add_header("Authorization", "Basic %s" % base64string)
    return request


if not os.path.exists('%s' %TARGET_DIR):
    os.mkdir('%s' %TARGET_DIR)
address = SERVER_ADDRESS + '/api/v2.0/%s/%s/loci?'\
    '&limit=%d&scheme=%s' \
    %(DATABASE, SCHEME, LIMIT, SCHEME)
print("Fetching scheme loci list from " + address)
try:
    response = urllib2.urlopen(__create_request(address))
    data = json.load(response)
    print("Download of json complete")

    for record in data['loci']:
        record_locus = record['locus']
        record_link = record['download_alleles_link']
        print("Downloading alleles for locus " + record_locus)
        response = urllib2.urlopen(__create_request(record_link))
        with open(os.path.join('%s' %TARGET_DIR, '%s.fasta.gz' %record_locus),'wb') as out_ass: 
            out_ass.write(response.read())
except HTTPError as Response_error:
    logging.error('%d %s. <%s>\n Reason: %s' %(Response_error.code,
                                              Response_error.msg,
                                              Response_error.geturl(),
                                              Response_error.read()))

It requires a valid Enterobase token

@dfornika
Copy link
Contributor Author

Hi @crarlus thanks for this. We should be able to use this as some inspiration to improve the MentaLiST Enterobase download methods. We're currently not supporting downloads that require an API token but we may be able to do that if there are protected datasets that we want access to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants