Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enterobase Downloads Failing #64

Closed
klgray25 opened this issue Jun 11, 2018 · 2 comments
Closed

Enterobase Downloads Failing #64

klgray25 opened this issue Jun 11, 2018 · 2 comments

Comments

@klgray25
Copy link

Downloading the Salmonella wgMLST from Enterobase was unsuccessful.

I followed the link being used by the script to access the scheme (http://enterobase.warwick.ac.uk/download_data?species=SALwgMLST&scheme=wgMLSTv1&allele=SC0831) and received the following message from the Enterobase site:

Please log in first as anonymous user does not have access to rMLST data. Please not also that administrator has only access to this section, Please visit pubmlst.org/rmlst/ for more information or contact keith.jolley_at_zoo.ox.ac.uk. Copyright 2010-2016, University of Oxford

It seems that they have altered access to the data and this is causing mentalist to fail when running download_enterobase.

@dfornika
Copy link
Contributor

@crarlus
Copy link

crarlus commented Jun 25, 2018

My solution is to use the following python script, which has been adapted from the Enterobase API How-To (https://bitbucket.org/enterobase/enterobase-web/wiki/api_download_schemes):

import os
import urllib2
import json
import base64
import sys
from urllib2 import HTTPError
import logging

# You must have a valid API Token
API_TOKEN =os.getenv('ENTEROBASE_API_TOKEN', None)
SERVER_ADDRESS = 'http://enterobase.warwick.ac.uk'
DATABASE = 'senterica'
SCHEME = 'cgMLST_v2'
LIMIT = 10000
TARGET_DIR = 'alleles_cgMLST_v2'

def __create_request(request_str):

    request = urllib2.Request(request_str)
    base64string = base64.encodestring('%s:%s' % (API_TOKEN,'')).replace('\n', '')
    request.add_header("Authorization", "Basic %s" % base64string)
    return request


if not os.path.exists('%s' %TARGET_DIR):
    os.mkdir('%s' %TARGET_DIR)
address = SERVER_ADDRESS + '/api/v2.0/%s/%s/loci?'\
    '&limit=%d&scheme=%s' \
    %(DATABASE, SCHEME, LIMIT, SCHEME)
print("Fetching scheme loci list from " + address)
try:
    response = urllib2.urlopen(__create_request(address))
    data = json.load(response)
    print("Download of json complete")

    for record in data['loci']:
        record_locus = record['locus']
        record_link = record['download_alleles_link']
        print("Downloading alleles for locus " + record_locus)
        response = urllib2.urlopen(__create_request(record_link))
        with open(os.path.join('%s' %TARGET_DIR, '%s.fasta.gz' %record_locus),'wb') as out_ass: 
            out_ass.write(response.read())
except HTTPError as Response_error:
    logging.error('%d %s. <%s>\n Reason: %s' %(Response_error.code,
                                              Response_error.msg,
                                              Response_error.geturl(),
                                              Response_error.read()))

It requires a valid Enterobase token

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants