Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bio2bel_hgnc populate sqlAlchemy #11

Open
EidrianGM opened this issue Oct 14, 2022 · 1 comment
Open

bio2bel_hgnc populate sqlAlchemy #11

EidrianGM opened this issue Oct 14, 2022 · 1 comment

Comments

@EidrianGM
Copy link

The problem seems to come from the generation of a bad formatted query for SQLAlchemy. I am using SQLAlchemy==1.3.24. In the traceback below you will find two errors:

  1. ValueError: invalid literal for int() with base 10: (mgdid = int(mgd.split(':')[-1]))
  2. UNIQUE constraint failed: pyhgnc_hgnc.identifier

I had no problems populating bio2bel_chebi. Which makes me think that this has to do with unexpected changes in the HGNC that bio2bel is not controlling ??

$ python3 -m bio2bel_hgnc populate

2022-10-14 14:28:20,894 - pyhgnc - INFO - low_memory set to False
 15%|████████████████████████▍    | 6311/43233 [00:05<00:29, 1262.21it/s]
 
Traceback (most recent call last):
  File "venv/lib/python3.8/site-packages/bio2bel/manager/abstract_manager.py", line 38, in populate_wrapped
    cls._populate_original(self, *populate_args, **populate_kwargs)
  File "venv/lib/python3.8/site-packages/bio2bel_hgnc/manager.py", line 85, in populate
    self.insert_hgnc(hgnc_dict=json_data, silent=silent, low_memory=low_memory)
  File "venv/lib/python3.8/site-packages/pyhgnc/manager/database.py", line 337, in insert_hgnc
    'mgds': self.get_mgds(hgnc_data),
  **File "venv/lib/python3.8/site-packages/pyhgnc/manager/database.py", line 195, in get_mgds
    mgdid = int(mgd.split(':')[-1])
ValueError: invalid literal for int() with base 10: ''**

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
sqlite3.IntegrityError: UNIQUE constraint failed: pyhgnc_hgnc.identifier

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "venv/lib/python3.8/site-packages/bio2bel_hgnc/__main__.py", line 8, in <module>
    main()
  File "venv/lib/python3.8/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "venv/lib/python3.8/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "venv/lib/python3.8/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "venv/lib/python3.8/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "venv/lib/python3.8/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "venv/lib/python3.8/site-packages/click/decorators.py", line 27, in new_func
    return f(get_current_context().obj, *args, **kwargs)
  File "venv/lib/python3.8/site-packages/bio2bel_hgnc/manager.py", line 640, in populate
    manager.populate(use_hcop=(not skip_hcop))
  File "venv/lib/python3.8/site-packages/bio2bel/manager/abstract_manager.py", line 40, in populate_wrapped
    self._store_populate_failed()
  File "venv/lib/python3.8/site-packages/bio2bel/manager/connection_manager.py", line 93, in _store_populate_failed
    Action.store_populate_failed(self.module_name, session=self.session)
  File "venv/lib/python3.8/site-packages/bio2bel/models.py", line 95, in store_populate_failed
    _store_helper(action, session=session)
  File "venv/lib/python3.8/site-packages/bio2bel/models.py", line 140, in _store_helper
    session.commit()
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/scoping.py", line 163, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1046, in commit
    self.transaction.commit()
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 504, in commit
    self._prepare_impl()
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 483, in _prepare_impl
    self.session.flush()
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2540, in flush
    self._flush(objects)
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2682, in _flush
    transaction.rollback(_capture_exception=True)
  File "venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
    compat.raise_(
  File "venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2642, in _flush
    flush_context.execute()
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
    rec.execute(self)
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/unitofwork.py", line 586, in execute
    persistence.save_obj(
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 239, in save_obj
    _emit_insert_statements(
  File "venv/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py", line 1135, in _emit_insert_statements
    result = cached_connections[connection].execute(
  File "venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "venv/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
    ret = self._execute_context(
  File "venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
    self._handle_dbapi_exception(
  File "venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
    util.raise_(
  File "venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: pyhgnc_hgnc.identifier

[SQL: INSERT INTO pyhgnc_hgnc (name, symbol, identifier, status, uuid, orphanet, locus_group, locus_type, date_name_changed, date_modified, date_symbol_changed, date_approved_reserved, ensembl_gene, horde, vega, lncrnadb, entrez, mirbase, iuphar, ucsc, snornabase, pseudogeneorg, bioparadigmsslc, locationsortable, merops, location, cosmic, imgt) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: ('alpha-1-B glycoprotein', 'A1BG', 5, 'Approved', '24787fd0-c018-4336-8833-eb1c27cc9727', None, 'protein-coding gene', 'gene with protein product', None, '2020-09-17', None, '1989-06-30', 'ENSG00000121410', None, 'OTTHUMG00000183507', None, '1', None, None, 'uc002qsd.5', None, None, None, '19q13.43', 'I43.950', '19q13.43', None, None)]
(Background on this error at: http://sqlalche.me/e/13/gkpj)
@EidrianGM
Copy link
Author

EidrianGM commented Oct 14, 2022

A quick "ñapa" for the first error has been to change this function in pyhgnc/manager/database.py I think mgd_id are just not been properly captured Not interested in them anyway.

    def get_mgds(self, hgnc):
        mgds = []
        if 'mgd_id' in hgnc:
            for mgd in hgnc['mgd_id']:
                if mgd not in self.mgds:
                    try:
                        mgdid = mgd.split(':')[-1]
                        self.mgds[mgd] = models.MGD(mgdid=mgdid)
                    except:
                        self.mgds[mgd] = models.MGD(mgdid=mgd)
                mgds.append(self.mgds[mgd])
        return mgds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant