Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/fix mvp 114 #967

Open
wants to merge 213 commits into
base: release/mvp
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
213 commits
Select commit Hold shift + click to select a range
3778645
Merge pull request #802 from Ensembl/fix/hive-deps-110
marcoooo Jun 9, 2023
b2b617f
Added a failure if gzip fails to complete properly (#804)
dpopleton Jun 13, 2023
5ea62af
Added Gzip check for array (#808)
dpopleton Jun 19, 2023
ab54cdf
fix data_files path removig the vertebrates folder
Jul 18, 2023
9e80a93
Merge pull request #812 from pblins/release/110
vinay-ebi Jul 18, 2023
fc8f643
point symlink to the analysis_type folder and not to the release folder
Jul 18, 2023
4d007ad
Merge pull request #814 from pblins/release/110
vinay-ebi Jul 26, 2023
b4ffc18
Merge branch 'main' of github.com:Ensembl/ensembl-production
marcoooo Sep 13, 2023
a1ef53d
Merge branch 'main' of github.com:Ensembl/ensembl-production into rel…
marcoooo Sep 24, 2023
7d11118
Merge pull request #825 from Ensembl/release/merge-111
marcoooo Sep 25, 2023
ac36066
Updating test dbs to latest schema
marcoooo Sep 25, 2023
5a31c85
Merge pull request #826 from Ensembl/feature/test_db_patch_112
marcoooo Sep 25, 2023
aea0b73
Add gencode primary tag to GFF3 files
Oct 2, 2023
881d271
Updated download URL for miRBase miRNA.dat file
jmgonzmart Oct 2, 2023
50e7bc4
Merge pull request #827 from TamaraNaboulsi/new/gencode_primary
marcoooo Oct 4, 2023
9a5508a
Merge pull request #829 from jmgonzmart/main
marcoooo Oct 4, 2023
45f88d3
Merge pull request #828 from jmgonzmart/main
marcoooo Oct 6, 2023
a3f251c
Updated .gitignore + patch DQ forgotten when initially branching
marcoooo Oct 16, 2023
e7ca268
Merge remote-tracking branch 'origin/main'
marcoooo Oct 16, 2023
aab833e
Merge pull request #830 from Ensembl/release-112/patch-sql
marcoooo Oct 16, 2023
19a9198
Merge remote-tracking branch 'origin/release/110' into bugfixes/merge…
marcoooo Oct 16, 2023
b0bd310
Merge pull request #833 from Ensembl/bugfixes/merge-fixes-110
marcoooo Oct 16, 2023
3318e16
Stop skipping lines when reaching sequence data (#837)
TamaraNaboulsi Oct 19, 2023
cb28396
Updated Core stats resources.
dpopleton Oct 31, 2023
5ebab63
Merge pull request #839 from Ensembl/update/112_slurm
dpopleton Oct 31, 2023
97e0fb3
Fixed open/close TSV file while writing to it
sgiorgetti Nov 6, 2023
47b7e3d
Fixed typo
sgiorgetti Nov 6, 2023
3236a1f
Merge pull request #840 from Ensembl/hotfix/Xref_TSV-dump
marcoooo Nov 6, 2023
94b4ca7
Merge pull request #841 from Ensembl/main
marcoooo Nov 7, 2023
2de0888
Added table cleanup. Added SLURM resource specs. Removed 4GB_8CPU's a…
dpopleton Nov 8, 2023
b0df7a2
Fix for "scalar on reference"
EbiArnie Nov 13, 2023
0e1c0fe
Moved to KyotoCabinet as DB
EbiArnie Nov 15, 2023
7b25f08
Added KyotoCabinet dependency
EbiArnie Nov 15, 2023
a3dd857
Merge pull request #845 from EbiArnie/main
marcoooo Nov 15, 2023
4393c3c
update port for blat files
vinay-ebi Nov 17, 2023
5d456ac
Merge pull request #851 from Ensembl/hotfix/blat_port
vinay-ebi Nov 18, 2023
f2e0f7d
Merge pull request #844 from Ensembl/update/112_slurm
dpopleton Nov 21, 2023
6773a1f
Better AlphaFold mapping, Travis
EbiArnie Nov 21, 2023
8c0eb84
Fix Travis cpanm
EbiArnie Nov 21, 2023
b4a1481
Merge pull request #846 from EbiArnie/feature/alphafold_kyotodb
marcoooo Nov 22, 2023
53d3045
Merge pull request #847 from Ensembl/main
marcoooo Nov 22, 2023
d2f2123
Merge branch 'release/112' into update/112_slurm
dpopleton Nov 22, 2023
9352e9c
Added additional resource classes: 50GB, 100GB, 200GB
dpopleton Nov 22, 2023
2c32ef0
Altered resource classes for XrefDownload_conf.pm
dpopleton Nov 22, 2023
803fb76
Added memory step increase
dpopleton Nov 23, 2023
d01b2f3
Integrate GIFTS
EbiArnie Nov 23, 2023
a7f556d
Merge pull request #854 from Ensembl/update/112_slurm
dpopleton Nov 23, 2023
5986520
removed analysis step copytopublicftp
vinay-ebi Nov 24, 2023
0e664ba
Merge pull request #856 from Ensembl/bugfix/earlydumps
vinay-ebi Nov 26, 2023
07f00f1
Merge pull request #855 from EbiArnie/feature/integrate_gifts
marcoooo Nov 27, 2023
b397e55
Merge pull request #857 from Ensembl/main
dpopleton Nov 27, 2023
bfbec1e
update DumpOrtholog_eg_conf.pm
JAlvarezJarreta Nov 28, 2023
fd52bf1
Merge pull request #858 from JAlvarezJarreta/patch-1
marcoooo Nov 28, 2023
2ec70d8
Fixes for pipe config and cleanup
EbiArnie Nov 29, 2023
2db8a49
Merge pull request #859 from EbiArnie/feature/integrate_gifts
dpopleton Dec 1, 2023
72d1903
Merge pull request #860 from Ensembl/main
dpopleton Dec 1, 2023
ae3aa21
Update regulation_ftp_symlinks.py
marcoooo Dec 15, 2023
a896738
fix signals symlink in rf dir
pblins Dec 18, 2023
0190c65
fix delete option
pblins Dec 18, 2023
128ac03
Merge pull request #863 from Ensembl/bugfixes/typo-log-regul-symlink
marcoooo Dec 18, 2023
d6d4572
Merge pull request #864 from pblins/bugfixes/signal-in-rf-dir
marcoooo Dec 18, 2023
c84c5c7
Altered resource classes for XrefDownload_conf.pm
dpopleton Jan 24, 2024
87454c7
Altered resource classes for Xref
dpopleton Jan 24, 2024
f0b3925
updated copyright
dpopleton Jan 24, 2024
e5d196a
Merge pull request #865 from Ensembl/update/112_slurm
dpopleton Jan 24, 2024
be5f319
Fixed array bug
dpopleton Jan 25, 2024
73e7801
Fixed typo
dpopleton Jan 25, 2024
b012cb5
Update update_copyrights.sh
marcoooo Jan 30, 2024
fe59fe5
2024 copyright update
marcoooo Jan 30, 2024
4422b56
Merge pull request #869 from Ensembl/bau/copyright-2024
vinay-ebi Jan 30, 2024
c8606fa
Add check for unvalid readdir output
Jan 30, 2024
6a8c1c4
Rework alphafold pipeline cleanup logic
EbiArnie Feb 1, 2024
46ced20
Merge pull request #871 from EbiArnie/feature/integrate_gifts
dpopleton Feb 1, 2024
78acaca
Merge pull request #870 from TamaraNaboulsi/fix_xref/perl_readdir
marcoooo Feb 5, 2024
19a9c31
Fix uniprot gene name + cleanup bugs
Feb 6, 2024
4e8e235
Merge pull request #875 from TamaraNaboulsi/xref_fix/uniprot_bugs
marcoooo Feb 6, 2024
38e1efe
Merge pull request #867 from Ensembl/bugfix/base_conf
marcoooo Feb 7, 2024
79b1f2e
Flag outdated taxon_ids in TaxonomyInfoCore pipeline
twalsh-ebi Feb 15, 2024
2f50996
Merge pull request #876 from twalsh-ebi/feature/flag_outdated_taxon_ids
marcoooo Feb 21, 2024
610d7a9
Load InterPro Family data only for canonicals
twalsh-ebi Feb 26, 2024
c0bbe2b
Merge pull request #878 from twalsh-ebi/feature/canonical_only_members
marcoooo Mar 4, 2024
a2596d4
Update ProteinFeatures_conf.pm
dpopleton Mar 15, 2024
d376dea
Reduced field lengths and optimised index on stable_id_lookup
sgiorgetti Mar 16, 2024
009cb6d
Fixed dependencies conflict in requirements.in
marcoooo Mar 18, 2024
35ff742
Merge pull request #888 from Ensembl/update/Protein_features_comment
marcoooo Mar 18, 2024
5b4122b
Merge pull request #889 from Ensembl/fix112/stable_ids_db
marcoooo Mar 18, 2024
30f409f
Merge pull request #890 from Ensembl/buxfixes/112-dependencies
marcoooo Mar 18, 2024
c617b54
New xref download pipeline using python and nextflow
Mar 28, 2024
93dfccd
Add recent changes (112)
Mar 28, 2024
d13bd11
Added handling of new Transcript attribute 'gencode_primary'
sgiorgetti Apr 3, 2024
20f4e5a
Merge pull request #908 from Ensembl/feature/gencode_primary
marcoooo Apr 4, 2024
6765b85
Using core model from ensembl-py
Apr 22, 2024
68347b0
Patched gencode_primary tag for harmonisation purposes
sgiorgetti Apr 22, 2024
e1d6f8b
Fixed gencode_primary string
sgiorgetti Apr 22, 2024
bccac73
Moving all db models to ensembl-py
Apr 23, 2024
5fa2a0f
Adding tags to download and cleanup jobs
Apr 23, 2024
6c9ec7a
Merge pull request #915 from Ensembl/fix/gencode_primary_tag
dpopleton Apr 23, 2024
ae58398
Merge pull request #914 from Ensembl/fix112/gencode_primary_tag
dpopleton Apr 23, 2024
80296b0
Merge pull request #905 from TamaraNaboulsi/xref/new_python_pipeline
dpopleton Apr 23, 2024
11d6bb3
Merge pull request #907 from Ensembl/release/112
dpopleton Apr 23, 2024
792c5f3
Updated patches for preperation of 113
dpopleton Apr 23, 2024
3907412
Merge pull request #916 from Ensembl/update/patches
dpopleton Apr 23, 2024
127ad09
Updating test dbs to latest schema
Apr 23, 2024
17d5c1a
Merge pull request #917 from Ensembl/feature/test_db_patch_113
dpopleton Apr 23, 2024
d924e50
Changed GENCODE Basic tag to 'gencode_basic' as per ENSINT-1885
sgiorgetti Apr 26, 2024
f09b9b7
Changed GENCODE Basic tag to 'gencode_basic' as per ENSINT-1885
sgiorgetti Apr 26, 2024
a0b629a
Minor fixes
Apr 29, 2024
ec58c92
Missing semicolon
Apr 29, 2024
6b97c44
Change glob parameters
Apr 29, 2024
5d99a70
Remove debugging
Apr 29, 2024
577c6ba
Fixes to prevent warnings
Apr 29, 2024
a29d373
Fix file paths
Apr 30, 2024
7fb4c04
Keep original files if no species file
May 2, 2024
e72a12d
Merge pull request #919 from Ensembl/fix/gencode_basic
dpopleton May 2, 2024
c05a7d4
Merge pull request #920 from TamaraNaboulsi/xref/new_python_pipeline
dpopleton May 2, 2024
74296b6
Updated HGNC custom download URL
jmgonzmart May 10, 2024
66f0ff4
Merge pull request #923 from jmgonzmart/release/113
vinay-ebi May 10, 2024
ca259d0
remove partition from slurm command
MatBarba May 14, 2024
0c7c41f
moved ensembl/xrefs to ensembl/production/xrefs
vinay-ebi May 15, 2024
0393357
Update xref.config
vinay-ebi May 15, 2024
4391762
base load changed to ensembl.production.xrefs
vinay-ebi May 16, 2024
fde8b7b
Merge pull request #925 from Ensembl/feature/nf_xref
vinay-ebi May 16, 2024
6f63ae1
Update requirements.txt
JAlvarezJarreta May 16, 2024
15b4ead
Updated default resources from 100mb to 1gb
dpopleton May 20, 2024
d9ac953
Merge pull request #928 from Ensembl/update/increase_rc
dpopleton May 20, 2024
2c33d61
Merge pull request #921 from Ensembl/main
vinay-ebi May 20, 2024
7859b31
Merge pull request #926 from JAlvarezJarreta/patch-2
vinay-ebi May 20, 2024
2021302
Updated JSON remodeler to stop Experimental push on scalar is now for…
dpopleton May 23, 2024
2e0063c
Merge pull request #924 from MatBarba/mbarba/slurm_partition
dpopleton May 24, 2024
d1f6dbf
Merge branch 'release/113' into merge_conflicts
vinay-ebi May 24, 2024
cdce7ed
Merge pull request #930 from Ensembl/merge_conflicts
dpopleton May 24, 2024
3290f21
Update xref_all_sources.json for RGD
vinay-ebi May 24, 2024
808b346
Merge pull request #931 from Ensembl/xref_resource_update
vinay-ebi May 24, 2024
975fa4a
Update tag names and info relating to gencode genesets
nwillhoft May 28, 2024
95c1320
Bugfix for files not being overwritten
May 29, 2024
ba155dc
Merge pull request #929 from Ensembl/update/file_dump_perl_compatibility
dpopleton May 29, 2024
1c15ef9
Merge pull request #933 from TamaraNaboulsi/xref/bugfix
vinay-ebi May 29, 2024
3ad6a02
Fix for when no species file is found
Jun 4, 2024
700c96f
Merge pull request #934 from TamaraNaboulsi/xref/bugfix
vinay-ebi Jun 4, 2024
9586a89
Update ProteinFeatures analysis
vinay-ebi Jun 4, 2024
f065884
Update ProteinFeatures_conf.pm
vinay-ebi Jun 4, 2024
2b7fded
Merge pull request #935 from Ensembl/bugifx/proteinfeatures
vinay-ebi Jun 5, 2024
d283cf0
Update xref_sources.json
vinay-ebi Jun 19, 2024
d74f87e
Update xref_all_sources.json
vinay-ebi Jun 19, 2024
4c8fcf0
Merge pull request #938 from Ensembl/bugfix/update_xenbase
vinay-ebi Jun 19, 2024
b2da23c
Fixed as per ENSPROD-9493
sgiorgetti Jun 20, 2024
7463aca
Merge pull request #939 from Ensembl/fix113/alphafold-displaylabel
vinay-ebi Jun 21, 2024
7e4aca5
Fix use of keys on a scalar
jgtate Jun 24, 2024
a3ff2b4
Fixes for 113 issues
Jun 26, 2024
39c3a57
Merge pull request #941 from TamaraNaboulsi/xref/fixes
dpopleton Jun 27, 2024
6549a72
Merge pull request #940 from Ensembl/fix-keys-on-scalar
jgtate Jul 1, 2024
ee5da99
Update SourceFactory.pm
vinay-ebi Jul 5, 2024
2976823
Merge pull request #942 from Ensembl/bug/experimenta_scalar
vinay-ebi Jul 5, 2024
d1ce293
Updated Base class with slurm default resource 1GB
vinay-ebi Jul 19, 2024
772caf1
Update Typo GB
vinay-ebi Jul 19, 2024
ce884ca
Merge pull request #918 from Ensembl/fix113/gencode_basic
vinay-ebi Jul 22, 2024
14500c9
Merge pull request #932 from nwillhoft/fix113/update_tag_info
vinay-ebi Jul 22, 2024
2b01517
decompress upidump.lis.gz file before load to hive db
vinay-ebi Jul 22, 2024
434e865
delete the upidump file after loading into hive db
vinay-ebi Jul 22, 2024
693c899
Update modules/Bio/EnsEMBL/Production/Pipeline/ProteinFeatures/LoadUn…
vinay-ebi Jul 22, 2024
51cc0c3
Update modules/Bio/EnsEMBL/Production/Pipeline/ProteinFeatures/LoadUn…
vinay-ebi Jul 22, 2024
7cbce1e
Merge pull request #943 from Ensembl/feature/slurm_resource
vinay-ebi Jul 22, 2024
515063a
Merge pull request #946 from Ensembl/bugfix/pf_uniparc_gunzip
vinay-ebi Jul 23, 2024
0eae3f6
Include human and mouse symlinks
pblins Jul 25, 2024
3542ca1
include Mouse and Human symlinks
Jul 25, 2024
ac6c741
include Mouse and Human symlinks
Jul 25, 2024
8380bd1
Merge pull request #947 from pblins/include-human-and-mouse
vinay-ebi Jul 25, 2024
bbc1362
Update ProteinFeatures_conf.pm
vinay-ebi Aug 13, 2024
d26c79e
Merge pull request #949 from Ensembl/bugfix/pf_analysis_lc
vinay-ebi Aug 13, 2024
3045fbb
Merge pull request #951 from Ensembl/release/113
vinay-ebi Sep 3, 2024
897ec8e
Changes for xref for release 114
Sep 9, 2024
be04bb1
Update XrefProcess_conf.pm
TamaraNaboulsi Sep 10, 2024
5f1bd53
patch 114
vinay-ebi Sep 10, 2024
e715bf6
Updating test dbs to latest schema
Sep 10, 2024
ec8ca60
Merge pull request #956 from Ensembl/fix_patch_updates_114
vinay-ebi Sep 10, 2024
4a1d025
Merge pull request #957 from Ensembl/feature/test_db_patch_114
vinay-ebi Sep 10, 2024
9c5e81a
Update python version to 3.10
vinay-ebi Sep 20, 2024
6f1c54c
Fix Specified key was too long; max key length is 1000 bytes
vinay-ebi Sep 22, 2024
8fe57dc
Fix Specified key was too long; max key length is 1000 bytes
vinay-ebi Sep 22, 2024
963b1f1
Fix Specified key was too long; max key length is 1000 bytes
vinay-ebi Sep 22, 2024
153c7d6
set travis dist to bionic to revert mysql version to 5.7 similar to p…
vinay-ebi Sep 22, 2024
8cd9826
set travis dist to default to revert mysql version to 5.7 similar to …
vinay-ebi Sep 22, 2024
a108cfc
Merge pull request #954 from TamaraNaboulsi/xref/changes
vinay-ebi Sep 24, 2024
928c0de
Fixed analysis name
jmgonzmart Sep 24, 2024
6725416
Added new resource class as it was required by the XrefDownload pipeline
jmgonzmart Sep 24, 2024
7f0a5f1
Updated file name in the URL for the UniParc data download
jmgonzmart Sep 24, 2024
f6154b6
Merge pull request #959 from Ensembl/feature/py3_10
vinay-ebi Sep 25, 2024
681fc92
Merge pull request #960 from jmgonzmart/release/114
vinay-ebi Sep 25, 2024
1fd5982
fix delete group param in dbcopy pipeline
vinay-ebi Oct 14, 2024
e7955f9
remove exiting param delete group assigned to group
vinay-ebi Oct 14, 2024
c5981b5
Merge pull request #965 from Ensembl/bugfix/dbcopy_delete_group
vinay-ebi Oct 15, 2024
a9e758f
Merge branch 'refs/heads/release/mvp' into features/fix_mvp_114
vinay-ebi Oct 15, 2024
62fc1cb
fix travis failures
vinay-ebi Oct 15, 2024
80ff508
update travis to run bot the perl and python tests
vinay-ebi Oct 15, 2024
a4797ab
fix missing mysql service in python jobs for travis
vinay-ebi Oct 15, 2024
59cd63e
fix missing mysql service in python jobs for travis
vinay-ebi Oct 15, 2024
5372422
fix missing mysql service in python jobs for travis
vinay-ebi Oct 15, 2024
7fe4248
fix missing mysql service in python jobs for travis
vinay-ebi Oct 15, 2024
088ed3f
Merge remote-tracking branch 'origin/features/fix_mvp_114' into featu…
vinay-ebi Oct 15, 2024
ac5e7ab
fix missing mysql service in python jobs for travis
vinay-ebi Oct 15, 2024
a444f32
fix missing mysql service in python jobs for travis
vinay-ebi Oct 16, 2024
3d4a9df
set perl version to 5.26.2
vinay-ebi Oct 16, 2024
f253f28
revert the travis yml file and remove the perl version 5.14
vinay-ebi Oct 16, 2024
c69590e
check perl test are running
vinay-ebi Oct 16, 2024
a61bfe6
Revert "fix missing mysql service in python jobs for travis"
vinay-ebi Oct 16, 2024
5781de6
check perl test are running
vinay-ebi Oct 16, 2024
b82e08a
update hive version to 2.7 in core stats pipeline
vinay-ebi Oct 16, 2024
68cc8f1
update hive version to 2.7 in production pipeline
vinay-ebi Oct 17, 2024
c42b788
update metadata-api to latest version
vinay-ebi Oct 22, 2024
efa637c
update metadata-api to latest version 3.3.0a1
vinay-ebi Oct 22, 2024
86da542
update MetadataUpdaterHiveCore to insert job details into result table
vinay-ebi Nov 5, 2024
fb31eff
test job_id
vinay-ebi Nov 5, 2024
6e4dfe7
set job id in output flow
vinay-ebi Nov 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 68 additions & 56 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,69 +1,81 @@
language: perl
perl:
- "5.14"
- "5.26.2"
services:
- mysql
env:
- COVERALLS=true DB=mysql
addons:
apt:
update: true
packages:
- unzip
- sendmail
- graphviz
- emboss
before_install:
- git clone --depth 1 https://github.com/Ensembl/ensembl-git-tools.git
- export PATH=$PATH:$PWD/ensembl-git-tools/bin
- export ENSEMBL_BRANCH=master
- export SECONDARY_BRANCH=main
- echo "TRAVIS_BRANCH=$TRAVIS_BRANCH"
- if [[ $TRAVIS_BRANCH =~ ^release\/[0-9]+$ ]]; then export ENSEMBL_BRANCH=$TRAVIS_BRANCH; export SECONDARY_BRANCH=$TRAVIS_BRANCH; fi
- echo "ENSEMBL_BRANCH=$ENSEMBL_BRANCH"
- echo "SECONDARY_BRANCH=$SECONDARY_BRANCH"
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-test
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-compara
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-datacheck
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-variation
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-metadata
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-funcgen
- git-ensembl --clone --branch master --secondary_branch main --depth 1 ensembl-hive
- git-ensembl --clone --branch master --secondary_branch main --depth 1 ensembl-orm
- git-ensembl --clone --branch master --secondary_branch main --depth 1 ensembl-taxonomy
- git clone --branch 1.9 --depth 1 https://github.com/samtools/htslib.git
- git clone --branch release-1-6-924 --depth 1 https://github.com/bioperl/bioperl-live.git
- cd htslib
- make
- export HTSLIB_DIR=$(pwd -P)
- cd ..
install:
- cpanm --sudo -v --installdeps --with-recommends --notest --cpanfile ensembl/cpanfile .
- cpanm --sudo -v --installdeps --notest --cpanfile ensembl-hive/cpanfile .
- cpanm --sudo -v --installdeps --notest --cpanfile ensembl-datacheck/cpanfile .
- export PERL5LIB=$PERL5LIB:$PWD/bioperl-live
- cpanm --sudo -v --installdeps --notest .
- cpanm --sudo -n Devel::Cover::Report::Coveralls
- cp travisci/MultiTestDB.conf.travisci modules/t/MultiTestDB.conf
- mysql -u root -h localhost -e 'GRANT ALL PRIVILEGES ON *.* TO "travis"@"%"'
script:
- ./travisci/harness.sh
os: linux
jobs:
include:
- language: python
python: 3.8
- name: "Perl Job"
perl: "5.26.2"
services:
- mysql
env:
- COVERALLS=true DB=mysql
addons:
apt:
update: true
packages:
- unzip
- sendmail
- graphviz
- emboss
- libkyotocabinet-dev
before_install:
- git clone --depth 1 https://github.com/Ensembl/ensembl-git-tools.git
- export PATH=$PATH:$PWD/ensembl-git-tools/bin
- export ENSEMBL_BRANCH=master
- export SECONDARY_BRANCH=main
- echo "TRAVIS_BRANCH=$TRAVIS_BRANCH"
- if [[ $TRAVIS_BRANCH =~ ^release\/[0-9]+$ ]]; then export ENSEMBL_BRANCH=$TRAVIS_BRANCH; export SECONDARY_BRANCH=$TRAVIS_BRANCH; fi
- echo "ENSEMBL_BRANCH=$ENSEMBL_BRANCH"
- echo "SECONDARY_BRANCH=$SECONDARY_BRANCH"
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-test
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-compara
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-datacheck
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-variation
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-metadata
- git-ensembl --clone --branch $ENSEMBL_BRANCH --secondary_branch $SECONDARY_BRANCH --depth 1 ensembl-funcgen
- git-ensembl --clone --branch master --secondary_branch main --depth 1 ensembl-hive
- git-ensembl --clone --branch master --secondary_branch main --depth 1 ensembl-orm
- git-ensembl --clone --branch master --secondary_branch main --depth 1 ensembl-taxonomy
- git clone --branch 1.9 --depth 1 https://github.com/samtools/htslib.git
- git clone --branch release-1-6-924 --depth 1 https://github.com/bioperl/bioperl-live.git
- cd htslib
- make
- export HTSLIB_DIR=$(pwd -P)
- mysql -e "SET GLOBAL local_infile=1;"
- cd ..
install:
- cpanm --sudo -v --installdeps --with-recommends --notest --cpanfile ensembl/cpanfile .
- cpanm --sudo -v --installdeps --notest --cpanfile ensembl-hive/cpanfile .
- cpanm --sudo -v --installdeps --notest --cpanfile ensembl-datacheck/cpanfile .
- export PERL5LIB=$PERL5LIB:$PWD/bioperl-live
- cpanm travisci/kyotocabinet-perl-1.20.tar.gz
- cpanm --sudo -v --installdeps --notest .
- cpanm --sudo -n Devel::Cover::Report::Coveralls
- cp travisci/MultiTestDB.conf.travisci modules/t/MultiTestDB.conf
- mysql -u root -h localhost -e 'GRANT ALL PRIVILEGES ON *.* TO "travis"@"%"'
script:
- ./travisci/harness.sh

- name: "Python Job"
language: python
python:
- "3.10"
- "3.11"
services:
- mysql
env:
- COVERALLS=true DB=mysql
install:
- pip install -e .
- pip install -r requirements-test.txt
- pip install -e .
before_script:
- mysql -e "SET GLOBAL local_infile=1;"
script:
- pytest src/python/test


notifications:
email:
on_success: always
on_failure: always
slack:
secure: BkrSPAkOM5aTOpeyO9vZnHdZ0LF1PLk0r2HtcXN2eTMyiHoGXkl6VUjdAL8EkzI4gunW2GProdSIjHpf60WdiEmKAulMdJRI+xyUbuxnY31mwiikS9HYwqmPBbMTf0Mh2pMBngZRFs+gaFZDUMTfLfp+8MQfU1R54yb6hPuVt5I=
secure: BkrSPAkOM5aTOpeyO9vZnHdZ0LF1PLk0r2HtcXN2eTMyiHoGXkl6VUjdAL8EkzI4gunW2GProdSIjHpf60WdiEmKAulMdJRI+xyUbuxnY31mwiikS9HYwqmPBbMTf0Mh2pMBngZRFs+gaFZDUMTfLfp+8MQfU1R54yb6hPuVt5I=
2 changes: 1 addition & 1 deletion cpanfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ requires 'File::Slurp';
requires 'Log::Log4perl';
requires 'XML::Simple';
requires 'Time::Duration';
requires 'Tie::LevelDB';
requires 'IO::Zlib';
requires 'File::Temp';
requires 'Fcntl';
requires 'KyotoCabinet';
49 changes: 29 additions & 20 deletions modules/Bio/EnsEMBL/Production/Pipeline/AlphaFold/CreateAlphaDB.pm
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,12 @@

This module prepares a DB with a mapping from Uniprot accession to related
Alphafold data (Alphafold accession, protein start, end). The DB is created on
disk in LevelDB format.
disk in KyotoCabinet format.

=head1 DESCRIPTION

- We expect the file accession_ids.csv to be available
- We go through the file and build a LevelDB mapping the Uniprot accession to the Alphafold data
- We go through the file and build a DB mapping the Uniprot accession to the Alphafold data

=cut

Expand All @@ -49,7 +49,7 @@ use strict;

use parent 'Bio::EnsEMBL::Production::Pipeline::Common::Base';
use Bio::EnsEMBL::Utils::Exception qw(throw info);
use Tie::LevelDB;
use KyotoCabinet;
use File::Temp 'tempdir';


Expand All @@ -66,7 +66,7 @@ sub run {

throw ("Data file not found: '$map_file' on host " . `hostname`) unless -f $map_file;

my $idx_dir = $self->param_required('alphafold_db_dir') . '/uniprot-to-alpha.leveldb';
my $idx_dir = $self->param_required('alphafold_db_dir') . '/uniprot-to-alphafold';
if (-d $idx_dir) {
system(qw(rm -rf), $idx_dir);
}
Expand All @@ -78,33 +78,42 @@ sub run {
$copy_to = $idx_dir;
$idx_dir = tempdir(DIR => '/dev/shm/');
}

tie(my %idx, 'Tie::LevelDB', $idx_dir)
or die "Error trying to tie Tie::LevelDB $idx_dir: $!";

my $db = new KyotoCabinet::DB;

# Set 4 GB mmap size
my $mapsize_gb = 4 << 30;

# Open the DB
# Open as the exclusive writer, truncate if it exists, otherwise create the DB
# Open the database as a file hash DB, 600M buckets, 4GB mmap, linear option for
# hash collision handling. These are tuned for write speed and for approx. 300M entries.
# As with a regular Perl hash, a duplicate entry will overwrite the previous
# value.
$db->open("$idx_dir/uniprot-to-alphafold.kch#bnum=600000000#msiz=$mapsize_gb#opts=l",
$db->OWRITER | $db->OCREATE | $db->OTRUNCATE
) or die "Error opening DB: " . $db->error();

my $map;
open($map, '<', $map_file) or die "Opening map file $map_file failed: $!";

# A line from accession_ids.csv looks like this:
# Uniprot accession, hit start, hit end, Alphafold accession, Alphafold version
# A0A2I1PIX0,1,200,AF-A0A2I1PIX0-F1,4
# Currently, all entries in this file have a unique uniprot accession and
# have a hit starting at 1

while (my $line = <$map>) {
chomp $line;
# A line from accession_ids.csv looks like this:
# Uniprot accession, hit start, hit end, Alphafold accession, Alphafold version
# A0A2I1PIX0,1,200,AF-A0A2I1PIX0-F1,4
# Currently, all entries in this file have a unique uniprot accession and
# have a hit starting at 1
unless ($line =~ /^\w+,\d+,\d+,[\w_-]+,\d+$/) {
chomp $line;
warn "Data error. Line is not what we expect: '$line'";
next;
die "Data error. Line is not what we expect: '$line'";
}
my @x = split(",", $line, 2);

# This is the DB write operation. Tie::LevelDB will croak on errors (e.g. disk full)
$idx{$x[0]} = $x[1];
# This is the DB write operation.
$db->set($x[0], $x[1]) or die "Error inserting data: " . $db->error();
}

close($map);
untie %idx;
$db->close() or die "Error closing DB: " . $db->error();

if ($copy_back) {
system (qw(cp -r), $idx_dir, $copy_to);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,12 +32,12 @@
=head1 SYNOPSIS

This module prepares a DB with a mapping from Uniparc accession to Uniprot
accession. The DB is created on disk in LevelDB format.
accession. The DB is created on disk in KyotoCabinet format.

=head1 DESCRIPTION

- We expect the file idmapping_selected.tab.gz to be available
- We go through the file and build a LevelDB mapping the Uniparc accessions to Uniprot accessions
- We go through the file and build a DB mapping the Uniparc accessions to Uniprot accessions

=cut

Expand All @@ -49,7 +49,7 @@ use strict;
use parent 'Bio::EnsEMBL::Production::Pipeline::Common::Base';

use Bio::EnsEMBL::Utils::Exception qw(throw info);
use Tie::LevelDB;
use KyotoCabinet;
use IO::Zlib;
use File::Temp 'tempdir';

Expand All @@ -66,7 +66,7 @@ sub run {

throw ("Data file not found: '$map_file' on host " . `hostname`) unless -f $map_file;

my $idx_dir = $self->param_required('uniparc_db_dir') . '/uniparc-to-uniprot.leveldb';
my $idx_dir = $self->param_required('uniparc_db_dir') . '/uniparc-to-uniprot';
if (-d $idx_dir) {
system(qw(rm -rf), $idx_dir);
}
Expand All @@ -79,8 +79,21 @@ sub run {
$idx_dir = tempdir(DIR => '/dev/shm/');
}

tie(my %idx, 'Tie::LevelDB', $idx_dir)
or die "Error trying to tie Tie::LevelDB $idx_dir: $!";
my $db = new KyotoCabinet::DB;

# Set 4 GB mmap size
my $mapsize_gb = 4 << 30;

# Open the DB
# Open as the exclusive writer, truncate if it exists, otherwise create the DB
# Open the database as a file hash DB, 600M buckets, 4GB mmap, linear option for
# hash collision handling. These are tuned for write speed and for approx. 300M entries.
# Uniparc has 251M entries at the moment.
# As with a regular Perl hash, a duplicate entry will overwrite the previous
# value.
$db->open("$idx_dir/uniparc-to-uniprot.kch#bnum=600000000#msiz=$mapsize_gb#opts=l",
$db->OWRITER | $db->OCREATE | $db->OTRUNCATE
) or die "Error opening DB: " . $db->error();

my $map = new IO::Zlib;
$map->open($map_file, 'rb') or die "Opening map file $map_file with IO::Zlib failed: $!";
Expand All @@ -90,22 +103,27 @@ sub run {
# We pick out the Uniparc accession and Uniprot accession
# index[10] (Uniparc): UPI00003B0FD4; index[0] (Uniprot): Q6GZX4
my $line;

while ($line = <$map>) {
chomp $line;
unless ($line =~ /^\w+\t[[:print:]\t]+$/) {
warn "Data error: Line is not what we expect: '$line'";
next;
die "Data error: Uniparc accession is not what we expect: '$line'";
}
my @x = split("\t", $line, 12);
unless ($x[10] and $x[10] =~ /^UPI\w+$/) {
warn "Data error: Uniparc accession is not what we expect: '$line'";
next;
die "Data error: Uniparc accession is not what we expect: '$line'";
}
# This is the DB write operation.
my $oldval;
if ($oldval = $db->get($x[10])) {
$db->set($x[10], "$oldval\t" . $x[0]) or die "Error inserting data: " . $db->error();
} else {
$db->set($x[10], $x[0]) or die "Error inserting data: " . $db->error();
}
# This is the DB write operation. Tie::LevelDB will croak on errors (e.g. disk full)
$idx{$x[10]} = $x[0];
}

$map->close;
untie %idx;
$db->close() or die "Error closing DB: " . $db->error();

if ($copy_back) {
system (qw(cp -r), $idx_dir, $copy_to);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ sub run {
-db => 'alphafold',
-db_version => $alpha_version,
-db_file => $self->param('db_dir') . '/accession_ids.csv',
-display_label => 'AlphaFold DB import',
-display_label => 'AFDB-ENSP mapping',
-displayable => '1',
-description => 'Protein features based on AlphaFold predictions, mapped with GIFTS or UniParc'
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,8 @@ sub Bio::EnsEMBL::Transcript::summary_as_hash {
$summary{'transcript_support_level'} = $self->tsl if $self->tsl;

my @tags;
push(@tags, 'basic') if $self->gencode_basic();
push(@tags, 'gencode_basic') if $self->gencode_basic();
push(@tags, 'gencode_primary') if $self->gencode_primary();
push(@tags, 'Ensembl_canonical') if $self->is_canonical();

# A transcript can have different types of MANE-related attributes (MANE_Select, MANE_Plus_Clinical)
Expand Down
3 changes: 2 additions & 1 deletion modules/Bio/EnsEMBL/Production/Pipeline/GFF3/DumpFile.pm
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,8 @@ sub Bio::EnsEMBL::Transcript::summary_as_hash {
$summary{'transcript_support_level'} = $self->tsl if $self->tsl;

my @tags;
push(@tags, 'basic') if $self->gencode_basic();
push(@tags, 'gencode_basic') if $self->gencode_basic();
push(@tags, 'gencode_primary') if $self->gencode_primary();
push(@tags, 'Ensembl_canonical') if $self->is_canonical();

# A transcript can have different types of MANE-related attributes (MANE_Select, MANE_Plus_Clinical)
Expand Down
3 changes: 2 additions & 1 deletion modules/Bio/EnsEMBL/Production/Pipeline/GTF/DumpFile.pm
Original file line number Diff line number Diff line change
Expand Up @@ -383,7 +383,8 @@ feature for the position of this on the genome
- cds_start_NF: the coding region start could not be confirmed
- mRNA_end_NF: the mRNA end could not be confirmed
- mRNA_start_NF: the mRNA start could not be confirmed.
- basic: the transcript is part of the gencode basic geneset
- gencode_basic: the transcript is part of the gencode basic geneset
- gencode_primary: the transcript is part of the gencode primary geneset

Comments

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ sub all_hashes {
} ## end foreach my $slice (@slices)

for my $seq_type (keys %$batch) {
for my $attrib_table (keys $batch->{$seq_type}) {
for my $attrib_table (keys %{$batch->{$seq_type}}) {
$attribute_adaptor->store_batch_on_Object($attrib_table, $batch->{$seq_type}->{$attrib_table}, 1000);
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,10 @@ sub merge_xrefs {
$obj->{$dbname} = [];
}
for my $ann ( @{ $subobj->{$dbname} } ) {
push $obj->{$dbname}, $self->copy_hash($ann);
if (ref($obj->{$dbname}) ne 'ARRAY') {
$obj->{$dbname} = [];
}
push @{ $obj->{$dbname} }, $self->copy_hash($ann);
}
}
}
Expand Down
Loading