Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spam deletion script broken in production #60

Closed
jvendetti opened this issue Dec 12, 2022 · 2 comments
Closed

Spam deletion script broken in production #60

jvendetti opened this issue Dec 12, 2022 · 2 comments
Assignees

Comments

@jvendetti
Copy link
Member

jvendetti commented Dec 12, 2022

The spam deletion script is executing nightly as scheduled, but doesn't appear to be deleting spam anymore.

I added a new account to the list of spam users in this commit, then ran the spam deletion script manually. The script output shows that the user ("buyadderallonline") and the ontology they uploaded (acronym ADDERALL) weren't deleted as expected:

[ncbo-deployer@ncbo-prd-app-31 ncbo_cron]$ bin/ncbo_spam_deletion
(LD) >> Using rdf store ncboprod-4store1:8080/sparql/
(LD) >> Using term search server at http://ncbo-prd-sol-01.stanford.edu:8983/solr/term_search_core1
(LD) >> Using property search server at http://ncbo-prd-sol-01.stanford.edu:8983/solr/prop_search_core1
(LD) >> Using HTTP Redis instance at ncbo-prd-rds-02.sunet:6380
(LD) >> Using Goo Redis instance at ncbo-prd-rds-03.sunet:6381
(AN) >> Using ANN Redis instance at ncbo-prd-rds-01.sunet:6379
(CNFG) >> OntologyRecommender not available, cannot load config
(CNFG) >> OntologiesAPI not available, cannot load config
(CR) >> Using Redis instance at localhost:6379
Processing details are logged to STDOUT
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1440    0  1440    0     0   5978      0 --:--:-- --:--:-- --:--:--  6000
I, [2022-12-12T14:39:11.544077 #14145]  INFO -- : No users/projects/notes/reviews/ontologies/provisional classes found
Completed removing SPAM
I, [2022-12-12T14:39:11.544146 #14145]  INFO -- : Completed removing SPAM

A BioPortal user reported the spam ontology, so I manually deleted it (and also the user account). However, there appear to be a couple of newer spam entries on the Projects page that could be used for testing, e.g. /projects/PAGOMU.

The scheduler-spam-deletion.log file shows no errors.

@mdorf
Copy link
Member

mdorf commented Apr 6, 2023

Looks like our current Github Authorization token is invalid

@mdorf
Copy link
Member

mdorf commented Apr 6, 2023

I added some additional handling to the script to fail with a corresponding error is anything other than a successful fetch of the SPAM user list occurs.

@alexskr alexskr closed this as completed May 2, 2023
syphax-bouazzouni referenced this issue in ontoportal-lirmm/ncbo_cron Dec 27, 2023
…its, and the Most visited pages in the month (#17)

* remove forgot variables

* fix for #61

- create contact instance if it doesn't exist
- changed --from-api to --from-apikey
- minor linting

* Restore branch specifier to develop

* Optimization - remove repeated query

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* Gemfile had references to develop branch

* implemented #64 - ability to generate labels independently of RDF processing (and vise versa)

* Gemfile.lock update

* fixed a bug in #64

* Relocate docker-compose file and update default configs

* Add GH workflow for publishing docker images

* use ruby native method for listing files instead of a git function

Resolves warning messages when we exclude .git directory from docker image

* remove comment

* capitalize argument in order to be consistent with other scripts

* add arm/64 platform

* additional error handling for SPAM deletion script, #60

* additional error handling for SPAM deletion script, #60

* implemented #67 - improved corrupt data and error handling

* Gemfile.lock update

* exclude test/data/dictionary.txt from git commits

* update version of solr-ut

* Gemfile.lock update

* Restore branch specifier to master

* fixed configuration for the analytics module

* Gemfile.lock update

* implemented #69 - scheduled annotator dictionary file generation should be a configurable option instead of the default

* Gemfile.lock update

* gem update

* create new rake taks for updating purls for all ontologies

moved from ontologies_api/fix_purls.rb

* initial implementation of #70 - Google Analytics v4 Update Compatibility Issue

* added the /data folder to ignore

* update gems

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* use patched version of agraph v7.3.1

* unpin faraday gem

* A chnage to reference Analytics Redis from LinkedData block

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* use assert_operator instead of assert

minitest style guide adherence.
encountered an intermittent unit test failure so assert_operator will provide
better failure feedback than assert

* use local solr to pass the tests

* fixed ncbo_ontology_archive_old_submissions error output

* Gemfile.lock update

* Gemfile.lock update

* Gemfile update

* Gemfile update

* fixes to the analytics script and a new script to generate UA analytics for documentation

* Gemfile.lock update

* Gemfile.lock update

* implemented the first pass at bmir-radx/radx-project#37

* implemented the first pass at bmir-radx/radx-project#37

* set bundler version to be comptatible with ruby 2.7

+ AG v8

* refactor ontologies analytics job to handle the new google analytics migration

* add user analytics fetching the monthly user visits count

* add page visits analytics  fetching  last month most visited pages

* extract google analytics UA import code to a script to make current code clean of it

* add option to force submission archiving even if already archived

---------

Co-authored-by: Alex Skrenchuk <alexskr@stanford.edu>
Co-authored-by: mdorf <mdorf@stanford.edu>
Co-authored-by: Jennifer Vendetti <vendetti@stanford.edu>
syphax-bouazzouni referenced this issue in ontoportal-lirmm/ncbo_cron Dec 28, 2023
…its, and the Most visited pages in the month (#17)

* remove forgot variables

* fix for #61

- create contact instance if it doesn't exist
- changed --from-api to --from-apikey
- minor linting

* Restore branch specifier to develop

* Optimization - remove repeated query

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* Gemfile had references to develop branch

* implemented #64 - ability to generate labels independently of RDF processing (and vise versa)

* Gemfile.lock update

* fixed a bug in #64

* Relocate docker-compose file and update default configs

* Add GH workflow for publishing docker images

* use ruby native method for listing files instead of a git function

Resolves warning messages when we exclude .git directory from docker image

* remove comment

* capitalize argument in order to be consistent with other scripts

* add arm/64 platform

* additional error handling for SPAM deletion script, #60

* additional error handling for SPAM deletion script, #60

* implemented #67 - improved corrupt data and error handling

* Gemfile.lock update

* exclude test/data/dictionary.txt from git commits

* update version of solr-ut

* Gemfile.lock update

* Restore branch specifier to master

* fixed configuration for the analytics module

* Gemfile.lock update

* implemented #69 - scheduled annotator dictionary file generation should be a configurable option instead of the default

* Gemfile.lock update

* gem update

* create new rake taks for updating purls for all ontologies

moved from ontologies_api/fix_purls.rb

* initial implementation of #70 - Google Analytics v4 Update Compatibility Issue

* added the /data folder to ignore

* update gems

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* use patched version of agraph v7.3.1

* unpin faraday gem

* A chnage to reference Analytics Redis from LinkedData block

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* use assert_operator instead of assert

minitest style guide adherence.
encountered an intermittent unit test failure so assert_operator will provide
better failure feedback than assert

* use local solr to pass the tests

* fixed ncbo_ontology_archive_old_submissions error output

* Gemfile.lock update

* Gemfile.lock update

* Gemfile update

* Gemfile update

* fixes to the analytics script and a new script to generate UA analytics for documentation

* Gemfile.lock update

* Gemfile.lock update

* implemented the first pass at bmir-radx/radx-project#37

* implemented the first pass at bmir-radx/radx-project#37

* set bundler version to be comptatible with ruby 2.7

+ AG v8

* refactor ontologies analytics job to handle the new google analytics migration

* add user analytics fetching the monthly user visits count

* add page visits analytics  fetching  last month most visited pages

* extract google analytics UA import code to a script to make current code clean of it

* add option to force submission archiving even if already archived

---------

Co-authored-by: Alex Skrenchuk <alexskr@stanford.edu>
Co-authored-by: mdorf <mdorf@stanford.edu>
Co-authored-by: Jennifer Vendetti <vendetti@stanford.edu>
syphax-bouazzouni referenced this issue in ontoportal-lirmm/ncbo_cron Dec 28, 2023
…its, and the Most visited pages in the month (#17)

* remove forgot variables

* fix for #61

- create contact instance if it doesn't exist
- changed --from-api to --from-apikey
- minor linting

* Restore branch specifier to develop

* Optimization - remove repeated query

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* Gemfile had references to develop branch

* implemented #64 - ability to generate labels independently of RDF processing (and vise versa)

* Gemfile.lock update

* fixed a bug in #64

* Relocate docker-compose file and update default configs

* Add GH workflow for publishing docker images

* use ruby native method for listing files instead of a git function

Resolves warning messages when we exclude .git directory from docker image

* remove comment

* capitalize argument in order to be consistent with other scripts

* add arm/64 platform

* additional error handling for SPAM deletion script, #60

* additional error handling for SPAM deletion script, #60

* implemented #67 - improved corrupt data and error handling

* Gemfile.lock update

* exclude test/data/dictionary.txt from git commits

* update version of solr-ut

* Gemfile.lock update

* Restore branch specifier to master

* fixed configuration for the analytics module

* Gemfile.lock update

* implemented #69 - scheduled annotator dictionary file generation should be a configurable option instead of the default

* Gemfile.lock update

* gem update

* create new rake taks for updating purls for all ontologies

moved from ontologies_api/fix_purls.rb

* initial implementation of #70 - Google Analytics v4 Update Compatibility Issue

* added the /data folder to ignore

* update gems

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* use patched version of agraph v7.3.1

* unpin faraday gem

* A chnage to reference Analytics Redis from LinkedData block

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* use assert_operator instead of assert

minitest style guide adherence.
encountered an intermittent unit test failure so assert_operator will provide
better failure feedback than assert

* use local solr to pass the tests

* fixed ncbo_ontology_archive_old_submissions error output

* Gemfile.lock update

* Gemfile.lock update

* Gemfile update

* Gemfile update

* fixes to the analytics script and a new script to generate UA analytics for documentation

* Gemfile.lock update

* Gemfile.lock update

* implemented the first pass at bmir-radx/radx-project#37

* implemented the first pass at bmir-radx/radx-project#37

* set bundler version to be comptatible with ruby 2.7

+ AG v8

* refactor ontologies analytics job to handle the new google analytics migration

* add user analytics fetching the monthly user visits count

* add page visits analytics  fetching  last month most visited pages

* extract google analytics UA import code to a script to make current code clean of it

* add option to force submission archiving even if already archived

---------

Co-authored-by: Alex Skrenchuk <alexskr@stanford.edu>
Co-authored-by: mdorf <mdorf@stanford.edu>
Co-authored-by: Jennifer Vendetti <vendetti@stanford.edu>
syphax-bouazzouni referenced this issue in ontoportal/ncbo_cron Jan 16, 2024
…onward (#2)

* add a script to eradicate (delete data+ files) submissions of an ontology

* Auto stash before merge of "development" and "master"

* omit logs link file

* update the eradicator to support the eradication of not archived submissions if wanted

* fix the delete submission files to not let behind empty directories

* not remove the submission directory beaucse it's already done by the submission.delete

* Update Gemfile.lock

* Reset branch specifier to develop

* extract do_ontology_pull function

* some simple code refactor in the ontology_pull

* simple code refactor of test_ontology_pull

* add a script to do a ontology pull on an ontology on demand

* set the name of the new script in $0

* extract new_file_exists? method from do_ontology_pull

* save the submission in the RemoteFileException

* some automatic code refactor/lint

* use the new do_ontology_pull in the old  do_remote_ontology_pull

* fixed an API call mentioned by @syphax-bouazzouni in ncbo/bioportal-project#254

* fixed an API call mentioned by @syphax-bouazzouni in ncbo/bioportal-project#254

* Gemfile.lock update

* bump up version of actions/checkout from v2->v3

* Gemfile.lock update

* Merge branch 'develop'

* remove forgot variables

* GH Actions unit test workflow refactor

- add ruby versioning via docker-compose.yml file
- bump up ruby v2.6 -> v2.7
- add AllegroGraph backend
- add code coverage

* Remove extra space

* fix for #61

- create contact instance if it doesn't exist
- changed --from-api to --from-apikey
- minor linting

* Restore branch specifier to develop

* Optimization - remove repeated query

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* Gemfile had references to develop branch

* implemented #64 - ability to generate labels independently of RDF processing (and vise versa)

* Gemfile.lock update

* fixed a bug in #64

* Relocate docker-compose file and update default configs

* Add GH workflow for publishing docker images

* use ruby native method for listing files instead of a git function

Resolves warning messages when we exclude .git directory from docker image

* remove comment

* capitalize argument in order to be consistent with other scripts

* add arm/64 platform

* additional error handling for SPAM deletion script, #60

* additional error handling for SPAM deletion script, #60

* implemented #67 - improved corrupt data and error handling

* Gemfile.lock update

* exclude test/data/dictionary.txt from git commits

* update version of solr-ut

* Gemfile.lock update

* Restore branch specifier to master

* fixed configuration for the analytics module

* Gemfile.lock update

* implemented #69 - scheduled annotator dictionary file generation should be a configurable option instead of the default

* Gemfile.lock update

* gem update

* create new rake taks for updating purls for all ontologies

moved from ontologies_api/fix_purls.rb

* initial implementation of #70 - Google Analytics v4 Update Compatibility Issue

* added the /data folder to ignore

* update gems

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* use patched version of agraph v7.3.1

* unpin faraday gem

* A chnage to reference Analytics Redis from LinkedData block

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* Gemfile.lock update

* use assert_operator instead of assert

minitest style guide adherence.
encountered an intermittent unit test failure so assert_operator will provide
better failure feedback than assert

* fixed ncbo_ontology_archive_old_submissions error output

* Gemfile.lock update

* Gemfile.lock update

* Gemfile update

* Gemfile update

* fixes to the analytics script and a new script to generate UA analytics for documentation

* Gemfile.lock update

* Gemfile.lock update

* implemented the first pass at bmir-radx/radx-project#37

* implemented the first pass at bmir-radx/radx-project#37

* set bundler version to be comptatible with ruby 2.7

+ AG v8

* Gemfile.lock update

* Gemfile.lock update

---------

Co-authored-by: Jennifer Vendetti <vendetti@stanford.edu>
Co-authored-by: mdorf <mdorf@stanford.edu>
Co-authored-by: Alex Skrenchuk <alexskr@stanford.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants