- Codespelled some obvious typos etc #100 (@yarikoptic)
- Add beforeCommitChangelog hook for updating datalad_crawler/version.py #98 (@jwodder)
- Use Markdown README as-is on PyPI #99 (@jwodder)
- John T. Wodder II (@jwodder)
- Yaroslav Halchenko (@yarikoptic)
- Michael Hanke (@mih)
- Set up workflow with auto for releasing & PyPI uploads #86 (@jwodder @yarikoptic)
- John T. Wodder II (@jwodder)
- Yaroslav Halchenko (@yarikoptic)
- RF: Replace custom SafeConfigParserWithIncludes with standard ConfigParser
- BF: gh - handle situation when cloned repo is still empty
- Fix up use of protected datalad's interface for auth to github. Boosted DataLad version dependency to 0.13.6
- Making compatible with recent DataLad by using new WitlessRunner and not older unused features.
- RF: stop using
_{git,annex}_custom_command
to allow DataLad core progress forward without "breaking" the crawler
- ENH: fix enabling special remotes when working ith recent (as of 202006) git-annex
- NF: gh (for github) and xnat crawler pipelines
- DataLad 0.12 is now minimal version. Codebase is now compatible with current
DataLad 0.12.2-293-gd5fcb4833
- uses less of GitPython functionality
- OpenfMRI pipeline tests "relaxed" (no commit counts etc)
- s3 node - be robust in case of no previous version-id known
- ENH:
s3_simple
pipeline got additional optiondrop_immediately
to drop files immediately upon having them annexed - RF:
mock
is explicitly listed as a dependency for testing since DataLad 0.12.x will be PY3 only and could use built-inunittest.mock
- MNT: More changes for compatibility with developmental DataLad (#62)
- BF: Prevent sorting error on missing attribute (#45)
- BF: enclose "if else" into () since it has lower precedence than + (#43)
- MNT: Adjust imports for compatibility with developmental DataLad (#53)
- MNT: Update save() call for compatibility with new save (#42)
- Compatibility layer with 0.12 series of DataLad changing API
(no backend option for
create
)
Primarily a variety of fixes and small enhancements. The only notable change is stripping away testing/support of git-annex direct mode.
- do not depend on a release candidate of the DataLad, since PIP then opens the way to a RCs for any later releases to be installed
simple_with_archives
- issue warning if incoming_pipeline has Annexificator but no
annex
is given
- issue warning if incoming_pipeline has Annexificator but no
crcns
- skip (but warn if relevant) records without xml
- do not crash while saving updated crawler's URL db to the file which is annexed.
Primarily a variety of fixes
crcns
crawler now uses new datacite interfaceopenfmri
crawler uses legacy.openfmri.orgsimple_with_archives
- by default now also match pure .gz files to be downloaded
archives_re
option provides regex for archives files (so.gz
could be added if needed)- will now run with
tarballs=False
add_annex_to_incoming_pipeline
to state either to addannex
to the incoming pipeline
- new
stanford_lib
pipeline - aggregation of metadata explicitly invokes incremental mode
- tests
- variety of tests lost their
@known_failure_v6
and now tollerant to upcoming datalad 0.11.2
- variety of tests lost their
- All non-master branches in the pipelines now will initiate from master branch, not detached. That should allow to inherit .gitattributes settings of the entire dataset
- First release as a DataLad extension. Functionality remains identical to DataLad 0.10.0.rc2