The cayenne base codebase. Implements useful metadata transforms, ID handling, a resource API, OAI metadata download and ingest / indexing.
Make sure you have these dependencies installed within your development environment:
- Java 7
- Leiningen
- Docker
The tests require Docker to spin up service dependencies on development machine (SOLR and MongoDB). Download Docker for Mac from https://www.docker.com/docker-mac and confirm that it's installed with docker-compose --version
.
Before running cayenne, building an uberjar or building a docker docker image, CSL resources must be pulled into the local repository via git submodules. Manifest files must then be created for CSL style files and locale files.
Update git submodules to bring in CSL style and locale files:
$ git submodule update --init
Refresh resources/styles.edn
and resources/locales.edn
:
$ lein csl
Call lein repl
, (begin)
then try a few commands, for example, parse some XML downloaded from
OAI-PMH:
$ lein repl
> (begin)
> (action/get-oai-records (conf/get-param [:oai :crossref-journals]) "2012-01-01" "2012-01-02" action/dump-plain-docs)
Run as a production service with some profiles:
$ lein with-profiles prod run :api :index :update-members :update-journals :update-funders
- :api - Run the resource HTTP API.
- :index - Run an OAI download and index once daily.
- :update-members - Collect member records and update with metadata coverage stats (once a day).
- :update-journals - Collect journal records and update with metadata coverage stats (once a day).
- :update-funders - Load new funder registry RDF when available. Checks for new RDF once an hour.
- :graph - Enables a connection to the graph database backend, datomic.
- :graph-api - Must be specified along with :api and :graph. Enables the graph API. Requires datomic leiningen profile.
- :feed-api - Must be specified along with :api. Enables the feed API for real-time metadata ingest.
- :process-feed-files - Run async processing of incoming feed files. Should be enabled with :feed-api.
- :solr-inserts - Run solr inserts. Should be enabled with :feed-api or instances perform OAI-PMH harvesting.
Run as a daemonized production service with lein-daemon:
$ lein with-profiles prod daemon start cayenne :api :index :update-members
Accepts the same arguments as lein run. Also available are:
$ lein with-profiles prod daemon stop cayenne
$ lein with-profiles prod daemon check cayenne
In daemonized mode, feature logging will still go to log/log.txt
, however,
standard out from the daemonized process's start up phase will be sent to
cayenne.log
rather than standard out.
When running as a daemon it is sometimes useful to start an nrepl server to later connect a repl:
$ lein with-profiles prod daemon start cayenne :api :nrepl
Create a docker image:
$ lein uberimage
Running with lein test
should take care of creating any required infrastructure, typically MongoDB and Solr.
The Solr instance will be created using docker image crossref/cayenne-solr
, this docker image is available in docker hub but
can also be created locally by cloning https://github.com/crossref/cayenne-solr
and running docker image build ./ -t crossref/cayenne-solr
, building
the image locally is useful if you want to make changes to the Solr schema.
In order for the tests to pass there must be a specific set of feed files present in the feed input directory, these feed files are not currently in this repository because of distribution issues but this will be addressed. For now, if the expected number of feed files is not present an exception will be thrown:
actual: java.lang.Exception: The number of feed input files is not as expected. Expected to find 174 files in /home/markwoodhall/src/crossref/cayenne/dev-resources/feeds/source
Note. Occasionally HTTP Kit will hold onto port 3000 after starting the API, this can sometimes cause problems with multiple
test runs, running a subset, e.g. lein test cayenne.works-test
is more reliable.
Running tests from the REPL will also work.
References are displayed in API output. The visiblity level of those references
can be set as open
, limited
or closed
, where each setting will display
more references than the last. This setting can be configured directly by
the internal config variable [:service :api :references]
or via the ENV VAR
REFERENCES
.