Skip to content
This repository has been archived by the owner on Apr 15, 2021. It is now read-only.

7. Setting up CKAN: Running a Harvest

Jessica Good edited this page Aug 13, 2014 · 1 revision

How to run a harvest in a development environment. In the production environment this is automated with Upstart.

More directions at https://github.com/ckan/ckanext-harvest and https://github.com/ngds/ckanext-ngds/issues/226

Start Solr

$ cd ~/solr-4.5.1/example/
$ java -jar start.jar

Start central.usgin.org

In a new terminal:

$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster serve central.ini

Start Rabbitmq

In a new terminal:

$ source ~/ckanenv/bin/activate
(ckanenv) $ sudo apt-get install rabbitmq-server
(ckanenv) $ sudo service rabbitmq-server start

username: developer password: django
Should be an entry in /var/log/rabbitmq/startup_log if server started

Create a new harvest in the UI

If the harvest has been created previously and failed click the Clear button and then Reharvest.

List harvest jobs

In a new terminal:

$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster --plugin=ckanext-harvest harvester jobs --config=central.ini

Should be a NEW harvest job running.

Start the consumer for the gathering queue

In a new terminal:

$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster --plugin=ckanext-harvest harvester gather_consumer --config=central.ini

Once that says "Gather queue consumer registered" proceed to the next step

Start the consumer for the fetching queue

In a new terminal:

$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster --plugin=ckanext-harvest harvester fetch_consumer --config=central.ini

Once that says "Fetch queue consumer registered" proceed to the next step

Run harvest jobs

In a new terminal:

$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster --plugin=ckanext-harvest harvester run --config=central.ini

The terminal with the gather queue should now start to run and when that is finished the terminal with the fetch queue will run.

If at any time it seems like the process is hanging, press Ctrl-C to cancel.

Clone this wiki locally