-
Notifications
You must be signed in to change notification settings - Fork 13
7. Setting up CKAN: Running a Harvest
How to run a harvest in a development environment. In the production environment this is automated with Upstart.
More directions at https://github.com/ckan/ckanext-harvest and https://github.com/ngds/ckanext-ngds/issues/226
$ cd ~/solr-4.5.1/example/
$ java -jar start.jar
In a new terminal:
$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster serve central.ini
In a new terminal:
$ source ~/ckanenv/bin/activate
(ckanenv) $ sudo apt-get install rabbitmq-server
(ckanenv) $ sudo service rabbitmq-server start
username: developer password: django
Should be an entry in /var/log/rabbitmq/startup_log if server started
- At http://127.0.0.1:5000/
- Login as admin with the pasword admin
- Click the Content button
- Click Add Harvest Source
- Fill in the url and title
- Some csw urls we have used are http://gdr.openei.org/csw, http://geothermal.smu.edu/geoportal/csw, [http://catalog.usgin.org/geothermal/csw] (http://catalog.usgin.org/geothermal/csw)
- Save
- Go back into the harvest job and click the Reharvest button
If the harvest has been created previously and failed click the Clear button and then Reharvest.
In a new terminal:
$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster --plugin=ckanext-harvest harvester jobs --config=central.ini
Should be a NEW harvest job running.
In a new terminal:
$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster --plugin=ckanext-harvest harvester gather_consumer --config=central.ini
Once that says "Gather queue consumer registered" proceed to the next step
In a new terminal:
$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster --plugin=ckanext-harvest harvester fetch_consumer --config=central.ini
Once that says "Fetch queue consumer registered" proceed to the next step
In a new terminal:
$ source ~/ckanenv/bin/activate
(ckanenv) $ cd ~/ckanenv/src/ckan
(ckanenv) $ paster --plugin=ckanext-harvest harvester run --config=central.ini
The terminal with the gather queue should now start to run and when that is finished the terminal with the fetch queue will run.
If at any time it seems like the process is hanging, press Ctrl-C to cancel.