Skip to content

The DDD downloader is a set of scripts to download the Databank Digitale Dagbladen from the National library of the Netherlands. It was created in 2011 by Daan Odijk at ILPS (University of Amsterdam).

License

Notifications You must be signed in to change notification settings

dodijk/ddd-downloader

Repository files navigation

DDD-Downloader

The DDD downloader is a set of scripts to download the Databank Digitale Dagbladen from the National library of the Netherlands. It was created in 2011 by Daan Odijk at ILPS (University of Amsterdam).

The scripts and resulting dataset have been used in these publications:

If you use these script or the retrieved dataset for your own research, please include a reference to one of these articles.

The documentation of the scripts is currently very limited as the code has been developed for internal use.

The code is released under LGPL license (see below). If you have any questions, contact Daan.

Usage

  • Update the collection via OAI/PMH using update.sh. This will generate a set of XML files from the OAI/PMH server.
  • Use multi_store_kb.py to download all OCR'ed text from the KB servers.
  • count_kb.py will give an overview of the number of files downloaded.
  • This also is a nice starting point if you want to process the files yourself.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this program. If not, see http://www.gnu.org/licenses/.

About

The DDD downloader is a set of scripts to download the Databank Digitale Dagbladen from the National library of the Netherlands. It was created in 2011 by Daan Odijk at ILPS (University of Amsterdam).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published