-
Notifications
You must be signed in to change notification settings - Fork 9
Internet Archive
The Internet Archive is a free online digital library. Anyone with a free account can upload materials to the Internet Archive via a drag/drop interface or an API. It supports all kinds of digital materials, so it can accept a wide range of file formats, but its metadata model is not tailored to books.
They also host the WayBack Machine, which archives snapshots of web pages. Anyone can submit URLs to be archived, whether or not they have an account. The archived webpages are transformed to WARC and can then be publicly accessed as HTML.
- Preferred metadata format: Custom form
- Other supported formats: N/A
- File transfer: browser upload
- Content files: PDF (and many others)
- Chapter support: No
The Internet Archive, a 501(c)(3) non-profit, is a digital library of internet sites and other cultural artifacts in digital form that provides free access. Contains 475 billion web pages, 28 million books and texts, 14 million audio recordings (including 220,000 live concerts), 6 million videos (including 2 million Television News programs), 3.5 million images, 580,000 software programs. Has newly launched an Internet Archive Scholar search engine. For books, see Archive-It.
When a dynamic page contains forms, JavaScript, or other elements that require interaction with the originating host, the archived version in the WayBack Machine will not contain the original site’s functionality.
WayBack Machine crawls are contributed from various sources, some imported from third parties and others generated internally by the Archive. The frequency of snapshot captures varies per website.
The Internet Archive has datacentres in three Californian cities: San Francisco, Redwood City, and Richmond. To prevent losing the data in case of e.g. a natural disaster, the Archive attempts to create copies of (parts of) the collection at more distant locations, currently including the Bibliotheca Alexandrina in Egypt and a facility in Amsterdam.
The Thoth Wiki has been developed in the context of the COPIM (Community-led Open Publication Infrastructures for Monographs) project. Individual contributions to the wiki have been made by Tim Elfenbein, Rupert Gatti, Ross Higman, Brendan O'Connell, Vincent W.J. van Gerven Oei, Tobias Steiner and Hannah Hillen under the general editorship of Van Gerven Oei. All data are available under a CC-BY 4.0 license.