Skip to content

Internet Archive

Rupert Gatti edited this page Jan 28, 2021 · 13 revisions

Issue 73

The Internet Archive is a free online digital library. Anyone with an account can upload materials to the Internet Archive via a drag/drop interface.

  • Preferred metadata format: Custom form
  • Other supported formats: N/A
  • File transfer: browser upload
  • Content files: PDF (and many others)
  • Chapter support: No

Internet Archive Drag/Drop Interface

Summary:

The Internet Archive, a 501(c)(3) non-profit, is a digital library of internet sites and other cultural artifacts in digital form that provides free access. Contains 475 billion web pages, 28 million books and texts, 14 million audio recordings (including 220,000 live concerts), 6 million videos (including 2 million Television News programs), 3.5 million images, 580,000 software programs. Has newly launched an Internet Archive Scholar search engine. For books, see Archive-It.

Format types:

Users input URLs, transformed to WAC

Third-party content support:

When a dynamic page contains forms, JavaScript, or other elements that require interaction with the originating host, the archive will not contain the original site’s functionality.

Features:

Crawls are contributed from various sources, some imported from third parties and others generated internally by the Archive. The frequency of snapshot captures varies per website. Datacentres in three Californian cities: San Francisco, Redwood City, and Richmond. To prevent losing the data in case of e.g. a natural disaster, the Archive attempts to create copies of (parts of) the collection at more distant locations, currently including the Bibliotheca Alexandrina in Egypt and a facility in Amsterdam.

Costs:

Anyone with a free account can upload media using the WayBack Machine.

Clone this wiki locally