Skip to content
Change the repository type filter

All

    Repositories list

    • pandas4

      Public
      Web archive workflow system
      Java
      Apache License 2.0
      23161Updated Nov 7, 2024Nov 7, 2024
    • heritrix3

      Public
      Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
      Java
      Other
      763000Updated Nov 7, 2024Nov 7, 2024
    • Discovery application for the National Library of Australia's catalogue
      Ruby
      Other
      1002Updated Nov 6, 2024Nov 6, 2024
    • Common functionality for Blacklight and ArcLight applications
      Ruby
      Other
      0002Updated Nov 6, 2024Nov 6, 2024
    • Custom implementation of ArcLight for The National Library of Australia.
      Ruby
      Other
      0002Updated Nov 4, 2024Nov 4, 2024
    • AI audio proof of concept #2 - read TEI transcripts, build SOLR index with nomic embeddings, exploratory search and delivery web interface
      JavaScript
      0006Updated Oct 22, 2024Oct 22, 2024
    • Web archive index server based on RocksDB
      Java
      Apache License 2.0
      2032180Updated Oct 17, 2024Oct 17, 2024
    • Simple website to capture evaluation of different ways to search images.
      EJS
      0107Updated Oct 10, 2024Oct 10, 2024
    • AI pictures proof of concept - crawl blacklight, build SOLR index with CLIP embeddings, exploratory web interface
      HTML
      1107Updated Oct 10, 2024Oct 10, 2024
    • doss-dash

      Public
      A dashboard for doss with pretty graphs
      JavaScript
      0001Updated Oct 3, 2024Oct 3, 2024
    • bamboo

      Public
      Web archive collection manager
      Java
      Apache License 2.0
      4891Updated Oct 3, 2024Oct 3, 2024
    • AI newspaper search proof of concept - all 3.1m CT articles 1926-94, build SOLR index with nomic embeddings, exploratory web interface with LLM summaries
      JavaScript
      0003Updated Sep 20, 2024Sep 20, 2024
    • pywb

      Public
      Core Python Web Archiving Toolkit for replay and recording of web archives
      JavaScript
      GNU General Public License v3.0
      217103Updated Aug 16, 2024Aug 16, 2024
    • Callslip / pickslip request viewer
      Java
      0000Updated Aug 13, 2024Aug 13, 2024
    • Converts HTTrack crawls to WARC files
      Java
      Apache License 2.0
      63020Updated Aug 6, 2024Aug 6, 2024
    • heimdall

      Public
      A Selenium based web crawler (and archiver) that attempts to capture all resources of JS heavy pages by recursively clicking applicable DOM elements and responding to DOM modifications.
      Java
      0000Updated Jul 12, 2024Jul 12, 2024
    • dnn-cli

      Public
      A command-line interface for training DNN classifiers using deeplearning4j.
      Java
      0000Updated Jul 11, 2024Jul 11, 2024
    • odin

      Public
      Web archiving domain harvest statistics web application prototype.
      Java
      0000Updated Jul 4, 2024Jul 4, 2024
    • An abstraction/normalization layer for querying and displaying results for external search engines, in Ruby on Rails.
      Ruby
      MIT License
      13001Updated Jun 25, 2024Jun 25, 2024
    • loki

      Public
      A lightweight framework for running GWT based applications with DOM-style UI control.
      Java
      0000Updated Jun 11, 2024Jun 11, 2024
    • thor

      Public
      A simple library for server-side utilities. Provides a mechanism to store and retrieve Java objects on the file system without the use of a database, and a mechanism to run tasks through a thread-safety assurance service.
      Java
      0000Updated Jun 11, 2024Jun 11, 2024
    • Range facet/limit/profile plugin for Blacklight
      Ruby
      Other
      41000Updated Jun 5, 2024Jun 5, 2024
    • ArchivesSpace plugin for spreadsheet import
      Ruby
      1001Updated May 30, 2024May 30, 2024
    • Prototype for displaying statistics from web archiving harvests.
      0000Updated May 23, 2024May 23, 2024
    • marcgrep

      Public
      A slow-moving search across MARC data
      Clojure
      Eclipse Public License 1.0
      3000Updated May 10, 2024May 10, 2024
    • nla-pywb

      Public
      pywb config overlay for the Australian Web Archive
      HTML
      0210Updated May 3, 2024May 3, 2024
    • flint

      Public
      A modular and extendible file/format validation framework
      Java
      Apache License 2.0
      3200Updated Apr 10, 2024Apr 10, 2024
    • jvmctl

      Public
      Java app deployment tool
      Python
      MIT License
      11101Updated Mar 22, 2024Mar 22, 2024
    • wombat

      Public
      Wombat.js client-side rewriting library
      JavaScript
      GNU Affero General Public License v3.0
      30000Updated Feb 23, 2024Feb 23, 2024
    • dl-models

      Public
      Java
      1001Updated Dec 17, 2023Dec 17, 2023