Skip to content
Change the repository type filter

All

    Repositories list

    • Double-checked Gold Standard Data for Training and Testing OCR Engines
      HTML
      141300Updated Jun 15, 2017Jun 15, 2017
    • 0800AH

      Public archive
      Texts from the 8th hijri century
      Roff
      2200Updated Apr 22, 2017Apr 22, 2017
    • 1000AH

      Public
      Texts from the 10th hijri century
      Makefile
      0000Updated Apr 21, 2017Apr 21, 2017
    • The project is being relocated to https://github.com/OpenITI; this one will shutdown shortly.
      3800Updated Apr 20, 2017Apr 20, 2017
    • 0400AH

      Public
      Texts from the 4th hijri century
      Makefile
      4110Updated Apr 20, 2017Apr 20, 2017
    • 1100AH

      Public
      Texts from the 11th hijri century
      Makefile
      0100Updated Apr 6, 2017Apr 6, 2017
    • 0900AH

      Public
      Texts from the 9th hijri century
      Roff
      1000Updated Apr 5, 2017Apr 5, 2017
    • 0700AH

      Public
      Texts from the 7th hijri century
      HTML
      4100Updated Apr 3, 2017Apr 3, 2017
    • 1300AH

      Public
      Texts from the 13th hijri century
      Makefile
      0000Updated Apr 3, 2017Apr 3, 2017
    • 0300AH

      Public
      Texts from the 3rd hijri century
      Makefile
      13000Updated Mar 31, 2017Mar 31, 2017
    • 0600AH

      Public
      Texts from the 6th hijri century
      Roff
      3000Updated Mar 30, 2017Mar 30, 2017
    • 1500AH

      Public
      Texts from the 15th hijri century
      Makefile
      0000Updated Mar 19, 2017Mar 19, 2017
    • 1400AH

      Public
      Texts from the 14th hijri century
      Makefile
      1100Updated Mar 19, 2017Mar 19, 2017
    • 1200AH

      Public
      Texts from the 12th hijri century
      Makefile
      0000Updated Mar 19, 2017Mar 19, 2017
    • 0500AH

      Public
      Texts from the 5th hijri century
      Makefile
      2000Updated Mar 19, 2017Mar 19, 2017
    • 0200AH

      Public
      Texts from the 2nd hijri century
      Makefile
      6000Updated Mar 19, 2017Mar 19, 2017
    • 0100AH

      Public
      Texts from the 1st hijri century
      Makefile
      11000Updated Mar 19, 2017Mar 19, 2017
    • uris

      Public
      Metadata file for assigning URIs
      Python
      0000Updated Mar 18, 2017Mar 18, 2017
    • Word frequencies for all the texts (~10,600) in `as is` and `normalized versions`
      1400Updated Mar 15, 2017Mar 15, 2017
    • OCR Training Data --- Minimum large files
      HTML
      0000Updated Oct 1, 2016Oct 1, 2016
    • OCR Training Data --- Raw Scans
      Python
      1100Updated Sep 23, 2016Sep 23, 2016