Skip to content
@macocu

MaCoCu

MaCoCu focuses on collecting monolingual and parallel data from the Internet, specially for under-resourced languages and DSI-specific data.

Popular repositories Loading

  1. MaCoCu-crawler MaCoCu-crawler Public

    Python 6

  2. LanguageModels LanguageModels Public

    Tools for training LMs

    Python 5 1

  3. prevert prevert Public

    Iterator for the prevert format

    Python 2

  4. American-British-variety-classifier American-British-variety-classifier Public

    Jupyter Notebook 1

  5. BCMS-variant-classifier BCMS-variant-classifier Public

    A classification tool for discriminating between Bosnian, Croatian, Montenegrin, and Serbian

    1

  6. Manual-Checking-Web-Corpora-Guidelines Manual-Checking-Web-Corpora-Guidelines Public

    Forked from TajaKuzman/GINCO-Genre-Annotation-Guidelines

    The Guidelines for Manual Checking of Web Corpora

    JavaScript

Repositories

Showing 10 of 10 repositories
  • LanguageModels Public

    Tools for training LMs

    macocu/LanguageModels’s past year of commit activity
    Python 5 1 0 0 Updated Jun 6, 2023
  • documentation Public
    macocu/documentation’s past year of commit activity
    0 Apache-2.0 0 0 0 Updated Jun 1, 2023
  • Monolingual-Curation Public

    The Repository for the Curation of Monolingual Data work package

    macocu/Monolingual-Curation’s past year of commit activity
    Jupyter Notebook 0 0 0 0 Updated May 31, 2023
  • BCMS-variant-classifier Public

    A classification tool for discriminating between Bosnian, Croatian, Montenegrin, and Serbian

    macocu/BCMS-variant-classifier’s past year of commit activity
    1 0 0 0 Updated May 23, 2023
  • prevert Public

    Iterator for the prevert format

    macocu/prevert’s past year of commit activity
    Python 2 Apache-2.0 0 0 0 Updated Apr 12, 2023
  • macocu/American-British-variety-classifier’s past year of commit activity
    Jupyter Notebook 1 0 0 0 Updated Feb 7, 2023
  • HT-vs-MT Public Forked from tobiasvanderwerff/HT-vs-MT

    Source code for EAMT 2022 paper "Automatic Discrimination of Human and Neural Machine Translation: A Study with Multiple Pre-Trained Models and Longer Context".

    macocu/HT-vs-MT’s past year of commit activity
    Shell 0 MIT 1 0 0 Updated Jun 22, 2022
  • macocu/MaCoCu-crawler’s past year of commit activity
    Python 6 GPL-3.0 0 0 0 Updated May 4, 2022
  • Manual-Checking-Web-Corpora-Guidelines Public Forked from TajaKuzman/GINCO-Genre-Annotation-Guidelines

    The Guidelines for Manual Checking of Web Corpora

    macocu/Manual-Checking-Web-Corpora-Guidelines’s past year of commit activity
    JavaScript 0 57 0 0 Updated Mar 29, 2022
  • DSI Public Forked from RikVN/DSI

    Code for the DSI experiments in the MaCoCu project

    macocu/DSI’s past year of commit activity
    Python 0 1 0 0 Updated Jan 24, 2022

Top languages

Loading…

Most used topics

Loading…