CITATION.cff

cff-version: 1.2.0
title: 'MuLD: The Multitask Long Document Benchmark'
message: >-
  If you use this dataset, please cite it using the
  metadata from this file.
type: dataset
authors:
  - given-names: G Thomas
    family-names: Hudson
    email: g.t.hudson@durham.ac.uk
    affiliation: Durham University
    orcid: 'https://orcid.org/0000-0003-3562-3593'
  - given-names: Noura
    name-particle: Al
    family-names: Moubayed
    orcid: 'https://orcid.org/0000-0001-8942-355X'
    affiliation: Durham University
identifiers:
  - type: url
    value: 'https://aclanthology.org/2022.lrec-1.392'
abstract: >-
  The impressive progress in NLP techniques has been
  driven by the development of multi-task benchmarks
  such as GLUE and SuperGLUE. While these benchmarks
  focus on tasks for one or two input sentences,
  there has been exciting work in designing efficient
  techniques for processing much longer inputs. In
  this paper, we present MuLD: a new long document
  benchmark consisting of only documents over 10,000
  tokens. By modifying existing NLP tasks, we create
  a diverse benchmark which requires models to
  successfully model long-term dependencies in the
  text. We evaluate how existing models perform, and
  find that our benchmark is much more challenging
  than their ‘short document’ equivalents.
  Furthermore, by evaluating both regular and
  efficient transformers, we show that models with
  increased context length are better able to solve
  the tasks presented, suggesting that future
  improvements in these models are vital for solving
  similar long document problems. We release the data
  and code for baselines to encourage further
  research on efficient NLP models.
keywords:
  - Long Documents
  - Benchmark
  - Multitask learning
  - NLP
license: CC-BY-NC-4.0

preferred-citation:
  authors:
    - given-names: G Thomas
      family-names: Hudson
      email: g.t.hudson@durham.ac.uk
      affiliation: Durham University
      orcid: 'https://orcid.org/0000-0003-3562-3593'
    - given-names: Noura
      name-particle: Al
      family-names: Moubayed
      orcid: 'https://orcid.org/0000-0001-8942-355X'
      affiliation: Durham University
  title: "MuLD: The Multitask Long Document Benchmark"
  type: conference-paper
  collection-title: Proceedings of the Language Resources and Evaluation Conference
  conference:
    name: Language Resources and Evaluation Conference
    date-start: 2022-06-21
    date-end: 2022-06-23
    address: Marseille, France
  location: 
    name: Marseille, France
  start: 3675
  end: 3685
  publisher:
    name: European Language Resources Association
  url: https://aclanthology.org/2022.lrec-1.392
  abstract: >-
    The impressive progress in NLP techniques has been
    driven by the development of multi-task benchmarks
    such as GLUE and SuperGLUE. While these benchmarks
    focus on tasks for one or two input sentences,
    there has been exciting work in designing efficient
    techniques for processing much longer inputs. In
    this paper, we present MuLD: a new long document
    benchmark consisting of only documents over 10,000
    tokens. By modifying existing NLP tasks, we create
    a diverse benchmark which requires models to
    successfully model long-term dependencies in the
    text. We evaluate how existing models perform, and
    find that our benchmark is much more challenging
    than their ‘short document’ equivalents.
    Furthermore, by evaluating both regular and
    efficient transformers, we show that models with
    increased context length are better able to solve
    the tasks presented, suggesting that future
    improvements in these models are vital for solving
    similar long document problems. We release the data
    and code for baselines to encourage further
    research on efficient NLP models.