Skip to content

[athena] Information Retrieval SE Datasets

Nathan Cooper edited this page Jun 1, 2021 · 3 revisions

This page shows the information retrieval datasets catalog to be employed for the Athena project. This will include things like the fields/columns of the dataframe that represents the different datasets. All datasets can be in whatever format works best, e.g., JSON, JSONL, CSV, etc. However, processing/working with the data will be done through HuggingFace's datasets library.

Information for each dataset will follow this template format: https://github.com/huggingface/datasets/blob/9d8bf36fdb861d9b2922d7c782fb58f9f542997c/templates/README.md

Traceability Datasets

Please check in this link

Coupling Dataset

Searching Datasets