Our Papers

Pytorch code for Towards-Causal-Relationship-in-Indefinite-Data-Baseline-Model-and-New-Datasets

New Datasets

Causalogue: A text dataset with 1638 dialogue samples, labeled with full causal relationships.
For the representation, we recommend the pre-trained model RoBERTa (https://github.com/facebookresearch/fairseq/tree/main/examples/roberta)

Causaction: A video dataset with 1118 video samples, labeled with full causal relationships between two segments.
For the representation, we recommend the representation from I3D (https://zenodo.org/records/3625992#.Xiv9jGhKhPY)

Requirements:

torch
transformer
sklearn
wandb

Download for Causalogue:

pretrained model for representation extraction of causalogue: https://huggingface.co/docs/transformers/model_doc/roberta

you can also use the online from huggingface:

from transformers import pipeline
unmasker = pipeline('fill-mask', model='roberta-base'))

Download for Causaction:

pretrained representation of causaction: https://zenodo.org/records/3625992#.Xiv9jGhKhPY
storage path:data/causaction/pretrain_representation

In this repo， you can use follows to run the baseline model with two new datasets：

python main.py --datasetname [causalogue/causaction]

To access the new datasets, you can look into:

\data\causaction\breakfast2.json
\data\causalogue\all_data_small.json

To load these two new datasets, you can:

from data_loader import *

train_loader=build_train_data(datasetname=args.dataset,fold_id=fold_id,batch_size=wandb.config.batch_size,data_type='train',args=args,config=dataset_config)
valid_loader = build_inference_data(datasetname=args.dataset,fold_id=fold_id,batch_size=wandb.config.batch_size,data_type='valid',args=args,config=dataset_config)
test_loader = build_inference_data(datasetname=args.dataset,fold_id=fold_id,batch_size=wandb.config.batch_size,data_type='test',args=args,config=dataset_config)

Reference

@misc{chen2023causal,
title={Towards Causal Representation Learning and Deconfounding from Indefinite Data},
author={Hang Chen and Xinyu Yang and Qing Yang},
year={2023},
eprint={2305.02640},
archivePrefix={arXiv},
primaryClass={cs.LG}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
alldatasets		alldatasets
configs		configs
data		data
networks		networks
README.md		README.md
data_loader.py		data_loader.py
main.py		main.py
probabilistic_model.py		probabilistic_model.py
train_test.py		train_test.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Our Papers

New Datasets

Reference

About

Releases

Packages

Languages

Zodiark-ch/master-of-paper-Towards-Causal-Relationship-in-Indefinite-Data-Baseline-Model-and-New-Datasets

Folders and files

Latest commit

History

Repository files navigation

Our Papers

New Datasets

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages