- code for the paper Subjective Bias in Abstractive Summarization
- We examined the influence of subjective style bias in large-scale abstractive summarization datasets and introduced a Graph Convolutional Network method to capture and embed writing styles. Results demonstrate that style-clustered datasets enhance model convergence, abstraction, and generalization.
- params.py: hyperparameters
- get_datasets.py: get the topk Oracle sentences in the article then parse
- process_dataset.py: turn parsed file into the format of DGL graph triplet
- model.py: the self-supervised GCN model for extracting subjective style embedding
- train.py: training
- infer.py: infer the whole training set to get subjective style embedding for clustering
- negative samples of Oracle sentences are uniform-sampled by the Jaccard sim