This is a python tool to evaluate alignment and uniformity of sentence embedding like SimCSE paper.
SimCSE paper explains alignment and uniformity as below:
Given a distribution of positive pairs p_pos, alignment calculates expected distance between embeddings of the paired instances (assuming representations are already normalized):
On the other hand, uniformity measures how well the embeddings are uniformly distributed:
where p_data denotes the data distribution.
by pip
pip install alignuniformeval
by source
pip install https://github.com/akiFQC/AlignUnformEval
You can easily evaluate alignment and uniformity with this library.
This is a minimal example that evaluate alignment and uniformity of STS Benchmark.
from alignunformeval import STSBEval
evaluator = STSBEval(sentence_encoder)
# sentence_encoder is a callable from List[str] to numpy.array. The output numpy.array must be [dimention_of_sentence_vector].
result = evaluator.eval_summary()
# result = {"alignment": value_of_aligenment, "uniformity": value_of_uniformity}
STSBEval
get callable whose input is list
of str
and output is n dimentional numpy.array
.
This dataset (especially, sts-dev.csv
) was used in SimCSE paper. In the paper, the threshold of similarity score was st at 4.0; pairs of sentences whose similarity score is higher than 4.0 are used for evaluation of alignment. You can set other threshold as the following example.
from alignunformeval import STSBEval
# sentence_encoder : some function List[str] to np.array[dimention_of_sentence_vector]
evaluator = STSBEval(sentence_encoder, threshold=3.0) # set threshold at 3.0
result = evaluator.eval_summary()
Please see test/test_stsb.py
if you want more details.
Tokyo Metropolitan University Paraphrase Corpus (TMUP) is a Japanese paraphrase dataset.
from alignunformeval import TMUPEval
# sentence_encoder : some function List[str] to np.array[dimention_of_sentence_vector]
evaluator = TMUPEval(sentence_encoder)
result = evaluator.eval_summary()
The license of this tool follows each dataset. Please read the documents of datasets you use.