Skip to content

Why is my F1 so low? #2

@Arima-Kisho

Description

@Arima-Kisho

I'm a newbie and I'm having some issues running your code, hoping to get your help.
In the case of Adaptive Retrieval, I use TinyLlama and I can't get the F1 in the article:

{'data_source': 'retrievalqa', 'total_data_count': 2785, 'retrieval_frequency': 1137, 'retrieval_rate': 40.8, 'match_score': 54.0, 'f1_score': 12.5, 'em_score': 0.1, 'accuracy_score': 27.6, 'match_total': 1503, 'f1_total': 348.9463775190812, 'em_total': 3.0, 'accuracy_total': 769.0, 'total_q_tokens': 35779, 'total_context_tokens': 715530, 'total_no_retrieval_tokens': 35779, 'total_always_retrieval_tokens': 715530, 'estimate_no_retrieval_cost': 0.017889500000000003, 'estimate_always_retrieval_cost': 0.3756545, 'saved_cost_rate': 0.9523777833088649, 'args': {'openai_config_path': './openai_config.txt', 'data_source': 'retrievalqa', 'retrieval_mode': 'adaptive_retrieval', 'input_data_path': './data/retrievalqa.jsonl', 'output_score_path': './results/adaptive_retrieval/TinyLlama/TinyLlama-1.1B-Chat-v1.0/m=vanilla/t=0.0/score_retrievalqa_seed20.json', 'output_prediction_path': './results/adaptive_retrieval/TinyLlama/TinyLlama-1.1B-Chat-v1.0/m=vanilla/t=0.0/predict_retrievalqa_seed20.jsonl', 'model_name': 'TinyLlama/TinyLlama-1.1B-Chat-v1.0', 'max_tokens': 100, 'batch_size': 1, 'doc_top_n': 5, 'limit_input': 0, 'prompt_method': 'vanilla', 'seed': 20, 'temperature': 0.0, 'top_p': 1.0, 'world_size': 1}}
./results/adaptive_retrieval/TinyLlama/TinyLlama-1.1B-Chat-v1.0/m=vanilla/t=0.0
./results/adaptive_retrieval/TinyLlama/TinyLlama-1.1B-Chat-v1.0/m=vanilla/t=0.0

My F1 is only 12.5, but the F1 in Table 10 in the article shows 48.5, is there a problem with my settings somewhere?
What data correspond to Retrieval Acc and Precision and Recall? I can't find the corresponding fractional value?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions