Feature/results to df #94

humaira-rf · 2025-11-13T22:18:28Z

Changes

updated notebooks to display dataframe from the results dict
Added other experimentation knobs to the results dict

Testing

Tested on all three notebooks with downsampled data and just 2 configs

Screenshots

a. hyrbid - gsm8k:

b. Fully openai - scifact:(couldn't screenshot the full df)

c. Fully Local: fiqa

Note

Adds pipeline metadata to final metrics with consistent ordering and updates tutorials to display results as a DataFrame.

Evals Controller (rapidfireai/evals/scheduling/controller.py):
- Final metrics pipeline: _compute_final_metrics_for_pipelines now accepts optional pipeline_id_to_info and injects pipeline metadata (e.g., model_name, search_type, rag_k, top_n, chunk_size, chunk_overlap, sampling_params, prompt_manager_k, model_config).
- Ordering and output: Reorders cumulative metrics to run_id, model_name, hyperparams, Samples Processed, then remaining metrics; returns ordered_metrics and uses it for progress display.
- Integration: Builds pipeline_id_to_info from pipeline_info and passes it to final-metrics computation.
Tutorial Notebooks:
- Replace sample printouts with conversion of results into a pandas DataFrame (results_df) in rf-tutorial-gsm8k-fewshot.ipynb, rf-tutorial-rag-fiqa.ipynb, and rf-tutorial-scifact-full-evaluation.ipynb.
- Minor notebook metadata/formatting tweaks (ids, kernelspec).

^{Written by Cursor Bugbot for commit efdc979. This will update automatically on new commits. Configure here.}

arun-rfai

I'd suggest leaving the dict return type of run_evals as is. Conver the dict to dataframe in the notebook cell itself, limiting it only metrics columns and the config knobs akin to the second table printed in run_evals.

Also, pipeline_id -> run_id.

arun-rfai

LGTM

humaira-rf added 2 commits November 13, 2025 21:14

results to df

8322e30

dict to df testing

647ad97

humaira-rf requested a review from arun-rfai November 13, 2025 22:18

arun-rfai requested changes Nov 13, 2025

View reviewed changes

retaining dict

efdc979

humaira-rf requested a review from arun-rfai November 14, 2025 00:49

arun-rfai approved these changes Nov 14, 2025

View reviewed changes

arun-rfai merged commit 3570b6d into main Nov 14, 2025
1 check passed

arun-rfai deleted the feature/results_to_df branch November 14, 2025 00:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/results to df #94

Feature/results to df #94

Uh oh!

humaira-rf commented Nov 13, 2025 •

edited

Loading

Uh oh!

arun-rfai left a comment

Uh oh!

arun-rfai left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feature/results to df #94

Feature/results to df #94

Uh oh!

Conversation

humaira-rf commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Testing

Screenshots

Uh oh!

arun-rfai left a comment

Choose a reason for hiding this comment

Uh oh!

arun-rfai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

humaira-rf commented Nov 13, 2025 •

edited

Loading