Skip to content

new simulation (and visualization) for finding funniest caption #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

cwagaman
Copy link

This new simulation (and its corresponding visualization) attempts to show how quickly a "best caption" rises to the top of the rankings. We perform two visualizations.

  1. The graph titled "# Captions within 95% CI of Current Funniest" provides a visualization for how soon a caption (not necessarily the true funniest caption) can plausibly be identified as the funniest. First, the average user-provided rating is computed for each caption. Then, a 95% CI is computed for each of these average user-provided ratings (basically using the central limit theorem). The corresponding graph displays the number of captions with a 95% CI intersecting the 95% CI around the caption with the highest average user-provided rating.
  2. The graph titled "# Captions with Simulated Rating Higher than True Funniest" provides a visualization for how quickly the funniest caption can be correctly identified. the following. Recall that we have access to the ground truth for which caption is funniest. This graph displays how many captions, after a given number of queries, have recieved an average user-provided rating that is better than the average user-provided rating received by the true funniest caption.

Each visualization is performed for three different learning strategies.

  1. "Random" randomly selects captions for users to rate.
  2. "Active" adaptively chooses captions for users to rate according to the upper confidence bound strategy described in https://arxiv.org/abs/1312.7308.
  3. "lil_KLUCB" adaptively chooses captions for users to rate according to the upper confidence bound strategy described in https://arxiv.org/abs/1709.03570.

The line on each graph is a plot of the mean, taken over 10 samples. The shaded region around each line is the standard deviation.

cwagaman added 2 commits June 14, 2022 14:38

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant