News Locality Calibration #103

sophiasun0515 · 2024-09-23T21:28:34Z

Apply some refactoring by creating a Calibrator object that both TopicCalibration and the new LocalityCalibration inherit from. Updated current test_calibration test suites.

Some TODOs we need to figure out for next step:

Add test for current logic
Tune the parameter of calibration (The current params are borrowed from topic calibration)
Run the code on an S3 instance and call necessary endpoints from POPROX
Figure out how to connect the logic to participant selection (who will be included in this experiment?)
Integrate the LLM context generation into calibrated articles

karlhigley

Looks good!

karlhigley · 2024-09-24T13:35:18Z

src/poprox_recommender/components/diversifiers/locality_calibration.py

+        )
+        return ArticleSet(articles=[candidate_articles.articles[int(idx)] for idx in article_indices])
+
+    def calibration(self, relevance_scores, articles, preferences, theta, topk) -> list[Article]:


This method looks pretty much the same as the corresponding method on the TopicCalibrator. Wonder if it could move into the base class?

Yeah that's certainly possible, I was bit hesitant before since the calibration function will call overridden add_article_to_categories() function, but maybe that's OK as well. I'll try factor calibration() and normalized_categories_with_candidate() into the parent class and see how it goes!

karlhigley · 2024-09-24T13:37:35Z

src/poprox_recommender/recommenders.py

@@ -32,7 +36,7 @@ def select_articles(
    pipeline state whose ``default`` is the final list of recommendations.
    """
    available_pipelines = recommendation_pipelines(device=default_device())
-    pipeline = available_pipelines["nrms"]
+    pipeline = available_pipelines["cali"]


This will change the default when no recommender is specified in the request from plain NRMS to NRMS+Locality calibration. That's probably okay if you plan to run this experiment from a separate endpoint, but maybe not something we want to do in the main/default recommender endpoint

This raises a good point. Are we expected to fork the repo and run the fork?

Is POPROX planning to have a release cycle so we can uptake changes to the main POPROX?

Current plan is to have people fork the repo and deploy from the fork. No current plans for a defined release cycle since it hasn't become an issue yet.

Certainly! So based on the conversation looks like it's OK to keep the pipeline as cali instead of nrms if we will run only run this with allocated participants?

I'm actually also thinking about the possibility of running some experiment flag to determine which pipeline do we want to choose here:

if interest_profile.experiment_id == LOCALITY_EXPT_ID: pipeline = available_pipelines["cali"] else: pipeline = available_pipelines["nrms"]

But I realized that might not be the way we want to allocate users based on experiment manifest?

By the time the endpoint defined in this repo receives a request, it will already be for the appropriate recommender for each user according to the group assignments. The section that defines the experiment recommenders in the manifest will look something like this:

[recommenders.calibration] endpoint = "https://n07svs3qeh.execute-api.us-east-1.amazonaws.com/?pipeline=calibration"

The platform side sends requests for users assigned to groups that use that recommender to that URL, and the query parameter at the end of the URL gets used by this code to select the appropriate pipeline.

(It's set up that way so that the recommender code doesn't need to know anything about experiments, groups, phases, etc in order to keep it simple for experimenters.)

karlhigley · 2024-09-24T13:38:45Z

src/poprox_recommender/components/diversifiers/locality_calibration.py

+
+        return recommendations
+
+    def normalized_categories_with_candidate(self, rec_categories, article):


This looks very similar to the method on the other class too and might also be able to move to the base class

zentavious

Only major point we should talk about is calibrating on user's locality preferences vs. a fixed locality distribution.

Otherwise everything else is primarily clarifying questions.

zentavious · 2024-09-24T14:04:46Z

src/poprox_recommender/topics.py

@@ -43,6 +76,19 @@ def user_topic_preference(past_articles: list[Article], click_history: list[Clic
    return topic_count_dict


+def user_locality_preference(past_articles: list[Article], click_history: list[Click]) -> dict[str, int]:


Do we want to calibrate on the user's locality preference? I thought we might want to calibrate on the distribution of today's articles or a fixed distribution from our prior investigation.

Notably, I see an issue at the cold start of our experiment where users might have a very focused click history due to the focus of the baseline POPROX algorithm and their locality preference will be very biases.

Could you elaborate on what we might want to calibrate on in this function? I was thinking it as just a function to compute the "groud_truth" of user click history with the system, and the actual calibration step happened after we get their preference (in the new LocalityCalibration class).

The cold-start issue might exist, but if we have an one-month window to observe users and collect their preferences, can that be a feasible solution?

zentavious · 2024-09-24T14:24:04Z

src/poprox_recommender/components/diversifiers/locality_calibration.py

+                    best_candidate_score = adjusted_candidate_score
+                    candidate = article_idx
+
+            if candidate is not None:


Is the only case when candidate would be None if we've already appended the entire list of article_idx to recommendations?

Asking because if a downstream class expects there to always be 10 recommendations this could be an ideal spot to throw an error or at least a warning (not sure how much logging POPROX has or wants to have)

Our intent is to make the downstream code robust in the sense that the expected number of recommendations will always be generated if you construct your recommendation pipeline to do so. That doesn't imply that this code needs to always produce 10 recommendations though, because you can (for example) use the Fill component downstream to fill in however many slots are left with a second source of recommendations (like the UniformSampler, which chooses at random.)

zentavious · 2024-09-24T14:27:31Z

src/poprox_recommender/components/diversifiers/locality_calibration.py

+        # R is all candidates (not selected yet)
+
+        recommendations = []  # final recommendation (topk index)
+        rec_categories = defaultdict(int)  # frequency distribution of catregories of S


Not sure if I understand this comment. Maybe its just copied from topic_calibration?

I think the S vs R nomenclature comes from the MMR paper, which then got carried into our calibrator implementation. R is the set of available candidates Remaining; S is the set of items Selected so far.

zentavious · 2024-09-24T14:33:18Z

src/poprox_recommender/recommenders.py

@@ -32,7 +36,7 @@ def select_articles(
    pipeline state whose ``default`` is the final list of recommendations.
    """
    available_pipelines = recommendation_pipelines(device=default_device())
-    pipeline = available_pipelines["nrms"]
+    pipeline = available_pipelines["cali"]


This raises a good point. Are we expected to fork the repo and run the fork?

Is POPROX planning to have a release cycle so we can uptake changes to the main POPROX?

zentavious · 2024-09-24T14:33:48Z

src/poprox_recommender/components/diversifiers/locality_calibration.py

+        )
+        return ArticleSet(articles=[candidate_articles.articles[int(idx)] for idx in article_indices])
+
+    def calibration(self, relevance_scores, articles, preferences, theta, topk) -> list[Article]:


Request: This function could also return a list of flags for articles that have been swapped from the original relevance_scores list for easy manipulation of those articles' slugs in a later stage.

Relevance scores is sorted in a sense, but not in order descending by score. The order matches the list of candidate articles, so that the same position in the list of articles and in the list of scores correspond to each other.

In theory, this calibration component could compute the top-k list and do some kind of comparison to generate the kind of flags you're talking about, but it currently generates its own list of recs directly from the scores rather than operating on the top-k list and swapping things around.

If you want to identify which positions of the list are different from the plain top-k list, I'd suggest using this component, the TopkRanker component, and comparing the output of the two in a new GenerateExplanations component.

Thanks for pointing us to the TopRanker component; Indeed there should be a new GenerateExplanations component where we'll apply the LLM logic. We'll discuss that with the team.

karlhigley · 2024-10-02T15:00:49Z

I'd give feedback if I had any, but this LGTM. I like how the Calibrator turned out! 👍🏻

karlhigley reviewed Sep 24, 2024

View reviewed changes

zentavious suggested changes Sep 24, 2024

View reviewed changes

sophiasun0515 force-pushed the sophia/expt-news-locality-bias branch from 89a7e83 to 2766467 Compare September 25, 2024 17:34

sophiasun0515 mentioned this pull request Oct 1, 2024

Add click_locality_counts field in InterestProfile CCRI-POPROX/poprox-concepts#27

Merged

sophiasun0515 added 5 commits October 1, 2024 20:21

Fnish first pass of logic -- no test yet

cd881f3

further refactor

872f970

save intermediate progress

ac81848

keep debugging and save changes

98eceb8

updated tests

50e9c76

sophiasun0515 force-pushed the sophia/expt-news-locality-bias branch from 2766467 to 50e9c76 Compare October 1, 2024 21:01

sophiasun0515 marked this pull request as ready for review October 1, 2024 21:02

karlhigley approved these changes Oct 2, 2024

View reviewed changes

sophiasun0515 merged commit 2fd4b13 into main Oct 2, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

News Locality Calibration #103

News Locality Calibration #103

sophiasun0515 commented Sep 23, 2024 •

edited

Loading

karlhigley left a comment

karlhigley Sep 24, 2024

sophiasun0515 Sep 24, 2024

karlhigley Sep 24, 2024

zentavious Sep 24, 2024

karlhigley Sep 24, 2024

sophiasun0515 Sep 24, 2024

karlhigley Sep 30, 2024

karlhigley Sep 24, 2024

zentavious left a comment

zentavious Sep 24, 2024

sophiasun0515 Sep 24, 2024

zentavious Sep 24, 2024

zentavious Sep 24, 2024

karlhigley Sep 24, 2024

zentavious Sep 24, 2024

karlhigley Sep 24, 2024

zentavious Sep 24, 2024

zentavious Sep 24, 2024

karlhigley Sep 24, 2024

sophiasun0515 Sep 24, 2024

karlhigley commented Oct 2, 2024


		return recommendations

		def normalized_categories_with_candidate(self, rec_categories, article):

		@@ -43,6 +76,19 @@ def user_topic_preference(past_articles: list[Article], click_history: list[Clic
		return topic_count_dict


		def user_locality_preference(past_articles: list[Article], click_history: list[Click]) -> dict[str, int]:

News Locality Calibration #103

News Locality Calibration #103

Conversation

sophiasun0515 commented Sep 23, 2024 • edited Loading

karlhigley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zentavious left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlhigley commented Oct 2, 2024

sophiasun0515 commented Sep 23, 2024 •

edited

Loading