-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Locality calibration + text generation #129
Open
imrecommender
wants to merge
104
commits into
CCRI-POPROX:xinyili/development
Choose a base branch
from
zentavious:xinyili/development
base: xinyili/development
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Locality calibration + text generation #129
imrecommender
wants to merge
104
commits into
CCRI-POPROX:xinyili/development
from
zentavious:xinyili/development
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This copies embeddings from a candidate set to an selected set of articles, so that embeddings can be used in downstream analysis for metrics like ILD.
Since we're using the pipeline implementation from the `lkpipeline` module now, it's confusing to have this earlier implementation still hanging around in the code base. Removing for clarity.
…-POPROX#124) This grabs the candidate embeddings from each pipeline execution and builds up a cache of embeddings over the course of the eval run. Once the eval run finishes, it writes them to a two column Parquet file where the first column is the article id and the second is the corresponding embedding.
…RI-POPROX#127) This brings us **parallel generation** of recommendations for offline evaluation, and uses that to regenerate our offline recommendation outputs (and update our reported timings). This is a big PR, unfortunately, for a couple of reasons: - Parallelism requires refactoring quite a bit to have the parallel operation encapsulated in a worker function. - `ipyparallel` doesn't work well with code defined in the script that is being run, so most of the code — in particular, the class definitions it uses, but also functions — are refactored out into imported modules, so the `ipyparallel` workers can find them correctly. Highlights: - Parallelize generation with the `ipyparallel` package. I have found this to work well without some of the intermittent bugs I've encountered with multiprocessing or `ProcessPoolExecutor`, and it also makes it easy to run initialization and finalization tasks on all worker threads to set up outputs or finish writing outputs. - Refactor `generate` into a package (with `__main__.py`, so we can still run it with `python -m poprox_recommender.evaluation.generate`), with various modules implementing its pieces. - `recommendations` is now a directory instead of a single Parquet file; software that reads Parquets (including Pandas `read_parquet` and DuckDB) generally accept directories of Parquet files and concatenate the files they contain, to support sharding. We use this to shard the output based on worker process; each worker writes its own Parquet output. Otherwise sending data back to the parent and collecting or writing it becomes a bottleneck that severely impairs parallelism (I have learned this through long experience with Lenskit). - Since each worker has its own output, we can no longer de-duplicate embeddings while we write them. The solution I implemented is that each worker de-duplicates embeddings within its own output, and writes the results to a shared in a temporary directory; the parent process then collects all those embeddings, de-duplicates them once more, and writes the result to a single Parquet file. - To support the per-worker writing, and to avoid slurping all outputs into memory, this adds a batched parquet writer that accumulates batches of rows and writes them directly to the Parquet file. - We can still run in a single process with `-j 1` to the evaluation script. The default is to look at the `POPROX_NUM_CPUS` environment variable, or the min of the number of CPUs and 4. - This adds `recommend-mind-subset` and `measure-mind-subset` tasks to generate and measure for a small subset (first 1K rows) of MIND, to more easily facilitate quick testing of the offline eval code. - The `poprox_recommender.evaluation.generate.outputs` module defines the layout of the outputs from recommendation generation, so this is defined in one place instead of needing to keep the different pieces of the generator consistent. - **Renames** the “MRR” user metric to “RR” — the individual per-list metric is recip. rank, and the mean of them is the mean recip. rank.
locality + text generation
updated pixi lock
Add generic poprox export support to generate.py
Several relatively small improvements to our Docker config: - configure Serverless to build for `linux/amd64` explicitly - update deploy.sh to run serverless properly with npx, and only pull necessary DVC data - bump Pixi version in docker images - clean up docker image a bit, removing DNF caches (saves 60MB in final image) - add `outputs/` and `.pixi/` to `.dockerignore`, speeding up Docker build startup a lot
Fix paths to the safetensors files
Remove stray reference to SentenceTransformers
Adding support for date and pipeline splitting to generate
Unify Offline Evals
Bake the MiniLM model into the Docker container for deployment
Aggregate Metrics by Recommender, Locality Theta, and Topic Theta for Locality-Cali Pipelineunifying evals
locality_theta parameter passing bug
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The main logic in the code includes:
I ran a test on different config combinations using the uploaded basic request JSON. There don’t seem to be any bugs at this point, but the details of the output still need to be checked.