improved regressor memory usage by 60% by poonai · Pull Request #745 · PriorLabs/TabPFN

poonai · 2026-01-22T08:51:07Z

Issue

Fixes #354

Motivation and Context

The Github issue suggest to remove the low probability borders to reduce the regressor memory usage. Instead, I refactored the existing code to achieve the same result. The proposed code change, incrementally calculates the sum of estimator outputs instead of calculating the sum by accumulating all the estimator output.

existing code:

outputs = []
for output in iter_estimator_output:
    outputs.append(output)

avg_mean = sum(ouputs) / n_estimator

proposed code:

sum = 0
for output in iter_estimator_output:
    sum += output

avg_mean = sum / n_estimator

Public API Changes

No Public API changes
Yes, Public API changes (Details below)

How Has This Been Tested?

I've tested the memory usage by tweaking the fine tuning example.

def main() -> None:
    data = sklearn.datasets.fetch_california_housing(as_frame=True)
    X_all = data.data
    y_all = data.target

    X_train, X_test, y_train, y_test = train_test_split(
        X_all, y_all, test_size=0.1, random_state=RANDOM_STATE
    )

    print(
        f"Loaded {len(X_train):,} samples for training and "
        f"{len(X_test):,} samples for testing."
    )

    # 2. Initial model evaluation on test set
    base_reg = TabPFNRegressor(
        device=["cuda:0"],
        n_estimators=NUM_ESTIMATORS_FINAL_INFERENCE,
        ignore_pretraining_limits=True,
        inference_config={"SUBSAMPLE_SAMPLES": 50_000},
    )
    base_reg.fit(X_train, y_train)

    for _ in range(5):
        base_pred = base_reg.predict(X_test)    
    print("torch.cuda.memory_allocated: %fGB"%(torch.cuda.memory_allocated(0)/1024/1024/1024))
    print("torch.cuda.memory_reserved: %fGB"%(torch.cuda.memory_reserved(0)/1024/1024/1024))
    print("torch.cuda.max_memory_reserved: %fGB"%(torch.cuda.max_memory_reserved(0)/1024/1024/1024))
    



if __name__ == "__main__":
    if not torch.cuda.is_available():
        raise RuntimeError(
            "CUDA is not available. Please run the script on a CUDA-enabled GPU."
        )
    main()

previous allocation:

torch.cuda.memory_allocated: 0.046939GB
torch.cuda.memory_reserved: 1.638672GB
torch.cuda.max_memory_reserved: 1.638672GB

current allocation:

torch.cuda.memory_allocated: 0.046939GB
torch.cuda.memory_reserved: 1.021484GB
torch.cuda.max_memory_reserved: 1.021484GB

gain

1.021/1.638 * 100 = 62.33

memory_reserved is the total memory allocated during the entire life of the program. memory_allocated represent the current allocation of the program. I assumes, accumulated memory is released to the allocator after the predict function returns. Hence the no change in memory_allocated. I think it's fair to use gain of memory_reserved metric to argue the benefit of this code change. Please correct me if something is wrong.

Checklist

The changes have been tested locally.
Documentation has been updated (if the public API or usage changes).
A changelog entry has been added (see changelog/README.md), or "no changelog needed" label requested.
The code follows the project's style guidelines.
I have considered the impact of these changes on the public API.

chatgpt-codex-connector · 2026-01-22T08:51:11Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

gemini-code-assist

Code Review

The pull request successfully refactors the predict method in TabPFNRegressor to incrementally calculate the sum of estimator outputs, significantly reducing memory usage as demonstrated by the memory_reserved metric. This is a valuable improvement for efficiency. The change involves converting the previous forward method into an iterator _iter_forward_executor and then consuming this iterator in predict for memory-efficient aggregation. The original forward method is also updated to use the new iterator, but it still collects all outputs into lists, which might be intentional for fine-tuning purposes.

gemini-code-assist · 2026-01-22T08:52:29Z

src/tabpfn/regressor.py

        return logit_to_output(output_type=output_type)

-    def forward(
+    def _iter_forward_executor(


There's a typo in the return type hint for _iter_forward_executor. np.ndaarray should be np.ndarray.

) -> Iterator[tuple[np.ndarray, torch.Tensor]]:

poonai · 2026-01-29T08:43:49Z

@oscarkey friendly ping to review this PR.

This is a small change and requires less time to review the PR. I know you may be busy with other work so no rush at all. Review like this encourages me to do more opensource contribution.

oscarkey · 2026-01-29T12:47:03Z

hey @poonai , sorry for the delay. I'll get to both your prs on monday :)

Signed-off-by: balaji <rbalajis25@gmail.com>

- Use _iter_forward_executor directly instead of forward method - Transform probabilities across borders inside the loop - Average ensemble outputs on-the-fly instead of accumulating all outputs This reduces memory usage by avoiding storage of all intermediate outputs, especially beneficial for large n_estimators. Co-Authored-By: glm4.5

Signed-off-by: balaji <rbalajis25@gmail.com>

poonai · 2026-02-04T08:05:51Z

@gemini-code-assist review

gemini-code-assist

Code Review

This pull request effectively reduces memory usage in the regressor's predict method by processing estimator outputs iteratively instead of accumulating them in memory. The refactoring is well-executed, extracting the iterative logic into a new _iter_forward_executor method while maintaining the previous forward method's behavior for backward compatibility. I've identified one potential issue regarding the handling of cases where n_estimators is zero, which could lead to a runtime error. My feedback includes a suggestion to make the code more robust against this scenario.

gemini-code-assist · 2026-02-04T08:08:58Z

src/tabpfn/regressor.py

        if self.average_before_softmax:
-            logits = stacked_logits.log().mean(dim=0).softmax(dim=-1)
+            logits = (averaged_logits / n_estimators).softmax(dim=-1)  # type: ignore
        else:
-            logits = stacked_logits.mean(dim=0)
+            logits = averaged_logits / n_estimators  # type: ignore


If _iter_forward_executor yields no items (e.g., if n_estimators=0), averaged_logits will remain None. This will cause a TypeError when it's used in the division on lines 941 or 943. The existing type: ignore comments suppress this potential runtime error.

To make the method more robust, I suggest adding a check to ensure averaged_logits is not None before proceeding with the calculation. This will provide a clearer error message if no estimators were run and allows for the removal of the type: ignore comments.

Suggested change

if self.average_before_softmax:

logits = stacked_logits.log().mean(dim=0).softmax(dim=-1)

logits = (averaged_logits / n_estimators).softmax(dim=-1) # type: ignore

else:

logits = stacked_logits.mean(dim=0)

logits = averaged_logits / n_estimators # type: ignore

if averaged_logits is None:

raise ValueError("Cannot make predictions, possibly due to `n_estimators=0`.")

elif self.average_before_softmax:

logits = (averaged_logits / n_estimators).softmax(dim=-1)

else:

logits = averaged_logits / n_estimators

poonai requested a review from a team as a code owner January 22, 2026 08:51

poonai requested review from oscarkey and removed request for a team January 22, 2026 08:51

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

poonai added 4 commits February 4, 2026 06:42

refactor executor forward loop

2d60436

Signed-off-by: balaji <rbalajis25@gmail.com>

add changelog

866e299

resovle precommit issueo

962b2a3

Signed-off-by: balaji <rbalajis25@gmail.com>

poonai force-pushed the poonai/optimize_regressor branch from 6b78dfd to 962b2a3 Compare February 4, 2026 08:02

fix ndarray. I feel delusional now

c076ee9

gemini-code-assist bot reviewed Feb 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improved regressor memory usage by 60%#745

improved regressor memory usage by 60%#745
poonai wants to merge 5 commits intoPriorLabs:mainfrom
poonai:poonai/optimize_regressor

poonai commented Jan 22, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 22, 2026

Uh oh!

poonai Jan 22, 2026 •

edited

Loading

Uh oh!

poonai commented Jan 29, 2026

Uh oh!

oscarkey commented Jan 29, 2026

Uh oh!

poonai commented Feb 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

poonai commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Motivation and Context

existing code:

proposed code:

Public API Changes

How Has This Been Tested?

previous allocation:

current allocation:

gain

Checklist

Uh oh!

chatgpt-codex-connector bot commented Jan 22, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

poonai Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

poonai commented Jan 29, 2026

Uh oh!

oscarkey commented Jan 29, 2026

Uh oh!

poonai commented Feb 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

poonai commented Jan 22, 2026 •

edited

Loading

poonai Jan 22, 2026 •

edited

Loading