TransTab: A flexible transferable tabular learning framework [arxiv]

Document is available at https://transtab.readthedocs.io/en/latest/index.html.

Paper is available at https://arxiv.org/pdf/2205.09328.pdf.

5 min blog to understand TransTab at realsunlab.medium.com!

News!

[05/04/23] Check the version 0.0.5 of TransTab!
[01/04/23] Check the version 0.0.3 of TransTab!
[12/03/22] Check out our [blog] for a quick understanding of TransTab!
[08/31/22] 0.0.2 Support encode tabular inputs into embeddings directly. An example is provided here. Several bugs are fixed.

TODO

Table embedding.
Add support to direct process table with missing values.
Add regression support.

Features

This repository provides the python package transtab for flexible tabular prediction model. The basic usage of transtab can be done in a couple of lines!

import transtab

# load dataset by specifying dataset name
allset, trainset, valset, testset, cat_cols, num_cols, bin_cols \
     = transtab.load_data('credit-g')

# build classifier
model = transtab.build_classifier(cat_cols, num_cols, bin_cols)

# start training
transtab.train(model, trainset, valset, **training_arguments)

# make predictions, df_x is a pd.DataFrame with shape (n, d)
# return the predictions ypred with shape (n, 1) if binary classification;
# (n, n_class) if multiclass classification.
ypred = transtab.predict(model, df_x)

It's easy, isn't it?

How to install

First, download the right pytorch version following the guide on https://pytorch.org/get-started/locally/.

Then try to install from pypi directly:

pip install transtab

or

pip install git+https://github.com/RyanWangZf/transtab.git

Please refer to for more guidance on installation and troubleshooting.

Transfer learning across tables

A novel feature of transtab is its ability to learn from multiple distinct tables. It is easy to trigger the training like

# load the pretrained transtab model
model = transtab.build_classifier(checkpoint='./ckpt')

# load a new tabular dataset
allset, trainset, valset, testset, cat_cols, num_cols, bin_cols \
     = transtab.load_data('credit-approval')

# update categorical/numerical/binary column map of the loaded model
model.update({'cat':cat_cols,'num':num_cols,'bin':bin_cols})

# then we just trigger the training on the new data
transtab.train(model, trainset, valset, **training_arguments)

Contrastive pretraining on multiple tables

We can also conduct contrastive pretraining on multiple distinct tables like

# load from multiple tabular datasets
dataname_list = ['credit-g', 'credit-approval']
allset, trainset, valset, testset, cat_cols, num_cols, bin_cols \
     = transtab.load_data(dataname_list)

# build contrastive learner, set supervised=True for supervised VPCL
model, collate_fn = transtab.build_contrastive_learner(
    cat_cols, num_cols, bin_cols, supervised=True)

# start contrastive pretraining training
transtab.train(model, trainset, valset, collate_fn=collate_fn, **training_arguments)

Citation

If you find this package useful, please consider citing the following paper:

@inproceedings{wang2022transtab,
  title={TransTab: Learning Transferable Tabular Transformers Across Tables},
  author={Wang, Zifeng and Sun, Jimeng},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
blog		blog
docs		docs
examples		examples
transtab		transtab
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
README.md		README.md
pypi_build_commands.txt		pypi_build_commands.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TransTab: A flexible transferable tabular learning framework [arxiv]

News!

TODO

Features

How to install

Transfer learning across tables

Contrastive pretraining on multiple tables

Citation

About

Releases 1

Packages

Languages

License

RyanWangZf/transtab

Folders and files

Latest commit

History

Repository files navigation

TransTab: A flexible transferable tabular learning framework [arxiv]

News!

TODO

Features

How to install

Transfer learning across tables

Contrastive pretraining on multiple tables

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages