MCTensor is a general-purpose, fast, and high-precision deep learning library built upon pytorch and compliant on PyTorch programming paradigm on building deep learning modules and training codes. MCTensor follows the multiple-component floating-point format (MCF) using an unevaluated sum of multiple ordinary floating-point numbers (e.g. float16, float32, float64) to represent a high-precision floating point number.
The paper is presented in Hardware Aware Efficient Training workshop (HAET) in ICML'22 and its pdf version can be found in https://arxiv.org/pdf/2207.08867.pdf.
Python >= 3.6 PyTorch >= 1.10.0 CUDA >= 10.1 on linux
- clone this repo locally
- run
python build.py install
.
The folder src/MCTensor
contains source code for the library, specifically,
MCTensor.py
stores theMCTensor
class definition and encapsulated high-level operators.MCOpBasic.py
stores the MCF algorithms such asTwo_Sum
,Renormalize
, andTwo_Prod
and middle-level callers as_AddMCN
,_MultMCN
, and_DivMCN
operators. They will be called fromMCTensor
MCOpMatrix.py
stores the MCF algorithms for vector and matrix level operators such as_Dot_MCN
,_MV_MC_T
, and_MM_MC_T
.MCOptim.py
stores the MC-optimizers used for trainingMCModule
and native pytorch modules.MCModule.py
stores basicMCModule
definition such asMCLinear
andMCEmbedding
.
An MCTensor
x
contains a few important attributes, namely,
- nc: the number of components, which can be derived with
x.size_nc(-1)
. - tensor, the underlying data with all components, whose last dimension is the component dimension, i.e.,
x.tensor[...,i]
will retrive thei
-th component. - fc is a view of the first component, retrived with
x.fc
, which is used mainly for tracking gradient graph so as to be consistent with PyTorch autograd mechanism.
Basic operators are implemented first for MCTensor from basic MCF algorithms described in the paper (e.g., Two_Sum
, _Simple_renormalize
). For example, add
, sub
, div
and mul
. Matrix operators are then developed, including common ones adopted in PyTorch with same semantics, e.g., dot
, mv
, mm
, bmm
, matmul
, except for matmul
where we only support at most 4-d tensors matmul at this moment.
We provide some sample codes for better illustration. MCTensor
overrides PyTorch operators for MCTensor
-MCTensor
, and MCTensor
-Tensor
arithmetic with decorator hooks. Such hook works for torch.FUNC
or mc_tensor.FUNC
calls.
torch.add(mc_tensor, tensor)
workstorch.add(tensor, mc_tensor)
worksmc_tensor + tensor
worksmc_tensor.add(tensor)
workstensor + mc_tensor
works
The following are some sample MCTensor definition and arithmetics codes.
>>> MC_A = MCTensor((2, 2), nc=2)
>>> MC_A
MCTensor(Size=torch.Size([2, 2]), number of components=2, requires_grad=False)
>>> MC_A.tensor[..., 0]
tensor([[-1.8906, 0.3968],
[ 0.8522, -1.0379]])
>>> (MC_A + MC_A).fc
tensor([[[-3.7812],
[ 0.7937]],
[[ 1.7044],
[-2.0758]]])
>>> B = torch.ones(2, 2)
>>> MC_B = MCTensor((2, 2), val=B, nc=2)
>>> MC_C = MC_A + MC_B
>>> MC_C.fc
tensor([[[-0.8906],
[ 1.3968]],
[[ 1.8522],
[-0.0379]]])
>>> MC_A_cuda = MC_A.cuda()
>>> MC_AB = MC_A * MC_B
tensor([[[-1.8906],
[ 0.3968]],
[[ 0.8522],
[-1.0379]]])
>>> MC_A[0]
MCTensor(Size=torch.Size([2]), number of components=2, requires_grad=False)
>>> torch.dot(MC_A[0], B[0])
tensor(-1.4938)
>>> MC_A_requires_grad = MCTensor(2, 2, nc=2, val=MC_A, requires_grad=True)
>>> MC_A_requires_grad.sum().backward()
>>> MC_A_requires_grad.grad
tensor([[1., 1.],
[1., 1.]])
MCModule
is the basic MC Module definition block, similar to nn.Module
. MCModule
uses MCTensor
whose requires_grad=True
for trainable parameters, which means ANY fields in MCModule with requires_grad=True
will be passed as a parameter to optimizer with mc_module.parameters()
call. Currently, buffer
is not supported in MCModule
.
Some example layers including
MCLinear
, inherited fromMCModule
class, follows the implementation ofnn.Linear
in PyTorch.MCEmbedding
, inherited fromMCModule
class, follows the implementation ofnn.Embedding
in PyTorch.MCSequential
,MCModuleList
...
Just as the PyTorch case, users can develop advanced models with MCModule
, for example,
class MCMLP(MCModule):
def __init__(self, input_dim, hidden1, hidden2, nc=2, dtype=d16, device=device):
super(MCMLP, self).__init__()
self.fc1 = MCLinear(input_dim, hidden1, nc=nc, bias=False, dtype=dtype, device=device)
self.fc2 = MCLinear(hidden1, hidden2, nc=nc, bias=False, dtype=dtype, device=device)
self.fc3 = MCLinear(hidden2, 1, nc=nc, bias=False, dtype=dtype, device=device)
def forward(self, x):
x = F.relu(self.fc1(x))
x = x.tensor.sum(-1) # transform x to standard tensor for efficiency
x = F.relu(self.fc2(x))
x = x.tensor.sum(-1)
x = self.fc3(x)
x = torch.sigmoid(x)
x = x.tensor.sum(-1)
return x
MCSGD
and MCAdam
are implemented as counterparts of PyTorch SGD
and Adam
optimizers, which both inherit from MCOptimizer
class. Just as the usage of optimizers in PyTorch, MCOptimizer
can be used in the same way during training as
model = MCMLP(input_dim, hidden1, hidden2, nc=nc,
device=torch.device('cuda:0'), dtype=torch.float16)
criterion = torch.nn.BCELoss()
mc_optimizer = MCOptim.MCSGD(model.parameters(), lr)
for X, Y in trainloader:
mc_optimizer.zero_grad()
X, Y = X.cuda(), Y.cuda()
Y_hat = model(X)
loss = criterion(Y_hat, Y)
mc_outputs.backward()
mc_optimizer.step()
We provide codes for experiments in the paper in the applications
folder, including basic_examples
and poincare_embedding
.
MCTensor uses implements
decorator as defined in MCTensor.py
to override PyTorch operators. For example, we can override PyTorch's torch.cat
as
@implements(torch.cat)
def cat(mctensors, *args, **kw):
print("this is my mctensor cat")
>>> cat(MCTensor(1, 2, nc=2))
this is my MCTensor cat
- Tao Yu, tyu@cs.cornell.edu
- Wentao Guo, wg247@cornell.edu
- Jianan Canal Li, jl3789@cornell.edu
- Tiancheng Yuan, ty373@cornell.edu
- Christopher De Sa, cdesa@cs.cornell.edu
This work is supported by NSF IIS-2008102. MCTensor endeavors to follow the semantics and speed of PyTorch. As it is still under development and mainly implemented in Python level, it may not achieve the same speed as native PyTorch did, and sometimes their semantics are not fully equivalent. Please perform a correctness and performance test before the deployment and feel free to leave a issue or contact us.
MCTensor uses Apache-2 license in the LICENSE file.
If you find MCTensor library helpful in your research, please consider citing us:
@misc{https://doi.org/10.48550/arxiv.2207.08867,
doi = {10.48550/ARXIV.2207.08867},
url = {https://arxiv.org/abs/2207.08867},
author = {Yu, Tao and Guo, Wentao and Li, Jianan Canal and Yuan, Tiancheng and De Sa, Christopher},
title = {MCTensor: A High-Precision Deep Learning Library with Multi-Component Floating-Point},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}