Allstate is currently developing automated methods of predicting the cost, and hence severity, of claims.
Each row in this dataset represents an insurance claim. You must predict the value for the 'loss' column. Variables prefaced with 'cat' are categorical, while those prefaced with 'cont' are continuous.
{
['id', 'cat1', 'cat2', 'cat3', 'cat4', 'cat5', 'cat6', 'cat7',
'cat8', 'cat9', 'cat10', 'cat11', 'cat12', 'cat13', 'cat14',
'cat15', 'cat16', 'cat17', 'cat18', 'cat19', 'cat20', 'cat21',
'cat22', 'cat23', 'cat24', 'cat25', 'cat26', 'cat27', 'cat28',
'cat29', 'cat30', 'cat31', 'cat32', 'cat33', 'cat34', 'cat35',
'cat36', 'cat37', 'cat38', 'cat39', 'cat40', 'cat41', 'cat42',
'cat43', 'cat44', 'cat45', 'cat46', 'cat47', 'cat48', 'cat49',
'cat50', 'cat51', 'cat52', 'cat53', 'cat54', 'cat55', 'cat56',
'cat57', 'cat58', 'cat59', 'cat60', 'cat61', 'cat62', 'cat63',
'cat64', 'cat65', 'cat66', 'cat67', 'cat68', 'cat69', 'cat70',
'cat71', 'cat72', 'cat73', 'cat74', 'cat75', 'cat76', 'cat77',
'cat78', 'cat79', 'cat80', 'cat81', 'cat82', 'cat83', 'cat84',
'cat85', 'cat86', 'cat87', 'cat88', 'cat89', 'cat90', 'cat91',
'cat92', 'cat93', 'cat94', 'cat95', 'cat96', 'cat97', 'cat98',
'cat99', 'cat100', 'cat101', 'cat102', 'cat103', 'cat104', 'cat105',
'cat106', 'cat107', 'cat108', 'cat109', 'cat110', 'cat111',
'cat112', 'cat113', 'cat114', 'cat115', 'cat116', 'cont1', 'cont2',
'cont3', 'cont4', 'cont5', 'cont6', 'cont7', 'cont8', 'cont9',
'cont10', 'cont11', 'cont12', 'cont13', 'cont14', 'loss']
}
After encoding the categorical features, the total number of features are 1153