For additional information on model cards see the Model Card paper
The model used was scikit-learn's gradient boosting classifier.
Hyper parameter tuning was carried out using GridSearchCV. Including the defaults, the parameters used are:
- learning_rate: 0.1
- max_depth: 5
- n_estimators: 100
The model is only intended for the demonstration of deploying and serving a machine learning model.
80% of the data was used for training.
20% of the data was used for testing/evaluation.
The metrics used to evaluate the model are:
- f1 score 0.72
- precision 0.80
- recall 0.65
The dataset is quite outdated since it's from 1994. It does not provide any insight into current income with respect to demographics. The data is also not representative enough of the demographics within the dataset. See this bias report for more details.