Build software better, together

data-prep-kit / data-prep-kit

Open source project for data preparation of LLM application builders

python data spark malware code-quality data-preprocessing ray data-preparation deduplication data-prep finetuning data-preprocessing-pipelines datacuration large-language-models llm llmapps large-scale-data-processing datarecipes

Updated Apr 16, 2025
HTML

danielhanchen / sciblox

Star

sciblox - Easier Data Science and Machine Learning

python data-science machine-learning data-mining sklearn data-visualization imputation data-analysis data-preprocessing boosting

Updated Jul 28, 2017
HTML

sharmaroshan / Numpy-and-Pandas

Star

Numpy and Pandas are one of the most important building blocks of knowledge to get started in the field of Data Science, Analytics, Machine Learning, Business Intelligence, and Business Analytics. This Tutorial Focuses to help the Beginners to learn the core Concepts of Numpy and Pandas and get started with Machine Learning and Data Science.

data-science machine-learning numpy pandas feature-extraction data-analysis data-preprocessing aggregation feature-engineering dataframe pandas-profiling

Updated Apr 12, 2020
HTML

karamolegkos / EverAnalyzer

Star

EverAnalyzer is my thesis in the Department of Digital Systems of the University of Piraeus. EverAnalyzer is a platform for collecting, preprocessing, processing and analyzing Big Data from the Twitter platform.

java data-science big-data spark mongodb hadoop jsp mahout data-analytics data-collection data-preprocessing data-processing hadoop-mapreduce sparkmllib everanalyzer

Updated Sep 2, 2022
HTML

MezbanS / Real-Estate.

Star

This project creates a statistical model to predict demand for loans in each region of the USA based on monthly family income and rental costs. The results are displayed on a dashboard updated periodically with data retrieval.

exploratory-data-analysis data-preprocessing data-modeling data-reporting

Updated Oct 21, 2023
HTML

aenni0409 / DataScienceJob

Star

My side project about Data Scientist

python text-mining plotly data-wrangling data-preprocessing

Updated Jan 28, 2019
HTML

UzoigweC / EasyVisa

Star

Model for easy facilitation of visa processing and approvals

eda xgboost adaboost data-preprocessing gradient-boosting gridsearchcv business-insights stacking-classifier bagging-and-random-forest

Updated Mar 10, 2024
HTML

moreirab / finding-donors

Star

Machine Learning Engineer Nanodegree, Supervised Learning, Finding Donors for CharityML

udacity supervised-learning data-preprocessing udacity-nanodegree evaluation-metrics udacity-machine-learning-nanodegree finding-donors charityml

Updated Dec 27, 2018
HTML

Ginga1402 / Google_App_Store_Rating

Star

EDA & Data Preprocessing on Google App Store Rating Dataset.

data-preprocessing data-cleaning college-project model-preparation

Updated Jun 29, 2023
HTML

Omaewayoshiekinoroyo / LeafNet_Final

Star

Ini merupakan repositori proyek akhir untuk aplikasi LeafNet yang di buat oleh Tim Ampera dari kelas Asimo Kelompok 3 pada program Artificial Intelligence Mastery Program yang di selenggarakan oleh Orbit Future Academy

javascript css python training html php data-science machine-learning front-end computer-vision back-end python3 artificial-intelligence transfer-learning data-preprocessing performance-evaluation model-deployment

Updated Jun 5, 2023
HTML

ashva7 / finding_donors

Star

Finding Donor for CharityML - Machine Learning Nanodegree from Udacity

udacity supervised-learning ensemble-learning logistic-regression data-preprocessing grid-search udacity-nanodegree gradient-boosting-classifier stochastic-gradient-descent gradient-boosting udacity-machine-learning-nanodegree supervised-machine-learning ensemble-classifier supervised-learning-algorithms gridsearchcv finding-donors charityml layman-s-terms

Updated Aug 26, 2018
HTML

rogerchenfz / WISER-CLUB

Star

Based on the powerful econometrics and statistical background and rich data science resources of School of Economics (SOE) and Wang Yanan Institute for Studies in Economics (WISE), Xiamen University, WISER CLUB is a data science mutual aid learning organization jointly organized by SOE and WISE graduate students and undergraduate students.

data-mining prediction pandas xgboost adaboost data-preprocessing sklearn-classify business-analytics random-forest-classifier

Updated Apr 16, 2020
HTML

HariprasadManimozhi / data-preparation

Star

Data preparation on raw data using Python

python data-preprocessing

Updated May 13, 2020
HTML

UzoigweC / INNHotels

Star

Analyze the data of INN Hotels to find which factors have a high influence on booking cancellations, build a predictive model that can predict which booking is going to be canceled in advance, and help in formulating profitable policies for cancellations and refunds

exploratory-data-analysis logistic-regression pruning data-preprocessing decision-tree multicollinearity auc-roc-curve

Updated Mar 10, 2024
HTML

dattatele / Inspiring-Music-Analysis

Star

natural-language-processing text-mining modeling text-analysis wordcloud nltk data-analysis data-preprocessing multinomial-naive-bayes

Updated Sep 27, 2017
HTML

andrewsunsanto / DSBAProject3

Star

3rd Project for the Post Graduate Programme in Data Science and Business Analytics at the University of Texas at Austin - Linear Regression & Data Preprocessing

linear-regression data-wrangling data-preprocessing missing-values

Updated May 18, 2021
HTML

rohancodestack / Flask-RealEstate-Estimator

Star

In this project, I developed a basic model to predict house prices based on a single feature: the area of the property. Due to the simplicity of the dataset, I chose a linear regression algorithm for prediction. After training the model, I saved the parameters in a pickle file and deployed the model to a web application using Flask.

linear-regression data-preprocessing flask-api front-end-development

Updated Nov 13, 2024
HTML

digital-mila / WineQuality

Star

Final project for DSCI 100: Developed a KNN classification model in R to predict wine quality using physicochemical properties. Conducted data preprocessing, feature selection, and cross-validation to evaluate model performance.

data-science machine-learning r cross-validation feature-selection classification data-analysis academic-project data-preprocessing wine-quality knn-model physicochemical-analysis

Updated Jan 21, 2025
HTML

EricaYanoshak / AI-Purchase-Behavior-Project

Star

This project focuses on predicting customer purchase behavior using machine learning models, with an emphasis on feature importance.

machine-learning logistic-regression data-preprocessing predictive-modeling decision-tree-classifier binary-classification smote random-forest-classifier customer-segmentation marketing-analytics stacking-ensemble model-optimization model-implimentation

Updated Dec 25, 2024
HTML

Mohansharmila / Office-furniture-analysis-using-Rstudio

Star

In this two cluster approaches are used: hierarchical clustering and K-means clustering. It is unsupervised learning technique for grouping related data points which shows same behaviour in the dataset regardless of the outcome.

html algorithms rstudio data-visualization data-analysis data-preprocessing clustering-algorithm k-means-algorithm

Updated Oct 29, 2024
HTML

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-preprocessing

Here are 29 public repositories matching this topic...

data-prep-kit / data-prep-kit

danielhanchen / sciblox

sharmaroshan / Numpy-and-Pandas

karamolegkos / EverAnalyzer

MezbanS / Real-Estate.

aenni0409 / DataScienceJob

UzoigweC / EasyVisa

moreirab / finding-donors

Ginga1402 / Google_App_Store_Rating

Omaewayoshiekinoroyo / LeafNet_Final

ashva7 / finding_donors

rogerchenfz / WISER-CLUB

HariprasadManimozhi / data-preparation

UzoigweC / INNHotels

dattatele / Inspiring-Music-Analysis

andrewsunsanto / DSBAProject3

rohancodestack / Flask-RealEstate-Estimator

digital-mila / WineQuality

EricaYanoshak / AI-Purchase-Behavior-Project

Mohansharmila / Office-furniture-analysis-using-Rstudio

Improve this page

Add this topic to your repo