What is Data Dive?

DataDive is a web-based data visualization and predictive modeling playground project. Different resources such as Statiscal approaches, supervised machine learning and python are used to transform and work over the data to get a specified result. The resulting visual representation of data makes it easier to identify trends, outliers, and new insights about the information represented in the data.

How to use the site?

• Datasets
- Fruit: Solving classification problem with Python using the fruit data with colours The fruits dataset was created by Dr. Iain Murray from University of Edinburgh. He bought a few dozen oranges, lemons and apples of different varieties, and recorded their measurements in a table. And then the we formatted the fruits data slightly.The dataset comprises 150 rows and 7 features. The Python library and the dataset is open for learning purposes.

• Algorithms
-SVM: The objective of the support vector machine algorithm is to find a hyperplane(decision boundaries that help classify the data points) in an N-dimensional space(N — the number of features) that distinctly classifies the data points.We find a plane that has the maximum margin, i.e the maximum distance between data points of both classes. Maximizing the margin distance provides some reinforcement so that future data points can be classified with more confidence. Data points falling on either side of the hyperplane can be attributed to different classes. Using these support vectors, we maximize the margin of the classifier. Deleting the support vectors will change the position of the hyperplane. These are the points that help to build SVM.Support Vector Machine we consider for lowering misclassification rate(how much a model misqualifies a data).

• Hyperparameters
- Gaussian Radial Basis Function (RBF): It is one of the most preferred and used kernel functions in svm. It is usually chosen for non-linear data. It helps to make proper separation when there is no prior knowledge of data.

- C Regularization hyperparameter:
Parameter that controls the trade off between the achieving a low training error and a low testing error that is the ability to generalize your classifier to unseen data

What Library Are We Using?

Pickle
Urllib
Base64
Numpy
Pandas
Matplotlib
Seaborn
GraphViz
Sklearn

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
datadive_app		datadive_app
datadive_project		datadive_project
Dockerfile		Dockerfile
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
db.sqlite3		db.sqlite3
demo.png		demo.png
manage.py		manage.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is Data Dive?

How to use the site?

What Library Are We Using?

About

Releases

Packages

Contributors 2

Languages

License

MujassimJamal/datadive

Folders and files

Latest commit

History

Repository files navigation

What is Data Dive?

How to use the site?

What Library Are We Using?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages