Skip to content

dwipam/code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

0426e9b · Mar 29, 2020
Jan 4, 2016
Jan 5, 2016
May 29, 2017
May 29, 2017
Dec 24, 2015
May 29, 2017
Jan 14, 2017
Oct 24, 2016
Feb 26, 2017
Mar 29, 2020
Sep 14, 2019
Dec 3, 2016
Nov 26, 2016
May 29, 2017

Repository files navigation

ReadMe for this Branch.

Some collection of codes that are used in data mining and data science related fields, developed by me (Data Science, Indiana University):

Artificial-Intelligence: This folder contains programs in python, where I implemented KNN, Neural Nets, BFS, DFS, A*, Naive Baye's, HMM Viterbi, MCMC Gibs Sampling algorithms. The description of every program is returned above the specified program itself. Please check File to run program for each

  1. Image Classifier -
    File to run - orient.py
    Models used - Neural Nets, KNN
    Train_data - train-data.txt
    test_data - test-data.txt

  2. Maps -
    File to run - route.py
    City Data - road-segments.txt
    A* data - city-gps.txt

  3. Parts Of Speech tagger -
    File to run - pos_solver.py
    Train_data - bc.train
    Test_data - bc.test

  4. Zacate_Auto_Player -
    File to run - zacate.py

  5. Solver_16 -
    File to run - solver16.py
    input_matrix_data - input

Algorithms:

  1. Selection Sort - selectionsort.java
  2. Quick sort - quicksort.java
  3. Merge Sort - mergersort.java
  4. Least Commmon Subsequence - LCS.java
  5. Huffman coding - Huffman.py
  6. Heap Sort - HeapSort.java
  7. Dijkstra path finding - dijkstra.py
  8. DFS - dfs.py (recurssion)
  9. Binary Search Tree - BinarySearchTree.java

Data Mining:

  1. Kmeans - kmean_test.R (Implementaion of K-means Algorithm, with number of clusters value(k), tow,l, where l is the number of points the data to be allocated to.
    Data - http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/
  2. K-L distance - kl.R (Calculates the KL distance)
  3. Data_mining/BUS_decoders/BUS_decoders/Code - has all the codes related to the project, for cleaning, merging the data.
    Please check Readme_Data.txt, Readme_code.txt and Report.pdf

Machine Learning(Self Implementations):

  1. Linear Regression -
    ml_assign_1.py
  2. Ridge regression -
    self_implement/rig_regression.py
  3. Lasso regression -
    lass.py
  4. Time series -
    predict_18april_2may.R
  5. Bagging and Boosting(Adaboost) -
    mytree.py
  6. Decision Tree -
    mytree.py

Practice folder is for the coding that I do in my spare time.

Exploratory Data Analysis :- In depth analysis before building predictive model. After clicking on .html file, insert http://htmlpreview.github.com/? before the URL, for example http://htmlpreview.github.com/?https://github.com/dwipam/code/blob/master/EDA/s670-04.html

Bayesian A/B test :- Farm and multi-armed bandit problem simulation

Distribution by Technologies:-
Python - Check for Artificial Intelligence Folder, dijkstra.py, dfs.py and practice folder
R - Check for Data Mining Folder
JAVA - Check for Algorithm folder and Data Mining- BetterCode.java and practice folder

Challenges:-
Noctober - Check model.ipnyb within Noctober Folder. Placed 3 winner on AnalyticsVidhya competition.
Telstra - Check Telstra.ipnyb within Telstra challenge.
Attribution - http://htmlpreview.github.io/?https://github.com/dwipam/code/blob/master/AttributionChallenge/Model.html

If this readme is not understandable, write to:
ddkatari@iu.edu
dwipam.katariya@gmail.com