Skip to content

๐Ÿ“˜MLP์—์„œ ์ง„ํ–‰ํ•˜๋Š” ML, DL Study์ž…๋‹ˆ๋‹ค.๐Ÿ“˜

License

Notifications You must be signed in to change notification settings

Mecca-of-Learning-Pioneer/ML-DL-Study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

26 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

ML-DL-Study

๐Ÿ“˜MLP์—์„œ ์ง„ํ–‰ํ•˜๋Š” ML, DL Study์ž…๋‹ˆ๋‹ค.๐Ÿ“˜

Algorithm

Machine Learning(ML) - Structured Data

Supervised Learning(์ง€๋„ ํ•™์Šต)

  • Classification(๋ถ„๋ฅ˜) - Loss function : Cross-Entropy, Evaluation metrics : Accuracy (๋‹ค๋ฅธ ๊ฒƒ ํ›จ์”ฌ ๋งŽ์Œ)

    • k-NN
      • ์–ด๋–ค ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ๋‹ต์„ ๊ตฌํ•  ๋•Œ, ์ฃผ๋ณ€์˜ k๊ฐœ์˜ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๊ณ  ๋‹ค์ˆ˜๋ฅผ ์ฐจ์ง€ํ•˜๋Š” ๊ฒƒ์„ ์ •๋‹ต์œผ๋กœ ํŒ๋‹จ
    • Logistic Regression(๊ธฐ๋ณธ L2 Regularization)
      • Binary Classification(์ด์ง„ ๋ถ„๋ฅ˜)
        • feature๋ฅผ ๊ฐ€์žฅ ์ž˜ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋Š” ์„ ํ˜• ๋ฐฉ์ •์‹์„ ์–‘์„ฑ class์— ๋Œ€ํ•ด์„œ๋งŒ ํ•™์Šต -> ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์œผ๋ฉด, ์„  ์œ„์˜ ๊ฐ’์„ ๋ฐ˜ํ™˜(z) -> z๊ฐ’์„ ์–‘์„ฑ class ํ™•๋ฅ ๋กœ ๋ณ€ํ™˜(1์—์„œ ๋บ€ ๋‚˜๋จธ์ง€๋Š” ์Œ์„ฑ class ํ™•๋ฅ )(sigmoid function)
      • Multiclass Classification(๋‹ค์ค‘ ๋ถ„๋ฅ˜)
        • feature๋ฅผ ๊ฐ€์žฅ ์ž˜ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋Š” ์„ ํ˜• ๋ฐฉ์ •์‹์„ ๊ฐ class๋ณ„๋กœ ํ•™์Šต -> ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์œผ๋ฉด, ์„  ์œ„์˜ ๊ฐ’์„ ๋ฐ˜ํ™˜(z) -> z๊ฐ’์„ class๋ณ„ ํ™•๋ฅ ๋กœ ๋ณ€ํ™˜(softmax function)
    • Decision Tree(Normalization ํ•„์š”X)
      • ์˜ˆ/์•„๋‹ˆ์˜ค์— ๋Œ€ํ•œ ์งˆ๋ฌธ์„ ์ด์–ด๋‚˜๊ฐ€๋ฉด์„œ ์ •๋‹ต์„ ์ฐพ์•„ ํ•™์Šต -> Leaf Node์—์„œ ๊ฐ€์žฅ ๋งŽ์€ class๊ฐ€ ์˜ˆ์ธก class
    • Ensemble Learning(๋Œ€๋ถ€๋ถ„ Decesion Tree ๊ธฐ๋ฐ˜)
      • ๋” ์ข‹์€ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๊ฐœ์˜ model ํ›ˆ๋ จ
      • Random Forest, Extra Trees, Gradient Boosting, Histogram-based Gradient Boosting, (XGBoost) (Gradient Boosting ๋ฐฉ๋ฒ•์€ Gradient Descent๋ฅผ ์ด์šฉํ•œ ๊ฒƒ)
  • Regression(ํšŒ๊ท€) - Loss function : Mean Squared Error(MSE), Evaluation metrics : R-squared (๋‹ค๋ฅธ ๊ฒƒ ํ›จ์”ฌ ๋งŽ์Œ)

    • k-NN
      • ์–ด๋–ค ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ๋‹ต์„ ๊ตฌํ•  ๋•Œ, ์ฃผ๋ณ€์˜ k๊ฐœ์˜ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ๋ฐ์ดํ„ฐ์˜ ๊ฐ’์„ ํ‰๊ท ๋‚ธ ๊ฒƒ์„ ์ •๋‹ต์œผ๋กœ ํŒ๋‹จ
    • Linear Regression
      • feature๋ฅผ ๊ฐ€์žฅ ์ž˜ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋Š” ์„ ํ˜• ๋ฐฉ์ •์‹์„ ํ•™์Šต -> ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์œผ๋ฉด, ์„  ์œ„์˜ ๊ฐ’์„ ๋ฐ˜ํ™˜(์˜ˆ์ธก ๊ฐ’)
      • Ridge Regression
        • Linear Regression + L2 Regularization
      • Lasso Regression
        • Linear Regression + L1 Regularization
    • Decision Tree(Normalization ํ•„์š”X)
      • ์˜ˆ/์•„๋‹ˆ์˜ค์— ๋Œ€ํ•œ ์งˆ๋ฌธ์„ ์ด์–ด๋‚˜๊ฐ€๋ฉด์„œ ์ •๋‹ต์„ ์ฐพ์•„ ํ•™์Šต -> Leaf Node์— ๋„๋‹ฌํ•œ ์ƒ˜ํ”Œ์˜ target์„ ํ‰๊ท ํ•œ ๊ฐ’์ด ์˜ˆ์ธก๊ฐ’
    • Ensemble Learning(๋Œ€๋ถ€๋ถ„ Decesion Tree ๊ธฐ๋ฐ˜)
      • ๋” ์ข‹์€ ์˜ˆ์ธก ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๊ฐœ์˜ model ํ›ˆ๋ จ
      • Random Forest, Extra Trees, Gradient Boosting, Histogram-based Gradient Boosting, (XGBoost) (Gradient Boosting ๋ฐฉ๋ฒ•์€ Gradient Descent๋ฅผ ์ด์šฉํ•œ ๊ฒƒ)

Unsupervised Learning(๋น„์ง€๋„ ํ•™์Šต)

  • Clustering(๊ตฐ์ง‘)

    • k-Means
      • cluster์˜ ํ‰๊ท ๊ฐ’(cluster center/centroid)์„ ๋žœ๋ค์œผ๋กœ k๊ฐœ ์ •ํ•จ -> [๊ฐ ์ƒ˜ํ”Œ์ด ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด centroid์— ๊ฐ€์„œ cluster๋ฅผ ์ด๋ฃธ -> ๋‹ค์‹œ centroid๋ฅผ ๊ตฌํ•จ] -> centroid์— ๋ณ€ํ™”๊ฐ€ ์—†์„ ๋•Œ๊นŒ์ง€ []๊ตฌ๊ฐ„ ๋ฐ˜๋ณต -> k๊ฐœ์˜ cluster๋กœ ๋ถ„๋ฅ˜๋จ
      • elbow ๋ฐฉ๋ฒ•์œผ๋กœ ์ตœ์ ์˜ k๊ฐ’์„ ์ฐพ์„ ์ˆ˜ ์žˆ์Œ
  • Dimensionality Reduction(์ฐจ์›์ถ•์†Œ)

    • Principal Component Analysis = PCA - Principal Component(์ฃผ์„ฑ๋ถ„) : ๋ฐ์ดํ„ฐ์˜ ํŠน์ง•์„ ์ž˜ ํ‘œํ˜„ํ•˜๋Š” ์–ด๋–ค ๋ฒกํ„ฐ
      • ์—ฌ๋Ÿฌ ์ฃผ์„ฑ๋ถ„ ์ค‘ ์ผ๋ถ€ ์ฃผ์„ฑ๋ถ„๋งŒ์„ ์„ ํƒํ•ด์„œ ๋ฐ์ดํ„ฐ์˜ dimension(feature)๋ฅผ ์ค„์ž„

Reinforcement Learning(๊ฐ•ํ™” ํ•™์Šต)

  • RL Study์—์„œ ์ง„ํ–‰

Tip!

  • ๋ฐ˜๋“œ์‹œ Normalization์„ ํ•˜๊ณ  Regularization ์ ์šฉ
  • Model parameter : model์ด feature์—์„œ ํ•™์Šตํ•œ parameter
  • Hyperparameter : ์‚ฌ๋žŒ์ด ์ง€์ •ํ•˜๋Š” parameter
  • ์ ์ง„์  ํ•™์Šต : Loss function ๊ฐ’์„ ์ ์  ์ค„์ด๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ํ›ˆ๋ จํ•˜๋Š” ํ•™์Šต๋ฒ•
    • Gradient Descent
      • Stochastic Gradient Descent
      • Minibatch Gradient Descent
      • Batch Gradient Descent
  • Validation Set : hyperparameter ํŠœ๋‹์„ ์œ„ํ•ด model์„ ํ‰๊ฐ€ํ•  ๋•Œ, test set๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ธฐ ์œ„ํ•ด train set์—์„œ ๋‹ค์‹œ ๋–ผ์–ด ๋‚ธ data set
  • Test Set๋Š” model ์™„์„ฑ ํ›„ ๋งˆ์ง€๋ง‰์— ํ•œ ๋ฒˆ๋งŒ ์‚ฌ์šฉ(์—ฌ๋Ÿฌ๋ฒˆ ์‚ฌ์šฉ ์‹œ, model์„ test set์— ๋งž์ถ”๋Š” ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ)
    • k-fold Cross Validtaion : train set๋ฅผ k๊ฐœ์˜ fold๋กœ ๋‚˜๋ˆˆ ๋‹ค์Œ ํ•œ fold๊ฐ€ validation set์˜ ์—ญํ• , ๋‚˜๋จธ์ง€ fold๋Š” model ํ›ˆ๋ จ -> k๋ฒˆ ๋ฐ˜๋ณตํ•˜์—ฌ ์–ป์€ ๋ชจ๋“  validation score๋ฅผ ํ‰๊ท ๋ƒ„
  • Hyperparameter tuning with AutoML
    • Grid Search : hyperparameter ํƒ์ƒ‰(๊ฐ’์˜ ๋ชฉ๋ก ์ „๋‹ฌ) + cross validation
    • Random Search : hyperparameter ํƒ์ƒ‰(๊ฐ’์˜ ๋ฒ”์œ„ ์ „๋‹ฌ) + cross validation
  • Dimensionality Reduction์„ ํ•˜์—ฌ ๋ฐ์ดํ„ฐ์˜ dimension(feature)๋ฅผ ์ค„์ธ ๋’ค, ๋‹ค๋ฅธ ML ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ํ›ˆ๋ จํ•  ์ˆ˜ ์žˆ์Œ
  • ๋ฐ์ดํ„ฐ์˜ dimension(feature)์„ 3๊ฐœ ์ดํ•˜๋กœ ์ค„์ด๋ฉด, ์‹œ๊ฐํ™”ํ•˜๊ธฐ ์ข‹์Œ(3D or 2D๋กœ ํ‘œํ˜„ ๊ฐ€๋Šฅ)

Deep Learning(DL) - Structured Data

  • DL์˜ ๋ชฉํ‘œ : loss function์˜ ์ตœ์†Œ๊ฐ’ ์ฐพ๊ธฐ = Optimization(์ตœ์ ํ™”)
  • ANN : Artificial Neural Network(์ธ๊ณต ์‹ ๊ฒฝ๋ง)
  • DNN : Deep Neural Network(์‹ฌ์ธต ์‹ ๊ฒฝ๋ง)
  • SLP : Single Layer Perceptron(๋‹จ์ธต ํผ์…‰ํŠธ๋ก )
  • MLP : Multi Layer Perceptron(๋‹ค์ธต ํผ์…‰ํŠธ๋ก )
  • Node : ANN์„ ๊ตฌ์„ฑํ•˜๋Š” ๋ชจ๋“  feature๋ฅผ node๋ผ๊ณ  ๋ถ€๋ฆ„
  • ANN โŠƒ DNN
  • ANN = { SLP, {DNN} }
  • DNN = { MLP, CNN, RNN }
  • Input Layer(์ž…๋ ฅ์ธต)
  • Hidden Layer(์€๋‹‰์ธต) โŠƒ FC(Fully Connected) Layer = Dense Layer(๋ฐ€์ง‘์ธต)
  • Output Layer(์ถœ๋ ฅ์ธต)
  • Activation Function(ํ™œ์„ฑํ™” ํ•จ์ˆ˜) : ๊ฐ node์˜ ์„ ํ˜•๋ฐฉ์ •์‹ ๊ณ„์‚ฐ ๊ฒฐ๊ณผ์— ์ ์šฉ๋˜๋Š” ํ•จ์ˆ˜
    • sigmoid function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)
    • softmax function(์†Œํ”„ํŠธ๋งฅ์Šค ํ•จ์ˆ˜)
    • ReLU(๋ ๋ฃจ ํ•จ์ˆ˜)
    • tanh(ํ•˜์ดํผ๋ณผ๋ฆญ ํƒ„์  ํŠธ ํ•จ์ˆ˜)
  • Optimizer : ์ตœ์†Œ์˜ loss๊ฐ’์„ ์ฐพ๋Š” ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜
    • SGD
    • Adaptive Learning Rate(์ ์‘์  ํ•™์Šต๋ฅ ) ์‚ฌ์šฉํ•˜๋Š” optimizer - ๋ชจ๋ธ์ด ์ตœ์ ์ ์— ๊ฐ€๊นŒ์ด ๊ฐˆ์ˆ˜๋ก ํ•™์Šต๋ฅ ์„ ๋‚ฎ์ถœ ์ˆ˜ ์žˆ์Œ
      • Adagrad
      • RMSprop
      • Adam : Momentum optimization + RMSprop
  • Dropout : hidden layer์— ์žˆ๋Š” ์ผ๋ถ€ node๋ฅผ ๋„๊ณ  ํ›ˆ๋ จ์‹œํ‚ค๋Š” ๊ฒƒ - overfitting ๋ฐฉ์ง€
  • ์ผ๋ฐ˜ Data : ์ˆœ์„œ์— ์˜๋ฏธ๊ฐ€ ์—†๋Š” data, ์˜ˆ) Image
  • Sequential Data(์ˆœ์ฐจ ๋ฐ์ดํ„ฐ) : ์ˆœ์„œ์— ์˜๋ฏธ๊ฐ€ ์žˆ๋Š” data, ์˜ˆ) Text, Time Series(์‹œ๊ณ„์—ด)(์ผ์ •ํ•œ ์‹œ๊ฐ„ ๊ฐ„๊ฒฉ์œผ๋กœ ๊ธฐ๋ก๋œ ๋ฐ์ดํ„ฐ)
    • sequence : ํ•˜๋‚˜์˜ ์ƒ˜ํ”Œ
    • Text Data
      • token : ๋ถ„๋ฆฌ๋œ ๋‹จ์–ด
      • ์–ดํœ˜ ์‚ฌ์ „ : train set์—์„œ ๊ณ ์œ ํ•œ ๋‹จ์–ด๋ฅผ ๋ฝ‘์•„ ๋งŒ๋“  ๋ชฉ๋ก
      • ๋‹จ์–ด๋งˆ๋‹ค ๊ณ ์œ ํ•œ ์ •์ˆ˜๋ฅผ ๋ถ€์—ฌํ•ด ์ˆซ์ž ๋ฐ์ดํ„ฐ๋กœ ๋ฐ”๊ฟˆ(0 : padding, 1 : ๋ฌธ์žฅ์˜ ์‹œ์ž‘, 2 : ์–ดํœ˜ ์‚ฌ์ „์— ์—†๋Š” toekn)
      • ์ •์ˆ˜๊ฐ’ ์‚ฌ์ด์—๋Š” ์–ด๋– ํ•œ ๊ด€๊ณ„๋„ ์—†์Œ - One-hot encoding, Word embedding ์ด์šฉ
  • CNN(Convolution Neural Network)(ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง) - image data ์ฒ˜๋ฆฌ์— ํŠนํ™”๋˜์–ด ์žˆ์Œ
    • convolution layer์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” NN
    • filter = kernel => convolution layer์—์„œ feature map ์ƒ์„ฑ
    • same padding : convolution ๊ณผ์ •์„ ๊ฑฐ์น˜๊ณ ๋„ output ํฌ๊ธฐ๋ฅผ input๊ณผ ๋™์ผํ•˜๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•จ
    • stride : convolution ์—ฐ์‚ฐ ๊ณผ์ •์—์„œ filter์˜ ์ด๋™ ํฌ๊ธฐ
    • pooling : convolution layer์—์„œ ์ƒ์„ฑ๋œ feature map์˜ ๊ฐ€๋กœ์„ธ๋กœ ํฌ๊ธฐ๋ฅผ ์ค„์ž„
      • max pooling : filter๋ฅผ ์ฐ์€ ์˜์—ญ์—์„œ ๊ฐ€์žฅ ํฐ ๊ฐ’ ๊ณ ๋ฅด๊ธฐ
      • average pooling : filter๋ฅผ ์ฐ์€ ์˜์—ญ์—์„œ ํ‰๊ท ๊ฐ’ ๊ณ„์‚ฐํ•˜๊ธฐ
  • RNN(Recurrent Neural Network)(์ˆœํ™˜ ์‹ ๊ฒฝ๋ง) - sequential data ์ฒ˜๋ฆฌ์— ํŠนํ™”๋˜์–ด ์žˆ์Œ
    • recurrent layer์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” NN
    • rucurrent layer = cell
    • cell์˜ ์ถœ๋ ฅ = hidden state
    • timestep : ์ƒ˜ํ”Œ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ํ•œ ๋‹จ๊ณ„, text data - 1๊ฐœ์˜ token์ด ํ•˜๋‚˜์˜ timestep
    • NLP(Natural Language Processing)(์ž์—ฐ์–ด ์ฒ˜๋ฆฌ)์—์„œ ์ด์šฉ - ์Œ์„ฑ ์ธ์‹, ๊ธฐ๊ณ„ ๋ฒˆ์—ญ, ๊ฐ์„ฑ ๋ถ„์„ ๋“ฑ
    • ์ฃผ์š” Model
      • LSTM(Long Short-Term Memory)
      • GRU(Gated Recurrent Unit)

Tip!

  • ์ด๋ฏธ์ง€ ํ”ฝ์…€์€ 0255 ์‚ฌ์ด์˜ ์ •์ˆ˜๊ฐ’์„ ๊ฐ€์ง -> 255๋กœ ๋‚˜๋ˆ„์–ด 01 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ normalization
  • DL์—์„œ๋Š” cross validation ๋Œ€์‹  validation set์„ ๋ณ„๋„๋กœ ๋œ์–ด๋‚ด์–ด ์‚ฌ์šฉ
  • ๋ชจ๋“  hidden layer์™€ output layer์—๋Š” bias(์ ˆํŽธ)๊ณผ activation function์ด ์žˆ์Œ(๋‹จ, Regression์˜ ๊ฒฝ์šฐ์—๋Š” output layer์— activation function์ด ์—†์Œ)
  • Binary Classification์„ ํ•  ๋•Œ output layer์—์„œ๋Š” sigmoid function ์‚ฌ์šฉ
  • Multiclass Classification์„ ํ•  ๋•Œ output layer์—์„œ๋Š” softmax function ์‚ฌ์šฉ
  • ๋ณดํ†ต convolution layer์™€ pooling layer์€ ๊ฑฐ์˜ ํ•ญ์ƒ ํ•จ๊ป˜ ์‚ฌ์šฉ

About

๐Ÿ“˜MLP์—์„œ ์ง„ํ–‰ํ•˜๋Š” ML, DL Study์ž…๋‹ˆ๋‹ค.๐Ÿ“˜

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published