Skip to content

Commit

Permalink
303 class 5.2
Browse files Browse the repository at this point in the history
  • Loading branch information
SichangHe committed Apr 16, 2024
1 parent ad9354a commit f894b3d
Showing 1 changed file with 24 additions and 0 deletions.
24 changes: 24 additions & 0 deletions src/notes/class_notes/stats303.md
Original file line number Diff line number Diff line change
Expand Up @@ -327,3 +327,27 @@ p(x_{-i}^*)=p(x_{-i})\\
\frac{p(x_i^*|x_{-i})p(x_{-i})p(x_i|x_{-i})}
{p(x_i|x_{-i})p(x_{-i})p(x_i^*|x_{-i})}=1
$$

## entropy

- randomness, impurity, how easy to determine
- equal to expected surprise
$$
H=\mathbb E(\sup)=\sum_xp(x)\sup(x)=-\sum_xp(x)\ln p(x)
$$
- [cross entropy loss](notes/cs/machine_learning.html#cross-entropy-loss-for-binary-classification)

### surprise

$$
\sup=\ln\frac{1}{p(x)}=-\ln p(x)
$$

### decision tree based on entropy

- information gain
$$
I(Y|x_i)=H(Y)-H(Y|x_i)\\
\text{where}\quad H(Y|x_i)=\sum_{x}p(x_i=x)H(Y|x_i=x)
$$
- maximize information gain on each split

0 comments on commit f894b3d

Please sign in to comment.