You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The factor of (1/2) is often included for mathematical convenience. When you take the derivative of the cost function to perform gradient descent (a common optimization algorithm), the (1/2) cancels out when computing the gradient, simplifying the expressions and computations.
Also I believe that this cost function is just for an instance and they have not yet taken the mean in this case:
Equation for Mean Squared Error with (1/2) multiplied for convenience: $$\frac{1}{2n} \sum_{i=1}^{n} \left( \hat{y}^{(i)} - y^{(i)} \right)^2$$
And then after taking the differentiation: $$\frac{1}{n} \sum_{i=1}^{n} \left( \hat{y}^{(i)} - y^{(i)} \right) \cdot x^{(i)}$$
Wikipedia defines the cost function MSE as follows:
Yet, the ml cheatsheet uses the following formulas in the backpropogation section.
This is particularly confusing since the Linear Regression and Gradient Descent section defines it correctly:
The text was updated successfully, but these errors were encountered: