MLSS Africa Resources

Ferenc's lecture notes:

References and Reading List


Let's consider two models of data:
f_1(x) = w_1 x + w_2
with initial values $w_1=1$, $w_2=2$, and
f_2(x) = 10\cdot w_3x + w_4
with initial values $w_3=0.1$, $w_4=2$.

It is easy to verify that the two functions at these initial points in fact mathematically the same, and that the two models describe the same set of linear 1D functions.

Now let's consider observing a new datapoint $x = 7, y = 10$. We will update each model by taking a single gradient step trying to reduce the mean-squared error on this single datapoint, with learning rate $0.1$.

Calculate how $f_1$ and $f_2$ are going to change as a result of a single update step. Relatively speaking, do the slope and bias parameters change similarly in the two models.

If you have a solution, maybe nice plots, feel free to send them to me in an email: