MLSS Africa Resources

Ferenc's lecture notes:

References and Reading List

Homework

Let's consider two models of data:
$$
f_1(x) = w_1 x + w_2
$$
with initial values $w_1=1$, $w_2=2$, and
$$
f_2(x) = 10\cdot w_3x + w_4
$$
with initial values $w_3=0.1$, $w_4=2$.

It is easy to verify that the two functions at these initial points in fact mathematically the same, and that the two models describe the same set of linear 1D functions.

Now let's consider observing a new datapoint $x = 7, y = 10$. We will update each model by taking a single gradient step trying to reduce the mean-squared error on this single datapoint, with learning rate $0.1$.

Calculate how $f_1$ and $f_2$ are going to change as a result of a single update step. Relatively speaking, do the slope and bias parameters change similarly in the two models.

If you have a solution, maybe nice plots, feel free to send them to me in an email: ferenc.huszar@gmail.com