Get inspired by the success stories of our students in IIT JAM MS, ISI  MStat, CMI MSc DS.  Learn More

# Restricted Regression Problem | ISI MStat 2017 PSB Problem 7

This problem is a regression problem, where we use the ordinary least square methods, to estimate the parameters in a restricted case scenario. This is ISI MStat 2017 PSB Problem 7.

## Problem

Consider independent observations $${\left(y_{i}, x_{1 i}, x_{2 i}\right): 1 \leq i \leq n}$$ from the regression model
$$y_{i}=\beta_{1} x_{1 i}+\beta_{2} x_{2 i}+\epsilon_{i}, i=1, \ldots, n$$ where $$x_{1 i}$$ and $$x_{2 i}$$ are scalar covariates, $$\beta_{1}$$ and $$\beta_{2}$$ are unknown scalar
coefficients, and $$\epsilon_{i}$$ are uncorrelated errors with mean 0 and variance $$\sigma^{2}>0$$. Instead of using the correct model, we obtain an estimate $$\hat{\beta_{1}}$$ of $$\beta_{1}$$ by minimizing
$$\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}$$ Find the bias and mean squared error of $$\hat{\beta}_{1}$$.

## Solution

It is sort of a restricted regression problem because maybe we have tested the fact that $$\beta_2 = 0$$. Hence, we are interested in the estimate of $$\beta_1$$ given $$\beta_2 = 0$$. This is essentially the statistical significance of this problem, and we will see how it turns out in the estimate of $$\beta_1$$.

$$\sum_{i=1}^{n} a_{i} b_{i} = s_{a,b}$$

Let's minimize $$L(\beta_1) = \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}$$ by differentiating w.r.t $$\beta_1$$ and equating to 0.

$$\frac{dL(\beta_1)}{d\beta_1}\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} = 0$$

$$\Rightarrow \sum_{i=1}^{n} x_{1 i} \left(y_{i}-\beta_{1} x_{1 i}\right) = 0$$

$$\Rightarrow \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}$$

From, the given conditions, $$E(Y_{i})=\beta_{1} X_{1 i}+\beta_{2} X_{2 i}$$.

$$\Rightarrow E(s_{X_{1},Y}) = \beta_{1}s_{X_{1},X_{1}} +\beta_{2} s_{X_{1},X_{2}}$$.

Since, $$x's$$ are constant, $$E(\hat{\beta_1}) = \beta_{1} +\beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}}$$.

$$Bias(\hat{\beta_1}) = \beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}}$$.

Thus, observe that the more $$\beta_2$$ is close to 0, the more bias is close to 0.

From, the given conditions,

$$Y_{i} - \beta_{1} X_{1 i} - \beta_{2} X_{2 i}$$ ~ Something$$( 0 , \sigma^2$$).

$$\hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}$$ ~ Something$$( E(\hat{\beta_{1}}) , Var(\hat{\beta_1}))$$.

$$Var(\hat{\beta_1}) = \frac{\sum_{i=1}^{n} x_{1i}^2 Var(Y_{i})}{s_{X_1, X_1}^2} = \frac{\sigma^2}{s_{X_1, X_1}}$$

$$MSE(\hat{\beta_1}) = Variance + \text{Bias}^2 = \frac{\sigma^2}{s_{X_1, X_1}} + \beta_{2}^2(\frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}})^2$$

Observe, that even the MSE is minimized if $$\beta_2 = 0$$.