Get inspired by the success stories of our students in IIT JAM 2021. Learn More

# Restricted Regression Problem | ISI MStat 2017 PSB Problem 7

This problem is a regression problem, where we use the ordinary least square methods, to estimate the parameters in a restricted case scenario. This is ISI MStat 2017 PSB Problem 7.

## Problem

Consider independent observations ${\left(y_{i}, x_{1 i}, x_{2 i}\right): 1 \leq i \leq n}$ from the regression model
$$y_{i}=\beta_{1} x_{1 i}+\beta_{2} x_{2 i}+\epsilon_{i}, i=1, \ldots, n$$ where $x_{1 i}$ and $x_{2 i}$ are scalar covariates, $\beta_{1}$ and $\beta_{2}$ are unknown scalar
coefficients, and $\epsilon_{i}$ are uncorrelated errors with mean 0 and variance $\sigma^{2}>0$. Instead of using the correct model, we obtain an estimate $\hat{\beta_{1}}$ of $\beta_{1}$ by minimizing
$$\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}$$ Find the bias and mean squared error of $\hat{\beta}_{1}$.

## Solution

It is sort of a restricted regression problem because maybe we have tested the fact that $\beta_2 = 0$. Hence, we are interested in the estimate of $\beta_1$ given $\beta_2 = 0$. This is essentially the statistical significance of this problem, and we will see how it turns out in the estimate of $\beta_1$.

$\sum_{i=1}^{n} a_{i} b_{i} = s_{a,b}$

Let's minimize $L(\beta_1) = \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}$ by differentiating w.r.t $\beta_1$ and equating to 0.

$\frac{dL(\beta_1)}{d\beta_1}\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} = 0$

$\Rightarrow \sum_{i=1}^{n} x_{1 i} \left(y_{i}-\beta_{1} x_{1 i}\right) = 0$

$\Rightarrow \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}$

From, the given conditions, $E(Y_{i})=\beta_{1} X_{1 i}+\beta_{2} X_{2 i}$.

$\Rightarrow E(s_{X_{1},Y}) = \beta_{1}s_{X_{1},X_{1}} +\beta_{2} s_{X_{1},X_{2}}$.

Since, $x's$ are constant, $E(\hat{\beta_1}) = \beta_{1} +\beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}}$.

$Bias(\hat{\beta_1}) = \beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}}$.

Thus, observe that the more $\beta_2$ is close to 0, the more bias is close to 0.

From, the given conditions,

$Y_{i} - \beta_{1} X_{1 i} - \beta_{2} X_{2 i}$ ~ Something$( 0 , \sigma^2$).

$\hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}$ ~ Something$( E(\hat{\beta_{1}}) , Var(\hat{\beta_1}))$.

$Var(\hat{\beta_1}) = \frac{\sum_{i=1}^{n} x_{1i}^2 Var(Y_{i})}{s_{X_1, X_1}^2} = \frac{\sigma^2}{s_{X_1, X_1}}$

$MSE(\hat{\beta_1}) = Variance + \text{Bias}^2 = \frac{\sigma^2}{s_{X_1, X_1}} + \beta_{2}^2(\frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}})^2$

Observe, that even the MSE is minimized if $\beta_2 = 0$.