This problem is a regression problem, where we use the ordinary least square methods, to estimate the parameters in a restricted case scenario. This is ISI MStat 2017 PSB Problem 7.

## Problem

Consider independent observations \({\left(y_{i}, x_{1 i}, x_{2 i}\right): 1 \leq i \leq n}\) from the regression model

$$

y_{i}=\beta_{1} x_{1 i}+\beta_{2} x_{2 i}+\epsilon_{i}, i=1, \ldots, n

$$ where \(x_{1 i}\) and \(x_{2 i}\) are scalar covariates, \(\beta_{1}\) and \(\beta_{2}\) are unknown scalar

coefficients, and \(\epsilon_{i}\) are uncorrelated errors with mean 0 and variance \(\sigma^{2}>0\). Instead of using the correct model, we obtain an estimate \(\hat{\beta_{1}}\) of \(\beta_{1}\) by minimizing

$$

\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}

$$ Find the bias and mean squared error of \(\hat{\beta}_{1}\).

### Prerequisites

- Ordinary Least Square Method
- Minimizing the Square Loss Error Function
- Multiple Regression
- Mean Square Error = \(\text{Variance} + \text{Bias}^2\).
- Bias

## Solution

It is sort of a restricted regression problem because maybe we have tested the fact that \(\beta_2 = 0\). Hence, we are interested in the estimate of \(\beta_1\) given \(\beta_2 = 0\). This is essentially the statistical significance of this problem, and we will see how it turns out in the estimate of \(\beta_1\).

Let’s minimize \( L(\beta_1) = \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}\) by differentiating w.r.t \(\beta_1\) and equating to 0.

\( \frac{dL(\beta_1)}{d\beta_1}\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} = 0\)

\( \Rightarrow \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right) = 0 \)

\( \Rightarrow \hat{\beta_1} = \frac{\bar{Y}}{\bar{X_1}} \)

From, the given conditions, \( E(Y_{i})=\beta_{1} X_{1 i}+\beta_{2} X_{2 i}\).

\( \Rightarrow E(\bar{Y}) = \beta_{1} \bar{X_{1}}+\beta_{2} \bar{X_{2}} \).

Since, \(x’s\) are constant, \( E(\hat{\beta_1}) = E(\frac{\bar{Y}}{\bar{X_{1}}}) = \beta_{1} +\beta_{2} \frac{\bar{X_{2}}}{\bar{X_{1}}} \).

\( Bias(\hat{\beta_1}) = E(\frac{\bar{Y}}{\bar{X_{1}}} – \beta_{1}) = \beta_{2} \frac{\bar{X_{2}}}{\bar{X_{1}}} \).

Thus, observe that the more \( \beta_2 \) is close to 0, the more bias is close to 0.

From, the given conditions,

\( Y_{i} – \beta_{1} X_{1 i} – \beta_{2} X_{2 i}\) ~ Something\(( 0 , \sigma^2\)).

\( \bar{Y} – \beta_{1} \bar{X_{1}}\) ~ Something\(( \beta_{2} \bar{X_{2}}) , \frac{\sigma^2}{n}\).

\( \hat{\beta_1} – \beta_1 = \frac{\bar{Y}}{\bar{X_1}} – \beta_{1} = \frac{\bar{Y}- \beta_{1} \bar{X_{1}}}{\hat{X_1}}\) ~ Something\(( \beta_{2}\frac{\bar{X_{2}}}{\bar{X_{1}}} , \frac{\sigma^2}{n\bar{X_{1}}^2}\)).

\( MSE(\hat{\beta_1}) = E((\frac{\bar{Y}}{\bar{X_{1}}} – \beta_{1})^2) = Variance + \text{Bias}^2 = \frac{\sigma^2}{n\bar{X_{1}}^2} + \beta_{2}^2\frac{\bar{X_{2}}^2}{\bar{X_{1}}^2}\)

Observe, that even the MSE is minimized if \(\beta_2 = 0\).

Google