Get inspired by the success stories of our students in IIT JAM 2021. Learn More

This problem is a regression problem, where we use the ordinary least square methods, to estimate the parameters in a restricted case scenario. This is ISI MStat 2017 PSB Problem 7.

Consider independent observations \({\left(y_{i}, x_{1 i}, x_{2 i}\right): 1 \leq i \leq n}\) from the regression model

$$

y_{i}=\beta_{1} x_{1 i}+\beta_{2} x_{2 i}+\epsilon_{i}, i=1, \ldots, n

$$ where \(x_{1 i}\) and \(x_{2 i}\) are scalar covariates, \(\beta_{1}\) and \(\beta_{2}\) are unknown scalar

coefficients, and \(\epsilon_{i}\) are uncorrelated errors with mean 0 and variance \(\sigma^{2}>0\). Instead of using the correct model, we obtain an estimate \(\hat{\beta_{1}}\) of \(\beta_{1}\) by minimizing

$$

\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}

$$ Find the bias and mean squared error of \(\hat{\beta}_{1}\).

- Ordinary Least Square Method
- Minimizing the Square Loss Error Function
- Multiple Regression
- Mean Square Error = \(\text{Variance} + \text{Bias}^2\).
- Bias

It is sort of a restricted regression problem because maybe we have tested the fact that \(\beta_2 = 0\). Hence, we are interested in the estimate of \(\beta_1\) given \(\beta_2 = 0\). This is essentially the statistical significance of this problem, and we will see how it turns out in the estimate of \(\beta_1\).

Let's start with some notational nomenclature.

\( \sum_{i=1}^{n} a_{i} b_{i} = s_{a,b} \)

Let's minimize \( L(\beta_1) = \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}\) by differentiating w.r.t \(\beta_1\) and equating to 0.

\( \frac{dL(\beta_1)}{d\beta_1}\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} = 0\)

\( \Rightarrow \sum_{i=1}^{n} x_{1 i} \left(y_{i}-\beta_{1} x_{1 i}\right) = 0 \)

\( \Rightarrow \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}} \)

From, the given conditions, \( E(Y_{i})=\beta_{1} X_{1 i}+\beta_{2} X_{2 i}\).

\( \Rightarrow E(s_{X_{1},Y}) = \beta_{1}s_{X_{1},X_{1}} +\beta_{2} s_{X_{1},X_{2}} \).

Since, \(x's\) are constant, \( E(\hat{\beta_1}) = \beta_{1} +\beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}} \).

\( Bias(\hat{\beta_1}) = \beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}} \).

Thus, observe that the more \( \beta_2 \) is close to 0, the more bias is close to 0.

From, the given conditions,

\( Y_{i} - \beta_{1} X_{1 i} - \beta_{2} X_{2 i}\) ~ Something\(( 0 , \sigma^2\)).

\( \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}\) ~ Something\(( E(\hat{\beta_{1}}) , Var(\hat{\beta_1}))\).

\(Var(\hat{\beta_1}) = \frac{\sum_{i=1}^{n} x_{1i}^2 Var(Y_{i})}{s_{X_1, X_1}^2} = \frac{\sigma^2}{s_{X_1, X_1}} \)

\( MSE(\hat{\beta_1}) = Variance + \text{Bias}^2 = \frac{\sigma^2}{s_{X_1, X_1}} + \beta_{2}^2(\frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}})^2\)

Observe, that even the MSE is minimized if \(\beta_2 = 0\).

Advanced Mathematical Science. Taught by olympians, researchers and true masters of the subject.

JOIN TRIAL
I guess the estimate of Beta 1 here is slightly incorrect .

beta 1 estimate is wrong dude !!!

beta 1 estimate is wrong dude.

Mistake in taking derivative..Whole solution gone wrong..Yare yare daze

Mistake in taking derivative. Whole solution gone wrong..

Yes srijit da!!

Estimate value of β1 is not true