Get inspired by the success stories of our students in IIT JAM MS, ISI  MStat, CMI MSc DS.  Learn More 

Restricted Regression Problem | ISI MStat 2017 PSB Problem 7

This problem is a regression problem, where we use the ordinary least square methods, to estimate the parameters in a restricted case scenario. This is ISI MStat 2017 PSB Problem 7.

Problem

Consider independent observations {\left(y_{i}, x_{1 i}, x_{2 i}\right): 1 \leq i \leq n} from the regression model

    \[y_{i}=\beta_{1} x_{1 i}+\beta_{2} x_{2 i}+\epsilon_{i}, i=1, \ldots, n\]

where x_{1 i} and x_{2 i} are scalar covariates, \beta_{1} and \beta_{2} are unknown scalar
coefficients, and \epsilon_{i} are uncorrelated errors with mean 0 and variance \sigma^{2}>0. Instead of using the correct model, we obtain an estimate \hat{\beta_{1}} of \beta_{1} by minimizing

    \[\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}\]

Find the bias and mean squared error of \hat{\beta}_{1}.

Prerequisites

Solution

It is sort of a restricted regression problem because maybe we have tested the fact that \beta_2 = 0. Hence, we are interested in the estimate of \beta_1 given \beta_2 = 0. This is essentially the statistical significance of this problem, and we will see how it turns out in the estimate of \beta_1.

Let's start with some notational nomenclature.
\sum_{i=1}^{n} a_{i} b_{i}  = s_{a,b}

Let's minimize L(\beta_1) =  \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} by differentiating w.r.t \beta_1 and equating to 0.

\frac{dL(\beta_1)}{d\beta_1}\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} =  0

\Rightarrow \sum_{i=1}^{n} x_{1 i} \left(y_{i}-\beta_{1} x_{1 i}\right) = 0

\Rightarrow \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}

From, the given conditions, E(Y_{i})=\beta_{1} X_{1 i}+\beta_{2} X_{2 i}.

\Rightarrow E(s_{X_{1},Y}) = \beta_{1}s_{X_{1},X_{1}} +\beta_{2} s_{X_{1},X_{2}}.

Since, x's are constant, E(\hat{\beta_1}) = \beta_{1} +\beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}}.

Bias(\hat{\beta_1})  = \beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}}.

Thus, observe that the more \beta_2 is close to 0, the more bias is close to 0.

From, the given conditions,

Y_{i} - \beta_{1} X_{1 i} - \beta_{2} X_{2 i} ~ Something( 0 , \sigma^2).

\hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}} ~ Something( E(\hat{\beta_{1}}) , Var(\hat{\beta_1})).

Var(\hat{\beta_1}) = \frac{\sum_{i=1}^{n} x_{1i}^2 Var(Y_{i})}{s_{X_1, X_1}^2} = \frac{\sigma^2}{s_{X_1, X_1}}

MSE(\hat{\beta_1}) = Variance + \text{Bias}^2 = \frac{\sigma^2}{s_{X_1, X_1}} + \beta_{2}^2(\frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}})^2

Observe, that even the MSE is minimized if \beta_2 = 0.

This problem is a regression problem, where we use the ordinary least square methods, to estimate the parameters in a restricted case scenario. This is ISI MStat 2017 PSB Problem 7.

Problem

Consider independent observations {\left(y_{i}, x_{1 i}, x_{2 i}\right): 1 \leq i \leq n} from the regression model

    \[y_{i}=\beta_{1} x_{1 i}+\beta_{2} x_{2 i}+\epsilon_{i}, i=1, \ldots, n\]

where x_{1 i} and x_{2 i} are scalar covariates, \beta_{1} and \beta_{2} are unknown scalar
coefficients, and \epsilon_{i} are uncorrelated errors with mean 0 and variance \sigma^{2}>0. Instead of using the correct model, we obtain an estimate \hat{\beta_{1}} of \beta_{1} by minimizing

    \[\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2}\]

Find the bias and mean squared error of \hat{\beta}_{1}.

Prerequisites

Solution

It is sort of a restricted regression problem because maybe we have tested the fact that \beta_2 = 0. Hence, we are interested in the estimate of \beta_1 given \beta_2 = 0. This is essentially the statistical significance of this problem, and we will see how it turns out in the estimate of \beta_1.

Let's start with some notational nomenclature.
\sum_{i=1}^{n} a_{i} b_{i}  = s_{a,b}

Let's minimize L(\beta_1) =  \sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} by differentiating w.r.t \beta_1 and equating to 0.

\frac{dL(\beta_1)}{d\beta_1}\sum_{i=1}^{n}\left(y_{i}-\beta_{1} x_{1 i}\right)^{2} =  0

\Rightarrow \sum_{i=1}^{n} x_{1 i} \left(y_{i}-\beta_{1} x_{1 i}\right) = 0

\Rightarrow \hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}}

From, the given conditions, E(Y_{i})=\beta_{1} X_{1 i}+\beta_{2} X_{2 i}.

\Rightarrow E(s_{X_{1},Y}) = \beta_{1}s_{X_{1},X_{1}} +\beta_{2} s_{X_{1},X_{2}}.

Since, x's are constant, E(\hat{\beta_1}) = \beta_{1} +\beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}}.

Bias(\hat{\beta_1})  = \beta_{2} \frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}}.

Thus, observe that the more \beta_2 is close to 0, the more bias is close to 0.

From, the given conditions,

Y_{i} - \beta_{1} X_{1 i} - \beta_{2} X_{2 i} ~ Something( 0 , \sigma^2).

\hat{\beta_1} = \frac{s_{x_{1},y}}{s_{x_{1},x_{1}}} ~ Something( E(\hat{\beta_{1}}) , Var(\hat{\beta_1})).

Var(\hat{\beta_1}) = \frac{\sum_{i=1}^{n} x_{1i}^2 Var(Y_{i})}{s_{X_1, X_1}^2} = \frac{\sigma^2}{s_{X_1, X_1}}

MSE(\hat{\beta_1}) = Variance + \text{Bias}^2 = \frac{\sigma^2}{s_{X_1, X_1}} + \beta_{2}^2(\frac{s_{X_{1},X_{2}}}{s_{X_{1},X_{1}}})^2

Observe, that even the MSE is minimized if \beta_2 = 0.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

6 comments on “Restricted Regression Problem | ISI MStat 2017 PSB Problem 7”

Knowledge Partner

Cheenta is a knowledge partner of Aditya Birla Education Academy
Cheenta

Cheenta Academy

Aditya Birla Education Academy

Aditya Birla Education Academy

Cheenta. Passion for Mathematics

Advanced Mathematical Science. Taught by olympians, researchers and true masters of the subject.
JOIN TRIAL
support@cheenta.com
Menu
Trial
Whatsapp
rockethighlight