## ISI MStat Entrance 2020 Problems and Solutions

This post contains Indian Statistical Institute, ISI MStat Entrance 2020 Problems and Solutions. Try to solve them out.

## Subjective Paper – ISI MStat Entrance 2020 Problems and Solutions

• Let $f(x)=x^{2}-2 x+2$. Let $L_{1}$ and $L_{2}$ be the tangents to its graph at $x=0$ and $x=2$ respectively. Find the area of the region enclosed by the graph of $f$ and the two lines $L_{1}$ and $L_{2}$.

Solution
• Find the number of $3 \times 3$ matrices $A$ such that the entries of $A$ belong to the set $\mathbb{Z}$ of all integers, and such that the trace of $A^{t} A$ is 6 . $\left(A^{t}\right.$ denotes the transpose of the matrix $\left.A\right)$.

Solution
• Consider $n$ independent and identically distributed positive random variables $X_{1}, X_{2}, \ldots, X_{n},$ Suppose $S$ is a fixed subset of ${1,2, \ldots, n}$ consisting of $k$ distinct elements where $1 \leq k<n$
(a) Compute $\mathbb{E}\left[\frac{\sum_{i \in S} X_{i}}{\sum_{i=1}^{n} X_{i}}\right]$

(b) Assume that $X_{i}$ ‘s have mean $\mu$ and variance $\sigma^{2}, 0<\sigma^{2}<\infty$. If $j \notin S,$ show that the correlation between $\left(\sum_{i \in S} X_{i}\right) X_{j}$ and $\sum_{i \in S} X_{i}$ lies between -$\frac{1}{\sqrt{k+1}} \text { and } \frac{1}{\sqrt{k+1}}$.

Solution
• Let $X_{1,} X_{2}, \ldots, X_{n}$ be independent and identically distributed random variables. Let $S_{n}=X_{1}+\cdots+X_{n}$. For each of the following statements, determine whether they are true or false. Give reasons in each case.

(a) If $S_{n} \sim E_{x p}$ with mean $n,$ then each $X_{i} \sim E x p$ with mean 1 .

(b) If $S_{n} \sim B i n(n k, p),$ then each $X_{i} \sim B i n(k, p)$

Solution
• Let $U_{1}, U_{2}, \ldots, U_{n}$ be independent and identically distributed random variables each having a uniform distribution on (0,1) . Let $X=\min \{U_{1}, U_{2}, \ldots, U_{n}\}$, $Y=\max \{U_{1}, U_{2}, \ldots, U_{n}\}$

Evaluate $\mathbb{E}[X \mid Y=y]$ and $\mathbb{E}[Y \mid X=x]$.

Solution
• Suppose individuals are classified into three categories $C_{1}, C_{2}$ and $C_{3}$ Let $p^{2},(1-p)^{2}$ and $2 p(1-p)$ be the respective population proportions, where $p \in(0,1)$. A random sample of $N$ individuals is selected from the population and the category of each selected individual recorded.

For $i=1,2,3,$ let $X_{i}$ denote the number of individuals in the sample belonging to category $C_{i} .$ Define $U=X_{1}+\frac{X_{3}}{2}$

(a) Is $U$ sufficient for $p ?$ Justify your answer.

(b) Show that the mean squared error of $\frac{U}{N}$ is $\frac{p(1-p)}{2 N}$

Solution
• Consider the following model: $y_{i}=\beta x_{i}+\varepsilon_{i} x_{i}, \quad i=1,2, \ldots, n$, where $y_{i}, i=1,2, \ldots, n$ are observed; $x_{i}, i=1,2, \ldots, n$ are known positive constants and $\beta$ is an unknown parameter. The errors $\varepsilon_{1}, \varepsilon_{2}, \ldots, \varepsilon_{n}$ are independent and identically distributed random variables having the probability density function $f(u)=\frac{1}{2 \lambda} \exp \left(-\frac{|u|}{\lambda}\right), \quad-\infty<u<\infty$ and $\lambda$ is an unknown parameter.

(a) Find the least squares estimator of $\beta$.

(b) Find the maximum likelihood estimator of $\beta$.

Solution
• Assume that $X_{1}, \ldots, X_{n}$ is a random sample from $N(\mu, 1),$ with $\mu \in \mathbb{R}$. We want to test $H_{0}: \mu=0$ against $H_{1}: \mu=1$. For a fixed integer $m \in{1, \ldots, n},$ the following statistics are defined:

\begin{aligned}
T_{1} &= \frac{\left(X_{1}+\ldots+X_{m}\right)}{m} \\
T_{2} &= \frac{\left(X_{2}+\ldots+X_{m+1}\right)}{m} \\
\vdots &=\vdots \\
T_{n-m+1} &= \frac{\left(X_{n-m+1}+\ldots+X_{n}\right)}{m}
\end{aligned}

$\operatorname{Fix} \alpha \in(0,1) .$ Consider the test

Reject $H_{0}$ if $\max \{T_{i}: 1 \leq i \leq n-m+1\}>c_{m, \alpha}$

Find a choice of $c_{m, \alpha} \in \mathbb{R}$ in terms of the standard normal distribution function $\Phi$ that ensures that the size of the test is at most $\alpha$.

Solution
• A finite population has $N$ units, with $x_{i}$ being the value associated with the $i$ th unit, $i=1,2, \ldots, N$. Let $\bar{x}{N}$ be the population mean. A statistician carries out the following experiment.

Step 1: Draw an SRSWOR of size $n({1}$ and denote the sample mean by $\bar{X}{n}$

Step 2: Draw a SRSWR of size $m$ from $S_{1}$. The $x$ -values of the sampled units are denoted by $\{Y_{1}, \ldots, Y_{m}\}$

An estimator of the population mean is defined as,

$\widehat{T}{m}=\frac{1}{m} \sum{i=1}^{m} Y_{i}$

(a) Show that $\widehat{T}{m}$ is an unbiased estimator of the population mean.

(b) Which of the following has lower variance: $\widehat{T}{m}$ or $\bar{X}_{n} ?$

Solution

## Objective Paper

 1. C 2. D 3. A 4. B 5. A 6. B 7. C 8. A 9. C 10. A 11. C 12. D 13. C 14. B 15. B 16. C 17. D 18. B 19. B 20. C 21. C 22. D 23. A 24. B 25. D 26. B 27. D 28. D 29. B 30. C

Watch videos related to the ISI MStat Problems here.

Categories

## Testing of Hypothesis | ISI MStat 2016 PSB Problem 9

This is a problem from the ISI MStat Entrance Examination, 2016 involving the basic idea of Type 1 error of Testing of Hypothesis but focussing on the fundamental relationship of Exponential Distribution and the Geometric Distribution.

## The Problem:

Suppose $X_{1}, X_{2}, \ldots, X_{n}$ is a random sample from an exponential distribution with mean $\lambda$.

Assume that the observed data is available on $\left[X_{1}\right], \ldots,\left[X_{n}\right]$, instead of $X_{1}, \ldots, X_{n},$ where $[x]$ denotes the largest integer less than or equal to $x$.

Consider a test for $H_{0}: \lambda=1$ vs $H_{1}: \lambda>1$ which rejects $H_{0}$ when $\sum_{i=1}^{n}\left[X_{i}\right]>c_{n} .$

Given $\alpha \in(0,1),$ obtain values of $c_{n}$ such that the size of the test converges to $\alpha$ as $n \rightarrow \infty$.

## Prerequisites:

(a) Testing of Hypothesis

(b) Type 1 Error

(c) Exponential Distribution

(d) Relationship of Exponential Distribution and Geometric Distribution

(e) Central Limit Theorem

## Solution:

• X ~ Exponential($\lambda$), then $Y = [\frac{X}{a}]$ ~ Geom($p$), where $p = 1-e^{-\lambda a} \in(0,1)$

Proof:

$Y$ is clearly discrete taking values in the set of non-negative integers, due to the flooring. Then, for any integer $n \geq 0$ we have
$\begin{array}{c} P(Y=n)=P(X \in[\text {an, } a(n+1))) \ =\int_{a n}^{a(n+1)} \lambda \mathrm{e}^{-\lambda x} d x=(1-p)^{n} p \end{array}$
where $p=1-e^{-\lambda a} \in(0,1),$ as $\lambda>0$ and $a>0$.

• $X_i$ ~ Geom($p$), then $\sum_{i = 1}^{n}$ ~ NBinom(n,p)
• $X_i$ ~ Exponential($\lambda$), then $S_n = \sum_{i=1}^{n}\left[X_{i}\right]$ ~ NBinom($(n,p)$), where $p = 1-e^{-\lambda} \in(0,1)$

#### Testing of Hypothesis

$H_{0}: \lambda=1$ vs $H_{1}: \lambda>1$

We reject $H_{0}$ when $\sum_{i=1}^{n}\left[X_{i}\right]>c_{n} .$

Here, the size of the test i.e the Type 1 error (for simple hypothesis), $\alpha_n$ = $P(S_n > c_{n} | \lambda=1)$.

We want to select $c_n$ such that $\alpha_n \to \alpha$.

$S_n$ ~ NBinom($n,p$), where $p = 1-e^{-1}$ under $H_0$.

Now, $\frac{\sqrt{n}(\frac{S_n}{n} – \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}} \rightarrow Z = N(0,1)$ by Central Limit Theorem.

Observe that thus, $\alpha_n = P(S_n > c_{n} | \lambda=1) \rightarrow P(Z > \frac{\sqrt{n}(\frac{c_n}{n} – \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}}) = \alpha$.

Thus, $\frac{\sqrt{n}(\frac{c_n}{n} – \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}} = Z_{\alpha}$.

We can solve this to find $c_n$, where $p = 1-e^{-1}$

## Food for Thought

If X ~ Exponential($\lambda$), then what is the distribution of {X} [ The fractional part of X]. This question is crucial is getting back Exponential Distrbution from Geometric Distribution.

Rather, the food for thought, asks you how do we get Exponential Distribution from Geometric Distribution.

Stay Tuned. Stay Blessed! See you in the next post.

Categories

## ISI MStat PSB 2006 Problem 8 | Bernoullian Beauty

This is a very beautiful sample problem from ISI MStat PSB 2006 Problem 8. It is based on basic idea of Maximum Likelihood Estimators, but with a bit of thinking. Give it a thought !

## Problem– ISI MStat PSB 2006 Problem 8

Let $(X_1,Y_1),……,(X_n,Y_n)$ be a random sample from the discrete distributions with joint probability

$f_{X,Y}(x,y) = \begin{cases} \frac{\theta}{4} & (x,y)=(0,0) \ and \ (1,1) \\ \frac{2-\theta}{4} & (x,y)=(0,1) \ and \ (1,0) \end{cases}$

with $0 \le \theta \le 2$. Find the maximum likelihood estimator of $\theta$.

### Prerequisites

Maximum Likelihood Estimators

Indicator Random Variables

Bernoulli Trials

## Solution :

This is a very beautiful Problem, not very difficult, but her beauty is hidden in her simplicity, lets explore !!

Observe, that the given pmf is as good as useless while taking us anywhere, so we should think out of the box, but before going out of the box, lets collect whats in the box !

So, from the given pmf we get, $P( \ of\ getting\ pairs \ of\ form \ (1,1) \ or \ (0,0))=2\times \frac{\theta}{4}=\frac{\theta}{2}$,

Similarly, $P( \ of\ getting\ pairs \ of\ form \ (0,1) \ or \ (1,0))=2\times \frac{2-\theta}{4}=\frac{2-\theta}{2}=1-P( \ of\ getting\ pairs \ of\ form \ (1,1) \ or \ (0,0))$

So, clearly it is giving us a push towards involving Bernoulli trials, isn’t it !!

So, lets treat the pairs with match, .i.e. $x=y$, be our success, and the other possibilities be failure, then our success probability is $\frac{\theta}{2}$, where $0\le \theta \le 2$. So, if $S$ be the number of successful pairs in our given sample of size $n$, then it is evident $S \sim Binomial(n, \frac{\theta}{2})$.

So, now its simplified by all means, and we know the MLE of population proportion in binomial is the proportion of success in the sample,

Hence, $\frac{\hat{\theta_{MLE}}}{2}= \frac{s}{n}$, where $s$ is the number of those pairs in our sample where $X_i=Y_i$.

So, $\hat{\theta_{MLE}}=\frac{2(number\ of \ pairs \ in\ the\ sample\ of \ form\ (0,0)\ or \ (1,1))}{n}$.

Hence, we are done !!

## Food For Thought

Say, $X$ and $Y$ are two independent exponential random variable with means $\mu$ and $\lambda$ respectively. But you observe two other variables, $Z$ and $W$, such that $Z=min(X,Y)$ and $W$ takes the value $1$ when $Z=X$ and $0$ otherwise. Can you find the MLEs of the parameters ?

Give it a try !!

Categories

## ISI MStat PSB 2009 Problem 8 | How big is the Mean?

This is a very simple and regular sample problem from ISI MStat PSB 2009 Problem 8. It It is based on testing the nature of the mean of Exponential distribution. Give it a Try it !

## Problem– ISI MStat PSB 2009 Problem 8

Let $X_1,…..,X_n$ be i.i.d. observation from the density,

$f(x)=\frac{1}{\mu}exp(-\frac{x}{\mu}) , x>0$

where $\mu >0$ is an unknown parameter.

Consider the problem of testing the hypothesis $H_o : \mu \le \mu_o$ against $H_1 : \mu > \mu_o$.

(a) Show that the test with critical region $[\bar{X} \ge \mu_o {\chi_{2n,1-\alpha}}^2/2n]$, where ${\chi^2}_{2n,1-\alpha}$ is the $(1-\alpha)$th quantile of the ${\chi^2}_{2n}$ distribution, has size $\alpha$.

(b) Give an expression of the power in terms of the c.d.f. of the ${\chi^2}_{2n}$ distribution.

### Prerequisites

Likelihood Ratio Test

Exponential Distribution

Chi-squared Distribution

## Solution :

This problem is quite regular and simple, from the given form of the hypotheses , it is almost clear that using Neyman-Pearson can land you in trouble. So, lets go for something more general , that is Likelihood Ratio Testing.

Hence, the Likelihood function of the $\mu$ for the given sample is ,

$L(\mu | \vec{X})=(\frac{1}{\mu})^n exp(-\frac{\sum_{i=1}^n X_i}{\mu}) , \mu>0$, also observe that sample mean $\vec{X}$ is the MLE of $\mu$.

So, the Likelihood Ratio statistic is,

$\lambda(\vec{x})=\frac{\sup_{\mu \le \mu_o}L(\mu |\vec{x})}{\sup_\mu L(\mu |\vec{x})} \\ =\begin{cases} 1 & \mu_o \ge \bar{X} \\ \frac{L(\mu_o|\vec{x})}{L(\bar{X}|\vec{x})} & \mu_o < \bar{X} \end{cases}$

So, our test function is ,

$\phi(\vec{x})=\begin{cases} 1 & \lambda(\vec{x})<k \\ 0 & otherwise \end{cases}$.

We, reject $H_o$ at size $\alpha$, when $\phi(\vec{x})=1$, for some $k$, $E_{H_o}(\phi) \le \alpha$,

Hence, $\lambda(\vec{x}) < k \\ \Rightarrow L(\mu_o|\vec{x})<kL(\bar{X}|\vec{x}) \\ \ln k_1 -\frac{1}{\mu_o}\sum_{i=1}^n X_i < \ln k -n \ln \bar{X} -\frac{1}{n} \\ n \ln \bar{X}-\frac{n\bar{X}}{\mu_o} < K*$.

for some constant, $K*$.

Let $g(\bar{x})=n\ln \bar{x} -\frac{n\bar{x}}{\mu_o}$, and observe that $g$ is,

decreasing function of $\bar{x}$ for $\bar{x} \ge \mu_o$,

Hence, there exists a $c$ such that $\bar{x} \ge c$,we have $g(\bar) < K*$. See the figure.

So, the critical region of the test is of form $\bar{X} \ge c$, for some $c$ such that,

$P_{H_o}(\bar{X} \ge c)=\alpha$, for some $0 \le \alpha \le 1$, where $\alpha$ is the size of the test.

Now, our task is to find $c$, and for that observe, if $X \sim Exponential(\theta)$, then $\frac{2X}{\theta} \sim {\chi^2}_2$,

Hence, in this problem, since the $X_i$’s follows $Exponential(\mu)$, hence, $\frac{2n\bar{X}}{\mu} \sim {\chi^2}_{2n}$, we have,

$P_{H_o}(\bar{X} \ge c)=\alpha \\ P_{H_o}(\frac{2n\bar{X}}{\mu_o} \ge \frac{2nc}{\mu_o})=\alpha \\ P_{H_o}({\chi^2}{2n} \ge \frac{2nc}{\mu_o})=\alpha$,

which gives $c=\frac{\mu_o {\chi^2}_{2n;1-\alpha}}{2n}$,

Hence, the rejection region is indeed, $[\bar{X} \ge \frac{\mu_o {\chi^2}_{2n;1-\alpha}}{2n}$.

Hence Proved !

(b) Now, we know that the power of the test is,

$\beta= E_{\mu}(\phi) \\ = P_{\mu}(\lambda(\bar{x})>k)=P(\bar{X} \ge \frac{\mu_o {\chi_{2n;1-\alpha}}^2}{2n}) \\ \beta = P_{\mu}({\chi^2}_{2n} \ge \frac{mu_o}{\mu}{\chi^2}_{2n;1-\alpha})$.

Hence, the power of the test is of form of a cdf of chi-squared distribution.

## Food For Thought

Can you use any other testing procedure to conduct this test ?

Categories

## ISI MStat PSB 2009 Problem 6 | abNormal MLE of Normal

This is a very beautiful sample problem from ISI MStat PSB 2009 Problem 6. It is based on the idea of Restricted Maximum Likelihood Estimators, and Mean Squared Errors. Give it a Try it !

## Problem-ISI MStat PSB 2009 Problem 6

Suppose $X_1,…..,X_n$ are i.i.d. $N(\theta,1)$, $\theta_o \le \theta \le \theta_1$, where $\theta_o < \theta_1$ are two specified numbers. Find the MLE of $\theta$ and show that it is better than the sample mean $\bar{X}$ in the sense of having smaller mean squared error.

### Prerequisites

Maximum Likelihood Estimators

Normal Distribution

Mean Squared Error

## Solution :

This is a very interesting Problem ! We all know, that if the condition “$\theta_o \le \theta \le \theta_1$, for some specified numbers $\theta_o < \theta_1$” had been not given, then the MLE would have been simply $\bar{X}=\frac{1}{n}\sum_{k=1}^n X_k$, the sample mean of the given sample. But due to the restriction over $\theta$ things get interestingly complicated.

So, simplify a bit, lets write the Likelihood Function of $theta$ given this sample, $\vec{X}=(X_1,….,X_n)’$,

$L(\theta |\vec{X})={\frac{1}{\sqrt{2\pi}}}^nexp(-\frac{1}{2}\sum_{k=1}^n(X_k-\theta)^2)$, when $\theta_o \le \theta \le \theta_1$ow taking natural log both sides and differentiating, we find that ,

$\frac{d\ln L(\theta|\vec{X})}{d\theta}= \sum_{k=1}^n (X_k-\theta)$.

Now, verify that if $\bar{X} < \theta_o$, then $L(\theta |\vec{X})$ is always a decreasing function of $\theta$, [ where, $\theta_o \le \theta \le \theta_1$], Hence the maximum likelihood attains at $\theta_o$ itself. Similarly, when, $\theta_o \le \bar{X} \le \theta_1$, the maximum likelihood attains at $\bar{X}$, lastly the likelihood function will be increasing, hence the maximum likelihood will be found at $\theta_1$.

Hence, the Restricted Maximum Likelihood Estimator of $\theta$, say

$\hat{\theta_{RML}} = \begin{cases} \theta_o & \bar{X} < \theta_o \\ \bar{X} & \theta_o\le \bar{X} \le \theta_1 \\ \theta_1 & \bar{X} > \theta_1 \end{cases}$

Now, to check that, $\hat{\theta_{RML}}$ is a better estimator than $\bar{X}$, in terms of Mean Squared Error (MSE).

Now, $MSE_{\theta}(\bar{X})=E_{\theta}(\bar{X}-\theta)^2=\int^{-\infty}_\infty (\bar{X}-\theta)^2f_X(x)\,dx$

$=\int^{-\infty}_{\theta_o} (\bar{X}-\theta)^2f_X(x)\,dx+\int^{\theta_o}_{\theta_1} (\bar{X}-\theta)^2f_X(x)\,dx+\int^{\theta_1}_\infty (\bar{X}-\theta)^2f_X(x)\,dx$.

$\ge \int^{-\infty}_{\theta_o} (\theta_o-\theta)^2f_X(x)\,dx+\int^{\theta_o}_{\theta_1} (\bar{X}-\theta)^2f_X(x)\,dx+\int^{\theta_1}_\infty (\theta_1-\theta)^2f_X(x)\,dx$

$=E_{\theta}(\hat{\theta_{RML}}-\theta)^2=MSE_{\theta}(\hat{\theta_{RML}})$.

Hence proved !!

## Food For Thought

Now, can you find an unbiased estimator, for $\theta^2$ ?? Okay!! now its quite easy right !! But is the estimator you are thinking about is the best unbiased estimator !! Calculate the variance and also compare weather the Variance is attaining Cramer-Rao Lowe Bound.

Give it a try !! You may need the help of Stein’s Identity.

Categories

## ISI MStat PSB 2018 Problem 9 | Regression Analysis

This is a very simple sample problem from ISI MStat PSB 2018 Problem 9. It is mainly based on estimation of ordinary least square estimates and Likelihood estimates of regression parameters. Try it!

## Problem – ISI MStat PSB 2018 Problem 9

Suppose $(y_i,x_i)$ satisfies the regression model,

$y_i= \alpha + \beta x_i + \epsilon_i$ for $i=1,2,….,n.$

where ${ x_i : 1 \le i \le n }$ are fixed constants and ${ \epsilon_i : 1 \le i \le n}$ are i.i.d. $N(0, \sigma^2)$ errors, where $\alpha, \beta$ and $\sigma^2 (>0)$ are unknown parameters.

(a) Let $\tilde{\alpha}$ denote the least squares estimate of $\alpha$ obtained assuming $\beta=5$. Find the mean squared error (MSE) of $\tilde{\alpha}$ in terms of model parameters.

(b) Obtain the maximum likelihood estimator of this MSE.

### Prerequisites

Normal Distribution

Ordinary Least Square Estimates

Maximum Likelihood Estimates

## Solution :

These problem is simple enough,

for the given model, $y_i= \alpha + \beta x_i + \epsilon_i$ for $i=1,….,n$.

The scenario is even simpler here since, it is given that $\beta=5$ , so our model reduces to,

$y_i= \alpha + 5x_i + \epsilon_i$, where $\epsilon_i \sim N(0, \sigma^2)$ and $\epsilon_i$’s are i.i.d.

now we know that the Ordinary Least Square (OLS) estimate of $\alpha$ is

$\tilde{\alpha} = \bar{y} – \tilde{\beta}\bar{x}$ (How ??) where $\tilde{\beta}$ is the (generally) the OLS estimate of $\beta$, but here $\beta=5$ is known, so,

$\tilde{\alpha}= \bar{y} – 5\bar{x}$ again,

$E(\tilde{\alpha})=E( \bar{y}-5\bar{x})=alpha-(\beta-5)\bar{x}$, hence $\tilde{\alpha}$ is a biased estimator for $\alpha$ with $Bias_{\alpha}(\tilde{\alpha})= (\beta-5)\bar{x}$.

So, the Mean Squared Error, MSE of $\tilde{\alpha}$ is,

$MSE_{\alpha}(\tilde{\alpha})= E(\tilde{\alpha} – \alpha)^2=Var(\tilde{\alpha})$ + ${Bias^2}_{\alpha}(\tilde{\alpha})$

$= frac{\sigma^2}{n}+ \bar{x}^2(\beta-5)^2$

[ as, it follows clearly from the model, $y_i \sim N( \alpha +\beta x_i , \sigma^2)$ and $x_i$’s are non-stochastic ] .

(b) the last part follows directly from the, the note I provided at the end of part (a),

that is, $y_i \sim N( \alpha + \beta x_i , \sigma^2 )$ and we have to find the Maximum Likelihood Estimator of $\sigma^2$ and $\beta$ and then use the inavriant property of MLE. ( in the MSE obtained in (a)). In leave it as an Exercise !! Finish it Yourself !

## Food For Thought

Suppose you don’t know the value of $\beta$ even, What will be the MSE of $\tilde{\alpha}$ in that case ?

Also, find the OLS estimate of $\beta$ and you already have done it for $\alpha$, so now find the MLEs of all $\alpha$ and $\beta$. Are the OLS estimates are identical to the MLEs you obtained ? Which assumption induces this coincidence ?? What do you think !!

Categories

## Restricted Maximum Likelihood Estimator |ISI MStat PSB 2012 Problem 9

This is a very beautiful sample problem from ISI MStat PSB 2012 Problem 9, It’s about restricted MLEs, how restricted MLEs are different from the unrestricted ones, if you miss delicacies you may miss the differences too . Try it! But be careful.

## Problem– ISI MStat PSB 2012 Problem 9

Suppose $X_1$ and $X_2$ are i.i.d. Bernoulli random variables with parameter $p$ where it us known that $\frac{1}{3} \le p \le \frac{2}{3}$. Find the maximum likelihood estimator $\hat{p}$ of $p$ based on $X_1$ and $X_2$.

### Prerequisites

Bernoulli trials

Restricted Maximum Likelihood Estimators

Real Analysis

## Solution :

This problem seems quite simple and it is simple, if and only if one observes subtle details. Lets think about the unrestricted MLE of $p$,

Let the unrestricted MLE of $p$ (i.e. when $0\le p \le 1$ )based on $X_1$ and $X_2$ be $p_{MLE}$, and $p_{MLE}=\frac{X_1+X_2}{2}$ (How ??)

Now lets see the contradictions which may occur if we don’t modify $p_{MLE}$ to $\hat{p}$ (as it is been asked).

See, that when if our sample comes such that $X_1=X_2=0$ or $X_1=X_2=1$, then $p_{MLE}$ will be 0 and 1 respectively, where $p$, the actual parameter neither takes the value 1 or 0 !! So, $p_{MLE}$ needs serious improvement !

To, modify the $p_{MLE}$, lets observe the log-likelihood function of Bernoulli based in two samples.

$\log L(p|x_1,x_2)=(x_1+x_2)\log p +(2-x_1-x_2)\log (1-p)$

Now, make two observations, when $X_1=X_2=0$ (.i.e. $p_{MLE}=0$), then $\log L(p|x_1,x_2)=2\log (1-p)$, see that $\log L(p|x_1,x_2)$ decreases as p increase, hence under the given condition, log_likelihood will be maximum when p is least, .i.e. $\hat{p}=\frac{1}{3}$.

Similarly, when $p_{MLE}=1$ (i.e.when $X_1=X_2=1$), then for the log-likelihood function to be maximum, p has to be maximum, i.e. $\hat{p}=\frac{2}{3}$.

So, to modify $p_{MLE}$ to $\hat{p}$, we have to develop a linear relationship between $p_{MLE}$ and $\hat{p}$. (Linear because, the relationship between $p$ and $p_{MLE}$ is linear. ). So, $\hat{p}$ and $p_{MLE}$ is on the line that is joining the points $(0,\frac{1}{3})$ ( when $p_{MLE}= 0$ then $\hat{p}=\frac{1}{3}$) and $(1,\frac{2}{3})$. Hence the line is,

$\frac{\hat{p}-\frac{1}{3}}{p_{MLE}-0}=\frac{\frac{2}{3}-\frac{1}{3}}{1-0}$

$\hat{p}=\frac{2-X_1-X_2}{6}$. is the required restricted MLE.

Hence the solution concludes.

## Food For Thought

Can You find out the conditions for which the Maximum Likelihood Estimators are also unbiased estimators of the parameter. For which distributions do you think this conditions holds true. Are the also Minimum Variance Unbiased Estimators !!

Can you give some examples when the MLEs are not unbiased ?Even If they are not unbiased are the Sufficient ??

Categories

## ISI MStat PSB 2010 Problem 10 | Uniform Modified

This is a very elegant sample problem from ISI MStat PSB 2010 Problem 10. It’s mostly based on propertes of uniform, and its behaviour when modified . Try it!

## Problem– ISI MStat PSB 2010 Problem 10

Let $X$ be a random variable uniformly distributed over $(0,2\theta )$, $\theta>0$, and $Y=max(X,2\theta -X)$.

(a) Find $\mu =E(Y)$.

(b) Let $X_1,X_2,…..X_n$ be a random sample from the above distribution with unknown $\theta$. Find two distinct unbiased estimators of $\mu$, as defined in (a), based on the entire sample.

### Prerequisites

Uniform Distribution

Law of Total Expectation

Unbiased Estimators

## Solution :

Well, this is a very straight forward problem, where we just need to be aware the way $Y$ is defined.

As, we need $E(Y)$ and by definition of $Y$ , we clearly see that $Y$ is dependent in $X$ where $X \sim Unif( 0, 2\theta)$.

So, using Law of Total Expectation,

$E(Y)= E(X|X>2\theta-X)P(X>2\theta-X)+E(2\theta-X|X \le 2\theta-X)P(X \le 2\theta-X). Observe that, \(P(X \le \theta)=\frac{1}{2}$, why ??

Also, conditional pdf of $X|X>\theta$ is,

$f_{X|X>\theta}(x)=\frac{f_X(x)}{P(X>\theta)}==\frac{1}{\theta} \& \theta< x \le 2\theta$. [where $f_X$ is the pdf of $X$].

the other conditional pdf is also same due to symmetry.(Verify!!).

So, $E(Y)=E(X|X\sim Unif(\theta,2\theta))\frac{1}{2}+E(X|X\sim Unif(0,\theta))\\frac{1}{2}=\frac{1}{2}(\frac{3\theta}{2}+2\theta-\frac{\theta}{2})=\frac{3\theta}{2}$.

hence, $\mu=\frac{3\theta}{2}$.

Now, for the next part, one trivial unbiased estimator of $\theta$ is $T_n=\frac{1}{n}\sum_{i=1}^n X_i$ (based on the given sample). So,

$\frac{3T_n}{2}=\frac{3}{2n}\sum_{i=1}^n X_i$ is an obvious unbiased estimator of $\mu$.

For another we need to change our way of looking on conventional way and look for the order statistics, since we know that $X_{(n)}$ is sufficient for $\theta$.(Don’t Know ?? Look for Factorization Theorem .)

So, verify that $E(X_{(n)})=\frac{2n}{n+1}\theta$.

Hence, $\frac{n+1}{2n}X_{(n)}$ is another unbiased estimator of $theta$. So, $\frac{3(n+1)}{4n}X_{(n)}$ is also another unbiased estimator of $\mu$ a defined in (a).

Hence the solution concludes.

## Food For Thought

Let us think about some unpopular but very beautiful relationship between discrete random variables besides the Universality of uniform. Let $X$be a discrete random variable with cdf $F_X(x)$ and define the random variable $Y=F_X(x)$.

Can you verify that, $Y$ is stochastically greater that a uniform(0,1) random variable $U$. i.e.

$P(Y>y) \ge P(U>y)=1-y$ for all $y$, $0<y<1$,

$P(Y>y) > P(U>y) =1-y$, for some $y$, $0<y<1$.

Hint: Draw a typical picture of a discrete cdf, and observe the jump points ! you may jump to the solution!! Think it over.

Categories

## ISI MStat PSB 2012 Problem 10 | MVUE Revisited

This is a very simple sample problem from ISI MStat PSB 2012 Problem 10. It’s a very basic problem but very important and regular problem for statistics students, using one of the most beautiful theorem in Point Estimation. Try it!

## Problem– ISI MStat PSB 2012 Problem 10

Let $X_1,X_2,…..X_{10}$ be i.i.d. Poisson random variables with unknown parameter $\lambda >0$. Find the minimum variance unbiased estimator of exp{$-2\lambda$}.

### Prerequisites

Poisson Distribution

Minimum Variance Unbiased Estimators

Lehman-Scheffe’s Theorem

Completeness and Sufficiency

## Solution :

Well, this is a very straight forward problem, where we just need to verify certain conditions, of sufficiency and completeness.

If, one is aware of the nature of Poisson Distribution, one knows that for a given sample $X_1,X_2,…..X_{10}$, the sufficient statistics for the unknown parameter $\lambda>0$, is $\sum_{i=1}^{10} X_i$ , also by extension $\sum_{i}X_i$ is also complete for $\lambda$ (How??).

So, now first let us construct an unbiased estimator of $e^{-2\lambda}$. Here, we need to observe patterns as usual. Let us define an Indicator Random variable,

$I_X(x) = \begin{cases} 1 & X_1=0\ and\ X_2=0 \\ 0 & Otherwise \end{cases}$,

So, $E(I_X(x))=P(X_1=0, X_2=0)=e^{-2\lambda}$, hence $I_X(x)$ is an unbiased estimator of $e^{-2\lambda}$. But is it a Minimum Variance ??

Well, Lehman-Scheffe answers that, Since we know that $\sum X_i$ is complete and sufficient for $\lambda$, By Lehman-Scheffe’s theorem,

$E(I_X(x)|\sum X_i=t)$ is the minimum variance unbiased estimator of $e^{-2\lambda }$ for any $t>0$. So, we need to find the following,

$E(I_X(x)|\sum_{i=1}^{10}X_i=t)= \frac{P(X_1=0,X_2; \sum_{i}X_i=t)}{P(\sum_{i=3}^{10}X_i=t)}=\frac{e^{-2\lambda}e^{-8\lambda}\frac{(8\lambda)^t}{t!}}{e^{10\lambda}\frac{(10\lambda)^t}{t!}}=(\frac{8}{10})^t$.

So, the Minimum Variance Unbiased Estimator of exp{$-2\lambda$} is $(\frac{8}{10})^{\sum_{i=1}^{10}X_i}$

Now can you generalize this for a sample of size n, again what if I defined $I_X(x)$ as,

$I_X(x) = \begin{cases} 1 & X_i=0\ &\ X_j=0 \\ 0 & Otherwise \end{cases}$, for some $i \neq j$,

would it affected the end result ?? What do you think?

## Food For Thought

Let’s not end our concern for Poisson, and think further, that for the given sample if the sample mean is $\bar{X}$ and sample variance is $S^2$. Can you show that $E(S^2|\bar{X})=\bar{X}$, and further can you extend your deductions to $Var(S^2) > Var(\bar{X})$ ??

Finally can you generalize the above result ?? Give some thoughts to deepen your insights on MVUE.

Categories

## ISI MStat PSB 2006 Problem 9 | Consistency and MVUE

This is a very simple sample problem from ISI MStat PSB 2006 Problem 9. It’s based on point estimation and finding consistent estimator and a minimum variance unbiased estimator and recognizing the subtle relation between the two types. Go for it!

## Problem– ISI MStat PSB 2006 Problem 9

Let $X_1,X_2,……$ be i.i.d. random variables with density $f_{\theta}(x), \ x \in \mathbb{R}, \ \theta \in (0,1)$, being the unknown parameter. Suppose that there exists an unbiased estimator $T$ of $\theta$ based on sample size 1, i.e. $E_{\theta}(T(X_1))=\theta$. Assume that $Var(T(X_1))< \infty$.

(a) Find an estimator $V_n$ for $\theta$ based on $X_1,X_2,……,X_n$ such that $V_n$ is consistent for $\theta$ .

(b) Let $S_n$ be the MVUE( minimum variance unbiased estimator ) of $\theta$ based on $X_1,X_2,….,X_n$. Show that $\lim_{n\to\infty}Var(S_n)=0$.

### Prerequisites

Consistent estimators

Minimum Variance Unbiased Estimators

Rao-Blackwell Theorem

## Solution :

Often, problems on estimation seems a bit of complicated and we feel directionless, but most cases its always beneficiary do go with the flow.

Here, it is given that $T$ is an unbiased estimator of $\theta$ based on one observation, and we are to find a consistent estimator for $\theta$ based on a sample of size $n$. Now first, we should consider what are the requisition of an estimator to be consistent?

• The required estimator $V_n$ have to be unbiased for $\theta$ as $n \uparrow \infty$ . i.e. $\lim_{n \uparrow \infty} E_{\theta}(V_n)=\theta$.
• The variance of the would be consistent estimator must converge to 0, as n grows large .i.e. $\lim_{n \uparrow \infty}Var_{\theta}(V_n)=0$.

First thing first, let us fulfill the unbiased criteria of $V_n$, so, from each of the observation from the sample , $X_1,X_2,…..,X_n$ , of size n, we can get as set of n unbiased estimator of $\theta$ $T(X_1), T(X_2), ….., T(X_n)$. So, can we write $V_n=\frac{1}{n} \sum_{i=1}^n(T(X_i)+a)$ ? where $a$ is a constant, ( kept for generality). Can you verify that $V_n$ satisfies the first requirement of being a consistent estimator?

Now, proceeding towards fulfilling the final requirement, that is the variance of $V_n$ converges to 0 as $n \uparrow \infty$ . Since we have defined $V_n$ based on $T$, and it is given that $Var(T(X_i))$ exists for $i \in \mathbb{N}$, and $X_1,X_2,…X_n$ are i.i.d. (which is a very important realization here), leads us to

$Var(V_n)= \frac{Var(T(X_1))}{n}$ , (why ??) . So, clearly, $Var(V_n) \downarrow 0$ a $n \uparrow \infty$, fulfilling both required conditions for being a consistent estimator. So, $V_n= \sum_{i=1}^n(T(X_i)+a)$ is a consistent estimator for $\theta$.

(b) For this part one may also use Rao-Blackwell theorem, but I always prefer using as less formulas and theorem as possible, and in this case we can do the required problem from the previous part. Since given $S_n$ is MVUE for $\theta$ and we found that $V_n$ is consistent for $\theta$, so, by the nature of MVUE,

$Var(S_n) \le Var(V_n)$, so as n gets bigger, $\lim_{ n \to \infty} Var(S_n) \le \lim{n \to infty} Var(V_n) \Rightarrow \lim_{n \to \infty}Var(S_n) \le 0$

again, $Var(S_n) \ge 0$, so, $\lim_{n \to \infty }Var(S_n)= 0$. Hence, we conclude.

## Food For Thought

Lets extend this problem a liitle bit just to increase the fun!!

Let, $X_1,….,X_n$ are independent but not identical, but still $T(X_1),T(X_2),…..,T(X_n)$, remains unbiased of $\theta$ , and $Var(T(X_i)= {\sigma_i}^2$, and

$Cov(T(X_i),T(X_j))=0$ if $i \neq j$.

Can you show that of all the estimators of form $\sum a_iT(X_i)$, where $a_i$’s are constants, and $E_{\theta}(\sum a_i T(X_i))=\theta$, the estimator,

$T*= \frac{\sum \frac{T(X_i)}{{\sigma_i}^2}}{\sum\frac{1}{{\sigma_i}^2}}$ has minimum variance.

Can you find the variance ? Think it over !!