Testing of Hypothesis | ISI MStat 2016 PSB Problem 9

This is a problem from the ISI MStat Entrance Examination, 2016 involving the basic idea of Type 1 error of Testing of Hypothesis but focussing on the fundamental relationship of Exponential Distribution and the Geometric Distribution.

The Problem:

Suppose \(X_{1}, X_{2}, \ldots, X_{n}\) is a random sample from an exponential distribution with mean \(\lambda\).

Assume that the observed data is available on \(\left[X_{1}\right], \ldots,\left[X_{n}\right]\), instead of \(X_{1}, \ldots, X_{n},\) where \([x]\) denotes the largest integer less than or equal to \(x\).

Consider a test for \(H_{0}: \lambda=1\) vs \(H_{1}: \lambda>1\) which rejects \(H_{0}\) when \(\sum_{i=1}^{n}\left[X_{i}\right]>c_{n} .\)

Given \(\alpha \in(0,1),\) obtain values of \(c_{n}\) such that the size of the test converges to \(\alpha\) as \(n \rightarrow \infty\).

Prerequisites:

(a) Testing of Hypothesis

(b) Type 1 Error

(c) Exponential Distribution

(d) Relationship of Exponential Distribution and Geometric Distribution

(e) Central Limit Theorem

Solution:

  • X ~ Exponential(\(\lambda\)), then \(Y = [\frac{X}{a}]\) ~ Geom(\(p\)), where \( p = 1-e^{-\lambda a} \in(0,1) \)

Proof:

\(Y\) is clearly discrete taking values in the set of non-negative integers, due to the flooring. Then, for any integer \(n \geq 0\) we have
\(
\begin{array}{c}
P(Y=n)=P(X \in[\text {an, } a(n+1))) \
=\int_{a n}^{a(n+1)} \lambda \mathrm{e}^{-\lambda x} d x=(1-p)^{n} p
\end{array}
\)
where \(p=1-e^{-\lambda a} \in(0,1),\) as \(\lambda>0\) and \(a>0\).

  • \(X_i\) ~ Geom(\(p\)), then \(\sum_{i = 1}^{n} \) ~ NBinom(n,p)
  • \(X_i\) ~ Exponential(\(\lambda\)), then \(S_n = \sum_{i=1}^{n}\left[X_{i}\right]\) ~ NBinom(\((n,p)\)), where \( p = 1-e^{-\lambda} \in(0,1) \)

Testing of Hypothesis

\(H_{0}: \lambda=1\) vs \(H_{1}: \lambda>1\)

We reject \(H_{0}\) when \(\sum_{i=1}^{n}\left[X_{i}\right]>c_{n} .\)

Here, the size of the test i.e the Type 1 error (for simple hypothesis), \( \alpha_n\) = \( P(S_n > c_{n} | \lambda=1)\).

We want to select \(c_n\) such that \(\alpha_n \to \alpha\).

\(S_n\) ~ NBinom(\(n,p\)), where \( p = 1-e^{-1} \) under \(H_0\).

Now, \(\frac{\sqrt{n}(\frac{S_n}{n} - \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}} \rightarrow Z = N(0,1)\) by Central Limit Theorem.

Observe that thus, \( \alpha_n = P(S_n > c_{n} | \lambda=1) \rightarrow P(Z > \frac{\sqrt{n}(\frac{c_n}{n} - \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}}) = \alpha\).

Thus, \( \frac{\sqrt{n}(\frac{c_n}{n} - \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}} = Z_{\alpha} \).

We can solve this to find \(c_n\), where \( p = 1-e^{-1} \)

Food for Thought

If X ~ Exponential(\(\lambda\)), then what is the distribution of {X} [ The fractional part of X]. This question is crucial is getting back Exponential Distrbution from Geometric Distribution.

Rather, the food for thought, asks you how do we get Exponential Distribution from Geometric Distribution.

Stay Tuned. Stay Blessed! See you in the next post.

ISI MStat PSB 2006 Problem 8 | Bernoullian Beauty

This is a very beautiful sample problem from ISI MStat PSB 2006 Problem 8. It is based on basic idea of Maximum Likelihood Estimators, but with a bit of thinking. Give it a thought !

Problem- ISI MStat PSB 2006 Problem 8


Let \((X_1,Y_1),......,(X_n,Y_n)\) be a random sample from the discrete distributions with joint probability

\(f_{X,Y}(x,y) = \begin{cases} \frac{\theta}{4} & (x,y)=(0,0) \ and \ (1,1) \\ \frac{2-\theta}{4} & (x,y)=(0,1) \ and \ (1,0) \end{cases}\)

with \(0 \le \theta \le 2\). Find the maximum likelihood estimator of \(\theta\).

Prerequisites


Maximum Likelihood Estimators

Indicator Random Variables

Bernoulli Trials

Solution :

This is a very beautiful Problem, not very difficult, but her beauty is hidden in her simplicity, lets explore !!

Observe, that the given pmf is as good as useless while taking us anywhere, so we should think out of the box, but before going out of the box, lets collect whats in the box !

So, from the given pmf we get, \(P( \ of\ getting\ pairs \ of\ form \ (1,1) \ or \ (0,0))=2\times \frac{\theta}{4}=\frac{\theta}{2}\),

Similarly, \(P( \ of\ getting\ pairs \ of\ form \ (0,1) \ or \ (1,0))=2\times \frac{2-\theta}{4}=\frac{2-\theta}{2}=1-P( \ of\ getting\ pairs \ of\ form \ (1,1) \ or \ (0,0))\)

So, clearly it is giving us a push towards involving Bernoulli trials, isn't it !!

So, lets treat the pairs with match, .i.e. \(x=y\), be our success, and the other possibilities be failure, then our success probability is \(\frac{\theta}{2}\), where \(0\le \theta \le 2\). So, if \(S\) be the number of successful pairs in our given sample of size \(n\), then it is evident \(S \sim Binomial(n, \frac{\theta}{2})\).

So, now its simplified by all means, and we know the MLE of population proportion in binomial is the proportion of success in the sample,

Hence, \(\frac{\hat{\theta_{MLE}}}{2}= \frac{s}{n}\), where \(s\) is the number of those pairs in our sample where \(X_i=Y_i\).

So, \(\hat{\theta_{MLE}}=\frac{2(number\ of \ pairs \ in\ the\ sample\ of \ form\ (0,0)\ or \ (1,1))}{n}\).

Hence, we are done !!


Food For Thought

Say, \(X\) and \(Y\) are two independent exponential random variable with means \(\mu\) and \(\lambda\) respectively. But you observe two other variables, \(Z\) and \(W\), such that \(Z=min(X,Y)\) and \(W\) takes the value \(1\) when \(Z=X\) and \(0\) otherwise. Can you find the MLEs of the parameters ?

Give it a try !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


How to roll a Dice by tossing a Coin ? Cheenta Statistics Department

How can you roll a dice by tossing a coin? Can you use your probability knowledge? Use your conditioning skills.

Suppose, you have gone to a picnic with your friends. You have planned to play the physical version of the Snake and Ladder game. You found out that you have lost your dice.

The shit just became real!

Now, you have an unbiased coin in your wallet / purse. You know Probability.

Aapna Time Aayega

starts playing in the background. :p

Can you simulate the dice from the coin?

Ofcourse, you know chances better than others. :3

Take a coin.

Toss it 3 times. Record the outcomes.

HHH = Number 1

HHT = Number 2

HTH = Number 3

HTT = Number 4

THH = Number 5

THT = Number 6

TTH = Reject it, don't ccount the toss and toss again

TTT = Reject it, don't ccount the toss and toss again

Voila done!

What is the probability of HHH in this experiment?

Let X be the outcome in the restricted experiment as shown.

How is this experiment is different from the actual experiment?

This experiment is conditioning on the event A = {HHH, HHT, HTH, HTT, THH, THT}.

\(P( X = HHH) = P (X = HHH | X \in A ) = \frac{P (X = HHH)}{P (X \in A)} = \frac{1}{6}\)


Beautiful right?

Can you generalize this idea?

Food for thought

  • Give an algorithm to simulate any conditional probability.
  • Give an algorithm to simulate any event with probability \(\frac{m}{2^k}\), where \( m \leq 2^k \).
  • Give an algorithm to simulate any event with probability \(\frac{m}{2^k}\), where \( n \leq 2^k \).
  • Give an algorithm to simulate any event with probability \(\frac{m}{n}\), where \( m \leq n \leq 2^k \) using conditional probability.

Watch the Video here:

Some Useful Links:

Books for ISI MStat Entrance Exam

How to Prepare for ISI MStat Entrance Exam

ISI MStat and IIT JAM Stat Problems and Solutions

Cheenta Statistics Program for ISI MStat and IIT JAM Stat

Simple Linear Regression - Playlist on YouTube

ISI MStat PSB 2009 Problem 8 | How big is the Mean?

This is a very simple and regular sample problem from ISI MStat PSB 2009 Problem 8. It It is based on testing the nature of the mean of Exponential distribution. Give it a Try it !

Problem- ISI MStat PSB 2009 Problem 8


Let \(X_1,.....,X_n\) be i.i.d. observation from the density,

\(f(x)=\frac{1}{\mu}exp(-\frac{x}{\mu}) , x>0\)

where \(\mu >0\) is an unknown parameter.

Consider the problem of testing the hypothesis \(H_o : \mu \le \mu_o\) against \(H_1 : \mu > \mu_o\).

(a) Show that the test with critical region \([\bar{X} \ge \mu_o {\chi_{2n,1-\alpha}}^2/2n]\), where \( {\chi^2}_{2n,1-\alpha} \) is the \((1-\alpha)\)th quantile of the \({\chi^2}_{2n}\) distribution, has size \(\alpha\).

(b) Give an expression of the power in terms of the c.d.f. of the \({\chi^2}_{2n}\) distribution.

Prerequisites


Likelihood Ratio Test

Exponential Distribution

Chi-squared Distribution

Solution :

This problem is quite regular and simple, from the given form of the hypotheses , it is almost clear that using Neyman-Pearson can land you in trouble. So, lets go for something more general , that is Likelihood Ratio Testing.

Hence, the Likelihood function of the \(\mu\) for the given sample is ,

\(L(\mu | \vec{X})=(\frac{1}{\mu})^n exp(-\frac{\sum_{i=1}^n X_i}{\mu}) , \mu>0\), also observe that sample mean \(\vec{X}\) is the MLE of \(\mu\).

So, the Likelihood Ratio statistic is,

\(\lambda(\vec{x})=\frac{\sup_{\mu \le \mu_o}L(\mu |\vec{x})}{\sup_\mu L(\mu |\vec{x})} \\ =\begin{cases} 1 & \mu_o \ge \bar{X} \\ \frac{L(\mu_o|\vec{x})}{L(\bar{X}|\vec{x})} & \mu_o < \bar{X} \end{cases} \)

So, our test function is ,

\(\phi(\vec{x})=\begin{cases} 1 & \lambda(\vec{x})<k \\ 0 & otherwise \end{cases}\).

We, reject \(H_o\) at size \(\alpha\), when \(\phi(\vec{x})=1\), for some \(k\), \(E_{H_o}(\phi) \le \alpha\),

Hence, \(\lambda(\vec{x}) < k \\ \Rightarrow L(\mu_o|\vec{x})<kL(\bar{X}|\vec{x}) \\ \ln k_1 -\frac{1}{\mu_o}\sum_{i=1}^n X_i < \ln k -n \ln \bar{X} -\frac{1}{n} \\ n \ln \bar{X}-\frac{n\bar{X}}{\mu_o} < K* \).

for some constant, \(K*\).

Let \(g(\bar{x})=n\ln \bar{x} -\frac{n\bar{x}}{\mu_o}\), and observe that \(g\) is,

Here, \(K*, \mu_o\) are fixed quantities.

decreasing function of \(\bar{x}\) for \(\bar{x} \ge \mu_o\),

Hence, there exists a \(c\) such that \(\bar{x} \ge c \),we have \(g(\bar) < K*\). See the figure.

So, the critical region of the test is of form \(\bar{X} \ge c\), for some \(c\) such that,

\(P_{H_o}(\bar{X} \ge c)=\alpha \), for some \(0 \le \alpha \le 1\), where \(\alpha\) is the size of the test.

Now, our task is to find \(c\), and for that observe, if \(X \sim Exponential(\theta)\), then \(\frac{2X}{\theta} \sim {\chi^2}_2\),

Hence, in this problem, since the \(X_i\)'s follows \(Exponential(\mu)\), hence, \(\frac{2n\bar{X}}{\mu} \sim {\chi^2}_{2n}\), we have,

\(P_{H_o}(\bar{X} \ge c)=\alpha \\ P_{H_o}(\frac{2n\bar{X}}{\mu_o} \ge \frac{2nc}{\mu_o})=\alpha \\ P_{H_o}({\chi^2}{2n} \ge \frac{2nc}{\mu_o})=\alpha \),

which gives \(c=\frac{\mu_o {\chi^2}_{2n;1-\alpha}}{2n}\),

Hence, the rejection region is indeed, \([\bar{X} \ge \frac{\mu_o {\chi^2}_{2n;1-\alpha}}{2n}\).

Hence Proved !

(b) Now, we know that the power of the test is,

\(\beta= E_{\mu}(\phi) \\ = P_{\mu}(\lambda(\bar{x})>k)=P(\bar{X} \ge \frac{\mu_o {\chi_{2n;1-\alpha}}^2}{2n}) \\ \beta = P_{\mu}({\chi^2}_{2n} \ge \frac{mu_o}{\mu}{\chi^2}_{2n;1-\alpha}) \).

Hence, the power of the test is of form of a cdf of chi-squared distribution.


Food For Thought

Can you use any other testing procedure to conduct this test ?

Think about it !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


ISI MStat PSB 2009 Problem 4 | Polarized to Normal

This is a very beautiful sample problem from ISI MStat PSB 2009 Problem 4. It is based on the idea of Polar Transformations, but need a good deal of observation o realize that. Give it a Try it !

Problem- ISI MStat PSB 2009 Problem 4


Let \(R\) and \(\theta\) be independent and non-negative random variables such that \(R^2 \sim {\chi_2}^2 \) and \(\theta \sim U(0,2\pi)\). Fix \(\theta_o \in (0,2\pi)\). Find the distribution of \(R\sin(\theta+\theta_o)\).

Prerequisites


Convolution

Polar Transformation

Normal Distribution

Solution :

This problem may get nasty, if one try to find the required distribution, by the so-called CDF method. Its better to observe a bit, before moving forward!! Recall how we derive the probability distribution of the sample variance of a sample from a normal population ??

Yes, you are thinking right, we need to use Polar Transformation !!

But, before transforming lets make some modifications, to reduce future complications,

Given, \(\theta \sim U(0,2\pi)\) and \(\theta_o \) is some fixed number in \((0,2\pi)\), so, let \(Z=\theta+\theta_o \sim U(\theta_o,2\pi +\theta_o)\).

Hence, we need to find the distribution of \(R\sin Z\). Now, from the given and modified information the joint pdf of \(R^2\) and \(Z\) are,

\(f_{R^2,Z}(r,z)=\frac{r}{2\pi}exp(-\frac{r^2}{2}) \ \ R>0, \theta_o \le z \le 2\pi +\theta_o \)

Now, let the transformation be \((R,Z) \to (X,Y)\),

\(X=R\cos Z \\ Y=R\sin Z\), Also, here \(X,Y \in \mathbb{R}\)

Hence, \(R^2=X^2+Y^2 \\ Z= \tan^{-1} (\frac{Y}{X}) \)

Hence, verify the Jacobian of the transformation \(J(\frac{r,z}{x,y})=\frac{1}{r}\).

Hence, the joint pdf of \(X\) and \(Y\) is,

\(f_{X,Y}(xy)=f_{R,Z}(x^2+y^2, \tan^{-1}(\frac{y}{x})) J(\frac{r,z}{x,y}) \\ =\frac{1}{2\pi}exp(-\frac{x^2+y^2}{2})\) , \(x,y \in \mathbb{R}\).

Yeah, Now it is looking familiar right !!

Since, we need the distribution of \(Y=R\sin Z=R\sin(\theta+\theta_o)\), we integrate \(f_{X,Y}\) w.r.t to \(X\) over the real line, and we will end up with, the conclusion that,

\(R\sin(\theta+\theta_o) \sim N(0,1)\). Hence, We are done !!


Food For Thought

From the above solution, the distribution of \(R\cos(\theta+\theta_o)\) is also determinable right !! Can you go further investigating the occurrence pattern of \(\tan(\theta+\theta_o)\) ?? \(R\) and \(\theta\) are the same variables as defined in the question.

Give it a try !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


ISI MStat PSB 2008 Problem 7 | Finding the Distribution of a Random Variable

This is a very beautiful sample problem from ISI MStat PSB 2008 Problem 7 based on finding the distribution of a random variable . Let's give it a try !!

Problem- ISI MStat PSB 2008 Problem 7


Let \( X\) and \( Y\) be exponential random variables with parameters 1 and 2 respectively. Another random variable \( Z\) is defined as follows.

A coin, with probability p of Heads (and probability 1-p of Tails) is
tossed. Define \( Z\) by \( Z=\begin{cases} X & , \text { if the coin turns Heads } \\ Y & , \text { if the coin turns Tails } \end{cases} \)
Find \( P(1 \leq Z \leq 2)\)

Prerequisites


Cumulative Distribution Function

Exponential Distribution

Solution :

Let , \( F_{i} \) be the CDF for i=X,Y, Z then we have ,

\( F_{Z}(z) = P(Z \le z) = P( Z \le z | coin turns Head )P(coin turns Head) + P( Z \le z | coin turns Tail ) P( coin turns Tail) \)

=\( P( X \le z)p + P(Y \le z ) (1-p) \) = \( F_{X}(z)p+F_{Y}(y) (1-p) \)

Therefore pdf of Z is given by \( f_{Z}(z)= pf_{X}(z)+(1-p)f_{Y}(z) \) , where \( f_{X} and f_{Y} \) are pdf of X,Y respectively .

So , \( P(1 \leq Z \leq 2) = \int_{1}^{2} \{pe^{-z} + (1-p) 2e^{-2z}\} dz = p \frac{e-1}{e^2} +(1-p) \frac{e^2-1}{e^4} \)

Food For Thought

Find the the distribution function of \( K=\frac{X}{Y} \) and then find \( \lim_{K \to \infty} P(K >1 ) \)


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


ISI MStat PSB 2012 Problem 5 | Application of Central Limit Theorem

This is a very beautiful sample problem from ISI MStat PSB 2012 Problem 5 based on central limit theorem . Let's give it a try !!

Problem- ISI MStat PSB 2012 Problem 5


Let \( X_{1}, X_{2}, \ldots, X_{j} \ldots \) be i.i.d. \(N(0,1)\) random variables. Show that for any \(a>0\)

\( \lim {n \rightarrow \infty} P(\sum_{i=1}^{n} {X_i}^2 \leq a) = 0 \)

Prerequisites


Limit

Central Limit Theorem

Normal Distribution

Chi- Square Distribution

Solution :

\( X_{1}, X_{2}, \ldots, X_{j} \ldots \) are i.i.d. \(N(0,1)\) random variables .

Let \( S_n = \sum_{i=1}^{n} {X_i}^2 \) , then \( S_n \sim \chi^{2}(n) \) , where \( \chi^{2}(n) \) is Chi-Square distribution with n degrees of freedom .

Therefore , \( E(S_n)= n \) and \( Var(S_n)=2n \) .

In this type of problems obvious thing that would come to our mind is to apply Central Limit Theorem right ! Let's try to apply it .

Now by Lindeberg Levy Central Limit Theorem we can say \( \frac{S_n-E(S_n)}{\sqrt{Var(S_n)}} \) = \( \frac{S_n-n}{\sqrt{2n}} {\to }^{d} N(0,1) \) , as n approaches infinity.

So, \( \lim {n \rightarrow \infty} P(\sum_{i=1}^{n} {X_i}^2 \leq a) \)

= \( \lim {n \rightarrow \infty} P( \frac{S_n-n}{\sqrt{2n}} \le \frac{a-n}{\sqrt{2n}} ) \)

= \( \lim {n \rightarrow \infty} \Phi(\frac{a-n}{\sqrt{2n}}) \)

= \( \lim {n \rightarrow \infty} \Phi(\frac{a}{\sqrt{2n}}- \sqrt{\frac{n}{2}}) = \Phi(0- \infty) \) (Since \( \Phi(x) \) is right continuous ) \( = 0 \) .

Hence Proved .


Food For Thought

Let \( \{X_{1}: i \geq 1 \}\) be a sequence of independent random variables each having a normal distribution with mean 2 and variance 5.Then \( (\frac{1}{n} \sum_{i=1}^{n} x_{i})^{2} \) converges in probability to ?


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


ISI MStat PSA 2019 Problem 18 | Probability and Digits

This problem is a very easy and cute problem of probability from ISI MStat PSA 2019 Problem 18.

Probability and Digits - ISI MStat Year 2019 PSA Problem 18


Draw one observation \(N\) at random from the set \(\{1,2, \ldots, 100\}\). What is the probability that the last digit of \(N^{2}\) is \(1\)?

  • \(\frac{1}{20}\)
  • \(\frac{1}{50}\)
  • \(\frac{1}{10}\)
  • \(\frac{1}{5}\)

Prerequisites


Last Digit of Natural Numbers

Basic Probability Theory

Combinatorics

Check the Answer


Answer: is \(\frac{1}{5}\)

ISI MStat 2019 PSA Problem Number 18

A First Course in Probability by Sheldon Ross

Try with Hints


Try to formulate the sample space. Observe that the sample space is not dependent on the number itself rather only on the last digits of the number \(N\).

Also, observe that the number of integers in \(\{1,2, \ldots, 100\}\) is uniformly distributed over the last digits. So the sample space can be taken as \(\{0,1,2, \ldots, 9\}\). So, the number of elements in the sample space is \(10\).

See the Food for Thought!

This step is easy.

Find out the cases for which \(N^2\) gives 1 as the last digit. Use the reduced last digit sample space.

  • 1 x 1
  • 3 x 7 (Since \(N^2\) and they must have the same last digit)
  • 7 x 3 (Since \(N^2\) and they must have the same last digit)
  • 9 x 9

So, there are 2 possible cases out of 10.

Therefore the probability = \( \frac{2}{10} = \frac{1}{5}\).

  • Observe that there is a little bit of handwaving in the First Step. Please make it more precise using the ideas of Probability that it is okay to use the sample space as the reduced version rather than \(\{1,2, \ldots, 100\}\).
  • Generalize the problem for \(\{1,2, \ldots, n\}\).
  • Generalize the problems for \(N^k\) for selecting an observation from \(\{1,2, \ldots, n\}\).
  • Generalize the problems for \(N^k\) for selecting an observation from \(\{1,2, \ldots, n\}\) for each of the digits from \(\{0,1,2, \ldots, 9\}\).
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube