Get inspired by the success stories of our students in IIT JAM MS, ISI  MStat, CMI MSc DS.  Learn More

# ISI MStat PSB 2014 Problem 9 | Hypothesis Testing

This is a another beautiful sample problem from ISI MStat PSB 2014 Problem 9. It is based on testing simple hypothesis, but reveals and uses a very cute property of Geometric distribution, which I prefer calling sister to Loss of memory . Give it a try !

## Problem- ISI MStat PSB 2014 Problem 9

Let $$X_i \sim Geo(p_1)$$ and $$X_2 \sim Geo(p_2)$$ be independent random variables, where Geo(p) refers to Geometric distribution whose p.m.f. f is given by,

$$f(k)=p(1-p)^k, k=0,1,.....$$

We are interested in testing the null hypothesis $$H_o : p_1=p_2$$ against the alternative $$H_1: p_1<p_2$$. Intuitively it is clear that we should reject if $$X_1$$ is large, but unfortunately, we cannot compute the cut-off because the distribution of $$X_1$$ under $$H_o$$ depends on the unknown (common) value $$p_1$$ and $$p_2$$.

(a) Let $$Y= X_1 +X_2$$. Find the conditional distribution of $$X_1|Y=y$$ when $$p_1=p_2$$.

(b) Based on the result obtained in (a), derive a level 0.05 test for $$H_o$$ against $$H_1$$ when $$X_1$$ is large.

### Prerequisites

Geometric Distribution.

Negative binomial distribution.

Discrete Uniform distribution .

Conditional Distribution . .

Simple Hypothesis Testing.

## Solution :

Well, Part (a), is quite easy, but interesting and elegant, so I'm leaving it as an exercise, for you to have the fun. Hint: verify whether the required distribution is Discrete uniform or not ! If you are done, proceed .

Now, part (b), is further interesting, because here we will not use the conventional way of analyzing the distribution of $$X_1$$ and $$X_2$$, whereas we will be concentrating ourselves on the conditional distribution of $$X_1 | Y=y$$ ! But why ?

The reason behind this adaptation of strategy is required, one of the reason is already given in the question itself, but the other reason is more interesting to observe , i.e. if you are done with (a), then by now you found that , the conditional distribution of $$X_1|Y=y$$ is independent of any parameter ( i.e. ithe distribution of $$X_1$$ looses all the information about the parameter $$p_1$$ , when conditioned by Y=y , $$p_1=p_2$$ is a necessary condition), and the parameter independent conditional distribution is nothing but a Discrete Uniform {0,1,....,y}, where y is the sum of $$X_1$$ and $$X_2$$ .

so, under $$H_o: p_1=p_2$$ , the distribution of $$X_1|Y=y$$ is independent of the both common parameter $$p_1$$ and $$p_2$$ . And clearly as stated in the problem itself, its intuitively understandable , large value of $$X_1$$ exhibits evidences against $$H_o$$. Since large value of $$X_1$$ is realized, means the success doesn't come very often .i.e. $$p_1$$ is smaller.

So, there will be strong evidence against $$H_o$$ if $$X_1 > c$$ , where , for some constant $$c \ge y$$, where y is given the sum of $$X_1+X_2$$.

So, for a level 0.05 test , the test will reject $$H_o$$ for large value of k , such that,

$$P_{H_o}( X_1 > c| Y=y)=0.05 \Rightarrow \frac{y-c}{y+1} = 0.05 \Rightarrow c= 0.95 y - 0.05 .$$

So, we reject $$H_o$$ at level 0.05, when we observe $$X_1 > 0.95y - 0.05$$ , where it is given that $$X_1+X_2$$ =y . That's it!

## Food For Thought

Can you show that for this same $$X_1$$ and $$X_2$$ ,

$$P(X_1 \le n)- P( X_1+X_2 \le n)= \frac{1-p}{p}P(X_1+X_2= n)$$

considering $$p_1=p_2=p$$ , where n=0,1,.... What about the converse? Does it hold? Find out!

But avoid loosing memory, it's beauty is exclusively for Geometric ( and exponential) !!

## Subscribe to Cheenta at Youtube

### Knowledge Partner

Cheenta is a knowledge partner of Aditya Birla Education Academy