Get inspired by the success stories of our students in IIT JAM MS, ISI  MStat, CMI MSc Data Science.  Learn More 

ISI MStat PSB 2014 Problem 9 | Hypothesis Testing

This is a another beautiful sample problem from ISI MStat PSB 2014 Problem 9. It is based on testing simple hypothesis, but reveals and uses a very cute property of Geometric distribution, which I prefer calling sister to Loss of memory . Give it a try !

Problem- ISI MStat PSB 2014 Problem 9


Let \( X_i \sim Geo(p_1)\) and \( X_2 \sim Geo(p_2)\) be independent random variables, where Geo(p) refers to Geometric distribution whose p.m.f. f is given by,

\(f(k)=p(1-p)^k, k=0,1,.....\)

We are interested in testing the null hypothesis \(H_o : p_1=p_2\) against the alternative \( H_1: p_1<p_2\). Intuitively it is clear that we should reject if \(X_1\) is large, but unfortunately, we cannot compute the cut-off because the distribution of \(X_1\) under \(H_o\) depends on the unknown (common) value \(p_1\) and \(p_2\).

(a) Let \(Y= X_1 +X_2\). Find the conditional distribution of \( X_1|Y=y\) when \(p_1=p_2\).

(b) Based on the result obtained in (a), derive a level 0.05 test for \(H_o\) against \(H_1\) when \(X_1\) is large.

Prerequisites


Geometric Distribution.

Negative binomial distribution.

Discrete Uniform distribution .

Conditional Distribution . .

Simple Hypothesis Testing.

Solution :

Well, Part (a), is quite easy, but interesting and elegant, so I'm leaving it as an exercise, for you to have the fun. Hint: verify whether the required distribution is Discrete uniform or not ! If you are done, proceed .

Now, part (b), is further interesting, because here we will not use the conventional way of analyzing the distribution of \(X_1\) and \( X_2\), whereas we will be concentrating ourselves on the conditional distribution of \( X_1 | Y=y\) ! But why ?

The reason behind this adaptation of strategy is required, one of the reason is already given in the question itself, but the other reason is more interesting to observe , i.e. if you are done with (a), then by now you found that , the conditional distribution of \(X_1|Y=y\) is independent of any parameter ( i.e. ithe distribution of \(X_1\) looses all the information about the parameter \(p_1\) , when conditioned by Y=y , \(p_1=p_2\) is a necessary condition), and the parameter independent conditional distribution is nothing but a Discrete Uniform {0,1,....,y}, where y is the sum of \(X_1 \) and \(X_2\) .

so, under \(H_o: p_1=p_2\) , the distribution of \(X_1|Y=y\) is independent of the both common parameter \(p_1 \) and \(p_2\) . And clearly as stated in the problem itself, its intuitively understandable , large value of \(X_1\) exhibits evidences against \(H_o\). Since large value of \(X_1\) is realized, means the success doesn't come very often .i.e. \(p_1\) is smaller.

So, there will be strong evidence against \(H_o\) if \(X_1 > c\) , where , for some constant \(c \ge y\), where y is given the sum of \(X_1+X_2\).

So, for a level 0.05 test , the test will reject \(H_o\) for large value of k , such that,

\( P_{H_o}( X_1 > c| Y=y)=0.05 \Rightarrow \frac{y-c}{y+1} = 0.05 \Rightarrow c= 0.95 y - 0.05 .\)

So, we reject \(H_o\) at level 0.05, when we observe \( X_1 > 0.95y - 0.05 \) , where it is given that \(X_1+X_2\) =y . That's it!


Food For Thought

Can you show that for this same \(X_1 \) and \( X_2\) ,

\(P(X_1 \le n)- P( X_1+X_2 \le n)= \frac{1-p}{p}P(X_1+X_2= n) \)

considering \(p_1=p_2=p\) , where n=0,1,.... What about the converse? Does it hold? Find out!

But avoid loosing memory, it's beauty is exclusively for Geometric ( and exponential) !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


This is a another beautiful sample problem from ISI MStat PSB 2014 Problem 9. It is based on testing simple hypothesis, but reveals and uses a very cute property of Geometric distribution, which I prefer calling sister to Loss of memory . Give it a try !

Problem- ISI MStat PSB 2014 Problem 9


Let \( X_i \sim Geo(p_1)\) and \( X_2 \sim Geo(p_2)\) be independent random variables, where Geo(p) refers to Geometric distribution whose p.m.f. f is given by,

\(f(k)=p(1-p)^k, k=0,1,.....\)

We are interested in testing the null hypothesis \(H_o : p_1=p_2\) against the alternative \( H_1: p_1<p_2\). Intuitively it is clear that we should reject if \(X_1\) is large, but unfortunately, we cannot compute the cut-off because the distribution of \(X_1\) under \(H_o\) depends on the unknown (common) value \(p_1\) and \(p_2\).

(a) Let \(Y= X_1 +X_2\). Find the conditional distribution of \( X_1|Y=y\) when \(p_1=p_2\).

(b) Based on the result obtained in (a), derive a level 0.05 test for \(H_o\) against \(H_1\) when \(X_1\) is large.

Prerequisites


Geometric Distribution.

Negative binomial distribution.

Discrete Uniform distribution .

Conditional Distribution . .

Simple Hypothesis Testing.

Solution :

Well, Part (a), is quite easy, but interesting and elegant, so I'm leaving it as an exercise, for you to have the fun. Hint: verify whether the required distribution is Discrete uniform or not ! If you are done, proceed .

Now, part (b), is further interesting, because here we will not use the conventional way of analyzing the distribution of \(X_1\) and \( X_2\), whereas we will be concentrating ourselves on the conditional distribution of \( X_1 | Y=y\) ! But why ?

The reason behind this adaptation of strategy is required, one of the reason is already given in the question itself, but the other reason is more interesting to observe , i.e. if you are done with (a), then by now you found that , the conditional distribution of \(X_1|Y=y\) is independent of any parameter ( i.e. ithe distribution of \(X_1\) looses all the information about the parameter \(p_1\) , when conditioned by Y=y , \(p_1=p_2\) is a necessary condition), and the parameter independent conditional distribution is nothing but a Discrete Uniform {0,1,....,y}, where y is the sum of \(X_1 \) and \(X_2\) .

so, under \(H_o: p_1=p_2\) , the distribution of \(X_1|Y=y\) is independent of the both common parameter \(p_1 \) and \(p_2\) . And clearly as stated in the problem itself, its intuitively understandable , large value of \(X_1\) exhibits evidences against \(H_o\). Since large value of \(X_1\) is realized, means the success doesn't come very often .i.e. \(p_1\) is smaller.

So, there will be strong evidence against \(H_o\) if \(X_1 > c\) , where , for some constant \(c \ge y\), where y is given the sum of \(X_1+X_2\).

So, for a level 0.05 test , the test will reject \(H_o\) for large value of k , such that,

\( P_{H_o}( X_1 > c| Y=y)=0.05 \Rightarrow \frac{y-c}{y+1} = 0.05 \Rightarrow c= 0.95 y - 0.05 .\)

So, we reject \(H_o\) at level 0.05, when we observe \( X_1 > 0.95y - 0.05 \) , where it is given that \(X_1+X_2\) =y . That's it!


Food For Thought

Can you show that for this same \(X_1 \) and \( X_2\) ,

\(P(X_1 \le n)- P( X_1+X_2 \le n)= \frac{1-p}{p}P(X_1+X_2= n) \)

considering \(p_1=p_2=p\) , where n=0,1,.... What about the converse? Does it hold? Find out!

But avoid loosing memory, it's beauty is exclusively for Geometric ( and exponential) !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Knowledge Partner

Cheenta is a knowledge partner of Aditya Birla Education Academy
Cheenta

Cheenta Academy

Aditya Birla Education Academy

Aditya Birla Education Academy

Cheenta. Passion for Mathematics

Advanced Mathematical Science. Taught by olympians, researchers and true masters of the subject.
JOIN TRIAL
support@cheenta.com
Menu
Trial
Whatsapp
rockethighlight