ISI MStat PSB 2005 Problem 5 | Uniformity of Uniform

This is a simple and elegant sample problem from ISI MStat PSB 2005 Problem 5. It's based the mixture of Discrete and Continuous Uniform Distribution, the simplicity in the problem actually fools us, and we miss subtle happenings. Be careful while thinking !

Problem- ISI MStat PSB 2005 Problem 5


Suppose \(X\) and \(U\) are independent random variables with

\(P(X=k)=\frac{1}{N+1} \) , \( k=0,1,2,......,N\) ,

and \(U\) having a uniform distribution on [0,1]. Let \(Y=X+U\).

(a) For \( y \in \mathbb{R} \), find \( P(Y \le y)\).

(b) Find the correlation coefficient between \(X\) and \(Y\).

Prerequisites


Uniform Distribution

Law of Total Probability

Conditional Distribution

Solution :

This ptroblem is quite straight forward enough, and we do what we are told to.

Here, we need to find the Cdf of \(Y\) , where , \(Y=X+U \), and \(X\) and \(Y\) are defined as above.

So, \(P(Y\le y)= P(X+U \le y)=\sum_{k=0}^N P(U \le y-k|X=k)P(X=k) = \frac{1}{N+1} \sum_{i=1}^NP(U \le y-k) \), [ since \(U\) and \(X\) are indepemdent ],

Now, here is where we get fooled often by the simplicity of the problem. The beauty is to observe in the above expression, if \(y-k <0\) then \(P(U\le y-k)=0\), and if \(y-k>1\) then \(P(U\le y-k)=1\), ( why??)

So, for \( k* \le y \le k*+1 \) the \(P(U \le y-k*)=y-k*\), so when \(k* \le y \le k*+1\), \(P(U\le y-k)=0\) for \(k>k*\), and \(P(U \le y-k)=1\) for \(k<k*\).

So, the required Cdf will depend on the interval , y belongs to, for the above illustrated case, i.e. \(k* \le y\le k*+1\) there will be \(k*\) number of 1's, and \(N-k*-1\) number of 0's in the above summation, derived, so, \(P(Y\le y)\) reduces to,

\(P(Y\le y)= \frac{1}{N+1} ( k*+y-k*)=\frac{y}{N+1}\), \(0<y<N+1\), [ since here \(k*\) may vary from 0,1,...., N, hence union of all the nested sub-intervals give the mentioned interval]

Hence the required Cdf. But can you find this Cdf argumentatively, without this algebraic labor ?? What distribution is it ?? Does the Uniform retains its Uniformity ?

I leave the part (b) as an exercise, its quite trivial.


Food For Thought

How to deal with some random triangles ??

Suppose, You are given a stick of length \(N+1\) units, Now you are to break the stick into 3 pieces, and the breaking points are chosen randomly , What is the chance of constructing a triangle with those 3 broken part as the sides of the constructed triangle ??

Now if you first break the stick into two pieces randomly, and further break the longer piece again into two parts (randomly), How the chance of making a triangle changes ?? Do the chance increases what do you think ??

Lastly, does the length of the stick matters at all , Give it a thought !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


ISI MStat PSB 2014 Problem 9 | Hypothesis Testing

This is a another beautiful sample problem from ISI MStat PSB 2014 Problem 9. It is based on testing simple hypothesis, but reveals and uses a very cute property of Geometric distribution, which I prefer calling sister to Loss of memory . Give it a try !

Problem- ISI MStat PSB 2014 Problem 9


Let \( X_i \sim Geo(p_1)\) and \( X_2 \sim Geo(p_2)\) be independent random variables, where Geo(p) refers to Geometric distribution whose p.m.f. f is given by,

\(f(k)=p(1-p)^k, k=0,1,.....\)

We are interested in testing the null hypothesis \(H_o : p_1=p_2\) against the alternative \( H_1: p_1<p_2\). Intuitively it is clear that we should reject if \(X_1\) is large, but unfortunately, we cannot compute the cut-off because the distribution of \(X_1\) under \(H_o\) depends on the unknown (common) value \(p_1\) and \(p_2\).

(a) Let \(Y= X_1 +X_2\). Find the conditional distribution of \( X_1|Y=y\) when \(p_1=p_2\).

(b) Based on the result obtained in (a), derive a level 0.05 test for \(H_o\) against \(H_1\) when \(X_1\) is large.

Prerequisites


Geometric Distribution.

Negative binomial distribution.

Discrete Uniform distribution .

Conditional Distribution . .

Simple Hypothesis Testing.

Solution :

Well, Part (a), is quite easy, but interesting and elegant, so I'm leaving it as an exercise, for you to have the fun. Hint: verify whether the required distribution is Discrete uniform or not ! If you are done, proceed .

Now, part (b), is further interesting, because here we will not use the conventional way of analyzing the distribution of \(X_1\) and \( X_2\), whereas we will be concentrating ourselves on the conditional distribution of \( X_1 | Y=y\) ! But why ?

The reason behind this adaptation of strategy is required, one of the reason is already given in the question itself, but the other reason is more interesting to observe , i.e. if you are done with (a), then by now you found that , the conditional distribution of \(X_1|Y=y\) is independent of any parameter ( i.e. ithe distribution of \(X_1\) looses all the information about the parameter \(p_1\) , when conditioned by Y=y , \(p_1=p_2\) is a necessary condition), and the parameter independent conditional distribution is nothing but a Discrete Uniform {0,1,....,y}, where y is the sum of \(X_1 \) and \(X_2\) .

so, under \(H_o: p_1=p_2\) , the distribution of \(X_1|Y=y\) is independent of the both common parameter \(p_1 \) and \(p_2\) . And clearly as stated in the problem itself, its intuitively understandable , large value of \(X_1\) exhibits evidences against \(H_o\). Since large value of \(X_1\) is realized, means the success doesn't come very often .i.e. \(p_1\) is smaller.

So, there will be strong evidence against \(H_o\) if \(X_1 > c\) , where , for some constant \(c \ge y\), where y is given the sum of \(X_1+X_2\).

So, for a level 0.05 test , the test will reject \(H_o\) for large value of k , such that,

\( P_{H_o}( X_1 > c| Y=y)=0.05 \Rightarrow \frac{y-c}{y+1} = 0.05 \Rightarrow c= 0.95 y - 0.05 .\)

So, we reject \(H_o\) at level 0.05, when we observe \( X_1 > 0.95y - 0.05 \) , where it is given that \(X_1+X_2\) =y . That's it!


Food For Thought

Can you show that for this same \(X_1 \) and \( X_2\) ,

\(P(X_1 \le n)- P( X_1+X_2 \le n)= \frac{1-p}{p}P(X_1+X_2= n) \)

considering \(p_1=p_2=p\) , where n=0,1,.... What about the converse? Does it hold? Find out!

But avoid loosing memory, it's beauty is exclusively for Geometric ( and exponential) !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube