Categories
Calculus College Mathematics I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT ISI MSTAT Miscellaneous Probability Statistics

ISI MStat PSB 2009 Problem 4 | Polarized to Normal

This is a very beautiful sample problem from ISI MStat PSB 2009 Problem 4. It is based on the idea of Polar Transformations, but need a good deal of observation o realize that. Give it a Try it !

Problem– ISI MStat PSB 2009 Problem 4


Let \(R\) and \(\theta\) be independent and non-negative random variables such that \(R^2 \sim {\chi_2}^2 \) and \(\theta \sim U(0,2\pi)\). Fix \(\theta_o \in (0,2\pi)\). Find the distribution of \(R\sin(\theta+\theta_o)\).

Prerequisites


Convolution

Polar Transformation

Normal Distribution

Solution :

This problem may get nasty, if one try to find the required distribution, by the so-called CDF method. Its better to observe a bit, before moving forward!! Recall how we derive the probability distribution of the sample variance of a sample from a normal population ??

Yes, you are thinking right, we need to use Polar Transformation !!

But, before transforming lets make some modifications, to reduce future complications,

Given, \(\theta \sim U(0,2\pi)\) and \(\theta_o \) is some fixed number in \((0,2\pi)\), so, let \(Z=\theta+\theta_o \sim U(\theta_o,2\pi +\theta_o)\).

Hence, we need to find the distribution of \(R\sin Z\). Now, from the given and modified information the joint pdf of \(R^2\) and \(Z\) are,

\(f_{R^2,Z}(r,z)=\frac{r}{2\pi}exp(-\frac{r^2}{2}) \ \ R>0, \theta_o \le z \le 2\pi +\theta_o \)

Now, let the transformation be \((R,Z) \to (X,Y)\),

\(X=R\cos Z \\ Y=R\sin Z\), Also, here \(X,Y \in \mathbb{R}\)

Hence, \(R^2=X^2+Y^2 \\ Z= \tan^{-1} (\frac{Y}{X}) \)

Hence, verify the Jacobian of the transformation \(J(\frac{r,z}{x,y})=\frac{1}{r}\).

Hence, the joint pdf of \(X\) and \(Y\) is,

\(f_{X,Y}(xy)=f_{R,Z}(x^2+y^2, \tan^{-1}(\frac{y}{x})) J(\frac{r,z}{x,y}) \\ =\frac{1}{2\pi}exp(-\frac{x^2+y^2}{2})\) , \(x,y \in \mathbb{R}\).

Yeah, Now it is looking familiar right !!

Since, we need the distribution of \(Y=R\sin Z=R\sin(\theta+\theta_o)\), we integrate \(f_{X,Y}\) w.r.t to \(X\) over the real line, and we will end up with, the conclusion that,

\(R\sin(\theta+\theta_o) \sim N(0,1)\). Hence, We are done !!


Food For Thought

From the above solution, the distribution of \(R\cos(\theta+\theta_o)\) is also determinable right !! Can you go further investigating the occurrence pattern of \(\tan(\theta+\theta_o)\) ?? \(R\) and \(\theta\) are the same variables as defined in the question.

Give it a try !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Categories
I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT ISI MSTAT Miscellaneous Statistics

ISI MStat PSB 2014 Problem 4 | The Machine’s Failure

This is a very simple sample problem from ISI MStat PSB 2014 Problem 4. It is based on order statistics, but generally due to one’s ignorance towards order statistics, one misses the subtleties . Be Careful !

Problem– ISI MStat PSB 2014 Problem 4


Consider a machine with three components whose times to failure are independently distributed as exponential random variables with mean \(\lambda\). the machine continue to work as long as at least two components work. Find the expected time to failure of the machine.

Prerequisites


Exponential Distribution

Order statistics

Basic counting

Solution :

In the problem as it is said, let the 3 component part of the machine be A,B and C respectively, where \(X_A, X_B\) and \(X_C\) are the survival time of the respective parts. Now it is also told that \(X_A, X_B\) and \(X_C\) follows \(exponential(\lambda) \), and clearly these random variables are also i.id.

Now, here comes the trick ! It is told that the machine stops when two or all parts of the machine stop working. Here, we sometimes gets confused and start thinking combinatorially. But the we forget the basic counting of combinatorics lies in ordering ! Suppose we start ordering the life time of the individual components .i.e. among \(X_A, X_B\) and \(X_C\), there exists a ordering and say if we write it in order, we have \(X_{(1)} \le X_{(2)} \le X_{(3)} \).

Now observe that, after \(X_{(2)}\) units of time, the machine will stop !! (Are you sure ?? think it over ).

So, expected time till the machine stops , is just \(E(X_{(2)})\), but to find this we need to know the distribution of \(X_{(2)}\).

We have the pdf of \(X_{(2)}\) as, \(f_{(2)}(x)= \frac{3!}{(2-1)!(3-2)!} [P(X \le x)]^{2-1}[P(X>x)]^{3-2}f_X(x) \).

Where \(f_X(x)\) is the pdf of exponentional with mean \(\lambda\).

So, \(E(X(2))= \int^{\infty}_0 xf_{(2)}(x)dx \). which will turn out to be \(\frac{5\lambda}{6}\), which I leave on the readers to verify , hence concluding my solution.


Food For Thought

Now, suppose, you want install an alarm system, which will notify you some times before the machine wears our!! So, what do you think your strategy should be ? Given that you have a strategy, you now replace the weared out part of the machine within the time period between the alarm rings and the machine stops working, to continue uninterrupted working.What is the expected time within which you must act ?

Keep the machine running !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Categories
I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT ISI MSTAT Miscellaneous Statistics Testing of Hypothesis

ISI MStat PSB 2013 Problem 7 | Bernoulli interferes Normally

This is a very simple and beautiful sample problem from ISI MStat PSB 2013 Problem 7. It is mainly based on simple hypothesis testing of normal variables where it is just modified with a bernoulli random variable. Try it!

Problem– ISI MStat PSB 2013 Problem 7


Suppose \(X_1\) and \(X_2\) are two independent and identically distributed random variables with \(N(\theta, 1)\). Further consider a Bernoulli random variable \(V\) with \(P(V=1)=\frac{1}{4}\) which is independent of \(X_1\) and \(X_2\) . Define \(X_3\) as,

\(X_3 = \begin{cases} X_1 & when & V=0 \\ X_2 & when & V=1 \end{cases}\)

For testing \(H_o: \theta= 0\) against \(H_1=\theta=1\) consider the test:

Rejects \(H_o\) if \(\frac{(X_1+X_2+X_3)}{3} >c\).

Find \(c\) such that the test has size \(0.05\).

Prerequisites


Normal Distribution

Simple Hypothesis Testing

Bernoulli Trials

Solution :

These problem is simple enough, the only trick is that to observe that the test rule is based on 3 random variables, \(X_1,X_2 \) and \(X_3\) but \(X_3\) on extension is dependent on the the other bernoulli variable \(V\).

So, here it is given that we reject \(H_o\) at size \(0.05\) if \(\frac{(X_1+X_2+X_3)}{3}> c\) such that,

\(P_{H_o}(\frac{X_1+X_2+X_3}{3}>c)=0.05\)

So, Using law of Total Probability as, \(X_3\) is conditioned on \(V\),

\(P_{H_o}(X_1+X_2+X_3>3c|V=0)P(V=0)+P_{H_o}(X_1+X_2+X_3>3c|V=1)P(V=1)=0.05\)

\(\Rightarrow P_{H_o}(2X_1+X_2>3c)\frac{3}{4}+P_{H_o}(X_1+2X_2>3c)\frac{1}{4}=0.05 \) [ remember, \(X_1\), and \(X_2\) are independent of \(V\)].

Now, under \(H_o\) , \(2X_1+X_2 \sim N(0,5) \)and \( X_1+2X_2 \sim N(0,5) \) ,

So, the rest part is quite obvious and easy to figure it out which I leave it is an exercise itself !!


Food For Thought

Lets end this discussion with some exponential,

Suppose, \(X_1,X_2,….,X_n\) are a random sample from \(exponential(\theta)\) and \(Y_1,Y_2,…..,Y_m\) is another random sample from the population of \(exponential(\mu)\). Now you are to test \(H_o: \theta=\mu\) against \(H_1: \theta \neq \mu \) .

Can you show that the test can be based on a statistic \(T\) such that, \(T= \frac{\sum X_i}{\sum X_i +\sum Y_i}\).

What distribution you think, T should follow under null hypothesis ? Think it over !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Categories
I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT ISI MSTAT Miscellaneous Statistics

ISI MStat PSB 2013 Problem 9 | Envelope Collector’s Expenditure

This is a very simple and beautiful sample problem from ISI MStat PSB 2013 Problem 9. It is mainly based on geometric distribution and its expectation . Try it!

Problem– ISI MStat PSB 2013 Problem 9


Envelopes are on sale for Rs. 30 each. Each envelope contains exactly one coupon, which can be one of the one of four types with equal probability. Suppose you keep on buying envelopes and stop when you collect all the four type of coupons. What will be your expenditure ?

Prerequisites


Geometric Distribution

Expectation of geometric distribution

Basic counting

Solution :

This problem seems quite simple and it is simple, often one may argue that we can take a single random variable, which denotes the number of trials till the fourth success (or is it third !!), and calculate its expectation. But I differ here becaue I find its lot easier to work with sum of geometric random variables than a negative binomial. ( negative binomial is actually sum of finite geometrics !!)

So, here what we will do, is define 4 random variables, as \(X_i\) : # trials to get a type of coupon that is different from the all the \((i-1)\) types of coupons drawn earlier. \(i=1,2,3,4\).

Now since each type of coupon has an equal probability to come, that is probability of success is \(\frac{1}{4}\), here a common mistakes people commit is assuming all \(X_1,X_2,X_3,X_4\) are i.i.d Geometric(\(\frac{1}{4}\)), and this turns out to be a disaster !! So, be aware and observe keenly, that at the first draw, any of the four types will come, with probability 1, and there after we just need the rest of the 3 types to appear at least once. So, here \(X_1=1\) always and \(X_2 \sim Geo(\frac{3}{4})\)( becuase since in the first trial, surely any of the four types will come, our success would be getting any of the 3 types of envelopes from the all types making the success probability \(\frac{3}{4}\)) similarly , \(X_3 \sim Geo(\frac{1}{2})\) and \(X_4 \sim Geo(\frac{1}{4})\).

Now for the expectated expenditure at the given rate of Rs. 30 per envelope, expected expenditure is Rs.\(30 E(X_1+X_2+X_3+X_4)\)

Now, we know that \(E(X_2)=\frac{4}{3}\), \(E(X_3)=2\) and \(E(X_4)=4\) (why??)

So, \(E(X_1+X_2+X_3+X_4)=1+\frac{4}{3}+2+4=\frac{25}{3}\).

So, our required expectation is Rs. 250. Hence we are done !


Food For Thought

Suppose among the envelopes you collected each envelope has an unique 5 digit number, you picked an envelope randomly and what is the chance of the number that is going to show up is has its digits in a non-decreasing order ?

How does the chances may vary if you find a k digit number on the envelope ? think over it !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Categories
I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT ISI MSTAT Miscellaneous Statistics

ISI MStat PSB 2013 Problem 10 | Balls-go-round

This is a very beautiful sample problem from ISI MStat PSB 2013 Problem 10. It’s based mainly on counting and following the norms stated in the problem itself. Be careful while thinking !

Problem– ISI MStat PSB 2013 Problem 10


There are 10 empty boxes numbered 1,2,……,10 placed sequentially on a circle as shown in the figure,

We perform 100 independent trials. At each trial one box is selected with probability \(\frac{1}{10}\) and a ball is placed in each of the two neighboring boxes of the selected one.

Define \(X_k\) be the number of balls in the \(k^{th}\) box at the end of 100 trials.

(a) Find \(E(X_k)\) for \( 1 \le k \le 10\).

(b) Find \(Cov(X_k, X_5)\) for \(1 \le k \le 10 \).

Prerequisites


Counting principles

Binomial Distribution

Independence of Events

Solution :

At first this problem may seem a bit of complex, but when one get to see the pattern it starts unfolding. For his types of problem, I find a useful technique is to follow the picture, in this case the picture is being provided, (if not draw it yourself !!)

Given, \(X_k\) : # balls in the \(k^{th}\) box at the end of 100 trials.

So, the possible values of \(X_k\) are 0,1,2,…..,100. and the probability that at the jth trial a ball will added to the kth box is \(\frac{1}{10}\) , (why??) .

Now, \(P(X_k=x)\)= \({100 \choose x}\)\( (\frac{1}{5})^{x}\) \( (\frac{4}{5})^{100-x} \) \( x=0,1,……,100\)

Clearly, \( X_k \sim binomial( 100, \frac{1}{5})\), from here one can easily find out the the expectation of \(X_k\). But have you thought like this ??

Now, notice that after every slelection of box, 2 balls are added in the system, so at the end of the 10th trial, there will be 200 balls distributed in the system.

So, \(X_1 +X_2 +…….+ X_{100} =200 \), which implies \( \sum_{k} E(X_k)=200\), due to symmetry \(E(X_k)=E(X_l) \forall k \neq l \), So, \(E(X_k)=20\).

(b) Now this part is the cream of this problem, first notice that number of balls in kth box, is dependent on the number of balls in the (k-2)th and (k+2)th box, and vice versa, So, \(Cov( X_k, X_l)=0\) if \(|k-l|\neq 2\).

So, \(Cov(X_k, X_5)= 0 \forall k \neq 3,5 \&\ 7 \), so we just need to find the \(Cov(X_7,X_5)\) and \(Cov(X_3,X_5)\), and \(Cov(X_5,X_5)=Var(X_5)\) .

Now it is sufficient to find the covariance of any of the the above mentioned covariances, as both are symmetric and identical to each other. But for the finding say \(Cov(X_3,X_5)\), lets look whats happening in each trial more closely,

let, \(X_k= i_{k_1} +i_{k_2}+……+i{k_{100}} \) where , \( i_{k_j} = \begin{cases} 1 & if\ a\ ball\ added\ to\ the\ kth\ box\ at\ the\ jth\ trial\ \\ 0 & otherwise \end{cases}\)

So, clearly, \(P(i_{k_j}=1)=\frac{1}{5} \) ; j=1,2,….,100.

So, \(Cov(X_3,X_5)=Cov( i_{3_1}+i_{3_2}+…..+i_{3_{100}},i_{5_1}+i_{5_2}+….+i_{5_{100}})=\sum_{j=1}^{100} Cov(i_{3_j},i_{5_j})\), [\(Cov(i_{3_j},i_{5_j*})=0 \forall j\neq j*\), why ?? ].

So, \(Cov(X_3,X_5)= 100 Cov(i_{3_1},i_{5_1})=100( E(i_{3_1}i_{5_1})-E(i_{3_1})E(i_{5_1}))=100(P(i_{3_1}=1, i_{5_1}=1)-P(i_{3_1}=1)P(i_{5_1}=1))=100(\frac{1}{10}- \frac{1}{5}\frac{1}{5})=6\).

similarly, \(Cov(X_7,X_5)=6\) also, and its easy to find \(Var(X_5)\) , so I leave it as an exercise. So, \(Cov(X_k,X_5)= \begin{cases} 6 & k=3,7 \\ Var(X_5) & k=5 \\ 0 & k\neq 3,5 \or\ 7\end{cases}\). Hence we are done !!


Food For Thought

Wait, lets imagine, these boxes are interchanged in such a way that the hth box is replaced with the kth (\(\neq h\)) box, this has been done for all possible pairs of (h,k), Now can you show that all there are precisely \( 10!\sum_{i+2j=10}(i!j!2^j)^{-1}\) number of arrangements possible ?

Now imagine, in a game there are 1 balls each in every box( boxes are arranged identically as shown in the question), you pick up the ball from the first box and put it into the 2nd one, now you can’t take out any ball from a box in which you just put a ball, so you pick a ball from the 3rd box and put it into the 4th and you go on like this, taking a ball frim the ith box and put it into the (i+1)th box, but cant empty the box you just filled. Now, again after the first round you remove the empty boxes and do the same thing again and again, till all the balls are not accumulated in a single box. Which box you think will contain all the balls after you run this process finitely many time ?? If you are in this lottery and you are to choose a bix before this game begin, which box you must choose ??

if the coordinator of the game, starts with any ith box how should your strategy change ?? Give it a thought !!

For help, look for Josephus Problem, you may be moved by it’s beauty !


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Categories
I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT ISI MSTAT Miscellaneous Statistics

ISI MStat PSB 2005 Problem 5 | Uniformity of Uniform

This is a simple and elegant sample problem from ISI MStat PSB 2005 Problem 5. It’s based the mixture of Discrete and Continuous Uniform Distribution, the simplicity in the problem actually fools us, and we miss subtle happenings. Be careful while thinking !

Problem– ISI MStat PSB 2005 Problem 5


Suppose \(X\) and \(U\) are independent random variables with

\(P(X=k)=\frac{1}{N+1} \) , \( k=0,1,2,……,N\) ,

and \(U\) having a uniform distribution on [0,1]. Let \(Y=X+U\).

(a) For \( y \in \mathbb{R} \), find \( P(Y \le y)\).

(b) Find the correlation coefficient between \(X\) and \(Y\).

Prerequisites


Uniform Distribution

Law of Total Probability

Conditional Distribution

Solution :

This ptroblem is quite straight forward enough, and we do what we are told to.

Here, we need to find the Cdf of \(Y\) , where , \(Y=X+U \), and \(X\) and \(Y\) are defined as above.

So, \(P(Y\le y)= P(X+U \le y)=\sum_{k=0}^N P(U \le y-k|X=k)P(X=k) = \frac{1}{N+1} \sum_{i=1}^NP(U \le y-k) \), [ since \(U\) and \(X\) are indepemdent ],

Now, here is where we get fooled often by the simplicity of the problem. The beauty is to observe in the above expression, if \(y-k <0\) then \(P(U\le y-k)=0\), and if \(y-k>1\) then \(P(U\le y-k)=1\), ( why??)

So, for \( k* \le y \le k*+1 \) the \(P(U \le y-k*)=y-k*\), so when \(k* \le y \le k*+1\), \(P(U\le y-k)=0\) for \(k>k*\), and \(P(U \le y-k)=1\) for \(k<k*\).

So, the required Cdf will depend on the interval , y belongs to, for the above illustrated case, i.e. \(k* \le y\le k*+1\) there will be \(k*\) number of 1’s, and \(N-k*-1\) number of 0’s in the above summation, derived, so, \(P(Y\le y)\) reduces to,

\(P(Y\le y)= \frac{1}{N+1} ( k*+y-k*)=\frac{y}{N+1}\), \(0<y<N+1\), [ since here \(k*\) may vary from 0,1,…., N, hence union of all the nested sub-intervals give the mentioned interval]

Hence the required Cdf. But can you find this Cdf argumentatively, without this algebraic labor ?? What distribution is it ?? Does the Uniform retains its Uniformity ?

I leave the part (b) as an exercise, its quite trivial.


Food For Thought

How to deal with some random triangles ??

Suppose, You are given a stick of length \(N+1\) units, Now you are to break the stick into 3 pieces, and the breaking points are chosen randomly , What is the chance of constructing a triangle with those 3 broken part as the sides of the constructed triangle ??

Now if you first break the stick into two pieces randomly, and further break the longer piece again into two parts (randomly), How the chance of making a triangle changes ?? Do the chance increases what do you think ??

Lastly, does the length of the stick matters at all , Give it a thought !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Categories
I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT ISI MSTAT Miscellaneous Statistics

Bayes comes to rescue | ISI MStat PSB 2007 Problem 7

This is a very beautiful sample problem from ISI MStat PSB 2007 Problem 7. It’s a very simple problem, which very much rely on conditioning and if you don’t take it seriously, you will make thing complicated. Fun to think, go for it !!

Problem– ISI MStat PSB 2007 Problem 7


Let \(X\) and \(Y\) be i.i.d. exponentially distributed random variables with mean \(\lambda >0 \). Define \(Z\) by :

\( Z = \begin{cases} 1 & if X <Y \\ 0 & otherwise \end{cases} \)

Find the conditional mean, \( E(X|Z=1) \).

Prerequisites


Conditional Distribution

Bayes Theorem

Exponential Distribution

Solution :

This is a very simple but very elegant problem to describe an unique and efficient technique to solve a class of problems, which may seem analytically difficult.

Here, for \(X\), \(Y\) and \(Z\) as defined in the question, lets find out what we need first of all.

Sometimes, breaking a seemingly complex problem into some simpler sub-problems, makes our way towards the final solution easier. In this problem, the possible simpler sub problems, which I think would help us is, “Whats the value of \(P(X<Y)\) (or similarly \(P(Z=1)\)) ?? “, “what is pdf of \(X|X<Y\)( or equivalently \(X|Z=1\)) ?” and finally ” what is the conditional mean \(E(X|Z=1)\) ??”. We will attain these questions one by one.

for the very first question, “Whats the value of \(P(X<Y)\) (or similarly \(P(Z=1)\)) ?? “, well the answer to this question is relatively simple, and I leave it as an exercise !! the probability value which one will find if done correctly is \( \frac{1}{2}\). Verify it, then only move forward!!

The 2nd question is the most vital and beautiful part of the problem, we generally, do this kind of problems using the general definition of conditional probability, which you can obviously try, but will face some difficulties, which can be easily ignored by using the continuous form of Bayes’ rule, which we are not often encouraged to use !! I don’t really know why, though !

Let, find the conditional Cdf of \(X|Z=1\),

\( P(X \le x|Z=1) = \int^x_0 f_{X|X<Y}(x) dx, ……….. x>0 \)

where \( f_{X|X<Y}(x)\) is the conditional pdf, which we are interested in, So now we can use Bayes rule on \(f_{X|X<Y}(x)\), we have,

\( f_{X|X<Y}(x)\)=\( \frac{P(Z=1|X=x)f_X(x)}{P(Z=1)}\)=\(\frac{P(Y>x)f_X(x)}{P(X<Y)}\)=\(\frac{\frac{e^{-\frac{x}{\lambda}}e^{- \frac{x}{\lambda}}}{\lambda}}{\frac{1}{2}}\)=\(\frac{2}{\lambda}e^{-\frac{2x}{\lambda}}\)

plugging this in the form of cdf we can easily verify, that \(X|Z=1 \sim expo(\frac{\lambda}{2}) \). (We can’t say this directly from pdf because, pdfs are not unique, Can you give such an example ? think about it !)

So, now as we successfully answered the first 2 questions its easy to, answer the last and the final one, as \(X|Z=1 \sim expo(\frac{\lambda}{2}) \), its mean .i.e.

\(E(X|Z=1)=\frac{\lambda}{2}.\)

Hence the solution concludes.


Food For Thought

Lets, provide an interesting problem before concluding,

There, are K+1 machineas in a shop, all engaged in the mass production of an item. the \(i\)th machine produces defectives with probability of \(\frac{i}{k}\), i=0,1,2,…..,k.A machine is selected at random and then the items produced are repeatedly sampled. If the first n products are all defectives, then show that the conditional probability that (n+1)th sampled product will also be defective is approximately, equal to \( \frac{(n+1)}{(n+2)}\) when k is large.

Can you show it? Give it a try !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Categories
I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT ISI MSTAT Miscellaneous Statistics Theory of Estimation

ISI MStat PSB 2008 Problem 8 | Bivariate Normal Distribution

This is a very beautiful sample problem from ISI MStat PSB 2008 Problem 8. It’s a very simple problem, based on bivariate normal distribution, which again teaches us that observing the right thing makes a seemingly laborious problem beautiful . Fun to think, go for it !!

Problem– ISI MStat PSB 2008 Problem 8


Let \( \vec{Y} = (Y_1,Y_2)’ \) have the bivariate normal distribution, \( N_2( \vec{0}, \sum ) \),

where, \(\sum\)= \begin{pmatrix} \sigma_1^2 & \rho\sigma_1\sigma_2 \\ \rho\sigma_2\sigma_1 & \sigma^2 \end{pmatrix} ;

Obtain the mean ad variance of \( U= \vec{Y’} {\sum}^{-1}\vec{Y} – \frac{Y_1^2}{\sigma^2} \) .

Prerequisites


Bivariate Normal

Conditonal Distribution of Normal

Chi-Squared Distribution

Solution :

This is a very simple and cute problem, all the labour reduces once you see what to need to see !

Remember , the pdf of \(N_2( \vec{0}, \sum)\) ?

Isn’t \( \vec{Y}\sum^{-1}\vec{Y}\) is the exponent of e, in the pdf of bivariate normal ?

So, we can say \(\vec{Y}\sum^{-1}\vec{Y} \sim {\chi_2}^2 \) . Can We ?? verify it !!

Also, clearly \( \frac{Y_1^2}{\sigma^2} \sim {\chi_1}^2 \) ; since \(Y_1\) follows univariate normal.

So, expectation is easy to find accumulating the above deductions, I’m leaving it as an exercise .

Calculating the variance may be a laborious job at first, but now lets imagine the pdf of the conditional distribution of \( Y_2 |Y_1=y_1 \) , what is the exponent of e in this pdf ?? \( U = \vec{Y’} {\sum}^{-1}\vec{Y} – \frac{Y_1^2}{\sigma^2} \) , right !!

and also , \( U \sim \chi_1^2 \) . Now doing the last piece of subtle deduction, and claiming that \(U\) and \( \frac{Y_1^2}{\sigma^2} \) are independently distributed . Can you argue why ?? go ahead . So, \( U+ \frac{Y_1^2}{\sigma^2} \sim \chi_2^2 \).

So, \( Var( U + \frac{Y_1^2}{\sigma^2})= Var( U) + Var( \frac{Y_1^2}{\sigma^2}) \)

\( \Rightarrow Var(U)= 4-2=2 \) , [ since, Variance of a R.V following \(\chi_n^2\) is \(2n\).]

Hence the solution concludes.


Food For Thought

Before leaving, lets broaden our mind and deal with Multivariate Normal !

Let, \(\vec{X}\) be a 1×4 random vector, such that \( \vec{X} \sim N_4(\vec{\mu}, \sum ) \), \(\sum\) is positive definite matrix, then can you show that,

\( P( f_{\vec{X}}(\vec{x}) \ge c) = \begin{cases} 0 & c \ge \frac{1}{4\pi^2\sqrt{|\sum|}} \\ 1-(\frac{k+2}{2})e^{-\frac{k}{2}} & c < \frac{1}{4\pi^2\sqrt{|\sum|}} \end{cases}\)

Where, \( k=-2ln(4\pi^2c \sqrt{|\sum|}) \).

Keep you thoughts alive !!


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube


Categories
Experiment Design and Sample Survey I.S.I. and C.M.I. Entrance IIT JAM Statistics ISI M.Stat PSB ISI MSAT Miscellaneous Statistics Theory of Estimation

ISI MStat PSB 2013 Problem 5 | Simple Random Sampling

This is a sample problem from ISI MStat PSB 2013 Problem 5. It is based on the simple random sampling model, finding the unbiased estimates of the population size. But think over the “Food for Thought” any kind of discussion will be appreciated. Give it a try!

Problem– ISI MStat PSB 2013 Problem 5


A box has a unknown number of tickets serially numbered 1,2,…..,N. Two tickets are drawn using simple random sampling without replacement (SRSWOR) from the box. If X and Y are the numbers on the tickets and Z=max(X,Y), show that

(a) Z is not ubiased for N.

(b) \( aX+ bY+ c \) is unbiased for N if and only if \(a+b=2 \) and \( c=-1 \).

Prerequisites


Naive Probability

Counting priciples

Unbiased estimators.

Simple random sampling .

Solution :

For this problem, first let us find the pmf of Z, where we will need some counting techniques.

Since we are drawing balls at a random and not replacing the drawn ball after each draw (SRSWOR), so, clearly its about choosing two numbers from the se of N elements {1,…..,N}. So, all possible sample of size 2 , that than be drawn from the population of N units is \( { N \choose 2}\) .

now Z defined here as the maximum of the two chosen numbers, so, all possible values of Z are 2,3,….,N.

Now lets assume that Z=k, so now we just need to find out what are the possible pairs, such that k comes the max among both, or in other words if k is the maximum of the drawn numbers, what are the possible values that the other number can take ? Well, its simple the other ticket can carry any number less than k, an since there are k-1 such numbers. So there are (k-1) such pairs where the maximum numbered ticket is k. (not concerned on the ordering, of the two observation)

So, the pmf of Z=max(X,Y) , i.e. \( P(Z=k) = \begin{cases} \frac{k-1}{{N \choose 2}} & k=2,3,….,N \\ 0 & otherwise \end{cases} \)

(a) So, now to check whether Z is unbiased for N, we need to check E(Z),

\(E(Z)= \sum_{k=2}^N{k}{\frac{k-1}{{N \choose 2}}} =\frac{1}{{N \choose 2}}\sum_{k=2}^N{k(k-1)}=(\frac{2}{3}) (N+1) \).

so, \( E(Z)=\frac{2}{3} (N+1) \neq N \). Hence Z is not Unbiased for the population size, N.

(b) Similarly, we find the expectation of T=aX+bY+c,

\( E(T)=aE(X)+bE(Y)+c= a \sum_{i=1}^N i P(X=i) + b \sum_{j=1}^N j P(Y=j) + c,\)

now here \( P(X=i)=P(Y=i)= \frac{1}{N} \), so, \( E(T) = a \frac{N+1}{2}+ b\frac{N+1}{2}+c = (a+b) \frac{N+1}{2} +c,\)

clearly, E(T) = N, i.e T will be unbiased for N, iff a+b=2 and c=-1.

Hence we are done !


Food For Thought

Now, suppose that the numbers on the tickets are random, that is it can be any positive integer, ( like say 220 or 284), but thankfully you now the total number of tickets .i.e. N is known . Now you are collecting tickets for yourself and k-1 of your friends, and the number c is lucky for you and you wish to keep it in your collection, and select the remaining k-1 tickets out of N-1 tickets, and you calculate a sample mean(of the collected numbers) \( \bar{y’}\), Can I claim that \(c+ (N-1)\bar{y’}\) is an unbiased estimator of the population total ? Do you know this estimator shows less variance than the conventional unbiased estimator of the population total? Can you show that too?? why do you think the variance minimizes??

By the Way, Do you know, in mathematics 220 and 284 are quite special ?? They are the first “amicable numbers”. One can obtain the other by summing over its own divisors !! So, to become amicable one needs to increase the size of their mind and heart !! Keep increasing both!! Till then…. bye.


ISI MStat PSB 2008 Problem 10
Outstanding Statistics Program with Applications

Outstanding Statistics Program with Applications

Subscribe to Cheenta at Youtube