IIT JAM Stat Mock Test Toppers

We are really happy with the performance of our students and thus, we have initiated to name the Toppers of IIT JAM Stat Mock Test. These toppers are named in this leader board according to their performance in IIT JAM Stat Mock Tests.

So, here goes the list:

Mock Test name	Topper's name and their scores
IIT JAM Mock Test 1 (Full)	1. Somyadipta Ghosh - 88.5% 2. Mainack Paul - 83.7% 3. Abhradiptaa Ghosh - 78.7% 4. Prabirkumar Das - 71.2% 5. Debepsita Mukherjee - 68%
IIT JAM Mock Test 2 (Full)	1. Somyadipta Ghosh - 74.2% 2. Mainack Paul - 68.2% 3. Prabirkumar Das - 58.6% 4. Saikat Kar - 57.6% 5. Debepsita Mukherjee - 49.7%
IIT JAM Mathematics Mock Test 1	1. Bidisha Ghosh - 51.4% 2. Mainack Paul - 51% 3. Somyadipta Ghosh - 50.3%
IIT JAM Mathematics Mock Test 2	1. Abhradiptaa Ghosh - 57.8% 2. Debepsita Mukherjee - 54.7% 3. Srija Mukherjee - 52.5%
IIT JAM Statistics Mock Test 1	1. Somyadipta Ghosh - 68% 2. Mainack Paul - 64% 3. Debepsita Mukherjee - 56% 4. Srija Mukherjee - 52% 5. Abhradiptaa Ghosh - 52%
IIT JAM Statistics Mock Test 2	1. Somyadipta Ghosh - 56.7% 2. Mainack Paul - 56.7%
IIT JAM Probability Mock Test 1	1. Mainack Paul - 80% 2. Anis Pakrashi - 76.7% 3. Somyadipta Ghosh - 76.7% 4. Prabirkumar Das - 73.3%
IIT JAM Probability Mock Test 2	1. Abhradiptaa Ghosh - 80% 2. Mainack Paul - 76% 3. Srija Mukherjee - 76% 4. Anis Pakrashi - 68% 5. Prabirkumar Das - 68%

These Mock Tests are part of our Cheenta Statistics Bronze Learning Path. You can learn more about it here.

Some Useful Links:

ISI MStat Entrance 2020 Problems and Solutions PSA & PSB

This post contains ISI MStat Entrance PSA and PSB 2020 Problems and Solutions that can be very helpful and resourceful for your ISI MStat Preparation.

ISI MStat Entrance 2020 Problems and Solutions - Subjective Paper

ISI MStat 2020 Problem 1

Let f(x)=x2−2x+2. Let L1 and L2 be the tangents to its graph at x=0 and x=2 respectively. Find the area of the region enclosed by the graph of f and the two lines L1 and L2.

Solution

ISI MStat 2020 Problem 2

Find the number of 3×3 matrices A such that the entries of A belong to the set Z of all integers, and such that the trace of AtA is 6 . (At denotes the transpose of the matrix A).

Solution

ISI MStat 2020 Problem 3

Consider $n$ independent and identically distributed positive random variables $X_{1}, X_{2}, \ldots, X_{n}$. Suppose $S$ is a fixed subect of ${1,2, \ldots, n}$ consisting of $k$ distinct ekements where $1 \leq k<n$.
(a) Compute
$$
\mathrm{E}\left[\frac{\sum_{i \in s} X_{i}}{\sum_{i=1}^{\infty} X_{i}}\right]
$$
(b) Assume that $X_{i}$ is have mean $\mu$ and variance $\sigma^{2}, 0<\sigma^{2}<\infty$. If $j \notin S$, show that the correlation between ( $\left.\sum_{i \in s} X_{i}\right) X_{j}$ and $\sum_{i \in}X_{i} $ lies between $-\frac{1}{\sqrt{k+1}}$ and $\frac{1}{\sqrt{k+1}}$.

Solution

ISI MStat 2020 Problem 4

Let X1,X2,…,Xn be independent and identically distributed random variables. Let Sn=X1+⋯+Xn. For each of the following statements, determine whether they are true or false. Give reasons in each case.

(a) If Sn∼Exp with mean n, then each Xi∼Exp with mean 1 .

(b) If Sn∼Bin(nk,p), then each Xi∼Bin(k,p)

Solution

ISI MStat 2020 Problem 5

Let U1,U2,…,Un be independent and identically distributed random variables each having a uniform distribution on (0,1) . Let X=min{U1,U2,…,Un}, Y=max{U1,U2,…,Un}

Evaluate E[X∣Y=y] and E[Y∣X=x].

Solution

ISI MStat 2020 Problem 6

Suppose individuals are classified into three categories C1,C2 and C3 Let p2,(1−p)2 and 2p(1−p) be the respective population proportions, where p∈(0,1). A random sample of N individuals is selected from the population and the category of each selected individual recorded.

For i=1,2,3, let Xi denote the number of individuals in the sample belonging to category Ci. Define U=X1+X32

(a) Is U sufficient for p? Justify your answer.

(b) Show that the mean squared error of UN is p(1−p)2N

Solution

ISI MStat 2020 Problem 7

Consider the following model:
$$
y_{i}=\beta x_{i}+\varepsilon_{i} x_{i}, \quad i=1,2, \ldots, n
$$
where $y_{i}, i=1,2, \ldots, n$ are observed; $x_{i}, i=1,2, \ldots, n$ are known positive constants and $\beta$ is an unknown parameter. The errors $\varepsilon_{1}, \varepsilon_{2}, \ldots, \varepsilon_{n}$ are independent and identically distributed random variables having the
probability density function
$$
f(u)=\frac{1}{2 \lambda} \exp \left(-\frac{|u|}{\lambda}\right),-\infty<u<\infty
$$
and $\lambda$ is an unknown parameter.
(a) Find the least squares estimator of $\beta$.
(b) Find the maximum likelihood estimator of $\beta$.

Solution

ISI MStat 2020 Problem 8

Assume that $X_{1}, \ldots, X_{n}$ is a random sample from $N(\mu, 1)$, with $\mu \in \mathbb{R}$. We want to test $H_{0}: \underline{\mu}=0$ against $H_{1}: \mu=1$. For a fixed integer $m \in{1, \ldots, n}$, the following statistics are defined:

\begin{aligned}
T_{1} &=\left(X_{1}+\ldots+X_{m}\right) / m \\
T_{2} &=\left(X_{2}+\ldots+X_{m+1}\right) / m \\
\vdots &=\vdots \\
T_{n-m+1} &=\left(X_{n-m+1}+\ldots+X_{n}\right) / m .
\end{aligned}

Fix $\alpha \in(0,1)$. Consider the test

reject $H_{0}$ if max {${T_{i}: 1 \leq i \leq n-m+1}>c_{m, \alpha}$}

Find a choice of $c_{m, \alpha}$ $\mathbb{R}$ in terms of the standard normal distribution
function $\Phi$ that ensures that the size of the test is at most $\alpha$.

Solution

ISI MStat 2020 Problem 9

A finite population has N units, with xi being the value associated with the i th unit, i=1,2,…,N. Let x¯N be the population mean. A statistician carries out the following experiment.

Step 1: Draw an SRSWOR of size n(1 and denote the sample mean by X¯n

Step 2: Draw an SRSWR of size m from S1. The x -values of the sampled units are denoted by {Y1,…,Ym}

An estimator of the population mean is defined as,

Tˆm=1m∑i=1mYi

(a) Show that Tˆm is an unbiased estimator of the population mean.

(b) Which of the following has lower variance: Tˆm or X¯n?

Solution

ISI MStat 2020 - Objective Paper

ISI-Mstat-2020-PSADownload

ISI MStat 2020 PSA Answer Key

Click on the links to learn about the detailed solution.

1. C	2. D	3. A	4. B	5. A
6. B	7. C	8. A	9. C	10. A
11. C	12. D	13. C	14. B	15. B
16. C	17. D	18. B	19. B	20. C
21. C	22. D	23. A	24. B	25. D
26. B	27. D	28. D	29. B	30. C

Please suggest changes in the comment section.

ISI MStat 2020 Probability Problems Discussion [Recorded Class]

Cheenta Statistics Department
ISI MStat and IIT JAM Training Program

Testing of Hypothesis | ISI MStat 2016 PSB Problem 9

This is a problem from the ISI MStat Entrance Examination, 2016 involving the basic idea of Type 1 error of Testing of Hypothesis but focussing on the fundamental relationship of Exponential Distribution and the Geometric Distribution.

The Problem:

Suppose $X_{1}, X_{2}, \ldots, X_{n}$ is a random sample from an exponential distribution with mean $\lambda$.

Assume that the observed data is available on $\left[X_{1}\right], \ldots,\left[X_{n}\right]$, instead of $X_{1}, \ldots, X_{n},$ where $[x]$ denotes the largest integer less than or equal to $x$.

Consider a test for $H_{0}: \lambda=1$ vs $H_{1}: \lambda>1$ which rejects $H_{0}$ when $\sum_{i=1}^{n}\left[X_{i}\right]>c_{n} .$

Given $\alpha \in(0,1),$ obtain values of $c_{n}$ such that the size of the test converges to $\alpha$ as $n \rightarrow \infty$.

Prerequisites:

(a) Testing of Hypothesis

(b) Type 1 Error

(d) Relationship of Exponential Distribution and Geometric Distribution

(e) Central Limit Theorem

Solution:

Proof:

$Y$ is clearly discrete taking values in the set of non-negative integers, due to the flooring. Then, for any integer $n \geq 0$ we have
$
\begin{array}{c}
P(Y=n)=P(X \in[\text {an, } a(n+1))) \
=\int_{a n}^{a(n+1)} \lambda \mathrm{e}^{-\lambda x} d x=(1-p)^{n} p
\end{array}
$
where $p=1-e^{-\lambda a} \in(0,1),$ as $\lambda>0$ and $a>0$.

Testing of Hypothesis

$H_{0}: \lambda=1$ vs $H_{1}: \lambda>1$

We reject $H_{0}$ when $\sum_{i=1}^{n}\left[X_{i}\right]>c_{n} .$

Here, the size of the test i.e the Type 1 error (for simple hypothesis), $ \alpha_n$ = $ P(S_n > c_{n} | \lambda=1)$.

We want to select $c_n$ such that $\alpha_n \to \alpha$.

$S_n$ ~ NBinom($n,p$), where $ p = 1-e^{-1} $ under $H_0$.

Now, $\frac{\sqrt{n}(\frac{S_n}{n} - \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}} \rightarrow Z = N(0,1)$ by Central Limit Theorem.

Observe that thus, $ \alpha_n = P(S_n > c_{n} | \lambda=1) \rightarrow P(Z > \frac{\sqrt{n}(\frac{c_n}{n} - \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}}) = \alpha$.

Thus, $ \frac{\sqrt{n}(\frac{c_n}{n} - \frac{1}{p})}{\sqrt{\frac{1-p}{p^2}}} = Z_{\alpha} $.

We can solve this to find $c_n$, where $ p = 1-e^{-1} $

Food for Thought

If X ~ Exponential($\lambda$), then what is the distribution of {X} [ The fractional part of X]. This question is crucial is getting back Exponential Distrbution from Geometric Distribution.

Rather, the food for thought, asks you how do we get Exponential Distribution from Geometric Distribution.

Stay Tuned. Stay Blessed! See you in the next post.

ISI MStat PSB 2006 Problem 8 | Bernoullian Beauty

This is a very beautiful sample problem from ISI MStat PSB 2006 Problem 8. It is based on basic idea of Maximum Likelihood Estimators, but with a bit of thinking. Give it a thought !

Problem- ISI MStat PSB 2006 Problem 8

Let $(X_1,Y_1),......,(X_n,Y_n)$ be a random sample from the discrete distributions with joint probability

$f_{X,Y}(x,y) = \begin{cases} \frac{\theta}{4} & (x,y)=(0,0) \ and \ (1,1) \\ \frac{2-\theta}{4} & (x,y)=(0,1) \ and \ (1,0) \end{cases}$

with $0 \le \theta \le 2$. Find the maximum likelihood estimator of $\theta$.

Prerequisites

Maximum Likelihood Estimators

Indicator Random Variables

Bernoulli Trials

Solution :

This is a very beautiful Problem, not very difficult, but her beauty is hidden in her simplicity, lets explore !!

Observe, that the given pmf is as good as useless while taking us anywhere, so we should think out of the box, but before going out of the box, lets collect whats in the box !

So, from the given pmf we get, $P( \ of\ getting\ pairs \ of\ form \ (1,1) \ or \ (0,0))=2\times \frac{\theta}{4}=\frac{\theta}{2}$,

Similarly, $P( \ of\ getting\ pairs \ of\ form \ (0,1) \ or \ (1,0))=2\times \frac{2-\theta}{4}=\frac{2-\theta}{2}=1-P( \ of\ getting\ pairs \ of\ form \ (1,1) \ or \ (0,0))$

So, clearly it is giving us a push towards involving Bernoulli trials, isn't it !!

So, lets treat the pairs with match, .i.e. $x=y$, be our success, and the other possibilities be failure, then our success probability is $\frac{\theta}{2}$, where $0\le \theta \le 2$. So, if $S$ be the number of successful pairs in our given sample of size $n$, then it is evident $S \sim Binomial(n, \frac{\theta}{2})$.

So, now its simplified by all means, and we know the MLE of population proportion in binomial is the proportion of success in the sample,

Hence, $\frac{\hat{\theta_{MLE}}}{2}= \frac{s}{n}$, where $s$ is the number of those pairs in our sample where $X_i=Y_i$.

So, $\hat{\theta_{MLE}}=\frac{2(number\ of \ pairs \ in\ the\ sample\ of \ form\ (0,0)\ or \ (1,1))}{n}$.

Hence, we are done !!

Food For Thought

Say, $X$ and $Y$ are two independent exponential random variable with means $\mu$ and $\lambda$ respectively. But you observe two other variables, $Z$ and $W$, such that $Z=min(X,Y)$ and $W$ takes the value $1$ when $Z=X$ and $0$ otherwise. Can you find the MLEs of the parameters ?

Give it a try !!

Outstanding Statistics Program with Applications

How to roll a Dice by tossing a Coin ? Cheenta Statistics Department

How can you roll a dice by tossing a coin? Can you use your probability knowledge? Use your conditioning skills.

Suppose, you have gone to a picnic with your friends. You have planned to play the physical version of the Snake and Ladder game. You found out that you have lost your dice.

The shit just became real!

Now, you have an unbiased coin in your wallet / purse. You know Probability.

Aapna Time Aayega

starts playing in the background. :p

Can you simulate the dice from the coin?

Ofcourse, you know chances better than others. :3

Take a coin.

Toss it 3 times. Record the outcomes.

HHH = Number 1

HHT = Number 2

HTH = Number 3

HTT = Number 4

THH = Number 5

THT = Number 6

TTH = Reject it, don't ccount the toss and toss again

TTT = Reject it, don't ccount the toss and toss again

Voila done!

What is the probability of HHH in this experiment?

Let X be the outcome in the restricted experiment as shown.

How is this experiment is different from the actual experiment?

This experiment is conditioning on the event A = {HHH, HHT, HTH, HTT, THH, THT}.

$P( X = HHH) = P (X = HHH | X \in A ) = \frac{P (X = HHH)}{P (X \in A)} = \frac{1}{6}$

Beautiful right?

Can you generalize this idea?

Food for thought

Watch the Video here:

Some Useful Links:

Books for ISI MStat Entrance Exam

How to Prepare for ISI MStat Entrance Exam

ISI MStat and IIT JAM Stat Problems and Solutions

Cheenta Statistics Program for ISI MStat and IIT JAM Stat

Simple Linear Regression - Playlist on YouTube

Bayes' in-sanity || Cheenta Probability Series

One of the most controversial approaches to statistics, this post mainly deals with the fundamental objections to Bayesian methods and Bayesian school of thinking. Turning to the Bayesian crank, Fisher put forward a vehement objection towards Bayesian Inference, describing it as "fallacious rubbish".

However, ironically enough, it’s interesting to note that Fisher’s greatest statistical failure, fiducialism, was essentially an attempt to “enjoy the Bayesian omelette without breaking any Bayesian eggs" !

Ronald Fisher - Objections to Bayesian theory — Ronald Fisher

Inductive Logic

An inductive logic is a logic of evidential support. In a deductive logic, the premises of a valid deductive argument logically entail the conclusion, where logical entailment means that every logically possible state of affairs that makes the premises true must make the conclusion truth as well. Thus, the premises of a valid deductive argument provide total support for the conclusion. An inductive logic extends this idea to weaker arguments. In a good inductive argument, the truth of the premises provides some degree of support for the truth of the conclusion, where this degree-of-support might be measured via some numerical scale.

If a logic of good inductive arguments is to be of any real value, the measure of support it articulates should be up to the task. Presumably, the logic should at least satisfy the following condition:

Criterion of Adequacy (CoA):
The logic should make it likely (as a matter of logic) that as evidence accumulates, the total body of true evidence claims will eventually come to indicate, via the logic’s measure of support, that false hypotheses are probably false and that true hypotheses are probably true.

One practical example of an easy inductive inference is the following:

" Every bird in a random sample of 3200 birds is black. This strongly supports the following conclusion: All birds are black. "

This kind of argument is often called an induction by enumeration. It is closely related to the technique of statistical estimation.

Critique of Inductive Logic

Non-trivial calculi of inductive inference are shown to be incomplete. That is, it is impossible for a calculus of inductive inference to capture all inductive truths in some domain, no matter how large, without resorting to inductive content drawn from outside that domain. Hence inductive inference cannot be characterized merely as inference that conforms with some specified calculus.
A probabilistic logic of induction is unable to separate cleanly neutral support from disfavoring evidence (or ignorance from disbelief). Thus, the use of probabilistic representations may introduce spurious results stemming from its expressive inadequacy. That such spurious results arise in the Bayesian "doomsday argument" is shown by a re-analysis that employs fragments of inductive logic able to represent evidential neutrality. Further, the improper introduction of inductive probabilities is illustrated with the "self-sampling assumption."

Objections to Bayesian Statistics

While Bayesian analysis has enjoyed notable success with many particular problems of inductive inference, it is not the one true and universal logic of induction. Some of the reasons arise at the global level through the existence of competing systems of inductive logic. Others emerge through an examination of the individual assumptions that, when combined, form the Bayesian system: that there is a real valued magnitude that expresses evidential support, that it is additive and that its treatment of logical conjunction is such that Bayes' theorem ensues.

The fundamental objections to Bayesian methods are twofold: on one hand, Bayesian methods are presented as an automatic inference engine, and this raises suspicion in anyone with applied experience. The second objection to Bayes' comes from the opposite direction and addresses the subjective strand of Bayesian inference.

Andrew Gelman , a staunch Bayesian pens down an interesting criticism of the Bayesian ideology in the voice of a hypothetical anti-Bayesian statistician.

Here is the list of objections from a hypothetical or paradigmatic non-Bayesian ; and I quote:

"Bayesian inference is a coherent mathematical theory but I don’t trust it in scientific applications. Subjective prior distributions don’t transfer well from person to person, and there’s no good objective principle for choosing a non-informative prior (even if that concept were mathematically defined, which it’s not). Where do prior distributions
come from, anyway? I don’t trust them and I see no reason to recommend that other people do, just so that I can have the warm feeling of philosophical coherence. To put it another way, why should I believe your subjective prior? If I really believed it, then I could just feed you some data and ask you for your subjective posterior. That would save me a lot of effort!"

In 1986 , a statistician as prominent as Brad Efron restates these concerns mathematically:

"I like unbiased estimates and I like confidence intervals that really have their advertised confidence coverage. I know that these aren’t always going to be possible, but I think the right way forward is to get as close to these goals as possible and to develop robust methods that work with minimal assumptions. The Bayesian approach—to give up even trying to approximate unbiasedness and to instead rely on stronger and stronger assumptions—that seems like the wrong way to go. When the priors I see in practice are typically just convenient conjugate forms. What a coincidence that, of all the infinite variety of priors that could be chosen, it always seems to be the normal, gamma, beta, etc., that turn out to be the right choices?"

Well that really sums up every frequentist's rant about Bayes' 😀 !

And the torrent of complaints never ceases....

Some frequentists believe that in the old days, Bayesian methods at least had the virtue of being mathematically
clean. Nowadays, they all seem to be computed using Markov chain Monte Carlo, which means that, not only can you not realistically evaluate the statistical properties of the method, you can’t even be sure it’s converged, just adding one more item to the list of unverifiable (and unverified) assumptions in Bayesian belief.

As the applied statistician Andrew Ehrenberg wrote :

" Bayesianism assumes:

(a) Either a weak or uniform prior, in which case why bother?,

(b) Or a strong prior, in which case why collect new data?,

(c) Or more realistically, something in between,in which case Bayesianism always seems to duck the issue."

Many are skeptical about the new found empirical approach of Bayesians which always seems to rely on the assumption of "exchangeability", which is almost impossible to obtain in practical scenarios.

Finally Peace!!!

No doubt, some of these are strong arguments worthy enough to be taken seriously.

There is an extensive literature, which sometimes seems to overwhelm that of Bayesian inference itself, on
the advantages and disadvantages of Bayesian approaches. Bayesians’ contributions to this discussion have included defense (explaining how our methods reduce to classical methods as special cases, so that we can be as inoffensive as anybody if needed).

Obviously, Bayesian methods have filled many loopholes in classical statistical theory.

And always remember that you are subjected to mass-criticism only when you have done something truly remarkable walking against the tide of popular opinion.

Hence : "All Hail the iconoclasts of Statistical Theory:the Bayesians"

N.B. The above quote is mine XD

Wait for our next dose of Bayesian glorification!

Till then ,

Stay safe and cheers!

References

1."Critique of Bayesianism"- John D Norton

2."Bayesian Informal Logic and Fallacy" - Kevin Korb

3."Bayesian Analysis"- Gelman

4."Statistical Re-thinking"- Richard McElreath

Some Important Links:

ISI MStat PSB 2009 Problem 8 | How big is the Mean?

This is a very simple and regular sample problem from ISI MStat PSB 2009 Problem 8. It It is based on testing the nature of the mean of Exponential distribution. Give it a Try it !

Problem- ISI MStat PSB 2009 Problem 8

Let $X_1,.....,X_n$ be i.i.d. observation from the density,

$f(x)=\frac{1}{\mu}exp(-\frac{x}{\mu}) , x>0$

where $\mu >0$ is an unknown parameter.

Consider the problem of testing the hypothesis $H_o : \mu \le \mu_o$ against $H_1 : \mu > \mu_o$.

(a) Show that the test with critical region $[\bar{X} \ge \mu_o {\chi_{2n,1-\alpha}}^2/2n]$, where $ {\chi^2}_{2n,1-\alpha} $ is the $(1-\alpha)$th quantile of the ${\chi^2}_{2n}$ distribution, has size $\alpha$.

(b) Give an expression of the power in terms of the c.d.f. of the ${\chi^2}_{2n}$ distribution.

Prerequisites

Likelihood Ratio Test

Exponential Distribution

Chi-squared Distribution

Solution :

This problem is quite regular and simple, from the given form of the hypotheses , it is almost clear that using Neyman-Pearson can land you in trouble. So, lets go for something more general , that is Likelihood Ratio Testing.

Hence, the Likelihood function of the $\mu$ for the given sample is ,

$L(\mu | \vec{X})=(\frac{1}{\mu})^n exp(-\frac{\sum_{i=1}^n X_i}{\mu}) , \mu>0$, also observe that sample mean $\vec{X}$ is the MLE of $\mu$.

So, the Likelihood Ratio statistic is,

$\lambda(\vec{x})=\frac{\sup_{\mu \le \mu_o}L(\mu |\vec{x})}{\sup_\mu L(\mu |\vec{x})} \\ =\begin{cases} 1 & \mu_o \ge \bar{X} \\ \frac{L(\mu_o|\vec{x})}{L(\bar{X}|\vec{x})} & \mu_o < \bar{X} \end{cases} $

So, our test function is ,

$\phi(\vec{x})=\begin{cases} 1 & \lambda(\vec{x})<k \\ 0 & otherwise \end{cases}$.

We, reject $H_o$ at size $\alpha$, when $\phi(\vec{x})=1$, for some $k$, $E_{H_o}(\phi) \le \alpha$,

Hence, $\lambda(\vec{x}) < k \\ \Rightarrow L(\mu_o|\vec{x})<kL(\bar{X}|\vec{x}) \\ \ln k_1 -\frac{1}{\mu_o}\sum_{i=1}^n X_i < \ln k -n \ln \bar{X} -\frac{1}{n} \\ n \ln \bar{X}-\frac{n\bar{X}}{\mu_o} < K* $.

for some constant, $K*$.

Let $g(\bar{x})=n\ln \bar{x} -\frac{n\bar{x}}{\mu_o}$, and observe that $g$ is,

decreasing function of $\bar{x}$ for $\bar{x} \ge \mu_o$,

Hence, there exists a $c$ such that $\bar{x} \ge c $,we have $g(\bar) < K*$. See the figure.

So, the critical region of the test is of form $\bar{X} \ge c$, for some $c$ such that,

$P_{H_o}(\bar{X} \ge c)=\alpha $, for some $0 \le \alpha \le 1$, where $\alpha$ is the size of the test.

Now, our task is to find $c$, and for that observe, if $X \sim Exponential(\theta)$, then $\frac{2X}{\theta} \sim {\chi^2}_2$,

Hence, in this problem, since the $X_i$'s follows $Exponential(\mu)$, hence, $\frac{2n\bar{X}}{\mu} \sim {\chi^2}_{2n}$, we have,

$P_{H_o}(\bar{X} \ge c)=\alpha \\ P_{H_o}(\frac{2n\bar{X}}{\mu_o} \ge \frac{2nc}{\mu_o})=\alpha \\ P_{H_o}({\chi^2}{2n} \ge \frac{2nc}{\mu_o})=\alpha $,

which gives $c=\frac{\mu_o {\chi^2}_{2n;1-\alpha}}{2n}$,

Hence, the rejection region is indeed, $[\bar{X} \ge \frac{\mu_o {\chi^2}_{2n;1-\alpha}}{2n}$.

Hence Proved !

(b) Now, we know that the power of the test is,

$\beta= E_{\mu}(\phi) \\ = P_{\mu}(\lambda(\bar{x})>k)=P(\bar{X} \ge \frac{\mu_o {\chi_{2n;1-\alpha}}^2}{2n}) \\ \beta = P_{\mu}({\chi^2}_{2n} \ge \frac{mu_o}{\mu}{\chi^2}_{2n;1-\alpha}) $.

Hence, the power of the test is of form of a cdf of chi-squared distribution.

Food For Thought

Can you use any other testing procedure to conduct this test ?

Think about it !!

Outstanding Statistics Program with Applications

ISI MStat PSB 2009 Problem 4 | Polarized to Normal

This is a very beautiful sample problem from ISI MStat PSB 2009 Problem 4. It is based on the idea of Polar Transformations, but need a good deal of observation o realize that. Give it a Try it !

Problem- ISI MStat PSB 2009 Problem 4

Let $R$ and $\theta$ be independent and non-negative random variables such that $R^2 \sim {\chi_2}^2 $ and $\theta \sim U(0,2\pi)$. Fix $\theta_o \in (0,2\pi)$. Find the distribution of $R\sin(\theta+\theta_o)$.

Prerequisites

Convolution

Polar Transformation

Normal Distribution

Solution :

This problem may get nasty, if one try to find the required distribution, by the so-called CDF method. Its better to observe a bit, before moving forward!! Recall how we derive the probability distribution of the sample variance of a sample from a normal population ??

Yes, you are thinking right, we need to use Polar Transformation !!

But, before transforming lets make some modifications, to reduce future complications,

Given, $\theta \sim U(0,2\pi)$ and $\theta_o $ is some fixed number in $(0,2\pi)$, so, let $Z=\theta+\theta_o \sim U(\theta_o,2\pi +\theta_o)$.

Hence, we need to find the distribution of $R\sin Z$. Now, from the given and modified information the joint pdf of $R^2$ and $Z$ are,

$f_{R^2,Z}(r,z)=\frac{r}{2\pi}exp(-\frac{r^2}{2}) \ \ R>0, \theta_o \le z \le 2\pi +\theta_o $

Now, let the transformation be $(R,Z) \to (X,Y)$,

$X=R\cos Z \\ Y=R\sin Z$, Also, here $X,Y \in \mathbb{R}$

Hence, $R^2=X^2+Y^2 \\ Z= \tan^{-1} (\frac{Y}{X}) $

Hence, verify the Jacobian of the transformation $J(\frac{r,z}{x,y})=\frac{1}{r}$.

Hence, the joint pdf of $X$ and $Y$ is,

$f_{X,Y}(xy)=f_{R,Z}(x^2+y^2, \tan^{-1}(\frac{y}{x})) J(\frac{r,z}{x,y}) \\ =\frac{1}{2\pi}exp(-\frac{x^2+y^2}{2})$ , $x,y \in \mathbb{R}$.

Yeah, Now it is looking familiar right !!

Since, we need the distribution of $Y=R\sin Z=R\sin(\theta+\theta_o)$, we integrate $f_{X,Y}$ w.r.t to $X$ over the real line, and we will end up with, the conclusion that,

$R\sin(\theta+\theta_o) \sim N(0,1)$. Hence, We are done !!

Food For Thought

From the above solution, the distribution of $R\cos(\theta+\theta_o)$ is also determinable right !! Can you go further investigating the occurrence pattern of $\tan(\theta+\theta_o)$ ?? $R$ and $\theta$ are the same variables as defined in the question.

Give it a try !!

Outstanding Statistics Program with Applications

ISI MStat PSB 2008 Problem 7 | Finding the Distribution of a Random Variable

This is a very beautiful sample problem from ISI MStat PSB 2008 Problem 7 based on finding the distribution of a random variable . Let's give it a try !!

Problem- ISI MStat PSB 2008 Problem 7

Let $ X$ and $ Y$ be exponential random variables with parameters 1 and 2 respectively. Another random variable $ Z$ is defined as follows.

A coin, with probability p of Heads (and probability 1-p of Tails) is
tossed. Define $ Z$ by $ Z=\begin{cases} X & , \text { if the coin turns Heads } \\ Y & , \text { if the coin turns Tails } \end{cases} $
Find $ P(1 \leq Z \leq 2)$

Prerequisites

Cumulative Distribution Function

Exponential Distribution

Solution :

Let , $ F_{i} $ be the CDF for i=X,Y, Z then we have ,

$ F_{Z}(z) = P(Z \le z) = P( Z \le z | coin turns Head )P(coin turns Head) + P( Z \le z | coin turns Tail ) P( coin turns Tail) $

=$ P( X \le z)p + P(Y \le z ) (1-p) $ = $ F_{X}(z)p+F_{Y}(y) (1-p) $

Therefore pdf of Z is given by $ f_{Z}(z)= pf_{X}(z)+(1-p)f_{Y}(z) $ , where $ f_{X} and f_{Y} $ are pdf of X,Y respectively .

So , $ P(1 \leq Z \leq 2) = \int_{1}^{2} \{pe^{-z} + (1-p) 2e^{-2z}\} dz = p \frac{e-1}{e^2} +(1-p) \frac{e^2-1}{e^4} $

Food For Thought

Find the the distribution function of $ K=\frac{X}{Y} $ and then find $ \lim_{K \to \infty} P(K >1 ) $

Outstanding Statistics Program with Applications

ISI MStat PSB 2008 Problem 2 | Definite integral as the limit of the Riemann sum

This is a very beautiful sample problem from ISI MStat PSB 2008 Problem 2 based on definite integral as the limit of the Riemann sum . Let's give it a try !!

Problem- ISI MStat PSB 2008 Problem 2

For $ k \geq 1,$ let $ a_{k}=\lim {n \rightarrow \infty} \frac{1}{n} \sum_{m=1}^{kn} \exp \left(-\frac{1}{2} \frac{m^{2}}{n^{2}}\right) $

Find $ \lim_{k \rightarrow \infty} a_{k} $ .

Prerequisites

Integration

Gamma function

Definite integral as the limit of the Riemann sum

Solution :

$ a_{k}=\lim {n \rightarrow \infty} \frac{1}{n} \sum_{m=1}^{kn} \exp \left(-\frac{1}{2} \frac{m^{2}}{n^{2}}\right) = \int_{0}^{k} e^{\frac{-y^2}{2}} dy $ , this can be written you may see in details Definite integral as the limit of the Riemann sum .

Therefore , $ lim_{k \to \infty} a_{k}= \int_{0}^{ \infty} e^{\frac{-y^2}{2}} dy $ ----(1) , let $ \frac{y^2}{2}=z \Rightarrow dy= \frac{dz}{\sqrt{2z}} $

Substituting we get , $ \int_{0}^{ \infty} z^{\frac{1}{2} -1} e^{z} \frac{1}{\sqrt{2}} dz =\frac{ \gamma(\frac{1}{2}) }{\sqrt{2}} = \sqrt{\frac{\pi}{2}} $

Statistical Insight

Let $ X \sim N(0,1) $ i.e X is a standard normal random variable then,

$ Y=|X| $ called folded Normal has pdf $ f_{Y}(y)= \begin{cases} \frac{2}{\sqrt{2 \pi }} e^{\frac{-x^2}{2}} & , y>0 \\ 0 &, otherwise \end{cases} $ . (Verify!)

So, from (1) we can say that $ \int_{0}^{ \infty} e^{\frac{-y^2}{2}} dy = \frac{\sqrt{2 \pi }}{2} \int_{0}^{ \infty}\frac{2}{\sqrt{2 \pi }} f_{Y}(y) dy $

$ =\frac{\sqrt{2 \pi }}{2} \times 1 $ ( As that a PDF of folded Normal distribution ) .

Food For Thought

Find the same when $ a_{k}=\lim {n \rightarrow \infty} \frac{1}{n} \sum_{m=1}^{kn} {(\frac{m}{n})}^{5} \exp \left(-\frac{1}{2} \frac{m}{n}\right) $.