Get inspired by the success stories of our students in IIT JAM MS, ISI MStat, CMI MSc Data Science. Learn More

This is a problem from the ISI MStat Entrance Examination,2019 involving the MLE of the population size and investigating its unbiasedness.

Suppose an SRSWOR of size n has been drawn from a population labelled \(1,2,3,...,N \) , where the population size \(N\) is unknown.

(a)Find the maximum likelihood estimator \( \hat{N} \) of \(N\).

(b)Find the probability mass function of \( \hat{N} \).

(c)Show that \( \frac{n+1}{n}\hat{N} -1\) is an unbiased estimator of \(N\).

(a) Simple random sampling (SRSWR/SRSWOR)

(b)Maximum Likelihood estimator and how to find it.

(c)Unbiasedness of an estimator.

(d)Identities involving Binomial coefficients. (For this, you may refer to any standard text on Combinatorics like R.A.Brualdi,Miklos Bona etc.)

(a) Let \(X_1,X_2,..X_n \) be the sample to be selected. In the SRSWOR scheme,

the selection probability of a sample of size \(n\) is given by \(P(s)=\frac{1}{{N \choose n}} \).

As, \(X_1,..,X_n \in \{1,2,...,N \} \) , we have the maximum among them , that is the \( n \) th order statistic, \(X_{(n)} \) is always less than \(N\).

Now, \( {N \choose n} \) is an increasing function of \(N\). So, of course, \( {X_{(n)} \choose n} \le {N \choose n } \) , thus on reciprocating, we have \(P(s) \le \frac{1}{ {X_{(n)} \choose n}} \). Hence the maximum likelihood estimator of \(N\) i.e. \( \hat{N} \) is \( X_{(n)} \).

(b) We need to find the pmf of \( \hat{N} \).

See that \(P(\hat{N}=m) = \frac{ {m \choose n} - {m-1 \choose n } }{ {N \choose n }} \) , where \(m=n,n+1,...,N \).

Can you convince yourself why?

(c) We use a well known identity , the **Pascal's Identity** to rewrite the distribution of $\hat{N}=X_{(n)}$ a bit more precisely:

We write \( P(\hat{N}=m) = \frac{ {m-1 \choose n-1}}{ {N \choose n} } ; \text{whenever m=n,n+1,...,N } \)

Thus, we have :

\( \begin{align}

E(\hat{N})&=\sum_{m=n}^N m P(\hat{N}=m)

=\frac{n}{\binom{N}{n}}\sum_{m=n}^N \frac{m}{n}\binom{m-1}{n-1}

=\frac{n}{\binom{N}{n}}\sum_{m=n}^N \binom{m}{n}

\end{align} \)

Also, use the **Hockey Stick Identity** to see that \( \sum_{m=n}^{N} {m \choose n} = {N+1 \choose n+1} \)

So, we have \( E(\hat{N})=\frac{n}{ {N \choose n}} {N+1 \choose n+1}=\frac{n(N+1)}{n+1} \).

Thus, we get \( E( \frac{n+1}{n}\hat{N} -1) = N \)

Look up the many proofs of the **Hockey Stick Identity**. But make sure you at least learn the proof by a combinatorial argument and an alternative proof involving visualizing the identity via the Pascal's Triangle.

This is a problem from the ISI MStat Entrance Examination,2019 involving the MLE of the population size and investigating its unbiasedness.

Suppose an SRSWOR of size n has been drawn from a population labelled \(1,2,3,...,N \) , where the population size \(N\) is unknown.

(a)Find the maximum likelihood estimator \( \hat{N} \) of \(N\).

(b)Find the probability mass function of \( \hat{N} \).

(c)Show that \( \frac{n+1}{n}\hat{N} -1\) is an unbiased estimator of \(N\).

(a) Simple random sampling (SRSWR/SRSWOR)

(b)Maximum Likelihood estimator and how to find it.

(c)Unbiasedness of an estimator.

(d)Identities involving Binomial coefficients. (For this, you may refer to any standard text on Combinatorics like R.A.Brualdi,Miklos Bona etc.)

(a) Let \(X_1,X_2,..X_n \) be the sample to be selected. In the SRSWOR scheme,

the selection probability of a sample of size \(n\) is given by \(P(s)=\frac{1}{{N \choose n}} \).

As, \(X_1,..,X_n \in \{1,2,...,N \} \) , we have the maximum among them , that is the \( n \) th order statistic, \(X_{(n)} \) is always less than \(N\).

Now, \( {N \choose n} \) is an increasing function of \(N\). So, of course, \( {X_{(n)} \choose n} \le {N \choose n } \) , thus on reciprocating, we have \(P(s) \le \frac{1}{ {X_{(n)} \choose n}} \). Hence the maximum likelihood estimator of \(N\) i.e. \( \hat{N} \) is \( X_{(n)} \).

(b) We need to find the pmf of \( \hat{N} \).

See that \(P(\hat{N}=m) = \frac{ {m \choose n} - {m-1 \choose n } }{ {N \choose n }} \) , where \(m=n,n+1,...,N \).

Can you convince yourself why?

(c) We use a well known identity , the **Pascal's Identity** to rewrite the distribution of $\hat{N}=X_{(n)}$ a bit more precisely:

We write \( P(\hat{N}=m) = \frac{ {m-1 \choose n-1}}{ {N \choose n} } ; \text{whenever m=n,n+1,...,N } \)

Thus, we have :

\( \begin{align}

E(\hat{N})&=\sum_{m=n}^N m P(\hat{N}=m)

=\frac{n}{\binom{N}{n}}\sum_{m=n}^N \frac{m}{n}\binom{m-1}{n-1}

=\frac{n}{\binom{N}{n}}\sum_{m=n}^N \binom{m}{n}

\end{align} \)

Also, use the **Hockey Stick Identity** to see that \( \sum_{m=n}^{N} {m \choose n} = {N+1 \choose n+1} \)

So, we have \( E(\hat{N})=\frac{n}{ {N \choose n}} {N+1 \choose n+1}=\frac{n(N+1)}{n+1} \).

Thus, we get \( E( \frac{n+1}{n}\hat{N} -1) = N \)

Look up the many proofs of the **Hockey Stick Identity**. But make sure you at least learn the proof by a combinatorial argument and an alternative proof involving visualizing the identity via the Pascal's Triangle.

Cheenta is a knowledge partner of Aditya Birla Education Academy

Advanced Mathematical Science. Taught by olympians, researchers and true masters of the subject.

JOIN TRIALAcademic Programs

Free Resources

Why Cheenta?

Google