This is a problem from the ISI MStat Entrance Examination,2019 involving the MLE of the population size and investigating its unbiasedness.

## The Problem:

Suppose an SRSWOR of size n has been drawn from a population labelled \(1,2,3,…,N \) , where the population size \(N\) is unknown.

(a)Find the maximum likelihood estimator \( \hat{N} \) of \(N\).

(b)Find the probability mass function of \( \hat{N} \).

(c)Show that \( \frac{n+1}{n}\hat{N} -1\) is an unbiased estimator of \(N\).

## Prerequisites:

(a) Simple random sampling (SRSWR/SRSWOR)

(b)Maximum Likelihood estimator and how to find it.

(c)Unbiasedness of an estimator.

(d)Identities involving Binomial coefficients. (For this, you may refer to any standard text on Combinatorics like R.A.Brualdi,Miklos Bona etc.)

## Solution:

(a) Let \(X_1,X_2,..X_n \) be the sample to be selected. In the SRSWOR scheme,

the selection probability of a sample of size \(n\) is given by \(P(s)=\frac{1}{{N \choose n}} \).

As, \(X_1,..,X_n \in \{1,2,…,N \} \) , we have the maximum among them , that is the \( n \) th order statistic, \(X_{(n)} \) is always less than \(N\).

Now, \( {N \choose n} \) is an increasing function of \(N\). So, of course, \( {X_{(n)} \choose n} \le {N \choose n } \) , thus on reciprocating, we have \(P(s) \le \frac{1}{ {X_{(n)} \choose n}} \). Hence the maximum likelihood estimator of \(N\) i.e. \( \hat{N} \) is \( X_{(n)} \).

(b) We need to find the pmf of \( \hat{N} \).

See that \(P(\hat{N}=m) = \frac{ {m \choose n} – {m-1 \choose n } }{ {N \choose n }} \) , where \(m=n,n+1,…,N \).

Can you convince yourself why?

(c) We use a well known identity , the **Pascal’s Identity** to rewrite the distribution of $\hat{N}=X_{(n)}$ a bit more precisely:

We write \( P(\hat{N}=m) = \frac{ {m-1 \choose n-1}}{ {N \choose n} } ; \text{whenever m=n,n+1,…,N } \)

Thus, we have :

\( \begin{align}

E(\hat{N})&=\sum_{m=n}^N m P(\hat{N}=m)

=\frac{n}{\binom{N}{n}}\sum_{m=n}^N \frac{m}{n}\binom{m-1}{n-1}

=\frac{n}{\binom{N}{n}}\sum_{m=n}^N \binom{m}{n}

\end{align} \)

Also, use the **Hockey Stick Identity** to see that \( \sum_{m=n}^{N} {m \choose n} = {N+1 \choose n+1} \)

So, we have \( E(\hat{N})=\frac{n}{ {N \choose n}} {N+1 \choose n+1}=\frac{n(N+1)}{n+1} \).

Thus, we get \( E( \frac{n+1}{n}\hat{N} -1) = N \)

## Useful Exercise:

Look up the many proofs of the **Hockey Stick Identity**. But make sure you at least learn the proof by a combinatorial argument and an alternative proof involving visualizing the identity via the Pascal’s Triangle.

Google