Categories
I.S.I. and C.M.I. Entrance ISI M.Stat PSB Theory of Estimation

Mean Square Error | ISI MStat 2019 PSB Problem 5

This problem based on calculation of Mean Square Error gives a detailed solution to ISI M.Stat 2019 PSB Problem 5, with a tinge of simulation and code.

This problem based on the calculation of Mean Square Error gives a detailed solution to ISI M.Stat 2019 PSB Problem 5, with a tinge of simulation and code.

Problem

Suppose \(X_{1}, X_{2}, \ldots, X_{n}\) are independent random variables such that \(
\mathrm{P}\left(X_{i}=1\right)=p_{i}=1-\mathrm{P}\left(X_{i}=0\right)
\)
where \(p_{1}, p_{2}, \ldots, p_{n} \in(0,1)\) are all distinct and unknown. Consider \(X=\sum_{i=1}^{n} X_{i}\) and another random variable \(Y\) which is distributed as Binomial \((n, \bar{p}),\) where \(\bar{p}=\frac{1}{n} \sum_{i=1}^{n} p_{i} .\) Between \(X\) and \(Y,\) which is a better estimator of \(\sum_{i=1}^{n} p_{i}\) in terms of their respective mean squared errors?

Prerequisites

Solution

Unbiasedness

\(E(\sum_{i=1}^{n} X_{i}) = \sum_{i=1}^{n} p_{i}\).

\(Y\) ~ Binomial \((n, \bar{p})\)

\( E(Y) = n.\bar{p} = \sum_{i=1}^{n} p_{i}\).

Mean Square Error

If \(T\) is unbiased for \(\theta\), then MSE(\(T) = Var(T)\).

\( MSE(\sum_{i=1}^{n} X_{i}) = Var(\sum_{i=1}^{n} X_{i}) \overset{X_{1}, X_{2}, \ldots, X_{n} \text{are independent}}{=} \sum_{i=1}^{n} Var(X_{i}) = \\ \sum_{i=1}^{n} p_i(1-p_i) = \sum_{i=1}^{n} p_i – \sum_{i=1}^{n} p_i^2 \)

\( MSE(Y) = Var(Y) = n\bar{p}(1 – \bar{p}) = \sum_{i=1}^{n} p_i – \frac{(\sum_{i=1}^{n} p_i)^2}{n} \)

Observe that \( (\sum_{i=1}^{n} p_i^2)n = (\sum_{i=1}^{n} p_i^2)(\sum_{i=1}^{n} 1) \overset{\text{Cauchy Schwartz Inequality}}{\geq} (\sum_{i=1}^{n} p_i)^2 \).

This results in the fact that \(MSE(\sum_{i=1}^{n} X_{i}) \leq MSE(Y)\).

Therefore, \(\sum_{i=1}^{n} X_{i}\) is a better estimate thatn \(Y\) w.r.t Mean Square Error.

Let’s verify this as usual by simulation.

Computation and Simulation

library(statip)
library(Metrics)
N = 10
p = runif(10, 0, 1)
X = rep(0,N)
vX = NULL
vY = NULL
for (j in 1:1000)
{
  for (i in 1:N) 
  {
    X[i] = rbern(1,p[i])
  }
  Z = sum(X) #sum of Xi random variables
  Y = rbinom(1,N,mean(p)) #Y random variable
  vX = c(vX, Z)
  vY = c(vY, Y)
  
}
k = rep(sum(p), 1000)
mse(k, vX) #MSE of Sum Xi #1.57966
mse(k, vY) #MSE of Y  #2.272519

Hence, the theory is verified by this simulation. I hope it helps.

By Srijit Mukherjee

I Learn. I Dream. I Enjoy. I Share.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.