Select Page

This post has a two-faced problem, which gives you both an analytical and statistical insight into ISI MStat 2013 PSB Problem 1.

## Problem

If $a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}, b_{1}<b_{2}<\ldots \ldots < b_{n}$ and also $\sum_{i=1}^{m}|a_i – x|=\sum_{j=1}^{n}|b_j – x|$, where
$x$ is any real number then prove that $a_i=b_i$ for all $i$ and $n=m$.

### Prerquisites

• Non – Differentiability of $|x – a|$ at $x = a$.
• $f(x)$ is non differentiable on set $A$ and $g(x)$ is non differentiable on set $B$, then $f(x) + g(x)$ is non differentiable on $A \cup B$.
• Minimization of $\sum_{i=1}^{m}|a_i – x|$ at $x = median$. In other words, the Median minimizes the Sum of Absolute Deviation.

## Solution I (Calculus)

$f(x) = \sum_{i=1}^{m}|a_i – x|$ has non – differentiability at $a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}$, since each of $|a_i – x|$ is non differentiable at $a_i$. The result follows from the first two results in the pre-requisites.

Now, since, $\sum_{i=1}^{m}|a_i – x|=\sum_{j=1}^{n}|b_j – x|$, the set of non differentiability points of $f(x) = \sum_{i=1}^{m}|a_i – x|$ is $a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}$ while the set of non differentiability points of $g(x) = \sum_{i=1}^{n}|b_j – x|$ is $b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}$. $f(x) = g(x)$.

Thus, {$a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}$} = {$b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}$}.

Therfore, $a_i=b_i$ for all $i$ and $n=m$.

## Solution II ( Statistical Perspective )

The Median Minimizes the Sum of Absolute Deviation has really elegant proof here.

We will see the idea from a different perspective with respect to the diagrams and use the idea to prove the actual problem.

For the set { 1 < 2 < 3 < 4 < 5 }, the median is 3. Hence, the $f(x) = \sum_{i=1}^{5}|i – x|$ is minized at $x = 3$.

But for the set { 1 < 2 < 3 < 4 < 5 < 6}, the median is $\frac{3+4}{2}$. Rather the $f(x) = \sum_{i=1}^{6}|i – x|$ is minized at $3 \leq x \leq 4$. Let’s for the time being call $\{3,4\}$ as the median of { 1 < 2 < 3 < 4 < 5 < 6}.

Now, the next idea is simple yet elegant.

$A_0$ = {$a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}$}; $B_0$ = {$b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}$}.

$A$ = $A_0$; $B$ = $B_0$.

$f_A(x) = \sum_{a \in A}|a – x|$ = $g_B(x) = \sum_{b \in B}|b – x|$.

Step 1 : Take the minimum of $f(x)$ and $g(x)$. They must be equal.

Step 2 : Therefore, median of $A$ = median o f $B$.

Step 3 : $A = A -$ median of $A$; $B = B –$ median of $B$. This means, we delete the equal elements in both $A$ and $B$.

Step 4 : Go to Step 1. Stop if $A = \phi = B$

This will converge only if $A_0 = B_0$.

Hence, $a_i=b_i$ for all $i$ and $n=m$.

Thus, we gave two different views of the same problem.

Hope you enjoyed it.

Stay Tuned!