This post has a two-faced problem, which gives you both an analytical and statistical insight into ISI MStat 2013 PSB Problem 1.

## Problem

If \(a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}, b_{1}<b_{2}<\ldots \ldots < b_{n}\) and also \(\sum_{i=1}^{m}|a_i – x|=\sum_{j=1}^{n}|b_j – x|\), where

\(x\) is any real number then prove that \(a_i=b_i\) for all \(i\) and \(n=m\).

### Prerquisites

- Non – Differentiability of \(|x – a| \) at \( x = a\).
- \( f(x)\) is non differentiable on set \(A\) and \(g(x)\) is non differentiable on set \(B\), then \( f(x) + g(x) \) is non differentiable on \( A \cup B\).
- Minimization of \(\sum_{i=1}^{m}|a_i – x|\) at \( x = median \). In other words, the
**Median minimizes the Sum of Absolute Deviation**.

## Solution I (Calculus)

\( f(x) = \sum_{i=1}^{m}|a_i – x|\) has non – differentiability at \(a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}\), since each of \(|a_i – x|\) is non differentiable at \( a_i\). The result follows from the first two results in the pre-requisites.

Now, since, \(\sum_{i=1}^{m}|a_i – x|=\sum_{j=1}^{n}|b_j – x|\), the set of non differentiability points of \( f(x) = \sum_{i=1}^{m}|a_i – x|\) is \(a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}\) while the set of non differentiability points of \( g(x) = \sum_{i=1}^{n}|b_j – x|\) is \(b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}\). \( f(x) = g(x)\).

Thus, {\(a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}\)} = {\(b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}\)}.

Therfore, \(a_i=b_i\) for all \(i\) and \(n=m\).

## Solution II ( Statistical Perspective )

The Median Minimizes the Sum of Absolute Deviation has really elegant proof here.

We will see the idea from a different perspective with respect to the diagrams and use the idea to prove the actual problem.

For the set { 1 < 2 < 3 < 4 < 5 }, the median is 3. Hence, the \( f(x) = \sum_{i=1}^{5}|i – x|\) is minized at \( x = 3 \).

But for the set { 1 < 2 < 3 < 4 < 5 < 6}, the median is \( \frac{3+4}{2}\). Rather the \( f(x) = \sum_{i=1}^{6}|i – x|\) is minized at \( 3 \leq x \leq 4 \). Let’s for the time being call \(\{3,4\}\) as the median of { 1 < 2 < 3 < 4 < 5 < 6}.

Now, the next idea is simple yet elegant.

\(A_0\) = {\(a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}\)}; \(B_0\) = {\(b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}\)}.

\(A\) = \(A_0\); \(B\) = \(B_0\).

\( f_A(x) = \sum_{a \in A}|a – x|\) = \( g_B(x) = \sum_{b \in B}|b – x|\).

**Step 1** : Take the minimum of \(f(x)\) and \(g(x)\). They must be equal.

**Step 2** : Therefore, median of \(A\) = median o f \(B\).

**Step 3** : \(A = A -\) median of \(A \); \(B = B – \) median of \(B\). This means, we delete the equal elements in both \(A\) and \(B\).

**Step 4** : Go to **Step 1**. Stop if \( A = \phi = B\)

This will converge only if \(A_0 = B_0 \).

Hence, \(a_i=b_i\) for all \(i\) and \(n=m\).

Thus, we gave two different views of the same problem.

Hope you enjoyed it.

Stay Tuned!

Google