Get inspired by the success stories of our students in IIT JAM MS, ISI  MStat, CMI MSc DS.  Learn More

# Two-Faced Problem | ISI MStat 2013 PSB Problem 1

This post has a two-faced problem, which gives you both an analytical and statistical insight into ISI MStat 2013 PSB Problem 1.

## Problem

If $$a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}, b_{1}<b_{2}<\ldots \ldots < b_{n}$$ and also $$\sum_{i=1}^{m}|a_i - x|=\sum_{j=1}^{n}|b_j - x|$$, where
$$x$$ is any real number then prove that $$a_i=b_i$$ for all $$i$$ and $$n=m$$.

### Prerquisites

• Non - Differentiability of $$|x - a|$$ at $$x = a$$.
• $$f(x)$$ is non differentiable on set $$A$$ and $$g(x)$$ is non differentiable on set $$B$$, then $$f(x) + g(x)$$ is non differentiable on $$A \cup B$$.
• Minimization of $$\sum_{i=1}^{m}|a_i - x|$$ at $$x = median$$. In other words, the Median minimizes the Sum of Absolute Deviation.

## Solution I (Calculus)

$$f(x) = \sum_{i=1}^{m}|a_i - x|$$ has non - differentiability at $$a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}$$, since each of $$|a_i - x|$$ is non differentiable at $$a_i$$. The result follows from the first two results in the pre-requisites.

Now, since, $$\sum_{i=1}^{m}|a_i - x|=\sum_{j=1}^{n}|b_j - x|$$, the set of non differentiability points of $$f(x) = \sum_{i=1}^{m}|a_i - x|$$ is $$a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}$$ while the set of non differentiability points of $$g(x) = \sum_{i=1}^{n}|b_j - x|$$ is $$b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}$$. $$f(x) = g(x)$$.

Thus, {$$a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}$$} = {$$b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}$$}.

Therfore, $$a_i=b_i$$ for all $$i$$ and $$n=m$$.

## Solution II ( Statistical Perspective )

The Median Minimizes the Sum of Absolute Deviation has really elegant proof here.

We will see the idea from a different perspective with respect to the diagrams and use the idea to prove the actual problem.

For the set { 1 < 2 < 3 < 4 < 5 }, the median is 3. Hence, the $$f(x) = \sum_{i=1}^{5}|i - x|$$ is minized at $$x = 3$$.

But for the set { 1 < 2 < 3 < 4 < 5 < 6}, the median is $$\frac{3+4}{2}$$. Rather the $$f(x) = \sum_{i=1}^{6}|i - x|$$ is minized at $$3 \leq x \leq 4$$. Let's for the time being call $$\{3,4\}$$ as the median of { 1 < 2 < 3 < 4 < 5 < 6}.

Now, the next idea is simple yet elegant.

$$A_0$$ = {$$a_{1}<a_{2}<\ldots \ldots \ldots<a_{m}$$}; $$B_0$$ = {$$b_{1}< b_{2}<\ldots \ldots \ldots<b_{n}$$}.

$$A$$ = $$A_0$$; $$B$$ = $$B_0$$.

$$f_A(x) = \sum_{a \in A}|a - x|$$ = $$g_B(x) = \sum_{b \in B}|b - x|$$.

Step 1 : Take the minimum of $$f(x)$$ and $$g(x)$$. They must be equal.

Step 2 : Therefore, median of $$A$$ = median o f $$B$$.

Step 3 : $$A = A -$$ median of $$A$$; $$B = B -$$ median of $$B$$. This means, we delete the equal elements in both $$A$$ and $$B$$.

Step 4 : Go to Step 1. Stop if $$A = \phi = B$$

This will converge only if $$A_0 = B_0$$.

Hence, $$a_i=b_i$$ for all $$i$$ and $$n=m$$.

Thus, we gave two different views of the same problem.

Hope you enjoyed it.

Stay Tuned!