How Cheenta works to ensure student success?
Explore the Back-Story

# A code to find Primes - Sieve of Eratoshenes

To some extent the beauty of number theory seems to be related to the contradiction between the simplicity of the integers and the complicated structure of the primes, their building blocks. This has always attracted people.

A. Knauf from "Number theory, dynamical systems and statistical mechanics"

This quote is indeed true. If you just think about the simplicity of integers and complexity of primes, it's really horrifying. It's almost like how easy and intuitive it is to analyse big "classical objects" but extremely un-intuitive to discuss about "quatum objects".

Today's blog will be a bit special. Today we will discuss about a simple algorithm which is one of the most easiest algorithm which find primes in a certain range. This is called The Sieve of Eratosthenes.

##### Main Idea of Eratosthenes

The main idea behind this algorithm is very simple. Let's try to understand this using an example.

Suppose, you want to find all primes within $20$.

• First you write down all numbers from $2$ to $20$. The list will be $[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]$
• Then, you take $2$ and eliminate all it's multiple from the list. This will remove all evens. The remaining list will be $[2,3,5,7,9,11,13,15,17,19]$.
• After this, you take $3$ and start eliminating all it's multiple. After this step the new list is $[2,3,5,7,11,13,17,19]$
• If we go on again and again by taking the smallest untouched number of the remaining list (in this case 5), then we will filter out all possible composite numbers from the list and finally we will have all primes.

This is basically our algorithm. Below you can find the algorithm for $30$.

##### Code to perform this

The code to perform this is quite easy, specially in python, because of the filter function. Let's see the code;

n = 20
all_list = [i for i in range(2,n+1)]
factor_2_remove = filter(lambda m: m %2 !=0, all_list)
factor_2_remove = list(factor_2_remove)
factor_2_remove.append(2)
print(factor_2_remove)
#output:[3, 5, 7, 9, 11, 13, 15, 17, 19, 2]

As you can clearly see, first we have created a list containing a list from $2$ to $n$, then we use filter. This function filter out the numbers which doesn't satisfy the given. Here the condition is $m \not\equiv 0$ mod $2$. This is written as $m%2 !=0$ in python. After using this we again have to add $2$ as it also doesn't satisfy the condition so, it also is removed.

So, to find all primes within any range, we simply do this for all number, not just $2$. You can use a simple for loop for this.

n = 100
all_list = [i for i in range(2,n+1)]
for i in len(all_list):
first_ele = all_list[0]
all_list = filter(lambda m: m %first_ele !=0, all_list)
all_list = list(all_list)
all_list.append(first_ele)
print(all_list)
#Output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

See!, how easy it is.. But we can do even better. Notice, here we have to run the loop $\pi(n)$ times. But we don't have to do that.

Note: $\pi(n)$ is a function, which represent number of primes in between $2$ to $n$.

Remember: If an integer $a>1$ is not divisible by any prime $p\leq \sqrt{a}$, then $a$ is of necessity a prime.

Using this we can increase the efficiency of the program. Let's try to understand this using an example.

Suppose, we want to find primes up to $100$. Then, we create a list up to $100$, i.e., $2,3,4,5,6,7,\cdots ,98,99,100$. Now, we take $2$ and remove all multiples of $2$ from the list, except $2$ itself. We again do the same for $3$, $5$,$\cdots$. But notice $\sqrt{100}=10$, so we just do this up to $10$, i.e., up to $7$. And this will generate all primes within $100$.

If we do this, now the we only have to run the loop for $\pi(n)$ times. Let's see how much the speed has increased. I will now make two functions, one which runs the loop $\pi(n)$ times and one which runs $\pi(\sqrt{n})$ times.

import math as mp
def with_just_sqrtn(n):#uses sqrt n
all_list = [i for i in range(2,n+1)]
root_n = mp.ceil(mp.sqrt(n))
for i in range(n):
first_ele = all_list[0]
all_list = filter(lambda m: m %first_ele !=0, all_list)
all_list = list(all_list)
all_list.append(first_ele)
if first_ele == root_n:
break
return all_list
print(with_just_sqrtn(100))
#output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

See, the output is same as before.

The previous one can also be written as a function.

def with_all(n):
all_list = [i for i in range(2,n+1)]
for i in range(n):
first_ele = all_list[0]
all_list = filter(lambda m: m %first_ele !=0, all_list)
all_list = list(all_list)
all_list.append(first_ele)
return all_list

Now, let's find which one takes much time. We will use $n=1,00,000$

n = 1_00_000
import time
start_t = time.time()
with_all(n)
print("Time = %s seconds"%(time.time()-start_t))
#Output: Time = 131.59189105033875 seconds

Just close to $2$ min $11$ sec. Now, let's run the one with small loop number.

start_t = time.time()
with_just_sqrtn(n)
print("Time = %s seconds"%(time.time()-start_t))
#Output: Time = 0.1107027530670166 seconds

What the !!, just 0.11 sec?

So, i hope now you can see, how much faster it is compared to the previous one.

This is all for today, I hope you have learnt something new.