Prime Number Algorithms with R.
I've recently been read up on prime number theory, which has inspired me to investigate some prime number algorithms myself.
First of all, the most obvious way to solve for prime numbers is the brute force method. I.e if you are evaluating whether a number (n) is prime, check if it can be factorized by integers from 2 to n-1. If not, then it is a prime. Needless to say, this is the most inefficient method of finding prime numbers.
A more efficient algorithm for finding primes is the Sieve of Eratosthenes. The way that it works is very intuitive. You basically only evaluate primality on numbers less than the square root of n, which allows you to rule out all multiples of those prime numbers. After you have done this, what you are left with is only the prime numbers up to the number n.
So why only up to the square root of n? How can you be sure that integers above the square root of n are prime numbers if you don't check them individually?
The explanation is quite intuitive:
Suppose you are evaluating primes up to 100. The square root of 100 is 10, so you would only need to rule out multiples of numbers up to 10. Say the number "37" is left over, and you are skeptical of whether it is indeed a prime. If 37 wasn't a prime, then it would be a multiple of 2 or more prime numbers, of which one has to be a prime less than 10 (Since two numbers greater than 10 multiplied would be > 100). However, you have already ruled out multiples of primes less than ten, so "37" must be a prime!
Here is an animation of the sieve in action:
As you can see, it is much more efficient than the brute force method which requires all numbers to be evaluated individually for primality.
Here is the code that I wrote in R to replicate the Sieve of Eratosthenes (O(sqrt(n)).
In contrast, here is code that I wrote using the brute force method. O(n), much less efficient
Ill update this post further as I learn more prime algorithms!
First of all, the most obvious way to solve for prime numbers is the brute force method. I.e if you are evaluating whether a number (n) is prime, check if it can be factorized by integers from 2 to n-1. If not, then it is a prime. Needless to say, this is the most inefficient method of finding prime numbers.
A more efficient algorithm for finding primes is the Sieve of Eratosthenes. The way that it works is very intuitive. You basically only evaluate primality on numbers less than the square root of n, which allows you to rule out all multiples of those prime numbers. After you have done this, what you are left with is only the prime numbers up to the number n.
So why only up to the square root of n? How can you be sure that integers above the square root of n are prime numbers if you don't check them individually?
The explanation is quite intuitive:
Suppose you are evaluating primes up to 100. The square root of 100 is 10, so you would only need to rule out multiples of numbers up to 10. Say the number "37" is left over, and you are skeptical of whether it is indeed a prime. If 37 wasn't a prime, then it would be a multiple of 2 or more prime numbers, of which one has to be a prime less than 10 (Since two numbers greater than 10 multiplied would be > 100). However, you have already ruled out multiples of primes less than ten, so "37" must be a prime!
Here is an animation of the sieve in action:
As you can see, it is much more efficient than the brute force method which requires all numbers to be evaluated individually for primality.
Here is the code that I wrote in R to replicate the Sieve of Eratosthenes (O(sqrt(n)).
test <- function(x) { primes <- matrix(c(rep(1,x),c(1:x)),ncol=2) # generate a table with a column of 1's and a column of consecutive integers primes[1,1]<-0 # mark the number 1 as not being prime for (n in primes[seq(1,floor(sqrt(x)),1),2]) { # for the values from 1 to x... if (primes[n,1]==1) { # if the number hasn't been marked as prime yet... primes[seq.int(2*n,x,n),1]<-0 # mark all multiples of that number as not prime } } return(primes) # all numbers left over are prime }
In contrast, here is code that I wrote using the brute force method. O(n), much less efficient
findprime <- function(x) { k = 0 y = c(1:x) #define a set of integers from 1 to x for (n in y) { if (x %% n == 0) k<-k+1 #if the remainder is not 0, then add 1 to k } if (k==2) return(c("yes",k)) #if prime, then only 2 numbers should have remained 0 (1 and x), otherwise the number is not prime else if (k!=2) return(c("no",k)) }
Ill update this post further as I learn more prime algorithms!
Comments
Post a Comment