By Vishnu Vinjamuri
Abstract:
Introduction:
The prime counting function is denoted by p(x) and has several approximations.
The most popular and widely used such approximation is
- Equation (1)
Where x is any given real number greater than 2 and p(x) is the number of primes less than or equal to n.
It is well known that the above approximation runs asymptotic to the actual total number of primes and is a lower bound approximation.
Many other approximations using this format are of the form:
Where B is a constant, whose value is 1.08366 as proposed first by Legendre and later changed to 1.0. From Wikipedia (source unknown), we see further modifications to these constants between different values of n, for better approximation and these constants seem to be working well.
Pierre Dusart suggested formulas using similar and more complex constants, with lower and upper bounds.
In 2024, Timothy Ganesan proposed the below formula for prime numbers which is a better approximation than 1.1.
- Equation (3)
The various approximations coming under the family of Equation (2), provide a very good estimate of the number of primes, but the constants need to be adjusted as the number n increases. Several other equations exist which are complex in nature.
These equations and varying constants make the computation more complex.
In this article, let us approximate the number of primes by looking at the very Equation (1) and closely examining the distribution of primes in large intervals, aiming to keep the solution simple.
From now onward, let us call Equation (1) the Prime Counting Function.
All logarithms in this article are natural logarithms.
Distribution of Primes:
For a large number n, the Prime Counting Function is
For 2n, Equation (1) becomes
This means, for very large
values of n, the number of primes less than n gets nearly doubled as n gets
doubled.
For a very large n, equation (4) can also be re-written as:
- Equation (5)
which means the number of primes under n is twice the numbers of primes under n/2, for a large n.
But, as we know, log n progresses at a snail's pace compared to the increase in n. Does this really work?
Let us do some calculations.
For n=106, we get
For n=103, we get
Below is the chart showing the ratio of p (2n)/ p (n) as n increases, and its asymptotic behavior with respect to 2.0.
Rearranging Prime Counting Function:
Let us rewrite a simple and wild equation for counting the number of primes as below: - Equation (6)
Sounds crazy, doesn’t it?!
From Equation (5), we get the below approximation for a large n:
Substituting various terms, Equation (6) can be rewritten as:
Which can be further rewritten as:
Which can be further simplified as:
- Equation (7)
Let us call this equation “Modified Approximation” further in this article.
Results:
In the results section, let us first present the results for the Prime Counting Function which are available, and how they compare against the actual number of primes known. The actual number of primes are taken from web sources.
The findings are presented in the below tables. The fractions are ignored as they do not make any sense when counting the number of integers.
Table 1 – No. of primes using Prime Counting Function (Known values)
|
n |
π(n) actual |
π(n) from
Prime Counting Function |
% error |
|
1.00E+01 |
4 |
4 |
0.00% |
|
1.00E+02 |
25 |
21 |
-16.00% |
|
1.00E+03 |
168 |
144 |
-14.29% |
|
1.00E+04 |
1229 |
1085 |
-11.72% |
|
1.00E+05 |
9592 |
8685 |
-9.46% |
|
1.00E+06 |
78498 |
72382 |
-7.79% |
|
1.00E+07 |
664579 |
620420 |
-6.64% |
|
1.00E+08 |
5761455 |
5428681 |
-5.78% |
|
1.00E+09 |
50847534 |
48254942 |
-5.10% |
|
1.00E+10 |
455052511 |
434294481 |
-4.56% |
|
1.00E+11 |
4118054813 |
3948131653 |
-4.13% |
|
1.00E+12 |
37607912018 |
36191206825 |
-3.77% |
|
1.00E+13 |
346065536839 |
334072678387 |
-3.47% |
|
1.00E+14 |
3204941750802 |
3102103442166 |
-3.21% |
|
1.00E+15 |
29844570422669 |
28952965460216 |
-2.99% |
|
1.00E+16 |
279238341033925 |
271434051189532 |
-2.79% |
|
1.00E+17 |
2623557157654233 |
2554673422960304 |
-2.63% |
|
1.00E+18 |
24739954287740860 |
24127471216847323 |
-2.48% |
|
1.00E+19 |
234057667276344607 |
228576043106974646 |
-2.34% |
|
1.00E+20 |
2220819602560918840 |
2171472409516259138 |
-2.22% |
|
1.00E+21 |
21127269486018731928 |
20680689614440563221 |
-2.11% |
|
1.00E+22 |
201467286689315906290 |
197406582683296285295 |
-2.02% |
|
1.00E+23 |
1925320391606803968923 |
1888236877840225337613 |
-1.93% |
|
1.00E+24 |
18435599767349200867866 |
18095603412635492818797 |
-1.84% |
|
1.00E+25 |
176846309399143769411680 |
173717792761300731060451 |
-1.77% |
|
1.00E+26 |
1699246750872437141327603 |
1670363391935583952504341 |
-1.70% |
|
1.00E+27 |
16352460426841680446427399 |
16084980811231549172264034 |
-1.64% |
|
1.00E+28 |
157589269275973410412739598 |
155105172108304224161117471 |
-1.58% |
|
1.00E+29 |
1520698109714272166094258063 |
1497567178976730440176306616 |
-1.52% |
Table 2 – No. of primes using Modified Approximation (Presented in this article)
|
n |
π(n) actual |
π(n) from Modified Approximation |
% error |
|
1.00E+01 |
4 |
7 |
75.00% |
|
1.00E+02 |
25 |
27 |
8.00% |
|
1.00E+03 |
168 |
168 |
0.00% |
|
1.00E+04 |
1229 |
1218 |
-0.90% |
|
1.00E+05 |
9592 |
9520 |
-0.76% |
|
1.00E+06 |
78498 |
78117 |
-0.49% |
|
1.00E+07 |
664579 |
662240 |
-0.35% |
|
1.00E+08 |
5761455 |
5747073 |
-0.25% |
|
1.00E+09 |
50847534 |
50759753 |
-0.17% |
|
1.00E+10 |
455052511 |
454513484 |
-0.12% |
|
1.00E+11 |
4118054813 |
4114760691 |
-0.08% |
|
1.00E+12 |
37607912018 |
37588078365 |
-0.05% |
|
1.00E+13 |
346065536839 |
345951503625 |
-0.03% |
|
1.00E+14 |
3204941750802 |
3204354872614 |
-0.02% |
|
1.00E+15 |
29844570422669 |
29842386052038 |
-0.01% |
|
1.00E+16 |
279238341033925 |
279241231332249 |
0.001% |
|
1.00E+17 |
2623557157654233 |
2623752533459659 |
0.007% |
|
1.00E+18 |
24739954287740860 |
24743023355249531 |
0.012% |
|
1.00E+19 |
234057667276344607 |
234095725093423118 |
0.016% |
|
1.00E+20 |
2220819602560918840 |
2221247473634630597 |
0.019% |
|
1.00E+21 |
21127269486018731928 |
21131835823913237324 |
0.022% |
|
1.00E+22 |
201467286689315906290 |
201514517487389703395 |
0.023% |
|
1.00E+23 |
1925320391606803968923 |
1925799106811863614750 |
0.025% |
|
1.00E+24 |
18435599767349200867866 |
18440385448469505739705 |
0.026% |
|
1.00E+25 |
176846309399143769411680 |
176893690363053571874257 |
0.027% |
|
1.00E+26 |
1699246750872437141327603 |
1699712592577485318689677 |
0.027% |
|
1.00E+27 |
16352460426841680446427399 |
16357017243895836441592453 |
0.028% |
|
1.00E+28 |
157589269275973410412739598 |
157633676378313981721209496 |
0.028% |
|
1.00E+29 |
1520698109714272166094258063 |
1521129660905129306326196776 |
0.028% |
Inferences:
General Inferences:
1. The error percentage from Table 2 is much smaller than that from Table 1, except for very small values of n. See the chart below.
2. The error from the Modified Approximation changes sign between 1015 and 1016. See below chart.
4. The approximated value of number of primes under 1000 is 168.984 from the Modified Approximation which magically matches with the number of primes under 1000 when the fraction is eliminated.
Inferences on Probability:
1. The probability of a number being a prime, from the Prime Counting Function is,
- Equation (8)- Equation (9)
It can be easily observed that Equation (9) leads to better probability.
Now let us test this against the one-millionth prime number 15485863.
The probability of a randomly selected number less than or equal to the one-millionth prime number from the Prime Counting Function is, 1/log (15485863) = 6.04%.
The probability of a randomly selected number less than or equal to the one-millionth prime number from the Modified Approximation is,
Which is 6.56% higher than the probability coming from the Prime Counting Function.
The actual probability is 1000000/15584863=6.46%, which is very close to what we have estimated using the Modified Approximation.
From this, it can be observed that for smaller values of n, where the density of prime numbers is still high, the Modified Approximation provides a far better estimate of the probability of a number being prime, than the Prime Counting Function.
2. Semiprimes are the numbers which are products of two primes. There is no specific term when a semiprime is equal to twice a prime. Let us call them even semiprimes.
If x is the number of primes less than number n, they will have x number of corresponding even semiprimes less than 2n.
For a large number n, the no. of even semiprimes less than n is approximately equal to the no. of primes less than n/2.
The Prime Counting Function and its variants and modifications can be thus extended to approximate the number of semiprimes also.
From this, it can be inferred that the probability of a randomly selected even semiprime also gets influenced by the prime counting function chosen. The Modified Approximation in this article provides a better probability of a number being a prime, than the Prime Counting Function.
3. It can be calculated and found that the Modified Approximation provides a confidence of more than 95% at a value of n as low as 101, whereas for the Prime Counting Function the confidence level crosses 95% between 109 and 1010.
Conclusions:
1. A unique approximation is given for the Prime Counting Function in the form of
which gives much better approximation than the Prime Counting Function.
2.The probability of a randomly selected number n being a prime from the proposed approximation is
which provides more confidence than the probability from the Prime Counting Function.
3. The term “Self-correcting” used in the title of this article comes from the fact that the proposed equation for the number of primes less the number n uses the Prime Counting Function for n/2.
4. Compared to several other formulas, this formula is computationally less intensive as it uses just the log function a couple of times, and no constant is changing as n increases.
This is such an erudite piece of research work in pure mathematics which can be put in serious business usage.This approach can be suitably adapted in making business forecast through a probabilistic approach when less data points are available to make a regression analysis.
ReplyDeleteThanks Mr. Prakash. It's so kind of you.
DeleteBeautiful Mr Prakash . Thanks Sir
Delete