Sunday, May 4, 2025

Prime Counting - A Self-correcting Approximation

                                                                                                                           By Vishnu Vinjamuri

Abstract:

The famous prime-counting function, known as p(x) = x/log x  is widely used for most practical purposes but underestimates the number of primes. Several other formulas are available that provide more accurate estimates of the number of primes less than a given positive integer. In this article, a unique approximation is presented using the formula  p(x) = x/log x as the base, which results in a very high degree of accuracy while remaining computationally simple.

Introduction:

The prime counting function is denoted by   p(x)  and has several approximations.

The most popular and widely used such approximation is

      

- Equation (1)


Where x is any given real number greater than 2 and p(x)  is the number of primes less than or equal to n.

It is well known that the above approximation runs asymptotic to the actual total number of primes and is a lower bound approximation.

Many other approximations using this format are of the form:


  - Equation (2)


Where B is a constant, whose value is 1.08366 as proposed first by Legendre and later changed to 1.0. From Wikipedia (source unknown), we see further modifications to these constants between different values of n, for better approximation and these constants seem to be working well.

Pierre Dusart suggested formulas using similar and more complex constants, with lower and upper bounds.

In 2024, Timothy Ganesan proposed the below formula for prime numbers which is a better approximation than 1.1.

 

- Equation (3)


The various approximations coming under the family of Equation (2), provide a very good estimate of the number of primes, but the constants need to be adjusted as the number n increases. Several other equations exist which are complex in nature.

These equations and varying constants make the computation more complex.

In this article, let us approximate the number of primes by looking at the very Equation (1) and closely examining the distribution of primes in large intervals, aiming to keep the solution simple.

From now onward, let us call Equation (1) the Prime Counting Function.

All logarithms in this article are natural logarithms.

Distribution of Primes:

For a large number n, the Prime Counting Function is

For 2n, Equation (1) becomes







Which means,


- Equation (4)

This means, for very large values of n, the number of primes less than n gets nearly doubled as n gets doubled.

For a very large n, equation (4) can also be re-written as:


- Equation (5)




which means the number of primes under n is twice the numbers of primes under n/2, for a large n.

But, as we know, log n progresses at a snail's pace compared to the increase in n. Does this really work?

Let us do some calculations.

For n=106, we get 




For n=103, we get 


This shows that the ratio is tending toward 2.0 as n increases.

Below is the chart showing the ratio of p (2n)/ p (n) as increases, and its asymptotic behavior with respect to 2.0.


Rearranging Prime Counting Function:

Let us rewrite a simple and wild equation for counting the number of primes as below: - Equation (6)

Sounds crazy, doesn’t it?!

From Equation (5), we get the below approximation for a large n:

Substituting various terms, Equation (6) can be rewritten as: 

 
Which can be further rewritten as:

 
Which can be further simplified as:

 

- Equation (7)




Let us call this equation “Modified Approximation” further in this article.

Results:

In the results section, let us first present the results for the Prime Counting Function which are available, and how they compare against the actual number of primes known. The actual number of primes are taken from web sources.

The findings are presented in the below tables. The fractions are ignored as they do not make any sense when counting the number of integers.

Table 1 – No. of primes using Prime Counting Function (Known values)

n

π(n) actual

π(n) from Prime Counting Function

 % error

1.00E+01

4

4

0.00%

1.00E+02

25

21

    -16.00%

1.00E+03

168

144

-14.29%

1.00E+04

1229

1085

-11.72%

1.00E+05

9592

8685

-9.46%

1.00E+06

78498

72382

-7.79%

1.00E+07

664579

620420

-6.64%

1.00E+08

5761455

5428681

-5.78%

1.00E+09

50847534

48254942

-5.10%

1.00E+10

455052511

434294481

-4.56%

1.00E+11

4118054813

3948131653

-4.13%

1.00E+12

37607912018

36191206825

-3.77%

1.00E+13

346065536839

334072678387

-3.47%

1.00E+14

3204941750802

3102103442166

-3.21%

1.00E+15

29844570422669

28952965460216

-2.99%

1.00E+16

279238341033925

271434051189532

-2.79%

1.00E+17

2623557157654233

2554673422960304

-2.63%

1.00E+18

24739954287740860

24127471216847323

-2.48%

1.00E+19

234057667276344607

228576043106974646

-2.34%

1.00E+20

2220819602560918840

2171472409516259138

-2.22%

1.00E+21

21127269486018731928

20680689614440563221

-2.11%

1.00E+22

201467286689315906290

197406582683296285295

-2.02%

1.00E+23

       1925320391606803968923 

1888236877840225337613

-1.93%

1.00E+24

18435599767349200867866

18095603412635492818797

-1.84%

1.00E+25

176846309399143769411680

173717792761300731060451

-1.77%

1.00E+26

1699246750872437141327603

1670363391935583952504341

-1.70%

1.00E+27

16352460426841680446427399

16084980811231549172264034

-1.64%

1.00E+28

157589269275973410412739598

155105172108304224161117471

-1.58%

1.00E+29

1520698109714272166094258063

1497567178976730440176306616

-1.52%


Table 2 – No. of primes using Modified Approximation (Presented in this article)

n

π(n) actual

π(n) from Modified Approximation

 % error

1.00E+01

4

7

75.00%

1.00E+02

25

27

8.00%

1.00E+03

168

168

0.00%

1.00E+04

1229

1218

-0.90%

1.00E+05

9592

9520

-0.76%

1.00E+06

78498

78117

-0.49%

1.00E+07

664579

662240

-0.35%

1.00E+08

5761455

5747073

-0.25%

1.00E+09

50847534

50759753

-0.17%

1.00E+10

455052511

454513484

-0.12%

1.00E+11

4118054813

4114760691

-0.08%

1.00E+12

37607912018

37588078365

-0.05%

1.00E+13

346065536839

345951503625

-0.03%

1.00E+14

3204941750802

3204354872614

-0.02%

1.00E+15

29844570422669

29842386052038

-0.01%

1.00E+16

279238341033925

279241231332249

0.001%

1.00E+17

2623557157654233

2623752533459659

0.007%

1.00E+18

24739954287740860

24743023355249531

0.012%

1.00E+19

234057667276344607

234095725093423118

0.016%

1.00E+20

2220819602560918840

2221247473634630597

0.019%

1.00E+21

21127269486018731928

21131835823913237324

0.022%

1.00E+22

201467286689315906290

201514517487389703395

0.023%

1.00E+23

       1925320391606803968923 

1925799106811863614750

0.025%

1.00E+24

18435599767349200867866

18440385448469505739705

0.026%

1.00E+25

176846309399143769411680

176893690363053571874257

0.027%

1.00E+26

1699246750872437141327603

1699712592577485318689677

0.027%

1.00E+27

16352460426841680446427399

16357017243895836441592453

0.028%

1.00E+28

157589269275973410412739598

157633676378313981721209496

0.028%

1.00E+29

1520698109714272166094258063

1521129660905129306326196776

0.028%



Inferences:

General Inferences:

1. The error percentage from Table 2 is much smaller than that from Table 1, except for very small values of n. See the chart below.


 2. The error from the Modified Approximation changes sign between 1015 and 1016.  See below chart.


3. The error on the positive side for the Modified Approximation seems to be mildly increasing with n, but the error is still 0.03% (0.002838) to be precise. As the approximation is derived from fundamentals and uses the Prime Counting Function as its base, the error is not expected to be much bigger for much higher values, but one can investigate.

4. The approximated value of number of primes under 1000 is 168.984 from the Modified Approximation which magically matches with the number of primes under 1000 when the fraction is eliminated.

Inferences on Probability:

1. The probability of a number being a prime, from the Prime Counting Function is, 


Equation (8)



From the Modified Approximation, the probability is,



Equation (9)



It can be easily observed that Equation (9) leads to better probability.

Now let us test this against the one-millionth prime number 15485863.

The probability of a randomly selected number less than or equal to the one-millionth prime number from the Prime Counting Function is, 1/log (15485863) = 6.04%.

The probability of a randomly selected number less than or equal to the one-millionth prime number from the Modified Approximation is, 


 Which is 6.56% higher than the probability coming from the Prime Counting Function.

The actual probability is 1000000/15584863=6.46%, which is very close to what we have estimated using the Modified Approximation.

From this, it can be observed that for smaller values of n, where the density of prime numbers is still high, the Modified Approximation provides a far better estimate of the probability of a number being prime, than the Prime Counting Function. 

2.  Semiprimes are the numbers which are products of two primes. There is no specific term when a semiprime is equal to twice a prime. Let us call them even semiprimes.

If x is the number of primes less than number n, they will have x number of corresponding even semiprimes less than 2n.

For a large number n, the no. of even semiprimes less than n is approximately equal to the no. of primes less than n/2.

The Prime Counting Function and its variants and modifications can be thus extended to approximate the number of semiprimes also.

From this, it can be inferred that the probability of a randomly selected even semiprime also gets influenced by the prime counting function chosen. The Modified Approximation in this article provides a better probability of a number being a prime, than the Prime Counting Function.

3. It can be calculated and found that the Modified Approximation provides a confidence of more than 95% at a value of n as low as 101, whereas for the Prime Counting Function the confidence level crosses 95% between 109 and 1010.

Conclusions:

1. A unique approximation is given for the Prime Counting Function in the form of 


 which gives much better approximation than the Prime Counting Function.

2.The probability of a randomly selected number n being a prime from the proposed approximation is


which provides more confidence than the probability from the Prime Counting Function.

3. The term “Self-correcting” used in the title of this article comes from the fact that the proposed equation for the number of primes less the number n uses the Prime Counting Function for n/2.

4. Compared to several other formulas, this formula is computationally less intensive as it uses just the log function a couple of times, and no constant is changing as n increases.



3 comments:

  1. This is such an erudite piece of research work in pure mathematics which can be put in serious business usage.This approach can be suitably adapted in making business forecast through a probabilistic approach when less data points are available to make a regression analysis.

    ReplyDelete
    Replies
    1. Thanks Mr. Prakash. It's so kind of you.

      Delete
    2. Beautiful Mr Prakash . Thanks Sir

      Delete