By Vishnu Vinjamuri

Abstract:

The famous prime-counting function, known as p(x) = x/log x is widely used for most practical purposes but underestimates the number of primes. Several other formulas are available that provide more accurate estimates of the number of primes less than a given positive integer. In this article, a unique approximation is presented using the formula p(x) = x/log x as the base, which results in a very high degree of accuracy while remaining computationally simple.

Introduction:

The prime counting function is denoted by p(x) and has several approximations.

The most popular and widely used such approximation is

- Equation (1)

Where x is any given real number greater than 2 and p(x) is the number of primes less than or equal to n.

It is well known that the above approximation runs asymptotic to the actual total number of primes and is a lower bound approximation.

Many other approximations using this format are of the form:

- Equation (2)

Where B is a constant, whose value is 1.08366 as proposed first by Legendre and later changed to 1.0. From Wikipedia (source unknown), we see further modifications to these constants between different values of n, for better approximation and these constants seem to be working well.

Pierre Dusart suggested formulas using similar and more complex constants, with lower and upper bounds.

In 2024, Timothy Ganesan proposed the below formula for prime numbers which is a better approximation than 1.1.

- Equation (3)

The various approximations coming under the family of Equation (2), provide a very good estimate of the number of primes, but the constants need to be adjusted as the number n increases. Several other equations exist which are complex in nature.

These equations and varying constants make the computation more complex.

In this article, let us approximate the number of primes by looking at the very Equation (1) and closely examining the distribution of primes in large intervals, aiming to keep the solution simple.

From now onward, let us call Equation (1) the Prime Counting Function.

All logarithms in this article are natural logarithms.

Distribution of Primes:

For a large number n, the Prime Counting Function is

For 2n, Equation (1) becomes

Which means,

- Equation (4)

This means, for very large values of n, the number of primes less than n gets nearly doubled as n gets doubled.

For a very large n, equation (4) can also be re-written as:

- Equation (5)

which means the number of primes under n is twice the numbers of primes under n/2, for a large n.

But, as we know, log n progresses at a snail's pace compared to the increase in n. Does this really work?

Let us do some calculations.

For n=10⁶, we get

For n=10³, we get

This shows that the ratio is tending toward 2.0 as n increases.

Below is the chart showing the ratio of p (2n)/ p (n) as n increases, and its asymptotic behavior with respect to 2.0.

Rearranging Prime Counting Function:

Let us rewrite a simple and wild equation for counting the number of primes as below: - Equation (6)

Sounds crazy, doesn’t it?!

From Equation (5), we get the below approximation for a large n:

Substituting various terms, Equation (6) can be rewritten as:

Which can be further rewritten as:

Which can be further simplified as:

- Equation (7)

Let us call this equation “Modified Approximation” further in this article.

Results:

In the results section, let us first present the results for the Prime Counting Function which are available, and how they compare against the actual number of primes known. The actual number of primes are taken from web sources.

The findings are presented in the below tables. The fractions are ignored as they do not make any sense when counting the number of integers.

Table 1 – No. of primes using Prime Counting Function (Known values)

n	π(n) actual	π(n) from Prime Counting Function	% error
1.00E+01	4	4	0.00%
1.00E+02	25	21	-16.00%
1.00E+03	168	144	-14.29%
1.00E+04	1229	1085	-11.72%
1.00E+05	9592	8685	-9.46%
1.00E+06	78498	72382	-7.79%
1.00E+07	664579	620420	-6.64%
1.00E+08	5761455	5428681	-5.78%
1.00E+09	50847534	48254942	-5.10%
1.00E+10	455052511	434294481	-4.56%
1.00E+11	4118054813	3948131653	-4.13%
1.00E+12	37607912018	36191206825	-3.77%
1.00E+13	346065536839	334072678387	-3.47%
1.00E+14	3204941750802	3102103442166	-3.21%
1.00E+15	29844570422669	28952965460216	-2.99%
1.00E+16	279238341033925	271434051189532	-2.79%
1.00E+17	2623557157654233	2554673422960304	-2.63%
1.00E+18	24739954287740860	24127471216847323	-2.48%
1.00E+19	234057667276344607	228576043106974646	-2.34%
1.00E+20	2220819602560918840	2171472409516259138	-2.22%
1.00E+21	21127269486018731928	20680689614440563221	-2.11%
1.00E+22	201467286689315906290	197406582683296285295	-2.02%
1.00E+23	1925320391606803968923	1888236877840225337613	-1.93%
1.00E+24	18435599767349200867866	18095603412635492818797	-1.84%
1.00E+25	176846309399143769411680	173717792761300731060451	-1.77%
1.00E+26	1699246750872437141327603	1670363391935583952504341	-1.70%
1.00E+27	16352460426841680446427399	16084980811231549172264034	-1.64%
1.00E+28	157589269275973410412739598	155105172108304224161117471	-1.58%
1.00E+29	1520698109714272166094258063	1497567178976730440176306616	-1.52%

Table 2 – No. of primes using Modified Approximation (Presented in this article)

n	π(n) actual	π(n) from Modified Approximation	% error
1.00E+01	4	7	75.00%
1.00E+02	25	27	8.00%
1.00E+03	168	168	0.00%
1.00E+04	1229	1218	-0.90%
1.00E+05	9592	9520	-0.76%
1.00E+06	78498	78117	-0.49%
1.00E+07	664579	662240	-0.35%
1.00E+08	5761455	5747073	-0.25%
1.00E+09	50847534	50759753	-0.17%
1.00E+10	455052511	454513484	-0.12%
1.00E+11	4118054813	4114760691	-0.08%
1.00E+12	37607912018	37588078365	-0.05%
1.00E+13	346065536839	345951503625	-0.03%
1.00E+14	3204941750802	3204354872614	-0.02%
1.00E+15	29844570422669	29842386052038	-0.01%
1.00E+16	279238341033925	279241231332249	0.001%
1.00E+17	2623557157654233	2623752533459659	0.007%
1.00E+18	24739954287740860	24743023355249531	0.012%
1.00E+19	234057667276344607	234095725093423118	0.016%
1.00E+20	2220819602560918840	2221247473634630597	0.019%
1.00E+21	21127269486018731928	21131835823913237324	0.022%
1.00E+22	201467286689315906290	201514517487389703395	0.023%
1.00E+23	1925320391606803968923	1925799106811863614750	0.025%
1.00E+24	18435599767349200867866	18440385448469505739705	0.026%
1.00E+25	176846309399143769411680	176893690363053571874257	0.027%
1.00E+26	1699246750872437141327603	1699712592577485318689677	0.027%
1.00E+27	16352460426841680446427399	16357017243895836441592453	0.028%
1.00E+28	157589269275973410412739598	157633676378313981721209496	0.028%
1.00E+29	1520698109714272166094258063	1521129660905129306326196776	0.028%

Inferences:

General Inferences:

1. The error percentage from Table 2 is much smaller than that from Table 1, except for very small values of n. See the chart below.

2. The error from the Modified Approximation changes sign between 10¹⁵ and 10¹⁶. See below chart.

3. The error on the positive side for the Modified Approximation seems to be mildly increasing with n, but the error is still 0.03% (0.002838) to be precise. As the approximation is derived from fundamentals and uses the Prime Counting Function as its base, the error is not expected to be much bigger for much higher values, but one can investigate.

4. The approximated value of number of primes under 1000 is 168.984 from the Modified Approximation which magically matches with the number of primes under 1000 when the fraction is eliminated.

Inferences on Probability:

1. The probability of a number being a prime, from the Prime Counting Function is,

- Equation (8)

From the Modified Approximation, the probability is,

- Equation (9)

It can be easily observed that Equation (9) leads to better probability.

Now let us test this against the one-millionth prime number 15485863.

The probability of a randomly selected number less than or equal to the one-millionth prime number from the Prime Counting Function is, 1/log (15485863) = 6.04%.

The probability of a randomly selected number less than or equal to the one-millionth prime number from the Modified Approximation is,

Which is 6.56% higher than the probability coming from the Prime Counting Function.

The actual probability is 1000000/15584863=6.46%, which is very close to what we have estimated using the Modified Approximation.

From this, it can be observed that for smaller values of n, where the density of prime numbers is still high, the Modified Approximation provides a far better estimate of the probability of a number being prime, than the Prime Counting Function.

2. Semiprimes are the numbers which are products of two primes. There is no specific term when a semiprime is equal to twice a prime. Let us call them even semiprimes.

If x is the number of primes less than number n, they will have x number of corresponding even semiprimes less than 2n.

For a large number n, the no. of even semiprimes less than n is approximately equal to the no. of primes less than n/2.

The Prime Counting Function and its variants and modifications can be thus extended to approximate the number of semiprimes also.

From this, it can be inferred that the probability of a randomly selected even semiprime also gets influenced by the prime counting function chosen. The Modified Approximation in this article provides a better probability of a number being a prime, than the Prime Counting Function.

3. It can be calculated and found that the Modified Approximation provides a confidence of more than 95% at a value of n as low as 101, whereas for the Prime Counting Function the confidence level crosses 95% between 10⁹and 10¹⁰.

Conclusions:

1. A unique approximation is given for the Prime Counting Function in the form of

which gives much better approximation than the Prime Counting Function.

2.The probability of a randomly selected number n being a prime from the proposed approximation is

which provides more confidence than the probability from the Prime Counting Function.

3. The term “Self-correcting” used in the title of this article comes from the fact that the proposed equation for the number of primes less the number n uses the Prime Counting Function for n/2.

4. Compared to several other formulas, this formula is computationally less intensive as it uses just the log function a couple of times, and no constant is changing as n increases.

3 comments:

Prakash KumarMay 4, 2025 at 7:14 AM
This is such an erudite piece of research work in pure mathematics which can be put in serious business usage.This approach can be suitably adapted in making business forecast through a probabilistic approach when less data points are available to make a regression analysis.

Engineering musings......

Labels

Sunday, May 4, 2025

Prime Counting - A Self-correcting Approximation