Counting Primes: Some History

A table of various timings when running this algorithm for increasingly larger values — Example Timings for this Algorithm

And now, if you don't mind a certain amount of personal history, I want to try to capture a bit of my experience here, concerning the algorithm that we evolved.

Go ahead.

So the basic identity used here to count primes, $p_k(n,j)$, is one I discovered independently in October of 2004. But prior to that, in early September of that year, I had successfully implemented a messier, closely related combinatorial prime counting algorithm. And that, in turn, was based on fundamental combinatorial ideas I had worked through about prime counting back in the summer of 2002. So really, the algorithm, which I was thinking about much more purely algorithmically from the beginning, preceded the identity $p_k(n,j)$. It resembled the approach I describe here, but used somewhat different divisor summatory functions. And in fact, the inaugural implementation went quite a lot further, determining the counts of all possible prime signatures for numbers up to $n$ rather than just primes, and doing so in something like $O(n^{1+\epsilon})$ time and $O(n^{\epsilon})$ space with pretty good constant time factors. I discuss my more recent perspective on this broader approach in this series of discussions, and I intend to have a much more detailed recounting of that history here (this is not complete and public at the time of writing, though).

Okay.

Anyway, after settling on that specific identity for counting weighted counts of prime numbers as the approach I would focus on, I began really evolving this technique in earnest, trying to make an implementation without precision issues and then optimizing it.

I don't have much of a public paper trail about my process from the time. But you can see a sci.math newsgroup posting I made about this style of counting from October 13, 2004. I was still trying to hold my cards close to my vest at that time, so to speak, so I didn't exactly bluntly announce what I thought I had done.

Which does seem to indicate something of the state of your thoughts at the time, even it isn't especially explicit.

A screen shot of the folder of my various backup snapshots while developing this algorithm — My Various Backup Snapshots While Developing This Algorithm

The specific approach that I detailed in our discussion here, I largely worked through between January and March of 2005. I've included a screen shot of my various code backups from that window of time, to give at least a sense of how that work progressed. By the end of that process, I had code that looked really quite similar to the fastest JavaScript implementation we looked at previously.

Okay, so by the end of March 2005, you'd run out of ideas about evolving this specific approach. But you never got around to publishing anything about it?

Well... no. Or, well, not formally, I should say. At that point in time, I didn't know the first thing about math papers, or math articles, or math publishing, or even how one might write up math documents on a computer. I wasn't in meaningful contact with any math professionals either. I've since read a lot more math history and been reassured, I suppose, that these are not unusual problems for amateurs, of course, whether or not they have something novel to say.

True. But what do you mean by not formally?

A screen shot of the preamble to my e-mail where I dumped my code online — The preamble to my e-mail where I dumped my code online. The time bound was what I was seeing empirically but isn't right.

At least in terms of paper trails, I did dump a giant log of some of my work unceremoniously in a text file on the sci.math newgroups on February 20, 2006, and that post does include a C++ implementation of the algorithm, after some preliminary attempt at describing the math behind it. But - and this shouldn't be a surprise - that was not really a successful way to draw attention to my ideas. LaTeX exists for a reason, as does the cultivation of long term relationships with other people and personal reputations. I have found drive-by, relatively anonymous math result dumps are not, in the grand scheme of things, an effective method of communication, and boy have I tried.

Fair enough. But I guess it does serve as a reasonable signpost of where you were, anyway.

That it does.

Anyway, one final detail about the history of my experience here might be worth noting. I had tried to contact professional mathematicians a few times during this time period about my results, and a few of them responded politely but without any engagement. I'm not saying that as any kind of criticism, by the way, as I don't have a lot of energy to respond to random e-mail from strangers myself. I actually really appreciate that any responded at all.

Certainly.

A screen shot of an e-mail exchange clarifying, for me, the history of the core counting idea — An e-mail exchange clarifying the history of the core counting idea

But this means that the entire time I was working here, between October of 2004 and September of 2006, I had no idea if the core identity I had discovered, the one I was using to count primes, was even known at all. But finally, in the Fall of 2006, Professor Daniel Goldston of San Jose State University was kind enough to respond to me and point me to a reference in Iwaniec and Kowalski's "Analytic number theory", showing me that the core identity I was using was a variant of an identity published by Soviet mathematician Yuri Linnik back in 1961. Which was disappointing, of course, but also amounted to a kind of closure, too, which isn't the worst thing in the world.

So that's a good synopsis of the early history of this approach, I think. I've included below some working C++ source code that is roughly equivalent to the code I uploaded to that newsgroup back in 2006.

#include "stdio.h"
#include "stdlib.h"
#include "math.h"
#include "time.h"

const double EPSILON = .00000000001;
const int wheelCyclePeriod = 9699690, wheelFirstPrime = 23, wheelCycleEntries= 1658880, wheelDiscaredPrimes = 8;
int wheelTranslation[wheelCyclePeriod], wheelOffsets[wheelCycleEntries];
void MakeWheel() {
    int entry = 0, offset = 1, total = 1, primes[] = { 2,3,5,7,11,13,17,19,23,29,31,37 };
    wheelTranslation[0] = wheelTranslation[1] = 0;
    wheelTranslation[2] = 1;
    for (int i = 3; i <= wheelCyclePeriod; i+=2) {
        bool inUse = true;
        for (int j = 1; j < wheelDiscaredPrimes && inUse; j++) if (!(i % primes[j])) inUse = false;
        if (inUse) {
            wheelOffsets[entry] = offset + 1;
            offset = 0;
            entry++;
        }
        else offset++;
        wheelTranslation[i] = total;
        if( inUse )total++;
        offset++;
        wheelTranslation[i+1] = total;
    }
    wheelOffsets[entry] = 2;
}
inline void IncrementWheelEntry(int& offset) {
    if (++offset >= wheelCycleEntries)offset = 0;
}
inline long long InversePower(long long x, long y) {
    return ((long long)(pow((double)x + EPSILON, (1.0 / (double)y)) + EPSILON));
}
inline int NumToWheelEntry( long long n){
    return (n < wheelCyclePeriod) ? wheelTranslation[n] : wheelTranslation[n % wheelCyclePeriod];
}
inline long long wheelCount(long long rangeStart, long long rangeEnd) {
    if (++rangeEnd < wheelCyclePeriod)return wheelTranslation[rangeEnd] - wheelTranslation[rangeStart];
    int b = rangeEnd % wheelCyclePeriod;
    if (rangeStart < wheelCyclePeriod) return (rangeEnd - b) / wheelCyclePeriod * wheelCycleEntries + wheelTranslation[b] - wheelTranslation[rangeStart];
    int a = rangeStart % wheelCyclePeriod;
    return (rangeEnd - b - rangeStart + a) / wheelCyclePeriod * wheelCycleEntries + wheelTranslation[b] - wheelTranslation[a];
}
long long D_2(long long n, long long a, long long p, int r ) {
    int nroot = (long long)(sqrt((double)n) + EPSILON);
    long long t = (wheelCount(a, n / a) - 1) * (p / r) + (wheelCount(a, nroot) - 1) * (p / 2) + p / (r * (r + 1));
    
    int wheelEntry = NumToWheelEntry(a);
    a += wheelOffsets[wheelEntry];
    IncrementWheelEntry(wheelEntry);
    
    long long u = 0;
    while (a <= nroot) {
        u += wheelCount(a, n / a)-1;
        a += wheelOffsets[wheelEntry];
        IncrementWheelEntry(wheelEntry);
    }

    return t + u * p;
}
long long D_k(long long n, long long a, long long p, int r, int k ) {
    if( k == 2)return D_2(n,a,p,r );
    long long t = 0;

    t += D_k(n / a, a, p / r, r + 1, k - 1);

    int wheelEntry = NumToWheelEntry(a);
    a += wheelOffsets[wheelEntry];
    IncrementWheelEntry(wheelEntry);
    
    while (a <= InversePower(n, k)) {
        t += D_k(n / a, a, p, 2, k - 1);
        a += wheelOffsets[wheelEntry];
        IncrementWheelEntry(wheelEntry);
    }
    return t;
}
long long D(long long n, int k) {
    if( k == 1 )return wheelCount(wheelFirstPrime, n);
    if( k == 2 )return D_2(n, wheelFirstPrime, 2, 1);
    long long factorial[] = { 0, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, 39916800, 479001600, 6227020800, 87178291200,
        1307674368000, 20922789888000, 355687428096000, 6402373705728000, 121645100408832000, 2432902008176640000
    };
    return D_k(n, wheelFirstPrime, factorial[k], 1, k );
}

long long countPrimes(long long n) {
    int mu[] = { 0, 1, -1, -1, 0, -1, 1, -1, 0, 0, 1, -1, 0, -1, 1, 1, 0, -1, 0, -1, 0, 1, 1, -1, 0, 0, 1, 0, 0 };
    long long totals[32] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};

    for (int j = 1; j < (int)(log((double)n + EPSILON) / log((double)wheelFirstPrime + EPSILON) + EPSILON) + 1; j++) {
        if (!mu[j])continue;
        for (int k = 1; k < (int)(log((double)n + EPSILON) / j / log((double)wheelFirstPrime + EPSILON) + EPSILON) + 1; k++) {
            totals[j * k] += D(InversePower(n, j), k) * pow(-1.0, k + 1) * mu[j];
        }
    }

    if (n < wheelFirstPrime) {
        static const int lookup[] = { 0,0,1,2,2,3,3,4,4,4,4,5,5,6,6,6,6,7,7,8,8,8,8,9 };
        return lookup[n];
    }

    double t = wheelDiscaredPrimes;
    for( int i = 1; i < 32; i++ )t += totals[i]/i;
    return (long long)(t + 0.5);
}
void printSpaces(const char* str) {
    printf("                                                                    %s", str);
}
int main(){
    MakeWheel();
    const int scaleNum = 10;
    int oldClock = (int)clock(), lastDif = 0;
    printSpaces("Time\n");
    printSpaces("Increase\n");
    printSpaces("");
    printf("for x%d\n", scaleNum);
    printf("         __ Input Number __   __ Output Number __ _ MSec _ _ Sec _  Input\n\n");
    for (long long i = scaleNum; i <= 1000000000000000000; i *= scaleNum) {
        printf("%17I64d(10^%4.1f): ", i, log((double)i) / log(10.0));

        long long total = countPrimes(i);

        int newClock = (int)clock();
        printf(" %20I64d %8d : %4d: x%f\n", total, newClock - oldClock, (newClock - oldClock) / CLK_TCK,
            (lastDif) ? (double)(newClock - oldClock) / (double)lastDif : 0.0);
        lastDif = newClock - oldClock;
        oldClock = newClock;
    }
    return 0;
}