So we just saw one way the weighted count of prime powers relates to the logarithmic integral, through smoothing.
Well, two ways, if we count Riemann's formula, right?
Sure, but I don't think I really did any justice to his very powerful, very complicated ideas. They are substantially more nuanced than the smoothing form I introduced, and you really need to grapple with some big ideas from complex analysis to broach them properly.
Okay, fair enough.
At any rate, I want to cover another really fascinating way to connect the weighted count of prime powers to the logarithmic integral.
Again using the structure of our general recursive prime formula?
Yes. But this time, we won't have two parameterized versions of one function. Rather, we will be dealing with the difference more directly. And this approach will be a bit more involved, but I hope you stick with me. I think this one is especially fascinating.
Okay. How does it work?
Well, just watch. As always, let's go back to our basic recursive formula...
$$f_k(n,j)=\rec{\frac{1}{k}-f_{k+1}(\frac{n}{j},2) + f_k (n, j+1)}$$And as always, $f_1(n, 2)$ looks like this:
Now, unlike the smoothing version we just explored, we will go back to our old patterns and introduce a new helper function and explore its impact. Then we'll progressively generalize it in more complicated ways and end up somewhere fascinating.
Okay.
So to start with, the function we'll use is $(-1)^{j+1}$ - for now, nothing fancy.
Okay, simple enough.
With that, our new recursive function will look like this:
$$g_k(n,j)=\rec{\rr{(-1)^{j+1} \cdot }(\frac{1}{k}-g_{k+1}(\frac{n}{j},2)) + g_k (n, j+1)}$$Any speculation about what this might do?
Well, in other contexts, introducing $(-1)^{j+1}$ to series changes them into alternating series. And those usually behave in, all things considered, reasonably predictable ways. So I would guess the behavior change here ought to be relatively simple as well.
That seems like reasonable intuition. Okay, well, let's actually take a look of a graph of $g_1(n,2)$. And let's directly graph it against our unaltered baseline, $f_1(n,2)$.
What in the world?
So what do you see? Is that what you were expecting?
Huh. No, not really. It looks like $g_1(n,2)$ also counts primes, mostly. But then it has these huge jumps downward every so often.
True. And where are the jumps?
Well, it looks like they happen pretty infrequently. It appears that it happens at powers of 2, in fact. And the jumps just seem to get bigger and bigger.
A simple experiment will help us get more specific about those jumps. To do that, let's try subtracting our new function $g_1(n,2)$ from the original $f_1(n,2)$. If we do, we'll see this:
That's just strange. As I zoom out, that graph seems roughly self-similar, visually, almost like a fractal. Well, at least on powers of $2$, anyway.
True.
There is actually a simple formula for describing the difference here. If you were to puzzle it out, you would find that $g_1(n,2)$ can be expressed in terms of $f_1(n,2)$ as
$$ g_1(n,2) = f_1(n,2) \rr{ - \sum_{1 \le k \le \log_2 n} \frac{2^k}{k}} $$That's surprisingly simple.
I suppose it is. You can see it at work in the following animation.
Does the sum on the right of that equation, $\displaystyle \sum_{1 \le k \le \log_2 n} \frac{2^k}{k} $, look at all familiar to you?
Oh... Isn't that similar to the Taylor series for something?
It is. In fact, the Taylor series for logarithms is
$$\log(\frac{1}{1 - t}) = \sum_{k = 1}^{\infty} \frac{t^k}{k}$$Exactly. But that series only converges when $t$ is less than $1$, and when $t$ is greater or equal to $-1$. That's drilled into your head in early calculus discussions. So it certainly won't converge when $t=2$.
That's a fine observation, although one I want to revisit. But restating what you just said, given $t=1$, you have
$$1 +\frac{1}{2} +\frac{1}{3} +\frac{1}{4} +... = \infty$$Which is another way of saying harmonic numbers doesn't converge.
Correct. Meanwhile, check for $t=-1$ and we will indeed find the well-known identity
$$1 -\frac{1}{2} +\frac{1}{3} -\frac{1}{4} +...=\log 2$$So, here, the non-alternating series doesn't converge, while the alternating variant does.
Right. Does it seem like that has a parallel to our functions $f_1(n,2)$ and $g_1(n,2)$, above? Like, in a purely visual and hand-wavy sense?
Well, just going purely from the visual impressions, $f_1(n,2)$, seems to grow to larger and larger values without bound. It just keeps growing larger. And then meanwhile, $g_1(n,2)$, the alternating version, obviously doesn't converge. But it does seem to stay centered around the line $y=0$, in a sense.
And what does that suggest? What does it mean that $g_1(n,2)$ seems to stay centered around $y=0$, in a very rough sense?
Well, I guess, if $g_1(n,2)$ really does stay centered around $y=0$ in the long run, which you'd need to prove... If that were true, it must mean that $\displaystyle \sum_{1 \le k \le \log_2 n} \frac{2^k}{k}$ is something like an extremely rough approximation of the count of primes, right?
Well, that's an interesting guess. Let's try to generalize our approach here and see where if we can't gather more information from some computational experiments...
So, how do you suppose we ought to generalize what we're doing here?
Should we generalize $(-1)^{j+1}$ by adding some real scaling parameter $s$ to it, perhaps, as $(-1)^{s(j+1)}$? Or maybe generalize the $-1$ to something like $m^{j+1}$
Those are of course perfectly sensible guesses. But in practice, if you go down that road, the resulting functions stop counting primes.
Oh.
Fortunately, there is another way to go that isn't too tricky. Did you find it at all odd that the alternating series added up to $\log \rr{2}$?
Well, I suppose...
And what do you notice about the fact that the difference between $f_1(n,2)$ and $g_1(n,2)$ is $\displaystyle \sum_{1 \le k \le \log_{\rr{2}} n} \frac{\rr{2}^k}{k}$.
Well, the number $\rr{2}$ shows up again, oddly.
Well, it's not so odd with a bit of thought. In fact, that original alternating series above can be rewritten, with a bit of work, as $$\frac{1}{1}-\frac{1}{2}+\frac{1}{3}- \frac{1}{4}+ \dots \frac{(-1)^{n+1}}{n} = (\frac{1}{1}+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+ \dots \frac{1}{n}) - (0+\frac{1}{1}+0+\frac{1}{2}+ \dots \frac{1}{\frac{n}{\rr{2}}})$$
And the infinite series can in turn be rewritten as
$$\log \rr{2} = \lim_{n \to \infty} H_n - H_{\lfloor \frac{n}{\rr{2}} \rfloor }$$Ah!
Right. This suggests an immediate approach to generalization here. Take our case of $\rr{2}$, and replace it with other, larger whole numbers $\rr{m}$. If you do, what do you find?
Well, the case of $m=2$ gave us coefficients $$1,-1,1,-1,1,...$$ The case of $m=3$ seems to give us coefficients $$1,1,-2,1,1,-2,1...$$ $m=4$ gives us coefficients $$1,1,1,-3,1,1,1,-3,1...$$ And so on.
Good. In fact, one way you can write this idea out generally is as $$a_m(n)=1-m \cdot (\lfloor \frac{n}{\rr{m}} \rfloor - \lfloor \frac{n-1}{\rr{m}} \rfloor)$$
Oh.
You can use that to immediately say that
$$\sum_{j=1}^\infty \frac{a_{\rr{m}}(j)}{j} = \log \rr{m}$$Or, which is basically the same,
$$\log \rr{m} = \lim_{n \to \infty} H_n - H_{\lfloor \frac{n}{\rr{m}} \rfloor }$$Got it. That makes sense.
And so, given all this preamble, the alternating negative sign can in fact be rewritten as $$(-1)^{n+1} = 1-\rr{2} \cdot (\lfloor \frac{n}{\rr{2}} \rfloor - \lfloor \frac{n-1}{\rr{2}} \rfloor)$$ So that will be our generalization, at least for now.
So now let's see the impact of this generalization in our recursive function $g_1(n,2)$ by exploring an example.
Sure, sounds good.
Okay. Our generalization looks like this...
$$a_m(n)=1-m \cdot (\lfloor \frac{n}{m} \rfloor - \lfloor \frac{n-1}{m} \rfloor)$$...and graphing the example of $m=3$ in turn looks like this, of course. Nothing too surprising.
So now let's apply this to our recursive function, like so:
$$g_k(n,j)=\rec{\rr{a_m(j) \cdot }(\frac{1}{k}-g_{k+1}(\frac{n}{j},2)) + g_k (n, j+1)}$$And then graph it.
Now, what do we see?
Well, it seems very similar to our original case. But now the jumps are at powers of $3$, rather than powers of $2$.
Correct.
But more generally, the function still counts primes otherwise, and it still seems, at least from the graphs here, to stay centered around $y=0$.
Right. In fact, if you check the difference between $f_1(n,2)$ and $g_1(n,2)$, you'll find it to be $$\displaystyle \sum_{1 \le k \le \log_{\rr{3}} n} \frac{\rr{3}^k}{k}$$ And that looks like this:
And so just like before, zooming in and out looks almost fractal here.
Right. As before, you can see these ideas in action in the following animation.
Hmm. And I suppose that, because the count of primes is staying centered on $y=0$ here, it suggests $\displaystyle \sum_{1 \le k \le \log_{\rr{3}} n} \frac{\rr{3}^k}{k}$ is also an approximation of the count of primes.
Indeed, though seemingly an even worse one. Still, if you were to work this all out more generally, you would find that for any positive whole number $m$, the difference between $f_1(n,2)$ and $g_1(n,2)$ is $$\displaystyle \sum_{1 \le k \le \log_{\rr{m}} n} \frac{\rr{m}^k}{k}$$ And they all, in a hand-wavy sense, appear to stay centered on $y=0$.
Hmm. That's... curious. And perhaps a bit frustrating.
How so?
Well, on the one hand, it's interesting that $\displaystyle \sum_{1 \le k \le \log_{\rr{m}} n} \frac{\rr{m}^k}{k}$ seems like it must be an approximation of the count of primes.
Sure.
But on the other hand, it's frustrating that the best possible case, when $m=2$, still isn't a very good approximation, at all.
It's not, is it? Well, hold on, because if you can tolerate more complexity, we can go quite a bit further here, and it will get more interesting and curious still.