Integrating the Logarithm

(It’s not actually that difficult.)

So far, our antiderivatives of standard functions have been relatively straightforward to figure out. All we’ve had to do is think about what would differentiate to give our integrand. x2x^2 gives 2x2x, sin(x)\sin(x) gives cos(x)\cos(x), exe^x gives exe^x. But now comes a bridge that can’t be crossed while seeing what’s on the other side.

lnx dxint ln{x} dx

Simple, but devastating. Where do we start?

A surprisingly common mistake is to say it’s 1x\frac{1}{x} – but remember this is the derivative of ln(x)\ln(x). What we want to find is its antiderivative. That being said, the derivative of ln(x)\ln(x) will be very handy to know, so keep it in mind.

It may not be immediately apparent how you might go about this. Before we try integrating the bare log, let’s explore some other integrals involving ln(x)\ln(x) to get a feel for it.

Dojo Training

If you’re not familiar with integrals involving ln(x)\ln(x), some of these may look scary to you. Take it easy, see how you can rewrite things to make it easier for to spot things, and above all remember the derivative of ln(x)=1x\ln(x) = \frac{1}{x}.

Divided by xx

lnxx dxint rac{ln{x}}{x} dx

Dividing a function by xx? Why would this be any easier than ln(x)\ln(x)? Upon seeing this integral, your mind could jump to many different thoughts. Substitution? Layer cake? Maybe we can use parts, somehow?

Indeed, most functions divided by xx can’t be integrated to give an answer in terms of elementary functions, but ln(x)\ln(x) is one the few that can. It’s all down to its derivative.

When dividing terms, it can sometimes be helpful to write them separately to help you see what’s going on. In this case, let’s pull out the xx as 1x\frac{1}{x}:

1xlnx dxint rac{1}{x} ln{x} dx

Now, what’s the relationship between 1x\frac{1}{x} and ln(x)\ln(x)? Why of course, 1x\frac{1}{x} is the derivative of ln(x)\ln(x)! This means we can use inverse chain rule, or, carrying out the full solution with substitution:

lnx=t1x dx=dt1xlnx dx=t dt=12t2=12ln(x)2+cegin{align*} ln{x} &= t \ rac{1}{x} dx &= dt end{align*} \ egin{align*} Rightarrow int rac{1}{x} ln{x} dx &= int t dt \ &= rac{1}{2} t^2 \ &= rac{1}{2} ln(x)^2 + c end{align*}

So, there the trick was to spot the 1x\frac{1}{x} derivative on the outside.

Full Division

1xlnx dxint rac{1}{x ln{x}} dx

After the previous integral, this one should be a lot more obvious. Notice once again we can split the fraction:

1xlnx dx=1x1lnx dxint rac{1}{x ln{x}} dx = int rac{1}{x} cdot rac{1}{ln{x}} dx

And we have 1x\frac{1}{x}, the derivative of ln(x)\ln(x), on the outside. This sets us up for another ln(x)=t\ln(x) = t substitution, giving

=1t dt= int rac{1}{t} dt

And hey, we know this integrates to ln(t)\ln(t) – so our answer is

=lnt=ln(lnx)+cegin{align*} &= ln{t} \ &= ln(ln{x}) + c end{align*}

Looking back, a way to jump straight to this with straight-up layer came is to elevate the 1x\frac{1}{x} into the numerator:

1xlnx dx=1xlnx dxint rac{1}{x ln{x}} dx = int rac{ rac{1}{x}}{ln{x}} dx

A little more fiddly, but still a nice way to think about it.

Multiplying by xx

xlnx dxint x ln{x} dx

Ok, this time there’s no 1x\frac{1}{x} on the outside for us to leverage. So what do we do here?

When we have xx multiplied by a function, we usually go for parts, since the xx vanishes to 11 when differentiated. But in this case, we would then have to integrate ln(x)\ln(x) – which we don’t know how to do yet – and we’d also have to integrate that again after.

But, we do know the derivative of ln(x)\ln(x), and as it happens, it just gives us another polynomial in xx. In other words, we’d just be left with powers of xx, which is no problem at all.

So let’s do integration by parts, integrating xx and differentiating ln(x)\ln(x).

f=lnxg=12x2f=1xg=xegin{align*} f &= ln{x} & g &= rac{1}{2} x^2 \ f' &= rac{1}{x} & g' &= x end{align*}

Then

fgfg dx=ln(x)12x21x12x2 dx=12x2lnx12x dx=12x2lnx14x2+cegin{align*} fg - int f'g dx &= ln(x) cdot rac{1}{2} x^2 - int rac{1}{x} cdot rac{1}{2} x^2 dx \ &= rac{1}{2} x^2 ln{x} - rac{1}{2} int x dx \ &= rac{1}{2} x^2 ln{x} - rac{1}{4} x^2 + c end{align*}

Entering the Battlefield

What these previous examples have done is help us build up an intuition for how ln(x)\ln(x) works. In particular, you may have noticed its derivative is really quite nice to work with, because it just turns into a vanilla xx. No ln(x)\ln(x), exe^x or other fanciness involved.

So looking at our original integral

lnx dxint ln{x} dx

How could we manipulate it such that we could differentiate ln(x)\ln(x)?

Well, earlier we differentiated it when using integration by parts. But we only have 1 term here, not a product of 2 terms, so we can’t exactly use parts.

…or can we?

lnx dx=1lnx dxint ln{x} dx = int 1 cdot ln{x} dx

We can write anything how we want, so long as they’re still mathematically congruent. And anything multiplied by 11 is itself.

Now we can carry out parts just fine. Sure, the 11 will integrate to xx, which is an increase in complexity, but this is a non-issue since the important thing is getting rid of the ln(x)\ln(x).

f=lnxg=xf=1xg=1fgfg dx=ln(x)x1xx dxegin{align*} f &= ln{x} & g &= x \ f' &= rac{1}{x} & g' &= 1 end{align*} \ fg - int f'g dx = ln(x) cdot x - int rac{1}{x} cdot x dx

In fact, the division even cancels out nicely to just give us

=xlnx dx=xlnxx+cegin{align*} &= x ln x - int dx \ &= x ln x - x + c end{align*}

There. QED.

Postmortem

lnx dx\int \ln{x} \ dx is often quoted as a bit of a ‘magical’ integral since it’s got this trick of multiplying by 11. But funnily enough, you don’t actually need the trick at all.

The other thing that makes ln(x)\ln(x) so nice is its intimate relationship with exe^x – and exe^x is certainly one of the nicest terms to work with in integration. Knowing this, we shouldn’t be afraid to substitute out integral, because here’s what happens…

lnx=tx=etdx=et dtegin{align*} ln{x} &= t \ x &= e^t \ dx &= e^t dt end{align*} lnx dx=tetdtRightarrow int ln{x} dx = int te^t dt

And that is the archetypical integration by parts. So we end up with

=tetet=xlnxxcegin{align*} &= te^t - e^t \ &= xln{x} - x - c end{align*}

Which after re-substituting gives us exactly the same as before.

No trick, just plain substitution.1 Not actually that difficult to spot, right?


  1. This helps illustrate why substituting ln(x)\ln(x) or exe^x are common and generally ‘safe’ strategies in integration – both have really nice derivatives, and many terms will just end up turning into one of the 2.