Mobius Inversion

If you've ever taken some lessons on competitive programming, chances are that you have already heard about one of the most famous formula: the Mobius inversion. This article is aimed to provide some basic insight on what is the Mobius inversion, as well as how to apply it in various programming tasks.


If you are not familiar with the linear sieve and multiplicative functions, I recommend that you read about them first from the previous essay.

I will introduce some frequently used notations and lemmas first.


The following functions are all multiplicative functions, where \( p \) is a prime number and \( k \) is a positive integer.


I have some my unofficial names for these frequently used conclusions. If you happen to know any more commonly used name for them, you are more than welcome to tell me.

What is the Mobius Inversion?

According to Wikipedia, the Mobius inversion states that:

If \( g(n)=\sum_{d|n}f(d) \) for every positive integer \( n \), then \( f(n)=\sum_{d|n}g(d)\mu(\frac{n}{d}) \), where \( \mu(x) \) is the Mobius function, which is multiplicative and satisfies \( f(1)=1, f(p)=-1, f(p^k)=0 \) for any prime number \( p \) and any integer \( k \geq 2 \). {\it It is worth noting that you can pre-compute all values of the Mobius function from \( 1 \) to \( n \) using the linear sieve.}

However, the definition alone does not mean much (unless the problem statement explicitly gives you something like \( g(n)=\sum_{d|n}f(d) \). In that case, well...). There is one important property that is probably more useful than the definition:

\[ \displaystyle \sum_{d|n}\mu(d)=\epsilon(n)=[n=1] \]

In order to prove this property, we have to use the transitive property of multiplicative functions to show that \( f(n)=\sum_{d|n}\mu(d) \) is multiplicative. After that, we can see \( f(1)=1 \) and \( f(p^k)=0 \), which means \( f(x)=\epsilon(x) \). Now you may ask: why is this property important? I think it is best to show the reasons in some basic examples below.

Example 1

Find out the number of co-prime pairs of integers \( (x,y) \) in range \( [1,n] \).

Solution 1

Notice that the question is the same as asking you the value of \[ \displaystyle f(n)=\sum_{i=1}^n\sum_{j=1}^n[gcd(i,j)=1] \]

Now apply the Mobius inversion on \( [gcd(i,j)]=1 \), we have

\[ \displaystyle f(n)=\sum_{i=1}^n\sum_{j=1}^n\sum_{d|gcd(i,j)}\mu(d) \]

which is the same as

\[ \displaystyle f(n)=\sum_{i=1}^n\sum_{j=1}^n\sum_{d=1}^n[d|gcd(i,j)]\mu(d) \]

Notice that \( [d|gcd(i,j)]=[d|i][d|j] \). Therefore

\[ \displaystyle f(n)=\sum_{i=1}^n\sum_{j=1}^n\sum_{d=1}^n[d|i][d|j]\mu(d) \]

We can change the order of summing things up, so

\[ \displaystyle f(n)=\sum_{d=1}^n\mu(d)\left(\sum_{i=1}^n[d|i]\right)\left(\sum_{j=1}^n[d|j]\right) \]

We know that \( \sum_{i=1}^n[d|i]=\sum_{j=1}^n[d|j]=\lfloor \frac{n}{d} \rfloor \). Thus

\[ \displaystyle f(n)=\sum_{d=1}^n\mu(d)\lfloor \frac{n}{d} \rfloor ^2 \]

Then we can simply loop through \( 1 \) to \( n \) to compute the formula. While we can optimize this loop to \( O(\sqrt{n}) \) using the harmonic lemma, we still have to use the linear sieve to pre-compute the values of \( \mu(d) \). Therefore, the overall complexity of the algorithm is \( O(n) \).

Example 2

Find out the sum of \( gcd(x,y) \) for every pair of integers \( (x,y) \) in range \( [1,n] \) (\( gcd(x,y) \) means the greatest common divisor of \( x \) and \( y \)).

Solution 2

Similar as above, notice that the question is the same as asking you the value of \[ \displaystyle f(n)=\sum_{i=1}^n\sum_{j=1}^ngcd(i,j) \]

Let \( k=gcd(i,j) \). We can then loop for \( k \) first.

\[ \displaystyle f(n)=\sum_{k=1}^nk\sum_{i=1}^n\sum_{j=1}^n[gcd(i,j)=k] \]

Now assume \( i=ak,j=bk \), and then

\[ \displaystyle f(n)=\sum_{k=1}^nk\sum_{a=1}^{\lfloor\frac{n}{k}\rfloor}\sum_{b=1}^{\lfloor\frac{n}{k}\rfloor}[gcd(a,b)=1] \]

It can be observed that the last part of the formula is the same as the one in example 1. Applying the same procedure, we have

\[ \displaystyle f(n)=\sum_{k=1}^nk\sum_{d=1}^{\lfloor\frac{n}{k}\rfloor}\mu(d)\lfloor \frac{n}{kd} \rfloor ^2 \]

According to the harmonic lemma, \( O(\sum_{k=1}^n\lfloor\frac{n}{k}\rfloor)=O(n \log n) \). Therefore, if we compute the sum above with brute force, the overall complexity will be \( O(n \log n) \).

This is, however, not the best complexity we can achieve with the Mobius inversion. Now let \( l=kd \), and we can loop for \( l \) first.

\[ \displaystyle f(n)=\sum_{l=1}^n\lfloor \frac{n}{l} \rfloor^2\sum_{k|l}k\mu(\frac{l}{k}) \]

Applying the transitive property of multiplicative functions, we can see that \( g(l)=\sum_{k|l}k\mu(\frac{l}{k}) \) is multiplicative. Moreover, we have \( g(p^k)=p^k-p^{k-1} \). If you have somewhat studied about the number theory, you may figure out that \( g(l) \) is actually the Euler's totient function. Regardless of whether you know about the coincidence or not, \( g(l) \) can be pre-computed with the linear sieve, and \( f(n)=\sum_{l=1}^n\lfloor \frac{n}{l} \rfloor^2g(l) \), which can be simply computed in \( O(n) \) complexity.

Example 3

Find out the sum of \( lcm(x,y) \) for every pair of integers \( (x,y) \) in range \( [1,n] \) (\( lcm(x,y) \) means the least common multiple of \( x \) and \( y \)).

Solution 3

Well, we should be pretty familiar with the techniques by now. First of all, the answer should be \[ \displaystyle f(n)=\sum_{i=1}^n\sum_{j=1}^nlcm(i,j) \]

Since \( lcm(i,j)=\frac{ij}{gcd(i,j)} \), let \( k=gcd(i,j) \). We can then loop for \( k \) first.

\[ \displaystyle f(n)=\sum_{k=1}^n\sum_{i=1}^n\sum_{j=1}^n[gcd(i,j)=k]\frac{ij}{k} \]

Now assume \( i=ak,j=bk \), and then

\[ \begin{align*} \displaystyle f(n)&=\sum_{k=1}^n\sum_{a=1}^{\lfloor\frac{n}{k}\rfloor}\sum_{b=1}^{\lfloor\frac{n}{k}\rfloor}[gcd(a,b)=1]abk\\ &=\sum_{k=1}^nk\sum_{a=1}^{\lfloor\frac{n}{k}\rfloor}a\sum_{b=1}^{\lfloor\frac{n}{k}\rfloor}b[gcd(a,b)=1]\\ &=\sum_{k=1}^nk\sum_{a=1}^{\lfloor\frac{n}{k}\rfloor}a\sum_{b=1}^{\lfloor\frac{n}{k}\rfloor}b\sum_{d=1}^{\lfloor\frac{n}{k}\rfloor}[d|a][d|b]\mu(d)\\ &=\sum_{k=1}^nk\sum_{d=1}^{\lfloor\frac{n}{k}\rfloor}\mu(d)\left(\sum_{a=1}^{\lfloor\frac{n}{k}\rfloor}[d|a]a\right)\left(\sum_{b=1}^{\lfloor\frac{n}{k}\rfloor}[d|b]b\right) \end{align*} \]

Now let's proceed to assume \( a=dp,b=dq \), then

\[ \begin{align*} \displaystyle f(n)&=\sum_{k=1}^nk\sum_{d=1}^{\lfloor\frac{n}{k}\rfloor}\mu(d)\left(\sum_{p=1}^{\lfloor\frac{n}{kd}\rfloor}dp\right)\left(\sum_{q=1}^{\lfloor\frac{n}{kd}\rfloor}dq\right)\\ &=\sum_{k=1}^nk\sum_{d=1}^{\lfloor\frac{n}{k}\rfloor}\mu(d)d^2\left(\sum_{p=1}^{\lfloor\frac{n}{kd}\rfloor}p\right)\left(\sum_{q=1}^{\lfloor\frac{n}{kd}\rfloor}q\right)\\ \end{align*} \]

Now notice \( \sum_{p=1}^{\lfloor\frac{n}{kd}\rfloor} p=\frac{(1+\lfloor\frac{n}{kd}\rfloor)(\lfloor\frac{n}{kd}\rfloor)}{2} \), and we have

\[ \displaystyle f(n)=\sum_{k=1}^nk\sum_{d=1}^{\lfloor\frac{n}{k}\rfloor}\mu(d)d^2\left(\frac{(1+\lfloor\frac{n}{kd}\rfloor)(\lfloor\frac{n}{kd}\rfloor)}{2}\right)^2 \]

Now following the example above, let \( l=kd \), and then

\[ \displaystyle f(n)=\sum_{l=1}^n\left(\frac{(1+\lfloor\frac{n}{l}\rfloor)(\lfloor\frac{n}{l}\rfloor)}{2}\right)^2\sum_{d|l}\mu(d)ld \]

Notice that \( g(l)=\sum_{d|l}\mu(d)ld \) is also multiplicative, and \( g(p^k)=p^k-p^{k+1} \). We can pre-compute the values of \( g(l) \) via the linear sieve, and \( f(n)=\sum_{l=1}^n\left(\frac{(1+\lfloor\frac{n}{l}\rfloor)(\lfloor\frac{n}{l}\rfloor)}{2}\right)^2g(l) \). The overall complexity will be \( O(n) \).

Example 4

Find out the sum of \( lcm(A[i],A[j]) \) for every pair of integers \( (A[i],A[j]) \) with an array \( A \) of length \( N \) (\( 1\leq i,j\leq N \)), with the constraint of \( 1\leq N\leq 200,000 \) and \( 1\leq A[i] \leq 1,000,000 \).

Solution 4

This is a common variation of the example above. Denote \[ \displaystyle f(n)=\sum_{d}\sum_{a\in A\land d|a}\sum_{b\in A\land d|b}[gcd(a,b)=d]\frac{ab}{d} \]

We simply have

\[ \begin{align*} \displaystyle f(n)&=\sum_{d}\sum_{a\in A\land d|a}\sum_{b\in A\land d|b}[gcd(\frac{a}{d},\frac{b}{d})=1]\frac{ab}{d}\\ &=\sum_{d}\sum_{a\in A\land d|a}\sum_{b\in A\land d|b}\sum_{l}[l|\frac{a}{d}][l|\frac{b}{d}]\mu(l)\frac{ab}{d}\\ &=(\sum_{d}\frac{1}{d}\sum_{l}\mu(l))(\sum_{a\in A}\sum_{b\in A}[dl|a][dl|b]ab) \end{align*} \]

Let \( t=dl \), then

\[ \displaystyle f(n)=(\sum_{t}\sum_{l|t}\frac{l}{t}\mu(l))(\sum_{a\in A\land t|a}a)^2 \]

Now, similar to the example above \( \displaystyle g(t)=\sum_{l|t}\frac{l}{t}\mu(l) \) is multiplicative with \( g(p^k)=p^{-k}-p^{-k+1} \), and thus can be computed with \( O(V) \), where \( V \) is the range span of \( A \). The latter part of the formula \( \displaystyle (\sum_{a\in A\land t|a}a)^2 \) is also fairly easy to figure out if you keep track of the number of the elements equal to \( v \) as \( cnt(v) \), since

\[ \displaystyle \sum_{a\in A\land t|a}a=\sum_{t|v}cnt(v)v\\ =\sum_{i=1}^{\frac{V}{t}}cnt(it)it \]

and with the harmonic lemma, the above computation can be done in \( O(V\log V) \). Hence the overall complexity is \( O(N+V\log V) \).


It is known that all examples above can be further optimized to \( O(n^{\frac{2}{3}}) \). The exact technique will be introduced in the Dirichlet Convolution essay.