No frills proof of higher-order Cheeger inequality

Following some recent applications by Mamoru Tanaka and Laurent Miclo, I was asked where there is a short, no-frills, self-contained, and (possibly) quantitatively non-optimal proof of the higher-order Cheeger inequalities from our paper with Shayan Oveis-Gharan and Luca Trevisan. I thought I would post it here. (If you’re hungering for something new, see this recently posted preprint of my coauthors relating higher eigenvalues to graph expansion.)

[Update: The application of Miclo can also be done using the higher-order Cheeger inequalities of Louis, Raghavendra, Tetali, and Vempala.]

The main simplification comes from doing the random partitioning non-optimally with axis-parallel cubes. For ease of notation, we will deal only with regular graphs, but there will be no quantitative dependence on the degree and this assumption can be removed (see the full paper).

Suppose that {G=(V,E)} is a connected, {n} -vertex, {d} -regular graph. Define the Laplacian by {\mathcal L = I - (1/d)A} , where {A} is the adjacency matrix of {G} . We will think of {\mathcal L} as acting on {\ell^2(V)} , the space of functions {f : V \rightarrow \mathbb R} equipped with the {\ell^2} norm. {\mathcal L} is positive semi-definite with spectrum

\displaystyle  0 = \lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_{|V|} \leq 2\,.

We we define the Rayleigh quotient of a function {f \in \ell^2(V)} by

\displaystyle  \mathcal R(f) = \frac{\sum_{u \sim v} (f(u)-f(v))^2}{d \sum_{u \in V} f(u)^2}\,,

where the numerator is summed over edges of {G} . By the variational principle for eigenvalues, we have

\displaystyle  \lambda_k = \min_{\stackrel{W \subseteq \ell^2(V)}{\dim(W)=k}} \max_{0 \neq f \in W} \mathcal R(f)\,. \ \ \ \ \ (1)

For a subset {S \subseteq V} , we define the expansion of {S} by {\phi(S) = \mathcal R(\mathbf{1}_S)} , where {\mathbf{1}_S} is the indicator function of {S} .

Finally, for {k \in \{1,2,\ldots,n\}} , we define the {k} -way expansion constant of {G} by

\displaystyle  \rho_G(k) = \min \left\{ \max_i \phi(S_i) : S_1, S_2, \ldots, S_k \subseteq V \right\}\,,

where the minimum is over collections of {k} disjoint, non-empty subsets of {V} .

The classical discrete Cheeger inequality asserts that

\displaystyle  \frac{\lambda_2}{2} \leq \rho_G(2) \leq \sqrt{2\lambda_2}\,.

We will now prove the following generalization. See the full paper for a discussion of the surrounding issues and better quantitative bounds.

Theorem 1 For every {k \in \{1,2,\ldots,n\}} ,

\displaystyle  \frac{\lambda_k}{2} \leq \rho_G(k) \leq 30 k^{3.5} \sqrt{\lambda_k}\,. \ \ \ \ \ (2)

First, let’s prove the (easy) LHS of (2). Suppose we have {S_1, S_2, \ldots, S_k \subseteq V} which are disjoint and non-empty and which satisfy {\phi(S_i) = \mathcal R(\mathbf{1}_{S_i}) \leq \rho} . Then certainly {W = \mathrm{span}(\mathbf{1}_{S_1}, \ldots, \mathbf{1}_{S_k})} is a {k} -dimensional subspace of {\ell^2(V)} . On the other hand, every {f \in W} satisfies {\mathcal R(f) \leq 2 \rho} because if {f = \alpha_1 \mathbf{1}_{S_1} + \cdots + \alpha_k \mathbf{1}_{S_k}} , then

\displaystyle  \sum_{u \sim v} (f(u)-f(v))^2 \leq 2 \sum_{i=1}^k \alpha_i^2 \sum_{u \sim v} |\mathbf{1}_{S_i}(u)-\mathbf{1}_{S_i}(v)|^2\,,

where we have used the fact that if {u \in S_i} and {v \in S_j} , then

\begin{array}{rcl} |\alpha_i \mathbf{1}_{S_i}(u) - \alpha_j \mathbf{1}_{S_j}(u)|^2 &\leq & 2(\alpha_i^2 + \alpha_j^2) \\ &=& 2\alpha_i^2 |\mathbf{1}_{S_i}(u) - \mathbf{1}_{S_i}(v)|^2 + 2\alpha_j^2 |\mathbf{1}_{S_j}(u)-\mathbf{1}_{S_j}(v)|^2\,.\end{array}

But now using (1), the subspace {W} witnesses the fact that {\lambda_k \leq 2\rho} .

To prove the more difficult RHS of (2), we will use the following discrete Cheeger inequality with boundary conditions.

Lemma 2 For any {f : V\rightarrow \mathbb R} , there is a subset {U \subseteq \{ v \in V : f(v) \neq 0\}} such that

\displaystyle  \phi(U) \leq \sqrt{2 \mathcal R(f)}\,.

Proof: Let {U_t = \{ v \in V : f(v)^2 \geq t \}} . Observe that for each {t > 0} , one has {U_t \subseteq \{ v \in V : f(v) \neq 0\}} . For {S \subseteq V} , let {E(S,\bar S)} denote the edges of {G} with exactly one endpoint in {S} . Then we have

\displaystyle  \begin{array}{rcl}  \int_0^{\infty} |E(U_t, \bar U_t)|\,dt &=& \sum_{\{u,v\} \in E} \left|f(u)^2 - f(v)^2\right| \\ &= & \sum_{\{u,v\} \in E} |f(u) + f(v)| |f(u)-f(v)| \\ &\leq & \sqrt{\sum_{\{u,v\} \in E} (|f(u)|+|f(v)|)^2} \sqrt{\sum_{\{u,v\} \in E} |f(u)-f(v)|^2} \\ &\leq & \sqrt{2 d \sum_{u \in V} f(u)^2}\sqrt{\sum_{\{u,v\} \in E} |f(u)-f(v)|^2}. \end{array}

On the other hand, {\int_0^{\infty} d|U_t|\,dt = d \sum_{u \in V} |f(u)|^2,} thus

\displaystyle  \int_0^{\infty} |E(U_t, \bar U_t)|\,dt \leq \sqrt{2 \mathcal R(f)} \int_0^{\infty} d|U_t|\,dt\,,

implying there exists a {t > 0} such that {\phi(U_t) = \frac{|E(U_t,\bar U_t)|}{d |U_t|} \leq \sqrt{2 \mathcal R(f)}} . \Box

In light of the preceding lemma, to prove the RHS of (2), it suffices to find {k} disjointly supported functions {\psi_1, \psi_2, \ldots, \psi_k : V \rightarrow \mathbb R} such that {\mathcal R(\psi_i) \leq (30)^2 k^6 \lambda_k} for each {i=1,2,\ldots,k} . Then Lemma 2 guarantees the existence of disjoint subsets of vertices satisfying our desired conclusion.

Toward this end, let {f_1, f_2, \ldots, f_k : V \rightarrow \mathbb R} denote {k} orthonormal functions satisfying {\mathcal L f_i = \lambda_i f_i} for each {i=1,2,\ldots,k} , i.e. consider the first {k} eigenfunctions of the Laplacian. Define the map {F : V \rightarrow \mathbb R^k} via

\displaystyle  F(v) = \left(f_1(v), f_2(v), \ldots, f_k(v)\right)\,.

We also put {\hat F(v) = F(v)/\|F(v)\|} . (Since {\lambda_1=0} , the function {f_1} takes value {1/\sqrt{n}} everywhere, hence {F} is never zero and this is well-defined.)

The next lemma shows that, in order to find disjointly supported functions, it suffices to find “separated sets.”

Lemma 3 Suppose that for some {\varepsilon > 0} and {0 < \delta \leq 1} , there exist {k} subsets {U_1, U_2, \ldots, U_k \subseteq V} satisfying the conditions:

  1. For every {i=1,2,\ldots,k} , we have {\sum_{v \in U_i} \|F(v)\|^2 \geq \varepsilon} , and
  2. For every {i \neq j} , if {u \in U_i} and {v \in U_j} , then

    \displaystyle  \|\hat F(u)-\hat F(v)\| \geq \delta

Then there are disjointly supported functions {\psi_1, \psi_2, \ldots, \psi_k : V \rightarrow \mathbb R} such that {\mathcal R(\psi_i) \leq 25k\lambda_k/(\varepsilon\delta^2)} .

Proof: For each {i=1,2,\ldots,k} , we define maps {\theta_i : V \rightarrow \mathbb R} by

\displaystyle  \theta_i(v) = \max\left(0, 1 - \frac{2}{\delta} \min_{u \in U_i} \|\hat F(u)-\hat F(v)\|\right)\,.

Observe that, by the triangle inequality, for {u,v \in V} , we have

\displaystyle  |\theta_i(u)-\theta_i(v)| \leq \frac{2}{\delta} \|\hat F(u)-\hat F(v)\|\,. \ \ \ \ \ (3)

Next, we define

\displaystyle  \psi_i(v) = \theta_i(v) \|F(v)\|\,.

Observe that since {\theta_i} is identically {1} on {U_i} , we have

\displaystyle  \sum_{v \in V} \psi_i(v)^2 \geq \sum_{v \in U_i} \|F(v)\|^2 \geq \varepsilon \ \ \ \ \ (4)

by condition (i).

Next, we argue that the functions {\{\theta_i\}} are disjointly supported. This immediately implies that the {\{\psi_i\}} are disjointly supported. If {u \in \mathrm{supp}(\theta_i) \cap \mathrm{supp}(\theta_j)} for some {i \neq j} , then there are points {v \in U_i} and {v' \in U_j} such that {\|\hat F(u)-\hat F(v)\| < \delta/2} and {\|\hat F(u)-\hat F(v')\| < \delta/2} . But then by the triangle inequality, {\|\hat F(v)-\hat F(v')\| < \delta} , violating condition (ii).

Finally, we bound {\mathcal R(\psi_i)} . For any {u,v \in V} and {i=1,2,\ldots,k} , we have

\displaystyle  |\psi_i(u)-\psi_i(v)| \leq \theta_i(u) \left|\vphantom{\bigoplus}\|F(u)\|-\|F(v)\|\right| + \|F(v)\| \cdot |\theta_i(u)-\theta_i(v)|\,. \ \ \ \ \ (5)

Now using (3),

\displaystyle  \|F(v)\| \cdot |\theta_i(u)-\theta_i(v)| \leq \frac{2}{\delta} \|F(v)\| \cdot \|\hat F(u)-\hat F(v)\| \leq \frac{4}{\delta} \|F(u)-F(v)\|\,, \ \ \ \ \ (6)

where we have used the fact that for any non-zero vectors {x,y \in \mathbb R^k} , we have

\displaystyle  \|x\| \left\|\frac{x}{\|x\|} - \frac{y}{\|y\|}\right\| = \left\|x - \frac{\|x\|}{\|y\|} y\right\| \leq \|x-y\| + \left\|y - \frac{\|x\|}{\|y\|} y\right\| \leq 2 \,\|x-y\|\,.

Using (6) and the fact that {\theta_i(u) \leq 1} in (5) yields

\displaystyle  |\psi_i(u)-\psi_i(v)| \leq \left(1+\frac{4}{\delta}\right) \|F(u)-F(v)\|\,. \ \ \ \ \ (7)

Finally, observe that

\displaystyle  \sum_{u \sim v} \|F(u)-F(v)\|^2 = \sum_{i=1}^k \sum_{u \sim v} |f_i(u)-f_i(v)|^2 = d (\lambda_1 + \cdots + \lambda_k) \leq dk\lambda_k\,.

Combining this with (7) and (4) shows that {\mathcal R(\psi_i) \leq \lambda_k \frac{k}{\varepsilon}(1+\frac{4}{\delta})^2 \leq 25k\lambda_k/(\varepsilon \delta^2).} \Box

We will first make the task of finding separated sets slightly easier.

Lemma 4 Suppose that for some {0 < \delta \leq 1} and {m \geq 1} , there are subsets {T_1, T_2, \ldots, T_m \subseteq V} which satisfy:

  1. {\displaystyle \sum_{i=1}^m \sum_{v \in T_i} \|F(v)\|^2 \geq k - \frac14} .
  2. For every {i \neq j} , if {u \in T_i} and {v \in T_j} , then { \|\hat F(u)-\hat F(v)\| \geq \delta }
  3. For every {i=1,2,\ldots, m} ,

    \displaystyle  \sum_{v \in T_i} \|F(v)\|^2 \leq 1 + \frac{1}{2k}\,.

Then there are {k} sets {U_1, U_2, \ldots, U_k \subseteq V} satisfying the assumption of Lemma 3 with {\varepsilon = \frac{1}{4}} .

Proof: We can form the desired sets {\{U_i\}} by taking disjoint unions of the sets {\{T_i\}} . Begin with the family {\{T_1, T_2, \ldots, T_m\}} and repeatedly replace two sets {T} and {T'} by their union if they each satisfy {\sum_{v \in T} \|F(v)\|^2 < \frac12} .

When we finish, we are left with a collection of sets each member of which satisfies {\sum_{v \in T} \|F(v)\|^2 \in [\frac12, 1+\frac{1}{2k}]} , and possibly one set for which {\sum_{v \in T} \|F(v)\|^2 < 1/2} . By (i), we must end with at least {k} sets which each satisfy {\sum_{v \in T} \|F(v)\|^2 \geq 1/4} . \Box

Our final lemma, which finishes the proof of Theorem 1, simply asserts that such sets exist. We will no longer need the fact that {F} comes from eigenfunctions.

Lemma 5 Suppose that {g_1, g_2, \ldots, g_k : V \rightarrow \mathbb R} are orthornormal in {\ell^2(V)} . Let

\displaystyle G(v) = (g_1(v), g_2(v), \ldots, g_k(v))

and {\hat G(v) = G(v)/\|G(v)\|} . Then there is an {m \geq 1} and subsets {T_1, T_2, \ldots, T_m \subseteq V} such that

  1. {\displaystyle\sum_{i=1}^m \sum_{v \in T_i} \|G(v)\|^2 \geq k - \frac14} .
  2. For every {i \neq j} , if {u \in T_i} and {v \in T_j} , then

    \displaystyle \|\hat G(u)-\hat G(v)\| \geq \frac{1}{2\sqrt{2} k^{3}}\,.

  3. For every {i = 1,2,\ldots,m} ,

    \displaystyle  \sum_{v \in T_i} \|G(v)\|^2 \leq 1 + \frac{1}{2k}\,.

Proof: Consider the {n \times k} matrix {A} which has columns {g_1, g_2, \ldots, g_k} . For any {x \in \mathbb R^k} , we have

\displaystyle  \sum_{v \in V} \langle x, G(v)\rangle^2 = \|Ax\|^2 = x^T A^T A x = x^T x = \|x\|^2\,, \ \ \ \ \ (8)

since the columns of {A} are orthornormal

For a subset {X \subseteq S^{k-1}} of the unit sphere in {\mathbb R^k} , we put {V(X) = \{ v \in V : \hat G(v) \in X \}} . Fix some {x \in X} and let {\mathsf{diam}(X)} denote the diameter of {X} , then by (8), we have

\displaystyle  1 = \sum_{v \in V(X)} \langle x, G(v)\rangle^2 = \sum_{v \in V(X)} \|G(v)\|^2 \left(1-\frac{\|\hat G(v)-x\|^2}{2}\right)^2 \geq \sum_{v \in V(X)} \|G(v)\|^2 \left(1-\frac{\mathrm{diam}(X)^2}{2}\right)^2\,.

We conclude that if {\mathrm{diam}(X) \leq 1/\sqrt{2k}} , then

\displaystyle  \sum_{v \in V(X)} \|G(v)\|^2 \leq 1 + \frac{1}{2k}\,. \ \ \ \ \ (9)

Now, let {\mathcal{P}} be a partition of {\mathbb R^k} into axis-parallel squares of side length {L = \frac{1}{k \sqrt{2}}} . For any such square {Q \in \mathcal P} , we let {\tilde Q \subseteq Q} denote the set of points which are at Euclidean distance at least {\frac{L}{4k^2}} from every side of {Q} . Observe that

\displaystyle  \mathrm{vol}(\tilde Q) \geq \left(1-\frac{1}{4k^2}\right)^k \mathrm{vol}(Q) \geq \left(1-\frac{1}{4k}\right) \mathrm{vol}(Q)\,. \ \ \ \ \ (10)

Consider now the collection of sets {\{V(\tilde Q) : Q \in \mathcal P\}} . Since {\mathrm{diam}(\tilde Q) \leq L \sqrt{k} = \frac{1}{\sqrt{2k}}} , (9) implies that {\sum_{v \in V(\tilde Q)} \|G(v)\|^2 \leq 1 + \frac{1}{2k}} . Furthermore, by construction, for any {u \in V(\tilde Q)} and {v \in V(\tilde Q')} with {Q \neq Q'} , we have

\displaystyle  \|\hat G(u)-\hat G(v)\| \geq 2 \frac{L}{4k^2} \geq \frac{1}{2\sqrt{2} k^{3}}\,.

Thus the collection of sets {\{V(\tilde Q) : Q \in \mathcal P\}} satisfy both conditions (ii) and (iii) of the lemma. We are left to verify (i).

Note that {\sum_{v \in V} \|G(v)\|^2 = \sum_{i=1}^k \sum_{v \in V} g_i(v)^2 = k} . If we choose a uniformly random axis-parallel translation {\mathcal P'} of the partition {\mathcal P} , then (10) implies

\displaystyle  \mathbb{E} \sum_{Q \in \mathcal P'} \sum_{v \in V(\tilde Q)} \|G(v)\|^2 \geq \left(1-\frac{1}{4k}\right) \sum_{v \in V(Q)} \|G(v)\|^2 \geq k - \frac14\,.

In particular, there exists some fixed partition that achieves this bound. For this partition {\mathcal P} , the sets {\{V(\tilde Q) : Q \in \mathcal P\}} satisfy all three conditions of the lemma, completing the proof.

7 thoughts on “No frills proof of higher-order Cheeger inequality

  1. I do not understand the formula (3), please board can make more explicit. thank you.

  2. I agree, there are a number of steps contained in that inequality. First, observe that if S is a set in a metric space and d(x,S) is the distance from x to the closest point in S, then |d(x,S)-d(y,S)| \leq d(x,y). In other words, the map f(x)=d(x,S) is 1-Lipschitz. Now observe that scaling a 1-Lipschitz map by C, it becomes C-Lipschitz. Adding a constant, it remains C-Lipschitz. Finally, if f is C-Lipschitz, then the map g(x)=max(0,f(x)) is also C-Lipschitz. Combine all these together for the distance d(u,v) = \|\hat F(u)-\hat F(v)\| and you get (3).

  3. Thanks to Professor Lee very much. I am very interested in this problem, because I am learning, learn about it in earnest.

  4. Dear Mr. Lee.
    I think, there is a typographical error in Lemma 3 does not.
    $\mathcal{R}(\psi_i) \leq 25k / (\varepsilon \delta^2) $
    to be $\mathcal{R}(\psi_i) \leq 25k \lambda_k / (\varepsilon \delta^2) $.
    Because at the end of the line when we demonstrated that the combination of the three that still \lambda_k.

  5. and if we choose $$ \delta = \frac{1}{2 \sqrt{2} k^{5/2}} $$ as in 2. of Lemma 5 it is:
    $$ \mathcal{R}(\psi_i) \leq 800 k^6 \lambda_k $$
    better than $$ \mathcal{R}(\psi_i) \leq 30^2 k^6 \lambda_k $$
    this true?
    Thank you much.

  6. Dear Mr. Lee.
    I think, there is a typographical error in Lemma 3 does not.
    \mathcal{R}(\psi_i) \leq 25k / (\varepsilon \delta^2)
    to be \mathcal{R}(\psi_i) \leq 25k \lambda_k / (\varepsilon \delta^2) .
    Because at the end of the line when we demonstrated that the combination of the three that still \lambda_k .

    [Thanks. I fixed it. –j]

  7. and if we choose \delta = \frac{1}{2 \sqrt{2} k^{5/2}} as in 2. of Lemma 5 it is:
    \mathcal{R}(\psi_i) \leq 800 k^6 \lambda_k
    better than \mathcal{R}(\psi_i) \leq 30^2 k^6 \lambda_k
    this true?
    Thank you much.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s