In the last post, I recalled the problem of dimension reduction for finite subsets of . I thought I should mention the main obstacle to reducing the dimension below for -point subsets: It can’t be done with linear mappings.
All the general results mentioned in that post use a linear mapping. In fact, they are all of the form:
- Change of density, i.e. preprocess the points/subspace so that no point has too much weight on any one coordinate.
- Choose a subset of the coordinates, possibly multiplying the chosen coordinates by non-negative weights. (Note that the Newman-Rabinovich result, based on Batson, Spielman, and Srivastava, is deterministic, while in the other bounds, the sampling is random.)
(The dimension reduction here is non-linear, but only applies to special subsets of , like the Brinkman-Charikar point set.)
The next theorem shows that linear dimension reduction mappings cannot do better than dimensions.
Theorem: For every , there are arbitrarily large -point subsets of on which any linear embedding into incurs distortion at least
Since the identity map from to has distortion , this theorem immediately implies that there are -point subsets on which any linear embedding requires dimension for an -distortion embedding. The case of the preceding theorem was proved by Charikar and Sahai. A simpler proof, which extends to all is given in Lemma 3.1 of a paper by myself, Mendel, and Naor.
Since I’ve been interacting a lot with the theory group at MSR Redmond (see the UW-MSR Theory Center), I’ve been asked occasionally to propose problems in the geometry of finite metric spaces that might be amenable to probabilistic tools. Here’s a fundamental problem that’s wide open. Let be the smallest number such that every -point subset of embeds into with distortion at most . Here’s what’s known.
- Talagrand (following work of Bourgain-Lindenstrauss-Milman and Schechtman) proved that for every , every -dimensional subspace of admits a -distortion embedding into with . In particular, this gives
- Brinkman and Charikar showed that for . A significantly simpler proof was later given by Assaf Naor and myself. (With Brinkman and Karagiozova, we have also shown that this bound is tight for the Brinkman-Charikar examples and their generalizations.)
- Recently, Newman and Rabinovich showed that one can take for any . Their paper relies heavily on the beautiful spectral sparsification method of Batson, Spielman, and Srivastava. In fact, it is shown that one can use only weighted cuts (see the paper for details). This also hints at a limitation of their technique, since it is easy to see that the metric on requires cuts for a constant distortion embedding (and obviously only one dimension).
The open problem is to get better bounds. For instance, we only know that
There is evidence that might be the right order of magnitude. In the large distortion regime, when , results of Arora, myself, and Naor show that .
My student Punya Biswal just completed this great survey on hypercontractivity and its application in computer science. There is a PDF version from his home page, and accompanying slides.
Hypercontractive inequalities are a useful tool in dealing with extremal questions in the geometry of high-dimensional discrete and continuous spaces. In this survey we trace a few connections between different manifestations of hypercontractivity, and also present some relatively recent applications of these techniques in computer science.
1. Preliminaries and notation
Fourier analysis on the hypercube. We define the inner product on functions , where the expectation is taken over the uniform (counting) measure on . The multilinear polynomials (where ranges over subsets of ) form an orthogonal basis under this inner product; they are called the Fourier basis. Thus, for any function , we have , where the Fourier coefficients obey Plancherel’s relation . It is easy to verify that and .
Norms. For , define the norm . These norms are monotone in : for every function , implies . For a linear operator carrying functions to functions , we define the -to- operator norm . is said to be a contraction from to when . Because of the monotonicity of norms, a contraction from to is automatically a contraction from to for any . When and , then is said to be hypercontractive.
Convolution operators. Letting represent the coordinatewise product of , we define the convolution of two functions , and note that it is a linear operator for every fixed . Convolution is commutative and associative, and the Fourier coefficients of a convolution satisfy the useful property . We shall be particularly interested in the convolution properties of the following functions
- The Dirac delta , given by and otherwise. It is the identity for convolution and has for all .
- The edge functions given by
is or according as contains or does not contain , respectively. For any function , , where is obtained from by flipping just the th bit. Convolution with acts as an orthogonal projection (as we can easily see in the Fourier domain), so for any functions , we have
- The Bonami-Gross-Beckner noise functions for , where and we define . These operators form a semigroup, because and . Note that . We define the noise operator acting on functions on the discrete cube by . In combinatorial terms, is the expected value of , where is obtained from by independently flipping each bit of with probability .
Proof: This is easy in the Fourier basis: