Representation theory, for the impatient

Let {G} be a finite group. The group algebra {\mathbf{C} G} is the complex algebra with basis {G}, with multiplication defined by linearly extending the group law of {G}. The description of {\mathbf{C} G} as an algebra is called representation theory. Relatedly, we want to understand the structure of {\mathbf{C} G}-modules. We often call {\mathbf{C} G}-modules (complex) representations of {G}, and simple {\mathbf{C} G}-modules irreducible representations.

Lemma 1 (Schur’s lemma) If {V} and {V'} are simple {\mathbf{C} G}-modules, then every nonzero homomorphism {V\rightarrow V'} is an isomorphism. Moreover every homomorphism {V\rightarrow V} is a scalar multiple of the identity.

Proof: Suppose that {f:V\rightarrow V'} is a nonzero homomorphism. Then {\ker f} is a submodule, so {\ker f = 0}, and similarly {f(V) = V'}, so {f} is an isomorphism. Now suppose {V=V'}. Since every eigenspace of {f} is a submodule, by simplicity every eigenspace is either {0} or {V}. From the spectral theorem we deduce that {f} is a scalar multiple of the identity. \Box

It turns out that we can ask for a unitary structure in {\mathbf{C} G}-modules for free.

Lemma 2 (Weyl’s averaging tricking) Every {\mathbf{C} G}-module {V} has a {G}-invariant inner product. Moreover if {V} is simple then this inner product is unique up to scaling.

Proof: Let {V} be a {\mathbf{C} G}-module and let {(,)} be any inner product on {V}. Define a new inner product {\langle,\rangle} by

\displaystyle \langle u, v\rangle = \frac{1}{|G|} \sum_{g\in G} (g\cdot u, g\cdot v).

Clearly {\langle,\rangle} has the desired properties. Now suppose that {V} is simple, and that {\langle,\rangle'} is another {G}-invariant inner product. Let {f:V\rightarrow V} be the adjoint of the formal identity

\displaystyle (V,\langle,\rangle)\rightarrow(V,\langle,\rangle').

In other words let {f} be the unique function {V\rightarrow V} satisfying

\displaystyle \langle u,v\rangle' = \langle u, f(v)\rangle

for all {u,v\in V}. Then {f} is a homomorphism, so by Schur’s lemma it must be a multiple of the identity, so {\langle,\rangle'} must be a multiple of {\langle,\rangle}. \Box

Lemma 3 Let {U} be a finite-dimensional {\mathbf{C} G}-module with {G}-invariant inner product {\langle,\rangle}, and for each simple {\mathbf{C} G}-module {V} let {U_V} be the sum of all submodules of {U} isomorphic to {V} (the isotypic component of {U} corresponding to {V}). Then {U_V\cong V^{\oplus m}} for some {m}, and {U} is the orthogonal direct sum of the submodules {U_V}.

Proof: Since the orthogonal complement of a submodule is a submodule, it follows by induction on dimension that {U} is an orthogonal direct sum of simple submodules, so {U = \bigoplus U'_V}, where {V} runs over simple {\mathbf{C} G}-modules and each {U'_V \cong V^{\oplus m}} for some {m\geq 0}. Moreover by Schur’s lemma any two nonisomorphic simple submodules must be orthogonal, as the orthogonal projection from one to the other is a homomorphism, so every submodule of {U} isomorphic to {V} must be contained in {U'_V}, so {U'_V=U_V}. \Box

The above lemma is particularly interesting when applied to the {\mathbf{C} G}-module {\mathbf{C} G} itself, the regular representation. We give {\mathbf{C} G} a {G}-invariant inner product by declaring the basis {G} to be orthonormal. From the above lemma we know then that {\mathbf{C} G = \bigoplus \mathbf{C} G_V}, where {V} runs over simple {\mathbf{C} G}-modules, and {\mathbf{C} G_V\cong V^{\oplus m}} for some {m\geq 0}. Note then by Schur’s lemma that {\dim\textup{Hom}(\mathbf{C} G,V)=m}. On the other hand every homomorphism {\mathbf{C} G\rightarrow V} is determined uniquely by the destination of the unit {1} in {V}, so {\dim\textup{Hom}(\mathbf{C} G,V)=\dim V}. We deduce that as {\mathbf{C} G}-modules

\displaystyle \mathbf{C} G \cong \bigoplus V^{\oplus \dim V}.

In particular there are only finitely many simple {\mathbf{C} G}-modules, and their dimensions obey

\displaystyle |G| = \sum (\dim V)^2.

One can also prove {\mathbf{C} G_V\cong V^{\oplus\dim V}} in a more informative way, as follows. Fix an invariant inner product {\langle,\rangle_V} on {V}, and consider any homomorphism {f:V\rightarrow\mathbf{C} G}. The adjoint {f^*:\mathbf{C} G\rightarrow V} is also a homomorphism, so

\displaystyle  f(v) = \sum_{g\in G}\langle f(v),g\rangle_{\mathbf{C} G} g = \sum_{g\in G}\langle v, f^*(g)\rangle_V g = \sum_{g\in G}\langle v,g f^*(1)\rangle_V g.

Conversely for any {u\in V} we may define a homomorphism {V\rightarrow\mathbf{C} G} by

\displaystyle  f_{V,u}(v) = \sum_{g\in G} \langle v,g\cdot u\rangle_V g.

We deduce therefore that {\mathbf{C} G_V} is the subspace of {\mathbf{C} G} spanned by the elements {f_{V,u}(v)}, where {u,v\in V}. Moreover using Schur’s lemma one can show that the images of {f_{V,u}} and {f_{V,u'}} are orthogonal whenever {u} and {u'} are orthogonal in {V}, so by letting {u} range over a basis of {V} we thus see that {\mathbf{C} G_V} is the orthogonal direct sum of {\dim V} copies of {V}.

In any case we now understand the structure of {\mathbf{C} G} as a {\mathbf{C} G}-module, and we are only a short step away from understanding its structure as an algebra. Consider the obvious map

\displaystyle \mathbf{C} G \longrightarrow \bigoplus \textup{End}(V),

where as always the sum runs over a complete set of irreducible representations up to isomorphism. We claim this map is an isomorphism. Since we already know the dimensions agree, it suffices to prove injectivity. Thus suppose {x\in\mathbf{C} G} maps to zero, i.e., that {x} acts trivially on each simple {\mathbf{C} G}-module {V}. Then {x} acts trivially on {\mathbf{C} G}, so {x = x1 = 0}. Thus we have proved the following theorem.

Theorem 4 As complex algebras, {\mathbf{C} G \cong \bigoplus \textup{End}(V)}.

Finally, it is useful to understand how to project onto isotypic components. Given {g\in G}, we can compute the trace of {g} as an operator on {\mathbf{C} G} in two different ways. On the one hand, by looking at the basis {G},

\displaystyle  \textup{tr}_{\mathbf{C} G} (g) = \begin{cases} |G| & \text{if }g=1,\\ 0 &\text{if }g\neq 1.\end{cases}

On the other hand from the decomposition {\mathbf{C} G\cong\bigoplus V^{\oplus\dim V}} we have

\displaystyle  \textup{tr}_{\mathbf{C} G} (g) = \sum_V (\dim V) \textup{tr}_V(g).

As a consequence, for every {x\in \mathbf{C} G} we have

\displaystyle  x = \sum_V P_V x,

where {P_V : \mathbf{C} G \rightarrow \mathbf{C} G} is the operator defined by

\displaystyle  P_V x = \frac{\dim V}{|G|} \sum_{g\in G} \textup{tr}_V(xg^{-1}) g = \frac{\dim V}{|G|} \sum_{g\in G} \textup{tr}_V(g^{-1}) gx.

These identities are most easily verified first for {x\in G}, then extending to all of {\mathbf{C} G} by linearity. Now if {x \in \mathbf{C} G_U} for {U\neq V} then {x} acts as zero on {V}, so {\textup{tr}_V(x g^{-1}) = 0}, so {P_V x = 0}. On the other hand one can verify directly that {P_V} is a homomorphism, so by Schur’s lemma the image of {P_V} must be contained in {\mathbf{C} G_V}. We deduce therefore from {x = \sum_V P_V x} that {P_V} is the orthogonal projection onto {\mathbf{C} G}.

The function {\chi_V(g) = \textup{tr}_V(g)} is usually called the character of {V}. From the relations {P_V^2 = P_V} and {P_U P_V = 0} for {U\neq V} one can deduce the well known orthogonality relations for characters. In fact the distinction between {P_V} and {\chi_V} is hardly more than notational. Often we identify functions {f:G\rightarrow \mathbf{C}} with elements {\sum_{g\in G} f(g) g \in \mathbf{C} G}, in which case the operation of convolution corresponds to multiplication in the group algebra. The operator {P_V} then is just convolution with {(\dim V)\chi_V}. So, in brief, to project onto the {V}-isotypic component you convolve with the character of {V} and multiply by {\dim V}.

We have kept to almost the bare minimum in the above discussion: the complex numbers {\mathbf{C}} and finite groups {G}. There are a number of directions we could try to move in. We could replace {\mathbf{C}} with a different field, say one which is not algebraically closed, or one which has positive characteristic. Alternatively we could replace {G} with an infinite group, say with a locally compact topology. We mention two such generalisations.

Theorem 5 (Artin–Wedderburn) Every semisimple ring is isomorphic to a product {\prod_{i=1}^k M_{n_i}(D_i)} of matrix rings, where the {n_i} are integers and the {D_i} are division rings. In particular every semisimple {\mathbf{C}}-algebra is isomorphic to a product {\prod_{i=1}^k M_{n_i}(\mathbf{C})}.

When defining unitary representations for compact groups {G} we demand that the map {G\rightarrow U(V)} be continuous, where {U(V)} is given the strong operator topology.

Theorem 6 (Peter–Weyl) Let {G} be a compact group and {\mu} its normalised Haar measure. Let {\widehat{G}} be the set of all irreducible unitary representations of {G} up to isomorphism. Then {\widehat{G}} is countable, every {V\in\widehat{G}} is finite-dimensional, and the algebra {L^2(G)} of square-integrable functions with the operation of convolution decomposes as a Hilbert algebra as

\displaystyle L^2(G) \cong \bigoplus_{V\in\widehat{G}} (\dim V) \cdot \textup{HS}(V),

where {\textup{HS}(V)} is the space {\textup{End}(V)} together with the Hilbert–Schmidt inner product.

Leon Green’s theorem

The fundamental objects of study in higher-order Fourier analysis are nilmanifolds, or in other words spaces given as a quotient {G/\Gamma} of a connected nilpotent Lie group {G} by a discrete cocompact subgroup {\Gamma}. Starting with Furstenberg’s work on Szemeredi’s theorem and the multiple recurrence theorem, work by Host and Kra, Green and Tao, and several others has gradually established that nilmanifolds control higher-order linear configurations in the same way that the circle, as in the Hardy-Littlewood circle method, controls first-order linear configurations.

Of basic importantance in the study of nilmanifolds is equidistribution: one needs to know when the sequence {g^n x} equidistributes and when it is trapped inside a subnilmanifold. It turns out that this problem was already studied by Leon Green in the 60s. To describe the theorem note first that the abelianisation map {G\rightarrow G/G_2} induces a map from {G/\Gamma} to a torus {G/(G_2\Gamma)} which respects the action of {G}, and recall that equidistribution on tori is well understood by Weyl’s criterion. Leon Green’s beautiful theorem then states that {g^n x} equidistributes in the nilmanifold if and only if its image in the torus {G/(G_2\Gamma)} equidistributes.

Today at our miniseminar, Aled Walker showed us Parry’s nice proof of this theorem, which is more elementary than Green’s original proof. During the talk there was some discussion about the importantance of various hypotheses such as “simply connected” and “Lie”. It turns out that the proof works rather generally for connected locally compact nilpotent groups, so I thought I would record the proof here with minimal hypotheses. The meat of the argument is exactly as in Aled’s talk and, presumably, Parry’s paper.

Let {G} be an arbitrary locally compact connected nilpotent group, say with lower central series

\displaystyle G=G_1\geq G_2 \geq\cdots\geq G_s\geq G_{s+1}=1,

and let {\Gamma\leq G} be a closed cocompact subgroup. Under these conditions the Haar measure {\mu_G} of {G} induces a {G}-invariant probability measure {\mu_{G/\Gamma}} on {G/\Gamma}. We say that {x_n\in G/\Gamma} is equidistributed if for every {f\in C(G/\Gamma)} we have

\displaystyle \frac{1}{N}\sum_{n=0}^{N-1} f(x_n) \rightarrow \int f \,d\mu_{G/\Gamma}.

We fix our attention on the sequence

\displaystyle x_n = g^n x

for some {g\in G} and {x\in G/\Gamma}. As before we have an abelianisation map

\displaystyle \pi: G/\Gamma\rightarrow G/G_2\Gamma

from the {G}-space {G/\Gamma} to the compact abelian group {G/G_2\Gamma}. We define equidistribution on {G/G_2\Gamma} similarly. The theorem is then the following.

Theorem 1 (Leon Green’s theorem) For {g\in G} and {x\in G/\Gamma} the following are equivalent.

  1. The sequence {g^n x} is equidistributed in {G/\Gamma}.
  2. The sequence {\pi(g^n x)} is equidistributed in {G/G_2\Gamma}.
  3. The orbit of {\pi(g)} is dense in {G/G_2\Gamma}.
  4. {\chi(\pi(g))\neq 0} for every nontrivial character {\chi:G/G_2\Gamma\rightarrow\mathbf{R}/\mathbf{Z}}.

Item 1 above trivially implies every other item. The implication 4{\implies}3 (a generalised Kronecker theorem) follows by pulling back any nontrivial character of {(G/G_2\Gamma)/\overline{\langle\pi(g)\rangle}}. The implication 3{\implies}2 (a generalised Weyl theorem) follows from the observation that every weak* limit point of the sequence of measures

\displaystyle  \frac{1}{N}\sum_{n=0}^{N-1} \delta_{\pi(g^n x)}

]must be shift-invariant and thus equal to the Haar measure. So the interesting content of the theorem is 2{\implies}1.

A word about the relation to ergodicity: By the ergodic theorem the left shift {\tau_g:x\mapsto gx} is ergodic if and only if for almost every {x} the sequence {g^n x} equidistributes; on the other hand {\tau_g} is uniquely ergodic, i.e., the only {\tau_g}-invariant measure is the given one, if and only if for every {x} the sequence {g^n x} equidistributes. Thus to prove the theorem above we must not only prove that {\tau_g} is ergodic but that it is uniquely ergodic. Fortunately one can prove these two properties are equivalent in this case.

Lemma 2 If {\tau_g:G/\Gamma\rightarrow G/\Gamma} is ergodic then it’s uniquely ergodic.

The following proof is due to Furstenberg.

Proof: By the ergodic theorem the set {A} of {\mu_{G/\Gamma}}-generic points, in other words points {x} for which

\displaystyle \frac{1}{N}\sum_{n=0}^{N-1} f(g^n x) \rightarrow \int f \,d\mu_{G/\Gamma}

for every {f\in C(G/\Gamma)}, has {\mu_{G/\Gamma}}-measure {1}, and clearly if {x\in A} and {c\in G_s} then {xc\in A}, so {A = p^{-1}(p(A))}, where {p} is the projection of {G/\Gamma} onto {G/G_s\Gamma}.

Now let {\mu'} be any {\tau_g}-invariant ergodic measure. By induction we may assume that {\tau_g:G/G_s\Gamma\rightarrow G/G_s\Gamma} is uniquely ergodic, so we must have {p_*\mu' = p_*\mu_{G/\Gamma}}, so

\displaystyle \mu'(A) = p_*\mu'(p(A)) = p_*\mu_{G/\Gamma}(p(A)) = \mu_{G/\Gamma}(A) = 1.

But by the ergodic theorem the set of {\mu'}-generic points must also have {\mu'}-measure {1}, so there must be some point which is both {\mu_{G/\Gamma}}– and {\mu'}-generic, and this implies that {\mu'=\mu_{G/\Gamma}}. \Box

We need one more preliminary lemma about topological groups before we really get started on the proof.

Lemma 3 If {H} and {K} are connected subgroups of some ambient topological group then {[H,K]} is also connected.

Proof: Since {(h,k)\mapsto [h,k]=h^{-1}k^{-1}hk} is continuous certainly {C = \{[h,k]:h\in H,k\in K\}} is connected, so {C^n = CC\cdots C} is also connected, so because {1\in C^n} for all {n} we see that {[H,K]=\bigcup_{n=1}^\infty C^n} is connected. \Box

Thus if {G} is connected then every term {G_1,G_2,G_3,\dots} in the lower central series of {G} is connected.

We can now prove Theorem 1. As noted it suffices to prove that {\tau_g} acts ergodically on {G/\Gamma} whenever it acts ergodically on {G/G_2\Gamma}. By induction we may assume that {\tau_g} acts ergodically on {G/G_s\Gamma}. So suppose that {f\in L^2(G/\Gamma)} is {\tau_g}-invariant. By decomposing {L^2(G/\Gamma)} as a {\overline{G_s\Gamma}/\Gamma}-space we may assume that {f} obeys

\displaystyle f(cx)=\gamma(c)f(x)\quad(c\in G_s, x\in G/\Gamma)

for some character {\gamma:G_s\Gamma/\Gamma\rightarrow S^1}. In particular {|f|} is both {G_s}-invariant and {\tau_g}-invariant, so it factors through a {\tau_g}-invariant function {G/G_s\Gamma\rightarrow\mathbf{R}}, so it must be constant, say {1}. Moreover for every {b\in G_{s-1}} the function

\displaystyle \Delta_bf(x) = f(bx)\overline{f(x)}

is {G_s}-invariant, and also a {\tau_g} eigenvector:

\displaystyle \Delta_bf(gx) = \gamma([b,g])\Delta_bf(x).

By integrating this equation we find that either {\gamma([b,g])=1}, so {\Delta_bf} is constant, or {\int\Delta_bf \,d\mu_{G/\Gamma}= 0}, so either way we have

\displaystyle \int \Delta_bf\,d\mu_{G/\Gamma}\in \{0\}\cup S^1.

But since {\int\Delta_bf\,d\mu_{G/\Gamma}} is a continuous function of {b} and equal to {1} when {b=1} we must have {\gamma([b,g])=1} for all sufficiently small {b}, and thus for all {b} by connectedness of {G_{s-1}} and the identity

\displaystyle [b_1b_2,g]=[b_1,g][b_2,g].

Thus setting {\gamma(b)=\Delta_bf} extends {\gamma} to a homomorphism {G_{s-1}\rightarrow S^1}. In fact we can extend {\gamma} still further to a function {G\rightarrow D_1}, where {D_1} is the unit disc in {\mathbf{C}}, by setting

\displaystyle \gamma(a) = \int \Delta_af\,d\mu_{G/\Gamma}.

Now if {a\in G} and {b\in G_{s-1}} then

\displaystyle \gamma(ba) = \int f(bax) \overline{f(x)}\,d\mu_{G/\Gamma} = \int \gamma(b)f(ax)\overline{f(x)}\,d\mu_{G/\Gamma}=\gamma(b)\gamma(a),


\displaystyle \gamma(ab) = \int f(abx)\overline{f(x)}\,d\mu_{G/\Gamma} = \int f(ax) \overline{f(b^{-1}x)}\,d\mu_{G/\Gamma} = \int f(ax) \overline{\gamma(b^{-1})}\overline{f(x)}\,d\mu_{G/\Gamma} = \gamma(b)\gamma(a),


\displaystyle \gamma(b)\gamma(a)=\gamma(ab)=\gamma(ba[a,b]) = \gamma(ba)\gamma([a,b])=\gamma(b)\gamma(a)\gamma([a,b]).

Since {|\gamma(b)|=1} we can cancel {\gamma(b)}, so

\displaystyle \gamma(a)(\gamma([a,b])-1) = 0.

Finally observe that {\gamma(a)} is a continuous function of {a}, and {\gamma(1)=1}, so we must have {\gamma([a,b])=1} for all sufficiently small {a}, and thus by connectedness of {G} and the identity

\displaystyle [a_1a_2,b]=[a_1,b][a_2,b]

we must have {\gamma([a,b])=1} identically. But this implies that {\gamma} vanishes on all {s}-term commutators and thus on all of {G_s}, so in fact {f} factors through {G/G_s\Gamma}, so it must be constant. This finishes the proof.

A remark is in order about the possibility that some of the groups {G_i} and {G_i\Gamma} are not closed. This should not matter. One could either read the above proof as it is written, noting carefully that I never said groups should be Hausdorff, or, what’s similar, instead modify it so that whenever you quotient by a group {H} you instead quotient by the group {\overline{H}}.

Embarrassingly, it’s difficult to come up with a non-Lie group to which this generalised Leon Green’s theorem applies. It seems that many natural candidates have the property that {G} is not connected but {G/\Gamma} is: for example consider

\displaystyle  \left(\begin{array}{ccc}1&\mathbf{R}\times\mathbf{Q}_2&\mathbf{R}\times\mathbf{Q}_2\\0&1&\mathbf{R}\times\mathbf{Q}_2\\0&0&1\end{array}\right)/\left(\begin{array}{ccc}1&\mathbf{Z}[1/2]&\mathbf{Z}[1/2]\\0&1&\mathbf{Z}[1/2]\\0&0&1\end{array}\right).

So it would be interesting to know whether the theorem extends to such a case. Or perhaps there are no interesting non-Lie groups for this theorem, which would be a bit of a let down.