Processing math: 55%
Skip to main content
Library homepage
 

Text Color

Text Size

 

Margin Size

 

Font Type

Enable Dyslexic Font
Mathematics LibreTexts

Background on Symmetric Functions

( \newcommand{\kernel}{\mathrm{null}\,}\)

General Background

Here we will be giving a general background on the ring of symmetric functions. We start by letting n be an integer. A partition λ of n, which is written as λn is a weakly decreasing sequence with values in Z0 whose sum is n. A weak composition α of n is a sequence with values in Z0 whose sum is n. We will let xα=xα11xα22 and we will consider the formal power series Λ=Q[[x1,x2,]] which are invariant under S+, the group of permutations of {1,2,3,}, where σS+ acts by σxi=xσ(i). The ring (algebra) Λ is known as the ring (algebra) of symmetric functions. We can consider this as the direct limit of polynomials in k variables that are invariant under Sk, the symmetric group on k letters, where σSk acts by σxi=xσ(i).

Partitions can be represented as diagrams where we have λi boxes left-aligned where the top row corresponds to λ1. This diagram is called a Young diagram or Ferrers' diagram in English convention. Another notation often seen is for the bottom row to correspond to λ1 and this is known as French convention. Partitions have a natural partial ordering given by containment of the Young diagrams, or more formally, we have λμ if λiμi for all i. However the partial ordering we will be predominantly using is known as dominance order, where λμ if ij=1λjij=1μj for all i.

We also need to define a few more things for partitions. We denote |λ|=n as the size of λ and (λ) is number of non-zero entries of λ which we call the length. Let mi(λ) be the number of parts of i in λ, and if the partition is clear, we will simply write this as mi.

Monomial basis

It is straightforward to see that the following is a basis for Λ: mλ=αxα where we sum over all distinct permutations α of λ. Now there is a natural grading on polynomials given by the degree, and since mλ is homogeneous of degree n=|λ|. Therefore this is a graded basis of Λ, which means that for mλΛn and mμΛm, we have mλmμΛn+m.

Elementary basis

We now define a new basis, the elementary basis, by ek=i1<i2<<ikxi1xi2xikeλ=eλ1eλ2. We note that ek=m1k. From this definition, it is not clear that this forms a basis. However we can show that the transition map is upper unitriangular with respect to dominance order, and so this really is a basis.

(Complete) Homogeneous basis

Our next basis, the (complete) homogeneous basis, is defined similar to the elementary basis: hk=i1i2ikxi1xi2xikhλ=hλ1hλ2. We note that hk=μkmμ. However the transition map to the monomial symmetric functions is not as nice as above, so we have to define an algebra morphism ω:ΛΛ by ω(ek)=hk. Using generating functions for the elementary and homogeneous basis, one can show that ω(hk)=ek and hence ω is an involution. Therefore, since the elementary symmetric functions are a basis, so are the homogeneous symmetric functions.

Powersum Basis

Our last natural choice of basis is the powersum basis and is given by

pk=ixkipλ=pλ1pλ2.

We note that pk=mk. While the transition map is upper triangular with respect to dominance order, it is not unitriangular. We can use the powersum basis to define an inner product on Λ by

pλ,pμ=δλμzλ

where zλ=m1!1λ1m2!2λ2 (which is the size of the conjugacy class of cycle type λ in Sk). So this does form a basis, but in a way, it is not a particularly nice basis.

A side remark, the inner product defined above is known as the Hall inner product and implies that

mλ,hμ=δλμ.

Schur functions

Since we will be working in an infinite number of variables, we will be using the definition using semistandard Young tableaux. We start by considering a partition λ, and now we will fill λ with entries in \(\{1, 2, \ldots\}\) such that the following conditions hold:

  1. it is weakly increasing across rows,
  2. it is strictly increasing down columns.

Such a filling T is called a semistandard Young tableau. We define the weight of T as the weak composition α=wt(T) by αi is the number of boxes filled with an i. We denote the set of all semistandard (Young) tableaux of shape λ by SSYT(λ) and for a fixed weight by SSYT(λ,α).

Next we will denote xT=xwt(T)=xα11xα22, and so we can define the Schur function by sλ=TSSYT(λ)xT. We want to show that this is a symmetric function, and we must extend the action of S+ to semistandard tableaux. We do so by defining the action of the simple generator si by finding all partial rows that looks like (i,,i,i+1,,i+1) where any i does not have an i+1 below it and the any i+1 does not have an i above it. Let r,s denote the number of i,i+1 respectively, and si acts by interchanging the number of i,i+1 occurring in each partial row. This is a valid action of S+ on SSYT(λ), and so the Schur functions are symmetric functions.

Schur functions exhibit many remarkable properties. The first is that this is a basis, which can be seen by sλ=μ|λ|Kλμmμ where Kλμ=|SSYT(λ,μ)| and is known as a Kostka number. It is easy to see that Kλμ=0 if λμ and K_{\lambda \lambda} = 1. The next is that from the RSK (a bijection from SSYT to integer matrices) and dual RSK algorithms, which hold many interesting properties within themselves, one can deduce that \langle s_{\lambda}, s_{\mu} \rangle = \delta_{\lambda \mu} and \omega s_{\lambda} = s_{\lambda^t} where \lambda^t is the transpose partition (reflect the Young diagram along the main diagonal).

Schur functions​

There is another basis, called the schur functions, with strong connections to combinatorics and representation theory. We give several equivalent definitions:

Tableaux
We first give a purely combinatorial definition

s_{\lambda} = \displaystyle \sum_{T} x^{cont(T)}

where the sum is over all Semi-Standard Young Tableaux of shape \lambda and cont(T) is the monomial representing its content. An example in 3 variables is

s_{2,1} = x_1^2x_2 + x_1^2x_3 + x_1x_2x_3+x_1x_2x_3 +x_1x_2^2 + x_1x_3^2 + x_2^2x_3 + x_2x_3^2

corresponding to the tableaux

[11,2], [11,3], [12,3], [13,2], [12,2], [13,3], [22,3], [23,3]


Here [ab,c] its just a one line notation for the more familiar

a b

c

Note in the example that the schur function is indeed symmetric, a fact which is not direct from its definition. In general, the symmetry of schur functions requires a proof, not hard, and once we know it, we can write

s_\lambda= \sum_\mu K_{\lambda\mu}m_\mu.\

with the following natural interpretation K_{\lambda\mu} is the number of Semi-Standard Young Tableaux (SSYT for short) with shape \lambda and content \mu. A careful analysis reveals that K_{\lambda\mu} is nonzero if and only if \lambda \geq \mu. Even more K_{\lambda\lambda} = 1 , with the only tableaux being the one with all 1's in the first row, 2's in the second, and so on. So in fact we have

s_\lambda= \sum_{\lambda\geq\mu} K_{\lambda\mu}m_\mu.\

Which means that the transition matrix will be lower unitriangular with respect to any order extending the dominance order. This shows that the schur functions are an integral basis for the ring of symmetric functions.

Remark: This is stronger that triangularity since being smaller in the linear order doesn't guarantees being smaller in dominance order. It is triangular, but several other positions are forced to be zero too.

One advantage of the combinatorial definition is that it makes sense also to define schur functions indexed by skew shapes.

Ratio of determinants

There is also a purely algebraic definition of s_{\lambda}

For any integer vector (l_1,l_2,\cdots, l_k) define \(a_{(l_1,l_2,\cdots, l_k) } (x_1, x_2, \dots , x_k) \) as the determinant of​

\left[ \begin{matrix} x_1^{l_1} & x_2^{l_1} & \dots & x_k^{l_1} \\ x_1^{l_2} & x_2^{l_2} & \dots & x_k^{l_2} \\ \vdots & \vdots & \ddots & \vdots \\ x_1^{l_k} & x_2^{l_k} & \dots & x_k^{l_k} \end{matrix} \right]

Set \delta = (n-1,n-2,\cdots,1,0) and note that a_{\delta} is the Vandermonde determinant \v(x). We have the alternative expression for the schur functions

s_{\lambda} = \dfrac{a_{\lambda+\delta}}{a_{\delta}}

Remark: This makes sense even if \lambda is not a partition. In its useful to define s_{\upsilon} this way for any integer vector. If nonzero, s_{\upsilon} will be equal (up to sign) to s_{\lambda} for a unique partition.

One common application, that we will use later, is the following: to find the coefficient of s_{\lambda} in a symmetric function f(x) we can just compute the coefficient of x^{\lambda+\delta} in v(x)f(x) . The hook length formula can be proven this way (See http://en.wikipedia.org/wiki/Hook_length_formula)

More determinants

From the first definition we were able to expand schur in terms of the monomial basis. What about the other bases? The answer for the homogeneous (and elementary) basis is given by the Jacobi-Trudi determinant

s_{\lambda} = \det(h_{\lambda_i-i+j})_{i,j=1}^n

Applying the involution \omega we get

s_{\lambda} = \det(e_{\lambda'_i-i+j})_{i,j=1}^n

Relation with the representation theory of the symmetric group

The remarkable connection between schur functions and the characters of the irreducible representations of the symmetric group is given by the following magical formula

\chi^{\lambda}(w)=\langle s_{\lambda}, p_{\tau(w)}\rangle

where \chi^{\lambda}(w)= is the value of the character of the irreducible representation V_{\lambda} in the element w and p_{\tau(w)} is the power symmetric function of the partition \tau(w) associated to the cycle decomposition of w . For example, if w=(154)(238)(6)(79) then \tau(w) = (3,3,2,1) .

Since e=(1)(2)\cdots(n) in cycle notation, \tau(e) = 1+1+\cdots+1=1^{(n)} . Then the formula says

\dim V_\lambda =\chi^\lambda(e) = \langle s_\lambda, p_{1^{(n)}}\rangle

Considering the expansion of schur functions in terms on monomial symmetric functions using the Kostka numbers

s_\lambda= \sum_\mu K_{\lambda\mu}m_\mu.\

the inner product with p_{1^{(n)}} = h_{1^{(n)}} is K_{\lambda 1^{(n)}} , because \langle m_{\mu},h_{\nu} \rangle = \delta_{\mu\nu} . Note that K_{\lambda 1^{(n)}} is equal to f^{\lambda} , the number of Standard Young Tableaux of shape \lambda . Hence

\dim V_\lambda = f^{\lambda}

and

\langle s_\lambda, p_{1^{(n)}}\rangle = f^{\lambda}

which will be useful later

The magical formula is equivalent to

s_{\lambda} = \dfrac{1}{n!} \sum_{w\in S_n} \chi^{\lambda}(w)p_{\tau(w)}

This gives a conceptual proof of the identity \omega(s_{\lambda}) = s_{\lambda'} , by comparing the coefficients and taking into account that \chi^{\lambda}(w) = \textrm{sgn}(w) \chi^{\lambda'}(w) because tensoring by the sign representation gives the irreducible representation for the conjugate partition.

Frobenius Series

The expansion in terms of the power symmetric functions suggest we define the following map The Frobenius Characteristic map F takes class functions on the symmetric group to symmetric function by sending \chi^{\lambda} \to s_{\lambda} and extending by linearity. An important fact is that F is an isometry with respect to the inner products.

Remark F does not commute with multiplication

The formula

s_\lambda= \sum_\mu K_{\lambda\mu}m_\mu.\

is equivalent to

h_\mu= \sum_\mu K_{\lambda\mu}s_\lambda.\

And this also comes from representation theory. There is a module M^{\mu} decomposition in V^{\lambda} is given by the multiplicities K_{\lambda\mu} , and the above equation is simply the Frobenius translation. (See Sagan)

Let A = \bigoplus_r A_r be a graded S_n module, with each A_r finite dimensional, define the Frobenius Series of A as

F_A(z;t) = \sum_r t^r F\textrm{char}(A_r)

where F\textrm{char}(A_r) is the image of \textrm{char} (A_r) under the Frobenius map.

And similarly if A = \bigoplus_{r,s} A_{r,s} is a doubly graded S_n module, with each A_{r,s} finite dimensional, define the Frobenius Series of A as

F_A(z;q,t) = \sum_r t^rq^s F\textrm{char}(A_r)

It is clear that the Frobenius series expand positively in terms of schur functions, because the coefficients of schur come from the multiplicities (obviously nonnegative) of the irreducibles on each graded piece. The proof of the positivity conjecture for Macdonald Polynomials consist of finding a module whose Frobenius series is the desired symmetric function.

Characters of the General Linear group

Another way of thinking about schur functions is as the characters of the irreducible representations of GL(n) . Lets go through a simple example

The first nontrivial representation of GL(2) that comes into mind is itself, which comes from the natural action on k^2 , call this representation V . The character is just the trace, which is, as function of the eigenvalues, equal to x_1+x_2 . What happens if we tensor V\otimes V ? The character gets raised by two and we have the identity
(x_1+x_2)^2 = x_1^2+x_2^2 + 2x_1x_2 = x_1^2+x_2^2 + x_1x_2 + x_1x_2 = s_{2}(x_1,x_2) + s_{1,1}(x_1,x_2)
Since decomposing the characters gives the information to decompose the representation, the above identity says that V\otimes V decomposes into two irreducibles one corresponding to the partition (2) and other to the partition (1,1) . This are the symmetric and antisymmetric part, respectively.

More generally, consider GL(n) and its defining representation V given by the natural action on k^n . If we want to decompose V^{\otimes m} into irreducibles we will need to write (x_1+x_2+\cdots+x_n)^m in term of schur. The remarkable formula, in the crossroads between symmetric functions, representation theory and combinatorics is

(x_1+x_2+\cdots+x_k)^n = \sum_{\lambda\vdash n} s_\lambda f^\lambda.

Which is the expansion of p_{1^{(n)}} in terms of schur functions using the coefficients given by the inner product, because \langle s_{\mu},s_{\nu}\rangle = \delta_{\mu\nu} and \langle s_\lambda, p_{1^{(n)}}\rangle = f^{\lambda} . The above equality can be proven also checking the coefficients of each monomial at both sides and using the Robinson–Schensted–Knuth correspondence. For a more detailed analysis of the decomposition of V^{\otimes m} see Schur–Weyl duality.

In this context we can express the schur functions by using Weyl's Character Formula

s_{\lambda}(x) = \sum_{w\in S_n} w\left(\dfrac{x^{\lambda}}{\Pi_{i<j}(1-x_j/x_i)}\right)

which is equivalent to the ratio of determinants.

Raising operators

Check the Plethystic notation section. For a set of variables x = (x_1,x_2,\cdots) define

\Omega(X) = \prod_{i}\dfrac{1}{1-x_i}

Now define the ''Bernstein Operators'' S^0_m on symmetric functions as

S^0_m(f) = [u^m]f(X-u^{-1})\Omega(uX)

In words this means we have some variables x, and we add one more variable u, so uX = (ux_1,ux_2,\cdots) , and f(X-u^{-1}) can be thought as adding the variable -u^{-1} to the set. So f(X-u^{-1})\Omega(uX) is an expression containing the variables x and u. [u^m] takes coefficient of u^m in the big mess

The following theorem is fundamental, not by itself, but because a lot of the theory is developed by deforming this operator, and while the proofs are different for the other cases, there is a pattern in the proofs. so lets describe this one to get a flavor of what's going on

Theorem The Bernstein operators add a part to the indexing of the Schur function, that is, for m \geq \lambda1 we have S^0_m s_\lambda = s_{m,\lambda}

Sketch of proof

We need the following ingredients. First by partial fraction expansion

\Omega(uX) = \prod_{i}\dfrac{1}{1-ux_i} = \sum_i \dfrac{1}{1-ux_i}\prod_{j\neq i} \dfrac{1}{1-x_j/x_i}

The second ingredient is Weyl's character formula

s_{\lambda}(x) = \sum_{w\in S_n} w\left(\dfrac{x^{\lambda}}{\prod_{i<j}(1-x_j/x_i)}\right)

By expanding its easy to check that for a polynomial f

[u^m]f(u^{-1})\dfrac{1}{1-uz} = z^mf(z)

Now we're ready to stir the elements. First lets mix the first with definition, so for any f

[u^m]f(X-u^{-1})\Omega(uX) = [u^m]f(X-u^{-1})\sum_i \dfrac{1}{1-ux_i}\prod_{j\neq i} \dfrac{1}{1-x_j/x_i}

= \sum_i [u^m]f(x-u^{-1}) \dfrac{1}{1-ux_i}\prod_{j\neq i} \dfrac{1}{1-x_j/x_i}

because [u^m] is an operator, it distributes over sum. Furthermore, considering f(x-u^{-1}) as a polynomial in u^{-1} we can use the third ingredient, with x_i playing the role of z to get


\sum_i x_i^mf(X-x_i)\prod_{j\neq i} \dfrac{1}{1-x_j/x_i} = \sum_i x_i^m\prod_{j\neq i} \dfrac{f(X-x_i)}{1-x_j/x_i}

By the virtues of the plethystic substitution f(X-x_i) is the same thing as evaluating in the other variables, i.e., taking x_i out. Now specialize to the schur function and consider the second ingredient, Weyl's formula, then it follows easily by induction, because note the factor

x_i^m\prod_{j\neq i} \dfrac{s_{\lambda}(X-x_i)}{1-x_j/x_i}

is the same as for the terms in the formula for s_{(m,\lambda)} when the permutation sends i to the first position.

This gives our final definition of schur functions

s_\lambda = S^0_{\lambda_1}S^0_{\lambda_2}\cdots S^0_{\lambda_l}(1)

Final Remarks

The schur functions are a basis for the symmetric functions with the following properties

1. Lower unitriangularity with respect to monomials s_\lambda= \sum_\mu K_{\lambda\mu}m_\mu.\

2. Orthogonality \langle s_{\mu},s_{\nu}\rangle = \delta_{\mu\nu}

The Kostka numbers K_{\lambda\mu} have two interpretations, a combinatorial and an algebraic one. These properties are important to keep in mind while generalizing with one or two parameters.

Hall-Littlewood symmetric functions

Now that we've looked at the classical basis for \Lambda, we are going to look at a t-deformation which specializes to the monomial and the Schur bases (plus two additional bases we haven't discussed here). We do this deformation considering a different inner product \langle p_{\lambda}, p_{\mu} \rangle_{(t)} = \delta_{\lambda \mu} z_{\lambda} \prod_{i=1}^{\ell(\lambda)} (1 - t^{\lambda_i})^{-1}. Now we define the Hall-Littlewood symmetric functions by the following two properties:

  1. (Triangularity) P_{\lambda}(t) = m_{\lambda} + LOT where LOT stands for lower order terms with respect to dominance order.
  2. (Orthogonality) \langle P_{\lambda}(t), P_{\mu}(t) \rangle_{(t)} = 0 if \lambda \neq \mu.

There is a well-known formula for the Hall-Littlewood polynomials for n variables and partitions such that |\lambda| \leq n by

P_{\lambda}(x_1, \ldots, x_n; t) = v_{\lambda}^{-1}(t) \sum_{w \in S_n} w\left( x^{\lambda} \prod_{i < j} \frac{x_i - t x_j}{x_i - x_j} \right)

where

v_{\lambda}(t) = \prod_{i \geq 0} \prod_{j=1}^{m_i} \frac{1 - t^j}{1 -t}

is just to normalize the coefficient of x^{\lambda} to 1.

Now if we take t = 0, we get P_{\lambda}(0) = s_{\lambda} by the inner product specializing to the Hall inner product, and if we take t = 1, we can see that

P_{\lambda}(1) = \sum_{w \in S_n} w( x^{\lambda} ) = m_{\lambda}.

As above, we take the limit as n \to \infty to get an element in \Lambda.

Contributors

  • Travis Scrimshaw (UC Davis)
  • Colombian Power

Background on Symmetric Functions is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by LibreTexts.

Support Center

How can we help?