Background on Symmetric Functions
( \newcommand{\kernel}{\mathrm{null}\,}\)
General Background
Here we will be giving a general background on the ring of symmetric functions. We start by letting n be an integer. A partition λ of n, which is written as λ⊢n is a weakly decreasing sequence with values in Z≥0 whose sum is n. A weak composition α of n is a sequence with values in Z≥0 whose sum is n. We will let xα=xα11xα22⋯ and we will consider the formal power series Λ=Q[[x1,x2,…]] which are invariant under S+∞, the group of permutations of {1,2,3,…}, where σ∈S+∞ acts by σxi=xσ(i). The ring (algebra) Λ is known as the ring (algebra) of symmetric functions. We can consider this as the direct limit of polynomials in k variables that are invariant under Sk, the symmetric group on k letters, where σ∈Sk acts by σxi=xσ(i).
Partitions can be represented as diagrams where we have λi boxes left-aligned where the top row corresponds to λ1. This diagram is called a Young diagram or Ferrers' diagram in English convention. Another notation often seen is for the bottom row to correspond to λ1 and this is known as French convention. Partitions have a natural partial ordering given by containment of the Young diagrams, or more formally, we have λ⊆μ if λi≤μi for all i. However the partial ordering we will be predominantly using is known as dominance order, where λ≤μ if ∑ij=1λj≤∑ij=1μj for all i.
We also need to define a few more things for partitions. We denote |λ|=n as the size of λ and ℓ(λ) is number of non-zero entries of λ which we call the length. Let mi(λ) be the number of parts of i in λ, and if the partition is clear, we will simply write this as mi.
Monomial basis
It is straightforward to see that the following is a basis for Λ: mλ=∑αxα where we sum over all distinct permutations α of λ. Now there is a natural grading on polynomials given by the degree, and since mλ is homogeneous of degree n=|λ|. Therefore this is a graded basis of Λ, which means that for mλ∈Λn and mμ∈Λm, we have mλmμ∈Λn+m.
Elementary basis
We now define a new basis, the elementary basis, by ek=∑i1<i2<⋯<ikxi1xi2⋯xikeλ=eλ1eλ2⋯. We note that ek=m1k. From this definition, it is not clear that this forms a basis. However we can show that the transition map is upper unitriangular with respect to dominance order, and so this really is a basis.
(Complete) Homogeneous basis
Our next basis, the (complete) homogeneous basis, is defined similar to the elementary basis: hk=∑i1≤i2≤⋯≤ikxi1xi2⋯xikhλ=hλ1hλ2⋯. We note that hk=∑μ⊢kmμ. However the transition map to the monomial symmetric functions is not as nice as above, so we have to define an algebra morphism ω:Λ→Λ by ω(ek)=hk. Using generating functions for the elementary and homogeneous basis, one can show that ω(hk)=ek and hence ω is an involution. Therefore, since the elementary symmetric functions are a basis, so are the homogeneous symmetric functions.
Powersum Basis
Our last natural choice of basis is the powersum basis and is given by
pk=∑ixkipλ=pλ1pλ2⋯.
We note that pk=mk. While the transition map is upper triangular with respect to dominance order, it is not unitriangular. We can use the powersum basis to define an inner product on Λ by
⟨pλ,pμ⟩=δλμzλ
where zλ=m1!1λ1m2!2λ2⋯ (which is the size of the conjugacy class of cycle type λ in Sk). So this does form a basis, but in a way, it is not a particularly nice basis.
A side remark, the inner product defined above is known as the Hall inner product and implies that
⟨mλ,hμ⟩=δλμ.
Schur functions
Since we will be working in an infinite number of variables, we will be using the definition using semistandard Young tableaux. We start by considering a partition λ, and now we will fill λ with entries in \(\{1, 2, \ldots\}\) such that the following conditions hold:
- it is weakly increasing across rows,
- it is strictly increasing down columns.
Such a filling T is called a semistandard Young tableau. We define the weight of T as the weak composition α=wt(T) by αi is the number of boxes filled with an i. We denote the set of all semistandard (Young) tableaux of shape λ by SSYT(λ) and for a fixed weight by SSYT(λ,α).
Next we will denote xT=xwt(T)=xα11xα22⋯, and so we can define the Schur function by sλ=∑T∈SSYT(λ)xT. We want to show that this is a symmetric function, and we must extend the action of S+∞ to semistandard tableaux. We do so by defining the action of the simple generator si by finding all partial rows that looks like (i,…,i,i+1,…,i+1) where any i does not have an i+1 below it and the any i+1 does not have an i above it. Let r,s denote the number of i,i+1 respectively, and si acts by interchanging the number of i,i+1 occurring in each partial row. This is a valid action of S+∞ on SSYT(λ), and so the Schur functions are symmetric functions.
Schur functions exhibit many remarkable properties. The first is that this is a basis, which can be seen by sλ=∑μ⊢|λ|Kλμmμ where Kλμ=|SSYT(λ,μ)| and is known as a Kostka number. It is easy to see that Kλμ=0 if λ≰μ and K_{\lambda \lambda} = 1. The next is that from the RSK (a bijection from SSYT to integer matrices) and dual RSK algorithms, which hold many interesting properties within themselves, one can deduce that \langle s_{\lambda}, s_{\mu} \rangle = \delta_{\lambda \mu} and \omega s_{\lambda} = s_{\lambda^t} where \lambda^t is the transpose partition (reflect the Young diagram along the main diagonal).
Schur functions
There is another basis, called the schur functions, with strong connections to combinatorics and representation theory. We give several equivalent definitions:
Tableaux
We first give a purely combinatorial definition
s_{\lambda} = \displaystyle \sum_{T} x^{cont(T)}
where the sum is over all Semi-Standard Young Tableaux of shape \lambda and cont(T) is the monomial representing its content. An example in 3 variables is
s_{2,1} = x_1^2x_2 + x_1^2x_3 + x_1x_2x_3+x_1x_2x_3 +x_1x_2^2 + x_1x_3^2 + x_2^2x_3 + x_2x_3^2
corresponding to the tableaux
[11,2], [11,3], [12,3], [13,2], [12,2], [13,3], [22,3], [23,3]
Here [ab,c] its just a one line notation for the more familiar
a b
c
Note in the example that the schur function is indeed symmetric, a fact which is not direct from its definition. In general, the symmetry of schur functions requires a proof, not hard, and once we know it, we can write
s_\lambda= \sum_\mu K_{\lambda\mu}m_\mu.\
with the following natural interpretation K_{\lambda\mu} is the number of Semi-Standard Young Tableaux (SSYT for short) with shape \lambda and content \mu. A careful analysis reveals that K_{\lambda\mu} is nonzero if and only if \lambda \geq \mu. Even more K_{\lambda\lambda} = 1 , with the only tableaux being the one with all 1's in the first row, 2's in the second, and so on. So in fact we have
s_\lambda= \sum_{\lambda\geq\mu} K_{\lambda\mu}m_\mu.\
Which means that the transition matrix will be lower unitriangular with respect to any order extending the dominance order. This shows that the schur functions are an integral basis for the ring of symmetric functions.
Remark: This is stronger that triangularity since being smaller in the linear order doesn't guarantees being smaller in dominance order. It is triangular, but several other positions are forced to be zero too.
One advantage of the combinatorial definition is that it makes sense also to define schur functions indexed by skew shapes.
Ratio of determinants
There is also a purely algebraic definition of s_{\lambda}
For any integer vector (l_1,l_2,\cdots, l_k) define \(a_{(l_1,l_2,\cdots, l_k) } (x_1, x_2, \dots , x_k) \) as the determinant of
\left[ \begin{matrix} x_1^{l_1} & x_2^{l_1} & \dots & x_k^{l_1} \\ x_1^{l_2} & x_2^{l_2} & \dots & x_k^{l_2} \\ \vdots & \vdots & \ddots & \vdots \\ x_1^{l_k} & x_2^{l_k} & \dots & x_k^{l_k} \end{matrix} \right]
Set \delta = (n-1,n-2,\cdots,1,0) and note that a_{\delta} is the Vandermonde determinant \v(x). We have the alternative expression for the schur functions
s_{\lambda} = \dfrac{a_{\lambda+\delta}}{a_{\delta}}
Remark: This makes sense even if \lambda is not a partition. In its useful to define s_{\upsilon} this way for any integer vector. If nonzero, s_{\upsilon} will be equal (up to sign) to s_{\lambda} for a unique partition.
One common application, that we will use later, is the following: to find the coefficient of s_{\lambda} in a symmetric function f(x) we can just compute the coefficient of x^{\lambda+\delta} in v(x)f(x) . The hook length formula can be proven this way (See http://en.wikipedia.org/wiki/Hook_length_formula)
More determinants
From the first definition we were able to expand schur in terms of the monomial basis. What about the other bases? The answer for the homogeneous (and elementary) basis is given by the Jacobi-Trudi determinant
s_{\lambda} = \det(h_{\lambda_i-i+j})_{i,j=1}^n
Applying the involution \omega we get
s_{\lambda} = \det(e_{\lambda'_i-i+j})_{i,j=1}^n
Relation with the representation theory of the symmetric group
The remarkable connection between schur functions and the characters of the irreducible representations of the symmetric group is given by the following magical formula
\chi^{\lambda}(w)=\langle s_{\lambda}, p_{\tau(w)}\rangle
where \chi^{\lambda}(w)= is the value of the character of the irreducible representation V_{\lambda} in the element w and p_{\tau(w)} is the power symmetric function of the partition \tau(w) associated to the cycle decomposition of w . For example, if w=(154)(238)(6)(79) then \tau(w) = (3,3,2,1) .
Since e=(1)(2)\cdots(n) in cycle notation, \tau(e) = 1+1+\cdots+1=1^{(n)} . Then the formula says
\dim V_\lambda =\chi^\lambda(e) = \langle s_\lambda, p_{1^{(n)}}\rangle
Considering the expansion of schur functions in terms on monomial symmetric functions using the Kostka numbers
s_\lambda= \sum_\mu K_{\lambda\mu}m_\mu.\
the inner product with p_{1^{(n)}} = h_{1^{(n)}} is K_{\lambda 1^{(n)}} , because \langle m_{\mu},h_{\nu} \rangle = \delta_{\mu\nu} . Note that K_{\lambda 1^{(n)}} is equal to f^{\lambda} , the number of Standard Young Tableaux of shape \lambda . Hence
\dim V_\lambda = f^{\lambda}
and
\langle s_\lambda, p_{1^{(n)}}\rangle = f^{\lambda}
which will be useful later
The magical formula is equivalent to
s_{\lambda} = \dfrac{1}{n!} \sum_{w\in S_n} \chi^{\lambda}(w)p_{\tau(w)}
This gives a conceptual proof of the identity \omega(s_{\lambda}) = s_{\lambda'} , by comparing the coefficients and taking into account that \chi^{\lambda}(w) = \textrm{sgn}(w) \chi^{\lambda'}(w) because tensoring by the sign representation gives the irreducible representation for the conjugate partition.
Frobenius Series
The expansion in terms of the power symmetric functions suggest we define the following map The Frobenius Characteristic map F takes class functions on the symmetric group to symmetric function by sending \chi^{\lambda} \to s_{\lambda} and extending by linearity. An important fact is that F is an isometry with respect to the inner products.
Remark F does not commute with multiplication
The formula
s_\lambda= \sum_\mu K_{\lambda\mu}m_\mu.\
is equivalent to
h_\mu= \sum_\mu K_{\lambda\mu}s_\lambda.\
And this also comes from representation theory. There is a module M^{\mu} decomposition in V^{\lambda} is given by the multiplicities K_{\lambda\mu} , and the above equation is simply the Frobenius translation. (See Sagan)
Let A = \bigoplus_r A_r be a graded S_n module, with each A_r finite dimensional, define the Frobenius Series of A as
F_A(z;t) = \sum_r t^r F\textrm{char}(A_r)
where F\textrm{char}(A_r) is the image of \textrm{char} (A_r) under the Frobenius map.
And similarly if A = \bigoplus_{r,s} A_{r,s} is a doubly graded S_n module, with each A_{r,s} finite dimensional, define the Frobenius Series of A as
F_A(z;q,t) = \sum_r t^rq^s F\textrm{char}(A_r)
It is clear that the Frobenius series expand positively in terms of schur functions, because the coefficients of schur come from the multiplicities (obviously nonnegative) of the irreducibles on each graded piece. The proof of the positivity conjecture for Macdonald Polynomials consist of finding a module whose Frobenius series is the desired symmetric function.
Characters of the General Linear group
Another way of thinking about schur functions is as the characters of the irreducible representations of GL(n) . Lets go through a simple example
The first nontrivial representation of GL(2) that comes into mind is itself, which comes from the natural action on k^2 , call this representation V . The character is just the trace, which is, as function of the eigenvalues, equal to x_1+x_2 . What happens if we tensor V\otimes V ? The character gets raised by two and we have the identity
(x_1+x_2)^2 = x_1^2+x_2^2 + 2x_1x_2 = x_1^2+x_2^2 + x_1x_2 + x_1x_2 = s_{2}(x_1,x_2) + s_{1,1}(x_1,x_2)
Since decomposing the characters gives the information to decompose the representation, the above identity says that V\otimes V decomposes into two irreducibles one corresponding to the partition (2) and other to the partition (1,1) . This are the symmetric and antisymmetric part, respectively.
More generally, consider GL(n) and its defining representation V given by the natural action on k^n . If we want to decompose V^{\otimes m} into irreducibles we will need to write (x_1+x_2+\cdots+x_n)^m in term of schur. The remarkable formula, in the crossroads between symmetric functions, representation theory and combinatorics is
(x_1+x_2+\cdots+x_k)^n = \sum_{\lambda\vdash n} s_\lambda f^\lambda.
Which is the expansion of p_{1^{(n)}} in terms of schur functions using the coefficients given by the inner product, because \langle s_{\mu},s_{\nu}\rangle = \delta_{\mu\nu} and \langle s_\lambda, p_{1^{(n)}}\rangle = f^{\lambda} . The above equality can be proven also checking the coefficients of each monomial at both sides and using the Robinson–Schensted–Knuth correspondence. For a more detailed analysis of the decomposition of V^{\otimes m} see Schur–Weyl duality.
In this context we can express the schur functions by using Weyl's Character Formula
s_{\lambda}(x) = \sum_{w\in S_n} w\left(\dfrac{x^{\lambda}}{\Pi_{i<j}(1-x_j/x_i)}\right)
which is equivalent to the ratio of determinants.
Raising operators
Check the Plethystic notation section. For a set of variables x = (x_1,x_2,\cdots) define
\Omega(X) = \prod_{i}\dfrac{1}{1-x_i}
Now define the ''Bernstein Operators'' S^0_m on symmetric functions as
S^0_m(f) = [u^m]f(X-u^{-1})\Omega(uX)
In words this means we have some variables x, and we add one more variable u, so uX = (ux_1,ux_2,\cdots) , and f(X-u^{-1}) can be thought as adding the variable -u^{-1} to the set. So f(X-u^{-1})\Omega(uX) is an expression containing the variables x and u. [u^m] takes coefficient of u^m in the big mess
The following theorem is fundamental, not by itself, but because a lot of the theory is developed by deforming this operator, and while the proofs are different for the other cases, there is a pattern in the proofs. so lets describe this one to get a flavor of what's going on
Theorem The Bernstein operators add a part to the indexing of the Schur function, that is, for m \geq \lambda1 we have S^0_m s_\lambda = s_{m,\lambda}
Sketch of proof
We need the following ingredients. First by partial fraction expansion
\Omega(uX) = \prod_{i}\dfrac{1}{1-ux_i} = \sum_i \dfrac{1}{1-ux_i}\prod_{j\neq i} \dfrac{1}{1-x_j/x_i}
The second ingredient is Weyl's character formula
s_{\lambda}(x) = \sum_{w\in S_n} w\left(\dfrac{x^{\lambda}}{\prod_{i<j}(1-x_j/x_i)}\right)
By expanding its easy to check that for a polynomial f
[u^m]f(u^{-1})\dfrac{1}{1-uz} = z^mf(z)
Now we're ready to stir the elements. First lets mix the first with definition, so for any f
[u^m]f(X-u^{-1})\Omega(uX) = [u^m]f(X-u^{-1})\sum_i \dfrac{1}{1-ux_i}\prod_{j\neq i} \dfrac{1}{1-x_j/x_i}
= \sum_i [u^m]f(x-u^{-1}) \dfrac{1}{1-ux_i}\prod_{j\neq i} \dfrac{1}{1-x_j/x_i}
because [u^m] is an operator, it distributes over sum. Furthermore, considering f(x-u^{-1}) as a polynomial in u^{-1} we can use the third ingredient, with x_i playing the role of z to get
\sum_i x_i^mf(X-x_i)\prod_{j\neq i} \dfrac{1}{1-x_j/x_i} = \sum_i x_i^m\prod_{j\neq i} \dfrac{f(X-x_i)}{1-x_j/x_i}
By the virtues of the plethystic substitution f(X-x_i) is the same thing as evaluating in the other variables, i.e., taking x_i out. Now specialize to the schur function and consider the second ingredient, Weyl's formula, then it follows easily by induction, because note the factor
x_i^m\prod_{j\neq i} \dfrac{s_{\lambda}(X-x_i)}{1-x_j/x_i}
is the same as for the terms in the formula for s_{(m,\lambda)} when the permutation sends i to the first position.
This gives our final definition of schur functions
s_\lambda = S^0_{\lambda_1}S^0_{\lambda_2}\cdots S^0_{\lambda_l}(1)
Final Remarks
The schur functions are a basis for the symmetric functions with the following properties
1. Lower unitriangularity with respect to monomials s_\lambda= \sum_\mu K_{\lambda\mu}m_\mu.\
2. Orthogonality \langle s_{\mu},s_{\nu}\rangle = \delta_{\mu\nu}
The Kostka numbers K_{\lambda\mu} have two interpretations, a combinatorial and an algebraic one. These properties are important to keep in mind while generalizing with one or two parameters.
Hall-Littlewood symmetric functions
Now that we've looked at the classical basis for \Lambda, we are going to look at a t-deformation which specializes to the monomial and the Schur bases (plus two additional bases we haven't discussed here). We do this deformation considering a different inner product \langle p_{\lambda}, p_{\mu} \rangle_{(t)} = \delta_{\lambda \mu} z_{\lambda} \prod_{i=1}^{\ell(\lambda)} (1 - t^{\lambda_i})^{-1}. Now we define the Hall-Littlewood symmetric functions by the following two properties:
- (Triangularity) P_{\lambda}(t) = m_{\lambda} + LOT where LOT stands for lower order terms with respect to dominance order.
- (Orthogonality) \langle P_{\lambda}(t), P_{\mu}(t) \rangle_{(t)} = 0 if \lambda \neq \mu.
There is a well-known formula for the Hall-Littlewood polynomials for n variables and partitions such that |\lambda| \leq n by
P_{\lambda}(x_1, \ldots, x_n; t) = v_{\lambda}^{-1}(t) \sum_{w \in S_n} w\left( x^{\lambda} \prod_{i < j} \frac{x_i - t x_j}{x_i - x_j} \right)
where
v_{\lambda}(t) = \prod_{i \geq 0} \prod_{j=1}^{m_i} \frac{1 - t^j}{1 -t}
is just to normalize the coefficient of x^{\lambda} to 1.
Now if we take t = 0, we get P_{\lambda}(0) = s_{\lambda} by the inner product specializing to the Hall inner product, and if we take t = 1, we can see that
P_{\lambda}(1) = \sum_{w \in S_n} w( x^{\lambda} ) = m_{\lambda}.
As above, we take the limit as n \to \infty to get an element in \Lambda.
Contributors
- Travis Scrimshaw (UC Davis)
- Colombian Power