$$\newcommand{\id}{\mathrm{id}}$$ $$\newcommand{\Span}{\mathrm{span}}$$ $$\newcommand{\kernel}{\mathrm{null}\,}$$ $$\newcommand{\range}{\mathrm{range}\,}$$ $$\newcommand{\RealPart}{\mathrm{Re}}$$ $$\newcommand{\ImaginaryPart}{\mathrm{Im}}$$ $$\newcommand{\Argument}{\mathrm{Arg}}$$ $$\newcommand{\norm}{\| #1 \|}$$ $$\newcommand{\inner}{\langle #1, #2 \rangle}$$ $$\newcommand{\Span}{\mathrm{span}}$$

# 13 Characteristic functions

$$\newcommand{\vecs}{\overset { \rightharpoonup} {\mathbf{#1}} }$$

$$\newcommand{\vecd}{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$$

## Definition and basic properties

A main tool in our proof of the central limit theorem will be that of characteristic functions. The basic idea will be to show that

$\expec\left[ g\left( \frac{S_n-n\mu}{\sqrt{n}\sigma}\right)\right] \xrightarrow[n\to\infty]{} \expec g(N(0,1))$

for a sufficiently large family of functions $$g$$. It turns out that the family of functions of the form

$g_t(x) = e^{itx}, \qquad (t\in\R),$

is ideally suited for this purpose. (Here and throughout, $$i=\sqrt{-1}$$).

Definition

The characteristic function of a r.v.\ $$X$$, denoted $$\varphi_X$$, is defined by

$\varphi_X(t) = \expec\left(e^{i t X}\right) = \expec(\cos(t X)) + i \expec(\sin(t X)), \qquad (t\in\R).$

Note that we are taking the expectation of a complex-valued random variable (which is a kind of two-dimensional random vector, really). However, the main properties of the expectation operator (linearity, the triangle inequality etc.) that hold for real-valued random variables also hold for complex-valued ones, so this will not pose too much of a problem.

Here are some simple properties of characteristic functions. For simplicity we denote $$\varphi = \varphi_X$$ where there is no risk of confusion.

1. $$\varphi(0) = \expec e^{i\cdot 0\cdot X} = 1$$.
2. $$\varphi(-t) = \expec e^{-i t X} = \expec \left(\overline{e^{it X}}\right) = \overline{\varphi(t)}$$ (where $$\overline{z}$$ denotes the complex conjugate of a complex number $$z$$).
3. $$|\varphi(t)| \le \expec\left|e^{i t X}\right| = 1$$ by the triangle inequality.
4. $$|\varphi(t)-\varphi(s)| \le \expec \left| e^{i t X} - e^{isX}\right| = \expec \left|e^{i s X} \left(e^{i(t-s)X}-1\right)\right| = \expec \left| e^{i(t-s)X}-1\right|$$. Note also that $$\expec \left|e^{i u X}-1\right|\to 0$$ as $$u \downarrow 0$$ by the bounded convergence theorem. It follows that $$\varphi$$ is a uniformly continuous function on $$\R$$.
5. $$\varphi_{a X}(t) = \expec e^{i a t X} = \varphi_X(a t)$$,\ \ $$(a\in\R)$$.
6. $$\varphi_{X+b}(t) = \expec e^{i t (X+b)} = e^{i b t} \varphi_X(t)$$,\ \ $$(b\in\R)$$.
7. Important: If $$X,Y$$ are independent then $\varphi_{X+Y}(t) = \expec\left(e^{it(X+Y)}\right) = \expec\left(e^{itX} e^{itY}\right) = \expec\left(e^{itX}\right) \expec \left(e^{itY}\right) = \varphi_X(t) \varphi_Y(t).$ Note that this is the main reason why characteristic functions are such a useful tool for studying the distribution of a sum of independent random variables.

A note on terminology: If $$X$$ has a density function $$f$$, then the characteristic function can be computed as

$\varphi_X(t) = \int_{-\infty}^\infty f_X(x) e^{itx}\,dx.$

In all other branches of mathematics, this would be called the Fourier transform of $$f$$. Well, more or less -- it is really the inverse Fourier transform; but it will be the Fourier transform if we replace $$t$$ by $$-t$$, so that is almost the same thing. So the concept of a characteristic function generalizes the Fourier transform. If $$\mu$$ is the distribution measure of $$X$$, some authors write

$\varphi_X(t) = \int_{-\infty}^\infty e^{itx} d\mu(x)$

(which is an example of a Lebesgue-Stieltjes integral) and call this the Fourier-Stieltjes transform (or just the Fourier transform) of the measure $$\mu$$.

## Examples

No study of characteristic functions is complete without dirtying your hands'' a little to compute the characteristic function for some important cases. The following exercise is highly recommended

Exercise

Compute the characteristic functions for the following distributions.

1. Coin flips: Compute $$\varphi_X$$ when $$\prob(X=-1)=\prob(X=1)=1/2$$ (this comes out slightly more symmetrical than the usual Bernoulli r.v. for which $$\prob(X=0)=\prob(X=1)=1/2$$).
2. Symmetric random walk: Compute $$\varphi_{S_n}$$ where $$S_n=\sum_{k=1}^n X_k$$ is the sum of $$n$$ i.i.d. copies of the coin flip distribution above.
3. Poisson distribution: $$X\sim \textrm{Poisson}(\lambda)$$.
4. Uniform distribution: $$X \sim U[a,b]$$, and in particular $$X\sim [-1,1]$$ which is especially symmetric and useful in applications.
5. Exponential distribution: $$X\sim \textrm{Exp}(\lambda)$$.
6. Symmetrized exponential: A r.v. $$Z$$ with density function $$f_Z(x) = e^{-|x|}$$. Note that this is the distribution of the exponential distribution after being symmetrized'' in either of two ways: (i) We showed that if $$X,Y\sim \textrm{Exp}(1)$$ are independent then $$X-Y$$ has density $$e^{-|x|}$$; (ii) alternatively, it is the distribution of an exponential variable with random sign'', namely $$\varepsilon\cdot X$$ where $$X\sim \textrm{Exp}(1)$$ and $$\varepsilon$$ is a random sign (same as the coin flip distribution mentioned above) that is independent of $$X$$.

The normal distribution has the nice property that its characteristic function is equal, up to a constant, to its density function.

Lemma

If $$Z\sim N(0,1)$$ then

$\varphi_Z(t) = e^{-t^2/2}.$

Proof

\begin{eqnarray*} \varphi_Z(t) &=& \frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{itx} e^{-x^2/2}\,dx = \frac{1}{\sqrt{2\pi}}
\int_{-\infty}^\infty e^{-t^2/2} e^{(x-it)^2/2}\,dx \\
&=& e^{-t^2/2} \left( \frac{1}{\sqrt{2\pi}}\int_{-\infty}^\infty e^{(x-it)^2/2}\,dx \right).
\end{eqnarray*}

As Durrett suggests in his physics proof'' (p. 92 in [Dur2010], 91 in [Dur2004]), the expression in parentheses is $$1$$, since it is the integral of a normal density with mean $$it$$ and variance $$1$$. This is a nonsensical argument, of course ($$it$$ being an imaginary number), but the claim is true, easy and is proved in any complex analysis course using contour integration.

Alternatively, let $$S_n=\sum_{k=1}^n X_k$$ where $$X_1,X_2,\ldots$$ are i.i.d.\ coin flips with $$\prob(X_k)=-1=\prob(X_k)=1=1/2$$. We know from the de Moivre-Laplace theorem (Theorem~\ref{thm-demoivre-laplace}) that

$S_n/\sqrt{n} \implies N(0,1),$

so that

$\varphi_{S_n/\sqrt{n}}(t) = \expec\left( e^{it S_n/\sqrt{n}} \right) \xrightarrow[n\to\infty]{} \varphi_Z(t), \qquad (t\in\R),$

since the function $$x\to e^{i t x}$$ is bounded and continuous. On the other hand, from the exercise above it is easy to compute that $$\varphi_{S_n}(t) = \cos^n(t)$$, which implies that

$\varphi_{S_n/\sqrt{n}}(t) = \cos^n\left(\frac{t}{\sqrt{n}}\right)= \left(1-\frac{t^2}{2n} + O\left(\frac{t^4}{n^2}\right)\right)^n \xrightarrow[n\to\infty]{} e^{-t^2/2}.$

As a consequence, let $$X\sim N(0,\sigma_1^2)$$ and $$Y\sim N(0,\sigma_2^2)$$ be independent, and let $$Z=X+Y$$. Then

$\varphi_X(t) = e^{-\sigma_1^2 t^2/2}, \qquad \varphi_Y(t) = e^{-\sigma_2^2 t^2/2},$

so $$\varphi_Z(t) = e^{-(\sigma_1^2+\sigma_2^2)/2}$$. This is the same as $$\varphi_W(t)$$, where $$W\sim N(0,\sigma_1^2+\sigma_2^2)$$. It would be nice if we could deduce from this that $$Z \sim N(0,\sigma_1^2+\sigma_2^2)$$ (we already proved this fact in a homework exercise, but it's always nice to have several proofs of a result, especially an important one like this one). This naturally leads us to an important question about characteristic functions, which we consider in the next section.

## The inversion formula

A fundamental question about characteristic functions is whether they contain all the information about a distribution, or in other words whether knowing the characteristic function determines the distribution uniquely. This question is answered (affirmatively) by the following theorem, which is a close cousin of the standard inversion formula from analysis for the Fourier transform.

Theorem: The Inversion Formula

If $$X$$ is a r.v.\ with distribution $$\mu_X$$, then for any $$a<b$$ we have

\begin{eqnarray*} \lim_{T\to\infty} \frac{1}{2\pi} \int_{-T}^T \frac{e^{-iat}-e^{-ibt}}{it}\varphi_X(t)\,dt &=& \mu_X((a,b)) + \frac12
\mu_X(\{a,b\}) \\ &=& \prob(a<X<b) + \frac12 \prob(X=a)+\frac12 \prob(X=b).
\end{eqnarray*}

Corollary: Uniqueness of Character Function
If $$X,Y$$ are r.v.s such that $$\varphi_X(t)\equiv \varphi_Y(t)$$ for all $$t\in\R$$ then $$X \eqdist Y$$.
Example
Explain why Corollary~\ref{cor:charfun-uniqueness} follows from the inversion formula.
Proof: Inversion Theorem

Throughout the proof, denote $$\varphi(t)=\varphi_X(t)$$ and $$\mu=\mu_X$$. For convenience, we use the notation of Lebesgue-Stieltjes integration with respect to the measure $$\mu$$, remembering that this really means taking the expectation of some function of the r.v.\ $$X$$. Denote

\begin{equation}\label{eq:inversionproof}
I_T = \int_{-T}^T \frac{e^{-i a t}-e^{-i b t}}{it} \varphi(t)\,dt
= \int_{-T}^T \int_{-\infty}^\infty \frac{e^{-i a t}-e^{-i b t}}{it} e^{itx}d\mu(x)\,dt.
\end{equation}

Since $$\frac{e^{-i a t}-e^{-i b t}}{it} = \int_a^b e^{-i t y}\,dy$$ is a bounded function of $$t$$ (it is bounded in absolute value by $$b-a$$), it follows by Fubini's theorem that we can change the order of integration, so

\begin{eqnarray*}I_T&=& \int_{-\infty}^\infty \int_{-T}^T \frac{e^{-i a t}-e^{-i b t}}{it} e^{itx} dt\,d\mu(x) \\
&=&
\int_{-\infty}^\infty \left[ \int_{-T}^T \frac{\sin(t(x-a))}{t}\,dt-\int_{-T}^T \frac{\sin(t(x-b))}{t}\,dt \right]d\mu(x)
\\ &=& \int_{-\infty}^\infty \left( R(x-a,T)-R(x-b,T) \right) d\mu(x),
\end{eqnarray*}

where we denote $$R(\theta, T) = \int_{-T}^T \sin(\theta t)/t\,dt$$. Note that in the notation of expectations this can be written as $$I_T = \expec\left( R(X-a,T)-R(X-b,T)\ \right)$$.

This can be simplified somewhat; in fact, observe also that

$(\theta,T)= 2\textrm{sgn}(\theta) \int_0^{|\theta|T} \frac{\sin x}{x}\,dx = 2 \textrm{sgn}(\theta) S(|\theta| T),$

where we denote $$S(x) = \int_0^x \frac{\sin(u)}{u}\,du$$ and $$\textrm{sgn}(\theta)$$ is $$1$$ if $$\theta>0$$, \)-1\) if $$\theta<0$$ and $$0$$ if $$\theta=0$$. By a standard convergence test for integrals, the improper integral $$\int_0^\infty \frac{\sin u}{u}\,du = \lim_{x\to\infty}S(x)$$ converges; denote its value by $$C/4$$. Thus, we have shown that $$R(\theta,T)\to \frac12\textrm{sgn}(\theta) C$$ as $$T\to \infty$$, hence that

$R(x-a,T)-R(x-b,T) \xrightarrow[T\to\infty]{} \begin{cases} C & a<x<b, \\ C/2 & x=a\textrm{ or }x=b, \\ 0 & x<a\textrm{ or }x>b. \end{cases}$

Furthermore, the function $$R(x-a,T)-R(x-b,T)$$ is bounded in absolute value by $$2\sup_{x\ge 0} S(x)$$. It follows that we can apply the bounded convergence theorem in \eqref{eq:inversionproof} to get that

$I_T \xrightarrow[T\to\infty]{} C \expec(1_{a<X<b}) + (C/2) \expec(\ind_{\{X=a\}} + \ind_{\{X=b\}}) = C \mu((a,b)) + (C/2)\mu(\{a,b\}).$

This is just what we claimed, minus the fact that $$C=2\pi$$. This fact is a well-known integral evaluation from complex analysis. We can also deduce it in a self-contained manner, by applying what we proved to a specific measure $$\mu$$ and specific values of $$a$$ and $$b$$ for which we can evaluate the limit in \eqref{eq:inversionproof} directly. This is not entirely easy to do, but one possibility, involving an additional limiting argument, is outlined in the next exercise; see also Exercise 1.7.5 on p.\ 35 in [Dur2010], (Exercise 6.6, p.\ 470 in Appendix A.6 of [Dur2004]) for a different approach to finding the value of $$C$$.

Exercise: Recommended for aspiring analysts

For each $$\sigma>0$$, let $$X_\sigma$$ be a r.v. with distribution $$N(0,\sigma^2)$$ and therefore with density $$f_X(x)=(\sqrt{2\pi}\sigma)^{-1} e^{-x^2/2\sigma^2}$$ and characteristic function $$\varphi_X(t) = e^{-\sigma^2 t^2/2}$$. For fixed $$\sigma$$, apply Theorem~\ref{thm-inversion} in its weak form given by \eqref{eq:almostinversion} (that is, without the knowledge of the value of $$C$$), with parameters $$X=X_\sigma$$, $$a=-1$$ and $$b=1$$, to deduce the identity

$\frac{C}{\sqrt{2\pi}\sigma} \int_{-1}^1 e^{-x^2/2\sigma^2}\,dx = \int_{-\infty}^\infty \frac{2\sin t}{t} e^{-\sigma^2 t^2/2}\,dt.$

Now multiply both sides by $$\sigma$$ and take the limit as $$\sigma\to\infty$$. For the left-hand side this should give in the limit (why?) the value $$(2C)/\sqrt{2\pi}$$. For the right-hand side this should give $$2\sqrt{2\pi}$$. Justify these claims and compare the two numbers to deduce that $$C=2\pi$$.

The following theorem shows that the inversion formula can be written as a simpler connection between the characteristic function and the density function of a random variable, in the case when the characteristic function is integrable.

Theorem

If $$\int_{-\infty}^\infty |\varphi_X(t)|\,dt < \infty$$, then $$X$$ has a bounded and continuous density function $$f_X$$, and the density and characteristic function are related by

\begin{eqnarray*}
\varphi_X(t) &=& \int_{-\infty}^\infty f_X(x) e^{itx}\,dx, \\
f_X(x) &=& \frac{1}{2\pi} \int_{-\infty}^\infty \varphi_X(t) e^{-itx}\,dt.
\end{eqnarray*}

In the lingo of Fourier analysis, this is known as the inversion formula for Fourier transforms.

Proof
This is a straightforward corollary of Theorem~\ref{thm-inversion}. See p.~95 in either [Dur2010] or [Dur2004].

## The continuity theorem

Theorem: The Continuity Theorem
Let $$(X_n)_{n=1}^\infty$$ be r.v.'s. Then:
1. If $$X_n \implies X$$ for some r.v. $$X$$, then $$\varphi_{X_n}(t)\to \varphi_X(t)$$ for all $$t\in\R$$.
2. If the limit $$\varphi(t) = \lim_{n\to\infty} \varphi_{X_n}(t)$$ exists for all $$t\in\R$$, and $$\varphi$$ is continuous at 0, then $$\varphi\equiv \varphi_X$$ for some r.v. $$X$$, and $$X_n \implies X$$.
Proof

Part (i) follows immediately from the fact that convergence in distribution implies that $$\expec g(X_n)\to \expec g(X)$$ for any bounded continuous function. It remains to prove the less trivial claim in part (ii). Assume that $$\varphi_{X_n}(t)\to \varphi(t)$$ for all $$t\in\R$$ and that $$\varphi$$ is continuous at $$0$$. First, we show that the sequence $$(X_n)_{n=1}^\infty$$ is tight. Fixing an $$M>0$$, we can bound the probability $$\prob(|X_n|>M)$$, as follows:

\begin{eqnarray*} \prob(|X_n|>M) &=& \expec\left( \ind_{\{|X_n|>M\}}\right) \le
\expec \left[ 2 \left( 1-\frac{M}{2|X_n|} \right) \ind_{\{|X_n|>M\}}\right]
\\ &\le&
\expec \left[ 2 \left( 1-\frac{\sin(2X_n/M)}{2X_n/M} \right) \ind_{\{|X_n|>M\}}\right].
\end{eqnarray*}

But this last expression can be related to the behavior of the characteristic function near $$0$$. Denote $$\delta=2/M$$. Reverting again to the Lebesgue-Stieltjes integral notation, we have:

$\expec \left[ 2 \left( 1-\frac{\sin(2X_n/M)}{2X_n/M} \right) \ind_{\{|X_n|>M\}}\right] \ \ =\ \ 2\int_{|x|>2/\delta} \left(1-\frac{\sin (\delta x)}{\delta x}\right) d\mu_{X_n}(x)$

\vspace{-16.0pt}
\begin{eqnarray*}
&\le& 2\int_{-\infty}^\infty \left(1-\frac{\sin (\delta x)}{\delta x}\right) d\mu_{X_n}(x)
= \int_{-\infty}^\infty \frac{1}{\delta}\left(\int_{-\delta}^\delta (1-e^{itx})\,dt\right)\,d\mu_{X_n}(x).
\end{eqnarray*}

Now use Fubini's theorem to get that this bound can be written as

$\frac{1}{\delta} \int_{-\delta}^\delta \int_{-\infty}^\infty (1-e^{itx})\,d\mu_{X_n}(x)\,dt =\frac{1}{\delta} \int_{-\delta}^\delta (1-\varphi_{X_n}(t))\,dt \xrightarrow[n\to\infty]{} \frac{1}{\delta} \int_{-\delta}^\delta (1-\varphi(t))\,dt$

(the convergence follows from the bounded convergence theorem).
So we have shown that

$\limsup_{n\to\infty} \prob(|X_n|>M) \le \frac{1}{\delta} \int_{-\delta}^\delta (1-\varphi(t))\,dt.$

But, because of the assumption that $$\varphi(t)\to \varphi(0)=1$$ as $$t\to 0$$, it follows that if $$\delta$$ is sufficiently small then $$\delta^{-1} \int_{-\delta}^\delta (1-\varphi(t))\,dt < \epsilon$$, where $$\epsilon>0$$ is arbitrary; so this establishes the tightness claim.
\end{samepage}

Finally, to finish the proof, let $$(n_k)_{k=1}^\infty$$ be a subsequence (guaranteed to exist by tightness) such that $$X_{n_k}\implies Y$$ for some r.v.\ $$Y$$. Then $$\varphi_{X_{n_k}}(t)\to \varphi_Y(t)=\varphi(t)$$ as $$k\to\infty$$ for all $$t\in\R$$, so $$\varphi\equiv\varphi_Y$$. This determines the distribution of $$Y$$, which means that the limit in distribution is the same no matter what convergent in distribution subsequence of the sequence $$(X_n)_n$$ we take. But this implies that $$X_n\implies Y$$ (why? The reader is invited to verify this last claim; it is best to use the definition of convergence in distribution in terms of expectations of bounded continuous functions).

## Moments

The final step in our lengthy preparation for the proof of the central limit theorem will be to tie the behavior of the characteristic function $$\varphi_X(t)$$ near $$t=0$$ to the moments of $$X$$. Note that, computing formally without regards to rigor, we can write

$\varphi_X(t) = \expec (e^{itX}) = \expec\left[\sum_{n=0}^\infty \frac{i^n t^n X^n}{n!} \right] = \sum_{n=0}^\infty \frac{i^n \expec X^n}{n!} t^n.$

So it appears that the moments of $$X$$ appear as (roughly) the coefficients in the Taylor expansion of $$\varphi_X$$ around $$t=0$$. However, for CLT we don't want to assume anything beyond the existence of the second moment, so a (slightly) more delicate estimate is required.

Lemma: Taylor Estimate

$\left|e^{ix}-\sum_{m=0}^n \frac{(ix)^m}{m!}\right| \le \min\left(\frac{|x|^{n+1}}{(n+1)!},\frac{2|x|^n}{n!}\right).$

Proof

$R_n(x) := e^{ix}-\sum_{m=0}^n \frac{(ix)^m}{m!} = \frac{i^{n+1}}{n!}\int_0^x(x-s)^n e^{is}\,ds,$

which follows from Lemma~\ref{lem-prevlemma} that we used in the proof of Stirling's formula. Taking the absolute value and using the fact that $$|e^{is}|=1$$ gives

$|R_n(x)| \le \frac{1}{n!} \left|\int_0^x |x-s|^n\,ds\right| = \frac{|x|^{n+1}}{n!}.$

To get a bound that is better-behaved for large $$x$$, note that

\begin{eqnarray*} R_n(x) &=& R_{n-1}(x) - \frac{(ix)^n}{n!} = R_{n-1}(x) - \frac{i^n}{(n-1)!}\int_0^x (x-s)^{n-1}\,ds
\\ &=& \frac{i^n}{(n-1)!}\int_0^x (x-s)^{n-1}(e^{is}-1)\,ds.
\end{eqnarray*}

So, since $$|e^{is}-1|\le 2$$, we get that

$|R_n(x)| \le \frac{2}{(n-1)!} \left|\int_0^x |x-s|^{n-1}\,ds\right| = \frac{2|x|^{n}}{(n-1)!}.$

Combining \eqref{eq:firstbound} and \eqref{eq:secondbound} gives the claim.

Now let $$X$$ be a r.v.\ with $$\expec|X|^n < \infty$$. Letting $$x=tX$$ in Lemma~\ref{lem-taylor-estimate}, taking expectations and using the triangle inequality, we get that

\begin{equation}
\label{eq:taylor-expec}
\left|\varphi_X(t)-\sum_{m=0}^n \frac{i^m \expec X^m}{m!}t^m \right| \le \expec\left[ \min\left(
\frac{|t|^{n+1} |X|^{n+1}}{(n+1)!}, \frac{2 |t|^n |X|^n}{n!} \right) \right].
\end{equation}

Note that in this minimum of two terms, when $$t$$ is very small the first term gives a better bound, but when taking expectations we need the second term to ensure that the expectation is finite if $$X$$ is only assumed to have a finite $$n$$-th moment.

Theorem: Second Moments of Character Functions

If $$X$$ is a r.v.\ with mean $$\mu=\expec X$$ and $$\var(X)<\infty$$ then

$\varphi_X(t) = 1 + i \mu t - \frac{\expec X^2}{2} t^2 + o(t^2) \qquad \textrm{as }t\to 0.$

Proof

By \eqref{eq:taylor-expec} above, we have

$\frac{1}{t^2}\left|\varphi_X(t) - \left(1 + i \mu t - \frac{\expec X^2}{2} t^2\right)\right| \le \expec \left[ \min\left(|t| \cdot |X|^3/6, X^2 \right) \right].$

As $$t\to 0$$, the right-hand side converges to 0 by the dominated convergence theorem.