18.6: Mean-Field Approximation on Scale-Free Networks

Last updated
Save as PDF

Page ID: 7883

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

What if the network topology is highly heterogeneous, like in scale-free networks, so that the random network assumption is no longer applicable? A natural way to reconcile such heterogeneous topology and mean-field approximation is to adopt a specific degree distribution \(P(k)\). It is still a non-spatial summary of connectivities within the network, but you can capture some heterogeneous aspects of the topology in \(P(k)\).

One additional complication the degree distribution brings in is that, because the nodes are now different from each other regarding their degrees, they can also be different from each other regarding their state distributions too. In other words, it is no longer reasonable to assume that we can represent the global state of the network by a single “mean field” \(q\). Instead, we will need to represent \(q\) as a function of degree \(k\) (i.e., a bunch of mean fields, each for a specific \(k\)), because heavily connected nodes may become infected more often than poorly connected nodes do. So, here is the summary of the new quantities that we need to include in the approximation:

• \(P(k)\): Probability of nodes with degree \(k \)

• \(q(k)\): Probability for a node with degree \(k\) to be infected

Let’s consider how to revise Table 18.5 .1 using \(P(k)\) and \(q(k)\). It is obvious that all the \(q\)’s should be replaced by \(q(k)\). It is also apparent that the third and fourth row (probabilities of transitions for currently infected states) won’t change, because they are simply based on the recovery probability \(p_r\). And the second row can be easily obtained once the first row is obtained. So, we can just focus on calculating the probability in the first row: What is the probability for a susceptible node with degree \(k\) to remain susceptible in the next time step?

We can use the same strategy as what we did for the random network case. That is, we calculate the probability for the node to get infected from another node, and then calculate one minus that probability raised to the power of the number of neighbors, to obtain the probability for the node to avoid any infection. For random networks, all other nodes were potential neighbors, so we had to raise (\(1−p_{e}qp_{i}\)) to the power of \(n−1\). But this is no longer necessary, because we are now calculating the probability for a node with the specific degree \(k\). So, the probability that goes to the first row of the table should look like:

\[1-(q(k))(1- something)^{k} \label{(18.29)} \]

Here, “something” is the joint probability of two events, that the neighbor node is infected by the disease and that the disease is actually transmitted to the node in question. The latter is still \(p_i\), but the former is no longer represented by a single mean field. We need to consider all possible degrees that the neighbor might have, and then aggregate them all to obtain an overall average probability for the neighbor to be in the infected state. So, here is a more elaborated probability:

\[(1-q(k))(1- \sum){k'} P_{n}(k'|k)q(k')p_i)^{k} \label{(18.30)} \]

Here, \(k'\) is the neighbor’s degree, and the summation is to be conducted for all possible values of \(k'\). \(P_{n}(k'|k)\) is the conditional probability for a neighbor of a node with degree \(k\) to have degree \(k'\). This could be any probability distribution, which could represent assortative or disassortative network topology. But if we assume that the network is neither assortative nor disassortative, then \(P_{n}(k'|k)\)) doesn’t depend on \(k\) at all, so it becomes just \(P_{n}(k')\): the neighbor degree distribution.

Wait a minute. Do we really need such a special distribution for neighbors’ degrees? The neighbors are just ordinary nodes, after all, so can’t we use the original degree distribution \(P(k')\) instead of \(P_{n}(k')\)? As strange as it may sound, the answer is an astonishing NO. This is one of the most puzzling phenomena on networks, but your neighbors are not quite ordinary people. The average degree of neighbors is actually higher than the average degree of all nodes, which is often phrased as “your friends have more friends than you do.” As brieﬂy discussed in Section 16.2, this is called the friendship paradox, first reported by sociologist Scott Feld in the early 1990s [68].

We can analytically obtain the neighbor degree distribution for non-assortative networks. Imagine that you randomly pick one edge from a network, trace it to one of its end nodes, and measure its degree. If you repeat this many times, then the distribution you get is the neighbor degree distribution. This operation is essentially to randomly choose one “hand,” i.e, half of an edge, from the entire network. The total number of hands attached to the nodes with degree \(k'\) is given by \(k'nP(k')\), and if you sum this over all \(k'\), you will obtain the total number of hands in the network. Therefore, if the sampling is purely random, the probability for a neighbor (i.e., a node attached to a randomly chosen hand) to have degree \(k'\) is given by

\[P_{n}(k') =\frac{k'nP(k')}{\sum_{k}k'mP(k')} =\frac{k'P(k')}{\sum_{k'}k'P(k')}=\frac{k'}{\langle{k} \rangle}P(k'), \label{(18.31)} \]

where \(\langle{k} \rangle\) is the average degree. As you can clearly see in this result, the neighbor degree distribution is a modified degree distribution so that it is proportional to degree \(k'\). This shows that higher-degree nodes have a greater chance to appear as neighbors. If we calculate the average neighbor degree \(\langle{k_n} \rangle\), we obtain

\[\langle{k_n} \rangle =\sum_{k'}{k'P_{n}(k')}= \sum_{k'}\frac{k'^{2}P(k')}{\langle{k} \rangle} =\frac{\langle{k^{2}} \rangle}{\langle{k} \rangle} =\frac{\langle{k} \rangle^{2}+\sigma{(k)^{2}}}{\langle{k} \rangle} = \langle{k} \rangle +\frac{\sigma (k)^{2}}{\langle{k} \rangle}, \label{(18.32)} \]

where \(σ(k)^{2}\) is the variance of the degree distribution. This mathematically shows that, if there is any variance in the degree distribution (which is true for virtually all real-world networks), your friends have more friends than you do, on average. This may be a bit depressing, but true. Just face it.

Okay, enough on the friendship paradox. Let’s plug Equation \ref{(18.31)} back into Equation \ref{(18.30)}, which gives

\[(1-q(k))(1- \sum_{k'}\frac{k'}{\langle{k} \rangle} P(k')q(k')p_{i})^{k}. \label{(18.33)} \]

With this, we can finally complete the state transition probability table as shown in Table 18.6.1.

Table \(\PageIndex{1}\): Possible scenarios of state transitions in the network SIS model, with degree distribution \(P(k)\) and degree-dependent infection probability \(q(k)\).

Current state	Next state	Probability of this transition
0 (susceptible)	0 (susceptible)	\[(1-q(k))(1- \sum_{k'}\frac{k'}{\langle{k} \rangle} P(k')q(k')p_{i})^{k} \nonumber \]
0 (susceptible)	1 (infected)	\[(1-q(k))(1- (1- \sum_{k'}\frac{k'}{\langle{k} \rangle} P(k')q(k')p_{i})^{k}) \nonumber \]
1 (infected)	0 (susceptible)	\[q(k)p_{r} \nonumber \]
1 (infected)	1 (infected)	\[q(k)()1-p_{r} \nonumber \]

We can add probabilities of state transitions that turn the next state into 1, to obtain a difference equation for \(q_{t}(k)\) (again, the subscripts are omitted on the right hand side), i.e.,

\[\begin{align} q_{t+1}(k)=(1-q(k))(1-q(k))(1- (1- \sum_{k'}\frac{k'}{\langle{k} \rangle} P(k')q(k')p_{i})^{k}) +q(k)(1-p_{r}) \label{(18.34)} \\ = (1-q(k))(1-(1-q_{n}p_{i})^{k})+q(k)(1-p_{r}) \end{align} \label{(18.35)} \]

with

\[q_{n} =\frac{\sum_{k'}k'P(k')q(k')}{\langle{k}\rangle}. \label{(18.36)} \]

\(q_n\) is the probability that a neighbor node is infected by the disease. Thus \(q_np_i\) corresponds to the “something” part in Equation \ref{(18.29)}. It can take any value between 0 and 1, but for our purpose of studying the epidemic threshold of infection probability, we can assume that \(q_np_i\) is small. Therefore, using the approximation \((1−x)^{k} ≈ 1−kx\) for small \(x\) again, the iterative map above becomes

\[\begin{align} q_{t+1}=(1-q(k)) (1-(1-kq_{n}p_{i})) +q(k)(1-p_{r}) \label{(18.37)} \\ = (1-q(k)) kq_{n}p_{i}+p(k)-q(k)p_{r} =f(q(k)). \label{(18.38)} \end{align} \]

Then we can calculate the equilibrium state of the nodes with degree \(k\) as follows:

\[q_{eq}(k)=(q- q_{eq}(k))kq_{n}p_{i}+ q_{eq}(k)-q_{eq}(k)p_{r} \label{(18.39)} \]

\[q_{eq}p_{r} =kq_{n}p_{i}-kq_{n}p_{i}q_{eq}(k)\label{(18.40)} \]

\[q_{eq} (k)=\frac{kq_{n}p_{i}}{kq_{n}p_{i}+p_{r}} \label{(18.41)} \]

We can apply this back to the definition of \(q_n\) to find the actual equilibrium state:

\[q_{n} =\frac{1}{\langle{k} \rangle}\sum_{k'}k'P(k') \frac{kq_{n}p_{i}}{kq_{n}p_{i}+p_{r}} \label{(18.42)} \]

Clearly, \(q_n = 0\) (i.e., \(q_{eq}(k) = 0\) for all \(k\)) satisfies this equation, so the extinction of the disease is still an equilibrium state of this system. But what we are interested in is the equilibrium with \(q_n \neq{ 0 }\)(i.e., \(q_{eq}(k) > 0\) for some \(k\)), where the disease continues to exist, and we want to know whether such a state is stable. To solve Equation \ref{(18.42)}, we will need to assume a certain degree distribution \(P(k)\).

For example, if we assume that the network is large and random, we can use the following crude approximate degree distribution:

\[P(k) \approx \begin{cases} 1 \text{ if } k =\langle{k}\rangle \label{(18.43)} \\ 0 \text{otherwise} \end{cases} \]

We can use this approximation because, as the network size increases, the degree distribution of a random graph becomes more and more concentrated at the average degree (relative to the network size), due to the law of large numbers well known in statistics. Then Equation \ref{(18.42)} is solved as follows:

\[q_{n}=\frac{\langle{k}\rangle {q_{n}}p_{i}}{\langle{k} \rangle {q_{n}}p_{i}+p_{r}} \label{(18.44)} \]

\[\langle{k} \rangle {q_{n}p_{i}+p_{r}} = \langle{k} \rangle{p_{i}} \label{(18.45)} \]

\[q_{n}=1-\frac{p_{r}}{ \langle{k} \rangle{p_{i}}} \label{(18.46)} \]

We can check the stability of this equilibrium state by differentiating Equation \ref{(18.38)} by \(q(k)\) and then applying the results above. In so doing, we should keep in mind that \(q_n\) contains \(q(k)\) within it (see Equation \ref{(18.36)}). So the result should look like this:

\[\begin {align}\frac{df(q(k))}{dq(k)} =-kq_{n}p_{i} +(1-q(k))kp_{i}\frac{dq_{n}}{dq(k)} +1-p_{r} \label{(18.47)} \\ =-kq_{n}p_{i}+(1-q(k))\frac{k^{2}P(k)p_{i}}{\langle{k}\rangle} +1-p_{r} \label{(18.48)} \end{align} \]

At the equilibrium state Equation \ref{(18.41)}, this becomes

\[\frac{df(q(k))}{dq(k)}|_{q(k)=\frac{kq_{n}p_{i}}{kq_{n}p_{i} +p_{r}}} =-kq_{n}p_{i} +\frac{p_{r}}{kq_{n}p_{i}+p_{r}} \frac{k^{2}P(k)p_{i}}{\langle{k}\rangle}+1-p_{r}=r(k). \label{(18.49)} \]

Then, by applying \(P(k)\) and \(q_n\) with a constraint \(k → \langle{k}\rangle\) for large random networks, we obtain

\[\begin{align} r(\langle{k}\rangle) =-\langle{k}\rangle(1-\frac{p_{r}}{\langle{k}\rangle{p_{i}}})p_{i}+\frac{p_{r}}{\langle{k}\rangle(1-\frac{p_{r}}{\langle{k}\rangle{p_{i}}})p_{i}+p_{r}} \frac{\langle{k}\rangle^{2}p_{i}}{\langle{k}\rangle} + -p_{r} \label{(18.50)} \\ =- \langle{k}\rangle{p_{i}} +p_{r}+p_{r}+1-p_{r} \label{(18.51)} \\ =-\langle{k}\rangle{p_{i}}+p_{r}+1. \label{(18.52)} \end{align} \]

In order for the non-zero equilibrium state to be stable:
\[-1<- \langle{k}\rangle {p_{i}}+p_{r}+1<1 \label{(18.53)} \]

\[\frac{p_{r}}{\langle{k}\rangle}< p_{i} < \frac{p_{r}+2}{\langle{k}\rangle} \label{(18.54)} \]

Note that the lower bound indicated above is the same as the epidemic threshold we obtained on random networks before (Eq. (18.5.18)). So, it seems our analysis is consistent so far. And for your information, the upper bound indicated above is another critical threshold at which this non-zero equilibrium state loses stability and the system begins to oscillate.

What happens if the network is scale-free? Here, let’s assume that the network is a Barab´asi-Albert scale-free network, whose degree distribution is known to be \(P(k) = 2m^{2}k^{−3}\) where \(m\) is the number of edges by which each newcomer node is attached to the network, and thus \(\langle{k}\rangle = 2m\) and \(k_{min} = m\) [57]. Then, from Equation \ref{(18.42)}¹, we obtain

\[q_{n} =\frac{1}{2m} \sum_{k'=m}^{\infty} k' \cdot 2m^{2}k'^{-3}\frac{k'q_{n}p_{i}}{k'q_{n}p_{i}+p_{r}} \label{(18.55)} \]

\[1=mp_{i} \sum_{k'=m}^{\infty} \frac{1}{k'(k'q_{n}p_{i}+p_{r})}.\label{(18.56)} \]

The summation can be approximated by a continuous integral, which results in

\[\begin{align} 1 \approx mp_{i} \int^{\infty}_{m} \frac{dk'}{k'(k'q_{n}p_{i}+p_{r})} \label{(18.57)} \\ = \frac{mp_{i}}{p_{r}} \int^{\infty}_{m} (\frac{1}{k'}-\frac{1}{k'+ p_{r}/(q_{n}p_{i})})dk' \label{(18.58)} \\ =\frac{mp_{i}}{p_{r}} [\log{(\frac{k'}{k'+p_{r/(q_{n}p_{i})}})}]^{\infty}_{m} \\ = \frac{mp_{i}}{p_{r}} \log{(1+\frac{p_{r}}{mq_{n}p_{i}})}, \label{(18.60)} \\ q_{n} \approx \frac{p_{r}}{(e^{\frac{p_{r}}{mp_{i}}} -1)mp_{i}}. \label {(18.61)} \end{align} \]

We can apply this result (together with \(P(k) = 2m^{2}k^{−3})\) to \(r(k)\) in Equation \ref{(18.49)} to check the stability of this non-zero equilibrium state:

\[\begin{align} r(k)=-k \frac{p_{r}}{(e^{\frac{p_{r}}{mp_{i}}} -1 mp_{i})}p_{t}+\frac{p_{r}}{k \frac{p_{r}}{(e^{\frac{p_{r}}{mp_{i}}} -1)mp_{i}}p_{i}+p_{r}} \\ = - \frac { k p _ { r } } { \left( e ^ { \frac { p _ { r } } { m p _ { i } } } - 1 \right) m } + \frac { m p _ { i } } { \left( e ^ { \frac { p r } { m _ { i } } } - 1 \right) m } + 1 - p _ { r } \label{(18.63)}\end{align} \]

This is rather complex, and it would be difficult to solve it in terms of \(p_i\). But still, we can show something very interesting using this formula. If we lower the infection probability \(p_i\) down to zero, this occurs asymptotically:

\[\begin{align} \lim _ { p _ { i } \rightarrow 0 } r ( k ) = - \frac { k p _ { r } } { ( \left[ e ^ { \frac { p _ { r } } { m p _ { i } } } \rightarrow \infty \right] m } + \frac { m \left[ p _ { i } \rightarrow 0 \right] } { ( \left[ e ^ { \frac { p _ { r } } { m p _ { i } } \rightarrow \infty } \right] - k } + 1 - p _ { r } \label{(18.64)} \\ 1-p_{r} \end{align} \]

This shows \(0 < lim_{pi→0} r(k) < 1\), i.e., the non-zero equilibrium state remains stable even if \(p_i\) approaches to zero. In other words, there is no epidemic threshold if the network is scale-free! This profound result was discovered by statistical physicists Romualdo Pastor Satorras and Alessandro Vespignani in the early 2000s[79], which illustrates an important fact that, on networks whose topologies are scale-free, diseases can linger around indefinitely, no matter how weak their infectivity is. This is a great example of how complex network topologies can fundamentally change the dynamics of a system, and it also illustrates how misleading the predictions made using random models can be sometimes. This finding and other related theories have lots of real-world applications, such as understanding, modeling, and prevention of epidemics of contagious diseases in society as well as those of computer viruses on the Internet.

Exercise \(\PageIndex{1}\)

Simulate the dynamics of the network SIS model on Barab´asi Albert scale-free networks to check if there is indeed no epidemic threshold as the theory predicts. In simulations on any finite-sized networks, there is always the possibility of accidental extinction of diseases, so you will need to make the network size as large as possible to minimize this “finite size” effect. You should compare the results with those obtained from control experiments that use random networks.

Finally, I should mention that all the analytical methods discussed above are still quite limited, because we didn’t consider any degree assortativity or disassortativity, any possible state correlations across edges, or any coevolutionary dynamics that couple state changes and topological changes. Real-world networks often involve such higher-order complexities. In order to better capture them, there are more advanced analytical techniques available, such as pair approximation and moment closure. If you want to learn more about these techniques, there are some more detailed technical references available [80, 81, 82, 83].

Exercise \(\PageIndex{2}\)

Consider a network with extremely strong assortativity, so that

\[P \left( k ^ { \prime } | k \right) \approx \left\{ \begin{array} { l l } { 1 } & { \text { if } k ^ { \prime } = k } \\ { 0 } & { \text { otherwise } } \end{array} \right. \label{(18.66)} \]

Use this new definition of \(P(k'|k)\) in Equation \ref{(18.30)} to conduct a mean-field approximation, and determine whether this strongly assortative network has an epidemic threshold or not.

¹Here we assume that Barab´asi-Albert networks are non-assortative.