# Giant component Information

https://en.wikipedia.org/wiki/Giant_component
An Erdős–Rényi–Gilbert random graph with 1000 vertices at the critical edge probability ${\displaystyle p=1/(n-1)}$, showing a large component and many small ones. At this edge probability, the large component is not yet a giant component: it contains only a sublinear number of vertices.

In network theory, a giant component is a connected component of a given random graph that contains a finite fraction of the entire graph's vertices.

## Giant component in Erdős–Rényi model

Giant components are a prominent feature of the Erdős–Rényi model (ER) of random graphs, in which each possible edge connecting pairs of a given set of n vertices is present, independently of the other edges, with probability p. In this model, if ${\displaystyle p\leq {\frac {1-\epsilon }{n}}}$ for any constant ${\displaystyle \epsilon >0}$, then with high probability all connected components of the graph have size O(log n), and there is no giant component. However, for ${\displaystyle p\geq {\frac {1+\epsilon }{n}}}$ there is with high probability a single giant component, with all other components having size O(log n). For ${\displaystyle p=p_{c}={\frac {1}{n}}}$, intermediate between these two possibilities, the number of vertices in the largest component of the graph, ${\displaystyle P_{\inf }}$ is with high probability proportional to ${\displaystyle n^{2/3}}$. [1]

Giant component is also important in percolation theory. [1] [2] When a fraction of nodes, ${\displaystyle q=1-p}$, is removed randomly from an ER network of degree ${\displaystyle \langle k\rangle }$, there exists a critical threshold, ${\displaystyle p_{c}={\frac {1}{\langle k\rangle }}}$. Above ${\displaystyle p_{c}}$ there exists a giant component (largest cluster) of size, ${\displaystyle P_{\inf }}$. ${\displaystyle P_{\inf }}$ fulfills, ${\displaystyle P_{\inf }=p(1-\exp(-\langle k\rangle P_{\inf }))}$. For ${\displaystyle p the solution of this equation is ${\displaystyle P_{\inf }=0}$, i.e., there is no giant component.

At ${\displaystyle p_{c}}$, the distribution of cluster sizes behaves as a power law, ${\displaystyle n(s)}$~${\displaystyle s^{-5/2}}$ which is a feature of phase transition.

Alternatively, if one adds randomly selected edges one at a time, starting with an empty graph, then it is not until approximately ${\displaystyle n/2}$ edges have been added that the graph contains a large component, and soon after that the component becomes giant. More precisely, when t edges have been added, for values of t close to but larger than ${\displaystyle n/2}$, the size of the giant component is approximately ${\displaystyle 4t-2n}$. [1] However, according to the coupon collector's problem, ${\displaystyle \Theta (n\log n)}$ edges are needed in order to have high probability that the whole random graph is connected.

## Graphs with arbitrary degree distributions

A similar sharp threshold between parameters that lead to graphs with all components small and parameters that lead to a giant component also occurs in random graphs with non-uniform degree distributions. The degree distribution does not define a graph uniquely. However under assumption that in all respects other than their degree distribution, the graphs are treated as entirely random, many results on finite/infinite-component sizes are known. In this model, the existence of the giant component depends only on the first two (mixed) moments of the degree distribution. Let a randomly chosen vertex has degree ${\displaystyle k}$, then the giant component exists [3] if and only if

${\displaystyle \mathbb {E} [k^{2}]-2\mathbb {E} [k]>0.}$
${\displaystyle \mathbb {E} [k]}$ which is also written in the form of ${\displaystyle {\langle k\rangle }}$ is the mean degree of the network. Similar expressions are also valid for directed graphs, in which case the degree distribution is two-dimensional. [4] There are three types of connected components in directed graphs. For a randomly chosen vertex:

1. out-component is a set of vertices that can be reached by recursively following all out-edges forward;
2. in-component is a set of vertices that can be reached by recursively following all in-edges backward;
3. weak component is a set of vertices that can be reached by recursively following all edges regardless of their direction.

## Criteria for giant component existence in directed and undirected configuration graphs

Let a randomly chosen vertex has ${\displaystyle k_{\text{in}}}$ in-edges and ${\displaystyle k_{\text{out}}}$ out edges. By definition, the average number of in- and out-edges coincides so that ${\displaystyle c=\mathbb {E} [k_{\text{in}}]=\mathbb {E} [k_{\text{out}}]}$. If ${\displaystyle G_{0}(x)=\textstyle \sum _{k}\displaystyle P(k)x^{k}}$ is the generating function of the degree distribution ${\displaystyle P(k)}$ for an undirected network, then ${\displaystyle G_{1}(x)}$ can be defined as ${\displaystyle G_{1}(x)=\textstyle \sum _{k}\displaystyle {\frac {k}{\langle k\rangle }}P(k)x^{k-1}}$. For directed networks, generating function assigned to the joint probability distribution ${\displaystyle P(k_{in},k_{out})}$ can be written with two valuables ${\displaystyle x}$ and ${\displaystyle y}$ as: ${\displaystyle {\mathcal {G}}(x,y)=\sum _{k_{in},k_{out}}\displaystyle P({k_{in},k_{out}})x^{k_{in}}y^{k_{out}}}$, then one can define ${\displaystyle g(x)={\frac {1}{c}}{\partial {\mathcal {G}} \over \partial x}\vert _{y=1}}$ and ${\displaystyle f(y)={\frac {1}{c}}{\partial {\mathcal {G}} \over \partial y}\vert _{x=1}}$. The criteria for giant component existence in directed and undirected random graphs are given in the table below:

Type Criteria
undirected: giant component ${\displaystyle \mathbb {E} [k^{2}]-2\mathbb {E} [k]>0}$, [3] or ${\displaystyle G'_{1}(1)=1}$ [4]
directed: giant in/out-component ${\displaystyle \mathbb {E} [k_{\text{in}}k_{out}]-\mathbb {E} [k_{\text{in}}]>0}$, [4] or ${\displaystyle g'_{1}(1)=f'_{1}(1)=1}$ [4]
directed: giant weak component ${\displaystyle 2\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{in}}k_{\text{out}}]-\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{out}}^{2}]-\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{in}}^{2}]+\mathbb {E} [k_{\text{in}}^{2}]\mathbb {E} [k_{\text{out}}^{2}]-\mathbb {E} [k_{\text{in}}k_{\text{out}}]^{2}>0}$ [5]