Published

Saturday, August 3, 2024

Generative Models

Generate graphs with specific properties.

Erdős–Rényi Model

  • Random graph model:
    • Start with \(n\) nodes.
    • For each pair of nodes, add an edge with probability \(p\).
    • Every edge is independent.
    • No obvious pattern.
    • Not a good model for real networks.
  • \(A_{ij} \sim \text{Bernoulli}(p)\)
  • Degree distribution is binomial:
    • Probability of having degree \(k\): \[ p_k = \binom{n}{k} p^k (1-p)^{n-k} \]
    • Can be approximated as Poisson.
  • Diameter: \[ \text{Diameter} \approx \frac{\log(n)}{\log(z)} \] where \(z\) is the average degree.
    • Grows slowly with \(n\) but in real data, we observe a small constant or even shrinking diameter.
  • Clustering coefficient:
    • Needs specification in this model.

Planted Partition Model

  • Two communities:
    • Each node has a community.
    • \(p_{\text{intra}}\): Probability of edge within the community.
    • \(p_{\text{inter}}\): Probability of edge between communities.
  • Latent variable model: It’s not iid - we condition on two community labels. \[ P(A_{ij} = 1 \mid z_i, z_j) = \begin{cases} p_{\text{intra}} & \text{if } z_i = z_j \\ p_{\text{inter}} & \text{if } z_i \neq z_j \end{cases} \]
  • Can generalize to weighted and directed edges.

Stochastic Block Model

  • \(k\) communities:
    • Generalizes PPM to graphs with \(k\) communities and varying edge densities.
    • \(z_i\): Node \(i\) belongs to community \(z_i \in \{1, ..., k\}\).
  • Model parameters:
    • \(p_i\): Community proportion.
    • \(p_{ij}\): Edge probability between communities \(i\) and \(j\).
    • \(P(z_i = i) = p_i\)
    • \(P(A_{ij} = 1 \mid z_i, z_j) = p_{ij}\)
    • \(\eta\): Transition matrix.

Transition Matrix
  • Doesn’t capture all patterns of real networks:
    • Degree-corrected SBM.
    • Real networks have more triangles: Geometric SBM.
    • Real communities are overlapping: Community affiliation graph model.

Preferential Attachment Model

  • EP/PPM/SMB assume that edges are formed independently.
  • Start with a small number of nodes and let it grow by adding nodes and edges.
  • Preferential attachment: New nodes prefer to connect to nodes with high degree (rich get richer - power law degree distribution).

Initial Attractiveness: - Can realize power law with exponent in \([2, \infty]\).

Algorithm: 1. Start with \(m_0\) nodes. 2. Add a new node \(w\). 3. Add \(m\) directed edges \((u, v)\). 4. Probability that the endpoint of an edge \((u, v)\) corresponds to \(v\) is proportional to current degree + IA: \[ A_v = A + \text{in-degree}(v) \] where \(A\) is the initial attractiveness (same for all nodes). 5. Not important where the edge starts; they can all start from \(w\) or existing nodes. 6. Repeat from step 2.

Parameters: - \(m, A\) - Degree distribution: \[ \gamma = 2 + \frac{A}{m} \] - Matches many real networks. - Diameter grows logarithmically with \(n\). - Average degree remains constant.

Configuration Model

  • Given a degree sequence, generate a graph:
    • Assign a degree to each node represented as a stub (a half-edge).
    • Randomly pair the stubs.

Deep Generative Models

  • Learn models from data:
    • All previous models are handcrafted.
    • More flexible.
  • Methods: VAE, GAN, Normalizing Flows, Score-based models, Denoising Diffusion (chronological order of development).
  • Example: NetGAN - latent space interpolations produce smoothly changing graphs.
    • Sample from noise.
    • Manipulate the latent space to generate graphs with specific properties.