Network Graphs using Python

Network graphs are powerful visual representations that illustrate relationships between entities across various domains. From social connections to biological systems, Python offers robust tools for network analysis and visualization. This guide explores essential concepts, libraries, and techniques for creating insightful network graphs with Python.

Table of Contents

Understanding Graph Theory Fundamentals

Graph theory provides the mathematical foundation for network analysis. A graph consists of nodes (vertices) connected by edges (links) that represent relationships.

Types of Graphs:

Undirected graphs have symmetric relationships (like Facebook friendships)
Directed graphs (digraphs) have orientations shown by arrows (like Twitter follows)
Weighted graphs assign values to edges (distances, costs, strengths)
Unweighted graphs simply show presence/absence of connections

Key network properties include density (ratio of actual to possible connections), connectivity (how well nodes connect), centrality (node importance), and communities (node clusters with dense internal connections).

Mathematically, networks are commonly represented using adjacency matrices, where element (i,j) indicates whether nodes i and j are connected.

Essential Python Libraries for Network Graphs

Several specialized Python libraries facilitate network analysis and visualization:

NetworkX: The cornerstone library providing comprehensive functionality for creation, manipulation, and study of complex networks.

import networkx as nx
# Create a simple graph
G = nx.Graph()
G.add_edges_from([(1, 2), (1, 3), (2, 3), (3, 4)])

Matplotlib: While not strictly a network library, it integrates seamlessly with NetworkX for basic visualization:

import matplotlib.pyplot as plt
nx.draw(G, with_labels=True)
plt.show()

Pyvis: Built on the vis.js JavaScript library, Pyvis creates interactive network visualizations:

from pyvis.network import Network
net = Network()
net.from_nx(G)
net.show("network.html")

Graph-tool: Written in C++ with Python bindings, it offers significantly faster performance for large-scale network analysis.

Getting Started with NetworkX

NetworkX provides an intuitive API with extensive documentation for network analysis.

Installation and Setup:

pip install networkx matplotlib

Creating Graph Objects:
NetworkX offers several graph classes:

# Undirected graph
G = nx.Graph()
# Directed graph
D = nx.DiGraph()
# For parallel edges
M = nx.MultiGraph()

Adding Nodes and Edges:

# Add nodes with attributes
G.add_node(1, role='server')
G.add_nodes_from([2, 3, 4])

# Add weighted edges
G.add_edge(1, 2, weight=0.5)
G.add_edges_from([(1, 3), (2, 3), (3, 4)])

Basic Manipulation:

# Access nodes and edges
print(G.nodes())
print(G.edges())

# Access attributes
print(G.nodes[1]['role'])
print(G.edges[1, 2]['weight'])

Graph Properties:

# Basic metrics
print(nx.number_of_nodes(G))
print(nx.number_of_edges(G))
print(G.degree[1])
print(nx.is_connected(G))

# Find shortest path
print(nx.shortest_path(G, 1, 4))

Building Basic Network Graphs

Let’s create a practical friendship network:

# Create social network
friendship_network = nx.Graph()

# Add people as nodes
people = ["Meilana", "Nadia", "Maria", "David", "Ulfa"]
friendship_network.add_nodes_from(people)

# Add connections
friendships = [
    ("Meilana", "Nadia"), ("Meilana", "Maria"), 
    ("Nadia", "David"), ("Maria", "David"), 
    ("David", "Ulfa")
]
friendship_network.add_edges_from(friendships)

# Visualize
plt.figure(figsize=(8, 6))
nx.draw_networkx(friendship_network, 
                 node_color='lightblue', 
                 node_size=500)
plt.axis('off')
plt.show()

Converting Data Structures to Graphs:

# From dictionary of lists
connections = {
    'A': ['B', 'C'],
    'B': ['A', 'D'],
    'C': ['A', 'D']
}
G = nx.Graph(connections)

# From pandas DataFrame
import pandas as pd
edges_df = pd.DataFrame({
    'source': ['A', 'A', 'B', 'C'],
    'target': ['B', 'C', 'D', 'D'],
    'weight': [0.5, 0.8, 1.2, 0.7]
})
G = nx.from_pandas_edgelist(edges_df, 'source', 'target', 'weight')

Customizing Network Visualization

Customization makes network graphs more informative and appealing.

Modifying Node Appearance:

G = nx.karate_club_graph()

# Size nodes by degree
node_sizes = [v * 100 for v in dict(G.degree()).values()]

# Color nodes by attribute
node_colors = ['red' if G.nodes[n]['club'] == 'Mr. Hi' 
               else 'green' for n in G.nodes()]

nx.draw_networkx(G, 
                node_size=node_sizes,
                node_color=node_colors,
                with_labels=True)

Customizing Edge Properties:

# Edge width based on weight
edge_weights = [G[u][v]['weight'] for u, v in G.edges()]
normalized_weights = [1 + 2 * (w / max(edge_weights)) 
                     for w in edge_weights]

# Draw with custom edges
nx.draw_networkx_edges(G, pos, 
                      width=normalized_weights,
                      edge_color='gray',
                      alpha=0.7)

Layout Algorithms:
NetworkX offers various layout algorithms that significantly impact visualization:

layouts = {
    "Spring": nx.spring_layout,
    "Circular": nx.circular_layout,
    "Random": nx.random_layout,
    "Shell": nx.shell_layout,
    "Spectral": nx.spectral_layout
}

# Compare layouts
plt.figure(figsize=(15, 10))
for i, (name, layout) in enumerate(layouts.items(), 1):
    plt.subplot(2, 3, i)
    plt.title(name)
    pos = layout(G)
    nx.draw_networkx(G, pos, node_size=100, font_size=8)
    plt.axis('off')

Spring layout works well for general networks, circular layouts highlight cycles, and spectral layouts often reveal community structures.

Data Import and Graph Construction

Real-world network analysis typically starts with importing external data.

From CSV Files:

import pandas as pd

# Load edge data
edges_df = pd.read_csv('network_edges.csv')
G = nx.from_pandas_edgelist(
    edges_df,
    source='source',
    target='target',
    edge_attr='weight'
)

# Load node attributes
nodes_df = pd.read_csv('network_nodes.csv')
node_attrs = nodes_df.set_index('id').to_dict('index')
nx.set_node_attributes(G, node_attrs)

From Adjacency Matrix:

import numpy as np

# Adjacency matrix
adj_matrix = np.array([
    [0, 1, 1, 0, 0],
    [1, 0, 1, 1, 0],
    [1, 1, 0, 1, 1],
    [0, 1, 1, 0, 1],
    [0, 0, 1, 1, 0]
])

G = nx.from_numpy_array(adj_matrix)

Data Cleaning:

# Remove missing data
df = df.dropna(subset=['source', 'target'])

# Fill missing weights
df['weight'] = df['weight'].fillna(1.0)

# Remove self-loops and isolated nodes
G.remove_edges_from(nx.selfloop_edges(G))
G.remove_nodes_from(list(nx.isolates(G)))

Graph Analysis Techniques

NetworkX provides powerful algorithms for network analysis.

Centrality Measures:

# Calculate various centrality metrics
degree_cent = nx.degree_centrality(G)
betweenness_cent = nx.betweenness_centrality(G)
closeness_cent = nx.closeness_centrality(G)
eigenvector_cent = nx.eigenvector_centrality(G)

# Visualize with node size based on centrality
plt.figure(figsize=(10, 8))
node_sizes = [v * 3000 for v in betweenness_cent.values()]
nx.draw_networkx(G, pos, node_size=node_sizes)

Each centrality measure highlights different aspects of importance:

Degree centrality: Number of connections
Betweenness centrality: Control over information flow
Closeness centrality: How quickly a node can reach others
Eigenvector centrality: Connection to other important nodes

Community Detection:

import community as community_louvain

# Detect communities using Louvain method
partition = community_louvain.best_partition(G)

# Visualize communities
colors = [partition[node] for node in G.nodes()]
nx.draw_networkx(G, pos, node_color=colors, cmap=plt.cm.rainbow)

Path Finding:

# Find shortest path by hops
shortest_path = nx.shortest_path(G, source=1, target=6)

# Find shortest path by edge weight
weighted_path = nx.dijkstra_path(G, source=1, target=6)

Interactive Network Visualization with Pyvis

Interactive visualizations allow users to explore complex networks dynamically.

from pyvis.network import Network

# Create interactive network
net = Network(height="700px", width="100%", bgcolor="#222222", font_color="white")

# Set physics options
net.barnes_hut(gravity=-80000, central_gravity=0.3, spring_length=250)

# Convert from NetworkX
net.from_nx(G)

# Customize nodes
for node in net.nodes:
    node["title"] = f"Node {node['id']}"
    node["size"] = 10 + G.degree[node['id']] * 2
    node["color"] = "#00ffff" if G.nodes[node['id']]["type"] == "A" else "#ff00ff"

# Save and display
net.show("interactive_network.html")

Rich Tooltips:

# Create detailed HTML tooltip
tooltip = f"""
<div style='padding: 10px; background-color: #f5f5f5; border-radius: 5px'>
<h3>Node {node}</h3>
<p><b>Type:</b> {G.nodes[node]['type']}</p>
<p><b>Connections:</b> {G.degree(node)}</p>
</div>
"""
net.add_node(node, title=tooltip, color=color, size=size)

Advanced Network Visualization

For complex or large networks, advanced techniques improve visualization clarity.

Edge Bundling:

# Simple edge bundling function
def curve_edges(G, pos, dist_ratio=0.2):
    curved_edges = []
    for edge in G.edges():
        # Create curved path for edge
        source_pos = np.array(pos[edge[0]])
        target_pos = np.array(pos[edge[1]])
        midpoint = (source_pos + target_pos) / 2
        
        # Add some curvature
        normal = np.array([-midpoint[1], midpoint[0]])
        normal = normal / np.linalg.norm(normal) * dist_ratio
        
        # Create curve points
        path = [source_pos, midpoint + normal, target_pos]
        curved_edges.append(path)
    return curved_edges

Large Network Visualization:

# For large networks, filter to show only important elements
pagerank = nx.pagerank(G)
threshold = 0.003
important_nodes = [n for n, r in pagerank.items() if r > threshold]
subgraph = G.subgraph(important_nodes)

# Size nodes by importance
node_sizes = [pagerank[n] * 30000 for n in subgraph.nodes()]
nx.draw_networkx(subgraph, pos, node_size=node_sizes, alpha=0.8)

Practical Applications and Case Studies

Network graphs apply to diverse domains:

Social Network Analysis:

# Detect communities
communities = nx.algorithms.community.greedy_modularity_communities(G)

# Analyze influence
betweenness = nx.betweenness_centrality(G)
influencers = sorted(betweenness.items(), key=lambda x: x[1], reverse=True)[:5]
print(f"Top influencers: {influencers}")

Biological Networks:

# Create protein interaction network
G = nx.Graph()
G.add_nodes_from([
    ("P1", {"type": "Receptor"}),
    ("P2", {"type": "Enzyme"}),
    ("P3", {"type": "Transcription Factor"})
])

G.add_edges_from([
    ("P1", "P2", {"effect": "Activation"}),
    ("P2", "P3", {"effect": "Inhibition"})
])

# Color by protein type
node_colors = ["red" if G.nodes[n]["type"] == "Receptor" else
              "blue" if G.nodes[n]["type"] == "Enzyme" else
              "green" for n in G.nodes()]

Transportation Networks:

# Find bottlenecks in transport network
edge_betweenness = nx.edge_betweenness_centrality(G)
critical_connections = sorted(edge_betweenness.items(), 
                             key=lambda x: x[1], reverse=True)[:5]

Working with Directed and Undirected Graphs

Different graph types require specific handling:

# Create directed graph
D = nx.DiGraph()
D.add_edges_from([('A', 'B'), ('B', 'C'), ('A', 'C')])

# Convert to undirected
G = D.to_undirected()

# Visualize directed graph
plt.figure(figsize=(8, 6))
pos = nx.spring_layout(D)
nx.draw_networkx(D, pos, 
                arrowsize=15, 
                arrowstyle='-|>', 
                node_color='lightblue')

Directed graphs use in_degree and out_degree for connection analysis:

# Analyze influence and popularity
influence = D.out_degree()
popularity = D.in_degree()

Best Practices and Optimization

For effective network visualizations:

Performance: For large networks, use Graph-tool or filter to show only important nodes
Layout: Choose appropriate layouts (spring for general, circular for cycles)
Color: Use meaningful color schemes that highlight important attributes
Size: Size nodes and edges based on relevant metrics
Interactivity: Use Pyvis for interactive exploration of complex networks
Simplification: Consider edge bundling or aggregation for dense networks
Labels: Only label important nodes to reduce visual clutter

VPS Manage Service Offer

If you don’t have time to do all of this stuff, or if this is not your area of expertise, we offer a service to do “VPS Manage Service Offer”, starting from $10 (Paypal payment). Please contact us to get the best deal!