NetworkX

NetworkX is a Python package for dealing with complex networks (graphs). It provides Graph classes, graph algorithms, and visualization tools.

You may need to

(pycourse) conda install networkx
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
nx.__version__
'2.5'

We’ll cover some basics in NetworkX. You can find additional information in the NetworkX documentation, which includes a tutorial.

First, let’s create a simple graph:

G = nx.Graph()

# add nodes
G.add_node(0, name="dog")
G.add_node(1)
G.add_nodes_from(range(3)) # adds nodes 0, 1

# add edge from node 0 to node 1
G.add_edge(0,1)

# draws the graph to pyplot axes
nx.draw(G, with_labels=True, font_weight='bold')
plt.show()
../_images/networkx_4_0.png

Graph nodes can be any hashable object, which covers many things you might use

G = nx.Graph()

# add nodes
G.add_node("dog")
G.add_node("cat")

# add edge
G.add_edge("cat","dog")

# draws the graph to pyplot axes
nx.draw(G, with_labels=True)
plt.show()
../_images/networkx_6_0.png

You can also associate arbitrary data with each node and edge by passing in keyword arguments

G = nx.Graph()

# add nodes
G.add_node("dog", name="Rover", weight=10.0)
G.add_node("cat", name="Felix", height=5)

# add edge
G.add_edge("cat","dog", time=2.0)

# draws the graph to pyplot axes
nx.draw(G, with_labels=True, labels=nx.get_node_attributes(G, "name"))
plt.show()
../_images/networkx_8_0.png
G.get_edge_data("cat", "dog")
{'time': 2.0}
nx.get_edge_attributes(G, "time")
{('dog', 'cat'): 2.0}
names = nx.get_node_attributes(G, "name")
names
{'dog': 'Rover', 'cat': 'Felix'}
weights = nx.get_node_attributes(G, "weight")
weights
{'dog': 10.0}

Generating Graphs

NetworkX provides Graph generators to generate a variety of random graphs.

Random Graphs

Erdos-Renyi Graphs

An Erdos-Renyi random graph \(G_{n,p}\) is a graph on \(n\) nodes, where the probability of an edge \((i,j)\) existing is \(p\). In NetworkX, this is called a gnp graph.

n = 50
p = 5 / (n-1) # 5 is expected number of neighbors of a single vertex
G = nx.gnp_random_graph(n, p)
nx.draw(G, with_labels=False)
plt.title(r'$G({},{})$'.format(n,p))
plt.show()
../_images/networkx_14_0.png

Stochastic Block Models

A Graph generated from a stochastic block model (SBM) has \(k\) clusters, and a \(k\times k\) (symmetric) matrix \(P\) of probabilities, where \(P_{i,j}\) is the probability that a pair of nodes \((a,b)\) will be joined by an edge if \(a\) is in cluster \(i\) and \(b\) is in cluster \(j\).

ns = [25, 25, 25] # size of clusters
ps = [[0.3, 0.01, 0.01], [0.01, 0.3, 0.01], [0.01, 0.01, 0.3]] # probability of edge
G = nx.stochastic_block_model(ns, ps)
nx.draw(G, with_labels=False)
plt.show()
../_images/networkx_16_0.png

Visualization

Visualizing graphs is often a great way to see and understand information.

Unless there is some “ground truth” way to lay out nodes (meaning each node has an associated coordinate), we must choose some way to place nodes in our visualization. There are a variety of methods for this.

Layouts

There are several layout methods you might use, which are good for certain applications.

Spectral Embeddings are good for partitioning the graph into clusters.

nx.draw_spectral(G)
plt.show()
../_images/networkx_18_0.png

Spring layouts tend to do a good job of separating nodes, which can be nice for visualization

nx.draw_spring(G)
plt.show()
../_images/networkx_20_0.png

the Kamada-Kawai algorithm also gives visually pleasing results.

nx.draw_kamada_kawai(G)
plt.show()
../_images/networkx_22_0.png
pos = nx.kamada_kawai_layout(G)
pos
{0: array([0.55848456, 0.28631111]),
 1: array([0.2723952 , 0.06022474]),
 2: array([0.29600044, 0.42521603]),
 3: array([0.14204467, 0.50618183]),
 4: array([ 0.13569054, -0.00391481]),
 5: array([0.36586802, 0.14155925]),
 6: array([0.58235413, 0.40263966]),
 7: array([0.22701682, 0.4735505 ]),
 8: array([0.20471613, 0.2449802 ]),
 9: array([0.04167401, 0.45013743]),
 10: array([-0.02230495,  0.36050668]),
 11: array([0.26310646, 0.55601801]),
 12: array([0.09507921, 0.2511831 ]),
 13: array([-0.01296102,  0.15589922]),
 14: array([0.29740876, 0.25122146]),
 15: array([0.30338164, 0.3425249 ]),
 16: array([0.46148538, 0.34601535]),
 17: array([-0.04543286,  0.24141499]),
 18: array([0.04895396, 0.03792357]),
 19: array([0.48120167, 0.46254991]),
 20: array([0.24102299, 0.16557828]),
 21: array([0.10568134, 0.16036921]),
 22: array([0.16803906, 0.37395204]),
 23: array([-0.12459958,  0.24998575]),
 24: array([0.52080826, 0.12520194]),
 25: array([-0.83509014,  0.29632483]),
 26: array([-0.7970804 ,  0.09993388]),
 27: array([-0.3340208 , -0.40362148]),
 28: array([-0.85562945,  0.43181013]),
 29: array([-0.60377777,  0.20106874]),
 30: array([-0.59737155, -0.05315006]),
 31: array([-0.65431893,  0.06491301]),
 32: array([-0.46842143,  0.53144003]),
 33: array([-0.59851362,  0.64011876]),
 34: array([-0.65732526,  0.22843393]),
 35: array([-0.36024348,  0.23178491]),
 36: array([-0.30253631,  0.02616656]),
 37: array([-0.49436719,  0.444842  ]),
 38: array([-0.2693406 ,  0.58999807]),
 39: array([-0.63188034,  0.35842707]),
 40: array([-0.73433804, -0.15805037]),
 41: array([-0.76823287, -0.0704386 ]),
 42: array([-0.42971896, -0.13705016]),
 43: array([-1.        ,  0.22030478]),
 44: array([-0.89866356,  0.00399537]),
 45: array([-0.59349463, -0.25297497]),
 46: array([-0.46461129,  0.13242792]),
 47: array([-0.54012952,  0.31147695]),
 48: array([-0.51027701,  0.01206457]),
 49: array([-0.63818229,  0.44121927]),
 50: array([ 0.25449128, -0.6329606 ]),
 51: array([ 0.41765881, -0.74847725]),
 52: array([ 0.4427453 , -0.20906612]),
 53: array([ 0.20804549, -0.55597445]),
 54: array([ 0.04027806, -0.64605528]),
 55: array([ 0.37076363, -0.41763146]),
 56: array([ 0.57373344, -0.6693159 ]),
 57: array([ 0.58391881, -0.31600168]),
 58: array([ 0.37674834, -0.48387407]),
 59: array([ 0.47358989, -0.40345153]),
 60: array([ 0.34410606, -0.59696046]),
 61: array([ 0.15019332, -0.47281503]),
 62: array([ 0.71101042, -0.29324512]),
 63: array([ 0.23374494, -0.2948483 ]),
 64: array([ 0.6000465 , -0.42802762]),
 65: array([ 0.17297465, -0.2675781 ]),
 66: array([ 0.45073499, -0.61345329]),
 67: array([-0.01417495, -0.47971352]),
 68: array([ 0.354409  , -0.24604809]),
 69: array([ 0.5229518 , -0.36106347]),
 70: array([ 0.23368975, -0.10406808]),
 71: array([ 0.481575  , -0.08849365]),
 72: array([ 0.61958812, -0.60681014]),
 73: array([ 0.72421881, -0.56604564]),
 74: array([ 0.10340916, -0.75671666])}
nx.draw_circular(G)
../_images/networkx_24_0.png

Algorithms

NetworkX implements a variety of graph algorithms.

Shortest Paths

A path between nodes \(x\) and \(y\) is a sequence of edges where the target of an edge is the source of the next edge. \begin{equation} (x, v_0), (v_0, v_1), \dots, (v_k, y) \end{equation} The length of the path is the number of edges in the sequence. A shortest path is a path with the shortest length over all possible paths.

We’ll illustrate on a 2D grid:

G = nx.grid_2d_graph(10,10)
pos = nx.kamada_kawai_layout(G)
nx.draw(G, pos, with_labels=True)
../_images/networkx_26_0.png
pos = {(i,j): np.array([i,j]) for i in range(10) for j in range(10)}
nx.draw(G, pos, with_labels=True)
../_images/networkx_27_0.png

The nodes of this 2D grid are indexed by coordinates \((i,j)\).

p = nx.shortest_path(G, (0,0), (9,9))
elist = [[p[i], p[i+1]] for i in range(len(p)-1)]
nx.draw(G, pos)
nx.draw_networkx_edges(G, pos, elist, edge_color='r', width=3.0)
plt.show()
../_images/networkx_29_0.png

The length of the shortest path between any two nodes in a graph \(G\) defines a metric on the vertex set (this is analagous to the geodesic metric on a manifold).

dists = nx.shortest_path_length(G)

This creates an iterator over shortest path lengths for each node

Exercise

Write a function which returns a distance matrix for the shortest path length.

## Your code here

Spanning Trees

A tree is a connected graph with no cycles. If the graph has \(n\) nodes, it must have \(m = n-1\) edges to be a tree.

A spanning tree of a graph \(G\) is a sub-graph with the same set of nodes, and a subset of edges that forms a tree. A minimum spanning tree (MST) is a spanning tree with minimum weight (you can weight edges using a 'weight' attribute).

You can compute a spanning tree of a graph using nx.tree.minimum_spanning_tree, or nx.tree.minimum_spanning_edges. Other tree algorithms are provided as well.

T = nx.tree.minimum_spanning_tree(G)

nx.draw(G, pos)
nx.draw(T, pos, edge_color='r', width=3.0)
plt.show()
../_images/networkx_36_0.png
# set random weights
for i,j in G.edges():
    G[i][j]['weight'] = np.random.rand()
    
    
T = nx.tree.minimum_spanning_tree(G)

# nx.draw(G, pos, )
nx.draw(T, pos, edge_color='r', width=3.0)
plt.show()
../_images/networkx_37_0.png

Exercise

Create a function that generates a maze on a 2-dimensional \(m \times n\) grid using nx.grid_2d_graph and a randomly weighted MST. Place the start and end points at \((0,0)\) and \((m-1, n-1)\) respectively.

Create a function that solves a maze by computing the shortest path between the start and end points.

## Your code here

Bipartite Graphs

A bipartite graph is a graph where nodes are separated into two sets \(U, V\), and where edges are of the form \((u,v)\) with \(u\in U\), and \(v\in V\) (there are no edges between two points in \(U\) or two points in \(V\)).

Bipartite graphs are often used to encode matching problems. For example, dating websites might match romantic partners, and ride apps such as Uber and Lyft might match riders with drivers.

NetworkX functionality for biparite graphs is in nx.algorithms.bipartite

from networkx.algorithms import bipartite

B = bipartite.random_graph(5, 7, 0.5)

nx.draw(B)
../_images/networkx_41_0.png
B.nodes(data=True)
NodeDataView({0: {'bipartite': 0}, 1: {'bipartite': 0}, 2: {'bipartite': 0}, 3: {'bipartite': 0}, 4: {'bipartite': 0}, 5: {'bipartite': 1}, 6: {'bipartite': 1}, 7: {'bipartite': 1}, 8: {'bipartite': 1}, 9: {'bipartite': 1}, 10: {'bipartite': 1}, 11: {'bipartite': 1}})

Let’s explicitly define positions to visualize the bipartite graph

pos = {}
cts = [0, 0]

for i, data in B.nodes(data=True):
    group = data['bipartite']
    pos[i] = np.array([group, cts[group]])
    cts[group] = cts[group] + 1
    
nx.draw(B, pos)
../_images/networkx_44_0.png
# set random weights
for i,j in B.edges():
    B[i][j]['weight'] = np.random.rand()
    
weights = [B[u][v]['weight'] for u,v in B.edges()]
    
nx.draw(B, pos, width=weights)
../_images/networkx_45_0.png
# compute maximum matching
match = bipartite.maximum_matching(B)
match
{0: 7, 1: 5, 2: 6, 3: 8, 4: 11, 5: 1, 6: 2, 7: 0, 8: 3, 11: 4}
elist = [[k, v] for k, v in match.items()]

nx.draw(B, pos, width=weights)
nx.draw_networkx_edges(B, pos, elist, edge_color='r', width=1.0)
plt.show()
../_images/networkx_47_0.png

Other Graph Algorithms

There are many interesting applications of graphs and many graph algorithms that you might consider using. See the NetworkX algorithms documentation to see what some of the possibilities are.

Reading Graphs

In scientific computing, you’ll typically get a graph from some sort of data. Often these graphs are referred to as “complex networks”. One good source of data is the Stanford Large Network Dataset Collection

Graphs can be stored in a variety of formats. You can find documentation for NetworkX’s read/write capabilities here.

There are also some built-in example graphs in NetworkX. One example graph is the Zachary Karate Club Graph, which encodes friendships between individuals in a Karate club. There are some other example social network graphs

G = nx.karate_club_graph()
nx.draw_kamada_kawai(G)
../_images/networkx_50_0.png

Here’s an example of reading and writing a graph as an edgelist:

B = bipartite.random_graph(5, 7, 0.5)
nx.write_edgelist(B, "bipartite.graph")
B2 = nx.read_edgelist("bipartite.graph")