Betweenness Centrality

In the prior chapter, we examined the idea of “centrality” for nodes and “centralization” for graphs, focusing attention on the conceptual aspects of what it means to be central as well as the ways in which we can operationalize the concept using degree centrality and closeness centrality. In this chapter we want to continue thinking about what it means to be central, but through a different conceptual lens: betweenness. Betweenness centrality serves as a powerful indicator of actors who bridge otherwise disconnected parts of a network, offering unique control over the flow of information and resources across network boundaries. Unlike degree centrality, which emphasizes direct ties, or closeness centrality, which identifies actors closest to others, betweenness centrality highlights those who can influence inter-group dynamics. A compelling example is seen in a study by Calderoni (2014), which examined the Italian Mafia network, specifically the Ndrangheta organization. Calderoni found that individuals with high betweenness centrality played pivotal roles in connecting local criminal cells to broader, often international networks. These “connectors” were essential for coordinating illicit activities across regions and facilitating transactions that would otherwise remain confined to isolated groups. For law enforcement, targeting these high-betweenness actors provided a strategic leverage point to disrupt cross-regional collaboration, making them critical targets in efforts to dismantle organized crime networks. By understanding the role of betweenness, crime analysts can better assess how key individuals maintain structural cohesion within social networks.

By the end of this chapter, you should be able to:

Explain the conceptualization of “centrality” as it pertains to betweenness centrality.
Calculate betweenness centrality scores for a set of nodes.
Calculate a betweenness centralization score for a graph.

Case Study: Money Laundering and Betweenness

In a case study examining the role of betweenness centrality in criminal networks, Malm and Bichler (2011) analyzed money laundering to understand how different actors influenced network cohesion and resource flow. Analyzing data from intelligence reports, the authors found that the majority of money laundering in this drug market was self-laundering (where individuals laundered their own illicit earnings), rather than employing specialized professionals. Using betweenness centrality to assess how well-positioned launderers were within the network, Malm and Bichler (2011) found that self-launderers generally exhibited higher betweenness centrality, showing they often acted as key brokers facilitating information flow and transactions across different parts of the network. The findings suggested that targeting individuals with high betweenness centrality, particularly those involved in self-laundering, may be most effective for disrupting drug networks.

Betweenness Centrality Definition

Betweenness centrality is based on the number of shortest paths between two nodes, j and k, that node i resides on. Recall that a path is an edge or sequence of edges that connect two nodes and a geodesic is the shortest path between two nodes.

Undirected Graphs

Betweenness Centrality

For an undirected binary graph, betweenness centrality is:

\[C_B(n_i) = \sum\limits_{j<k} g_{jk}(n_i) / g_{jk}\]

This equation is a bit different from those for degree and closeness centrality. Let’s unpack it:

The denominator, \(g_{jk}\), is the number of geodesics linking j to k.
The numerator, \(g_{jk}(n_i)\), is the number of geodesics linking j and k that contain i.
Betweenness centrality is the ratio of the geodesics between j and k that contain i.

In words, if j has to go through i to reach k, \(j-i-k\), then i will have high betweenness because i is between j and k.

Let’s consider an example. Take a look at this graph:

What are the paths between Jen and Jim?

There are two:

Jen to Tom to Bob to Jim
Jen to Tom to Bob to Leaf to Jim

We can see this visually:

Note that Bob is on both of those paths. He is between Jen and Jim on both paths. Note, however, that only one of the paths is a geodesic path (the Jen->Tom->Bob->Jim path). That is because it is shorter.

Now, let’s calculate the betweenness centrality for Bob. The first thing we need to find are the geodesic paths. Then, figure out how many of those geodesic paths contain Bob. We do this by creating a table of geodesic proportions for Bob.

Geodesic Proportions for Bob

	Jen	Tom	Leaf	Jim
Jen				/?
Tom
Leaf
Jim

How many geodesics from Jen to Jim? There is 1 geodesic from Jen to Jim. So, we would just add that to our table:

Geodesic Proportions for Bob

	Jen	Tom	Leaf	Jim
Jen				/1
Tom
Leaf
Jim

Now, how many geodesics from Jen to Jim include Bob? Bob is on the only geodesic from Jen to Jim. So, we would just add that to our table:

Geodesic Proportions for Bob

	Jen	Tom	Leaf	Jim
Jen				1/1
Tom
Leaf
Jim

What about Jen to Tom?

Geodesic Proportions for Bob

	Tom	Jim
Jen	?/?	1/1
Tom
Leaf
Jim

There is 1 geodesic from Jen to Tom, and Bob is not on it. So, that is a 0 for the numerator. We can update the table accordingly:

Geodesic Proportions for Bob

	Tom	Jim
Jen	0/1	1/1
Tom
Leaf
Jim

Finally, what about Jen to Leaf?

Geodesic Proportions for Bob

	Tom	Leaf	Jim
Jen	0/1	?/?	1/1
Tom
Leaf
Jim

Bob is on the geodesic from Jen to Leaf:

Geodesic Proportions for Bob

	Tom	Leaf	Jim
Jen	0/1	1/1	1/1
Tom
Leaf
Jim

Of the geodesics between \(Jen,Tom\), \(Jen,Leaf\), and \(Jen,Tom\), how many include Bob? We just sum the ratios across the row. It gives us 2.

Geodesic Proportions for Bob

	Tom	Leaf	Jim	SUM
Jen	0/1	1/1	1/1	2
Tom
Leaf
Jim

Now, to finish our calculation for Bob, we need to calculate the geodesics for the rest of the matrix:

Geodesic Proportions for Bob

	Tom	Leaf	Jim
Jen	0/1	1/1	1/1
Tom		?/?	?/?
Leaf			?/?
Jim

To aid in this, think about the highlighted paths below:

The sum of all these ratios is Bob’s betweenness centrality score:

\[C_B(Bob) = \sum\limits_{j<k} g_{jk}(Bob) / g_{jk}\]

Geodesic Proportions for Bob

	Tom	Leaf	Jim	SUM
Jen	0/1	1/1	1/1	2
Tom		1/1	1/1	2
Leaf			0/1	0
Jim
TOTAL				4

That sum is 4. Bob’s betweenness centrality score is 4. What does this mean? A score of 4 indicates that Bob occupies a position between two nodes on 4 of the geodesics in the graph.

What about Leaf? We can do the same thing by creating a table of geodesic proportions for Leaf.

Geodesic Proportions for Leaf

	Tom	Bob	Jim	SUM
Jen	?/?	?/?	?/?	?
Tom		?/?	?/?	?
Bob			?/?	?
Jim
TOTAL				?

Based on the plot, we can visually see that Leaf’s betweenness is going to be zero as he is not on any paths between two nodes. We can show this mathematically in the table:

Geodesic Proportions for Leaf

	Tom	Bob	Jim	SUM
Jen	0/1	0/1	0/1	0
Tom		0/1	0/1	0
Bob			0/1	0
Jim
TOTAL				0

Based on the visual inspection of the plot, can you find two other nodes who will have a betweenness centrality score of 0?

The complete scores are:

Node	Betweenness
Jen	0
Tom	3
Bob	4
Leaf	0
Jim	0

Standardization

As discussed in the prior chapters, centrality measures are sensitive to the size of the graph, g. In the case of betweenness centrality, having to sum over more nodes will make scores from large networks bigger than scores from smaller networks (because you are just summing over more nodes). This means that we can’t compare betweenness centrality scores across graphs of different sizes. Solution?

Standardize! If we want to compare nodes in different sized graphs, then we just take into account the number of nodes and the maximum possible nodes to which i could be connected. For betweenness centrality, we have to go a step further. We need to account for the number of pairs of nodes that do not include i. In an undirected graph, this is \((g-1))(g-2)/2\).

Thus, we can calculate a standardized betweenness centrality score for an undirected graph as:

\[C'_B(n_i) = \frac{\sum\limits_{j<k} g_{jk}(n_i) / g_{jk}}{[(g-1)(g-2)/2]} = \frac{C_B(n_i)}{[(g-1)(g-2)/2]}\]

So, all we are doing is adjusting each betweenness score by \((g-1))(g-2)/2\).

For this graph, the denominator, \((g-1))(g-2)/2=(5-1))(5-2)/2 = 6\). Plugging that in we get: \[C'_B(n_i) = \frac{C_B(n_i)}{6}\].

When we calculate the raw betweenness scores, we get the following table:

Node	Betweenness	Standardized Betweenness
Jen	0	0 / 6 = 0.000
Tom	3	3 / 6 = 0.500
Bob	4	4 / 6 = 0.667
Leaf	0	0 / 6 = 0.000
Jim	0	0 / 6 = 0.000

Mean Betweenness

We can examine the summary statistics for betweenness centrality by inspecting the mean. The average betweenness is an important property of a network. Why? What does a network with a high average betweenness look like? A low average closeness?

The mean betweenness for an undirected network is:

\[\bar C_B = \dfrac{\sum\limits_{i=1}^g C_B(n_i)}{g}\]

This equation, \(\dfrac{\sum\limits_{i=1}^g C_B(n_i)}{g}\), just says to sum up the betweeness scores for each node and then divide by the number of nodes, g.

What is the mean betweenness for our example network? We can pull these from the table above:

Node	Betweenness	Standardized Betweenness
Jen	0	0 / 6 = 0.000
Tom	3	3 / 6 = 0.500
Bob	4	4 / 6 = 0.667
Leaf	0	0 / 6 = 0.000
Jim	0	0 / 6 = 0.000

\[\bar C_B = \dfrac{\sum\limits_{i=1}^g C_B(n_i)}{g} = \dfrac{0 + 3 + 4 + 0 + 0}{5} = \dfrac{7}{5} = 1.4\] The average betweenness for this graph is 1.4, which tells us that the average node occupies just over 1 path which other nodes have to go through.

Betweenness Centralization

As we did with degree centralization and closeness centralization, we can calculate the betweenness centralization of the graph. Recall that centralization measures the extent to which the nodes in a social network differ from one another in their individual centrality scores. Put differently, how much variation is there in the distribution of centrality scores? As with closeness centralization, we use the standardized betweenness score (as opposed to the raw score).

We can calculate betweenness centralization as:

\[C_B = \frac{\sum\limits_{i=1}^g[C'_B(n^*)-C'_B(n_i)]}{(g-1)} \]

If I lost you in some of the math, don’t worry. Let’s calculate the betweenness centralization score for our example graph. Recall our table of standardized betweenness scores:

Node	Betweenness	Standardized Betweenness
Jen	0	0 / 6 = 0.000
Tom	3	3 / 6 = 0.500
Bob	4	4 / 6 = 0.667
Leaf	0	0 / 6 = 0.000
Jim	0	0 / 6 = 0.000

What is the largest standardized betweenness score? It is 0.667 for Bob. Also, we have \(g-1 = 5 - 1 = 4\). Now, we just plug these into our equation:

\[C_B = \frac{\sum\limits_{i=1}^g[0.667-C'_B(n_i)]}{4} \]

Node	Betweenness	Standardized Betweenness	Deviations of Betweenness Closeness, \(0.667-C'_B(n_i)\)
Jen	0	0 / 6 = 0.000	0.667 - 0.000 = 0.667
Tom	3	3 / 6 = 0.500	0.667 - 0.500 = 0.167
Bob	4	4 / 6 = 0.667	0.667 - 0.667 = 0.000
Leaf	0	0 / 6 = 0.000	0.667 - 0.000 = 0.667
Jim	0	0 / 6 = 0.000	0.667 - 0.000 = 0.667

If we total all of the values in the last column we get 2.168. Plugging this in as our numerator, we get:

\[C_B = \frac{2.168}{4} = 0.542\]

So, the betweenness centralization score for our example graph is 0.542. What does this mean?

When betweenness centrality is evenly dispersed, meaning that all nodes have the same betweenness score, then the numerator in the equation will be zero and the quotient will be close to 0. When there is considerable inequality in the betweenness centrality scores between nodes, the quotient will be closer to 1. Thus, closer to 1 indicates that the graph is hierarchically structured and closer to 0 means that the graph is more decentralized.

We can see this by examining two additional undirected networks:

The betweenness centralization score for the figure on the left is 1, whereas the betweenness centralization score for the figure on the right is 0. This makes sense. In the figure on the left, every node has to go between the node in the middle. The node in the middle is on every geodesic in the graph. In contrast, for the graph on the right, every node has a betweenness centrality score of 1, so there is no variation in betweenness centrality scores.

Directed Graphs

As we have seen, when we have a directed graph, we have to consider directionality when evaluating a centrality measure. If we are interested in the betweenness score for Bob, we would do the same thing: create a geodesic proportions table for Bob. The difference is that we want to consider the directionality.

Let’s take the example that we have been working with for directed graphs and change the layout of the edges slightly (the sociomatrix is the same, just curving the edges in the plot). Is Bob on the geodesic between Jen and Jim? Jen and Leaf? Jen and Tom? Let’s take a look.

Is Bob on the geodesic between Jen and Jim? Yes! Bob is on the geodesic between Jen and Jim. Is Bob on the geodesic between Jen and Leaf? Yes. Jen and Tom? No.

As before, we can put this information in a table of geodesic proportions:

Geodesic Proportions for Bob

	Tom	Leaf	Jim
Jen	0/1	1/1	1/1
Tom		1/1	1/1
Leaf			0/1
Jim

Are we done? Not quite yet. Since the graph is directed we have to total all the rows. The reasons we did not do this for undirected graph is because the adjacency matrix is symmetric. Since the matrix for an undirected graph is not symmetric, we have to think of all the ties. Recall from our discussion about adjacency matrices for directed graphs, the rows tell us about outgoing ties or sending whereas the columns tell us about incoming ties or receiving. So, we will need information for the complete table.

Why? Well, think about the table below:

Geodesic Proportions for Bob

	Tom	Leaf	Jim	SUM
Jen	0/1	1/1	1/1	2
Tom		1/1	1/1	2
Leaf			0/1	0
Jim
TOTAL				4

If we sum this, we get 4. But, note that this does not account for all of the edges in the network. We need to complete the matrix:

Geodesic Proportions for Bob

	Jen	Tom	Leaf	Jim	SUM
Jen		0/1	1/1	1/1	2
Tom	?/?		1/1	1/1	2
Leaf	?/?	?/?		0/1	0
Jim	?/?	?/?	?/?
TOTAL					4

To better see this, let’s highlight the other edges that we did not focus on above:

We can now complete our table:

Geodesic Proportions for Bob

	Jen	Tom	Leaf	Jim	SUM
Jen		0/1	1/1	1/1	2
Tom	0/0		1/1	1/1	2
Leaf	0/0	0/0		0/1	0
Jim	0/0	0/0	0/1
TOTAL					4

Why are there no geodesics in the lower portion of the table? Take a look at the plot. Tom does not send a tie to Jen, so there is not a geodesic from Tom to Jen. This is the same for Leaf to Jen. There is a geodesic from Jim to Leaf, but Bob is not on it. As a consequence, the total is 4.

Go ahead and compute the betweenness scores for the rest of the nodes in the network. You should get this:

Node	Betweenness
Jen	0
Tom	3
Bob	4
Leaf	0
Jim	0

Standardization

We can calculate a standardized betweenness centrality score for a directed graph as:

\[C'_B(n_i) = \frac{\sum\limits_{j<k} g_{jk}(n_i) / g_{jk}}{[(g-1)(g-2)]} = \frac{C_B(n_i)}{[(g-1)(g-2)]}\]

So, all we are doing is adjusting each betweenness score by \((g-1))(g-2)\).

For this graph, the denominator, \((g-1))(g-2)/2=(5-1))(5-2) = 12\). Plugging that in we get: \[C'_B(n_i) = \frac{C_B(n_i)}{12}\].

When we calculate the raw betweenness scores, we get the following table:

Node	Betweenness	Standardized Betweenness
Jen	0	0 / 12 = 0.000
Tom	3	3 / 12 = 0.250
Bob	4	4 / 12 = 0.333
Leaf	0	0 / 12 = 0.000
Jim	0	0 / 12 = 0.000

Betweenness Centralization

For a directed graph, we can use the following to calculate betweenness centralization (it is the same equation as above for the undirected graph):

\[C_B = \frac{\sum\limits_{i=1}^g[C'_B(n^*)-C'_B(n_i)]}{(g-1)} \]

If I lost you in some of the math, don’t worry. Let’s calculate the betweenness centralization score for our example graph. Recall our table of standardized betweenness scores:

Node	Betweenness	Standardized Betweenness
Jen	0	0 / 12 = 0.000
Tom	3	3 / 12 = 0.250
Bob	4	4 / 12 = 0.333
Leaf	0	0 / 12 = 0.000
Jim	0	0 / 12 = 0.000

What is the largest standardized betweenness score? It is 0.333 for Bob. Also, we have \(g-1 = 5 - 1 = 4\). Now, we just plug these into our equation:

\[C_B = \frac{\sum\limits_{i=1}^g[0.333-C'_B(n_i)]}{4} \]

Node	Betweenness	Standardized Betweenness	Deviations of Betweenness Closeness, \(0.667-C'_B(n_i)\)
Jen	0	0 / 12 = 0.000	0.333 - 0.000 = 0.333
Tom	3	3 / 12 = 0.250	0.333 - 0.250 = 0.083
Bob	4	4 / 12 = 0.333	0.333 - 0.333 = 0.000
Leaf	0	0 / 12 = 0.000	0.333 - 0.000 = 0.333
Jim	0	0 / 12 = 0.000	0.333 - 0.000 = 0.333

If we total all of the values in the last column we get 1.082. Plugging this in as our numerator, we get:

\[C_B = \frac{1.082}{4} = 0.271\]

Centralization Comparison

We can now compare four different centralization scores for this directed graph: indegree, outdegree, closeness, and betweenness. The scores are:

	Centralization
Indegree	0.438
Outdegree	0.125
Closeness	0.555
Betweenness	0.271

What can we say about the closeness centralization scores for each type of centrality? (take a look at the graph if that helps)

Test your Knowledge

What distinguishes betweenness centrality from other types of centrality like degree and closeness?
How does betweenness centrality help identify key actors in criminal networks, such as those in organized crime?
Why might law enforcement focus on individuals with high betweenness centrality when disrupting criminal networks?
What are the steps to calculate betweenness centrality for a specific node within an undirected network?
Why is it necessary to standardize betweenness centrality when comparing scores across networks of different sizes?
How does the calculation of betweenness centrality change when working with directed graphs compared to undirected graphs?
Why must you account for directionality when evaluating centrality measures in directed graphs?
What does a high betweenness centralization score indicate about the structure of a network? A low score?

Summary

This chapter explored the concept of betweenness centrality as a measure of centrality within social networks, emphasizing its role in identifying actors who bridge disconnected network segments and influence information flow. Unlike degree or closeness centrality, betweenness highlights actors who control connectivity between groups, making it particularly valuable in analyzing criminal networks, where key figures often facilitate collaboration. We covered how to calculate betweenness centrality for individual nodes, standardize these scores for comparison across network sizes, and compute a network-wide betweenness centralization score, which reveals the structural cohesion of a network. Understanding these concepts equips analysts to assess both individual influence and the overall structure of networks.

We have now covered three different ways of conceptualizing centrality: degree, closeness, and betweenness. These are all valuable tools for your crime analyst toolkit!