# Who are the most connected economists? Insights from 30 years of data in two academic journals

*For this blog, I applied techniques that I learned in the Network Analysis Skill Track on Data Camp.*

My motivation behind this blog post was opportunistic: I wanted to apply my newly-acquired skills in network analysis to some real-life question. As an economics major in college, I thought it would be fun to analyze networks of economists.

# The Data

The goal of this exercise is to **construct and visualize a network of economists based on their collaboration on published papers**. I define a collaboration as the **co-authorship of a paper.** I used two well-known, general-interest economics journals to build my dataset. Both are considered among the top 5 academic journals in the field of economics:

- The American Economic Review, published by the American Economic Association, is a monthly peer-reviewed journal. Digitally available issues date back to 1999, so
**I extracted data for the past 20 years**(2000–2020). - The Quarterly Journal of Economics, published by the Oxford University Press. While the journal has made information digitally available dating back to 1890, I
**only****extracted data for the past 30 years**(1990–2020).

When I say “extracted data”, I mean that **I scraped paper titles and author names** from the two websites linked above. Since this is not research about who has the most published papers, **I discard any paper that has only one author**. If a paper has at least two authors, **then these authors are added to my dataset.**

I then construct all possible combinations of dual relationships between authors of a given paper. For example, if *Paper X *was co-authored by economists A, B, C and D (4 co-authors), I construct all possible combinations between A, B, C and D; that is: A-B, A-C, A-D, B-C, B-D and C-D.

My final dataset is simple and contains three columns:

- Name 1
- Name 2
- Weight of the relationship between Name 1 and Name 2: how many times authors 1 and 2 have collaborated on a published article (of course only within these 2 journals).

To illustrate what the dataset looks like, consider *Paper X* above (co-authored by A, B, C, D), and *Paper Y* (co-authored by A, C, F). The dataset for each journal then looks like this:

Once my data is ready, I use the iGraph package in R to build most my analyses and visualizations.

# Visualizing the Networks

In the network graphs below, every point, called “node”, represents a single author, and every link between two nodes, called an “edge”, means that the two nodes (the two authors) have co-authored a paper.

*A. The AER Network*

The figure below represents the visual representation of all authors and their collaboration in the AER between 2000 and 2020. The number of authors in this network is 2410 and the number of relationships between each 2 authors is 2904. **The average degree, which illustrates the average connection any given author can have within the network, is thus 1.2 (2904/2410).**

What jumps to the eye in the graph below is that **there is a big cluster, colored in yellow, which forms a network of uninterrupted collaboration. **It does not mean that every author in this cluster has worked with the next author, only that **we can trace a path between every author in this cluster**. Outside of this big cluster, there is an even bigger group, although disconnected, colored in blue. There are **1,117 economists in the yellow network, and 1,293 economists in the blue network**. That means that 46% of the authors in the network have mutual connections, and form an uninterrupted sub-network.

To understand which economists are central to the AER network, we look at two centrality measures:

**Degree:**This is how many other authors a given author is directly connected to.**Betweenness:**This is how many pairs of authors have to go through a given economist to reach each other.

In the AER network, the top 3 economists with respect to each measure are:

**Degree:**University of Zurich’s**Ernst Fehr***(connection to 21 authors)***,**MIT’s**Daron Acemoglu***(connection to 17 authors)***,**and MIT’s**Amy Finkelstein***(connection to 16 authors).***Betweennes:**Harvard’s**Philippe Aghion***(87K pairs of authors go through him)*, MIT’s**Daron Acemoglu***(62K pairs of authors go through him)*and Stanford’s**Peter Klenow***(56K pairs of authors go through him).*

*B. The QJE Network*

The number of authors in the QJE network (1990–2020) is 1,565 and the number of relationships between each 2 authors is 1,994. **The average degree, which illustrates the average connection any given author can have within the network, is thus 1.27**

We can notice here that there are probably two clusters in the middle of the graph: The yellow one is what the algorithm identified as the largest connected group, and the blue ones are other clusters within the network. It might look like there is a bit of blue inside the yellow one, but these blue nodes do not connect with the yellow ones.

In the yellow cluster, **there are 657 authors, or 41% of total authors in the network.**

In the QJE network, the top 3 economists with respect to each measure are:

**Degree:**Harvard’s**Philippe Aghion***(connection to 22 authors)***,**UC Berkeley’s**Emmanuel Saez***(connection to 20 authors)***,**and MIT’s**Esther Duflo***(connection to 19 authors).***Betweennes:**Harvard’s**Philippe Aghion***(57K pairs of authors go through him)*, MIT’s**Jon Van Reenen***(41K pairs of authors go through him)*and Harvard’s**Andrei Shleifer***(39K pairs of authors go through him).*

**Looks like Philippe Aghion is very connected in both networks!**

# Putting both networks together

The total number of authors in both networks is 3,407.

Visualizing both networks together does not give us much insight… since there is a huge number of nodes and connections, so let’s look at **the centrality measures to understand the most connected economists across both networks.**

**A. Connectedness measure #1: Degree**

Reminder that in network analysis, the **Degree means how many other unique nodes any given author is connected to.** In this instance, if an economist has a degree = 5, that means that they have collaborated with 5 unique authors on various papers across both the AER and the QJE.

We can see from the graph on the left that the **vast majority of authors have a degree of 1 or 2**: meaning that they’ve only collaborated with 1 or 2 authors in these two networks. As degrees become larger, the authors’ count diminishes. The largest degree here is 31, and only one author has such a large degree.

This second graph shows the names of authors who have a degree of 15 or more. As we’ve suspected above, **Philippe Aghion is the economist with the most collaborations (31 connected economists), followed by Daron Acemoglu (26). **On the list are many notable economists such as Esther Duflo, Michael Bloom and Marianne Bertrand.

**B. Connectedness measure #2: Betweenness**

As a reminder, the betweenness measures **how many pairs of authors have to go through a given node within the network.** This measures how central this node is to the entire network.

In the combined network, the minium betweenness measure is 0, and the maximum is 285,890.

The density graph on the left shows that about 40% of authors in both networks have a betweenness measure of close to 10,000. This is in line with the number of authors in the larger clusters we observed in the visualizations above, **and confirms that the economics academic family is quite connected.**

There is also a **minority of economists who have a betweenness measure above 100,000.** Let’s take a look at who they are.

This is a list of exceptionally well-connected economists, those who have 100,000 pairs of authors go through them within the network. Once again, **Philippe Aghion tops the list.** I think it is now safe to proclaim him the most well-connected economist :)

# Visualizing Philippe Aghion’s network

Now that we’ve identified our most connected economist in these two journals, let’s see what Aghion’s network looks like.

**A. First degree of separation from Philippe Aghion**

The red node in the middle of this graph represents Philippe Aghion. T**he immediate nodes connected to him, or his first degree of separation, are represented by the grey nodes**. If you count these nodes, you will find 31 nodes, which is what we had identified as his degree in the above graphs. Let’s now look at Aghion’s larger network.

**B. Second degree of separation from Philippe Aghion**

As soon as we branch out to the second degree of separation (those nodes who are connected to the nodes that Philippe Aghion is initially connected to), the graph becomes so much larger. **This is because every node that belongs to Aghion’s first-degree network has its own network. **Let’s now look at a larger degree of separation: fifth degree.

**C. Fifth degree of separation from Philippe Aghion**

One final graph to close this post! This is the fifth degree separation from Philipe Aghion! You can see that he is very well connected. As the degrees of separation gets larger, this graph will only get bigger!

# Final Thoughts

The conclusions presented here are limited to the data from the two journals that I used. There are many other well-respected journals. Adding data from them might change the conclusions here, but I hope that the AER and the QJE are representative of the economics profession in academia.

I am by no means an expert in economics nor in network analysis. In fact, you can call me an amateur in both. So I welcome all thoughts and feedback.

If you have comments, ideas on how to make this better or if you want access to my data, please email me at kenza.bouhaj@gmail.com

As always, thank you for reading all the way!