In this blog post, I explore ~70,000 VC transactions between 2010 and 2020, focusing on co-investments. Specifically, I explain which VCs co-invest the most with each other, which pairs of VCs have the largest amount of co-investments, and the fraction of successful investments for a given pair of VC co-investors. To understand the data source and process, feel free to read the first section. Otherwise, skip over to subsequent sections to see the uncovered insights. I would like to thank Hiba Habbat, student at African Leadership Academy, for her valuable help in retrieving and cleaning the data.

Understanding the Data

The data source

I first download all Venture Capital transactions data from Crunchbase between 2010 and 2020. This includes all continents and industries. (I use a Crunchbase Pro subscription, which allows me to retrieve detailed information about each…


In this blog post, I teamed up with Eleno Castro to tell the story of this eventful year in 20 graphs. Most of the graphs in this post are borrowed from other sources (all properly cited), and a few are constructed based on available data. While we did not include any graph on the global coronavirus caseload or the death toll, we recognize the immense pain associated with the loss of every human life.

1. Zoom became a staple for every internet user; its stock experienced explosive growth

Source: Google Stocks: Google Finance

Image for post
Image for post

The Zoom stock started the year at USD 68.72 on January 2, 2020, and slowly began its meteoric ascent, until it reached its peak on October 19 at USD 568.34, a 7-fold increase (a 727% increase, to be precise). Since then, the stock stumbled and settled to its late-summer level, at USD 375.17 …


Using 1.7 Million data points in FIFA (the PlayStation game) datasets to “clone” FC Barcelona’s 2014/2015 squad.

In this blog post, I use the widely available FIFA datasets, a simple data science algorithm, and a tribe of football “experts” (really, just my friends and I) to construct the closest team to the FC Barcelona team of 2014/2015 with available players today (in 2020).

As a reminder, the 2014/2015 squad, while not as legendary as the one in 2008/2009, won everything there was to win (6 trophies to be precise, including La Liga and the Champions League). …


Following up on my earlier post on EdTech investments in 2020, and digging deeper into venture capital in EdTech, this article explores the space’s trends, players and insights between 2015 and 2020.

This article contains a small part of the insights that I was able to generate with this dataset. To stay up-to-date on my future posts on the topic, or to suggest additional analysis you’d be interested in seeing, please fill out this form. I will email you the articles when they come out.

Important Note: The analysis below depends entirely on the accuracy and completeness of the Crunchbase data.


On an otherwise boring Friday night, I decided to launch a poll on my Instagram to ask my friends if they had tested positive for COVID-19. This post recounts the story.

As the days of 2020 went by, I started counting the number of people that I knew who had tested positive for COVID-19. By mid-October, that number reached 16 people. These are either friends, acquaintances, or family members who had told me that they had tested positive for COVID-19. My immediate reaction was the following: if the official statistics are to be trusted, the world incidence rate of COVID-19 is 0.5%. If I knew 16 people who had been infected, does that mean that I personally know 3,200 people? A quick glance at my facebook friends list confirmed that no, I do not know that many people. …


I thank my friend Luca Sartorio for giving me access to his data on the stringency of COVID- related lockdowns, as part of his reseatch at the Torcuato di Tella University in Buenos Aires.

COVID-19 has upended the entire world in 2020. The economic news from Q2 of 2020 are bleak: most countries in the world witnessed a sharp contraction of their GDP growth, resulting from the crisis. In this blog post, I correlate this decline in GDP growth with characteristics of the pandemic (e.g., incidence and death rates in countries), and the macroeconomic conditions in select countries (e.g., foreign dependence, stringency of the lockdowns) in select countries. Due to the unavailability of comprehensive data on Q2 2020 growth, I was only able to gather data on 70 countries. Some regions are thus over-represented in my dataset (e.g., OECD countries) and others are under-represented (e.g., Africa). …


Descriptive statistics of previous Supreme Court appointments by US Presidents from 1945 to present.

“My most fervent wish is that I will not be replaced until a new President is installed”

Those were the dying wishes of Justice Ruth Bader Ginsburg, who passed away yesterday (September 18, 2020) due to complications from cancer.

The news of the passing of Supreme Justice Ruth Bader Ginsburg reverberated across the US and the world. Millions are mourning and grieving a feminist icon who has tirelessly worked to ensure gender equality through the Law.

Even before her passing could be properly mourned, it has ignited what promises to be a fierce political battle between the Republicans and the Democrats. The two Parties will be fighting on whether the current President will nominate her replacement, at a time that is so close to the general election on November 3rd. Supreme Court appointments are critical because they are lifetime appointments (the Justice leaves the Court when she retires or when she dies), and each President typically appoints a Justice that is ideologically close to their party. With only three liberal Justices left on the Court and five conservatives, the next Justice appointment might tilt the ideological balance of the Court towards Conservatism for an entire generation.


COVID-19, for all its disastrous effects on countless lives, has accelerated a positive trend that we’ve all been witnessing over the past few years: the move towards digital education. As schools and universities were shut down worldwide in March 2020, parents, educators and students alike scrambled to find a close substitute to something we think of as irreplaceable: in-person education.

As a response to these unusual circumstances, education-themed companies have proliferated across the world. Investor interest in these companies has increased, even as economic downturns reverberated across the world. To understand and quantify the (increased) value that society has put on this industry, I look at the investments made in the Education Technology — EdTech- field so far in 2020 (from January 1st to September 1st). I sourced my data from the Crunchbase database of funds raised by startups. A further, original step in this blog post is my categorization of EdTech companies into 14 sub-fields (e.g., K-12, Higher Education, Tutoring Marketplace, Workforce Development). The latter will help us understand the relative importance of these fields in 2020 vs. …


In this blog post, I take a step back and reflect on my 3-month data science & Medium journey.

Facing a bright, blank screen, I wondered if I should even start typing.

My thoughts serenaded me with a hundred questions: What am I getting myself into? Do I have the time, stamina and creativity to do this? Will I consistently find interesting things to write about? Will people even read my stuff?

What saved me from all that self-doubt was a familiar voice. The voice that urges us to take that new job, to jump from an airplane (skydive!), to move to that foreign country, or to try a new adventure.


Using text mining and natural language processing (NLP) to derive insights from last week’s Big Tech Congress Hearing (July 2020), featuring the CEOs of Amazon, Apple, Facebook, and Google.

On July 29, 2020, the CEOs of the four of the largest and most powerful US technology companies (virtually) convened to address the Antitrust, Commercial and Administrative Law Sub-Committee of the Judiciary Committee of the United States Congress. At this marathon-hearing, which lasted more than 5 hours, Republicans and Democrats came together to grill the CEOs on their perceived anti-competitive practices and other market powers that they are accused of abusing.

Since most of us could only catch snippets of the historic hearing (due to its length!), or because we would get our summaries from seconday accounts (which might be biased!), I thought it would be worthwhile to do a text mining and NLP exercise to see what kind of insights data science can help us uncover.

About

Kenza Bouhaj

Curious. Passionate about storytelling through data. Interested in Work, Skills and EdTech. Twitter: @KenzaBouhaj

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store