Data Science to the rescue: re-creating FC Barcelona’s glory team with new blood and fresh feet

Using 1.7 Million data points in FIFA (the PlayStation game) datasets to “clone” FC Barcelona’s 2014/2015 squad.

In this blog post, I use the widely available FIFA datasets, a simple data science algorithm, and a tribe of football “experts” (really, just my friends and I) to construct the closest team to the FC Barcelona team of 2014/2015 with available players today (in 2020).

As a reminder, the 2014/2015 squad, while not as legendary as the one in 2008/2009, won everything there was to win (6 trophies to be precise, including La Liga and the Champions League). That same squad also included the unforgettable MSN: Messi, Suarez and Neymar.

Trying to “clone” the 2014/2015 Barca’s squad purely based on data is therefore an ambitious and imperfect task. This blog post recounts the process, and the result.

Goal: Re-constructing the closest team to Barca’s 2014/2015 with today’s players

A large part of FC Barcelona’s success can be traced back to the 2008/2009 season, when an inexperienced Guardiola took charge of an old squad and turned it into one of the most unforgettable teams in history. I did try to replicate that team, but couldn’t find the right data, so I had to settle with the second-best: the 2014/2015 season.


The idea: Using FCB player attributes in 2015 and current player 2020 attributes to find the closest “match”

  • Step 2: Gather all possible statistics about each player in the 2014/2015 squad. Thankfully, the FIFA datasets are available for every year and contain ~ 50 attributes for each player. Below is a list of available attributes for each player:
List of Attributes in FIFA Datasets

To extract 2014/2015 FC Barcelona’s player attributes, I used the FIFA 2016 dataset. The latest available FIFA dataset is FIFA 2020, which reflects statistics of the 2018/2019 season. Each dataset contains information about more than 17,000 players, each having 50 attributes. Combining the two datasets, I had access and used close to 1.7 Millions data points.

I then make the 2015 Barca players my target players, and all players in 2020 (within a given playing position) the comparison group. Below in an illustration with 5 attributes (each player has a score that measures his ability in each attribute):

Illustrative: Comparing attributes between a target (Player 1) and a Comparison Group (Player 2)
  • Step 3: Use a simple data science technique — Euclidean Distance — to calculate similarity between players within their field positions. Below is an illustration of the process:
Illustrative: Measuring similarity between Players using Euclidean Distance

In our example, let Sergio Busquets (in 2015) be our target player. Busquets plays in a Central Defensive Midfielder position. So, in order to find the best 2020 match for Busquets, I calculate the similarity between 2015 Busquets and each of the Central Defensive Midfielders in the 2020 dataset. The player with the lowest “distance” is the most similar to Busquets, and thus his best match. Important: To keep the “blood new”, I exclude players who are 30 years old or older.

Below is an example of the resulting table, for Sergio Busquets:

Top 2020 “Best Matches” for 2015 S. Busquets

The table above includes the players’ release clause (in Million Euros) and their weekly wages (in thousand Euros)

  • Step 4: Gather friends over Zoom — football “experts”- and debate the best replacement for each player from the top 10 “best matches”. This is important because the data and / or the algorithm might have been inadequate. Therefore, I’m ultimately giving humans the power to make decisions, but all based on data!

Many thanks to Alfredo Campos (@acampostams), Ana Rocio Castillo, Hammaad Adam, Joshua Baissana, Khalid Attala, Otman Guessous and Yazid Heddane, for being the football experts in this work.

Results: The Ideal “Replacement” to Barca 2015

Source: FIFA Website

I realize that this team might raise some eyebrows, especially if it is compared to the legendary 2014/2015 Barça team. And here would be a good time to list out the limitations of this approach:

  • We considered each player’s best replacement within his current listed position. Therefore, we could have missed players who play in multiple positions, or whose best positions is not their current one. For instance, we thought Thiago Alcantara would have been a better replacement for Iniesta than Neves. But Thiago today plays in Busquets’ position, so he did not show up in potential replacements for Iniesta.
  • We looked at the best possible match for each individual player, and so we might have missed the “chemistry” or “cohesion” between team players. For instance, an improvement would be to look at the best replacement of the Pique-Mascherano partnership, instead of looking at their individual replacements.
  • This team might be on the older side. An improvement could be to filter out players older than 27, instead of the 30 years old cutoff that we put.
  • Some of the best matches for the 2014/2015 team are within the current squad: De Jong, Umtiti (at his peak) and others in today’s squad are remarkable players. However, per the rules of the game, we decided to look elsewhere.
  • The underlying datasets might not be the most reliable source of data. Indeed, FIFA sometimes has attributes that are off the mark. An improvement could be to look at a similarly comprehnsive but more accurate dataset.
  • The data science technique is not perfect! Perhaps we should have weighted certain attributes more than others when comparing players. For example, the “shooting” attribute is more important for strikers than it is for defenders. An improvement could be to refine the technique and add weights.

Below are the details for each player.

The Back Line: Best Matches for Jordi Alba, Mascherano, Pique and Alves

1. Jordi Alba

Top 2020 “Best Matches” for 2015 Jordi Alba

2. Mascherano

Top 2020 “Best Matches” for 2015 Mascherano

3. Pique

Top 2020 “Best Matches” for 2015 Pique

4. Dani Alves

Top 2020 “Best Matches” for 2015 Dani Alves

Midfielders: Replacement for Iniesta, Busquets and Rakitic

1. Iniesta

Top 2020 “Best Matches” for 2015 A. Iniesta

2. Busquets

Top 2020 “Best Matches” for 2015 S. Busquets

3. Rakitic

Top 2020 “Best Matches” for 2015 I. Rakitic

Attackers: Neymar, Suarez and Messi

1. Neymar

Top 2020 “Best Matches” for 2015 Neymar Jr.

2. Suarez

Top 2020 “Best Matches” for 2015 L. Suarez

3. Messi

Top 2020 “Best Matches” for 2015 L. Messi

Goalkeeper: Best Match for Bravo

Top 2020 “Best Matches” for 2015 Bravo

A few last words…

If you’ve enjoyed this post, I ask you to leave a “clap” (or 50 :p) on Medium, share it on your social media, or send it to someone who might enjoy it!

Also, feel free to comment if you think our picks are completely off, and how you would change them.

I am on a mission to somehow make this post reach FC Barcelona, and I would love your help in helping me achieve that!

Again, I would like to reiterate my thanks to Alfredo Campos, Ana Rocio Castillo, Hammaad Adam, Joshua Baissana, Khalid Attala, Otman Guessous and Yazid Heddane, for being the football experts in this work.

Curious. Passionate about storytelling through data. Interested in Work, Skills and EdTech. Twitter: @KenzaBouhaj

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store