Data Science to the rescue: re-creating FC Barcelona’s glory team with new blood and fresh feet
Using 1.7 Million data points in FIFA (the PlayStation game) datasets to “clone” FC Barcelona’s 2014/2015 squad.
In this blog post, I use the widely available FIFA datasets, a simple data science algorithm, and a tribe of football “experts” (really, just my friends and I) to construct the closest team to the FC Barcelona team of 2014/2015 with available players today (in 2020).
As a reminder, the 2014/2015 squad, while not as legendary as the one in 2008/2009, won everything there was to win (6 trophies to be precise, including La Liga and the Champions League). That same squad also included the unforgettable MSN: Messi, Suarez and Neymar.
Trying to “clone” the 2014/2015 Barca’s squad purely based on data is therefore an ambitious and imperfect task. This blog post recounts the process, and the result.
Goal: Re-constructing the closest team to Barca’s 2014/2015 with today’s players
Over the past decade, FC Barcelona has moved hearts across oceans and continents: inspiring a generation aspiring for greatness and igniting the most fervent passions.
A large part of FC Barcelona’s success can be traced back to the 2008/2009 season, when an inexperienced Guardiola took charge of an old squad and turned it into one of the most unforgettable teams in history. I did try to replicate that team, but couldn’t find the right data, so I had to settle with the second-best: the 2014/2015 season.
The idea: Using FCB player attributes in 2015 and current player 2020 attributes to find the closest “match”
- Step 1: Forget about today’s squad. While the team has some exceptional (and even legendary) players, the team’s performance this year has been dismal, the worst start to the League since 1991. So we’ll temporarily forget about the current options.
- Step 2: Gather all possible statistics about each player in the 2014/2015 squad. Thankfully, the FIFA datasets are available for every year and contain ~ 50 attributes for each player. Below is a list of available attributes for each player:
To extract 2014/2015 FC Barcelona’s player attributes, I used the FIFA 2016 dataset. The latest available FIFA dataset is FIFA 2020, which reflects statistics of the 2018/2019 season. Each dataset contains information about more than 17,000 players, each having 50 attributes. Combining the two datasets, I had access and used close to 1.7 Millions data points.
I then make the 2015 Barca players my target players, and all players in 2020 (within a given playing position) the comparison group. Below in an illustration with 5 attributes (each player has a score that measures his ability in each attribute):
- Step 3: Use a simple data science technique — Euclidean Distance — to calculate similarity between players within their field positions. Below is an illustration of the process:
In our example, let Sergio Busquets (in 2015) be our target player. Busquets plays in a Central Defensive Midfielder position. So, in order to find the best 2020 match for Busquets, I calculate the similarity between 2015 Busquets and each of the Central Defensive Midfielders in the 2020 dataset. The player with the lowest “distance” is the most similar to Busquets, and thus his best match. Important: To keep the “blood new”, I exclude players who are 30 years old or older.
Below is an example of the resulting table, for Sergio Busquets:
The table above includes the players’ release clause (in Million Euros) and their weekly wages (in thousand Euros)
- Step 4: Gather friends over Zoom — football “experts”- and debate the best replacement for each player from the top 10 “best matches”. This is important because the data and / or the algorithm might have been inadequate. Therefore, I’m ultimately giving humans the power to make decisions, but all based on data!
Many thanks to Alfredo Campos (@acampostams), Ana Rocio Castillo, Hammaad Adam, Joshua Baissana, Khalid Attala, Otman Guessous and Yazid Heddane, for being the football experts in this work.
Results: The Ideal “Replacement” to Barca 2015
After much data crunching and expert deliberations, below is the result that we came up with.
I realize that this team might raise some eyebrows, especially if it is compared to the legendary 2014/2015 Barça team. And here would be a good time to list out the limitations of this approach:
- We considered each player’s best replacement within his current listed position. Therefore, we could have missed players who play in multiple positions, or whose best positions is not their current one. For instance, we thought Thiago Alcantara would have been a better replacement for Iniesta than Neves. But Thiago today plays in Busquets’ position, so he did not show up in potential replacements for Iniesta.
- We looked at the best possible match for each individual player, and so we might have missed the “chemistry” or “cohesion” between team players. For instance, an improvement would be to look at the best replacement of the Pique-Mascherano partnership, instead of looking at their individual replacements.
- This team might be on the older side. An improvement could be to filter out players older than 27, instead of the 30 years old cutoff that we put.
- Some of the best matches for the 2014/2015 team are within the current squad: De Jong, Umtiti (at his peak) and others in today’s squad are remarkable players. However, per the rules of the game, we decided to look elsewhere.
- The underlying datasets might not be the most reliable source of data. Indeed, FIFA sometimes has attributes that are off the mark. An improvement could be to look at a similarly comprehnsive but more accurate dataset.
- The data science technique is not perfect! Perhaps we should have weighted certain attributes more than others when comparing players. For example, the “shooting” attribute is more important for strikers than it is for defenders. An improvement could be to refine the technique and add weights.
Below are the details for each player.
The Back Line: Best Matches for Jordi Alba, Mascherano, Pique and Alves
1. Jordi Alba
Below is the top 10 for Jordi Alba. After some debate, we hesitated between Alaba and Digne (bring back Digne!!), but we eventually agreed that Alaba was a better match for Alba.
This list initially looked weird for Mascherano, mainly because he played as a midfielder for much of his career, before being placed as part of the backline. But because we had to pick within this list, we thought Laporte was the closest in quality to Mascherano.
Pique ignited a heated debate, because he is not only a controversial and divisive figure, but he is also a captain of sorts, a leader. This is the trait we thought about the most when we picked Marquinhos, who exhibits similar leadership attributes. We thought that Stones, and then Sule would be first and second back-ups.
4. Dani Alves
Dani Alves is one of a kind: an icon for all Barcelona fans! Replacing him was very hard. From this list, we went with Liverpool’s Alexander, due to his young age and promise. As a back-up, we picked Trippier.
Midfielders: Replacement for Iniesta, Busquets and Rakitic
As soon as we saw this list, we all thought it was off the mark. But again, following the rules of the game, we picked Ruben Neves. If it were up to us (and not the data!), we would have picked Thiago Alcantara as the best replacement for Iniesta.
Within this list, we thought that Fabinho was the closest in quality to Busquets. As a back-up, we picked Ndidi.
As a replacement for Rakitic, there was a bit of debate between the eventual pick, Tielemans, and K. De Bruyne. We settled on Tielemans because we thought that while Rakitic had great skills, KDB was much better. Tielemans seemed like a more natural match for Rakitic.
Attackers: Neymar, Suarez and Messi
This one sparked quite the debate. It’s hard to imagine a better Neymar than the 2014/2015 one. The pace, the scoring, the harmony with Suarez & Messi — how could we ever get that glory back? After some discussion, we settled on Liverpool’s Mane, mainly because he is a scoring machine and that he would fit nicely into Barca’s style. The back-ups? We thought of Pulisic (considering his age, he has shown some promise!), Hazard (well, Chelsea’s Hazard, not Real Madrid’s) and we even considered Sterling.
Suarez is also a fighting legend who scores, assists and can do anything to win. After some debate between Immobile, Kane and Moussa Dembele, we settled on Immobile. The latter has the closest playing style to Suarez, but is perhaps not as consistent.
For this one, we immediately agreed that he is irreplaceable, and out of sheer respect, we did not proceed to replace him. However, the list below is a nice one for anyone looking to compare the GOAT to current players.
Goalkeeper: Best Match for Bravo
We thought that Ter Stegen is a great goalkeeper, and that Barca lucked out by buying him early in his career. However, per the rules of the game, we decided to pick Oblak as Bravo’s best match.
A few last words…
This project was a lot of fun (and a lot of work!).
If you’ve enjoyed this post, I ask you to leave a “clap” (or 50 :p) on Medium, share it on your social media, or send it to someone who might enjoy it!
Also, feel free to comment if you think our picks are completely off, and how you would change them.
I am on a mission to somehow make this post reach FC Barcelona, and I would love your help in helping me achieve that!
Again, I would like to reiterate my thanks to Alfredo Campos, Ana Rocio Castillo, Hammaad Adam, Joshua Baissana, Khalid Attala, Otman Guessous and Yazid Heddane, for being the football experts in this work.