From inspiration to execution: seven steps to publish analytical posts on Medium
In this blog post, I take a step back and reflect on my 3-month data science & Medium journey.
Facing a bright, blank screen, I wondered if I should even start typing.
My thoughts serenaded me with a hundred questions: What am I getting myself into? Do I have the time, stamina and creativity to do this? Will I consistently find interesting things to write about? Will people even read my stuff?
What saved me from all that self-doubt was a familiar voice. The voice that urges us to take that new job, to jump from an airplane (skydive!), to move to that foreign country, or to try a new adventure. In words immortalized by Nike, that voice screamed: Just Do It.
Since that evening in May, I have been writing consistently on Medium, applying my recently-acquired data science skills to real-life questions. I’ve written on a variety of topics and have had incredible support from my network.
Following some questions from friends who regularly read my posts, I thought I’d summarize my process and what I’ve learnt so far. This post will hopefully serve as an inspiration or a guide for you to start your own writing journey.
Step 1: If there’s no data, there’s no analysis.
I write analytical blog posts. My insights and conclusions thus have to be backed up with real-life data. My first step is always to look for data, even before thinking of what I am trying to uncover.
Now that I have some experience looking for data, I am mind-blown by how much data exists on the internet for us to use, free of charge. In some posts, I have been able to create my own data sets (more on that later), but for the most part, these websites are great to find great data:
- Kaggle open datasets. Kaggle is a Google-owned company that hosts data science competitions and fosters community among data scientists. Users and administrators at Kaggle make large amounts of data available to the community. You can see which datasets are popular (and thus most useful) by the amount of “upvotes” that they have. The underlying data for my post on skills needed for data jobs was from Kaggle.
- The World Bank Microdata Library. Although well-known for macro-economic data, this World Bank repository contains a wealth of microeconomic data (data on individuals) sources at the global and / or country levels on high-level, development-related research. Most datasets are accessible to the public. My posts about the “unbanked” had data from this source (here, and here)
- Data.world Open Data. A company that enables you to organize your data projects, it also provides free access to ~700 data sets, categorized and easily searchable.
- [On COVID-19] There is a wealth of open datasets on COVID-19. For my COVID-19 post, I used data from Oxford’s Our World in Data, another great resource for free access to data.
- [On music data] genius.com is an amazing resource to get information about the music industry: artists, albums, songs, lyrics, etc. I used the geniusr package in R to access data for my blog post on Beyonce.
- There are many other great websites to find open data. This Medium post summarizes the top 10.
Sometimes, I knew the data exists on the internet, but it is not easily arranged or accessible in a way that would facilitate analysis. That’s why learning web scraping was a game changer. Web scraping, although in the gray area of ethics, is a poweful tool to get the data that you need. In my case, I tried to use it responsibly: scraping economics journal titles for my post on social networks, scraping the Moroccan stock exchange website for my post on the Moroccan stock market, or transfermarkt for my post on Champions League predictions (soccer). Another useful skill that goes with web scraping is text mining: the ability to clean large amounts of text in order to draw meaning from them. In fact, no web scraping works if it is not coupled with text mining. Learning how to do both has been extremely useful.
Step 2: Read up on what’s been done on the topic before you dive in
After you select a dataset, it’s a good idea to do a quick Google search to see what has already been done on the topic. This accomplishes two purposes. First, it inspires you and gets you acquainted with the kinds of questions you can answer around the data. Second, it helps you be more original and avoid repeating other people’s work (although I would agree that there is value in replicating others’ work, from a learning perspective).
To make things more concrete — When I wrote about Airbnb data in Mexico City, I typed “Airbnb data insights” on Google, and explored some articles that were written on the topic, such as this report or this article. I also typed “social concerns about Airbnb” to see what social themes emerge around Airbnb. That’s how I found an academic paper that explored the effect of Airbnb prices on rent prices in the US. The idea of exploring Airbnb and gentrification therefore came to mind.
In short, a little homework does pay off!
Step 3: Formulate a clear question(s) and structure your writing accordingly
Now that you have been inspired by what’s out there, and that you have real data you can work with, it is very important that you clearly formulate a question (first in your head, and later, to your audience). This not only helps you organize your thoughts and keeps you in check, it also makes your analysis more straightforward.
For example, when I got my hands on the stock exchange data in Morocco, I wanted to answer these specific questions:
- What is the total value of the market?
- Is this value concentrated in a few companies, or distributed?
- Which specific holding companies own these top companies?
- What is the place of institutional investors (e.g., pension funds) in the stock market?
These questions helped guide me to which analytics tools I needed to use to get my insights, and later became the main paragraphs for my post.
Step 4: Consult friends to test your thinking
At times, I found myself unfamiliar with a particular context, or how to use an analytical tool, or how to explain concept. At such times, I turned to my friends, who pointed me to resources or offered to help me understand things better. For example, a friend who once worked as a Congress staffer explained to me how the US Congress hearings work (for this blog post). For my post on data skills, a friend who is building a data analytics start-up shared his view of the most in-demand data skills, and a data engineer friend helped me classify technologies into what they accomplish. For my post on Airbnb in Mexico City, two Mexican friends (and geo-spatial data experts) explained to me where I can get open data sets about Mexico City and what software I can use to visualize maps. For my post on Beyonce, the resident DJ in my friends group (and also upcoming data science PhD) gave me tips about what I can do to decipher hidden meaning from songs.
In conclusion, make sure to bounce ideas off of people in your network. Friends can help you overcome obstacles and can bring new ideas to light. Of course, always give them credit in your posts! So here’s a shout-out to everyone I’ve ever bothered about my Medium publishing. Thank you, thank you, thank you!
Step 5: Visualizations are important. Make sure to produce enough of them
Visualizing data is extremely important when writing complex, technical concepts. The human brain is more likely to remember a striking visualization than a bunch of text stitched together. Visualizations, when done right, can help your reader understand your insights and conclusions, and can even help them derive their own insights!
Visualizations is a theme that I definitely need to work on (my color coordination is often off), but I’m constantly improving. For example, I’ve found useful online resources that provide guidelines.
I’ve also tried to vary the type of visualizations I show in my posts, and I try as much as I can to move away from classic bar charts. I’ve experimented with network maps, box plots, geographic maps or faceted graphs.
Step 6: Write as if you’re explaining stuff to your grandma, or to a child
Once you’re done with your analysis, you need to actually start writing. What I’ve found useful is to write in direct language. I’ve avoided long, convoluted sentences and stayed away from complicated vocabulary. My rule of thumb is that if my teenage brother can understand my language, then it was probably good to go.
It is sometimes difficult to accomplish this when writing about technical concepts such as statistical inference, logistic regression or random forests, but clear communication is a crucial skill that any professional should strive for. After all, you can do the most elaborate and amazing analyses, but if nobody can undestand you, your work won’t have much impact.
Step 7: Unleash your creativity and let your personality shine through your writing
One great perk of blogging, as opposed to writing for an academic paper, or even a publication such as a magazine, is that you have absolute freedom to make your production your own. Those who know me well and have been following my writing can tell you it does reflect who I am: I wrote about my country (Morocco), my academic interests (economic development and economics’ networks), my professional interests (jobs, skills), my hobbies (listening to music, watching the Champions League) and my general interests (tech, politics, etc).
I’ve even embedded music in my posts (Cheb Khaled, Beyonce or Selena Quintanilla) and managed to tell the whole internet how much I dislike Real Madrid (oops, looks like I did it again). Even in this post, you’ll notice that I casually put various pictures of artisanal products from Morocco, and that’s because: (1) they are visualizations, (2) they are beautiful and (3) to promote them for your post-COVID consumption :)
Why does unleashing one’s creativity matter? It matters because when you write about stuff you care about, the process is fun, and the end result is satisfying.
I hope you walk away with this post with some inspiration on how to start your own writing journey. I also hope that you will one day… Visit Morocco.
Thank you for reading, and please leave a comment if you’d like to engage further.