I remember coming across an image on reddit a month ago. It was 50 companies made into boxes and split up into different industries like Finance, Pharmaceuticals, and Retail. When I saw it I didn't fully realize the goliath Walmart($482 B) was, when I compared it to Apple($235 B). But this was just revenue, basically "who is ringing up the most dough". And since Walmart pays the cost of having a store 50 miles from every American, I wanted to see who could do more with less. I wanted to rank them by their profit.
When I scrolled down to the comments I learned that I wasn't alone in this sentiment. Some people wanted even more. There was a discussion of whether assets or market cap are better measure for the title "Biggest Companies". So, I got to work, found my data source, the Forbes Global 2000. It had all the metrics people wanted to see. After scraping the list in Python, I decided to coalesce some of the industries, by using broader terminology. For example, I labeled "Major Banks", "Diversified Insurance" and "Investment Services" as an all-encompassing and more understandable "Finance and Insurance" industry.
Then I created my first graph. It was a treemap, just like the image on reddit, but with profit instead of revenue. I was immediately displeased with the result. I had a graph of 50 companies, but I had no way of comparing them to the revenue chart from reddit. I knew I had to put them both on the same chart. Also, it is poor form to manipulate two free variables, height and width, to show one dependent value, profit.
But interestingly from this chart we can see that 40% of the profit of the top 50 companies are finance companies. I think the metric shows a bias towards them because finance companies make money form money. They have relatively low overheads to make gobs of money. Most probably would never think of under taking large risky ventures that would push them into a loss for a year.
Next I made stacked bar chart, where you can see that the height or the bar represents revenue, and the lighted part is profit. This is much more accurate way showing that we have 1-d data. And this shows profit in relation to the total revenue.
I love this chart, it accurately portrays profit and revenue. This is what I would want to see if I was showing the results of an experiment. Yet, I felt like it loses the focus on profit, and instead gets overwhelmed by the big blue bars. At the end of the day, this chart is missing the art in the data visualization.
Which lead to the final graph. Bubbles to show market size. They are accurate in that only one free variable, radius, shows our one dependent value, profit or revenue. It's also beautiful in that the size of the bubble shows size of the revenue.
Final Interactive Result
So, happy with shape of the graph I set out make it interactive. The first thing I did is show country of origin, market cap, and asset value in a tip when a company is hovered over. Also, I made it so companies in the same industry are highlighted when their label is hovered over. And as a cheeky final touch, I made it so the company names rotate when you hover over them. Sometimes the stay rotating, just to keep the graph looking alive.
I have to admit that after spending so much time with the bubbles and knowing exactly what the actual revenue each of the bubbles represented, that I over looked that humans are really bad at judging areas. If I was to go back I would definitely label some of the bubble with that information to give context in the static graph. I am starting to be convinced that bubbles weren’t the right visualization of the data, because it kind of falls into the same problem of the treemap.
Also, looking back I wish I had included some annotations so readers would already be pointed to a few interesting insights.