As technology advances and business management techniques continued to be refined, there’s an increased focus on enabling data-based decision making. These days it’s hard to find a company that’s not talking about how to turn analytics and business intelligence software into a competitive advantage.
In fact, a Harvard Business Review article from a few years back named “data scientist” as the “sexiest job of the 21st century.”
Sales, operations, human resources, and finance managers alike are also being called upon to dive deeper into the data to justify business decisions. Reporting has always been a part of management. But the ability to effectively interpret and present data has arguably never been a more important skillset for the career minded manager than now.
The practical upshot is that reports or presentations with an unending stream of bar graphs just aren’t going to cut it anymore.
In order for data visualizations to be effective, they need to:
Selecting the right data visualization for each of your data sets is the first and most important step to accomplishing each of these goals.
Just making the simple shift from the rectangular bar to the circular pie is an easy way to bring visual variety to data presentations. And, if you need a fancier reason than that to roll with the circle, art history offers one.
Seeking a commission to do paintings for St. Peter’s Basilica, the Italian painter and architect, Giotto, swept a single-stroke perfect circle. Giotto submitted his flawless freehand rendering of the shape as evidence of his qualification for the appointment. Pope Boniface VIII promptly awarded him the contract.
So what gives? Of course, Giotto’s subtle demonstration of skill flattered the Pope by acknowledging his ability to recognize it. But it also played on a deeper truth. Circles convey a symbolic meaning of perfection and completeness.
If that sounds a bit esoteric for the task of adding some visuals to the next TPS report, no need to worry. There’s a practical application. Because they are circular, pie charts offer an ideal vehicle for communicating the proportionality of items within a greater whole–which is a frequent need for the visualization of business data.
If the cumulative sum of the value of each unit within a data set equals 100%, you could do a lot worse than choosing the humble pie chart.
Line charts are far from a radical data visualization. In fact, they’re almost as common as bar graphs.
But line charts are not interchangeable with bar graphs. Here’s the main difference: In a bar chart, the position of data only needs to be mathematically significant along one axis. In a line chart, the distance of a data point from the x,y intersection in both vertical and horizontal dimensions is always significant.
Line graphs provide an ideal format for mapping data values at specific intervals. Because the points on line graphs are connected together, they also reflect progression between measurements. The resulting slope between plotted points displays how a particular quality of the data increased or decreased as it moved toward the next interval point.
Bar graphs on the other hand work better when visually displaying a quantitative attribute of item, group, or category that can’t necessarily be meaningfully defined by a number. For instance, bar graphs would be a better choice for showing the average price of, say, bananas, apples, and oranges. A line chart that connected the data points between bananas, apples, and oranges with lines would be non-sensical. After all, you wouldn’t ever have a half-banana/half-apple fruit, which is what an X coordinate position halfway between the two fruits on a line-connected graph would insinuate.
Line graphs make excellent visualizations for time series, as time progresses at fixed intervals. They also offer another advantage. Lines simply take up less visual space than bars. As a result, when you need to include multiple data series, line charts provide a better, more visually comprehensible option.
Scatter plots share many of the same advantages line charts hold over bar graphs. Namely, scatter plots allow for the display of a larger number of datapoints than what a bar graph could comprehensibly display.
Unlike in a line chart though, scatter plots map data with single points which are unconnected by lines.
Line charts use lines between plotted points to illustrate trends within a specific data series. Therefore, while multiple datapoints in a single series on a line chart might share a position on the coordinate used for measurement (usually the y-coordinate), they can have one and only one data point on the coordinate used for categorization (usually the x-coordinate).
The advantage of scatter plots is that can plot multiple data points along the categorization/interval axis. Consequently, they are ideal for displaying data distributions where a single series of data might have multiple entries at the same x-coordinate (provided it is the x-coordinate that determines item categorization).
For instance, a scatter plot might display the dollar size of a sale as the unit of measure and the date of the sale as the unit of categorization. There might be five sales on one day, four on the next, and none for the next five days. If the pattern repeated and the days with sales corresponded with weekends, the distribution pattern would be easily recognizable. The scatter plot could be of further utility by analyzing which day tended to yield the larger $ figure sales.
Bubble charts are scatter plots. But they are a specific type of scatter plot that can overcome one of the limitations of the basic form.
The problem with the plain old scatter plot is the opportunity for occlusion. Occlusion is what happens when the display of some amount of data in a single visualization obscures the display of other data.
Looking back at the scatter plot chart shown above, what would’ve happened if two products from Family 1 had precisely the same number of “units sold” and “revenue in millions”? Since the 2 data points would’ve had exactly the same coordinates, one would inevitably obscure the other.
Bubble charts introduce a 3rd data dimension to overcome this issue. By converting the data scatter plot above to a bubble chart, data points could be drawn with a proportionally larger radius to reflect the number of products at that particular x,y coordinate.
While the bubble chart can be used in this manner to overcome some instances where occlusion may occur, it might not prevent every possibility for occlusion. Multiple data series in bubble charts reintroduce the possibility for occluding data points. Additionally, particularly large bubbles always run the risk of obscuring smaller ones. Theoretically, this problem can be solved by layering the datapoints from smaller to larger moving front to back, but not all software will make this option available or it may be too time-intensive to warrant the effort.
Nevertheless, for situations where a 3rd dimension of quantification is required, bubble charts offer a unique and visually compelling option that will work well for many data sets.
Data sets with items defined by geography offer a great chance to use map charts. If you’re looking to make an impact with a visually impressive data presentation, it’s tough to beat a map chart.
The two most common types of map charts used in BI data visualizations are choropleths and proportional symbol maps. Which one is right for your use-case will depend primarily on whether you are looking to display data that maps to a territory or a specific point on the map. Map charts work particularly well in contexts that allow for interactivity, so that viewers can roll over locations on demand to reveal detailed quantitative data.
Choropleths are the map chart of choice when you need to differentiate measurements that correspond to territories on a map. Choropleths use colors, shading, or patterns overlaid on geographical units to display corresponding measurement data.
In cases where data points correspond to specific locations rather than wider territories, proportional symbol maps are the best choice. Much like a bubble chart, proportional symbol maps use the size of the icon (often a circle) marking the data point to demonstrate magnitude.
For some reason a bar graph with a single bar just doesn’t look right. But often it’s useful to spotlight a single measurement with a dedicated visual treatment.
Because of the laser focus provided by meters, they are a frequent element on executive dashboards. Business leaders need to make rapid good/bad judgments. Meters provide a mechanism to help them do so based on real metrics.
The real business value of any visualization is not only to display information, but to actually elicit a response. This is particularly true in the case of meters which are often positioned in dashboard type views. As a result, it’s important that meter design reinforces the qualitative judgment of the data presented within it. Gauges and other skeuomorphic displays which borrow from the vocabulary of real world meters can be a good way to help viewers interpret the degree to which data corresponds to positive or negative performance.
While they are often overused, there’s nothing inherently “bad” about bar graphs. In many cases, they’re well-suited to the data. Being aware of the variety of bar graph variants, though, can help you choose the most appropriate bar graph option.
Bar graphs are great for comparing magnitude between multiple categories of data. But sometimes it is also important to display the value of individual items within the category. Stacked bar graphs provide a way to drill-down to this level of detail. There’s one important caveat to be aware of when using stacked bar graphs: As both the number of categories and the number of individual items within categories increase, it will get progressively more difficult to interpret the stacked bar graph.
The more quickly a viewer is able to interpret data, the better the data visualization. One trick that can be used to increase the speed of comprehension is using the shape of the bar itself to communicate information about the data contained within it. A bar graph modified into the shape of a sales funnel provides a good illustration of this approach. In some shaped bar graph displays each layer of the graph will lose proportionality in representing amounts, but labels can overcome this potential shortcoming.
The Pareto chart is another example of how a modified bar graph can be used to display more information about the data. The foundation of a Pareto chart is a standard bar graph that displays the magnitude of data categories with bars. In a Pareto chart, data is presented sequentially by arranging the individual categories from largest to smallest. This arrangement allows for overlaying a line chart that calculates the percentage of the whole resulting from the sum of each category with all preceding categories. The Pareto chart is especially useful for prioritizing the most important categories or factors in terms of their contribution to the whole. It is commonly used with data sets that relate to improving quality management.